SETI - Programme "Sécurité et Informatique"

Adaptive Fiabilisation of Multi-Agent Cooperative Applications – FACOMA

Submission summary

The possibility of partial failures is a fundamental characteristic of distributed applications. The fault
tolerance research community has developped solutions (algorithms and architectures), mostly based on
the concept of replication, applied for instance to data bases. But, these techniques are almost always
applied explicitely and statically. This is the responsability of the designer of the application to identify
explicitly which critical servers should be made robust and also to decide which strategies (active or
passive replication) and their configurations (how many replicas, their placement). Meanwhile, regarding
new cooperative applications, which are very dynamic, for instance : decision support systems, distributed
control, electronic commerce, crisis management systems, and intelligent sensors networks, - such applications
increasingly modeled as a set of cooperative agents (multi-agent systems) -, it is very difficult,
or even impossible, to identify in advance the most critical agents of the application. This is because the
roles and relative importances of the agents can greatly vary during the course of computation, interaction
and cooperation, the agents being able to change roles, strategies, plans, and new agents may also
join or leave the application (open system). Our approach is in consequence to give the capacity to the
multi-agent system itself to dynamically identify the most critical agents and to decide which abilisation
strategies to apply to them. This is analog to load balancing but for reliability. This project includes
several complementary aspects : - the design of a prototype platform for adaptive replication, with innovative
dynamicity characteristics (dynamic application of replication, dynamic change of strategy of
replication...). A first initial version (DarX platform) has been developed in our teams ; - to study replication,
coherence management and recovering policies, adapted to the specificities of agents ; - to study how
to combine agent replication techniques with other reliability and adaptation techniques : replanification,
tasks reallocation between agents, etc. and in the first place with exception handling and cooperative
recovery techniques ; - to study the central question of the automatic control of the replication strategies
(which agent, which strategy, which parameters...). Various types of information, system/network level
(communication, CPU...), agent level (nature of the communications, roles, plans, commitments...), and
models (of faults, of replication costs...) could be used, compared and combined (multi-criteria decision,
learning...) ; - to experiment on various application examples. Notably our collaboration with Eurocontrol
has for objective to test and validate our approach on scenarios of cooperative distributed air trafic
control.

Jacques MALENFANT (Université)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Help of the ANR 124,350 euros
Beginning and duration of the scientific project: - 36 Months

Explorez notre base de projets financés

ANR makes available its datasets on funded projects, click here to find more.