Blanc SIMI 2 - Sciences de l'information, de la matière et de l'ingénierie : Sciences de l’information, simulation

Motor Adaptive and Cognitive Scaffolding for iCub – MACSi

Motor Affective Cognitive Scaffolding for the iCub

Personal robots are believed to become a major market product in the near future. They will have to be flexible and adaptive because they will have to perform a wide variety of tasks in unpredictable environments. In this context, programming any reaction to any situation in advance is not a viable option anymore, in contrast with the previous practice in robotics. The alternative is the developmental approach taken in this project, at the sensori-motor root of behaviour learning.

Four complementary challenges

The project is based on four complementary challenges:<br /><br />How can a robot learn efficient perceptual representations of its body and of external objects given initially only low-level perceptual capabilities?<br /><br />How can a robot learn motor representations and use them to build basic affordant reaching and manipulation skills? <br /><br />What guidance heuristics should be used to explore vast sensori-motor spaces in an unknown changing environment?<br /><br />How can mechanisms for building efficient representations/abstractions, mechanisms for learning manipulation skills, and guidance mechanisms be integrated in the same experimental robotic architecture and reused for different robots?

The project complies with the developmental robotics approach, which consists in endowing the robot with learning capabilities inspired from those of children and to let the robot build on these learning capabilities to acquire more and more complex skills to interact with its environment.

In practice, the technical work achieved within the MACSi project are focused on three domains:

- learning for vision, which consists in endowing the robot with very basic perceptive capabilities and to let it learn how to structure the so extracted elementary informations in ever more complex representations.
As a result, the robot becomes able to recognize diverse objects, its own body and users around.

- imitation learning and reinforcement learning mechanisms for the improvement of motor capabilities. Based on these basic machine learning processes, the robot can progressively build on its own an ever richer and more efficient repertoire of motor skills.

- guidance mechanisms for action selection, based on computational models of the intrinsic curiosity of the robot, of the interaction with caregivers and of the choice between curiosity-driven and user-driven selection of action.

The project has generated many intermediary publications at all stages of the progress towards each of the four individual goals. Some of these publications impacted significantly the international communities in their respective domains.

Finally, all local elements have been integrated with a global computing architecture that encompasses all the capabilities developed by all teams.

In practice, the iCub humanoid robot faces a number of objects in its environment, it is trying to learn how to recognize them. In order to do so, it must see these objects from various sides using many different perspectives.

In order to see these sides, it has to choose between manipulating the object by himself by using the motor capabilities it is acquiring, or call upon a caregiver who may show him another side of the object.
We have shown that, through the guidance process, the robot learns to make this choice as a function of its difficulties in orienting the object by himself as it would need to. Furthermore, we have shown that the capability of the robot to learn to recognize the objects is improved by this guidance process.

These results have been published in a IEEE Transactions on Autonomous Mental Development paper that synthesizes the integrative work performed in the last part of the project.

Within the project, we developed a set of robust learning mechanisms which constitutea solid foundation for building high level cognitive skills.
Thus our work opens new perspectives in developmental robotics that are available to the international research community.

More locally, the project was the opportunity to initiate complementary research efforts around the iCub humanoid robot platform, among which:

- le FP7 CODYCO project, which studies the whole-body control of the robot in the context of contacts with the environment, and in which machine learning methods for control are integrated;
- the EDDHI project, financed by the SMART labex project, which studies human-robot interactions in collaboration with experimental psychology partners.

This field only contains publications in journals and major conferences majeures. See the web site for a complete publication list.

Ivaldi, S.; Nguyen, S.M.; Lyubova, N.; Droniou, A.; Padois, V.; Filliat, D.; Oudeyer, P.-Y.; Sigaud, O. (2013) Object learning through active exploration. IEEE Transactions on Autonomous Mental Development.


Droniou, A., Sigaud, O. (2013) Gated autoencoders with tied input weights. Proc. 30th International Conference on Machine learning, Atlanta, Georgia, USA.

Nguyen, M., Oudeyer, P-Y. (2013) Active Choice of Teachers, Learning Strategies and Goals for a Socially Guided Intrinsic Motivation Learner, Paladyn Journal of Behavioural Robotics.

Stulp, F. & Sigaud, O. (2013). Adaptation de la matrice de covariance pour l’apprentissage par renforcement direct. Revue d'intelligence artificielle - n. 2/2013, p. 243-263.

Ivaldi, S.; Sigaud, O.; Berret, B.; Nori F. (2012). From Humans to Humanoids: the Optimal Control framework. Paladyn. Journal of Behavioral Robotics. DOI: 10.2478/s13230-012-0022-3 Pages 1-17.

Stulp, F. & Sigaud, O. (2012). Path Integral Policy Improvement with Covariance Matrix Adaptation. Proceedings of the 29 th International Conference on Machine Learning, Edinburgh, UK.


T.Degris, M. White, R. S. Sutton (2012) Linear Off-Policy Actor-Critic. In Proceedings of the International Conference on Machine Learning.

Oudeyer, P-Y. (2012) GX-29 n'est pas un objet comme les autres, Sciences et Avenir Hors-Série, dec/jan 2011, «Qu'est-ce-que l'homme?«.


O. Sigaud, C. Salaün and V. Padois. On-line regression algorithms for learning mechanical models of robots: A survey. Robotics and Autonomous Systems 59 (2011) 1115–1129.

Filliat, D. (2010) Manuel d'éducation des jeunes robots à l'usage de leurs maitres. La Jaune et la Rouge.


In most of the previous century, the majority of robots were performing the same manufacturing task again and again in extremely structured environments such as automobile factories. Everything could be envisioned in advance and the achievement of the task could be pre-programmed by the designer. By contrast, the personal robots that represent the future of robotics will have to evolve in unpredictable environments such as homes and streets, they will have to achieve a large variety of tasks and to adapt to the needs of very different users. In this new context, programming in advance the behaviour of the robot to achieve any task in any context is not a viable approach anymore.

An obvious alternative to behaviour programming at the design stage consists in endowing the robot with some learning capabilities that will let it adapt its behaviour on the fly to experienced circumstances. With this goal in mind, research in artificial intelligence, machine learning and pattern recognition has produced a tremendous amount of results and concepts in the last decades. A blooming number of learning paradigms – supervised, unsupervised, reinforcement, active, associative, symbolic, neural, situated, hybrid, distributed ...- nourished the elaboration of highly sophisticated algorithms for robotics capabilities such as visual object recognition, speech recognition, robot walking, grasping or navigation, etc. Yet, we are still very far from being able to build robots capable of adapting to the physical and social environment with the flexibility, robustness, and versatility of a one-year-old human child.

Developmental (or epigenetic) robotics is an approach to robotics that takes inspiration from developmental psychology and tries to endow a robot with the above properties taking inspiration from the developmental processes that take place in children. Framed into this research agenda, our approach aims at importing some of the principles of infant sensorimotor development into machines to build mechanisms that allow a robot to efficiently learn new basic sensorimotor skills driven by its own motivations as well as by social incentives and feedback from a guiding human. The challenge is to build robots that possess the capability to discover, adapt and develop continuously new skills and new knowledge in unknown and changing environments, like human children do.

More precisely, the central target of the MACSi project is to build mechanisms that allow a robot to efficiently develop new basic sensorimotor skills through both autonomous exploration and social interaction with humans in partially unknown environments. Our approach will consist in designing a set of well identified core capabilities and learning mechanisms that will provide a good starting point onto which more complex capabilities can be developed in the future.

In pratice we will realize a scenario where the iCub robot is seated at a table with a few objects within reach. The robot will typically perform organized motor babbling and explore what it can do with its hands and with objects. A human caregiver will sometimes be in the front of the robot, giving feedback on the robot behaviour and attracting the robot’s attention toward particular objects (for example by shaking the object). From these interactions, the robot is expected to build increasingly complex representations of the surrounding world, giving rise to the edification of basic affordances.

Project coordination

Olivier Sigaud (UNIVERSITE PARIS VI [PIERRE ET MARIE CURIE]) – olivier.sigaud@upmc.fr

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

UPMC UNIVERSITE PARIS VI [PIERRE ET MARIE CURIE]
INRIA INRIA Centre Bordeaux Sud-Ouest
ENSTA ParisTech ECOLE NATIONALE SUPERIEURE DES TECHNIQUES AVANCEES
GOSTAI GOSTAI SAS
SOFTBANK ROBOTICS EUROPE

Help of the ANR 408,718 euros
Beginning and duration of the scientific project: - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter