BLANC - Blanc 2008

Multistabilité et groupement perceptif dans l'Audition et dans la Parole – Multistap

Submission summary

When dealing with natural scenes, sensory systems have to process an often messy and ambiguous flow of information. A stable perceptual organization nevertheless has to be achieved in order to guide behaviour. The neural mechanisms involved can be highlighted by intrinsically ambiguous situations. In such cases, bistable perception may occur: distinct interpretations of the unchanging stimulus alternate spontaneously in the mind of the observer. A typical case is the so-called Necker's cube. Multistability has been largely studied in vision, and has lead to a very rich set of data and theories about perceptual binding, cognitive formation of objects, attention, decision and even consciousness. Multistability can also occur in audition, as has been recently shown by two partners of the present project (Pressnitzer & Hupé 2006), but it has almost never been studied in that modality. However, the potential interests of considering multistability in the auditory modality are many. On the one hand, a number of results and assumptions about attention, decision and consciousness are solely based on multistability in the visual modality, so they should be subjected to further tests in another modality such as audition. On the other hand, multistability is a very interesting paradigm to address crucial questions in hearing such as auditory binding and the formation of auditory streams and objects in the human brain. Introducing audition in the multistability paradigm also enables to extend the paradigm to new questions about multimodality, asking how and where are the auditory and visual binding-and-decision processes coupled in the cognitive architecture. Furthermore, audition brings new questions and dimensions in the binding-and-decision multistability process. Firstly, the temporal dimension, potentially existing but not necessary in visual multistability, is basic in audition. Auditory multistability makes temporal binding and sequential processing central. Secondly, sound drives towards speech, which displays a multistability phenomenon known since long, namely 'verbal transformations' (uttering 'life' in loop makes 'fly' perceptually emerge, and back to 'life' and so on). Then come in the field new questions about the role of schemas, and the link between perception and action, including the multimodal (audiovisual) perspective as well, as has recently be shown by another partner of the present project (Schwartz, Sato et coll., 2006, 2007). The MULTISTAP project deals with multistability in audition and speech in a dynamical, multimodal and perceptuo-motor perspective, combining behavioural, neurophysiological and modelling approaches to better understand binding and the formation of perceptual objects by the human brain. The project associates specialists of audition (LNSCC and ENS-DEC) and speech (GIPSA-Lab), which is surprisingly rare in France. The fourth partner, CERCO provides the crucial background on the visual and audio-visual literature. MULTISTAP deals with three fundamental questions about human perception that we have selected to structure the project. (1) Binding mechanisms in audition and speech: binding and multistability are considered here as manifestations of the same core mechanisms concerned with the formation of perceptual objects in the human brain, out of complex sensory scenes. With stimuli that have a time dimension, binding is indeed a crucial step for multistability, and multistability provides in return an ideal paradigm to study binding. (2) Decision mechanisms in audition and speech The question is to formalize how decisions change from one percept to another in auditory and audiovisual speech and non-speech sequences. The issue has been largely studied in vision, which will provide a very useful background. Linking visual and auditory paradigms, we shall study the phenomenology of the decision process. (3) Functional architectures. The third question deals with the functional architectures compatible with the experimental data available and that we will acquire on binding and multistability in audition and speech. Multistability in hearing is a powerful tool to probe the mechanisms of perceptual organization as they unfold with time. For addressing this question, two approaches will be combined: determining plausible brain architectures thanks to neurocognitive tools, and proposing model architectures for binding and multistability in audition and speech. Since the binding / streaming process is crucial in audio and speech processing, particularly in noisy environment, MULTISTAP also contains a number of potential spin-offs that could be of importance for applications related to handicap and to computer technologies.

Organisme de recherche

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Help of the ANR 320,000 euros
Beginning and duration of the scientific project: - 36 Months

Explorez notre base de projets financés

ANR makes available its datasets on funded projects, click here to find more.