CE28 - Cognition, éducation, formation tout au long de la vie

From lip- to script-reading: An integrative view of audio-visual associations in language processing – AVA

From lip- to script-reading: An integrative view of audio-visual associations in language processing

The early exposure to speech and speakers’ articulatory gestures is the basis of language acquisition and is a fingerprint of audiovisual association learning. Is this initial ability to associate speech sounds and visual inputs a precursor of infants’ reading ability? Answering this question requires a good understanding of the developmental trajectory, cognitive and neural bases of both language abilities and of how they interact in the language system. This is the aim of the present program.

Investigate the relationships between the two main forms of audio-visual associations in language processing, i.e., “speech sounds-visual articulatory gestures” and “speech sounds-orthography”

The relationship between “lip-reading” and “script-reading” will be investigated in three main aspects: <br />1) Underlying cognitive and neural mechanisms: Do the association between speech and articulatory gestures and the association between speech and orthography rely on some common cognitive and neural mechanisms? If so, what is the nature of those mechanisms? <br />2) Speech representations: How do the two types of visual information contribute to forming phonological representations? <br />3) Language development: Is there a developmental link between “lip-reading” and “script-reading”? In other words, Whether the ability to associate speech sounds with visual articulatory gestures is a precursor of the ability to associate speech sounds with visual symbols, if so, what is(are) the underlying mechanism(s)?<br />At the theoretical level, the outcome of the research should lead to an elaboration of a unified framework that explains how multimodal inputs and representations acquired at different stages of language development jointly contribute to forming rich and coherent representations of language. In addition to the theoretical advances in the field of reading and speech processing, a better understanding of the possible links between “lip-reading” and “script-reading” can have important practical implications on spoken and written language learning in both children and adults, as well as on early detection and remediation of reading deficits.

The research program resorts to a combination of different techniques, i.e., behavioral, neurophysiology, brain-imaging, which provide information on task performances as well as on the underlying cognitive and neural mechanisms.
The data are collected in skilled reader adults and in children at different age ranges (from 5 to 8 years corresponding to the following classes: kindergarten, 1st grade, 2nd grade, and 3rd grade). The data obtained in adult allow us to characterize the speech processing system that has already reached maturity, while those obtained in young children allow us to follow the evolution of the speech system from the moment children have not yet learned to read until the moment they become readers. The relationship between the development of reading skills in young children and their sensitivity to visual information coming from the articulatory gestures is investigated using both longitudinal and cross-sectional protocols.

An EEG study was conducted on skilled-reader adults to investigate the cognitive and neural processes underlying the associations between speech and articulatory gestures and between speech and orthography. Participants’ brain responses were recorded when they performed speech processing tasks under different conditions: 1) a spoken word was presented alone, 2) in synchrony with a static image of the viseme of the word’s initial phoneme or, 3) in synchrony with the first letter of the spoken word. While the viseme allowed an activation of the initial speech sound through the visuo-motor system, the letter did so through an abstract visual symbol. Overall, the neurophysiological marker of the audiovisual integration showed that both visual inputs were integrated with speech sounds during both early stage of phonetic processing and lexical processing. However, in skilled readers, the contribution of the artificial orthographic code to audiovisual integration is stronger than that of the natural articulatory code. The finding suggests that, despite the abstract/artificial nature of the speech sounds-orthography association, once we learn to read, our spoken language system becomes highly sensitive to the orthographic code, and this code plays a strong role in spoken word recognition.
The conclusion obtained in the EEG study described above is coherent with the finding obtained in a study where we investigated the contribution of orthography and articulatory gestures to the acquisition of L2 speech sounds. In this study, native speakers of French learned minimal pairs of novel English words containing the English /?/-/f/ phonemic contrast (not present in French) under one of three exposure conditions: 1) the auditory forms of novel words alone, 2) the auditory forms associated with articulatory gestures, or 3) the auditory forms associated with orthography. The benefits of the three methods were compared at different moments. During training, the presence of each type of visual cue allowed a better distinction of the ambiguous L2 speech sounds and thus facilitated novel word learning well beyond the benefit of the auditory input alone. However, these additional benefits did not persist when participants’ L2-sound discrimination and novel-word-learning performance were assessed immediately after training, when the visual cues were no longer presented. Most surprisingly, after a night’s sleep, only participants who had also been exposed to orthography during training showed a spontaneous improvement in both L2-sound discrimination and novel-word memorization compared to the previous day, although no additional training was provided between the two days. The findings suggest that being exposed to word spelling led to a better consolidation of lexical knowledge and also reinforced the perceptual ability that allowed the participant to better discriminate the non-native phonemic contrast.

The findings reported above in adult skilled readers clearly indicate a strong bias toward higher sensitivity to (and a benefit of) orthography compared to visual articulatory gestures in different speech processing contexts. This bias was observed at the level of both task performance (i.e., the ability to store new words in the mental lexicon and to perceive non-native ambiguous speech sounds) and the cognitive process underlying audiovisual integration. Considering the naturalness of the association between speech sounds and articulatory gestures, their relatively reduced role compared to orthography may seem surprising. Follow-up studies are planned to identify the factors that could explain this observation, such as task demands or interindividual variability in terms of age and reading level.
In addition to these follow-up studies, the developmental trajectory of lip-reading and text-reading ability in young children at different age ranges (5- to 8-year-old) is currently investigated.

Pattamadilok, C., Welby, P., & Tyler, M.D. (in press). The contribution of visual articulatory gestures and orthography to speech processing: Evidence from novel word learning. Journal of Experimental Psychology: Learning, Memory, and Cognition.

Language processing is a multisensory activity. From very early on, infants are exposed to speech that they naturally associate with visual information from speakers’ articulatory gestures. This early exposure to multisensory inputs is at the basis of language acquisition and provides the first form of audio-visual (AV) communication. Many years later, children learn to associate the same speech with an orthographic code. Unlike the initial association, which relies on the biological link between action and perception, the association between speech and orthography is artificial, arbitrary, and requires years of practice to become automatic. Based on their distinct fundamental properties, “lip-reading” and “script-reading” have been considered to be two cognitive processes accounted for by distinct theoretical models.
The present proposal adopts a novel perspective that seeks to establish the missing link between these “biological” and “artificial” AV associations, with a theoretical aim of elaborating a unified framework explaining how different inputs jointly contribute to forming a coherent representation of speech. A multidisciplinary approach combining knowledge in linguistics, psycholinguistics, and neurosciences will be adopted to investigate the commonalities and dissimilarities between these two forms of AV association in three aspects: 1) their underlying cognitive and neural mechanisms, 2) the nature of speech representations influenced by the articulatory gestures and orthographic inputs and 3) the developmental trajectories of “lip-reading” and “script-reading” abilities, with the ultimate goal of investigating whether the early-developed sensitivity to the association between speech sounds and articulatory gestures is a precursor to child’s reading ability.
These three aspects will be examined using a unique approach that studies both forms of association within the same individuals and experimental paradigm. A combination of task performances and the associated neural measures of temporal dynamics, functional activity, and functional connectivity will be recorded in adults whose ability to associate speech sounds with articulatory gestures and with orthography has already reached maturity. The developmental link between lip-reading and script-reading abilities will be examined using a longitudinal protocol applied on young children at different stages of reading development. A predictive model of a child’s reading ability measured at the end of the research program will be constructed based on the longitudinal data on his or her sensitivity to articulatory gestures, reading-related skills, and general cognitive and socio-demographic profile.
The final outcome of the program should provide a significant advance in the understanding of how multisensory language inputs are learned, processed, stored in the brain, and affect performance. The novel perspective of a dependency between the developments of lip- and script-reading will have important implications not only on theoretical models of speech and reading development but also on educational and clinical domains.

Project coordination

Chotiga Pattamadilok (Laboratoire Parole et Langage)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


Santé Lyon-Est - Louis Léopold Ollier
CNRS DR Rhône-Auvergne Institut des Sciences Cognitives Marc Jeannerod
LPL Laboratoire Parole et Langage

Help of the ANR 342,385 euros
Beginning and duration of the scientific project: January 2020 - 48 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter