Blanc SHS 2 - Sciences humaines et sociales : Développement humain et cognition, langage et communication 2010

Early language acquisition : experimental and computational approaches – BootLang

Submission summary

Children across all cultures acquire quickly and reliably the language or languages spoken in their environment. Whereas the developmental landmarks of language acquisition during the first years of life have been described with increasing levels of detail, the mechanisms underpinning them remain poorly understood. Indeed, the complexity of the learning problem is daunting, as several levels of linguistic description have to be acquired simultaneously: the sound structure of words and phrases (phonology), the association between the sound pattern of words and their meaning (lexicon), the internal structure of words and their grammatical function (morphology), and the organization of words into sentences (syntax and semantics). Moreover, these levels are heavily interdependent, in that acquiring one level seems impossible unless one or more other levels have been acquired. These dependencies are the source of a ‘bootstrapping problem’ for any theory of language acquisition: the lexicon is necessary for the acquisition of phonology on the one hand and syntax on the other hand, while phonology and syntax in turn are necessary for lexical acquisition. Finally, in contrast to learning a second language at school or learning to read, first language acquisition occurs spontaneously, in an unsupervised fashion, through immersion in a linguistic environment.

Our general hypothesis is that language acquisition relies on a global architecture relating a number of specialized representational levels: acoustic representations, phonetic categories, prosodic structure, morphological representations, syntactic categories, etc. In infants, learning at each level relies primarily on regularities that can be observed from the acoustic signal, and is hence imperfect. However, as learning proceeds, interactions between levels will enable the child to refine the representations and tune it better to the properties of the language in the environment. Hence, acquisition is first driven by bottom-up processes, and then refined through between-level interactions. Using both computational modeling and behavioral experiments with infants, toddlers, and adults, we explore this two-step bootstrapping process for two linguistic levels for which the early phases of development have been extensively documented: phonology and syntax. More specifically, we focus on the acquisition of phonological and grammatical categories as they emerge during the first two years of life.

Our project is novel from several viewpoints. The overall approach integrates for the first time expertise in linguistics, cognitive psychology and computational modeling. The objectives are ambitious (a blueprint of early language acquisition), and the method represents a significant departure from earlier work: Instead of testing the learning algorithms on small or artificially simplified corpora, we aim at testing them on large-scale databases with real speech signals. Second, in addition to studying individual cues and processing components, we systematically address the multiple interactions among components within an integrated architecture.

Project coordination

Sharon Peperkamp (Ecole Normale Supérieure)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

LSCP Ecole Normale Supérieure

Help of the ANR 230,000 euros
Beginning and duration of the scientific project: - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter