The CODIM project focuses on the two main linguistic resources for organizing monologues or conversations in human languages : D(iscourse) M(arkers) (therefore/donc, well/ben,bon etc. in English/French) and prosody (in particular intonation). It will evaluate their status with respect to two major views on communication: compositionality (the possibility of combining meaningful expressions into more complex meaningful expressions) and pattern or construction-based approaches (the idea that language users exploit partly ‘frozen’ strings of words). We will compare the semantic and prosodic properties of simple and complex French DM (e.g. ah + bon) found in corpora for written and spoken French, using a variety of complementary approaches for DM identification (category-driven text mining), clustering (statistics and Machine Learning) and research in prosody (ToBI representation, speech analysis/synthesis). This will foster or reinforce strong collaborations between linguists and computer scientists.
Madame Mathilde DARGNAT (Analyse et Traitement Informatique de la Langue Française)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
LLF Université de Paris
LORIA Institut national de la recherche en informatique et automatique
ATILF Analyse et Traitement Informatique de la Langue Française
Help of the ANR 412,459 euros
Beginning and duration of the scientific project: December 2022 - 48 Months