CE38 - Interfaces : sciences du numérique - sciences humaines et sociales

Compositionality and Discourse Markers – CODIM

Submission summary

The CODIM project focuses on the two main linguistic resources for organizing monologues or conversations in human languages : D(iscourse) M(arkers) (therefore/donc, well/ben,bon etc. in English/French) and prosody (in particular intonation). It will evaluate their status with respect to two major views on communication: compositionality (the possibility of combining meaningful expressions into more complex meaningful expressions) and pattern or construction-based approaches (the idea that language users exploit partly ‘frozen’ strings of words). We will compare the semantic and prosodic properties of simple and complex French DM (e.g. ah + bon) found in corpora for written and spoken French, using a variety of complementary approaches for DM identification (category-driven text mining), clustering (statistics and Machine Learning) and research in prosody (ToBI representation, speech analysis/synthesis). This will foster or reinforce strong collaborations between linguists and computer scientists.

Project coordination

Mathilde DARGNAT (Analyse et Traitement Informatique de la Langue Française)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


LLF Université de Paris
LORIA Institut national de la recherche en informatique et automatique
ATILF Analyse et Traitement Informatique de la Langue Française

Help of the ANR 412,459 euros
Beginning and duration of the scientific project: December 2022 - 48 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter