DS0707 - Interactions humain-machine, objets connectés, contenus numériques, données massives et connaissance

Knowledge extraction from large corpora of human-human conversation data from WEB chat services – DATCHA

Submission summary

The goal of the DATCHA project is to perform knowledge extraction from very large databases of WEB chat conversations between operators and clients in customer contact centers. Extracting knowledge from chat corpus remains a challenging research issue. Simply applying traditional text mining tools is clearly sub-optimal as it takes into account neither the interaction dimension nor the particular nature of this language which shares properties of both spoken and written language. The DATCHA project will address scientific issues including intra-conversation analysis through a deep semantic analysis (syntactic, semantic, discursive and structural analysis) and inter-conversation analysis (definition of semantic and discursive similarity between conversations). It will propose innovative solutions in various use-cases including analytics report generation, conversation success prediction on the basis of criteria defined by operational units, and online conversation solving.

Project coordination

FREDERIC BECHET (Laboratoire d'Informatique Fondamentale de Marseille)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


AMU Laboratoire d'Informatique Fondamentale de Marseille
UPS-IRIT Université Toulouse III [Université Paul Sabatier]

Help of the ANR 409,806 euros
Beginning and duration of the scientific project: September 2015 - 42 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter