JCJC SIMI 2 - JCJC - SIMI 2 - Science informatique et applications

Automatic Understanding of Biomedical Texts for Translational Research – CABeRneT

Submission summary

Much clinical and biomedical knowledge is contained in the text of published articles, Electronic Health Records (EHRs) or online patient forums and is not directly accessible for automatic computation. Natural Language Processing (NLP) techniques have been successfully developed to extract information from text and convert it to machine-readable representations. The most advanced applications have focused on identifying clinically relevant entities and concepts from English text. However, for many biomedical informatics tasks it is necessary to go beyond the identification of isolated instances in single documents – the context of concept occurrences and the nature of the relationships between co-occurring concepts are often crucial for a specific understanding of the analyzed text. Furthermore, while most of the literature is available in English, EHRs in French hospitals are written in French. Therefore, it is important to develop advanced methods for French that will provide structured representations of clinical text compatible with existing representations for English.
This research project will focus on the following aims:
1. Providing material for text analysis in a specialized domain (i.e. the biomedical domain) in French
2. Adaptation to a specialized domain of NLP tools developed for the general language
3. Application to the automatic detection of links between clinical characteristics and medical history of patients described in EHRs, predictive biomarkers identified by immunologic or genetic studies and evidence of such associations reported in the literature
The proposed research is innovative and will provide an in-depth study of multiple biomedical texts in French (EHRs) and in English (literature). It will be guided by linguistic principles and by the application to personalized medicine. A global approach should ensure that the methods used can be generalized to other biomedical applications.

Project coordination

Aurélie Névéol (Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


LIMSI-CNRS Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Help of the ANR 225,853 euros
Beginning and duration of the scientific project: August 2013 - 48 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter