Automatic Understanding of Biomedical Texts for Translational Research – CABeRneT
Much clinical and biomedical knowledge is contained in the text of published articles, Electronic Health Records (EHRs) or online patient forums and is not directly accessible for automatic computation. Natural Language Processing (NLP) techniques have been successfully developed to extract information from text and convert it to machine-readable representations. The most advanced applications have focused on identifying clinically relevant entities and concepts from English text. However, for many biomedical informatics tasks it is necessary to go beyond the identification of isolated instances in single documents – the context of concept occurrences and the nature of the relationships between co-occurring concepts are often crucial for a specific understanding of the analyzed text. Furthermore, while most of the literature is available in English, EHRs in French hospitals are written in French. Therefore, it is important to develop advanced methods for French that will provide structured representations of clinical text compatible with existing representations for English.
This research project will focus on the following aims:
1. Providing material for text analysis in a specialized domain (i.e. the biomedical domain) in French
2. Adaptation to a specialized domain of NLP tools developed for the general language
3. Application to the automatic detection of links between clinical characteristics and medical history of patients described in EHRs, predictive biomarkers identified by immunologic or genetic studies and evidence of such associations reported in the literature
The proposed research is innovative and will provide an in-depth study of multiple biomedical texts in French (EHRs) and in English (literature). It will be guided by linguistic principles and by the application to personalized medicine. A global approach should ensure that the methods used can be generalized to other biomedical applications.
Project coordination
Aurélie Névéol (Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partner
LIMSI-CNRS Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur
Help of the ANR 225,853 euros
Beginning and duration of the scientific project:
August 2013
- 48 Months