CE23 - Intelligence artificielle et science des données

Automatic Simplification of Scientific Texts – SimpleText

Submission summary

Information access systems provide users with key information from reliable sources such as scientific literature; however, non-experts tend to avoid these sources due to its complex language or their lack of background knowledge. Text simplification removes some of these barriers. SimpleText will be a step forward to make research really open, accessible and understandable for everyone and help to counter fake news based on scientific results (sustainable development goal QUALITY EDUCATION). This is especially important with an explosion of open science during the COVID-19 pandemic. Simplified texts are more accessible for non-native speakers, young readers, people with reading disabilities or lower levels of education (sustainable development goal REDUCED INEQUALITY). Automatic text simplification could be useful for various domains such as scientific communication, science journalism, politics and education.
SimpleText tackles technical challenges and evaluation challenges by providing appropriate algorithms, data and benchmarks for text simplification and aims to answer the following research questions: RQ1 - What textual expression carrying information should be simplified (document and passage to be included in the simplified summary)? RQ2 - What kind of background information should be provided (what terms should be contextualised by giving a definition, use-case, example etc.)? RQ3 - How to improve the readability of a given short text (e.g. by reducing vocabulary and syntactic complexity) with acceptable rate of information distortion? RQ4 - To what extent are the approaches for English applicable for French? We will provide algorithms, collections and evaluation tools openly available to the scientific community (to the extent permitted by third-party copyrights) and will be valued at international evaluation campaigns, e.g. CLEF as well as at classes on pre-editing, web-site localisation, technical writing and digital humanities.

Project coordination

Liana ERMAKOVA (Université de Brest (UBO); Laboratoire Histoires et Constructions dans le Texte (HCTI))

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

University of Minho
UBO-HCTI Université de Brest (UBO); Laboratoire Histoires et Constructions dans le Texte (HCTI)
UAPV-LIA Université Avignon et Pays du Vaucluse (UAPV); Laboratoire d'Informatique d'Avignon
UBS-HCTI Université Bretagne Sud (UBS), Laboratoire Héritages et Constructions dans le Texte et l'Image (HCTI)
AMU-LIS Université Aix-Marseille (AMU), Laboratoire Informatique et Systèmes (LIS)
University of Amsterdam

Help of the ANR 295,557 euros
Beginning and duration of the scientific project: January 2023 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter