Computational Resources and Processing for Regional Languages – RESTAURE
Research aimed at producing resources and tools for low-resourced languages currently experiences a resurgence of interest, in particular through the creation of corpora and lexicons. The ultimate goal is to help preserve and disseminate cultural heritage. Regional languages of France can be considered as low-resourced. All languages with little resources have in common that their computerisation has a low financial profitability which does not compensate for considerable development costs. However, endowing these languages with electronic resources (corpora, lexicons, dictionaries) and tools is a major concern for their dissemination, protection and teaching (including for new speakers). In a broader perspective, it is the diversity of world languages which would be better preserved and the amount of data available to researchers in human and social sciences (linguistics, sociology, anthropology, literature, history, ... ) would increase.The overall objective of the RESTAURE project is to provide computational resources and processing tools for three regional languages of France: Alsatian, Occitan and Picard. To achieve this goal, it will be necessary to develop new computational models suitable for low-resourced and poorly standardized languages. The initial choice of these three languages is motivated by several reasons: they cover various language families and there has been significant work in the areas covered by the project. It will thus be possible to build upon existing work in order to share different approaches, experiences and tools developed in previous projects.
.
Project coordination
Delphine BERNHARD (Linguistique, Langues, Parole - Université de Strasbourg)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partnership
LiLPa - UNISTRA Linguistique, Langues, Parole - Université de Strasbourg
CLLE-ERSS Cognition Langues Langage Ergonomie – Équipe de Recherche en Syntaxe et Sémantique
CERCLL - LESCLAP Centre d'études des relations et des contacts linguistiques et littéraires - Laboratoire Linguistique Et Sociolinguistique : Contacts, Lexique, Appropriations, Politiques
LIMSI-CNRS Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur
Help of the ANR 394,674 euros
Beginning and duration of the scientific project:
September 2014
- 42 Months