CE23 - Données, Connaissances, Big data, Contenus multimédias, Intelligence Artificielle

End-to-End Neural Approaches for Speech Translation – ON-TRAC

Submission summary

The ON-TRAC project proposes to radically change the architectures used currently in speech translation. It is based on end-to-end neural models for machine translation and focuses on light and portable speech translation applications that Airbus is developing for security operations in theaters of operation.

Beyond the study of end-to-end approaches based on language pairs associated with large-scale learning data, ON-TRAC will study the development of models for poorly endowed oral or dialect languages.
An end-to-end approach to speech translation as we envision it would allow us to review the methodology of data collection for the development of a speech translation system.
Indeed, with this approach, a transcription of the source language becomes unnecessary: ??the cost of producing the data needed to learn a speech translation system is therefore greatly reduced and the development of such a system for new languages ??(including those without a writing system) would be facilitated and accelerated.
Since the project targets portable translation applications, ON-TRAC is also interested in studying the computational time and memory footprint required for neuronal translation of speech.
ON-TRAC will allow the processing of three pairs of distinct languages ??with increasing operational, security and defense interest and difficulty (English-French, French-Paschal, French-Tamacheq).

The ON-TRAC project is part of Axis 4 "Data, Knowledge, Big Data, Multimedia Content, Artificial Intelligence" of Challenge 7 "Information and Communication Society" of the 2018 Action Plan of the ANR.
By its main scientific theme dedicated to the translation of speech through end-to-end neural approaches, it is clearly positioned in the themes '' Data to knowledge '' and '' Treatment of multimedia content ''.

The technologies developed in the ON-TRAC project will be tested on three language pairs, with written French as a systematic target language.
The first pair of languages ??studied will be spoken English to written French for the sake of simplicity and for a better perception of the phenomena manifested during the translation through the analysis of the outputs of our systems, the English being sufficiently mastered by all the actors of the project.
The pashto language will be the source language of the second language pair. This choice is dictated by the fact that the treatment of an oral dialect is part of the project's stated objectives, and because of a minimized cost of collection since the consortium already has about 100 hours of audio recordings in pashto, with their textual translations in French (as well as their transcription in pashto).
Finally, the third language pair will have for its source Tamacheq, an oral dialect spoken by the Tuaregs in different areas of interest for intelligence and security (Sahel, Niger, Mali, Burkina Faso, Libya ...). As such, it is of great interest and has already been expressed by the State services concerned.

Project coordinator

Monsieur Yannick Estève (Laboratoire Informatique d’Avignon)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


LIA Laboratoire Informatique d’Avignon
UGA Université Grenoble Alpes

Help of the ANR 599,999 euros
Beginning and duration of the scientific project: - 36 Months

Useful links