Urgent news
Blanc SIMI 2 - Blanc - SIMI 2 - Science informatique et applications

Advanced error analysis for speech recognition – VERA

Submission summary

The proposal aims at developing tools for diagnostic, localization, and measurements of automatic transcription errors. This proposal is based on a consortium of academic actors of very first plan in this field. The objective is to study in detail (at the perceptive, acoustico-phonetics, lexical, and syntactic levels) the errors in order to bring a precise diagnosis of possible lacks of the current classical models on certain classes of linguistic phenomena.

At the application level, the proposal is justified by an observation: a high number of applications in the field of content access from multimedia data are made possible by the use of automatic transcriptions of speech: subtitling of video emissions, search for precise extracts in audio-visual archives, automated reports of meetings, extraction of information and structuring of information (Speech Analytics) in contents multimedia (Web, call centers, ...). However their deployment on a large scale is often slowed down by the fact that transcription from automatic speech recognition systems contains too many errors. Research and development in speech recognition related, successfully until now, to the improvement of methods and models implemented in the process of transcription, measured thanks to the word error rate; however, last a certain performance level, the marginal cost induced to reduce the residual errors increases then exponentially.

Transcription errors thus persist, which are more or less awkward according to the applications. Information retrieval is tolerant with errors (up to 30%), but systematic errors on certain named entities can be prohibitive. On the contrary, subtitling or meeting transcription have a very weak tolerance with the errors, and even very low word error rates compared to the state of the art (lower than 5%) are too high for the end-users.

Error processing is not limited to increase the acceptability of the applications based on the automatic transcription of the word. Error classification, impact measurement by perceptive tests, error diagnosis state-of-the-art systems, are the first crucial stage in order to identify the lacks of the current models and to prepare the future Automatic Speech Recognition system generations.
The proposal aims, by a close cooperation between complementary partners who excel in their field, to set up an infrastructure of detection, diagnosis, and qualitative measurement which makes it possible to create a virtuous circle of improvement of large and very large vocabulary continuous speech recognition systems.

Project coordination

Yannick Estève (Laboratoire d'Informatique de l'Université du Maine)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

LIMSI Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur
LPP Laboratoire de Phonétique et de Phonologie
LNE Laboratoire National de Métrologie et d'Essais
LIUM Laboratoire d'Informatique de l'Université du Maine

Help of the ANR 337,556 euros
Beginning and duration of the scientific project: - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter