DS0901 - Liberté et protection des citoyens et des résidents

Speech And Language technologies for Security Applications – SALSA

Submission summary

The goal of the SALSA project (Speech And Language technologies for Security Applications) is to develop a set of speech and language processing tools specifically designed to assist analysts in processing and exploiting audio data for security purposes, such as judicial, law enforcement and intelligence applications.

The SALSA consortium is composed of 4 technology and research partners with complementary expertise and excellent track records in their respective fields, and 3 user partners, also members of the advisory board. The coordinator, Vocapia Research (www.vocapia.com) is a software editor specialized in speech processing. The Centre National de la Recherche Scientifique - Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (CNRS-LIMSI) brings extensive expertise in spoken language processing. Laboratoire de Phonétique et de Phonologie (LPP), a CNRS-Paris 3 University research unit, brings acoustic-phonetic and articulatory expertise of speech production from a linguistic perspective. The fourth partner, Intelligences, specializes in providing expertise in using language technologies for judicial investigations. The three user partners are the French Defence Procurement Agency (DGA), the General Directorate of the National Police (DGPN) and the Home Affairs Technologies and Information Systems department (ST(SI)2) which is also federating the needs of several other agencies (DCPJ, IRCGN, OCRIEST, PP and PTS). These users will ensure that SALSA covers the spectra of needs that security agencies face when coping with large amounts of speech data.

Filtering information in audio data is critical for national security to ensure the protection of citizens and is of strategic need for many governmental agencies. The SALSA project will improve over the current state-of-the-art in language technologies in order to develop aids for analysts currently unable to process the exponentially growing volumes of data. The main objectives of the project are to enhance the efficiency of the analysts while reducing their workload, and to provide support for novel data mining in audio. To do so, the following technological innovations will be explored: New learning methods for spontaneous speech; Linguistic investigations of accented speech and speech with code-switching; New decoding methods for transcription and keyword spotting; and Novel user interfaces for intelligence analysis of audio data. The three users will be involved in the development loop and evaluation ensuring a close match to the security analysts needs.

Project coordination


The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


INT Intelligences SARL
LPP Laboratoire de Phonétique et de Phonologie
LIMSI Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Help of the ANR 798,738 euros
Beginning and duration of the scientific project: September 2014 - 36 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter