Blanc SIMI 3 - Blanc - SIMI 3 - Matériels et logiciels pour les systèmes et les communications

Reliability in biometric voice identification – FaBiole

Reliability in biometric voice identification

Fabiole aims to propose a paradigm shift for speaker recognition area: from performance (error rates) to Reliability.<br />In terms of applications, this project has a huge potential. first, it will allow a better view of possibilities and limits of forensic voice comparison. Second, it will increases the effectiveness of speaker recognition for commercial application, a growing area, by the robustness of an reliability approach<br />

Biometric voice authentication: an area where the needs are huge and increasing

Voice comparison, or more generally speaking, biometric voice authentication is an area showing a huge interest, in three different directions.<br />Firstly, this topic is very present in the forensic field. With the new telecommunication technologies, the needs are growing. At the international level, forensic voice comparison is a controversial topic. As, despite the doubts, an important number of forensic voice comparison expertises are done each year, the risk of a judicial disaster like the Outreau case is very serious. The introduction, in this context, of automatic approaches offers a real hope but, at this day, nothing insures the resolution of the risks and problems.<br />The second in-interest area is the national security and more particularly the war against terrorism. Several privates companies are, since few years, present in this area which appears as growing very quickly. <br />Lastly, there is an important interest about voice authentication in the electronic transaction area, for commercial applications. Security of electronic payments and protection of personal information are the two areas with the highest developments.<br />Different previous works showed that the very nice performances obtained by latest approaches in the speaker recognition areas area varying a lot from one trial to another. And the reasons of this huge variation are not explained and can't be predicted. This point is clearly the main brake for the development of real world applications.<br />The expected outcomes of Fabiole will offer an answer to this problem, with an application in of the three application area listed previously.

Fabiole aims to overtake the notion of «performance«, where only the error rates are measured -without trying to explain the reasons of this performance (i;e. the information present in the signal which explain the results)- in order to propose the notion of «reliability«, where each speaker characteritic factor is isolated and quantified before to do the voice authentication.
This ambition is split into four phases. In fact, it is important to acquire knowledge about the speaker characteristic idiosyncratic criteria, the automatic methods to detect these criterion and the evaluation approaches itself (in order to evaluated the benefits in terms of reliability). The last ambition is to propose a confidence measure based on these idiosyncratic informations.
Fabiole will take advantage of SoA speaker recognition approaches as well as international evaluation campaigns and protocol.

Different results corresponding to the four points listed in the previous section are expected.
The first works concern an evaluation protocol inside the reliability paradigm.

The perspectives will be defined lately.

The results of Fabiole will be proposed in several publications. The main conferences and journals of the area are targeted
The corresponding softwares will be distributed thanks to an open source, free, software.

This project takes part in the biometric speech authentication area. More precisely, it concerns the interest of “phonetic information” in this context.
Since about 15 years, Automatic Speaker Recognition systems are evaluated by the NIST thnaks to the Speaker Recognition Evaluation (SRE) campaigns. The story of these evaluations shows an important improvement in terms of performance, which authorizes to think about numerous applications of speaker recognition, and particularly in the forensic area.
The potential consequences of such an application lead us to think about the reliability of the performance criterion currently used in order to evaluate the systems.
Indeed, these performance criterion are estimated globally on a large set of biometric voice tests (voice comparisons). The criterions are mainly a global Equal Error Rate and a global decision cost obtained by a Decision Cost Function (DCF). The measures are computed as an average obtained from a large set of tests which are coming from different speakers and conditions. These measures don’t really reflect the practical application context where an answer must be returned by a system in a specific situation. In such a situation, only two specific recordings belonging from one or two specific speakers and with a specific content are available. More precisely, these measures don’t take care about both the differences between a speech recording and a speaker as well as and the differences between the speakers. Some recent works demonstrate the limits of these performance measures.
This project has two objectives. Firstly, the acoustic and phonetic factors which are linked to inter speaker variability will be studied. The aim is not longer to speak about a global performance but to highlight the part of the inter speaker variability explained by each of these acoustic/phonetic factors. Lastly, these results will be used to propose a confidence measure dedicated to biometric voice authentication/comparison area. These confidence measure will work only on the two speech recordings linked to a given voice comparison, independently of the speaker recognition approach . Basically, the speaker characteristic information contained in the two recordings will be quantified and their coherence estimated in order to compute the confidence measure.
Even if this project will provide very useful information on the speaker specific characteristics, the main objective will be to push the biometric voice comparison domain to make a big step in direction of reliability evaluation, instead of staying in the paradigm of performance evaluation.
The role of automatic speaker recognition systems will change after this project. Now, they have to take a decision and to evaluate the confidence on this decision. After, they will have only to propose the binary decision when the reliability of the decision will be provided by the outcomes of this project.
The potential effect of this project on the real life is very large. The results will authorize a better comprehension of the potential and limits of forensic voice comparison but also to increase drastically the reliability of speaker recognition applications in the commercial field. The independency between the confidence measure and the speaker recognition system itself will virtually authorize to integrate the results of this project in all existing applications…

Project coordination

Jean-François BONASTRE (Laboratoire Informatique d'Avignon) – jean-francois.bonastre@univ-avignon.fr

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

LIG Laboratoire d'Informatique de Grenoble
LNE Laboratoire National de Métrologie et d'Essais
LIA Laboratoire Informatique d'Avignon

Help of the ANR 282,000 euros
Beginning and duration of the scientific project: February 2013 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter