ANR-JST CREST IS - Appel à projets franco-japonais : Interaction Symbiotique

Speaker identity cloning and protection – VoicePersonae

Submission summary

Due to recent advances in speech and language processing capabilities, humans can today interact with intelligent technology by using their voice. The number of voice-enabled devices is growing exponentially and consumers are adapting quickly to using their voice as a natural means of interaction. Today’s voice interfaces can project synthetic speech signals that are of such quality that they are (near-to) indistinguishable from that of human speech. They are also capable of responding to requests or commands issued in the form of entirely natural, conversational speech and can even recognise or identify the user from their speech alone.
This project concerns the components of voice interfaces that relate to, or impact upon the notion of speaker/voice identity. They include speech generation and speaker recognition technologies. Speech generation technologies are components of a voice interface that aim to produce a natural human voice. They include speech synthesis and voice conversion technologies, both of which have the capacity to produce speech signals that are representative of a specific speaker identity. Speaker recognition technologies are the components of a voice interface that aim to determine or verify the identity of a human speaker.
In some senses, speech generation and speaker recognition technologies have potentially conflicting objectives. Speech generation technologies aim to produce human speech artificially whereas speaker recognition technologies aim to verify the authenticity of human speech and a claimed identity. Speaker recognition systems may thus be used to help train speech generation system. As a consequence, artificially generated speech then has the potential to fool recognition systems. Herein lies the conflicting objective. A second conflict stems from the use of speaker recognition technologies when a speaker may not wish to be identified or tracked. In order to protect the right to anonymity, de-identification solutions are then required in order to supress identity information from a speech while retaining linguistic information (the message). The study of speaker/voice identity in all three aspects of speech generation, speaker recognition and privacy are closely intertwined.
The VoicePersonae project will bridge the technical gap between the fields linked to voice identities and will (a) advance speaker identity modelling, (b) improve the security and robustness of biometric speaker recognition and (c) invent new solutions to conserve speaker privacy. For the accurate modelling of voice identities, required for new applications such as personal avatars and in the health domain, VoicePersonae will unify several classical multi-speaker speech generation tasks, that is, multi-speaker speech synthesis, voice conversion and speech enhancement. VoicePersonae will harness speaker recognition technologies in order to achieve this goal. In order to improve the robustness of speaker recognition to the security threats presented by advances in speech generation, VoicePersonae will also deliver advances in anti-spoofing. This work will be undertaken assuming that fraudsters are aware of anti-spoofing technologies and hence attempt to spoof not only biometric recognition systems but also anti-spoofing systems. Finally, VoicePersonae will deliver speaker anonymization capabilities in order to provide for speaker privacy. In order to fuel progress in this area, VoicePersonae will organise the first speech anonymization and re-identification challenge.

Project coordination

Jean-François BONASTRE (Laboratoire d'Informatique d'Avignon)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

EURECOM EURECOM
NII National Institute of Informatics
LIA Laboratoire d'Informatique d'Avignon

Help of the ANR 495,760 euros
Beginning and duration of the scientific project: January 2019 - 60 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter