CE33 - Interaction, robotique 2023

Silent Pitch – SilentPitch

Submission summary

Speech prosody, which includes intonation, rhythm and voice timbre, carries both expressive and speech structuring information. Pathologies (e.g. throat cancer) that affect vocal fold vibration and deprive patients of their control of intonation thus severely impact their intelligibility and social interactions. In that case, a central aspect of speech rehabilitation is the prediction of intonation from other channels. Intonation encodes multiple streams of information such as discourse delimitative cues ; focalisation on a constituent of the utterance ; or melodic modulation to express a social attitude. Also, several studies have demonstrated a strong correlation between intonational variations and speech co-occurring gestures such as lip, tongue, eyebrow, head or hand movements. Moreover, these speech co-occurring gestures are continuously adapted to the situation of communication, as well as to the speaker's own auditory feedback. Given these considerations, the goal of this project is to study the extent to which intonation can be automatically predicted and controlled from speech co-occurring gestures (orofacial or hand), by combining two approaches: 1) We will consider each prosodic function (delimitative, focalising, expressivity) separately by associating them with different gesture channels ; 2) Automatic prediction of intonation from orofacial and hand gestures will be integrated into a speech rehabilitation system that converts whisper into speech in real time. This will allow to quantify the impact of the automatic intonation prediction system on speech co-occurring gestures, and the user's ability to adapt them to obtain the best compromise between intelligibility and quality of prosody reconstruction.

Project coordination

Olivier Perrotin (Grenoble Images Parole Signal Automatique)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

University College London
GIPSA-lab Grenoble Images Parole Signal Automatique

Help of the ANR 314,772 euros
Beginning and duration of the scientific project: December 2023 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter