Distant speech communication with heterogeneous unconstrained microphone arrays – DiSCogs
DiSCogs aims at solving fundamental sound processing issues in order to place speech at the center of a new hands-free and flexible communication experience, exploiting the many devices equipped with microphones that populate our everyday life. I propose to apply machine learning methods based on deep learning to solve some of the challenges faced with such heterogeneous microphone arrays. In particular, I propose to recast the problem of synchronizing devices at the signal level as a multi-view learning problem aiming at extracting complementary information from the devices at hand. I also propose to explore new learning methodologies inspired by knowledge distillation and adaptation to circumvent the annotation of the data and train acoustic models which are robust to noise directly from the multichannel signals. Approaches proposed during DiSCogs will be evaluated in the "smart apartment'' platform at Loria.
Monsieur Romain SERIZEL (Laboratoire lorrain de recherche en informatique et ses applications)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
LORIA Laboratoire lorrain de recherche en informatique et ses applications
Help of the ANR 284,853 euros
Beginning and duration of the scientific project: August 2018 - 42 Months