Artifical Intelligence applied to augmented acoustic scenes – HAIKUS
Audition is a key modality to understand and to interact with our spatial environment, and plays a major role in Augmented Reality (AR) applications. The HAIKUS project investigates the use of Artificial Intelligence (AI) for synthesising augmented acoustic scenes. Embedding computer-generated or pre-recorded auditory content into a user's real acoustic environment creates an engaging and interactive experience that can be applied to video games, museum guides or radio plays. Audio-signal processing tools for real-time 3D sound spatialisation and artificial reverberation are nowadays mature and can handle both multichannel loudspeaker systems and binaural rendering over headphones. However, the seamless and congruent integration of computer-generated and pre-recorded objects within a live context is still challenging. It needs the automatic adaptation of the virtual object rendering to the acoustic properties of the user’s real environment.
Among the different subcategories of AI, machine learning (ML) is well suited to the audio processing in virtual and augmented reality applications. ML has shown its strong potential for solving complex acoustic problems such as sound source localisation or source separation. In the HAIKUS project, ML is applied to the identification and manipulation of the acoustic channels between the sources and the listener. The three main objectives of the project are (a) the blind estimation of room acoustic parameters and/or the room geometry from the observed reverberant audio signals originating from live sounds occurring in the room, (b) the inference of plausible rules to modify the spatialisation parameters and methods to interpolate between room impulse responses according to the movement of the listener, and (c) the blind estimation of the HRTFs of the listener from binaural signals captured in a real environment with in-ear microphones. The three objectives benefit from the mobility of the listener, which allows for gradually accumulating knowledge about the acoustic environment.
The HAIKUS project brings together three research teams with complementary expertise in the fields of signal processing, machine learning, acoustics, and audio technology. The general methodology combines statistical methods, acoustic modelling, and machine learning. The division of the scientific program is structured around the three main objectives. Each objective requires the development of statistical deep regression methods in order to map audio features extracted from observed signals to the acoustic parameters we want to estimate. Each objective tackles this problem from a different perspective, i.e. with different input and output features, and different assumptions about the known and unknown variables. Learning the mapping between the observed audio features and the target acoustic parameters requires the creation of dedicated audio datasets either built from numerical modelling or from real-world recordings.
The scientific results will be disseminated in publications and conferences representative of signal processing, acoustics or audio. Besides the theoretical results, practical outcomes will also comprise the development of a high-order spherical microphone array. In the spirit of open research, the generated or collected audio datasets will be made publicly available in order to serve the scientific community. Considering the increasing interest on applications of machine learning and auditory scene analysis, two workshops will be organised during the project. The workshops will address both the scientific community and companies involved in AAR research and development, and from other potential application domains such as audio/video gaming, cultural heritage, professional audio production, and broadcasting. Work dedicated to the personalisation of HRTFs using binaural recordings should lead to an original web-based solution for personalised binaural rendering accessible to any consumer.
Monsieur Olivier Warusfel (INST RECH COORD ACOUSTIQ MUSIQ)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
INRIA NGE Centre de Recherche Inria Nancy - Grand Est
d'Alembert Institut Jean le rond d'Alembert
IRCAM INST RECH COORD ACOUSTIQ MUSIQ
Help of the ANR 630,546 euros
Beginning and duration of the scientific project: December 2019 - 42 Months