DS07 - Société de l'information et de la communication

Multivariate Analysis and Knowledge Inference for Musical Orchestration – MAKIMOno

Understanding the Perception of Orchestral Sound Mixtures Using AI

This project pave the way for a whole new research fields on the perception and learning of sound mixtures, which wa still an entirely unexplored topic. The impact of this project can be directly evaluated in its numerous outputs and several prizes obtained in a variety of application domains (scientific and artistic alike). Furthermore, pragmatic outputs such as first-of-kind hardware and software were directly produced and are already distributed freely, along with a unique knowledge database.

A scientific treatise on orchestration to push the limits of musical creativity through artificial intelligence

This project scientifically addresses the use of timbre to shape music through sound mixtures and orchestration. This aims to create cutting-edge technologies for human interaction with digital media that will change our understanding of music, provide innovative and accessible tools for the creation of musical content and lead to a better understanding of perceptual principles underlying the practice of orchestration. The project also aims to create new techniques of effective analysis, machine learning and interaction applicable to other scientific fields. These techniques include artificial intelligence of musical signals and deep representational learning that will automatically decipher the structure of multimodal representations (musical symbols, acoustic properties, perceptual outcomes) to provide optimal descriptors for understanding the principles of orchestration. Relying on both solid perceptual principles and empirically characterized orchestration examples, it further aims to build a scientifically grounded theory of musical orchestration from a large multimodal database created during the research.

This project targets the emerging properties of instrumental mixtures through multimodal information linking spectral (mathematical properties), symbolic (musical writing) and perceptual (timbre) aspects. To understand orchestral effects, the project is developing innovative machine learning algorithms targeting the perception, but also the understanding of the underlying pedagogical knowledge. The project therefore aims to:
• model human perception through optimal acoustic and symbolic representations of perceptual effects in music;
• design new representation learning algorithms to link the signal-symbol interactions underlying sound mixtures;
• allow semantic inference and metric relationships between these different modalities, thus allowing innovative methods of musical generation and creativity
• validate the spaces and knowledge extracted by carrying out perceptual experiments, but also educational and compositional applications.
By closing the loop between perceptual effects, analyses and perceptual validation of AI algorithms, this project provides generic methods widely applicable to other scientific fields.

This project allowed the creation of the first perceptual orchestration database (Orchard), the first system to democratize synthesizers (FlowSynth) and the first synthesizer to embed deep AI synthesis (Neurorack). This research led to the creation of the first international consortium on orchestration (ACTOR) of 19 partners in 9 countries ($ 1.5 million over 7 years), as well as funding from ANR MERCI (€ 400k), Sorbonnes ACIMO (100k €) and Emergence(s) ACIDITEAM (500k €). Finally, our work has been awarded on numerous occasions, in both scientific and musical fields. We received the international Ars Electronica prize for the piece Convergence (Alexander Schubert), the prize for the best presentation at ISMIR 2018 (P. Esling) and the prize for the best internship from the Polytechnic school (C. Tabary). Finally, the project's technologies were the subject of a special program on ARTE, which aired on October 23 in the Xenius series.

The longer-term research horizon is directly translated in our new massive collaboration network that will perpetuate the path that was opened thanks to this project. Potential longer-term economic by-products of this project include integration of the scholarly results and technological developments into knowledge databases, an online treatise, and programs for teaching orchestration complemented by a large repertoire of rendered orchestration examples. Such tools could become the gold standard in orchestration pedagogy, and would be available and affordable by musical schools, teachers and composers around the world. Furthermore, the research in theoretical machine learning for artificial creativity led to the development of several cutting-edge technologies. Specifically, a whole new category of embedded instruments relying on deep learning have been developed through this project, and have already been used in unique musical pieces. These technologies are also broadly applicable to other fields an have received proposals for commercialization. In summary, this project will provide many significant opportunities for intellectual, cultural, societal and economic contributions of social sciences and humanities research in strong interplay with other disciplines in the fine arts and engineering.

The project is distinguished not only by its very large quantity of production, but above all by their thematic diversity, ranging from publication in sociology to deep learning technologies embedded in electronics, including concerts for the general public and targeted interventions for children. Thus, the project saw the advent of 5 unique technologies, the publication of 12 journal articles and 46 articles from international conferences and a large number of invited or popular conferences open to all within the framework of international festivals. Finally, many musical pieces have been produced thanks to this project.

Musical orchestration is the subtle art of writing musical pieces for orchestra, by combining instruments to achieve a particular sonic goal. This complex skill has an enormous impact on classical, popular, film and television music, and video game development. For centuries and up to this day, orchestration has been transmitted empirically and a true scientific theory of orchestration has never emerged. Orchestration pedagogy has focused solely on describing how composers have scored instruments in different pieces rather than understanding why they made such choices. This needs to be addressed rationally and scientifically, even though the obstacles this analysis and formalization must surmount are tremendous. This project aims to create the first partnership towards the long-term goal of a true scientific theory of orchestration by coalescing the domains of computer science, artificial intelligence, experimental psychology, digital signal processing, computational audition, and music analysis. To achieve this aim, the project will exploit a large number of orchestral pieces in digital form for both symbolic scores and multi-track acoustic renderings of independent instrumental tracks that are currently collected through a leveraged research project, for which partners of this project are also collaborators. These excerpts are currently being annotated by panels of experts, in terms of the occurrence of given perceptual orchestral effects. This leads to an unprecedented library of orchestral knowledge, readily available in both symbolic and signal formats for data mining and deep learning approaches. We intend to harness this unique source of knowledge, by first evaluating the optimal representations for symbolic scores and audio recordings of orchestration, by assessing their predictive capabilities on given perceptual effects. Then, we will develop novel models of learning and knowledge extraction based on the revealed representations of perceptual effects in orchestration. To this end, we will develop deep learning methods able to link musical signal, symbolic score information, and perceptual analyses by targeting multimodal embedding spaces (transforming multiple sources of information into a unified coordinate system). These spaces can provide metric relationships between modalities that can be exploited for both automatic generation and knowledge extraction. The results from the models will then feed back to and be validated through extensive perceptual studies. These will enhance the organization of the knowledge database, as well as the software of our collaborating industrial partner OrchPlayMusic. By closing the loop between perceptual effects and learning, while validating the higher-level knowledge that will be extracted, this project will revolutionize creative approaches to orchestration and its pedagogy. The predicted outputs include the development of technological tools for the automatic analysis of musical scores, for predicting the perceptual results of combining multiple musical sources, as well as the development of digital media environments for orchestration pedagogy, computer-aided orchestration and instrumental performance in simulated ensembles. The developed learning algorithms will also be broadly applicable to multiple fields with both signal and symbol information such as Music Information Retrieval (MIR) tasks, search engines, and speech recognition. This project will also implicate non-academic artistic communities and the private partners to evaluate directly the creative and pedagogical applications of scientific results.

Project coordination

Carlos Agon (Institut de recherche et coordination acoustique/ musique IRCAM - UMR STMS)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


Ecole de musique Schulich
IRCAM Institut de recherche et coordination acoustique/ musique IRCAM - UMR STMS

Help of the ANR 378,126 euros
Beginning and duration of the scientific project: November 2017 - 36 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter