CE38 - Révolution numérique : rapports au savoir et à la culture

Gender Equality Monitor – GEM

Submission summary

GEM project aims to describe automatically representation and treatment differences existing between women and men in the French-language media such as TV, radio, newspapers and song lyrics collections.
The ambition of this project is to analyze several million documents sampled over a period of more than 80 years, to produce the largest study on the representation of men and women in the media.

The project is supported by a transdisciplinary consortium composed of two major audiovisual actors (INA, Deezer), two STEM laboratories specialized in automatic information extraction from text and speech (LIUM, LIMSI) and three humanities laboratories specialized in the study of gender and media (CARISM, LERASS, ENS LYON). It also has an expert committee interested in the industrial and social outcomes of the project composed of the CSA, the DEPS, and Radio France.

The proposed approach for describing gender representation differences is based on three complementary lines of work:

The first axis is to formalize descriptors relevant for the quantification of representation differences. This work will be carried out through qualitative analyses performed using several thematic corpora: gender-related incivilities, limiting cases of vocal binarity, treatment of feminist movements by the media, the figure of the anonymous woman in the public space, the granted place to the body. A strength of the project lies in the complementarity of the methodologies used to analyze a common material: discourse analysis, case studies, reception studies, interviews of actors, prosodic analysis of speakers’ performances.

The second line of work consists in implementing the defined descriptors using information extraction methods, based on the automatic processing of written and spoken language, speaker and singer characterization, and face categorization. The issues raised by the GEM project help to orient research aimed at improving technology cores, especially those related to end-to-end semantic information extraction directly from the audio signal, or the regulation of biases (e.g., gender bias) learned through automatic models propagating the stereotypes conveyed by the training data.

The third axis of transversal nature consists in carrying out quantitative studies based on the exploitation of the descriptors obtained automatically, through several phases of expression of need and evaluation by use. This axis includes a number of technological issues, including the ability to process and exploit massive volumes of data. It also addresses theoretical issues, as the exploitation of this unprecedented mass of data will contribute not only to create new knowledge in human sciences, but also to raise new questions requiring qualitative studies. In particular, the study of the output borderline cases of classification algorithms should make it possible to question the criteria used for this classification, by relating the observed cases to the interpretation that will have to be made of them.

This project addresses scientific issues, both in the humanities and in the STEM domain, but also industrial: automatic estimation of gender representation in broadcasts, exploration of digital collections; and societal: impact of public policies of equality, objective measurement of the differences of treatments likely to throw light on the public debate. The first results obtained by INA based on the analysis of the speaking time of men and women suggest that the benefits of the proposed project could have a strong social and media impact, and are consistent with the concerns of citizens and contemporary demands, in the area of ??equality.

Project coordination

David Doukhan (Institut national de l'audiovisuel / D3I / DRIN / Service de la Recherche)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


INA Institut national de l'audiovisuel / D3I / DRIN / Service de la Recherche
LIMSI Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Help of the ANR 799,956 euros
Beginning and duration of the scientific project: December 2019 - 42 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter