CE38 - Révolution numérique : rapports au savoir et à la culture

Computer Vision and Historical Analysis of Scientific Illustration Circulation – VHS

Computer vision and historical analysis of the circulation of scientific knowledge through illustrations

VHS brings together researchers in history of science and computer vision to design a novel tool for the historical study of scientific knowledge circulation. The aim is to develop unsupervised or weakly supervised learning methods to study the evolution and transformation of images in large-scale scientific corpora from the Middle Ages and the modern period.

Designing deep learning analysis methods adapted to ancient illustrated corpora, to better understand the circulation of scientific knowledge through images in history

Illustrations and their evolution in the scientific corpora of the Middle Ages and modern Western cultures have only been partially studied. This is notably due to the fact that the role and status of image in the construction and dissemination of scientific knowledge raise complex questions that remain historically delicate to grasp and for which adapted analysis’ tools are lacking. In order to fill this gap and to renew the methods for studying the evolution of visual scientific knowledge throughout history, VHS closely associates two approaches: on the one hand, a historical approach that perceives the image not as a closed and isolated entity, but as an essential vector in the transmission of scientific knowledge; on the other hand, the development of automated methods for analyzing similarities and contents in the medieval and modern illustrated corpora with little or no annotation.<br />More precisely, our approach consists in designing Computer Vision based methods to detect iconographic (that is images that were copied or partially inspired from each other) and textual (images which describe similar textual contents but may be visually different) similarities between illustrations in large corpora. One of the main objectives is to obtain such associations automatically, relying as few as possible on expert annotations, which are expensive and complicated to obtain. We hypothesize that such methods of analysis will allow to carry out much more effectively relevant comparisons and connections between images as they will provide, on the one hand, new groupings of images according to their content and the texts they illustrate and, on the other hand, fine distinctions between various modalities of representation by the image. The historical analysis of these new groupings of images, of the different types of similarities and differences (between images, texts and images) they contain and their evolution in the time will provide new information on the modalities of circulation of scientific illustrations (implicit or explicit borrowing processes, modes of production and diffusion, readers, etc.), lead to the discovery of new iconographic circulation networks and, in doing so, allow to conduct new works and initiate new dynamics of study on the role of image in construction and diffusion of scientific knowledge processes.

The VHS project proposes a new approach to the historical study of the circulation of scientific knowledge based on new methods of illustration analysis. Thanks to the recent developments in AI, and Computer Vision, the aim is to design a new analysis tool based on the study of the evolution and transformation of images in illustrated scientific corpora from the Middle Ages and the modern period. To this end, we will develop unsupervised or weakly supervised learning methods that will allow us to carry out large-scale automatic searches adapted to these corpora, based on the detection of iconographic similarities (between images, in order to identify the copying and borrowing processes, in particular) and textual similarities (between images and captions or associated texts, in order to identify, e.g., different images describing similar textual content). These methods will provide historians with new associations of illustrations and possible relationships (inter-iconic and/or inter-textual), the analysis of which will allow to answer to several essential questions in studies of the circulation of illustrated scientific knowledge, starting with the place and role of the image in the transmission of these knowledge.
The work is being carried out on two medieval manuscript corpora and two printed corpora from the modern period, covering three fields of knowledge: natural history, mathematics and pharmacology. The project involves :
- the constitution of these four corpora, their indexation and the automatic extraction of their illustrations in an image database in IIIF format, equipped with a shared digital interface for consultation and annotation,
- the development of similarity detection methods, the joint analysis of tests carried out on illustrations extracted from the four corpora, and the organization of annotation work required to improve detection algorithms,
- the historical study of the results obtained and an interdisciplinary analysis of the associated methodological issues,
- the academic dissemination of results (publications, conferences, preparation of a collective work) in all the disciplines involved (history of science, digital humanities, artificial vision), and their educational enhancement.

The approach adopted, closely combining humanities and data sciences, places the project firmly in the field of Digital Humanities, where VHS will provide a concrete and important result for historians of science, of technology, of art and for the visual studies community: the availability of the methods developed in the form of an IIIF API, designed as a research environment for the analysis of the circulation of scientific knowledge through images. These same methods will also constitute original contributions in the field of AI, on several important challenges, such as weakly supervised learning of fine-grained image representations (including weakly aligned text data) and weakly supervised learning of image style. The results of the project will be disseminated academically (publications, conferences, preparation of a collective work) in all the disciplines involved. They will also be pedagogically valorized through the development of educational materials for primary and secondary school teachers, as well as the implementation of a Master's degree Teaching Unit in Digital Humanities.

VHS proposes innovative automatic analysis methods and a study methodology capable of taking into account crucial characteristics of the historical circulation patterns of scientific illustration, including their intericonic and intertextual dimensions.
In the History of science, succeeding to assemble and compare a very large number of images, some of which would never have been detected without the contribution of Computer vision, and to study their relations will allow us to better understand the place, the role and the epistemological status of illustrations and of visual information in scientific knowledge circulation, issues which were until now difficult to grasp by medieval and modern historians alike. More widely, VHS will question the existence in the science of visual cultures common to the Middle Ages and the modern period.
In Computer Vision and Machine Learning, this project will be the occasion to tackle core challenges, such as unsupervised fine-grained classification and learning from weakly aligned text data. We believe that the precise context of this work and the specific nature of the corpora studied will be an opportunity to elaborate and demonstrate original solutions to these challenges. The new tasks defined and the algorithms developed will be of great interest and will lead to significant progress in the vision community.
The tests that we will have made on four corpora covering a large period and three distinct fields of knowledge will guarantee the analysis methods robustness, allowing to develop and make them available in the form of an API, thus constituting another result, intended for the Digital Humanities community, teams working on illustrated corpora, and even beyond.

VHS project's results include:
- all the publications and presentations in conferences and congresses by the team members in the different disciplines involved;
- the preparation of a collective work synthesizing historical reflections on the circulation of scientific knowledge through illustration, based on the results obtained;
- a consultation interface to the iconographic database set up from the four corpora of the project and the networks and circulations processes detected;
- the implementation of the learning methods developed in the form of an API compatible with the IIIF format. This API will be freely available so that the community can use it and test it on other illustrated corpora. It will be developed in the form of a search environment allowing to load a corpus, to apply to it the analysis functionalities developed, and to visualize and annotate the results. Special attention will be paid to the design of the functionalities for viewing, manipulating and annotating the results, in order to ensure their relevance for historians;
- a set of educational materials for primary and secondary school teachers developed to enable them to easily design lessons and pedagogical activities focusing on the role of image in the historical development of scientific knowledge;
- a teaching unit in Digital Humanities at Master level (5 ECTS), directly linked to the research conducted in the project, to be deployed in at least one training course (at Sorbonne University);
- the organization of a multidisciplinary closing conference to present the main results of the project, in particular the analysis methods developed, the VHS API, and the historical synthesis of the work conducted.

The VHS project proposes a new approach to the historical study of the circulation of scientific knowledge based on new methods of illustration analysis. Thanks to the recent developments in AI, and Computer Vision, the aim is to design a new analysis tool based on the study of the evolution and transformation of images in illustrated scientific corpora from the Middle Ages and the modern period. To this end, we will develop unsupervised or weakly supervised learning methods that will allow us to carry out large-scale automatic searches adapted to these corpora, based on the detection of iconographic similarities (between images, in order to identify the copying and borrowing processes, in particular) and textual similarities (between images and captions or associated texts, in order to identify, e.g., different images describing similar textual content). These methods will provide historians with new associations of illustrations and possible relationships (inter-iconic and/or inter-textual), the analysis of which will allow to answer to several essential questions in studies of the circulation of illustrated scientific knowledge, starting with the place and role of the image in the transmission of these knowledge.
To achieve these goals, the project involves three recognized partners (the Digital Humanities team of the Institut des Sciences du Calcul et des Données at Sorbonne University; the Monde Byzantin team from the Orient & Méditerranée laboratory (UMR 8167); the Imagine team from the Gaspard Monge Computer Science Laboratory at École des Ponts ParisTech (ENPC)) which brings together specialists in History of Science, scientific illustration, Computer Vision and Deep Learning.
The VHS team is driven by a strongly interdisciplinary approach that will be structured over the four years of the project around regular internal workshops, which will allow the cross-fertilization of skills and approaches and the organization of work, and a monthly public seminar dedicated to studies on scientific illustration circulation. The work will be carried out on four illustrated corpora, two medieval manuscript corpora and two printed corpora from the modern period, covering three fields of knowledge: natural history, mathematics and pharmacology. These corpora will be indexed, and their illustrations automatically extracted in an IIIF database equipped with a consultation and annotation interface, the implementation of which will constitute the first phase of the project. The consortium will then work in parallel on the development of similarity detection methods, the joint analysis of the tests carried out, as well as the historical study of the results obtained and the associated methodological issues.
The approach adopted, closely combining humanities and data sciences, places the project firmly in the field of Digital Humanities, where VHS will provide a concrete and important result for historians of science, of technology, of art and for the visual studies community: the availability of the methods developed in the form of an IIIF API, designed as a research environment for the analysis of the circulation of scientific knowledge through images. These same methods will also constitute original contributions in the field of AI, on several important challenges, such as weakly supervised learning of fine-grained image representations (including weakly aligned text data) and weakly supervised learning of image style. The results of the project will be disseminated academically (publications, conferences, preparation of a collective work) in all the disciplines involved. They will also be pedagogically valorized through the development of educational materials for primary and secondary school teachers, as well as the implementation of a Master's degree Teaching Unit in Digital Humanities.

Project coordination

Alexandre GUILBAUD (INSTITUT DES SCIENCES DU CALCUL ET DES DONNEES)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

ISCD INSTITUT DES SCIENCES DU CALCUL ET DES DONNEES
Orient et Méditerranée, textes - archéologie - histoire
LIGM Laboratoire d'Informatique Gaspard-Monge

Help of the ANR 612,386 euros
Beginning and duration of the scientific project: December 2021 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter