Interactions between text and multimodal remote sensing images to provide easier access to information – TAMMI
In recent years, remote sensing images have become more available than ever. These images contain information which is already used to track climate change, improve security and to understand and manage the environment. This data is however hard to interpret and often involves manual processing. With the increase of amount of data, interpretation becomes a limiting factor impacting the delay at which information is extracted, but also the domains in which such data can be used. While the data is here, a large audience cannot use it. In this project, we aim at making the access to the information contained in multimodal data easier and accessible to a new audience.
To this effect, we propose to use natural language as a mean to extract information from such data. We propose to use a generic approach: the data representation learnt is not specific to a task. To achieve this objective, a new database will be constructed, targeting tasks such as Visual Question Answering, Image Captioning and Image Query. We will study cross-modal shared representations, with a focus on robustness to missing data. Furthermore, we will aim at enhancing the interpretability of predictions made based on text and multimodal data through new methodological developments on the specific example of Visual Question Answering.
Project coordination
Sylvain Lobry (LABORATOIRE INFORMATIQUE PARIS DESCARTES)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partner
LIPADE LABORATOIRE INFORMATIQUE PARIS DESCARTES
Help of the ANR 222,140 euros
Beginning and duration of the scientific project:
December 2021
- 42 Months