Deep learning for multiple targets detection and recognition in variable background – DEEPDETECT
During the very last years, computer vision has made a significant breakthrough with the emergence of deep learning techniques. Indeed, it successfully benefits to image classification where deep learning outperforms the state-of-the-art in challenges such as the Imagenet Large Scale Visual Recognition Competition (ILSVRC) since 2012. In this project we propose to use this kind of method, more particularly convolutional neural networks (CNN), for the detection and the recognition of multiple small size objects in images. Two applications are considered in this project, the detection and mapping of wales using satellite imaging and the detection and recognition of objects (vehicles) in infrared images.
For both applications, the object size ranges typically from 5x5 to 10x10 pixels. To address this detection and recognition problem we propose to design 2 different architectures. The first one is the most common approach and consist in a sliding window that extract patches (small part of the full image). Then, the patches are introduced in a trained CNN to differentiate objects from the background, and possibly, classify them. The second approach deals with the full image in one step. In this case, we design a deep classification net for semantic segmentation. In the final segmented map each pixel gets a label representing (hopefully) its class. These two approaches will be developed by the members of the consortium using the synthetic database provided by MBDA. Note that these CNN architectures must be designed regarding the operational constraints.
Following this work, we will deal with the variability of the background in the test database in comparison with the background available in the training database. The objects to be detected and recognize will be considered available in the training database. The goal is to evaluate possible operational situations and the potential losses in the final results.
We will also study how the CNN trained on simulated data performs on real data. To fulfil this experiment, new acquisitions in operational situations will be conducted by MDBA in order to complete the existing real images database. Considering the whale mapping application, the idea is to evaluate the adaptation capacity when lower resolutions are used in the test phase. Besides, thanks to the available real data, we also propose to evaluate common methods for incremental learning to specialize the proposed architecture.
Along each step of the project, we will evaluate the performances of the CNN and the results obtained. The goal is to monitor the learning process and to use criteria to quantify the final detection and recognition results.
Finally, as introduced before, the CNN design will take into account the possible operational constraints. Thus, we will analyse the potential solutions to reach real time implementation in embedded systems. We will deal with both the material (GPU, energy consumption…) and software (how to reduce the computational time) parts.
Monsieur Alexandre BAUSSARD (Laboratoire des sciences techniques de l'information, de la communication et de la connaissance)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
AMURE AMÉNAGEMENT DES USAGES DES RESSOURCES ET DES ESPACES MARINS ET LITTORAUX
IRISA Institut de Recherche en Informatique et Systèmes Aléatoires
MBDA Systems MBDA FRANCE
Lab-STICC Laboratoire des sciences techniques de l'information, de la communication et de la connaissance
Help of the ANR 297,522 euros
Beginning and duration of the scientific project: December 2017 - 30 Months