DETECT: Nouvelles approches statistiques pour la vision artificielle et la bioinformatique – DETECT
Context: The origins of the DETECT project lie in the relationship between several recent practical problems from computer vision and bioinformatics, and methodologies from statistical theory and machine learning. On the one hand, understanding actions and human interactions in video still is an open problem in computer vision, and would have many applications such as automatic annotation of video collections. On the other hand, the ever increasing amounts of biological data (CGH profiles from new sequencing technologies, gene expression measurements or protein level measurements under different conditions, DNA sequencing data, etc.) require new statistical tools for reliable, accurate and fast processing. Two detection problems are central to these two domains: multiple change-point detection (i.e., partitioning a sequence of observations into homogeneous contiguous segments) and multiple testing (i.e., performing simultaneously a large number of hypothesis tests while controlling the number of false positives). However, solving the related vision and bioinformatics problems is a major challenge for statistics and machine learning: indeed, it requires designing change-point detection algorithms for multivariate structured data, designing multiple testing procedures that can automatically adapt to unknown and impossible to estimate dependency structures, designing robust methods (i.e., which do not require unrealistic assumptions) and understanding precisely the trade-offs between increased precision and higher computing time for the associated algorithms. Strong, two-way and long-term interactions are thus needed between statistical machine learning, computer vision and bioinformatics. Objectives: The main objective of the DETECT project is to develop interactions around detection problems. By allowing exchanges between these domains, we aim at: - in computer vision, recognizing automatically actions in videos, - in bioinformatics, segmenting robustly multiple CGH profiles, detecting exceptional DNA motifs (i.e., motifs that are under or over represented), and detecting genes or groups of genes which are differentially expressed among tens of thousands of them from only few observations, - in machine learning, improving the theoretical understanding of methods which are frequently used in vision and bioinformatics, and proposing new algorithms for these problems. Program: The work program for the DETECT project is split into four main work packages. Two work packages correspond to the two main types of identified problems: change-point detection and mutiple testing. Then, our two main application domains (computer vision and bioinformatics) have their own work packages. The main tools we propose to use for solving these problems are resampling, model selection and kernel-based statistics. In order to reinforce the links between the four work packages, we will start a monthly reading group dedicated to the multi-disciplinary interactions; we will also take advantage of the physical proximity of all team members to organize frequent work meetings between two or three of them. Impact: The DETECT project aims at obtaining results in the fields of mathematical statistics, computer vision and bioinformatics. They will be disseminated by: - Publishing in high-ranked international journals and conference proceedings, - participating in national and international conferences, - collaborating with biologists (we are already in contact with researchers from INRA Jouy and Institut Curie), - giving public access to R and Matlab open-source packages (for example) under GPL licensing. In the medium term, much impact is expected from this project, e.g.: - Allowing the automatic annotation of videos and extract segments of interest, - discovering predisposition genes for heart diseases (for humans) or increased productivity (for plants such as wheat), - helping medical diagnosis for cancer detection from CGH profiles analysis.
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Help of the ANR 0 euros
Beginning and duration of the scientific project: - 0 Months