Medically Explainable Generative Deep Network for Analyzing Clinical Study Datasets – BIODEEP
This interdisciplinary project aims to develop new methods in mathematics and computer science to stratify patients into homogeneous groups based on biological variables, i.e. biomarkers. The aim is to develop a multi-clustering algorithm based on biological data to discover new clinical interpretations. This approach would be particularly useful in the area of psychiatry where it is now established that patients with the same symptoms, such as schizophrenia, actually suffer from different pathologies. To achieve this objective, we will use two datasets composed of clinical and biological variables of two cohorts of schizophrenic patients, FACE-SZ and OPTiMiSE. The biological variables, essentially biomarkers obtained from blood samples, will be used to predict the clinical variables, in particular the clinical scores established by psychiatrists.
In this project, we will develop a generative deep neural network for multi-facet clustering of biological data. A facet corresponds to a particular clustering which makes it possible to group patients according to common characteristics in their biological samples. By learning multiple facets simultaneously, physicians and biologists will be able to discover new ways to stratify patients. They could then discover new common traits shared by the patients analyzed and improve their care. We hope that some of these traits will correspond to clinical variables commonly used by psychiatrists.
The architecture of the neural network will be built, in an unsupervised way, on the model of a variational autoencoder. The project aims at three major innovations: 1) learning latent variables that correspond to relevant non-linear transformations of biological data, 2) modeling dependencies between latent variables with a graphical Bayesian network and 3) building a neural network for the multi-clustering whose architecture mimics the structure of the optimal Bayes classifier.
Latent variables will allow physicians and biologists to identify combinations of biological variables that best stratify patients based on objective biological data. The Bayesian network, particularly relevant for modeling statistical dependencies between variables, will serve as a meta-model that will guide the connectivity between neurons and layers within the deep neural network. This type of graphical network, often used in the health field, will facilitate exchanges between clinicians, biologists, and experts in machine learning. Finally, the design of an optimal Bayes classifier will allow us to have the best possible performance in prediction and to make the best use of the information contained in the databases.
In addition to its interest in signal processing, mathematics and statistics, our project would have a significant impact in the medical field by enabling patients with the same symptoms to be stratified on the basis of objectively measurable blood biomarkers. It would then be possible to prescribe to patients belonging to the same class the treatment best adapted according to the principle of personalized medicine.
Project coordination
Lionel FILLATRE (Laboratoire informatique, signaux systèmes de Sophia Antipolis)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partnership
I3S Laboratoire informatique, signaux systèmes de Sophia Antipolis
IPMC Institut de pharmacologie moléculaire et cellulaire
FondaMental Fondation FONDAMENTAL
Help of the ANR 292,151 euros
Beginning and duration of the scientific project:
March 2024
- 42 Months