DS04 - Vie, santé et bien-être

The Automatic Biomolecule Random Walk Analyser: Single Molecule Science at the Age of Big Data – TRamWAy

Submission summary

Single molecule biology is demonstrating impressive technical advances. New imaging techniques are introduced every other month, enabling the recording of large sets of biomolecules at ever-higher densities and faster temporal resolutions. Millions of localization events and hundreds of thousands of trajectories can now be obtained in a single cell.
Single molecule science is following the global trend in biology towards big data. The exponential increase in data is inducing an unavoidable shift away from Karl Popper’s principles where hypotheses would be formulated before being empirically tested and eventually falsified or corroborated. The variety of biomolecule dynamics and cellular environment recorded exceeds the production of theories or the proper statistical treatment of all environments properties. Numerous challenges are emerging because of this increase in data size. In a scientific field already dominated by noise due to the physical scale of the processes, heterogeneity of cellular environment and variability between individual cells, the emergence of large data creates formidable challenges in properly analyzing data and extracting robust, reliable and reproducible results. Hence, the field of single molecule biology needs to re-invent itself to adapt to the new reality where massive amounts of data are produced, numerous biological conditions are tested, full cell scale data are extracted in tens of seconds and where scientific hypotheses/theories are generated as data are observed.
Our solution to these challenges is the TRamWAy software platform. The main objective of the project is to develop a global and fully Bayesian statistical framework that includes all the steps involved in the analysis of single biomolecule dynamics. Namely, it will include generative machine learning models for localization, graph theoretical methods for probabilistic assignment of molecule identities to eliminate biases associated with tracking, and Bayesian model selection for inferring the underlying physical mechanisms that generate the observed motion. This unified framework will enable quantitative and rigorous comparison and selection of models describing the biomolecule dynamics, completed with full assessments of the quality of the results, posterior distributions of the parameters and statistical evidence for competing models. This framework will be included in the software platform TRamWAy. To build a platform that is able to handle big datasets, the project will focus on developing computationally efficient algorithms able to run on multiple computing infrastructures.
With the help of these statistical tools, our second goal is to show that screening experiments with dynamical biomolecules are not only possible but will be beneficial to our understanding of cellular functioning. Multiplexing experiments will allow acquiring large amount of data and sampling biological diversity. The software platform will provide a reliable framework to analyse data and to quantify all sources of noise and variability. We will build a physical random walk atlas of numerous cellular lines by screening thousands of cells. Furthermore, we will complement this first screen with an inhibitory synapse atlas screen based on previous and currently performed experiments in the teams of collaborators.
This project makes the case that by mixing machine learning, graph theory, Bayesian inference and a physical view on random walks, we can provide a robust, reliable and efficient framework to probe and to understand biomolecule dynamics. This is a great challenge and we believe that exciting achievements lie ahead.

Project coordinator

Monsieur Jean-Baptiste MASSON (INSTITUT PASTEUR (BP))

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

DBC INSTITUT PASTEUR (BP)

Help of the ANR 277,624 euros
Beginning and duration of the scientific project: October 2017 - 42 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter