Artificial life as a benchmark for molecular evolution and genomics – Evoluthon
Evoluthon, Artificial life as a benchmark for Molecular Evolutionary studie
We propose an original, principled way of benchmarking models and methods for molecular evolution studies with computer simulations. We are inspired by the “double blind” principle that governs test studies in science in general, and also some software development techniques, where development and test teams are separated and work independently.
Artificial life as a benchmark
Methods in molecular evolution face a validation issue: it is not possible to travel in time and verify hypotheses and predictions, which concern events that can be up to 4 billion years old. Throughout the scientific literature the most popular validation approach remains computer simulations. Genome evolution can be simulated in silico for a much higher number of generations than in experimental evolution, at a much lower cost. Then the results of simulations can be used as instances of inference methods.<br />Performing simulations for validation requires epistemological and organizational thinking. Indeed very often an individual method is tested with an ad hoc simulation, i.e. a simulation made on purpose to test it. In that situation some elements of the method are inevitably integrated in the simulator, which is then likely to generate only easy instances for this method and has no chance to reach the complexity of real data.
(1) Inference methods and simulated benchmarks should not be built by the same team. Moreover, the benchmark and inference teams, while having a common biological culture, should be “methodologically blind” to each other, meaning that principles from inference methods should not be included in simulations, and the other way around, principles specific to simulations should not be used by inference methods. To this aim, the simulation and the inference methods should be produced by teams belonging to different scientific communities.
(2) The simulated benchmarks are produced by a model which has not been designed to be used as a benchmarking tool. While this seems hardly doable and somewhat contradictory, we argue that this is the way of approaching the double blind principle, and that it is possible for molecular evolution because of the existence of disjoint scientific communities around the modeling of genome evolution.
(3) As much as possible, processes, and not patterns, should be simulated. This means that instead of tuning parameters to resemble empirical data in some arbitrary sense, we should uncover the processes that produce these empirical data and implement them into a mechanistic model. Although it is desirable to produce simulated data that looks like empirical data, the definitions of the similarity measures can themselves be ad hoc design choices, dependent on a particular inference method.
There is a need for a cooperative effort to organize and standardize benchmarks, as acknowledged for example by the addition of a section in PLoS Computational Biology dedicated to benchmarking, or the upcoming edition in 2019 of a special issue of Genome Biology on benchmarking studies.
The artificial life simulator is under construction.
As soon as it is operational, we will test the data from the simulator on phylogeny programs.
Not yet
We propose to implement an original computer simulation test principle for evolutionary genomic inference methods. These methods, although used daily in areas as diverse as health, agriculture, biodiversity protection or justice, make historical inferences difficult to test experimentally. The weakness of the present evaluation systems is to insert in simulations the same simplifying hypotheses as in the inference methods, because they are developed by the same designers, for validation purposes. For example, genes are defined a priori as evolutionary units, which makes their annotation and classification trivial. Extinct species are usually not simulated if no measurement is made from them, even if they interfere through hybridization or horizontal gene transfer.
We propose a new test bench approach rather than a validation approach, where the development teams and the test team will be separate. We are bringing together two teams, one in phylogeny, the other in artificial life, to build "blind" simulations of inference methods.
Preliminary tests have proved that this universally recognized scientific principle (blind tests) but never used in evolutionary studies, could reveal unexpected pitfalls of the methods, and correct them. We will make the results of the simulations available to the community as a validation standard. This proof of principle is encouraging for the effectiveness of our approach.
We will generalize this principle to evolutionary studies by adapting a program resulting from artificial life, Aevol. Importantly Aevol was not designed to generate a test bench, which makes it paradoxically interesting to be used as such. We will organize the collaboration between the two teams in a mode of mutually addressed challenges, respecting a communication on biological processes, and avoiding communication on computer models, which should as far as possible remain separate.
We will organize an international competition to promote this approach and test a large number of methods. The project will therefore also have the effect of enhancing cooperation between teams and best practices of method comparisons.
In particular, we will apply the benchmark to modern phylogeny methods developed in the team, integrating several evolutionary scales and their interactions.
Project coordination
Eric TANNIER (Centre de Recherche Inria Grenoble - Rhône-Alpes)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partner
INRIA GRA Centre de Recherche Inria Grenoble - Rhône-Alpes
LBBE BIOMÉTRIE ET BIOLOGIE EVOLUTIVE
Help of the ANR 298,339 euros
Beginning and duration of the scientific project:
October 2019
- 48 Months