Stochastic Models: Scalable Model Checking – Stoch-MC
STOCH-MC
Stochastic Models: Scalable Model Checking.
Scalable Model Checking and System Biology Applications
The aim of our research project Stoch-MC is to develop scalable model checking techniques that can handle large stochastic systems and give sure answers. Large stochastic systems arise naturally in many different contexts, from networks to system biology. We will demonstrate the techniques that we will develop on two biomodels: a model for Hela cells death and a model of stress response in yeast.
Our main technique is to obtain exact results through approximate computation, allowing scaling to larger system. The results are analysed to see whether a sure answer can be reached (e.g. by monitoring the accuracy of the approximation). If a sure answer cannot be derived, the approximation will be recomputed with a better accuracy.
We consider two kinds of models:
-purely stochastic systems (Markov Chains, Dynamic Bayesian Network (DBNs)). It allows modelling the behaviour of large system biology systems as the apoptosis pathways of Hela, under Trail treatment.
-Markov Decision Processes (MDPs as well as different semantics as Probabilistic Automaton de Rabin) allowing to model stochastic control, for instance of a population of yeasts under osmotic stress.
At T0+17 months:
-We obtained a first DBN model (using sparse Conditional Probability Tables) for the apoptotic pathway. (T5)
-We have a prototype for the approximated inference algorithm for DBNs (with or without sparse representation). Experiments are being performed to monitor its accuracy. (T2)
-We have a theoretical analysis of the one step error made, expressed under the Kullback–Leibler pseudo distance. (T3)
- We can control fairly accurately single yeast using standard control theory. (T4-T5)
-We have several approximation schemes for the control of a population of cells. The exact problem sums up to checking whether a (Rabin's) probabilistic automaton has a word leading to a large proportion of cells being in the Goal states. This problem is highly undecidable, subsuming the Skolem Problem. (T4)
- For a unary probabilistic automaton, we characterise when its language is regular, allowing a simple model checking procedure (T4).
This ANR Project gathers researchers from Formal Methods, System Biology, and also Information Theory. This unique combination already provided interesting results regarding DBN modelling of system biology and its inference.
Beyond the challenging goals of the project, namely obtaining tangible results for biological systems - which is notoriously hard, we want to demonstrate that Model Checking can be used to analyse complex systems. Our ultimate long term goal is to be able to design automatically controllers for large population models. The project STOCH-MC is the first step to show that this is doable.
Some main publications at the heart of the STOCH-MC project.
L. Maruthi, I. Tkachev, A. Carta, E. Cinquemani, P. Hersen, G. Batt. , A. Abate. Towards real-time control of gene expression at the single cell level: a stochastic control approach CMSB 2014, LNCS/LNBI, Springer, 155-172.
François Bertaux, Szymon Stoma, Dirk Drasdo, Gregory Batt. Modeling dynamics of cell-to-cell variability in TRAIL-induced apoptosis explains fractional killing and predicts reversible resistance. PLoS Computational Biology, 10(10):e1003893, 2014.
François Dufour Tomás Prieto-Rumeau. Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities STOCHASTICS, 2015.
Manindra Agrawal, S. Akshay, Blaise Genest, P.S. Thiagarajan. Approximate Verification of the Symbolic Dynamics of Markov Chains. JACM 62(1): 34-65, 2015.
Model Checking is nowadays a mature field of research, being awarded a Turing award in 2007. It consists in describing formally a system in the form of a mathematical model, specifying the properties it has to fulfill in a formal logic, and providing algorithms to decide whether the model satisfies the property. Model checking started with Boolean systems and properties, sequential finite state models, etc. This is now well established. In order to handle realistic (and thus large) systems, abstractions techniques such as the so called CEGAR method, namely counter example guided abstraction refinement (see e.g. [CCKSVW02]), or abstract interpretation (see the work of Cousot et al.) have been developed.
These methods work extremely well for verifying (large, sequential) programs, where Boolean predicates (e.g. variable x is below value a) are well suited. However, current abstraction techniques are not suited to the verification of large stochastic systems. To verify large stochastic systems, techniques such as Statistical Model Checking (based on Monte Carlo property testing) can be used. However, the latter do not give sure answers, but only answers correct up to some probabilities. To date, there is no (sure) Model Checking method which scales to large stochastic systems (e.g. Markov Chains with 10^20 states and over, for instance 20 variables which can take 10 different values each). For instance, the PRISM model checker and other comparable tools scale up to 10^10 states. This is the main reason why Clarke, Cousot et al. launched the CMACS initiative (Computational Modeling and Analysis for Complex Systems) in order to create MCAI 2.0, the next-generation Model Checking and Abstract Interpretation.
The aim of our research project Stoch-MC is to develop scalable model checking techniques that can handle large stochastic systems and give sure answers. Large stochastic systems arise naturally in many different contexts, from networks to system biology. We will demonstrate the techniques on two key stochastic models: a model for Hela cells death and a model of stress response in yeast.
Our key idea to obtain scalable algorithms is that probabilities can be approximated. A bound on the error made by the approximation is computed a posteriori, based on the characteristics of the model. We thus obtain a probability range (e.g. [0.4, 0.6]) instead of a precise point value as in current model checkers. If the answer is not precise enough (e.g. we want to know whether the probability is bigger than 0.45), then more time is devoted to the approximation, such that the error decreases, until a definite answer can be given. This mimics the CEGAR method in spirit.
The method thus calls for several tools. First, parametric algorithms should be proposed which can perform good approximated analysis. The parameter allows making it more precise at the expense of computational time. Second, an error analysis should be proposed, in order to evaluate a bound on the error. Some preliminary approaches have already been proposed by us, see [AAGT12, PALGT12].
Ultimately, we want to model check large Markov Decision Processes (MDPs). In the realm of system biology, compared to Markov Chains, MDPs allows model with different modes (e.g. pathway saturated/non-saturated by a drug). Also, MDPs allow to model different possibilities (the amount of drug to put at some moment, etc). Model checking a MDP thus amounts to finding optimal choices according to a given reward function, for instance designing cells with the right behavior.
We believe that designing scalable model checking techniques for large MDPs against a full version of the logic is not doable in the next 4 years. We will however develop techniques for every key feature and demonstrate them on biological pathways, forming a proof of concept that scalable (sure) model checking that can handle large MDPs is feasible, and will be the basis of an ERC project proposal in 4 years.
Project coordination
Blaise GENEST (Inria, Centre de recherche de Rennes - Bretagne Atlantique, equipe SUMO)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partner
LIAFA - CNRS UMR 7089 Laboratoire d'Informatique Algorithmique: Fondements et Applications (LIAFA)
LaBRI Laboratoire Bordelais de Recherche en Informatique
INRIA Paris - Rocquencourt INRIA Paris - Rocquencourt
Inria Rennes - Bretagne Atlantique - EPI SUMO Inria, Centre de recherche de Rennes - Bretagne Atlantique, equipe SUMO
Help of the ANR 361,900 euros
Beginning and duration of the scientific project:
January 2014
- 48 Months