CE46 - Modèles numériques, simulation, applications

Big Data reduction for predictive computational modeling – DataRedux

Our main novelty is to define network reduction techniques in relation with the dynamical processes occurring on the networks. To this aim, we will develop methods to go from data to information and knowledge at different scales in a human-accessible way by extracting structures from high-resolution, diverse and heterogeneous data. Our methodology will involve the identification of the most relevant subparts of time-resolved datasets while remapping the remaining parts of the system, the simultaneous structural-temporal representations of time-varying networks, the development of parsimonious data representations extracting meaningful structures at mesoscales (“mesostructures”), and the building of models of interactions that include mesostructures of various types. Our aim is to identify data aggregation methods at intermediate scales and new types of data representations that carry the richness of information of the original data, keeping their most relevant patterns and summarising less salient properties for their more manageable integration in data-driven models for decision making and actionable insights.

P3's modeling work on the covid-19 epidemic had a great societal impact, in terms of estimating the risk of importing cases at the very beginning of the epidemic, then providing scenarios for various types of restrictions, evaluating the impact of various measures a posteriori, and proposing protocols for reopening in the best possible conditions.
The collaborative work of P3 with P1 also took place in the working group on the development of the TousAntiCovid application: this modeling work allowed to validate its effectiveness. Workpackages WP3, WP4
From a theoretical point of view, P1 and P2 have contributed to the development of the theme of higher order interactions, i.e. how to go beyond networks that can only represent pairwise interactions: in many contexts, in particular social ones, interactions are also done by groups, and the modeling must therefore be adapted (in particular with hypergraphs). P1 and P2 have proposed new models of interaction and social contagion, showing a new phenomenology of contagion processes, and new methods for investigating structures in temporal hypergraphs. These topics are currently very promising. Workpackages WP1, WP3, WP4
In addition, P1 and P2 have proposed new procedures for representing and embedding temporal networks in a finite dimensional space («embeddings«), which allows to consider tasks such as the prediction of the size (or the complete course) of an epidemic process observed only partially. Finally, multi-view CCA approaches on graphs have been proposed. These methods, embeddings or CCA, have many potential applications. Workpackages WP1, WP2

In the next period, we plan to continue the work in the directions defined by the Workpackages, in particular: development and analysis of a new method of representation of temporal networks allowing to detect changes from one state to another in these networks; definition and analysis of the «temporal rich club«, generalizing the «rich club coefficient« to temporal networks; definition of methods of comparison of temporal networks; definition and study of the processes of contagion on hypergraphs, study of their relevance for infectious diseases.

see report

Submission summary

The DataRedux project focuses on developing radically new methods for the reduction of the complexity of large networked datasets to feed effective and realistic data-driven models of spreading phenomena.
Many rich datasets on actions and interactions of individuals have recently become available, commonly encoded as networked systems, arising from heterogeneous sources with details at different scales and resolutions, and potentially containing geographical and temporal information as well as metadata. These outstanding sources of information and knowledge fuel a wide spectrum of data-driven numerical simulations of dynamical processes. Data alone, however, even in huge amounts, do not easily transform into knowledge or predictive models. The rich and diverse information they contain raises crucial challenges concerning their analysis, representation and interpretation, the extraction of meaningful structures, and their integration into data-driven models.
In this context, DataRedux puts forward an innovative framework to reduce networked data complexity while preserving its richness, by working at intermediate scales (“mesoscales”). Our objective is to reach a fundamental breakthrough in the theoretical understanding and representation of rich and complex networked datasets for use in predictive data-driven models. Our main novelty is to define network reduction techniques in relation with the dynamical processes occurring on the networks. To this aim, we will develop methods to go from data to information and knowledge at different scales in a human-accessible way by extracting structures from high-resolution, diverse and heterogeneous data. Our methodology will involve the identification of the most relevant subparts of time-resolved datasets while remapping the remaining parts of the system, the simultaneous structural-temporal representations of time-varying networks, the development of parsimonious data representations extracting meaningful structures at mesoscales (“mesostructures”), and the building of models of interactions that include mesostructures of various types. Our aim is to identify data aggregation methods at intermediate scales and new types of data representations that carry the richness of information of the original data, keeping their most relevant patterns and summarising less salient properties for their more manageable integration in data-driven models for decision making and actionable insights.
The scientific program of DataRedux will optimally benefit from the diverse expertise of the participating teams to reach the objectives of the project. The project will last 48 months and is organised in six work packages: four scientific, one on dissemination, and one on management. It involves three teams with a leading position in their own field of research. The coordinator is the DANTE INRIA team, hosted by the Laboratoire de l'Informatique du Parallélisme and IXXI Complex System Institute at ENS Lyon, expert in exploration of massive enriched networked datasets on human behaviour, statistical methods and data-driven modelling of social contagion phenomena; the Statistical Physics and Complex Systems team from CNRS, CPT Marseille, with expertise on complex networks, temporal networks, spreading processes, dynamical processes; the Pierre Louis Institute of Epidemiology and Public Health from INSERM, with expertise on computational epidemiology, data-driven modelling and host dynamics.
This proposal is an invited resubmission after being ranked 2nd in the waiting list of the AAPG ANR 2018. The new proposal has been revised to take into account the evaluation remarks and the results obtained by the partners since last year.

Project coordination

Alain BARRAT (Centre National de la Recherche Scientifique Délégation Provence et Corse _Centre de physique théorique)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

iPLESP Institut Pierre Louis d'épidémiologie et de santé publique
LIP Laboratoire d'Informatique du Parallélisme
CNRS DR12 _CPT Centre National de la Recherche Scientifique Délégation Provence et Corse _Centre de physique théorique

Help of the ANR 711,702 euros
Beginning and duration of the scientific project: December 2019 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter