INtragrating and Sharing Health dAta for REsearch – INSHARE

CONTEXT:Health Big Data (HBD) is radically changing the way researchers relate to information for medical research. HBD can be exploited at different levels and across different domains, especially for multidisciplinary research. The secondary use of HBD is a promising step towards decreasing research costs, increasing patient-centered research, and speeding the rate of new medical discoveries. The incentive policy of opening HBD around health data science is supported by public authorities and scientific communities. Clinical Data warehouse (CDW) technologies comes forwards as one of the solutions to address HBD exploitation. CHALLENGERS: Sharing and exploiting efficiently HBD lead to tackle the following challenges:
-HBD is sensitive: reusing HBD must comply with data protection governance taking into account legal, ethical and deontological aspects which enables a trust, transparent and win-to-win relationship between researchers, citizen and data providers.
-HBD has a limited level of interoperability: data are compartmentalized and are so syntactically and semantically heterogeneous.
-HBD is of very variable quality depending of the source. This factor has to be taken into account for effective data management and statistical analysis. The position of the INSHARE project is to explore, through an experimental proof of concept, how recent technologies could overcome such issues. Our approach is to gather stakeholders, computer scientists and researchers, in order to imagine and to create the conditions to conduct and to achieve real world scenarios of secondary use of HBD in the field of medical research.
OBJECTIVE:The project aims at demonstrating the feasibility and the added value of an IT platform based on CDW, dedicated to collaborative HBD sharing for medical research. INSHARE will be designed to be both self-scaling and accessible but highly secured.
The specific objectives of the project are:
- To design the governance of data sharing grounded on the analysis of both the actual needs and use cases of the identified end users and stakeholders.
- To implement and to develop a prototype of trusted third-party HBD sharing platform which will be based upon innovative CDW frameworks including a disruptive approach in terms of data protection and bigdata processing.
- To evaluate the prototype on 3 different real world use cases.
o Registry enrichment: Comorbidities and drug exposure in End Stage Renal Disease patients.
o Characterizing the healthcare trajectories of children (and their mother) included in a Birth Defect Registry.
o Cross domain study: Cancer, Diabetes and ESRD in a cohort study.
METHODS:The consortium of this 3 years project includes teams specialized in medical informatics, statistics and epidemiology, data protection, computer science and 6 data providers (2 academic hospitals and 3 national or regional registries).
The scientific program is organized into 3 main work packages: (1) Definition of the use cases, of the platform governance as well as of the data privacy and sharing procedures. (3) Data integration and platform realization including tasks on data quality, big data processing, data encryption and watermarking. (3) Platform evaluation.
The originality and the innovations of the project will be: (i) A new governance of HBD sharing where data providers entrust their data source to a trusted-third party framework within actors specialized in medical informatics reinforcing their role as health data scientist. (ii) New methods for enhancing health data integration with bigdata technologies.(iii) New approach of data quality measuring for a better exploitation. (iv) Innovation for data protection using cryptowatermarking for a better security in terms of integrity and traceability. The expected result is to get an operational prototype of a platform at the end of the project, so as to provide effective win-to-win services meeting the research needs, data providers and citizen expectations.

Marc CUGGIA (Laboratoire du traitement du signal et de l'image - Equipe projet Données Massives en Santé (DMS))

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


CDC BREST Centre de Données Cliniques du CHRU de BREST
EHESP Ecole des Hautes Etudes en Santé Publique
REGISTRECANCER Registre général des cancers de Poitou-Charentes
ABM-REIN Agence de la biomédecine (Registre REIN)
DMS-LTSI Laboratoire du traitement du signal et de l'image - Equipe projet Données Massives en Santé (DMS)

Help of the ANR 846,074 euros
Beginning and duration of the scientific project: September 2015 - 36 Months

