RESO - REcherche sur les pratiques et enjeux de la Science Ouverte 2024

Data sharing and reuse in the humanities and social sciences – PaRéDo SHS

Sharing and reuse of research data in SSH

When it comes to opening up data from social sciences and humanities research, how can we strike a balance between the diversity of practices, methods, and relationships with a multitude of source materials, the prescription of best practices and standards, technical requirements, and political guidelines?

HSS research data in practice

In order to answer this question, it is necessary to conduct interdisciplinary research on the entire cycle of practices involved in the production, mobilization, and circulation of research materials in the humanities and social sciences, and to link these practices to infrastructure governance in order to understand how these practices both govern and are governed by infrastructure. Within OpenEdition Lab and in partnership with the Centre Internet et Société, the PaRéDo SHS project takes an in-depth look at the policies, mechanisms, and practices of data sharing and reuse in the social sciences and humanities at the level of the COMMONS shared digital infrastructure through: (WP1) a massive analysis of a large corpus of policy documents relating to open data, research on data in the humanities and social sciences, and technical and normative specifications, (WP2) research in the form of an ethnography of the data practices of COMMONS users, and (WP3) reflections and recommendations on the forms of governance necessary to preserve the epistemic diversity of the humanities and social sciences and thus promote the openness, sharing, and reuse of data.

Within the conceptual framework of an ecology of infrastructure governance (Mounier and Dumas Primbault, 2023a) and in line with the COMMONS project's observatory of uses (OE Lab) and the PathOS project's analysis of the economic and societal impact of open science (CIS), this “research on research” project will need to be strongly interdisciplinary in nature. It will be necessary to weave together a variety of theoretical perspectives in a methodological crossover:

● Science and technology studies (STS) will enable us to understand open data infrastructures as socio-technical devices that take place within a broader ecosystem.

● Information and communication sciences (ICS) will highlight how, when interacting with these infrastructures, users engage in practices of appropriation, circumvention, and diversion

in order to satisfy their needs for data access, sharing, and reuse;

● Data science will enable us to exploit the traces left by these practices, using tools that identify patterns of engagement between data and between platforms.

 

This interdisciplinary crossover will enable us to articulate several levels of analysis:

● a qualitative approach based on survey methods derived from ethnography (interviews, questionnaires, observations, focus groups) and semiotics (study of interfaces, classifications, search tools, and content hierarchy);

● a quantitative approach that will use text mining on a massive corpus as well as machine learning methods on usage traces (server logs, Matomo data, web crawling) to identify recurring patterns that may indicate typologies of uses and users.

Numerous studies of data-related practices, as well as a number of criticisms voiced directly by the academic community, show that one of the most important obstacles to the sharing and reuse of research data in the humanities and social sciences is epistemological: the principles, policies, tools and, consequently, the framework for specifying, circulating and valorizing research data in the HSS are likely to fail to respect the diversity of the epistemic cultures of these disciplines - among other things, the diversity of materials, the social dynamics of their circulation, the constructivist methodologies that shape the relationship of communities to their materials, as well as the importance of "doing" one's field, one's archives or maintaining a certain relationship to one's respondents.
Open data infrastructures are the nexus where public policies, national and international public bodies, private players and socio-technical devices, as well as the material practices of a variety of users and the data itself, come together. They are therefore a place of productive tension, where the principle of openness, the need for standardization - enabling the storage, documentation, circulation and interoperability of increasingly massive data - meet with the need to equip HSS practices as accurately as possible in their singularity - in France in particular, due to institutional and infrastructural constructs, and in contrast to STEM and their data - as well as to preserve epistemic diversity within HSS itself. In order to guarantee a certain form of academic autonomy (notably in terms of research practices) while advocating good open data practices (notably through norms and standards), the operational question that arises is that of epistemic data governance: how to arbitrate, in context, between the diversity of practices, methods and relationships to a multiplicity of source materials, the prescription of good practices and standards, technical imperatives and political orientations?
To answer this question, it is necessary to carry out interdisciplinary research on the whole cycle of practices of production, mobilization and circulation of research materials in the HSS, and to link these to the governance of infrastructures in order to understand how these practices both govern and are governed by infrastructures. Within the OpenEdition Lab and in partnership with the Centre Internet et Société, the PaRéDo SHS project takes an in-depth look at the policies, mechanisms and practices of data sharing and reuse in HSS at the scale of the COMMONS shared digital infrastructure, thanks to : (WP1) a massive analysis of a large corpus of open data policy documents, HSS data research and technical and normative specifications, (WP2) a research-on-research component in the form of an ethnography of the data practices of COMMONS users, and (WP3) reflections and recommendations on the forms of governance needed to preserve the epistemic diversity of SHS and thus foster data openness, sharing and reuse.

Project coordination

Simon Dumas-Primbault (OpenEdition)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

CIS Centre Internet et Société
OE OpenEdition

Help of the ANR 268,200 euros
Beginning and duration of the scientific project: August 2024 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter