DS07 - Société de l'information et de la communication

Enhancing Link Keys: Extraction and Reasoning – ELKER

Submission summary

The society at large requests access to available data from various bodies: governments, universities, cultural actors, etc. This has led to the release of a vast quantity of linked data, i.e., data expressed in semantic web formalisms (RDF). Part of the added value of linked data lies in the links identifying the same entity in different datasets. For instance, they may identify the same books and articles in different bibliographical data sources. Links allow to jointly exploit the content of data sources and make inferences between datasets. Thus, finding the manifestation of the same entity across several datasets is a crucial task for linked data.

One novel way to generate such links is to extract and use link keys. Link keys generalize database keys in two independent directions: they deal with data in RDF, and they apply across two datasets. The goal of ELKER is to extend the foundations and algorithms of link keys in two complementary ways:
extracting link keys automatically from datasets and reasoning with link keys.

Concerning link key extraction, ELKER will delve into the parallel between link key extraction and formal concept analysis. This will allow to extend the type of link keys that can be extracted and to take advantage of optimised extraction procedures. We will also deal with dependent link keys naturally occuring when the classes in ontologies are interdependent. For that purpose, we will consider the procedures defined for relational concept analysis and adapt them to link keys. We will also develop a fixed point semantics for link keys that depend on each other, which would allow to generate more links. Finally, we will explore description building techniques for optimising extraction, i.e. taking advantage of the quality measures used for selecting link keys during the extraction process so that the search space can be reduced.

Regarding reasoning with link keys, ELKER will extend description logics techniques for reasoning with ontologies, data and link keys. Tableau methods for description logics will be adapted to infer axioms and link keys from ontologies and link keys. We will also consider the distribution of this reasoning process adapted to the case where ontologies and datasets cannot be centralized. Such techniques may be used off-line for generating new link keys that can be evaluated on data. For high-throughput link generation, we will transform link keys into Datalog rules in an adaptation of probabilistic Datalog allowing to carry uncertainty from link keys and axioms.

The theoretical outcomes of ELKER will be implemented and integrated in software maintained by the partners and connected together. They will be distributed as open source software. Moreover, the designed methods and tools will be evaluated through specifically designed benchmarks enabling to test the unique aspects of link keys, and on real-world datasets.

The ELKER consortium comprises three complementary teams specialist on data interlinking and semantic web technologies and models, formal concept analysis and reasoning in description logics.

Project coordination

Manuel Atencia (Laboratoire d'Informatique de Grenoble)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

LIG Laboratoire d'Informatique de Grenoble
LIASD Laboratoire d’Informatique Avancée de Saint-Denis
UMR 1142 Laboratoire d’Informatique Médicale et d'Ingéniererie des connaissances E-santé
Inria Nancy Grand Est Centre de Recherche Inria Nancy - Grand Est

Help of the ANR 500,976 euros
Beginning and duration of the scientific project: September 2017 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter