DS0705 - Fondements du numérique 2014

Aggregation Queries – Aggreg

AGGREG

Aggregation Queries.

Aggregation Queries

The main goal of the Aggreg project is to develop efficient algorithms for answering aggregate queries for databases and data streams of various kinds. Aggregate queries are central for computing statistics on various data collections such as Rdf stores, NoSQL databases, streams of data trees in Json or Xml for- mat, relational databases, and datawarehouses. The problem to answer aggregate queries is more difficult than counting problems that are known to be computationally hard. This precludes from general and efficient solutions. Instead, we propose to: study the complexity of fragments of the class of aggregate queries, search for efficient algorithms for tractable fragments, find general online and of- fline algorithms that are gracefully degrading, and also efficient approximation algorithms. We apply methods from algebra, automata, probability, algorithmics and complexity theory.

Algebra, automata, probability, algorithmics and complexity theory.

State of the art
1 Aggregate queries for databases
2 Tractable query classes
3 Aggregation in constraint satisfaction
4 Online query answering on data streams
5 Probabilistic approximation algorithms
6 Aggregate path queries
7 Aggregation over incomplete data

Results

Working Packages, Tasks, and Deliverables
P1 Offline aggregation algorithms
P2 Online aggregation algorithms
P3 Aggregation complexity

Prospects

Project is running.

Scientific productions and patents

P1:
Jcss’14 [15], Tcs’14 [16], Stacs’15 [8], Sat’15 [6], Sat’14 [9], PhD thesis Capelli. Icalp’15 [3], Pods’16 [4], Cikm’15 [1], Icde’16 [5], Edbt’16 [2],
P2:
Tcs’15 [14], Sofsem’16 [23], Master thesis Sakho, QuiXPath tool, PhD thesis Sebastian [22]. PhD project Sakho (2016-19). Lata’15 [21]. BigData’15 [13], Esa’15 [18], Tcs’16 [12]. Lata’15 [10],
P3:
Jcss’16 [19], Master thesis Blind. PhD project Vigny (2015-18).
Stacs’16 [20], Pods’14 [17]. Ijcai’16 [7].

Submission summary

The main goal of the Aggreg project is to develop efficient algorithms
for answering aggregate queries for databases and data streams of
various kinds. Aggregate queries are central for computing statistics on data
collections: Rdf stores, NoSql databases, streams of data trees in Xml
or Json format, uncertain databases, relational databases, and datawarehouses.

Considering that counting is the basis of aggregate queries the principal difficulty here
is to overcome the inherent computational hardness of many
counting problems, which precludes general and efficient solutions.
Instead, we propose to:
study the complexity of expressive fragments of the class of aggregate queries,
search for efficient algorithms on tractable fragments,
identify which parameters can be fixed in order to obtain tractability,
find general algorithms that are gracefully degrading, and also efficient approximation algorithms.
We apply methods from algebra, automata, probability, algorithmics and complexity theory.

Joachim NIEHREN (Institut National de Recherche en Informatique et en Automatique (INRIA) Centre Lille - Nord Europe)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

LIF Laboratoire d'Informatique Fondamentale
INRIA Centre Lille-Nord Europe Institut National de Recherche en Informatique et en Automatique (INRIA) Centre Lille - Nord Europe
IMJ-PRG Institut de Mathématiques de Jussieu-Paris Rive Gauche

Help of the ANR 423,587 euros
Beginning and duration of the scientific project: September 2014 - 48 Months

Explorez notre base de projets financés

ANR makes available its datasets on funded projects, click here to find more.