CE25 - Mathématiques – Informatique théorique

# Aggregation Queries – Aggreg

## AGGREG

Aggregation Queries.

### Aggregation Queries

The main goal of the Aggreg project is to develop efficient algorithms for answering aggregate queries for databases and data streams of various kinds. Aggregate queries are central for computing statistics on various data collections such as Rdf stores, NoSQL databases, streams of data trees in Json or Xml for- mat, relational databases, and datawarehouses. The problem to answer aggregate queries is more difficult than counting problems that are known to be computationally hard. This precludes from general and efficient solutions. Instead, we propose to: study the complexity of fragments of the class of aggregate queries, search for efficient algorithms for tractable fragments, find general online and of- fline algorithms that are gracefully degrading, and also efficient approximation algorithms. We apply methods from algebra, automata, probability, algorithmics and complexity theory.

State of the art
1 Aggregate queries for databases
2 Tractable query classes
3 Aggregation in constraint satisfaction
4 Online query answering on data streams
5 Probabilistic approximation algorithms
6 Aggregate path queries
7 Aggregation over incomplete data

P1 Offline aggregation algorithms
P2 Online aggregation algorithms
P3 Aggregation complexity

Project is running.

P1:
Jcss’14 [15], Tcs’14 [16], Stacs’15 [8], Sat’15 [6], Sat’14 [9], PhD thesis Capelli. Icalp’15 [3], Pods’16 [4], Cikm’15 [1], Icde’16 [5], Edbt’16 [2],
P2:
Tcs’15 [14], Sofsem’16 [23], Master thesis Sakho, QuiXPath tool, PhD thesis

The main goal of the Aggreg project is to develop efficient algorithms
for answering aggregate queries for databases and data streams of
various kinds. Aggregate queries are central for computing statistics on data
collections: Rdf stores, NoSql databases, streams of data trees in Xml
or Json format, uncertain databases, relational databases, and datawarehouses.

Considering that counting is the basis of aggregate queries the principal difficulty here
is to overcome the inherent computational hardness of many
counting problems, which precludes general and efficient solutions.
study the complexity of expressive fragments of the class of aggregate queries,
search for efficient algorithms on tractable fragments,
identify which parameters can be fixed in order to obtain tractability,
find general algorithms that are gracefully degrading, and also efficient approximation algorithms.
We apply methods from algebra, automata, probability, algorithmics and complexity theory.

## Project coordinator

Institut National de Recherche en Informatique et en Automatique (INRIA) Centre Lille - Nord Europe (Laboratoire public)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

## Partner

Laboratoire d'Informatique Fondamentale
Institut National de Recherche en Informatique et en Automatique (INRIA) Centre Lille - Nord Europe
Institut de Mathématiques de Jussieu-Paris Rive Gauche

Help of the ANR 423,587 euros
Beginning and duration of the scientific project: September 2014 - 48 Months