Parameter ANalysis of certain classes of DAGs – PANDAG
Directed Acyclic Graphs (DAGs): modeling and property analysis
This project proposes a combinatorial study of computer science data structures modeled by directed acyclic graphs (DAGs). We focus on the analysis of objects arising from compaction procedures, particularly Boolean circuits and decision diagrams. The main challenge consists in specifying new combinatorial objects and extending generating function algebra to these new contexts.
Understanding DAGs and their fundamental roles in computer science
Directed acyclic graphs (DAGs) constitute a fundamental data structure in computer science, ubiquitous in numerous domains: scheduling problems, persistent data structures with sharing, collaborative version control systems. Despite their crucial importance, the combinatorial analysis of these structures remains complex and presents significant methodological challenges. Current problems: The counting of DAGs with a given number of vertices was solved in the 70s for two models (labeled or unlabeled nodes), but existing approaches present important limitations. Methods based on the inclusion-exclusion principle, although theoretically efficient, lead to inefficient random sampling algorithms due to the numerous rejections required. More recent approaches using graphic generating functions, although promising, still require recourse to the inclusion-exclusion principle, limiting their practical applicability. Project objectives: Our project aims to develop a constructive approach for DAG analysis, avoiding the drawbacks of the inclusion-exclusion principle. We focus on the study of compacted tree structures and have developed advanced tools to analyze DAG structures obtained by compacting trees using common substructure sharing. The main objectives include: 1. Development of advanced combinatorial tools: Creation of new analysis methods using analytic and enumerative combinatorics to structurally describe complex combinatorial objects. 2. Specification of new objects: Identification and formalization of new classes of combinatorial objects emerging from compaction procedures, particularly Boolean circuits and decision diagrams. 3. Extension of generating function algebra: Adaptation and extension of existing tools to handle these new combinatorial contexts, enabling finer analysis of structural properties. 4. Efficient sampling algorithms: Development of constructive algorithms for uniform DAG sampling, bypassing the rejection problems of classical methods. 5. Analysis of universal properties: Study of emergent properties of large random structures in different computer science contexts, revealing universal characteristics of data structures.
Project synthesis: Combinatorial analysis of compacted data structures
This project proposes an innovative combinatorial approach for studying four families of data structures obtained through compaction.
1. Boolean circuits
The study of Boolean functions represented by compacted circuits constitutes a major challenge. Unlike classical tree-like formulas, circuits allow substructure sharing, creating complex directed acyclic graphs. Central questions concern the probabilistic distribution of functions: probability of obtaining a tautology, persistence of the Shannon effect. Initial results suggest different behavior from tree models. The challenge lies in developing combinatorial tools adapted to sharing-based models.
2. Binary decision diagrams
These structures revolutionize Boolean function representation through decision tree compaction. The main combinatorial challenge consists in establishing a generating function requiring complete profile memorization. A recent approach uses inclusion-exclusion to avoid this complexity. Objectives include efficient uniform sampling and analysis of cryptographic properties of obtained functions.
3. Increasing DAG structures
Extension of recursive trees, these structures relax labeling constraints by allowing label repetitions. This model generalizes phylogenetic processes where multiple branches divide simultaneously. The approximate Borel transform solves complex functional equations. Challenges include extension through subtree substitution and adaptation to non-planar phylogenetic models.
4. Exploratory extensions
Two directions emerge: knowledge compilation and textual algorithms. d-DNNF languages in artificial intelligence represent knowledge bases with determinism constraints. In the textual domain, compacted trie trees reveal economies through suffix sharing. DAWG structures present complex dependencies between suffixes, constituting major combinatorial challenges.
Impact
This project unifies the analysis of distinct structures under the compaction prism. The developed tools will revolutionize theoretical understanding with impacts in algorithmics, cryptography, artificial intelligence and bioinformatics.
Currently a Franco-Austrian team of 3 researchers, one PhD student and one PhD candidate are working on modeling for point 1.
A first result has just been submitted for publication for point 2. This is the article (and associated code) presented in this repository: hal.sorbonne-universite.fr/hal-05183438v1.
The two researchers, associated with a PhD student, are exploiting these quantitative results to define random generation and unranking algorithms. The tools are being finalized and are being written up.
Concerning point 3, a postdoctoral researcher as well as two project researchers are working in two different directions: one to revisit the context of binary trees constructed via growth processes and the other to analyze layered DAGs arising from shortest path substructures in graphs.
Point 4. is currently glimpsed in the perspectives of the results being finalized for point 2.
All our advances rely on generation and simulation tools that we offer for download accompanying our scientific articles.
We met in May 2025 to compare our various research paths and unify our approaches: pandag.proj.lip6.fr/workshop-2025/
The initial results are promising, and we are confident in being able to present major advances for each of the 4 points we raised when writing the project.
Several bilateral France/Austria meetings, but also Caen/Paris and Paris/Paris meetings are planned during autumn 2025.
Directed acyclic graphs are important due to their close relation to partial orders, making them very general structures. Families of DAGs have long been appearing in computer science when compacting tree-structures sharing subtrees. First properties were found in the 1970s, in particular on the enumeration of DAGs. Our expertise lies in analysis of algorithms using Analytic Combinatorics in order to structurally describe combinatorial objects, to find new specifications and to exhibit universal properties of data structures. In this project we aim at studying tree structures and their associated DAGs induced by common substructures. We plan to extend the general
methodology to these DAGs, following three main directions:
(i) Quantitative analysis of the compaction of tries.
(ii) Typical structure of large DAGs whose decompaction relies on large binary trees.
(iii) Enumeration, random sampling and satisfiability of decision diagrams (ROBDDs, ZDDs, QDDs...) and Boolean circuits.
Project coordination
Antoine GENITRINI (LIP6)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partnership
GREYC Groupe de recherche en Informatique, Image, Automatique et Instrumentation de Caen
LIP6 LIP6
LIPN Laboratoire d'Informatique de Paris-Nord
Help of the ANR 295,534 euros
Beginning and duration of the scientific project:
February 2024
- 36 Months