D-DAL – Publications

Journal articles 2026

Corrections to “On the data complexity of consistent query answering over graph databases [Journal of Computer and System Sciences 88 (2017) 164–194]”

Pablo Barceló, Gaëlle Fontaine, Sylvain Salvati, Sophie Tison

Abstract

Applications of graph databases are prone to inconsistency due to interoperability issues. This raises the need for studying query answering over inconsistent graph databases in a simple but general framework. We follow the approach of consistent query answering (CQA), and study its data complexity over graph databases for conjunctive regular-path queries (CRPQs) and conjunctive regular-path constraints (CRPCs). We deal with subset, superset and symmetric-difference repairs. Without restrictions, CQA is undecidable for the semantics of superset- and symmetric-difference repairs, and _2^P-complete for subset-repairs. However, we identify restrictions on CRPCs and databases that lead to decidability, and even tractability of CQA.

Preprints, Working Papers, … 2025

Cutwidth Bounds via Vertex Partitions

Antoine Amarilli, Benoît Groz

HAL arXiv PDF

Abstract

We study the cutwidth measure on graphs and ways to bound the cutwidth of a graph by partitioning its vertices. We consider bounds expressed as a function of two quantities: on the one hand, the maximal cutwidth y of the subgraphs induced by the classes of the partition, and on the other hand, the cutwidth x of the quotient multigraph obtained by merging each class to a single vertex. We consider in particular the decomposition of directed graphs into strongly connected components (SCCs): in this case, y is the maximal cutwidth of an SCC, and x is the cutwidth of the directed acyclic condensation multigraph. We show that the cutwidth of a graph is always in O(x + y), specifically it can be upper bounded by 1.5x + y. We also show a lower bound justifying that the constant 1.5 cannot be improved in general.

Preprints, Working Papers, … 2025

Locality Testing for NFAs is PSPACE-complete

Antoine Amarilli, Mikaël Monet, Rémi de Pretto

HAL arXiv PDF

Preprints, Working Papers, … 2025

Confluence of the Node-Domination and Edge-Domination Hypergraph Rewrite Rules

Antoine Amarilli, Mikaël Monet, Rémi de Pretto

HAL arXiv PDF

Conference papers 2025

Shape Expressions with Inheritance

Iovka Boneva, Jose Emilio Labra Gayo, Eric Prud’hommeaux, Katherine Thornton, Andra Waagmeester

DOI HAL arXiv PDF

Abstract

We formally introduce an inheritance mechanism for the Shape Expressions language (ShEx). It is inspired by inheritance in object-oriented programming languages, and provides similar advantages such as reuse, modularity, and more flexible data modelling. Using an example, we explain the main features of the inheritance mechanism. We present its syntax and formal semantics. The semantics is an extension of the semantics of ShEx 2.1. It also directly yields a validation algorithm as an extension of the previous ShEx validation algorithms, while maintaining the same algorithmic complexity.

Conference papers 2025

Common Foundations for SHACL, ShEx, and PG-Schema

Shqiponja Ahmetaj, Iovka Boneva, Jan Hidders, Katja Hose, Maxime Jakubowski, Jose Emilio Labra Gayo, Wim Martens, Fabio Mogavero, Filip Murlak, Cem Okulmus, Axel Polleres, Ognjen Savković, Mantas Šimkus, Dominik Tomaszuk

DOI HAL arXiv PDF

Abstract

Graphs have emerged as a foundation for a variety of applications, including capturing factual knowledge, semantic data integration, social networks, and informing machine learning algorithms. Formalising properties of the data and ensuring data quality requires describing schemas of such graphs. Driven by diverse applications, the Semantic Web and database communities developed not only different graph data models-RDF and property graphs-but also different graph schema languages-SHACL, ShEx, and PG-Schema. Each language has its unique approach to defining constraints and validating graph data, leaving potential users in the dark about their commonalities and differences. In this paper, we provide concise formal definitions of the core components of these languages, employ a uniform framework to facilitate a comprehensive comparison between them, and identify a common set of functionalities, shedding light on both overlapping and distinctive features.

Preprints, Working Papers, … 2025

Vizitig: a pangenome and pantranscriptome explorer

Bastien Degardins, Charles Paperman, Camille Marchet

DOI HAL PDF

Abstract

Vizitig is the first platform for real-time exploration and querying of DNA and RNA sequence graphs across many samples, unifying visualization, metadata, and flexible search. It integrates raw and reference-based data, handles complex variation, and provides a human-readable query language with scalable graph loading. Vizitig enables fast, reproducible analysis in both pangenomics and pantranscriptomics.

Conference papers 2025

Linear Time Subsequence and Supersequence Regex Matching

Antoine Amarilli, Florin Manea, Tina Ringleb, Markus Schmid

DOI HAL arXiv PDF

Abstract

It is well-known that checking whether a given string w matches a given regular expression r can be done in quadratic time O(|w|⋅ |r|) and that this cannot be improved to a truly subquadratic running time of O((|w|⋅ |r|)^{1-ε}) assuming the strong exponential time hypothesis (SETH). We study a different matching paradigm where we ask instead whether w has a subsequence that matches r, and show that regex matching in this sense can be solved in linear time O(|w| + |r|). Further, the same holds if we ask for a supersequence. We show that the quantitative variants where we want to compute a longest or shortest subsequence or supersequence of w that matches r can be solved in O(|w|⋅ |r|), i. e., asymptotically no worse than classical regex matching; and we show that O(|w| + |r|) is conditionally not possible for these problems. We also investigate these questions with respect to other natural string relations like the infix, prefix, left-extension or extension relation instead of the subsequence and supersequence relation. We further study the complexity of the universal problem where we ask if all subsequences (or supersequences, infixes, prefixes, left-extensions or extensions) of an input string satisfy a given regular expression.

Conference papers 2025

Dynamic Membership for Regular Tree Languages

Antoine Amarilli, Corentin Barloy, Louis Jachiet, Charles Paperman

DOI HAL arXiv PDF

Abstract

We study the dynamic membership problem for regular tree languages under relabeling updates: we fix an alphabet Σ and a regular tree language L over Σ (expressed, e.g., as a tree automaton), we are given a tree T with labels in Σ, and we must maintain the information of whether the tree T belongs to L while handling relabeling updates that change the labels of individual nodes in T. Our first contribution is to show that this problem admits an O(log n / log log n) algorithm for any fixed regular tree language, improving over known O(log n) algorithms. This generalizes the known O(log n / log log n) upper bound over words, and it matches the lower bound of Ω(log n / log log n) from dynamic membership to some word languages and from the existential marked ancestor problem. Our second contribution is to introduce a class of regular languages, dubbed almost-commutative tree languages, and show that dynamic membership to such languages under relabeling updates can be decided in constant time per update. Almost-commutative languages generalize both commutative languages and finite languages: they are the analogue for trees of the ZG languages enjoying constant-time dynamic membership over words. Our main technical contribution is to show that this class is conditionally optimal when we assume that the alphabet features a neutral letter, i.e., a letter that has no effect on membership to the language. More precisely, we show that any regular tree language with a neutral letter which is not almost-commutative cannot be maintained in constant time under the assumption that the prefix-U1 problem from [Antoine Amarilli et al., 2021] also does not admit a constant-time algorithm.