Arrow Research search

Author name cluster

Victor Marsault

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
1 author row

Possible papers

3

KR Conference 2023 Conference Paper

Run-Based Semantics for RPQs

  • Claire David
  • Nadime Francis
  • Victor Marsault

RPQs (regular path queries) are an important building block of most query languages for graph databases. They are generally evaluated under homomorphism semantics; in particular only the endpoints of the matched walks are returned. However, practical applications often need the full matched walks to compute aggregate values. In those cases, homomorphism semantics are not suitable since the number of matched walks can be infinite. Hence, graph-database engines adapt the semantics of RPQs, often neglecting theoretical red flags. For instance, the popular query language Cypher uses trail semantics, which ensures the result to be finite at the cost of making computational problems intractable. We propose a new kind of semantics for RPQs, including in particular simple-run and binding-trail semantics, as a candidate to reconcile theoretical considerations with practical aspirations. Both ensure the output to be finite in a way that is compatible with homomorphism semantics: projection on endpoints coincides with homomorphism semantics. Hence, testing the emptiness of result is tractable, and known methods readily apply. Moreover, simple-run and binding-trail semantics support bag semantics, and enumeration of the bag of results is tractable.

Highlights Conference 2022 Conference Abstract

Simple-Run Semantics for RPQs

  • Victor Marsault

In database theory, RPQs (regular path queries) are the building block of most query languages for querying graph databases. RPQs are generally evaluated under homomorphism semantics; in particular only the endpoints of the matched walks are returned. On the contrary, due to user pressure, most real graph-database engines actually return the full matched walks. Under homomorphism semantics, there might be an infinite number of such walks. Hence each real query language had to adapt the semantics of RPQs in order to meet this popular demand, often neglecting theoretical implications. For instance, the most popular query language, Cypher, uses trail semantics: only walks with no repeated edges are returned. In that case, the result set is indeed finite, but the simplest computational problems are untractable. We propose new semantics for RPQs, called simple-run semantics, as a candidate to reconcile theoretical considerations with practical aspirations. Just as trail semantics, simple-run semantics aims at keeping the output finite by filtering out redundant results. Trail semantics filter based on redundancy in the computed walk: repeated edges are forbidden. Simple-run semantics filter based on redundancy in the run: a node can be reused only if the query computation did progress compared to the previous times the node was visited. Joint work with Claire David and Nadime Francis.