Author name cluster

Inge Li Gørtz

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

SODA Conference 2024 Conference Paper

Sparse Regular Expression Matching

Philip Bille
Inge Li Gørtz

A regular expression specifies a set of strings formed by single characters combined with concatenation, union, and Kleene star operators. Given a regular expression R and a string Q, the regular expression matching problem is to decide if Q matches any of the strings specified by R. Regular expressions are a fundamental concept in formal languages and regular expression matching is a basic primitive for searching and processing data. A standard textbook solution [Thompson, CACM 1968] constructs and simulates a nondeterministic finite automaton, leading to an O(nm) time algorithm, where n is the length of Q and m is the length of R. Despite considerable research efforts only polylogarithmic improvements of this bound are known. Recently, conditional lower bounds provided evidence for this lack of progress when Backurs and Indyk [FOCS 2016] proved that, assuming the strong exponential time hypothesis (SETH), regular expression matching cannot be solved in O (( nm ) 1-ɛ ), for any constant ɛ > 0. Hence, the complexity of regular expression matching is essentially settled in terms of n and m. In this paper, we take a new approach and introduce a density parameter, Δ, that captures the amount of nondeterminism in the NFA simulation on Q. The density is at most nm + 1 but can be significantly smaller. Our main result is a new algorithm that solves regular expression matching in time. This essentially replaces nm with Δ in the complexity of regular expression matching. We complement our upper bound by a matching conditional lower bound that proves that we cannot solve regular expression matching in time O (Δ 1-ɛ ) for any constant ɛ > 0 assuming SETH. The key technical contribution in the result is a new linear space representation of the classic position automaton that supports fast state-set transition computation in near-linear time in the size of the input and output state sets. To achieve this we develop several new insights and techniques of independent interest, including new structural properties of the parse trees of regular expression, a decomposition of state-set transitions based on parse trees, and a fast batched predecessor data structure. * The full version of the paper can be accessed at https: //arxiv. org/abs/1907. 04752

Details

TCS Journal 2022 Journal Article

Partial sums on the ultra-wide word RAM

Philip Bille
Inge Li Gørtz
Frederik Rye Skjoldjensen

Details DOI

TCS Journal 2022 Journal Article

String indexing for top-k close consecutive occurrences

Philip Bille
Inge Li Gørtz
Max Rishøj Pedersen
Eva Rotenberg
Teresa Anna Steiner

Details DOI

MFCS Conference 2019 Conference Paper

From Regular Expression Matching to Parsing

Philip Bille
Inge Li Gørtz

Given a regular expression R and a string Q, the regular expression parsing problem is to determine if Q matches R and if so, determine how it matches, e. g. , by a mapping of the characters of Q to the characters in R. Regular expression parsing makes finding matches of a regular expression even more useful by allowing us to directly extract subpatterns of the match, e. g. , for extracting IP-addresses from internet traffic analysis or extracting subparts of genomes from genetic data bases. We present a new general techniques for efficiently converting a large class of algorithms that determine if a string Q matches regular expression R into algorithms that can construct a corresponding mapping. As a consequence, we obtain the first efficient linear space solutions for regular expression parsing.

Details

TCS Journal 2018 Journal Article

Time–space trade-offs for Lempel–Ziv compressed indexing

Philip Bille
Mikko Berggren Ettienne
Inge Li Gørtz
Hjalte Wedel Vildhøj

Details DOI

TCS Journal 2016 Journal Article

Longest common extensions in trees

Philip Bille
Paweł Gawrychowski
Inge Li Gørtz
Gad M. Landau
Oren Weimann

Details DOI

TCS Journal 2014 Journal Article

Compact q-gram profiling of compressed strings

Philip Bille
Patrick Hagge Cording
Inge Li Gørtz

Details DOI

TCS Journal 2012 Journal Article

String matching with variable length gaps

Philip Bille
Inge Li Gørtz
Hjalte Wedel Vildhøj
David Kofoed Wind

Details DOI

TCS Journal 2006 Journal Article

Asymmetry in k -center variants

Inge Li Gørtz
Anthony Wirth

Details DOI