Arrow Research search

Author name cluster

Thomas Colcombet

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers
2 author rows

Possible papers

25

MFCS Conference 2025 Conference Paper

Lambdas, Transducers and MSO (Invited Talk)

  • Thomas Colcombet

This talk will revolve around the classical question: What kind of maps from words to words or trees to trees can be considered as "well behaved" from an automata theoretic point of view? What means to be well-behaved is left unspecified on purpose, but one thing is sure is that one would expect as minimal requirement that the inverse image of a regular language under such a map be effectively regular. One would also expect to have nice closure properties for these maps, in particular under composition. Ideally, we would also like to have several different equivalent computation models for representing them. This talk advertises the use of simply-typed lambda calculus as a good way to approach this question, and presents newer results concerning translations from and to exponential growth maps. This line of research is rooted in a long literature, starting from the works of Damm on higher order grammars, and the works of Engelfriet and Vogler on macro-tree transducers, and more generally from works on higher-order tree transducers, as well as the results of Ong et al. on decidability of MSO over higher-order recursion schemes (a model that produces infinite trees). These works show in a broad sense that models of computation defined in simply typed lambda-calculus do preserve regularity under taking the inverse image. Another connection is the recent implicit automata research programme initiated by Nguy~ên and Pradic, that was inspired by the seminal work of Hillebrand and Kanellakis. Here, automata models a directly viewed as programs written in some form of typed lambda-calculus, using Church encoding for representing inputs and outputs. When appropriately controlling the type system various it is possible to characterize various classes of finite state transducers studied in the literature. A last approach comes from finite model-theory. In this context, Monadic Second-Order Logic (MSO for short, the extension of first-order in which one can further quantify over sets) is the prime logic for describing regular languages over words or trees. This logic can also be used for defining transformations from words/trees to words/trees. These are the notions of MSO-interpretations or MSO-transducers. A major contribution here is the study of the expressive power of MSO-interpretations of polynomial growth from words to words that was undertaken by Bojańczyk, Kiefer and Lhote, and shown equivalent to other formalisms of transducers and programming languages. This is the class of polyregular functions, and it satisfies all the expectation of the question. Yet it is limited to finite words as inputs and outputs, and to polynomial growth. In a first part of the talk, we shall see how to use simply-typed lambda-calculus as a general mechanism for transforming trees into trees, and how it gives a positive answer to the original question. This description essentially repackages many ideas existing from the literature, in a way similar Gallot’s thesis. It also introduces a few new ones. In particular, we shall see generic results that some clearly identified classes of functional programs describing transformations from trees to trees can be effectively compiled into transducer-like models which perform only one pass on the input (i. e. higher-order tree transducers). This class of programs is also closed under composition. We shall also see how the logic MSO itself can be natively embedded in such models of computation. In a second part, we will see how this presentation relates to recent works with Lhote, Nguy~ên and Ohlmann in which extensions of polyregular functions to maps from words to words of exponential growth are studied. This work has the particularity to be the first one in the model-theoretic side of this theory in which it is key to allow lambda-terms to be unsafe.

CSL Conference 2025 Conference Paper

On the Expansion of Monadic Second-Order Logic with Cantor-Bendixson Rank and Order Type Predicates

  • Thomas Colcombet
  • Alexander Rabinovich

In this work, we consider two extensions of monadic second-order logic, and study in what cases the classical decidability results are preserved. The first extension, MSO[CBrank_β], is MSO (over the signature of the binary tree) augmented with the extra ability to express that the subtree over a set X has Cantor-Bendixson rank β, for some fixed countable ordinal β. We show that this extension is decidable over the binary tree if and only if β is finite, which means that it is decidable if and only if it is equivalent in expressiveness to MSO. The second extension, MSO[otp_α], is MSO (over the signature of order) augmented with the extra ability to express that the suborder induced by a set X has order type α for some fixed countable ordinal α. We show that this extension is decidable over countable ordinals if and only if α < ω^ω, which means that it is decidable if and only if it is equivalent in expressiveness to MSO. The first result can be established as a consequence of the second. The second result relies on the undecidability results of the logic BMSO (itself relying on the undecidability of MSO+U) in the case of ω^β for β a limit ordinal, and on entirely new techniques when β is a successor ordinal. We also have some partial extensions of the second result to some uncountable cases.

Highlights Conference 2023 Conference Abstract

Cascades of condensations

  • Thomas Colcombet

This is joint worked with Adsul Bharat, thanks to the Highlights Extended Stay Support Scheme 2023. We show that all regular languages can be described by applying a sequence of simple so-called ``condensation maps. ''A condensation map consists, given an input word, in selecting maximal factors that belong to a condensation language, and evaluate all of them in parallel. A condensation language is a language for which it is guaranteed that in the above definition maximal factors cannot overlap: it is assumed to be infix closed and to have the property that if $ua$ and $av$ are in the language, then $uav$ also is. The simple condensation maps correspond to very simple cases. This result captures classical proof techniques in the study of regular languages and exhibit similarities with the Krohn-Rhodes' theorem and the theorem of factorisation forest of Simon. Contributed talk given by Thomas Colcombet

MFCS Conference 2022 Conference Paper

A Complexity Approach to Tree Algebras: the Polynomial Case

  • Thomas Colcombet
  • Arthur Jaquard

In this paper, we consider infinitely sorted tree algebras recognising regular language of finite trees. We pursue their analysis under the angle of their asymptotic complexity, i. e. the asymptotic size of the sorts as a function of the number of variables involved. Our main result establishes an equivalence between the languages recognised by algebras of polynomial complexity and the languages that can be described by nominal word automata that parse linearisation of the trees. On the way, we show that for such algebras, having polynomial complexity corresponds to having uniformly boundedly many orbits under permutation of the variables, or having a notion of bounded support (in a sense similar to the one in nominal sets). We also show that being recognisable by an algebra of polynomial complexity is a decidable property for a regular language of trees.

Highlights Conference 2022 Conference Abstract

Uniformization of MSO over Countable Ordinals

  • Thomas Colcombet

Joint work with Alex Rabinovich. The question of uniformization consists, given a formula $\Psi(X, \bar Y)$, in constructing a formula $\Psi^*(X, \bar Y)$ such that, for all valuations of $\bar Y$, if there exists some $X$ such that $\Psi(X, \bar Y)$ holds, then there exists a unique $X$ such that $\Psi^*(X, \bar Y)$, and $\Psi(X, \bar Y)$. It is known, from [Lifsches and Shelah 98], that uniformization is not possible on the ordinal $\omega^\omega$. The non-uniformizable property $\Phi_\omega(X)$ states that $X$ has order type $\omega$ and is cofinal in the universe of the chain. In this work, we establish two results: (1) MSO can be uniformized over all countable ordinals in the extended logic MSO+$u_\omega$. There, $u_\omega(Y)$ is a new construct which, given a set $Y$, outputs a set $X: =u_{\omega}(Y)\subseteq Y$ of order-type $\omega$ and cofinal in $Y$. In some sense, this result shows that the counter-example of [Lifsches and Shelah 98] is the maximal obstruction that prevents MSO from being uniformizable over countable ordinals. (2) It is decidable, given a monadic formula, whether it can be uniformized over all ordinals. The results relies on the composition method / the algebraic theory of monadic theories over chains. On the algebraic side, it relies on classical tool for semigroups and monoids that are Green's relations and factorization forests (in a slightly non standard version which is definable over ordinals).

CSL Conference 2021 Conference Paper

Learning Automata and Transducers: A Categorical Approach

  • Thomas Colcombet
  • Daniela Petrisan
  • Riccardo Stabile

In this paper, we present a categorical approach to learning automata over words, in the sense of the L*-algorithm of Angluin. This yields a new generic L*-like algorithm which can be instantiated for learning deterministic automata, automata weighted over fields, as well as subsequential transducers. The generic nature of our algorithm is obtained by adopting an approach in which automata are simply functors from a particular category representing words to a "computation category". We establish that the sufficient properties for yielding the existence of minimal automata (that were disclosed in a previous paper), in combination with some additional hypotheses relative to termination, ensure the correctness of our generic algorithm.

Highlights Conference 2018 Conference Abstract

Universal parity graphs and lower bounds for parity games

  • Thomas Colcombet

ABSTRACT. This is a collaboration with Nathanaël Fijalkow. Fix a number of priorities d. A "parity graph" is a (d-)parity labelled graph such that all cycles have an even maximal priority (said differently, a parity game played by Adam only, in which Eve wins). It is "n-universal" if it contains all size-n parity graphs as subgraphs. In this talk, I will explain how these universal parity graphs provide neat explanations of all the subexponential algorithms for solving parity games in the literature. In particular, in each of these algorithms, one can find in its structure a universal parity graph, and conversely, one can devise a variant of the algorithm for each choice of a universal parity graph. As a consequence, it explains the recent result of Czerwinski, Fijalkow, Jurdzinski, Lazic and Parys, and shows that all these algorithms face the same difficulty if one attempts to further optimization.

MFCS Conference 2017 Conference Paper

Automata in the Category of Glued Vector Spaces

  • Thomas Colcombet
  • Daniela Petrisan

In this paper we adopt a category-theoretic approach to the conception of automata classes enjoying minimization by design. The main instantiation of our construction is a new class of automata that are hybrid between deterministic automata and automata weighted over a field.

Highlights Conference 2017 Conference Abstract

Perfect Half Space Games

  • Thomas Colcombet
  • Marcin Jurdziński
  • Ranko Lazić
  • Sylvain Schmitz

We introduce perfect half space games, in which the goal of Player 2 is to make the sums of encountered multi-dimensional weights diverge in a direction which is consistent with a chosen sequence of perfect half spaces (chosen dynamically by Player 2). We establish that the bounding games of Jurdziński et al. (ICALP 2015) can be reduced to perfect half space games, which in turn can be translated to the lexicographic energy games of Colcombet and Niwiński, and are positionally determined in a strong sense (Player 2 can play without knowing the current perfect half space). We finally show how perfect half space games and bounding games can be employed to solve multi- dimensional energy parity games in pseudo-polynomial time when both the numbers of energy dimensions and of priorities are fixed, regardless of whether the initial credit is given as part of the input or existentially quantified. This also yields an optimal 2-EXPTIME complexity with given initial credit, where the best known upper bound was non-elementary. Abstract available in PDF; paper presented at LICS 2017 and available from arXiv.

Highlights Conference 2015 Conference Abstract

On Unambiguity

  • Thomas Colcombet

A possible way to relax the notion of determinism is to consider unambiguous machines: nondeterministic machines that have at most one accepting run on each input. In this talk, we will consider several situations in which non-ambiguity arises in automata theory, surveying the cases of standard finite words up to infinite trees, as well as data-words. The main conclusion of this talk is that the notion of unambiguity, though relevant in many situations, is so far not well understood and yields difficult open questions. 10: 30 10: 50 Coffee Break

Highlights Conference 2014 Conference Abstract

An Intriguing Tiling Problem

  • Thomas Colcombet

This talk introduces some simple tiling problems. Their decidability is open. These problems are tightly related to the decidability of some quantitative variants of MSO over infinite words. The technical content of this talk is part of a collaboration with Achim Blumensath and Olivier Carton that will be presented at MFCS '14. 17: 00 17: 15 Break

Highlights Conference 2013 Conference Abstract

Approximate comparison of distance automata

  • Laure Daviaud
  • Thomas Colcombet

Distance automata are automata weighted over the semiring (N U {+infinity}, min, +) that compute functions from words to N U {+infinity}. It is known that testing f 0 and two functions f, g computed by distance automata, answers "yes" if f (1+e)g(w), and may answer "yes" or "no" in all other cases. This is a joint work with Thomas Colcombet.

Highlights Conference 2013 Conference Abstract

Boundedness games

  • Nathanaël Fijalkow
  • Krishnendu Chatterjee
  • Thomas Colcombet
  • Florian Horn
  • Denis Kuperberg
  • Michał Skrzypczak
  • Martin Zimmermann

In this talk, I will discuss some recent developments about boundedness games. They are two-player games played over graphs, featuring counters. The first player's objective is to satisfy a parity condition and that the counter values are bounded. A lot of questions are open: what is the algorithmic complexity of deciding the winner, and does there always exist finite-memory winning strategies? In particular, as shown by Thomas Colcombet, proving the existence of finite-memory winning strategies for (some) boundedness games is the last missing item to prove the decidability of cost-MSO over infinite trees. I will give an overview of where we stand now and what are the next challenges. 13: 00 14: 30 Lunch

CSL Conference 2013 Conference Paper

Deciding the weak definability of Büchi definable tree languages

  • Thomas Colcombet
  • Denis Kuperberg
  • Christof Löding
  • Michael Vanden Boom

Weakly definable languages of infinite trees are an expressive subclass of regular tree languages definable in terms of weak monadic second-order logic, or equivalently weak alternating automata. Our main result is that given a Büchi automaton, it is decidable whether the language is weakly definable. We also show that given a parity automaton, it is decidable whether the language is recognizable by a nondeterministic co-Büchi automaton. The decidability proofs build on recent results about cost automata over infinite trees. These automata use counters to define functions from infinite trees to the natural numbers extended with infinity. We reduce to testing whether the functions defined by certain "quasi-weak" cost automata are bounded by a finite value.

Highlights Conference 2013 Conference Abstract

Magnitude Monadic Logic over words and the use of relative internal set theory

  • Thomas Colcombet

Cost monadic logic extends monadic second-order logic with the ability to measure the cardinality of sets and comes with decision procedures for boundedness related questions. We provide new decidability results allowing the systematic investigation of questions involving “relative boundedness”. We first introduce a suitable logic, magnitude monadic logic. We then establish the decidability of this logic over finite words. We finally advocate that developing the proofs in the axiomatic system of “relative internal set theory”, a variant of nonstandard analysis, entails a significant simplification of the proofs. 11: 15 11: 45 Coffee break

Highlights Conference 2013 Conference Abstract

μ-calculus on data words

  • Thomas Colcombet
  • Amaldev Manuel

We introduce and explore various fragments of μ-calculus on data words. We show that full μ-calculus is undecidable, but its ν-fragment is decidable by reduction to data automata. We disclose two subclasses of the ν-fragment that enjoy the further property of being closed under negation, namely the class of Bounded Reversal (BR) and the class of Bounded Mode Alternation (BMA). The latter is contained in the former class and is already sufficiently expressive for capturing several logics studied in the literature like $\fotwo$ and Data-LTL. We then characterize these two fragments as cascades of suitably defined transducer models. We finally prove the separation of BMA from BR using a novel paradigm of circuits and prove hierarchy theorems for BMA, Data-LTL and $\fotwo$.

MFCS Conference 2011 Conference Paper

On the Use of Guards for Logics with Data

  • Thomas Colcombet
  • Clemens Ley
  • Gabriele Puppis

Abstract The notion of orbit finite data monoid was recently introduced by Bojańczyk as an algebraic object for defining recognizable languages of data words. Following Büchi’s approach, we introduce the new logic ‘rigidly guarded MSO’ and show that the data languages definable in this logic are exactly those recognizable by orbit finite data monoids. We also establish, following this time the approach of Schützenberger, McNaughton and Papert, that the first-order variant of this logic defines exactly the languages recognizable by aperiodic orbit finite data monoids. Finally, we give a variant of the logic that captures the larger class of languages recognized by non-deterministic finite memory automata.

CSL Conference 2008 Conference Paper

The Nesting-Depth of Disjunctive µ-Calculus for Tree Languages and the Limitedness Problem

  • Thomas Colcombet
  • Christof Löding

Abstract In this paper we lift the result of Hashiguchi of decidability of the restricted star-height problem for words to the level of finite trees. Formally, we show that it is decidable, given a regular tree language L and a natural number k whether L can be described by a disjunctive μ -calculus formula with at most k nesting of fixpoints. We show the same result for disjunctive μ -formulas allowing substitution. The latter result is equivalent to deciding if the language is definable by a regular expression with nesting depth at most k of Kleene-stars. The proof, following the approach of Kirsten in the word case, goes by reduction to the decidability of the limitedness problem for non-deterministic nested distance desert automata over trees. We solve this problem in the more general framework of alternating tree automata.

STOC Conference 2005 Conference Paper

Tree-walking automata do not recognize all regular languages

  • Mikolaj Bojanczyk
  • Thomas Colcombet

Tree-walking automata are a natural sequential model for recognizing tree languages. Every tree language recognized by a tree-walking automaton is regular. In this paper, we present a tree language which is regular but not recognized by any (nondeterministic) tree-walking automaton. This settles a conjecture of Engelfriet, Hoogeboom and Van Best. Moreover, the separating tree language is definable already in first-order logic over a signature containing the left-son, right-son and ancestor relations.