Arrow Research search
Back to TCS

TCS 2017

Inferring strings from Lyndon factorization

Journal Article journal-article Computer Science · Theoretical Computer Science

Abstract

The Lyndon factorization of a string w is a unique factorization ℓ 1 p 1, …, ℓ m p m of w such that ℓ 1, …, ℓ m is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S = ( ( s 1, p 1 ), …, ( s m, p m ) ) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i. e. , the Lyndon factorization of w is in a form of ℓ 1 p 1, …, ℓ m p m with | ℓ i | = s i for all 1 ≤ i ≤ m. Firstly, we show that there exists a simple O ( n ) -time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O ( n ) -time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O ( m ) time. Fourthly, we give an O ( m ) -time algorithm to compute an O ( m ) -size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

Authors

Keywords

  • Lyndon factorization
  • Reverse engineering problem

Context

Venue
Theoretical Computer Science
Archive span
1975-2026
Indexed papers
16261
Paper id
854665264266870431