On suffix tree detection

Amihood Amir, Eitan Kondratovsky, Avivit Levy

Research output: Contribution to journalArticlepeer-review

Abstract

A suffix tree is a fundamental data structure for string processing and information retrieval, however, its structure is still not well understood. The suffix trees reverse engineering problem, which its research aims at reducing this gap, is the following. Given an ordered rooted tree T with unlabeled edges, determine whether there exists a string w such that the unlabeled-edges suffix tree of w is isomorphic to T. Previous studies on this problem consider the relaxation of having the suffix links as well as assume a binary alphabet. This paper is the first to consider the suffix tree detection problem, in which the relaxation of having suffix links as input is removed. We study suffix tree detection on two scenarios that are interesting per se. We provide a suffix tree detection algorithm for general alphabet periodic strings. Given an ordered tree T with n leaves, our detection algorithm takes O(n+|Σ|p)-time, where p is the unknown in advance length of a period that repeats at least 3 times in a string S having a suffix tree structure identical to T, if such S exists. Therefore, it is a polynomial time algorithm if p is a constant and a linear time algorithm if, in addition, the alphabet has a sub-linear size. We also show some necessary (but insufficient) conditions for binary alphabet general strings suffix tree detection. By this we take another step towards understanding suffix trees structure.

Original languageEnglish
Article number114728
JournalTheoretical Computer Science
Volume1012
DOIs
StatePublished - 12 Oct 2024

Bibliographical note

Publisher Copyright:
© 2024 Elsevier B.V.

Keywords

  • Periodic string
  • Reverse engineering
  • Suffix tree
  • Suffix tree detection

Fingerprint

Dive into the research topics of 'On suffix tree detection'. Together they form a unique fingerprint.

Cite this