TY - JOUR
T1 - On suffix tree detection
AU - Amir, Amihood
AU - Kondratovsky, Eitan
AU - Levy, Avivit
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/10/12
Y1 - 2024/10/12
N2 - A suffix tree is a fundamental data structure for string processing and information retrieval, however, its structure is still not well understood. The suffix trees reverse engineering problem, which its research aims at reducing this gap, is the following. Given an ordered rooted tree T with unlabeled edges, determine whether there exists a string w such that the unlabeled-edges suffix tree of w is isomorphic to T. Previous studies on this problem consider the relaxation of having the suffix links as well as assume a binary alphabet. This paper is the first to consider the suffix tree detection problem, in which the relaxation of having suffix links as input is removed. We study suffix tree detection on two scenarios that are interesting per se. We provide a suffix tree detection algorithm for general alphabet periodic strings. Given an ordered tree T with n leaves, our detection algorithm takes O(n+|Σ|p)-time, where p is the unknown in advance length of a period that repeats at least 3 times in a string S having a suffix tree structure identical to T, if such S exists. Therefore, it is a polynomial time algorithm if p is a constant and a linear time algorithm if, in addition, the alphabet has a sub-linear size. We also show some necessary (but insufficient) conditions for binary alphabet general strings suffix tree detection. By this we take another step towards understanding suffix trees structure.
AB - A suffix tree is a fundamental data structure for string processing and information retrieval, however, its structure is still not well understood. The suffix trees reverse engineering problem, which its research aims at reducing this gap, is the following. Given an ordered rooted tree T with unlabeled edges, determine whether there exists a string w such that the unlabeled-edges suffix tree of w is isomorphic to T. Previous studies on this problem consider the relaxation of having the suffix links as well as assume a binary alphabet. This paper is the first to consider the suffix tree detection problem, in which the relaxation of having suffix links as input is removed. We study suffix tree detection on two scenarios that are interesting per se. We provide a suffix tree detection algorithm for general alphabet periodic strings. Given an ordered tree T with n leaves, our detection algorithm takes O(n+|Σ|p)-time, where p is the unknown in advance length of a period that repeats at least 3 times in a string S having a suffix tree structure identical to T, if such S exists. Therefore, it is a polynomial time algorithm if p is a constant and a linear time algorithm if, in addition, the alphabet has a sub-linear size. We also show some necessary (but insufficient) conditions for binary alphabet general strings suffix tree detection. By this we take another step towards understanding suffix trees structure.
KW - Periodic string
KW - Reverse engineering
KW - Suffix tree
KW - Suffix tree detection
UR - http://www.scopus.com/inward/record.url?scp=85198338978&partnerID=8YFLogxK
U2 - 10.1016/j.tcs.2024.114728
DO - 10.1016/j.tcs.2024.114728
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85198338978
SN - 0304-3975
VL - 1012
JO - Theoretical Computer Science
JF - Theoretical Computer Science
M1 - 114728
ER -