TY - GEN

T1 - Towards real-time suffix tree construction

AU - Amihood, A.

AU - Kopelowitz, T

AU - Lewenstein, M

AU - Lewenstein, N

N1 - Place of conference:Argentina

PY - 2005

Y1 - 2005

N2 - The quest for a real-time suffix tree construction algorithm is over three decades old. To date there is no convincing understandable solution to this problem. This paper makes a step in this direction by constructing a suffix tree online in time O(log n) per every single input symbol. Clearly, it is impossible to achieve better than O(log n) time per symbol in the comparison model, therefore no true real time algorithm can exist for infinite alphabets. Nevertheless, the best that can be hoped for is that the construction time for every symbol does not exceed O(log n) (as opposed to an amortized O(log n) time per symbol, achieved by current known algorithms). To our knowledge, our algorithm is the first that spends in the worst caseO(log n) per every single input symbol.
We also provide a simple algorithm that constructs online an indexing structure (the BIS) in time O(log n) per input symbol, where n is the number of text symbols input thus far. This structure and fast LCP (Longest Common Prefix) queries on it, provide the backbone for the suffix tree construction. Together, our two data structures provide a searching algorithm for a pattern of length m whose time is O(min(mlog|Σ|,m+logn)+tocc)O(min(mlog|Σ|,m+logn)+tocc), where tocc is the number of occurrences of the pattern.

AB - The quest for a real-time suffix tree construction algorithm is over three decades old. To date there is no convincing understandable solution to this problem. This paper makes a step in this direction by constructing a suffix tree online in time O(log n) per every single input symbol. Clearly, it is impossible to achieve better than O(log n) time per symbol in the comparison model, therefore no true real time algorithm can exist for infinite alphabets. Nevertheless, the best that can be hoped for is that the construction time for every symbol does not exceed O(log n) (as opposed to an amortized O(log n) time per symbol, achieved by current known algorithms). To our knowledge, our algorithm is the first that spends in the worst caseO(log n) per every single input symbol.
We also provide a simple algorithm that constructs online an indexing structure (the BIS) in time O(log n) per input symbol, where n is the number of text symbols input thus far. This structure and fast LCP (Longest Common Prefix) queries on it, provide the backbone for the suffix tree construction. Together, our two data structures provide a searching algorithm for a pattern of length m whose time is O(min(mlog|Σ|,m+logn)+tocc)O(min(mlog|Σ|,m+logn)+tocc), where tocc is the number of occurrences of the pattern.

UR - https://scholar.google.co.il/scholar?q=Towards+Real+Time+Indexing%2C+%E2%80%A2%09A.+Amir%2C+T.+Kopelowitz%2C+M.+Lewenstein+and+N.+Lewenstein&btnG=&hl=en&as_sdt=0%2C5

UR - https://scholar.google.co.il/scholar?q=Towards+Real-Time+Suffix+Tree+Construction%2C+Amir+Amihood+&btnG=&hl=en&as_sdt=0%2C5

M3 - Conference contribution

BT - International Symposium on String Processing and Information Retrieval

A2 - Consens, Mariano

A2 - Navarro, Gonzalo

PB - Springer Berlin Heidelberg

ER -