TY - UNPB

T1 - Managing Unbounded-Length Keys in Comparison-Driven Data Structures with Applications to On-Line Indexing

AU - Amir, Amihood

AU - Franceschini, Gianni

AU - Grossi, Roberto

AU - Kopelowitz, Tsvi

AU - Lewenstein, Moshe

AU - Lewenstein, Noa

PY - 2013/6/3

Y1 - 2013/6/3

N2 - This paper presents a general technique for optimally transforming any dynamic data structure that operates on atomic and indivisible keys by constant-time comparisons, into a data structure that handles unbounded-length keys whose comparison cost is not a constant. Examples of these keys are strings, multi-dimensional points, multiple-precision numbers, multi-key data (e.g.~records), XML paths, URL addresses, etc. The technique is more general than what has been done in previous work as no particular exploitation of the underlying structure of is required. The only requirement is that the insertion of a key must identify its predecessor or its successor. Using the proposed technique, online suffix tree can be constructed in worst case time $O(\log n)$ per input symbol (as opposed to amortized $O(\log n)$ time per symbol, achieved by previously known algorithms). To our knowledge, our algorithm is the first that achieves $O(\log n)$ worst case time per input symbol. Searching for a pattern of length $m$ in the resulting suffix tree takes $O(\min(m\log |\Sigma|, m + \log n) + tocc)$ time, where $tocc$ is the number of occurrences of the pattern. The paper also describes more applications and show how to obtain alternative methods for dealing with suffix sorting, dynamic lowest common ancestors and order maintenance.

AB - This paper presents a general technique for optimally transforming any dynamic data structure that operates on atomic and indivisible keys by constant-time comparisons, into a data structure that handles unbounded-length keys whose comparison cost is not a constant. Examples of these keys are strings, multi-dimensional points, multiple-precision numbers, multi-key data (e.g.~records), XML paths, URL addresses, etc. The technique is more general than what has been done in previous work as no particular exploitation of the underlying structure of is required. The only requirement is that the insertion of a key must identify its predecessor or its successor. Using the proposed technique, online suffix tree can be constructed in worst case time $O(\log n)$ per input symbol (as opposed to amortized $O(\log n)$ time per symbol, achieved by previously known algorithms). To our knowledge, our algorithm is the first that achieves $O(\log n)$ worst case time per input symbol. Searching for a pattern of length $m$ in the resulting suffix tree takes $O(\min(m\log |\Sigma|, m + \log n) + tocc)$ time, where $tocc$ is the number of occurrences of the pattern. The paper also describes more applications and show how to obtain alternative methods for dealing with suffix sorting, dynamic lowest common ancestors and order maintenance.

KW - cs.DS

M3 - פרסום מוקדם

BT - Managing Unbounded-Length Keys in Comparison-Driven Data Structures with Applications to On-Line Indexing

PB - arXiv preprint

ER -