TY - GEN
T1 - Indexing with gaps
AU - Lewenstein, Moshe
PY - 2011
Y1 - 2011
N2 - In Indexing with Gaps one seeks to index a text to allow pattern queries that allow gaps within the pattern query. Formally a gapped-pattern over alphabet Σ is a pattern of the form p = p1g1p 2g2⋯gℓpℓ+1, where ∀i, pi ∈ Σ* and each gi is a gap length ∈ N. Often one considers these patterns with some bound constraints, for example, all gaps are bounded by a gap-bound G. Near-optimal solutions have, lately, been proposed for the case of one gap only with a predetermined size. More specifically, an indexing solution for patterns of the form p 1·g·p2, where g is known apriori. In this case the solutions mentioned are preprocessed in O(n logε n) time and O(n) space, where the pattern queries are answered in O(|p1| + |p2|), for constant sized alphabets. For the more general case when there is a bound G these results can be easily adapted with a multiplicative factor of O(G) for the preprocessing, i.e. O(n log ε nG) preprocessing time and O(nG) preprocessing space. Alas, these solutions do not lend to more than one gap. In this paper we propose a solution for k gaps one with preprocessing time O(nG2k log k n log log n) and space of O(nG2k logk n) and query time O(m + 2k log log n), where m = Σi=1 |pi|.
AB - In Indexing with Gaps one seeks to index a text to allow pattern queries that allow gaps within the pattern query. Formally a gapped-pattern over alphabet Σ is a pattern of the form p = p1g1p 2g2⋯gℓpℓ+1, where ∀i, pi ∈ Σ* and each gi is a gap length ∈ N. Often one considers these patterns with some bound constraints, for example, all gaps are bounded by a gap-bound G. Near-optimal solutions have, lately, been proposed for the case of one gap only with a predetermined size. More specifically, an indexing solution for patterns of the form p 1·g·p2, where g is known apriori. In this case the solutions mentioned are preprocessed in O(n logε n) time and O(n) space, where the pattern queries are answered in O(|p1| + |p2|), for constant sized alphabets. For the more general case when there is a bound G these results can be easily adapted with a multiplicative factor of O(G) for the preprocessing, i.e. O(n log ε nG) preprocessing time and O(nG) preprocessing space. Alas, these solutions do not lend to more than one gap. In this paper we propose a solution for k gaps one with preprocessing time O(nG2k log k n log log n) and space of O(nG2k logk n) and query time O(m + 2k log log n), where m = Σi=1 |pi|.
UR - http://www.scopus.com/inward/record.url?scp=80053985128&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-24583-1_14
DO - 10.1007/978-3-642-24583-1_14
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:80053985128
SN - 9783642245824
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 135
EP - 143
BT - String Processing and Information Retrieval - 18th International Symposium, SPIRE 2011, Proceedings
T2 - 18th International Symposium on String Processing and Information Retrieval, SPIRE 2011
Y2 - 17 October 2011 through 21 October 2011
ER -