Indexing with gaps

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

17 Scopus citations


In Indexing with Gaps one seeks to index a text to allow pattern queries that allow gaps within the pattern query. Formally a gapped-pattern over alphabet Σ is a pattern of the form p = p1g1p 2g2⋯gpℓ+1, where ∀i, pi ∈ Σ* and each gi is a gap length ∈ N. Often one considers these patterns with some bound constraints, for example, all gaps are bounded by a gap-bound G. Near-optimal solutions have, lately, been proposed for the case of one gap only with a predetermined size. More specifically, an indexing solution for patterns of the form p 1·g·p2, where g is known apriori. In this case the solutions mentioned are preprocessed in O(n logε n) time and O(n) space, where the pattern queries are answered in O(|p1| + |p2|), for constant sized alphabets. For the more general case when there is a bound G these results can be easily adapted with a multiplicative factor of O(G) for the preprocessing, i.e. O(n log ε nG) preprocessing time and O(nG) preprocessing space. Alas, these solutions do not lend to more than one gap. In this paper we propose a solution for k gaps one with preprocessing time O(nG2k log k n log log n) and space of O(nG2k logk n) and query time O(m + 2k log log n), where m = Σi=1 |pi|.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 18th International Symposium, SPIRE 2011, Proceedings
Number of pages9
StatePublished - 2011
Event18th International Symposium on String Processing and Information Retrieval, SPIRE 2011 - Pisa, Italy
Duration: 17 Oct 201121 Oct 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7024 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference18th International Symposium on String Processing and Information Retrieval, SPIRE 2011


Dive into the research topics of 'Indexing with gaps'. Together they form a unique fingerprint.

Cite this