Less space: Indexing for queries with wildcards

Moshe Lewenstein, J. Ian Munro, Venkatesh Raman, Sharma V. Thankachan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Text indexing is a fundamental problem in computer science, where the task is to index a given text (string) T[1..n], such that whenever a pattern P[1..p] comes as a query, we can efficiently report all those locations where P occurs as a substring of T. In this paper, we consider the case when P contains wildcard characters (which can match with any other character). The first non-trivial solution for the problem is given by Cole et al. [STOC 2004], where the index space is O(nlog k n) words or O(nlog k+1 n) bits and the query time is O(p+2h loglogn+occ), where k is the maximum number of wildcard characters allowed in P, h ≤ k is the number of wildcard characters in P and occ represents the number of occurrences of P in T. Even though many indexes offering different space-time trade-offs were later proposed, a clear improvement on this result is still not known. In this paper, we first propose an O(nlogk+ε n) bits index achieving the same query time as that of Cole et al.'s index, where 0<ε<1 is an arbitrary small constant. Then we propose another index of size O(nlog k nlogσ) bits, but with a slightly higher query time of O(p+2 h logn+occ), where σ denotes the alphabet set size.

Original languageEnglish
Title of host publicationAlgorithms and Computation - 24th International Symposium, ISAAC 2013, Proceedings
Pages89-99
Number of pages11
DOIs
StatePublished - 2013
Event24th International Symposium on Algorithms and Computation, ISAAC 2013 - Hong Kong, China
Duration: 16 Dec 201318 Dec 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8283 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th International Symposium on Algorithms and Computation, ISAAC 2013
Country/TerritoryChina
CityHong Kong
Period16/12/1318/12/13

Fingerprint

Dive into the research topics of 'Less space: Indexing for queries with wildcards'. Together they form a unique fingerprint.

Cite this