Succinct online dictionary matching with improved worst-case guarantees

Tsvi Kopelowitz, Ely Porat, Yaron Rozen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

In the online dictionary matching problem the goal is to preprocess a set of patterns D = {P1, . . . , Pd} over alphabet Σ, so that given an online text (one character at a time) we report all of the occurrences of patterns that are a suffix of the current text before the following character arrives. We introduce a succinct Aho-Corasick like data structure for the online dictionary matching problem. Our solution uses a new succinct representation for multi-labeled trees, in which each node has a set of labels from a universe of size λ. We consider lowest labeled ancestor (LLA) queries on multi-labeled trees, where given a node and a label we return the lowest proper ancestor of the node that has the queried label. In this paper we introduce a succinct representation of multi-labeled trees for λ = ω(1) that support LLA queries in O(log log λ) time. Using this representation of multi-labeled trees, we introduce a succinct data structure for the online dictionary matching problem when σ = ω(1). In this solution the worst case cost per character is O(log log σ + occ) time, where occ is the size of the current output. Moreover, the amortized cost per character is O(1 + occ) time.

Original languageEnglish
Title of host publication27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016
EditorsRoberto Grossi, Moshe Lewenstein
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Pages6.1-6.13
ISBN (Electronic)9783959770125
DOIs
StatePublished - 1 Jun 2016
Event27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016 - Tel Aviv, Israel
Duration: 27 Jun 201629 Jun 2016

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume54
ISSN (Print)1868-8969

Conference

Conference27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016
Country/TerritoryIsrael
CityTel Aviv
Period27/06/1629/06/16

Bibliographical note

Publisher Copyright:
© Tsvi Kopelowitz, Ely Porat, and Yaron Rozen.

Keywords

  • Aho-Corasick
  • Dictionary matching
  • Labeled trees
  • Succinct indexing

Fingerprint

Dive into the research topics of 'Succinct online dictionary matching with improved worst-case guarantees'. Together they form a unique fingerprint.

Cite this