Abstract
In the online dictionary matching problem the goal is to preprocess a set of patterns D = {P1, . . . , Pd} over alphabet Σ, so that given an online text (one character at a time) we report all of the occurrences of patterns that are a suffix of the current text before the following character arrives. We introduce a succinct Aho-Corasick like data structure for the online dictionary matching problem. Our solution uses a new succinct representation for multi-labeled trees, in which each node has a set of labels from a universe of size λ. We consider lowest labeled ancestor (LLA) queries on multi-labeled trees, where given a node and a label we return the lowest proper ancestor of the node that has the queried label. In this paper we introduce a succinct representation of multi-labeled trees for λ = ω(1) that support LLA queries in O(log log λ) time. Using this representation of multi-labeled trees, we introduce a succinct data structure for the online dictionary matching problem when σ = ω(1). In this solution the worst case cost per character is O(log log σ + occ) time, where occ is the size of the current output. Moreover, the amortized cost per character is O(1 + occ) time.
Original language | English |
---|---|
Title of host publication | 27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016 |
Editors | Roberto Grossi, Moshe Lewenstein |
Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
Pages | 6.1-6.13 |
ISBN (Electronic) | 9783959770125 |
DOIs | |
State | Published - 1 Jun 2016 |
Event | 27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016 - Tel Aviv, Israel Duration: 27 Jun 2016 → 29 Jun 2016 |
Publication series
Name | Leibniz International Proceedings in Informatics, LIPIcs |
---|---|
Volume | 54 |
ISSN (Print) | 1868-8969 |
Conference
Conference | 27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016 |
---|---|
Country/Territory | Israel |
City | Tel Aviv |
Period | 27/06/16 → 29/06/16 |
Bibliographical note
Publisher Copyright:© Tsvi Kopelowitz, Ely Porat, and Yaron Rozen.
Funding
Work supported in part by NSF grants CCF-1217338, CNS-1318294, and CCF-1514383.
Funders | Funder number |
---|---|
National Science Foundation | CCF-1217338, CNS-1318294, CCF-1514383 |
Keywords
- Aho-Corasick
- Dictionary matching
- Labeled trees
- Succinct indexing