TY - GEN
T1 - Generalized substring compression
AU - Keller, Orgad
AU - Kopelowitz, Tsvi
AU - Landau, Shir
AU - Lewenstein, Moshe
PY - 2009
Y1 - 2009
N2 - In substring compression one is given a text to preprocess so that, upon request, a compressed substring is returned. Generalized substring compression is the same with the following twist. The queries contain an additional context substring (or a collection of context substrings) and the answers are the substring in compressed format, where the context substring is used to make the compression more efficient. We focus our attention on generalized substring compression and present the first non-trivial correct algorithm for this problem. In our algorithm we inherently propose a method for finding the bounded longest common prefix of substrings, which may be of independent interest. In addition, we propose an efficient algorithm for substring compression which makes use of range searching for minimum queries. We present several tradeoffs for both problems. For compressing the substring S[i . . j] (possibly with the substring S[α . . β] as a context), best query times we achieve are O(C) and O( C log (j-i/c)for substring compression query and generalized substring compression query, respectively, where C is the number of phrases encoded.
AB - In substring compression one is given a text to preprocess so that, upon request, a compressed substring is returned. Generalized substring compression is the same with the following twist. The queries contain an additional context substring (or a collection of context substrings) and the answers are the substring in compressed format, where the context substring is used to make the compression more efficient. We focus our attention on generalized substring compression and present the first non-trivial correct algorithm for this problem. In our algorithm we inherently propose a method for finding the bounded longest common prefix of substrings, which may be of independent interest. In addition, we propose an efficient algorithm for substring compression which makes use of range searching for minimum queries. We present several tradeoffs for both problems. For compressing the substring S[i . . j] (possibly with the substring S[α . . β] as a context), best query times we achieve are O(C) and O( C log (j-i/c)for substring compression query and generalized substring compression query, respectively, where C is the number of phrases encoded.
UR - http://www.scopus.com/inward/record.url?scp=70350633622&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-02441-2_3
DO - 10.1007/978-3-642-02441-2_3
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:70350633622
SN - 3642024408
SN - 9783642024405
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 26
EP - 38
BT - Combinatorial Pattern Matching - 20th Annual Symposium, CPM 2009, Proceedings
T2 - 20th Annual Symposium on Combinatorial Pattern Matching, CPM 2009
Y2 - 22 June 2009 through 24 June 2009
ER -