TY - JOUR
T1 - On the relationship between histogram indexing and block-mass indexing
AU - Amir, Amihood
AU - Butman, Ayelet
AU - Porat, Ely
PY - 2014/5/28
Y1 - 2014/5/28
N2 - Histogram indexing, also known as jumbled pattern indexing and permutation indexing is one of the important current open problems in pattern matching. It was introduced about 6 years ago and has seen active research since. Yet, to date there is no algorithm that can preprocess a text T in time o(|T|2/polylog|T|) and achieve histogram indexing, even over a binary alphabet, in time independent of the text length. The pattern matching version of this problem has a simple linear-time solution. Blockmass pattern matching problem is a recently introduced problem, motivated by issues in mass-spectrometry. It is also an example of a pattern matching problem that has an efficient, almost linear-time solution but whose indexing version is daunting. However, for fixed finite alphabets, there has been progress made. In this paper, a strong connection between the histogram indexing problem and the block-mass pattern indexing problem is shown. The reduction we show between the two problems is amazingly simple. Its value lies in recognizing the connection between these two apparently disparate problems, rather than the complexity of the reduction. In addition, we show that for both these problems, even over unbounded alphabets, there are algorithms that preprocess a text T in time o(|T|2/polylog|T|) and enable answering indexing queries in time polynomial in the query length. The contributions of this paper are twofold: (i) we introduce the idea of allowing a trade-off between the preprocessing time and query time of various indexing problems that have been stumbling blocks in the literature. (ii) We take the first step in introducing a class of indexing problems that, we believe, cannot be pre-processed in time o(|T|2/polylog|T|) and enable linear-time query processing.
AB - Histogram indexing, also known as jumbled pattern indexing and permutation indexing is one of the important current open problems in pattern matching. It was introduced about 6 years ago and has seen active research since. Yet, to date there is no algorithm that can preprocess a text T in time o(|T|2/polylog|T|) and achieve histogram indexing, even over a binary alphabet, in time independent of the text length. The pattern matching version of this problem has a simple linear-time solution. Blockmass pattern matching problem is a recently introduced problem, motivated by issues in mass-spectrometry. It is also an example of a pattern matching problem that has an efficient, almost linear-time solution but whose indexing version is daunting. However, for fixed finite alphabets, there has been progress made. In this paper, a strong connection between the histogram indexing problem and the block-mass pattern indexing problem is shown. The reduction we show between the two problems is amazingly simple. Its value lies in recognizing the connection between these two apparently disparate problems, rather than the complexity of the reduction. In addition, we show that for both these problems, even over unbounded alphabets, there are algorithms that preprocess a text T in time o(|T|2/polylog|T|) and enable answering indexing queries in time polynomial in the query length. The contributions of this paper are twofold: (i) we introduce the idea of allowing a trade-off between the preprocessing time and query time of various indexing problems that have been stumbling blocks in the literature. (ii) We take the first step in introducing a class of indexing problems that, we believe, cannot be pre-processed in time o(|T|2/polylog|T|) and enable linear-time query processing.
KW - Classes of indexing problems
KW - Complexity
KW - Reduction
UR - http://www.scopus.com/inward/record.url?scp=84899053471&partnerID=8YFLogxK
U2 - 10.1098/rsta.2013.0132
DO - 10.1098/rsta.2013.0132
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 24751866
AN - SCOPUS:84899053471
SN - 1364-503X
VL - 372
JO - Philosophical transactions. Series A, Mathematical, physical, and engineering sciences
JF - Philosophical transactions. Series A, Mathematical, physical, and engineering sciences
IS - 2016
M1 - 20130132
ER -