Mismatch sampling

Raphaël Clifford, Klim Efremenko, Benny Porat, Ely Porat, Amir Rothschild

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

We reconsider the well-known problem of pattern matching under the Hamming distance. Previous approaches have shown how to count the number of mismatches efficiently, especially when a bound is known for the maximum Hamming distance. Our interest is different in that we wish to collect a random sample of mismatches of fixed size at each position in the text. Given a pattern p of length m and a text t of length n, we show how to sample with high probability up to c mismatches from every alignment of p and t in O((c+logn)(n+mlogm)logm) time. Further, we guarantee that the mismatches are sampled uniformly and can therefore be seen as representative of the types of mismatches that occur.

Original languageEnglish
Pages (from-to)112-118
Number of pages7
JournalInformation and Computation
Volume214
DOIs
StatePublished - May 2012

Bibliographical note

Funding Information:
This work was supported in part by the Binational Science Foundation (BSF) grant 2006334 and Israel Science Foundation (ISF) grant 1484/08 as well as the Engineering and Physical Sciences Research Council (EPSRC).

Funding

This work was supported in part by the Binational Science Foundation (BSF) grant 2006334 and Israel Science Foundation (ISF) grant 1484/08 as well as the Engineering and Physical Sciences Research Council (EPSRC).

FundersFunder number
Engineering and Physical Sciences Research CouncilEP/J011940/1
United States-Israel Binational Science Foundation2006334
Israel Science Foundation1484/08

    Fingerprint

    Dive into the research topics of 'Mismatch sampling'. Together they form a unique fingerprint.

    Cite this