Abstract
We present solutions for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern p of length m with don't care symbols and a bound k, our algorithms find all the places that the pattern matches the text with at most k mismatches. We first give a Θ (n (k + log m log k) log n) time randomised algorithm which finds the correct answer with high probability. We then present a new deterministic Θ (n k2 log2 m) time solution that uses tools originally developed for group testing. Taking our derandomisation approach further we develop an approach based on k-selectors that runs in Θ (n k polylog m) time. Further, in each case the location of the mismatches at each alignment is also given at no extra cost.
Original language | English |
---|---|
Pages (from-to) | 115-124 |
Number of pages | 10 |
Journal | Journal of Computer and System Sciences |
Volume | 76 |
Issue number | 2 |
DOIs | |
State | Published - Mar 2010 |
Bibliographical note
Funding Information:E-mail addresses: [email protected] (R. Clifford), [email protected] (K. Efremenko), [email protected] (E. Porat), [email protected] (A. Rothschild). 1 Research supported in part by the Binational Science Foundation (BSF). 2 Throughout this paper we assume the RAM model with multiplication when giving the time complexity of the FFT. This is in order to be consistent with the large body of previous work on pattern matching with FFTs.
Funding
E-mail addresses: [email protected] (R. Clifford), [email protected] (K. Efremenko), [email protected] (E. Porat), [email protected] (A. Rothschild). 1 Research supported in part by the Binational Science Foundation (BSF). 2 Throughout this paper we assume the RAM model with multiplication when giving the time complexity of the FFT. This is in order to be consistent with the large body of previous work on pattern matching with FFTs.
Funders | Funder number |
---|---|
Engineering and Physical Sciences Research Council | EP/F02682X/1 |
United States-Israel Binational Science Foundation |
Keywords
- Group testing
- Pattern matching
- Randomised algorithms
- String algorithms