Abstract
The string matching with mismatches problem is that of finding the number of mismatches between a pattern P of length m and every length m substring of the text T. Currently, the fastest algorithms for this problem are the following. The Galil-Giancarlo algorithm finds all locations where the pattern has at most k errors (where k is part of the input) in time O(nk). The Abrahamson algorithm finds the number of mismatches at every location in time O(n√m log m). We present an algorithm that is faster than both. Our algorithm finds all locations where the pattern has at most k errors in time O(n√k log k). We also show an algorithm that solves the above problem in time O((n + (nk3)/m) log k).
Original language | English |
---|---|
Pages (from-to) | 257-275 |
Number of pages | 19 |
Journal | Journal of Algorithms |
Volume | 50 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2004 |
Bibliographical note
Funding Information:* Corresponding author. E-mail addresses: [email protected] (A. Amir), [email protected] (M. Lewenstein), [email protected] (E. Porat). 1 Partially supported by NSF grant CCR-96-10170, BSF grant 96-00509, and a BIU internal research grant.
Funding
* Corresponding author. E-mail addresses: [email protected] (A. Amir), [email protected] (M. Lewenstein), [email protected] (E. Porat). 1 Partially supported by NSF grant CCR-96-10170, BSF grant 96-00509, and a BIU internal research grant.
Funders | Funder number |
---|---|
National Science Foundation | CCR-96-10170 |
United States-Israel Binational Science Foundation | 96-00509 |
Keywords
- Approximate string matching
- Combinatorial algorithms on words
- Design and analysis of algorithms
- Hamming distance