Faster algorithms for string matching with k mismatches

Research output: Contribution to journalArticlepeer-review

134 Scopus citations

Abstract

The string matching with mismatches problem is that of finding the number of mismatches between a pattern P of length m and every length m substring of the text T. Currently, the fastest algorithms for this problem are the following. The Galil-Giancarlo algorithm finds all locations where the pattern has at most k errors (where k is part of the input) in time O(nk). The Abrahamson algorithm finds the number of mismatches at every location in time O(n√m log m). We present an algorithm that is faster than both. Our algorithm finds all locations where the pattern has at most k errors in time O(n√k log k). We also show an algorithm that solves the above problem in time O((n + (nk3)/m) log k).

Original languageEnglish
Pages (from-to)257-275
Number of pages19
JournalJournal of Algorithms
Volume50
Issue number2
DOIs
StatePublished - Feb 2004

Bibliographical note

Funding Information:
* Corresponding author. E-mail addresses: [email protected] (A. Amir), [email protected] (M. Lewenstein), [email protected] (E. Porat). 1 Partially supported by NSF grant CCR-96-10170, BSF grant 96-00509, and a BIU internal research grant.

Funding

* Corresponding author. E-mail addresses: [email protected] (A. Amir), [email protected] (M. Lewenstein), [email protected] (E. Porat). 1 Partially supported by NSF grant CCR-96-10170, BSF grant 96-00509, and a BIU internal research grant.

FundersFunder number
National Science FoundationCCR-96-10170
United States-Israel Binational Science Foundation96-00509

    Keywords

    • Approximate string matching
    • Combinatorial algorithms on words
    • Design and analysis of algorithms
    • Hamming distance

    Fingerprint

    Dive into the research topics of 'Faster algorithms for string matching with k mismatches'. Together they form a unique fingerprint.

    Cite this