The κ-mismatch problem revisited

Raphael Clifford, Allyx Fontaine, Ely Porat, Benjamin Sach, Tatiana Starikovskaya

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

52 Scopus citations

Abstract

We revisit the complexity of one of the most basic problems in pattern matching. In the κ-mismatch problem we must compute the Hamming distance between a pattern of length m and every m-length substring of a text of length n, as long as that Hamming distance is at most k. Where the Hamming distance is greater than k at some alignment of the pattern and text, we simply output "No". We study this problem in both the standard offline setting and also as a streaming problem. In the streaming/c-mismatch problem the text arrives one symbol at a time and we must give an output before processing any future symbols. Our main results are as follows: Our first result is a deterministic 0(nk2 log k/m + n polylog m) time offline algorithm for/c-mismatch on a text of length n. This is a factor of k improvement over the fastest previous result of this form from SODA 2000 [9, 10]. We then give a randomised and online algorithm which runs in the same time complexity but requires only 0(k2 polylog m) space in total. Next we give a randomised (1 + ∈)-approximation algorithm for the streaming κ-mismatch problem which uses 0(k2 polylog rn/∈2) space and runs in 0(polylogm/∈2) worst-case time per arriving symbol. Finally we combine our new results to derive a randomised 0(/c2 polylog m) space algorithm for the streaming/c-mismatch problem which runs in 0( √klog k + polylog m) worst-case time per arriving symbol. This improves the best previous space complexity for streaming/c-mismatch from FOCS 2009 [26] by a factor of k. We also improve the time complexity of this previous result by an even greater factor to match the fastest known offline algorithm (up to logarithmic factors).

Original languageEnglish
Title of host publication27th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016
EditorsRobert Krauthgamer
PublisherAssociation for Computing Machinery
Pages2039-2052
Number of pages14
ISBN (Electronic)9781510819672
DOIs
StatePublished - 2016
Event27th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016 - Arlington, United States
Duration: 10 Jan 201612 Jan 2016

Publication series

NameProceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
Volume3

Conference

Conference27th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016
Country/TerritoryUnited States
CityArlington
Period10/01/1612/01/16

Bibliographical note

Publisher Copyright:
© Copyright (2016) by SIAM: Society for Industrial and Applied Mathematics.

Fingerprint

Dive into the research topics of 'The κ-mismatch problem revisited'. Together they form a unique fingerprint.

Cite this