Approximate matching is one of the fundamental problems in pattern matching, and a ubiquitous problem in real applications. The Hamming distance is a simple and well studied example of approximate matching, motivated by typing, or noisy channels. Biological and image processing applications assign a different value to mismatches of different symbols. We consider the problem of approximate matching in the L1 metric - the k-L1-distance problem. Given text T = to, ..., tn-1 and pattern P = po, ..., pm-1 strings of natural number, and a natural number k, we seek all text locations i where the L1 distance of the pattern from the length m substring of text starting at i is not greater than k, i.e. ∑j=0m-1 |ti+j-pj| ≤ k. We provide an algorithm that solves the k-L1 -distance problem in time O(n √k log k). The algorithm applies a bounded divide-and-conquer approach and makes noveluses of non-boolean convolutions.
|Number of pages||13|
|Journal||Lecture Notes in Computer Science|
|State||Published - 2005|
|Event||Ot16th Annual Symposium on Combinatorial Pattern Matching, CPM 2005 - Jeju Island, Korea, Republic of|
Duration: 19 Jun 2005 → 22 Jun 2005