Abstract
Given an alphabet ∑={1,2,⋯,|∑|} text string T ∑ n and a pattern string P ∑ m, for each i=1,2,⋯,n-m+1 define L p (i) as the p-norm distance when the pattern is aligned below the text and starts at position i of the text. The problem of pattern matching with L p distance is to compute L p (i) for every i=1,2,⋯,n-m+1. We discuss the problem for d=1,2,∞. First, in the case of L 1 matching (pattern matching with an L 1 distance) we show a reduction of the string matching with mismatches problem to the L 1 matching problem and we present an algorithm that approximates the L 1 matching up to a factor of 1+ε, which has an O(1/ε 2n log m log|Σ|) run time. Then, the L 2 matching problem (pattern matching with an L 2 distance) is solved with a simple O(nlog∈m) time algorithm. Finally, we provide an algorithm that approximates the L ∞ matching up to a factor of 1+ε with a run time of O(1/εnlog mlog|Σ|). We also generalize the problem of String Matching with mismatches to have weighted mismatches and present an O(nlog∈ 4 m) algorithm that approximates the results of this problem up to a factor of O(log∈m) in the case that the weight function is a metric.
Original language | English |
---|---|
Pages (from-to) | 335-348 |
Number of pages | 14 |
Journal | Algorithmica |
Volume | 60 |
Issue number | 2 |
DOIs | |
State | Published - Jun 2011 |
Bibliographical note
Funding Information:Research supported in part by US-Israel Binational Science Foundation.
Funding
Research supported in part by US-Israel Binational Science Foundation.
Funders | Funder number |
---|---|
United States-Israel Binational Science Foundation |
Keywords
- Approximate string matching
- Combinatorial algorithms on words
- Design and analysis of algorithms
- Hamming distance