Abstract
We present problems in different application areas: tandem repeats (computational biology), poetry and music analysis, and author validation, that require a more sophisticated pattern matching model that hitherto considered.
We introduce a new matching criterion—generalized function matching—that encapsulates the notion suggested by the above problems. The generalized function matching problem has as its input a text T of length n over alphabet ΣT∪{ϕ} and a pattern P=P[0]P[1]⋯P[m−1] of length m over alphabet ΣP∪{ϕ}. We seek all text locations i where the prefix of the substring that starts at i is equal to f(P[0])f(P[1])⋯f(P[m−1]), for some function View the MathML source.
We give a polynomial time algorithm for the generalized pattern matching problem over bounded alphabets. We identify in this problem an interesting phenomenon that has been rare in pattern matching. One where the complexity of the naive solution is a polynomial with the alphabet size in the exponent. This causes a significant complexity difference between the bounded alphabet and infinite alphabet case. We prove that the generalized pattern matching problem over infinite alphabets is NP-hard.
Original language | American English |
---|---|
Title of host publication | 15th Annual International Symposium on Algorithms and Computation (ISAAC 2004) |
State | Published - 2004 |