Generalized function matching

Amihood Amir, Igor Nor

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

We present problems in different application areas: tandem repeats (computational biology), poetry and music analysis, and author validation, that require a more sophisticated pattern matching model that hitherto considered. We introduce a new matching criterion - generalized function matching - that encapsulates the notion suggested by the above problems. The generalized function matching problem has as its input a text T of length n over alphabet ∑T ∪ {φ} and a pattern P = P[0]P[1]⋯P[m-1] of length m over alphabet ∑P ∪ {φ}. We seek all text locations i where the prefix of the substring that starts at i is equal to f(P[0])f(P[1]) ⋯ f(P[m-1]), for some function f : ∑P → ∑*T. We give a polynomial time algorithm for the generalized pattern matching problem over bounded alphabets. We identify in this problem an important new phenomenon in pattern matching. One where there is a significant complexity difference between the bounded alphabet and infinite alphabet case. We prove that the generalized pattern matching problem over infinite alphabets is NP-hard. To our knowledge, this is the first case in the literature where a pattern matching problem over a bounded alphabet can be solved in polynomial time but the infinite alphabet version is NP-hard. Keywords: Pattern matching, function matching, parameterized matching, NP-hard.

Bibliographical note

Funding Information:
★ Partly supported by NSF grant CCR-01-04494 and ISF grant 282/01. ★★ Partly supported by ISF grant 282/01.

Funding

★ Partly supported by NSF grant CCR-01-04494 and ISF grant 282/01. ★★ Partly supported by ISF grant 282/01.

FundersFunder number
National Science FoundationCCR-01-04494
Israel Science Foundation282/01

    Fingerprint

    Dive into the research topics of 'Generalized function matching'. Together they form a unique fingerprint.

    Cite this