## Abstract

We present problems in different application areas: tandem repeats (computational biology), poetry and music analysis, and author validation, that require a more sophisticated pattern matching model that hitherto considered. We introduce a new matching criterion-generalized function matching-that encapsulates the notion suggested by the above problems. The generalized function matching problem has as its input a text T of length n over alphabet Σ_{T} ∪ {φ{symbol}} and a pattern P = P [0] P [1] ⋯ P [m - 1] of length m over alphabet Σ_{P} ∪ {φ{symbol}}. We seek all text locations i where the prefix of the substring that starts at i is equal to f (P [0]) f (P [1]) ⋯ f (P [m - 1]), for some function f : Σ_{P} → Σ_{T}^{*}. We give a polynomial time algorithm for the generalized pattern matching problem over bounded alphabets. We identify in this problem an interesting phenomenon that has been rare in pattern matching. One where the complexity of the naive solution is a polynomial with the alphabet size in the exponent. This causes a significant complexity difference between the bounded alphabet and infinite alphabet case. We prove that the generalized pattern matching problem over infinite alphabets is NP-hard.

Original language | English |
---|---|

Pages (from-to) | 514-523 |

Number of pages | 10 |

Journal | Journal of Discrete Algorithms |

Volume | 5 |

Issue number | 3 |

DOIs | |

State | Published - Sep 2007 |

## Keywords

- Function matching
- NP-hard
- Parameterized matching
- Pattern matching