TY - JOUR
T1 - Generalized function matching
AU - Amir, Amihood
AU - Nor, Igor
PY - 2007/9
Y1 - 2007/9
N2 - We present problems in different application areas: tandem repeats (computational biology), poetry and music analysis, and author validation, that require a more sophisticated pattern matching model that hitherto considered. We introduce a new matching criterion-generalized function matching-that encapsulates the notion suggested by the above problems. The generalized function matching problem has as its input a text T of length n over alphabet ΣT ∪ {φ{symbol}} and a pattern P = P [0] P [1] ⋯ P [m - 1] of length m over alphabet ΣP ∪ {φ{symbol}}. We seek all text locations i where the prefix of the substring that starts at i is equal to f (P [0]) f (P [1]) ⋯ f (P [m - 1]), for some function f : ΣP → ΣT*. We give a polynomial time algorithm for the generalized pattern matching problem over bounded alphabets. We identify in this problem an interesting phenomenon that has been rare in pattern matching. One where the complexity of the naive solution is a polynomial with the alphabet size in the exponent. This causes a significant complexity difference between the bounded alphabet and infinite alphabet case. We prove that the generalized pattern matching problem over infinite alphabets is NP-hard.
AB - We present problems in different application areas: tandem repeats (computational biology), poetry and music analysis, and author validation, that require a more sophisticated pattern matching model that hitherto considered. We introduce a new matching criterion-generalized function matching-that encapsulates the notion suggested by the above problems. The generalized function matching problem has as its input a text T of length n over alphabet ΣT ∪ {φ{symbol}} and a pattern P = P [0] P [1] ⋯ P [m - 1] of length m over alphabet ΣP ∪ {φ{symbol}}. We seek all text locations i where the prefix of the substring that starts at i is equal to f (P [0]) f (P [1]) ⋯ f (P [m - 1]), for some function f : ΣP → ΣT*. We give a polynomial time algorithm for the generalized pattern matching problem over bounded alphabets. We identify in this problem an interesting phenomenon that has been rare in pattern matching. One where the complexity of the naive solution is a polynomial with the alphabet size in the exponent. This causes a significant complexity difference between the bounded alphabet and infinite alphabet case. We prove that the generalized pattern matching problem over infinite alphabets is NP-hard.
KW - Function matching
KW - NP-hard
KW - Parameterized matching
KW - Pattern matching
UR - http://www.scopus.com/inward/record.url?scp=34248667020&partnerID=8YFLogxK
U2 - 10.1016/j.jda.2006.10.001
DO - 10.1016/j.jda.2006.10.001
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:34248667020
SN - 1570-8667
VL - 5
SP - 514
EP - 523
JO - Journal of Discrete Algorithms
JF - Journal of Discrete Algorithms
IS - 3
ER -