TY - GEN
T1 - Asynchronous pattern matching - metrics (Extended abstract)
AU - Amir, Amihood
PY - 2005
Y1 - 2005
N2 - Traditional Approximate Pattern Matching (e.g. Hamming distance errors, edit distance errors) assumes that various types of errors may occur to the data, but an implicit assumption is that the order of the data remains unchanged. Over the years, some applications identified types of "errors" were the data remains correct but its order is compromised. The earliest example is the "swap" error motivated by a common typing error. Other widely known examples such as transpositions, reversals and interchanges are motivated by biology. We propose that it is time to formally split the concept of "errors in data" and "errors in address" since they present different algorithmic challenges solved by different techniques. The "errors in address" model, which we call asynchronous pattern matching, since the data does not arrive in a synchronous sequential manner, is rich in problems not addresses hitherto. We will consider some reasonable metrics for asynchronous pattern matching, such as the number of inversions, or the number of generalized swaps, and show some efficient algorithms for these problems. As expected, the techniques needed to solve the problems are not taken from the standard pattern matching "toolkit".
AB - Traditional Approximate Pattern Matching (e.g. Hamming distance errors, edit distance errors) assumes that various types of errors may occur to the data, but an implicit assumption is that the order of the data remains unchanged. Over the years, some applications identified types of "errors" were the data remains correct but its order is compromised. The earliest example is the "swap" error motivated by a common typing error. Other widely known examples such as transpositions, reversals and interchanges are motivated by biology. We propose that it is time to formally split the concept of "errors in data" and "errors in address" since they present different algorithmic challenges solved by different techniques. The "errors in address" model, which we call asynchronous pattern matching, since the data does not arrive in a synchronous sequential manner, is rich in problems not addresses hitherto. We will consider some reasonable metrics for asynchronous pattern matching, such as the number of inversions, or the number of generalized swaps, and show some efficient algorithms for these problems. As expected, the techniques needed to solve the problems are not taken from the standard pattern matching "toolkit".
UR - http://www.scopus.com/inward/record.url?scp=84869113730&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84869113730
SN - 8001033074
SN - 9788001033074
T3 - Proceedings of the Prague Stringology Conference '05
SP - 31
EP - 36
BT - Proceedings of the Prague Stringology Conference '05
T2 - Prague Stringology Conference '05, PSC 2005
Y2 - 29 August 2005 through 31 August 2005
ER -