TY - GEN
T1 - Approximate string matching with swap and mismatch
AU - Lipsky, Ohad
AU - Porat, Benny
AU - Porat, Elly
AU - Shalom, B. Riva
AU - Tzur, Asaf
PY - 2007
Y1 - 2007
N2 - Finding the similarity between two sequences is a major problem in computer science. It is motivated by many issues from computational biology as well as from information retrieval and image processing. These fields take into account possible corruptions of the data caused by genome rearrangements, typing mistakes, and more. Therefore, many applications do not require merely complete resemblance of the sequences, but rather an approximated matching. We consider mismatches and swaps as natural mistakes which are allowed in a meagre number. The edit distance problem with swap and mismatch operations was discussed by Amir et. al. [3], They solved the problem in O(n√log m) time. From then on the problem of string matching with at most k swaps and mismatches errors was open. In this paper we present an algorithm that finds all locations where the pattern has at most k mismatch and swap errors in time O(n√k log m).
AB - Finding the similarity between two sequences is a major problem in computer science. It is motivated by many issues from computational biology as well as from information retrieval and image processing. These fields take into account possible corruptions of the data caused by genome rearrangements, typing mistakes, and more. Therefore, many applications do not require merely complete resemblance of the sequences, but rather an approximated matching. We consider mismatches and swaps as natural mistakes which are allowed in a meagre number. The edit distance problem with swap and mismatch operations was discussed by Amir et. al. [3], They solved the problem in O(n√log m) time. From then on the problem of string matching with at most k swaps and mismatches errors was open. In this paper we present an algorithm that finds all locations where the pattern has at most k mismatch and swap errors in time O(n√k log m).
UR - http://www.scopus.com/inward/record.url?scp=38349004713&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-77120-3_75
DO - 10.1007/978-3-540-77120-3_75
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:38349004713
SN - 9783540771180
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 869
EP - 880
BT - Algorithms and Computation - 18th International Symposium, ISAAC 2007, Proceedings
PB - Springer Verlag
T2 - 18th International Symposium on Algorithms and Computation, ISAAC 2007
Y2 - 17 December 2007 through 19 December 2007
ER -