TY - GEN

T1 - Quick greedy computation for minimum common string partitions

AU - Goldstein, Isaac

AU - Lewenstein, Moshe

PY - 2011

Y1 - 2011

N2 - In the minimum common string partition problem one is given two strings S and T with the same character statistics and one seeks the smallest partition of S into substrings so that T can also be partitioned into the same substring multiset. The problem is fundamental in several variants of edit distance with block operations, e.g. signed reversal distance with duplicates and edit distance with moves. The minimum common string partition problem is known to be NP-complete and the best approximation known is of order O(lognlog* n). Since this problem is of utmost practical importance one seeks a heuristic that will (1) usually have a low approximation factor and (2) will run fast. A simple greedy algorithm is known and it has been well-studied from an approximation point of view. It has been shown to have a bad worst case approximation factor. However, all the bad approximation factors presented so far stem from complicated recursive construction. In practice the greedy algorithm seems to have small approximation factors. However, the best current implementation of greedy runs in quadratic time. We propose a novel method to implement greedy in linear time.

AB - In the minimum common string partition problem one is given two strings S and T with the same character statistics and one seeks the smallest partition of S into substrings so that T can also be partitioned into the same substring multiset. The problem is fundamental in several variants of edit distance with block operations, e.g. signed reversal distance with duplicates and edit distance with moves. The minimum common string partition problem is known to be NP-complete and the best approximation known is of order O(lognlog* n). Since this problem is of utmost practical importance one seeks a heuristic that will (1) usually have a low approximation factor and (2) will run fast. A simple greedy algorithm is known and it has been well-studied from an approximation point of view. It has been shown to have a bad worst case approximation factor. However, all the bad approximation factors presented so far stem from complicated recursive construction. In practice the greedy algorithm seems to have small approximation factors. However, the best current implementation of greedy runs in quadratic time. We propose a novel method to implement greedy in linear time.

UR - http://www.scopus.com/inward/record.url?scp=79960098377&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-21458-5_24

DO - 10.1007/978-3-642-21458-5_24

M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???

AN - SCOPUS:79960098377

SN - 9783642214578

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 273

EP - 284

BT - Combinatorial Pattern Matching - 22nd Annual Symposium, CPM 2011, Proceedings

T2 - 22nd Annual Symposium on Combinatorial Pattern Matching, CPM 2011

Y2 - 27 June 2011 through 29 June 2011

ER -