TY - GEN
T1 - Restricted common superstring and restricted common supersequence
AU - Clifford, Raphaël
AU - Gotthilf, Zvi
AU - Lewenstein, Moshe
AU - Popa, Alexandru
PY - 2011
Y1 - 2011
N2 - The shortest common superstring and the shortest common supersequence are two well studied problems having a wide range of applications. In this paper we consider both problems with resource constraints, denoted as the Restricted Common Superstring (shortly RCSstr) problem and the Restricted Common Supersequence (shortly RCSseq). In the RCSstr (RCSseq) problem we are given a set S of n strings, s 1, s 2, ..., s n , and a multiset t = {t 1, t 2, ..., t m }, and the goal is to find a permutation π: {1, ..., m} → {1, ..., m} to maximize the number of strings in S that are substrings (subsequences) of π(t) = t π(1) t π(2) ⋯ t π(m) (we call this ordering of the multiset, π(t), a permutation of t). We first show that in its most general setting the RCSstr problem is NP-complete and hard to approximate within a factor of n 1 - ε , for any ε > 0, unless P = NP. Afterwards, we present two separate reductions to show that the RCSstr problem remains NP-Hard even in the case where the elements of t are drawn from a binary alphabet or for the case where all input strings are of length two. We then present some approximation results for several variants of the RCSstr problem. In the second part of this paper, we turn to the RCSseq problem, where we present some hardness results, tight lower bounds and approximation algorithms.
AB - The shortest common superstring and the shortest common supersequence are two well studied problems having a wide range of applications. In this paper we consider both problems with resource constraints, denoted as the Restricted Common Superstring (shortly RCSstr) problem and the Restricted Common Supersequence (shortly RCSseq). In the RCSstr (RCSseq) problem we are given a set S of n strings, s 1, s 2, ..., s n , and a multiset t = {t 1, t 2, ..., t m }, and the goal is to find a permutation π: {1, ..., m} → {1, ..., m} to maximize the number of strings in S that are substrings (subsequences) of π(t) = t π(1) t π(2) ⋯ t π(m) (we call this ordering of the multiset, π(t), a permutation of t). We first show that in its most general setting the RCSstr problem is NP-complete and hard to approximate within a factor of n 1 - ε , for any ε > 0, unless P = NP. Afterwards, we present two separate reductions to show that the RCSstr problem remains NP-Hard even in the case where the elements of t are drawn from a binary alphabet or for the case where all input strings are of length two. We then present some approximation results for several variants of the RCSstr problem. In the second part of this paper, we turn to the RCSseq problem, where we present some hardness results, tight lower bounds and approximation algorithms.
UR - http://www.scopus.com/inward/record.url?scp=79960081879&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-21458-5_39
DO - 10.1007/978-3-642-21458-5_39
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:79960081879
SN - 9783642214578
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 467
EP - 478
BT - Combinatorial Pattern Matching - 22nd Annual Symposium, CPM 2011, Proceedings
T2 - 22nd Annual Symposium on Combinatorial Pattern Matching, CPM 2011
Y2 - 27 June 2011 through 29 June 2011
ER -