Abstract
The problem of finding the longest common subsequence (LCS) of two given strings A 1 and A 2 is a well-studied problem. The constrained longest common subsequence (C-LCS) for three strings A 1, A 2 and B 1 is the longest common subsequence of A 1 and A 2 that contains B 1 as a subsequence. The fastest algorithm solving the C-LCS problem has a time complexity of O(m 1 m 2 n 1) where m 1, m 2 and n 1 are the lengths of A 1, A 2 and B 1 respectively. In this paper we consider two general variants of the C-LCS problem. First we show that in case of two input strings and an arbitrary number of constraint strings, it is NP-hard to approximate the C-LCS problem. Moreover, it is easy to see that in case of an arbitrary number of input strings and a single constraint, the problem of finding the constrained longest common subsequence is NP-hard. Therefore, we propose a linear time approximation algorithm for this variant, our algorithm yields a 1/mmin|Σ|−−−−−−−√1/mmin|Σ| approximation factor, where m min is the length of the shortest input string and |Σ| is the size of the alphabet.
Original language | American English |
---|---|
Title of host publication | Annual Symposium on Combinatorial Pattern Matching |
Editors | Paolo Ferragina, Gad M. Landau |
Publisher | Springer Berlin Heidelberg |
State | Published - 2008 |