TY - GEN
T1 - Configurations and minority in the string consensus problem
AU - Amir, Amihood
AU - Paryenty, Haim
AU - Roditty, Liam
PY - 2012
Y1 - 2012
N2 - The Closest String Problem is defined as follows. Let S be a set of k strings {s 1s k }, each of length, find a string, such that the maximum Hamming distance of from each of the strings is minimized. We denote this distance with d. The string is called a consensus string. In this paper we present two main algorithms, the Configuration algorithm with O(k 2k) running time for this problem, and the Minority algorithm. The problem was introduced by Lanctot, Li, Ma, Wang and Zhang [13]. They showed that the problem is -hard and provided an IP approximation algorithm. Since then the closest string problem has been studied extensively. This research can be roughly divided into three categories: Approximate, exact and practical solutions. This paper falls under the exact solutions category. Despite the great effort to obtain efficient algorithms for this problem an algorithm with the natural running time of O(k) was not known. In this paper we close this gap. Our result means that algorithms solving the closest string problem in times O(2), O(3), O(4) and O(5) exist for the cases of k∈=∈2,3,4 and 5, respectively. It is known that, in fact, the cases of k∈=∈2,3, and 4 can be solved in linear time. No efficient algorithm is currently known for the case of k∈=∈5. We prove the minority lemma that exploit surprising properties of the closest string problem and enable constructing the closest string in a sequential fashion. This lemma with some additional ideas give an O(2) time algorithm for computing a closest string of 5 binary strings.
AB - The Closest String Problem is defined as follows. Let S be a set of k strings {s 1s k }, each of length, find a string, such that the maximum Hamming distance of from each of the strings is minimized. We denote this distance with d. The string is called a consensus string. In this paper we present two main algorithms, the Configuration algorithm with O(k 2k) running time for this problem, and the Minority algorithm. The problem was introduced by Lanctot, Li, Ma, Wang and Zhang [13]. They showed that the problem is -hard and provided an IP approximation algorithm. Since then the closest string problem has been studied extensively. This research can be roughly divided into three categories: Approximate, exact and practical solutions. This paper falls under the exact solutions category. Despite the great effort to obtain efficient algorithms for this problem an algorithm with the natural running time of O(k) was not known. In this paper we close this gap. Our result means that algorithms solving the closest string problem in times O(2), O(3), O(4) and O(5) exist for the cases of k∈=∈2,3,4 and 5, respectively. It is known that, in fact, the cases of k∈=∈2,3, and 4 can be solved in linear time. No efficient algorithm is currently known for the case of k∈=∈5. We prove the minority lemma that exploit surprising properties of the closest string problem and enable constructing the closest string in a sequential fashion. This lemma with some additional ideas give an O(2) time algorithm for computing a closest string of 5 binary strings.
UR - http://www.scopus.com/inward/record.url?scp=84867541672&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-34109-0_6
DO - 10.1007/978-3-642-34109-0_6
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84867541672
SN - 9783642341083
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 42
EP - 53
BT - String Processing and Information Retrieval - 19th International Symposium, SPIRE 2012, Proceedings
PB - Springer Verlag
T2 - 19th International Symposium on String Processing and Information Retrieval, SPIRE 2012
Y2 - 21 October 2012 through 25 October 2012
ER -