TY - JOUR

T1 - Configurations and Minority in the String Consensus Problem

AU - Amir, Amihood

AU - Paryenty, Haim

AU - Roditty, Liam

N1 - Publisher Copyright:
© 2015, Springer Science+Business Media New York.

PY - 2016/4/1

Y1 - 2016/4/1

N2 - The Closest String Problem is defined as follows. Let (Formula presented.) be a set of (Formula presented.) strings (Formula presented.) , each of length (Formula presented.). Find a string (Formula presented.) , such that the maximum Hamming distance of (Formula presented.) from each of the strings is minimized. We denote this distance with (Formula presented.). The string (Formula presented.) is called a consensus string. In this paper we present two main algorithms, the Configuration algorithm with (Formula presented.) running time for this problem, and the Minority algorithm. The problem was introduced by Lanctot et al. [SODA’99 and (Inf Comput 185(1):41–55, 2003)]. They showed that the problem is (Formula presented.) -hard and provided an approximation algorithm based on Integer Programming. Since then the closest string problem has been studied extensively both in computational biology and theoretical computer science. This research can be roughly divided into three categories: Approximate, exact and practical solutions. This paper falls under the exact solutions category. Despite the great effort to obtain efficient algorithms for this problem an algorithm with the natural running time of (Formula presented.) was not known. In this paper we close this gap. Our result means that algorithms solving the closest string problem in times (Formula presented.) and (Formula presented.) exist for the cases of (Formula presented.) and (Formula presented.) , respectively. It is known that, in fact, the cases of (Formula presented.) and (Formula presented.) can be solved in linear time. No efficient algorithm is currently known for the case of (Formula presented.). We prove two lemmas, the unit square lemma and the minority lemma that exploit surprising properties of the closest string problem and enable constructing the closest string in a sequential fashion. These lemmas with some additional ideas give a (Formula presented.) algorithm for computing a closest string of (Formula presented.) binary strings. Algorithm Minority is based on these lemmas.

AB - The Closest String Problem is defined as follows. Let (Formula presented.) be a set of (Formula presented.) strings (Formula presented.) , each of length (Formula presented.). Find a string (Formula presented.) , such that the maximum Hamming distance of (Formula presented.) from each of the strings is minimized. We denote this distance with (Formula presented.). The string (Formula presented.) is called a consensus string. In this paper we present two main algorithms, the Configuration algorithm with (Formula presented.) running time for this problem, and the Minority algorithm. The problem was introduced by Lanctot et al. [SODA’99 and (Inf Comput 185(1):41–55, 2003)]. They showed that the problem is (Formula presented.) -hard and provided an approximation algorithm based on Integer Programming. Since then the closest string problem has been studied extensively both in computational biology and theoretical computer science. This research can be roughly divided into three categories: Approximate, exact and practical solutions. This paper falls under the exact solutions category. Despite the great effort to obtain efficient algorithms for this problem an algorithm with the natural running time of (Formula presented.) was not known. In this paper we close this gap. Our result means that algorithms solving the closest string problem in times (Formula presented.) and (Formula presented.) exist for the cases of (Formula presented.) and (Formula presented.) , respectively. It is known that, in fact, the cases of (Formula presented.) and (Formula presented.) can be solved in linear time. No efficient algorithm is currently known for the case of (Formula presented.). We prove two lemmas, the unit square lemma and the minority lemma that exploit surprising properties of the closest string problem and enable constructing the closest string in a sequential fashion. These lemmas with some additional ideas give a (Formula presented.) algorithm for computing a closest string of (Formula presented.) binary strings. Algorithm Minority is based on these lemmas.

KW - Closest string problem

KW - Consensus

KW - Strings

UR - http://www.scopus.com/inward/record.url?scp=84928139538&partnerID=8YFLogxK

U2 - 10.1007/s00453-015-9996-7

DO - 10.1007/s00453-015-9996-7

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:84928139538

SN - 0178-4617

VL - 74

SP - 1267

EP - 1292

JO - Algorithmica

JF - Algorithmica

IS - 4

ER -