Abstract
The Closest String Problem is defined as follows. Let (Formula presented.) be a set of (Formula presented.) strings (Formula presented.) , each of length (Formula presented.). Find a string (Formula presented.) , such that the maximum Hamming distance of (Formula presented.) from each of the strings is minimized. We denote this distance with (Formula presented.). The string (Formula presented.) is called a consensus string. In this paper we present two main algorithms, the Configuration algorithm with (Formula presented.) running time for this problem, and the Minority algorithm. The problem was introduced by Lanctot et al. [SODA’99 and (Inf Comput 185(1):41–55, 2003)]. They showed that the problem is (Formula presented.) -hard and provided an approximation algorithm based on Integer Programming. Since then the closest string problem has been studied extensively both in computational biology and theoretical computer science. This research can be roughly divided into three categories: Approximate, exact and practical solutions. This paper falls under the exact solutions category. Despite the great effort to obtain efficient algorithms for this problem an algorithm with the natural running time of (Formula presented.) was not known. In this paper we close this gap. Our result means that algorithms solving the closest string problem in times (Formula presented.) and (Formula presented.) exist for the cases of (Formula presented.) and (Formula presented.) , respectively. It is known that, in fact, the cases of (Formula presented.) and (Formula presented.) can be solved in linear time. No efficient algorithm is currently known for the case of (Formula presented.). We prove two lemmas, the unit square lemma and the minority lemma that exploit surprising properties of the closest string problem and enable constructing the closest string in a sequential fashion. These lemmas with some additional ideas give a (Formula presented.) algorithm for computing a closest string of (Formula presented.) binary strings. Algorithm Minority is based on these lemmas.
Original language | English |
---|---|
Pages (from-to) | 1267-1292 |
Number of pages | 26 |
Journal | Algorithmica |
Volume | 74 |
Issue number | 4 |
DOIs | |
State | Published - 1 Apr 2016 |
Bibliographical note
Publisher Copyright:© 2015, Springer Science+Business Media New York.
Funding
Amihood Amir: Partly supported by NSF Grant CCR-09-04581 and ISF Grant 571/14.
Funders | Funder number |
---|---|
National Science Foundation | CCR-09-04581 |
Israel Science Foundation | 571/14 |
Keywords
- Closest string problem
- Consensus
- Strings