TY - JOUR
T1 - Reversibility and efficiency in coding protein information
AU - Tamir, Boaz
AU - Priel, Avner
PY - 2010/12/21
Y1 - 2010/12/21
N2 - Why the genetic code has a fixed length? Protein information is transferred by coding each amino acid using codons whose length equals 3 for all amino acids. Hence the most probable and the least probable amino acid get a codeword with an equal length. Moreover, the distributions of amino acids found in nature are not uniform and therefore the efficiency of such codes is sub-optimal. The origins of these apparently non-efficient codes are yet unclear. In this paper we propose an a priori argument for the energy efficiency of such codes resulting from their reversibility, in contrast to their time inefficiency. Such codes are reversible in the sense that a primitive processor, reading three letters in each step, can always reverse its operation, undoing its process.We examine the codes for the distributions of amino acids that exist in nature and show that they could not be both time efficient and reversible. We investigate a family of Zipf-type distributions and present their efficient (non-fixed length) prefix code, their graphs, and the condition for their reversibility. We prove that for a large family of such distributions, if the code is time efficient, it could not be reversible. In other words, if pre-biotic processes demand reversibility, the protein code could not be time efficient. The benefits of reversibility are clear: reversible processes are adiabatic, namely, they dissipate a very small amount of energy. Such processes must be done slowly enough; therefore time efficiency is non-important. It is reasonable to assume that early biochemical complexes were more prone towards energy efficiency, where forward and backward processes were almost symmetrical.
AB - Why the genetic code has a fixed length? Protein information is transferred by coding each amino acid using codons whose length equals 3 for all amino acids. Hence the most probable and the least probable amino acid get a codeword with an equal length. Moreover, the distributions of amino acids found in nature are not uniform and therefore the efficiency of such codes is sub-optimal. The origins of these apparently non-efficient codes are yet unclear. In this paper we propose an a priori argument for the energy efficiency of such codes resulting from their reversibility, in contrast to their time inefficiency. Such codes are reversible in the sense that a primitive processor, reading three letters in each step, can always reverse its operation, undoing its process.We examine the codes for the distributions of amino acids that exist in nature and show that they could not be both time efficient and reversible. We investigate a family of Zipf-type distributions and present their efficient (non-fixed length) prefix code, their graphs, and the condition for their reversibility. We prove that for a large family of such distributions, if the code is time efficient, it could not be reversible. In other words, if pre-biotic processes demand reversibility, the protein code could not be time efficient. The benefits of reversibility are clear: reversible processes are adiabatic, namely, they dissipate a very small amount of energy. Such processes must be done slowly enough; therefore time efficiency is non-important. It is reasonable to assume that early biochemical complexes were more prone towards energy efficiency, where forward and backward processes were almost symmetrical.
KW - Energy dissipation
KW - Huffman prefix code
KW - Reversible computation
KW - Zipf distribution
UR - http://www.scopus.com/inward/record.url?scp=77957273240&partnerID=8YFLogxK
U2 - 10.1016/j.jtbi.2010.09.025
DO - 10.1016/j.jtbi.2010.09.025
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 20868696
AN - SCOPUS:77957273240
SN - 0022-5193
VL - 267
SP - 519
EP - 525
JO - Journal of Theoretical Biology
JF - Journal of Theoretical Biology
IS - 4
ER -