Prediction of DNA-binding residues from sequence

Yanay Ofran, Venkatesh Mysore, Burkhard Rost

Research output: Contribution to journalArticlepeer-review

149 Scopus citations


Motivation: Thousands of proteins are known to bind to DNA; for most of them the mechanism of action and the residues that bind to DNA, i.e. the binding sites, are yet unknown. Experimental identification of binding sites requires expensive and laborious methods such as mutagenesis and binding essays. Hence, such studies are not applicable on a large scale. If the 3D structure of a protein is known, it is often possible to predict DNA-binding sites in silico. However, for most proteins, such knowledge is not available. Results: It has been shown that DNA-binding residues have distinct biophysical characteristics. Here we demonstrate that these characteristics are so distinct that they enable accurate prediction of the residues that bind DNA directly from amino acid sequence, without requiring any additional experimental or structural information. In a cross-validation based on the largest non-redundant dataset of high-resolution protein-DNA complexes available today, we found that 89% of our predictions are confirmed by experimental data. Thus, it is now possible to identify DNA-binding sites on a proteomic scale even in the absence of any experimental data or 3D-structural information.

Original languageEnglish
Pages (from-to)i347-i353
Issue number13
StatePublished - 1 Jul 2007
Externally publishedYes

Bibliographical note

Funding Information:
We thank Guy Nimrod, Nir Ben-Tal (Tel Aviv University), and Trevor Siggers (Harvard University), for helpful discussions. We thank Jinfeng Liu, Andrew Kernytsky and Michael Honig (Columbia University) for help with computers and databases. This work was supported by the Grants 1-R01-GM64633 from the National Institute of General Medicine (NIGMS) at the National Institutes of Health (NIH) and 2-R01-LM007329 from the National Library of Medicine (NLM). Last, but not the least, we thank all those who deposit their experimental data in public databases, and to those who maintain these databases.


Dive into the research topics of 'Prediction of DNA-binding residues from sequence'. Together they form a unique fingerprint.

Cite this