Abstract
Transcription factors (TFs) achieve DNA-binding specificity through contacts with functional groups of bases (base readout) and readout of structural properties of the double helix (shape readout). Currently, it remains unclear whether DNA shape readout is utilized by only a few selected TF families, or whether this mechanism is used extensively by most TF families. We resequenced data from previously published HT-SELEX experiments, the most extensive mammalian TF–DNA binding data available to date. Using these data, we demonstrated the contributions of DNA shape readout across diverse TF families and its importance in core motif-flanking regions. Statistical machine-learning models combined with feature-selection techniques helped to reveal the nucleotide position-dependent DNA shape readout in TF-binding sites and the TF family-specific position dependence. Based on these results, we proposed novel DNA shape logos to visualize the DNA shape preferences of TFs. Overall, this work suggests a way of obtaining mechanistic insights into TF–DNA binding without relying on experimentally solved all-atom structures.
Original language | English |
---|---|
Article number | 910 |
Journal | Molecular Systems Biology |
Volume | 13 |
Issue number | 2 |
DOIs | |
State | Published - 6 Feb 2017 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2017 The Authors. Published under the terms of the CC BY 4.0 license
Funding
This work was performed in part while Y.O. and R.S. were visiting the Simons Institute for the Theory of Computing at UC Berkeley. The work was supported by the National Institutes of Health (grants R01GM106056 and U01GM103804 to R.R.) and an Alfred P. Sloan Research Fellowship (to R.R.), the Israel Science Foundation (grant 317/13 to R.S.) and the Raymond and Beverly Sackler Chair in Bioinformatics (to R.S.), and Knut and Alice Wallenberg Foundation and Swedish Research Council grants (to J.T.). L.Y. and Y.O. acknowledge support through Dan David Prize scholarships. Y.O. was partially supported by the Edmond J. Safra Center for Bioinformatics at Tel Aviv University. Open-access charges were defrayed in part through the National Science Foundation (grant MCB-1413539 to R.R.).
Funders | Funder number |
---|---|
Edmond J. Safra Center for Bioinformatics | |
Simons Institute for the Theory of Computing at UC Berkeley | |
National Science Foundation | MCB-1413539 |
National Institutes of Health | U01GM103804 |
National Institute of General Medical Sciences | R01GM106056 |
Alfred P. Sloan Foundation | |
Israel Science Foundation | 317/13 |
Knut och Alice Wallenbergs Stiftelse | |
Vetenskapsrådet | |
Tel Aviv University |
Keywords
- DNA shape
- binding specificity
- feature selection
- quantitative modeling
- transcription factor