A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data

Yaron Orenstein, Ron Shamir

Research output: Contribution to journalArticlepeer-review

76 Scopus citations


Understanding gene regulation is a key challenge in today's biology. The new technologies of protein-binding microarrays (PBMs) and high-throughput SELEX (HT-SELEX) allow measurement of the binding intensities of one transcription factor (TF) to numerous synthetic double-stranded DNA sequences in a single experiment. Recently, Jolma et al. reported the results of 547 HT-SELEX experiments covering human and mouse TFs. Because 162 of these TFs were also covered by PBM technology, for the first time, a large-scale comparison between implementations of these two in vitro technologies is possible. Here we assessed the similarities and differences between binding models, represented as position weight matrices, inferred from PBM and HT-SELEX, and also measured how well these models predict in vivo binding. Our results show that HT-SELEX- and PBM-derived models agree for most TFs. For some TFs, the HT-SELEX-derived models are longer versions of the PBM-derived models, whereas for other TFs, the HT-SELEX models match the secondary PBM-derived models. Remarkably, PBM-based 8-mer ranking is more accurate than that of HT-SELEX, but models derived from HT-SELEX predict in vivo binding better. In addition, we reveal several biases in HT-SELEX data including nucleotide frequency bias, enrichment of C-rich k-mers and oligos and underrepresentation of palindromes.

Original languageEnglish
Pages (from-to)e63
JournalNucleic Acids Research
Issue number8
StatePublished - Apr 2014
Externally publishedYes

Bibliographical note

Funding Information:
Israel Science Foundation (ISF) [802/08, 317/13]; Edmond J. Safra Center for Bioinformatics at Tel Aviv University, the Dan David Foundation, and the Israeli Center for Research Excellence (I-CORE), Gene Regulation in Complex Human Disease, center 41/11 (to Y.O). Funding for Open Access charge: ISF grant [317/13] and I-CORE.


Dive into the research topics of 'A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data'. Together they form a unique fingerprint.

Cite this