RNA G-quadruplexes (rG4s) are RNA secondary structures, which are formed by guanine-rich sequences and have important cellular functions. Thus, researchers would like to know where and when rG4s are formed throughout the transcriptome. Measuring rG4s experimentally is a long and lobarious process, and hence researchers often rely on computational methods to predict the rG4 propensity of a given RNA sequence. However, existing computational methods for rG4 propensity prediction are sub-optimal since they rely on specific sequence features and/or were trained on small datasets and without considering rG4 stability information. Here, we developed rG4detector, a convolutional neural network to predict the rG4 propensity of any given RNA sequence. We demonstrated that rG4detector outperforms existing methods over various transcriptomic datasets. In addition, we used rG4detector to detect potential rG4s in transcriptomic data, and showed that it improves detection performance compared to existing methods. Last, we interrogated rG4detector for the important features it learned and discovered known and novel molecular principles behind rG4 formation. We expect rG4detector to advance future rG4 research by accurate detection and propensity prediction of rG4s. The code, trained models, and processed datasets are publicly available via github.com/OrensteinLab/rG4detector.
|Title of host publication||Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2022|
|Publisher||Association for Computing Machinery, Inc|
|State||Published - 7 Aug 2022|
|Event||13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2022 - Chicago, United States|
Duration: 7 Aug 2022 → 8 Aug 2022
|Name||Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2022|
|Conference||13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2022|
|Period||7/08/22 → 8/08/22|
Bibliographical noteFunding Information:
This research was partially supported by the Israel Cancer Association (grant no. 20221519) and by the Israeli Council for Higher Education (CHE) via Data Science Research Center, Ben-Gurion University of the Negev, Israel.
© 2022 Owner/Author.
- Deep neural networks
- RNA G-quadruplex