PrismEXP: gene annotation prediction from stratified gene-gene co-expression matrices

Alexander Lachmann, Kaeli A. Rizzo, Alon Bartal, Minji Jeon, Daniel J.B. Clarke, Avi Ma’ayan

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Background: Gene-gene co-expression correlations measured by mRNA-sequencing (RNA-seq) can be used to predict gene annotations based on the co-variance structure within these data. In our prior work, we showed that uniformly aligned RNA-seq co-expression data from thousands of diverse studies is highly predictive of both gene annotations and protein-protein interactions. However, the performance of the predictions varies depending on whether the gene annotations and interactions are cell type and tissue specific or agnostic. Tissue and cell type-specific gene-gene co-expression data can be useful for making more accurate predictions because many genes perform their functions in unique ways in different cellular contexts. However, identifying the optimal tissues and cell types to partition the global gene-gene co-expression matrix is challenging. Results: Here we introduce and validate an approach called PRediction of gene Insights from Stratified Mammalian gene co-EXPression (PrismEXP) for improved gene annotation predictions based on RNA-seq gene-gene co-expression data. Using uniformly aligned data from ARCHS4, we apply PrismEXP to predict a wide variety of gene annotations including pathway membership, Gene Ontology terms, as well as human and mouse phenotypes. Predictions made with PrismEXP outperform predictions made with the global cross-tissue co-expression correlation matrix approach on all tested domains, and training using one annotation domain can be used to predict annotations in other domains. Conclusions: By demonstrating the utility of PrismEXP predictions in multiple use cases we show how PrismEXP can be used to enhance unsupervised machine learning methods to better understand the roles of understudied genes and proteins. To make PrismEXP accessible, it is provided via a user-friendly web interface, a Python package, and an Appyter. AVAILABILITY. The PrismEXP web-based application, with pre-computed PrismEXP predictions, is available from: https://maayanlab.cloud/prismexp; PrismEXP is also available as an Appyter: https://appyters.maayanlab.cloud/PrismEXP/; and as Python package: https://github.com/maayanlab/prismexp.

Original languageEnglish
Article numbere14927
JournalPeerJ
Volume11
DOIs
StatePublished - Feb 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
Copyright 2023 Lachmann et al.

Funding

This work was supported by the National Institutes of Health (NIH) grants U24CA224260, U24CA264250, OT2OD030160, RC2DK131995, and R01DK131525. There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

FundersFunder number
National Institutes of HealthRC2DK131995, U24CA224260, R01DK131525, U24CA264250, OT2OD030160

    Keywords

    • Druggable genome
    • Gene expression
    • Gene function predictions
    • RNA-seq
    • Transcriptomics
    • Unsupervised learning

    Fingerprint

    Dive into the research topics of 'PrismEXP: gene annotation prediction from stratified gene-gene co-expression matrices'. Together they form a unique fingerprint.

    Cite this