Massive mining of publicly available RNA-seq data from human and mouse

Alexander Lachmann, Denis Torre, Alexandra B. Keenan, Kathleen M. Jagodnik, Hoyjin J. Lee, Lily Wang, Moshe C. Silverstein, Avi Ma'ayan

Research output: Contribution to journalArticlepeer-review

448 Scopus citations

Abstract

RNA sequencing (RNA-seq) is the leading technology for genome-wide transcript quantification. However, publicly available RNA-seq data is currently provided mostly in raw form, a significant barrier for global and integrative retrospective analyses. ARCHS4 is a web resource that makes the majority of published RNA-seq data from human and mouse available at the gene and transcript levels. For developing ARCHS4, available FASTQ files from RNA-seq experiments from the Gene Expression Omnibus (GEO) were aligned using a cloud-based infrastructure. In total 187,946 samples are accessible through ARCHS4 with 103,083 mouse and 84,863 human. Additionally, the ARCHS4 web interface provides intuitive exploration of the processed data through querying tools, interactive visualization, and gene pages that provide average expression across cell lines and tissues, top co-expressed genes for each gene, and predicted biological functions and protein-protein interactions for each gene based on prior knowledge combined with co-expression.

Original languageEnglish
Article number1366
JournalNature Communications
Volume9
Issue number1
DOIs
StatePublished - 10 Apr 2018
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2018 The Author(s).

Funding

This work was partially supported by the National Institutes of Health (NIH) grants U54HL127624, U54CA189201, OT3OD025467, and U24CA224260 as well as cloud credits from the NIH BD2K Commons Cloud Credit Pilot project.

FundersFunder number
National Institutes of HealthOT3OD025467, U24CA224260, U54HL127624
National Cancer InstituteU54CA189201

    Fingerprint

    Dive into the research topics of 'Massive mining of publicly available RNA-seq data from human and mouse'. Together they form a unique fingerprint.

    Cite this