Removal of batch effects using distribution-matching residual networks

Uri Shaham, Kelly P. Stanton, Jun Zhao, Huamin Li, Khadir Raddassi, Ruth Montgomery, Yuval Kluger

Research output: Contribution to journalArticlepeer-review

112 Scopus citations

Abstract

Motivation: Sources of variability in experimentally derived data include measurement error in addition to the physical phenomena of interest. This measurement error is a combination of systematic components, originating from the measuring instrument and random measurement errors. Several novel biological technologies, such as mass cytometry and single-cell RNA-seq (scRNA-seq), are plagued with systematic errors that May severely affect statistical analysis if the data are not properly calibrated. Results: We propose a novel deep learning approach for removing systematic batch effects. Our method is based on a residual neural network, trained to minimize the Maximum Mean Discrepancy between the multivariate distributions of two replicates, measured in different batches. We apply our method to mass cytometry and scRNA-seq datasets, and demonstrate that it effectively attenuates batch effects.

Original languageEnglish
Pages (from-to)2539-2546
Number of pages8
JournalBioinformatics
Volume33
Issue number16
DOIs
StatePublished - 15 Aug 2017
Externally publishedYes

Bibliographical note

Publisher Copyright:
© The Author 2017.

Funding

This research was partially funded by NIH grant 1R01HG008383-01A1 (Y.K.).

FundersFunder number
National Institutes of Health1R01HG008383-01A1
National Institute of Allergy and Infectious DiseasesU19AI089992

    Fingerprint

    Dive into the research topics of 'Removal of batch effects using distribution-matching residual networks'. Together they form a unique fingerprint.

    Cite this