Abstract
Motivation: Sources of variability in experimentally derived data include measurement error in addition to the physical phenomena of interest. This measurement error is a combination of systematic components, originating from the measuring instrument and random measurement errors. Several novel biological technologies, such as mass cytometry and single-cell RNA-seq (scRNA-seq), are plagued with systematic errors that May severely affect statistical analysis if the data are not properly calibrated. Results: We propose a novel deep learning approach for removing systematic batch effects. Our method is based on a residual neural network, trained to minimize the Maximum Mean Discrepancy between the multivariate distributions of two replicates, measured in different batches. We apply our method to mass cytometry and scRNA-seq datasets, and demonstrate that it effectively attenuates batch effects.
Original language | English |
---|---|
Pages (from-to) | 2539-2546 |
Number of pages | 8 |
Journal | Bioinformatics |
Volume | 33 |
Issue number | 16 |
DOIs | |
State | Published - 15 Aug 2017 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© The Author 2017.
Funding
This research was partially funded by NIH grant 1R01HG008383-01A1 (Y.K.).
Funders | Funder number |
---|---|
National Institutes of Health | 1R01HG008383-01A1 |
National Institute of Allergy and Infectious Diseases | U19AI089992 |