Quantifying selection in high-throughput Immunoglobulin sequencing data sets

Gur Yaari, Mohamed Uduman, Steven H. Kleinstein

Research output: Contribution to journalArticlepeer-review

154 Scopus citations

Abstract

High-throughput immunoglobulin sequencing promises new insights into the somatic hypermutation and antigen-driven selection processes that underlie B-cell affinity maturation and adaptive immunity. The ability to estimate positive and negative selection from these sequence data has broad applications not only for understanding the immune response to pathogens, but is also critical to determining the role of somatic hypermutation in autoimmunity and B-cell cancers. Here, we develop a statistical framework for Bayesian estimation of Antigen-driven SELectIoN (BASELINe) based on the analysis of somatic mutation patterns. Our approach represents a fundamental advance over previous methods by shifting the problem from one of simply detecting selection to one of quantifying selection. Along with providing a more intuitive means to assess and visualize selection, our approach allows, for the first time, comparative analysis between groups of sequences derived from different germline V(D)J segments. Application of this approach to next-generation sequencing data demonstrates different selection pressures for memory cells of different isotypes. This framework can easily be adapted to analyze other types of DNA mutation patterns resulting from a mutator that displays hotcold-spots, substitution preference or other intrinsic biases.

Original languageEnglish
Pages (from-to)e134
JournalNucleic Acids Research
Volume40
Issue number17
DOIs
StatePublished - 1 Sep 2012
Externally publishedYes

Bibliographical note

Funding Information:
National Institutes of Health (NIH) [R03AI092379-01 to S.H.K.]; Yale University Biomedical High Performance Computing Center (NIH) [RR19895]. Funding for open access charge: NIH [R03AI092379-01].

Funding

National Institutes of Health (NIH) [R03AI092379-01 to S.H.K.]; Yale University Biomedical High Performance Computing Center (NIH) [RR19895]. Funding for open access charge: NIH [R03AI092379-01].

FundersFunder number
National Institutes of HealthR03AI092379-01
National Center for Research ResourcesS10RR019895

    Fingerprint

    Dive into the research topics of 'Quantifying selection in high-throughput Immunoglobulin sequencing data sets'. Together they form a unique fingerprint.

    Cite this