Practical guidelines for B-cell receptor repertoire sequencing analysis

Gur Yaari, Steven H. Kleinstein

Research output: Contribution to journalReview articlepeer-review

157 Scopus citations


High-throughput sequencing of B-cell immunoglobulin repertoires is increasingly being applied to gain insights into the adaptive immune response in healthy individuals and in those with a wide range of diseases. Recent applications include the study of autoimmunity, infection, allergy, cancer and aging. As sequencing technologies continue to improve, these repertoire sequencing experiments are producing ever larger datasets, with tens- to hundreds-of-millions of sequences. These data require specialized bioinformatics pipelines to be analyzed effectively. Numerous methods and tools have been developed to handle different steps of the analysis, and integrated software suites have recently been made available. However, the field has yet to converge on a standard pipeline for data processing and analysis. Common file formats for data sharing are also lacking. Here we provide a set of practical guidelines for B-cell receptor repertoire sequencing analysis, starting from raw sequencing reads and proceeding through pre-processing, determination of population structure, and analysis of repertoire properties. These include methods for unique molecular identifiers and sequencing error correction, V(D)J assignment and detection of novel alleles, clonal assignment, lineage tree construction, somatic hypermutation modeling, selection analysis, and analysis of stereotyped or convergent responses. The guidelines presented here highlight the major steps involved in the analysis of B-cell repertoire sequencing data, along with recommendations on how to avoid common pitfalls.

Original languageEnglish
Article number121
JournalGenome Medicine
Issue number1
StatePublished - 20 Nov 2015

Bibliographical note

Publisher Copyright:
© 2015 Yaari and Kleinstein.


Dive into the research topics of 'Practical guidelines for B-cell receptor repertoire sequencing analysis'. Together they form a unique fingerprint.

Cite this