A BALB/c IGHV Reference Set, Defined by Haplotype Analysis of Long-Read VDJ-C Sequences From F1 (BALB/c x C57BL/6) Mice

Katherine J.L. Jackson, Justin T. Kos, William Lees, William S. Gibson, Melissa Laird Smith, Ayelet Peres, Gur Yaari, Martin Corcoran, Christian E. Busse, Mats Ohlin, Corey T. Watson, Andrew M. Collins

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


The immunoglobulin genes of inbred mouse strains that are commonly used in models of antibody-mediated human diseases are poorly characterized. This compromises data analysis. To infer the immunoglobulin genes of BALB/c mice, we used long-read SMRT sequencing to amplify VDJ-C sequences from F1 (BALB/c x C57BL/6) hybrid animals. Strain variations were identified in the Ighm and Ighg2b genes, and analysis of VDJ rearrangements led to the inference of 278 germline IGHV alleles. 169 alleles are not present in the C57BL/6 genome reference sequence. To establish a set of expressed BALB/c IGHV germline gene sequences, we computationally retrieved IGHV haplotypes from the IgM dataset. Haplotyping led to the confirmation of 162 BALB/c IGHV gene sequences. A musIGHV398 pseudogene variant also appears to be present in the BALB/cByJ substrain, while a functional musIGHV398 gene is highly expressed in the BALB/cJ substrain. Only four of the BALB/c alleles were also observed in the C57BL/6 haplotype. The full set of inferred BALB/c sequences has been used to establish a BALB/c IGHV reference set, hosted at https://ogrdb.airr-community.org. We assessed whether assemblies from the Mouse Genome Project (MGP) are suitable for the determination of the genes of the IGH loci. Only 37 (43.5%) of the 85 confirmed IMGT-named BALB/c IGHV and 33 (42.9%) of the 77 confirmed non-IMGT IGHV were found in a search of the MGP BALB/cJ genome assembly. This suggests that current MGP assemblies are unsuitable for the comprehensive documentation of germline IGHVs and more efforts will be needed to establish strain-specific reference sets.

Original languageEnglish
Article number888555
JournalFrontiers in Immunology
StatePublished - 3 Jun 2022

Bibliographical note

Funding Information:
MO was supported by a grant from the Swedish Research Council (grant number 2019-01042). GY and AP were supported by a grant from the Israel Science Foundation (grant number 2940/21). MC was funded by the Swedish Research Council, grant No. 532 2017-00968. WL and GY were also supported by funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 825821.

Publisher Copyright:
Copyright © 2022 Jackson, Kos, Lees, Gibson, Smith, Peres, Yaari, Corcoran, Busse, Ohlin, Watson and Collins.


  • BALB/c
  • IGHV
  • SMRT sequencing
  • haplotyping
  • substrains


Dive into the research topics of 'A BALB/c IGHV Reference Set, Defined by Haplotype Analysis of Long-Read VDJ-C Sequences From F1 (BALB/c x C57BL/6) Mice'. Together they form a unique fingerprint.

Cite this