Slicing and Dicing the Genome: A Statistical Physics Approach to Population Genetics

Yosef E. Maruvka, Nadav M. Shnerb, Sorin Solomon, Gur Yaari, David A. Kessler

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The inference of past demographic parameters from current genetic polymorphism is a fundamental problem in population genetics. The standard techniques utilize a reconstruction of the gene-genealogy, a cumbersome process that may be applied only to small numbers of sequences. We present a method that compares the total number of haplotypes (distinct sequences) with the model prediction. By chopping the DNA sequence into pieces we condense the immense information hidden in sequence space into a function for the number of haplotypes versus subsequence size. The details of this curve are robust to statistical fluctuations and are seen to reflect the process parameters. This procedure allows for a clear visualization of the quality of the fit and, crucially, the numerical complexity grows only linearly with the number of sequences. Our procedure is tested against both simulated data as well as empirical mtDNA data from China and provides excellent fits in both cases.

Original languageEnglish
Pages (from-to)1302-1316
Number of pages15
JournalJournal of Statistical Physics
Volume142
Issue number6
Early online date12 Jan 2011
DOIs
StatePublished - Apr 2011

Bibliographical note

Funding Information:
Acknowledgement This work was supported by the EU 6th framework CO3 pathfinder. NMS and YM acknowledge many useful discussions with John Wakeley on scalable approaches to population genetics.

Keywords

  • Galton-Watson theory
  • Haplotype statistics
  • Population genetics

Fingerprint

Dive into the research topics of 'Slicing and Dicing the Genome: A Statistical Physics Approach to Population Genetics'. Together they form a unique fingerprint.

Cite this