Abstract
We propose a novel and simple approach to elucidate genomic patterns of divergence using principal component analysis (PCA). We applied this methodology to the metric space generated by M. musculus genome-wide SNPs. Distance profiles were computed between M. musculus and its closely related species, M. spretus, which was used as external reference. While the speciation dynamics were apparent in the first principal component, the within M. musculus differentiation dimensions gave rise to three minor components. We were unable to obtain a clear divergence signature discriminating laboratory strains, suggesting a stronger effect of genetic drift. These results were at odds with wild strains which exhibit defined deterministic signals of divergence. Finally, we were able to rank novel and previously known genes according to their likelihood to be under selective pressure. In conclusion, we posit PCA as a robust methodology to unravel diverging DNA regions without any a priori forcing.
Original language | English |
---|---|
Pages (from-to) | 611-622 |
Number of pages | 12 |
Journal | Evolutionary Bioinformatics |
Volume | 2012 |
Issue number | 8 |
DOIs | |
State | Published - 2012 |
Externally published | Yes |
Keywords
- Adaptive evolution
- Genetic drift
- Multi-scale modeling
- Principal component analysis
- Reproductive isolation
- Speciation