Abstract
Motivation: High-density single nucleotide polymorphism (SNP) genotyping arrays are efficient and cost effective platforms for the detection of copy number variation (CNV). To ensure accuracy in probe synthesis and to minimize production costs, short oligonucleotide probe sequences are used. The use of short probe sequences limits the specificity of binding targets in the human genome. The specificity of these short probeset sequences has yet to be fully analysed against a normal reference human genome. Sequence similarity can artificially elevate or suppress copy number measurements, and hence reduce the reliability of affected probe readings. For the purpose of detecting narrow CNVs reliably down to the width of a single probeset, sequence similarity is an important issue that needs to be addressed. Results: We surveyed the Affymetrix Human Mapping SNP arrays for probeset sequence similarity against the reference human genome. Utilizing sequence similarity results, we identified a collection of fine-scaled putative CNVs between gender from autosomal probesets whose sequence matches various loci on the sex chromosomes. To detect these variations, we utilized our statistical approach, Dectecting REcurrent Copy number change using rankorder Statistics (DRECS), and showed that its performance was superior and more stable than the t-test in detecting CNVs. Through the application of DRECS on the HapMap population datasets with multi-matching probesets filtered, we identified biologically relevant SNPs in aberrant regions across populations with known association to physical traits, such as height, covered by the span of a single probe. This provided empirical confirmation of the existence of naturally occurring narrow CNVs as well as the sensitivity of the Affymetrix SNP array technology in detecting them. Availability: The MATLAB implementation of DRECS is available at http://ww2.cs.mu.oz.au/~gwong/DRECS/index.html. Contact: [email protected]. Supplementary information: Supplementary information is available at Bioinformatics online.
Original language | English |
---|---|
Article number | btq088 |
Pages (from-to) | 1007-1014 |
Number of pages | 8 |
Journal | Bioinformatics |
Volume | 26 |
Issue number | 8 |
DOIs | |
State | Published - 15 Apr 2010 |
Externally published | Yes |
Bibliographical note
Funding Information:Funding: This project is partially supported by NICTA. NICTA is funded by the Australian Government through the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program.
Funding
Funding: This project is partially supported by NICTA. NICTA is funded by the Australian Government through the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program.
Funders | Funder number |
---|---|
National ICT Australia | |
Australian Research Council | |
Department of Broadband, Communications and the Digital Economy , Australian Government |