Skip to main navigation Skip to search Skip to main content

A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees

  • Mark Silberstein
  • , Omer Weissbrod
  • , Lars Otten
  • , Anna Tzemach
  • , Andrei Anisenia
  • , Oren Shtark
  • , Dvir Tuberg
  • , Eddie Galfrin
  • , Irena Gannon
  • , Adel Shalata
  • , Zvi U. Borochowitz
  • , Rina Dechter
  • , Elizabeth Thompson
  • , Dan Geiger
  • Technion-Israel Institute of Technology
  • University of Texas at Austin
  • University of California at Irvine
  • University of Ottawa
  • Bnai-Zion Medical Center
  • The Galilee Society
  • Holy Family Hospital
  • University of Washington

Research output: Contribution to journalArticlepeer-review

46 Scopus citations

Abstract

Motivation: The use of dense single nucleotide polymorphism (SNP) data in genetic linkage analysis of large pedigrees is impeded by significant technical, methodological and computational challenges. Here we describe Superlink-Online SNP, a new powerful online system that streamlines the linkage analysis of SNP data. It features a fully integrated flexible processing workflow comprising both well-known and novel data analysis tools, including SNP clustering, erroneous data filtering, exact and approximate LOD calculations and maximum-likelihood haplotyping. The system draws its power from thousands of CPUs, performing data analysis tasks orders of magnitude faster than a single computer. By providing an intuitive interface to sophisticated state-of-the-art analysis tools coupled with high computing capacity, Superlink-Online SNP helps geneticists unleash the potential of SNP data for detecting disease genes.Results: Computations performed by Superlink-Online SNP are automatically parallelized using novel paradigms, and executed on unlimited number of private or public CPUs. One novel service is large-scale approximate Markov Chain-Monte Carlo (MCMC) analysis. The accuracy of the results is reliably estimated by running the same computation on multiple CPUs and evaluating the Gelman-Rubin Score to set aside unreliable results. Another service within the workflow is a novel parallelized exact algorithm for inferring maximum-likelihood haplotyping. The reported system enables genetic analyses that were previously infeasible. We demonstrate the system capabilities through a study of a large complex pedigree affected with metabolic syndrome.

Original languageEnglish
Pages (from-to)197-205
Number of pages9
JournalBioinformatics
Volume29
Issue number2
DOIs
StatePublished - 15 Jan 2013
Externally publishedYes

Bibliographical note

Funding Information:
Funding: This work was supported by the National Institutes of Health [5R01HG004175-03] (to D.G., R.D. and E.T.), the Israeli Science Foundation (to D.G.) and the Israeli Ministry of Science and Technology [3-8095] (to A.S. and Z.B.).

Funding

Funding: This work was supported by the National Institutes of Health [5R01HG004175-03] (to D.G., R.D. and E.T.), the Israeli Science Foundation (to D.G.) and the Israeli Ministry of Science and Technology [3-8095] (to A.S. and Z.B.).

FundersFunder number
National Institutes of Health5R01HG004175-03
National Institute of General Medical SciencesR37GM046255
Israel Science Foundation
Ministry of science and technology, Israel3-8095

    Fingerprint

    Dive into the research topics of 'A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees'. Together they form a unique fingerprint.

    Cite this