Abstract
The popular Genome-wide Complex Trait Analysis (GCTA) software uses the randomeffects models for estimating the narrow-sense heritability based on GWAS data of unrelated individuals without knowing and identifying the causal loci. Many methods have since extended this approach to various situations. However, since the proportion of causal loci among the variants is typically very small and GCTA uses all variants to calculate the similarities among individuals, the estimation of heritability may be unstable, resulting in a large variance of the estimates. Moreover, if the causal SNPs are not genotyped, GCTA sometimes greatly underestimates the true heritability. We present a novel narrow-sense heritability estimator, named HERRA, using well-developed ultra-high dimensional machinelearning methods, applicable to continuous or dichotomous outcomes, as other existing methods. Additionally, HERRA is applicable to time-to-event or age-at-onset outcome, which, to our knowledge, no existing method can handle. Compared to GCTA and LDAK for continuous and binary outcomes, HERRA often has a smaller variance, and when causal SNPs are not genotyped, HERRA has a much smaller empirical bias. We applied GCTA, LDAK and HERRA to a large colorectal cancer dataset using dichotomous outcome (4,312 cases, 4,356 controls, genotyped using Illumina 300K), the respective heritability estimates of GCTA, LDAK and HERRA are 0.068 (SE = 0.017), 0.072 (SE = 0.021) and 0.110 (SE = 5.19 x 10-3). HERRA yields over 50% increase in heritability estimate compared to GCTA or LDAK.
Original language | English |
---|---|
Article number | e0181269 |
Journal | PLoS ONE |
Volume | 12 |
Issue number | 8 |
DOIs | |
State | Published - Aug 2017 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2017 Public Library of Science. All rights reserved This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding
This work was supported by National Institute of Health: R01 CA189532, Drs Li Hsu and Malka Gorfine; National Institutes of Health: R01 CA60987, Dr. Loic Le Marchand; German Research Council, Deutsche Forschungsgemeinschaft, BR 1704/6-1, BR 1704/6-3, BR 1704/6-4 and CH 117/1-1; German Federal Ministry of Education and Research, (01KH0404 and 01ER0814); National Institutes of Health: R01 CA48998, Dr Martha L Slattery; P01 CA033619, and R01 CA63464; National Institutes of Health (NIH), Genes, Environment and Health Initiative (GEI) Z01 CP 010200, NIH U01 HG004446, and NIH GEI U01 HG 004438; National Institutes of Health, K05 CA154337; National Heart, Lung, and Blood Institute, National Institutes of Health: HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C, and HHSN271201100004C; and National Cancer Institute-U01 CA137088, U01 CA164930, U01 CA185094, GECCO
Funders | Funder number |
---|---|
GECCO | |
National Cancer | U01 CA164930, U01 CA185094, Institute-U01 CA137088 |
National Institutes of Health | R01 CA60987, R01 CA189532 |
National Heart, Lung, and Blood Institute | HHSN268201100004C, HHSN271201100004C |
NIH Office of the Director | S10OD020069 |
Deutsche Forschungsgemeinschaft | BR 1704/6-3, BR 1704/6-1, CH 117/1-1, BR 1704/6-4 |
Bundesministerium für Bildung und Forschung | R01 CA48998, Z01 CP 010200, 01KH0404, U01 HG004446, K05 CA154337, R01 CA63464, 01ER0814, GEI U01 HG 004438, P01 CA033619 |