Abstract
SNP heritability, the proportion of phenotypic variance explained by SNPs, has been reported for many hundreds of traits. Its estimation requires strong prior assumptions about the distribution of heritability across the genome, but current assumptions have not been thoroughly tested. By analyzing imputed data for a large number of human traits, we empirically derive a model that more accurately describes how heritability varies with minor allele frequency (MAF), linkage disequilibrium (LD) and genotype certainty. Across 19 traits, our improved model leads to estimates of common SNP heritability on average 43% (s.d. 3%) higher than those obtained from the widely used software GCTA and 25% (s.d. 2%) higher than those from the recently proposed extension GCTA-LDMS. Previously, DNase I hypersensitivity sites were reported to explain 79% of SNP heritability; using our improved heritability model, their estimated contribution is only 24%.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
Maher, B. Personal genomes: the case of the missing heritability. Nature 456, 18–21 (2008).
Speed, D. et al. Describing the genetic architecture of epilepsy through heritability analysis. Brain 137, 2680–2689 (2014).
Henderson, C., Kempthorne, O., Searle, S. & von Krosigk, C. The estimation of environmental and genetic trends from records subject to culling. Biometrics 15, 192–218 (1959).
Falconer, D. & Mackay, T. Introduction to Quantitative Genetics 4th edn (Longman, 1996).
Yang, J. et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525 (2011).
Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).
Lee, S.H., Yang, J., Goddard, M.E., Visscher, P.M. & Wray, N.R. Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism–derived genomic relationships and restricted maximum likelihood. Bioinformatics 28, 2540–2542 (2012).
Speed, D., Hemani, G., Johnson, M.R. & Balding, D.J. Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91, 1011–1021 (2012).
Bulik-Sullivan, B.K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Bulik-Sullivan, B. Relationship between LD score and Haseman–Elston regression. Preprint at bioRxiv http://dx.doi.org/10.1101/018283 (2015).
Corbeil, R. & Searle, S. Restricted maximum likelihood (REML) estimation of variance components in the mixed model. Technometrics 18, 31–38 (1976).
Golan, D., Lander, E.S. & Rosset, S. Measuring missing heritability: inferring the contribution of common variants. Proc. Natl. Acad. Sci. USA 111, E5272–E5281 (2014).
Lee, S.H. et al. Estimation of SNP heritability from dense genotype data. Am. J. Hum. Genet. 93, 1151–1155 (2013).
Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).
Ek, W.E. et al. Germline genetic contributions to risk for esophageal adenocarcinoma, Barrett's esophagus, and gastroesophageal reflux. J. Natl. Cancer Inst. 105, 1711–1718 (2013).
Bevan, S. et al. Genetic heritability of ischemic stroke and the contribution of previously reported candidate gene and genomewide associations. Stroke 43, 3161–3167 (2012).
Keller, M.F. et al. Using genome-wide complex trait analysis to quantify 'missing heritability' in Parkinson's disease. Hum. Mol. Genet. 21, 4996–5009 (2012).
Yin, X. et al. Common variants explain a large fraction of the variability in the liability to psoriasis in a Han Chinese population. BMC Genomics 15, 87 (2014).
Lee, S.H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat. Genet. 44, 247–250 (2012).
Chen, G.B. et al. Estimation and partitioning of (co)heritability of inflammatory bowel disease from GWAS and Immunochip data. Hum. Mol. Genet. 23, 4710–4720 (2014).
Stahl, E.A. et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat. Genet. 44, 483–489 (2012).
Robinson, E.B. et al. The genetic architecture of pediatric cognitive abilities in the Philadelphia Neurodevelopmental Cohort. Mol. Psychiatry 20, 454–458 (2015).
Shah, T. et al. Population genomics of cardiometabolic traits: design of the University College London–London School of Hygiene and Tropical Medicine–Edinburgh–Bristol (UCLEB) Consortium. PLoS One 8, e71345 (2013).
Voight, B.F. et al. The Metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet. 8, e1002793 (2012).
Dempster, E.R. & Lerner, I.M. Heritability of threshold characters. Genetics 35, 212–236 (1950).
Lee, S.H., Wray, N.R., Goddard, M.E. & Visscher, P.M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).
Yang, J., Lee, S.H., Goddard, M.E. & Visscher, P.M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011)..
Pruit, K., Brown, G., Tatusova, T. & Maglott, D. in The NCBI Handbook (eds. McEntyre, J. & Ostell, J.) Chapter. 18 (National Center for Biotechnology Information, 2002).
Finucane, H.K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning (Springer, 2001).
Habier, D., Fernando, R.L., Kizilkaya, K. & Garrick, D.J. Extension of the Bayesian alphabet for genomic selection. BMC Bioinformatics 12, 186 (2011).
Lippert, C. et al. FaST linear mixed models for genome-wide association studies. Nat. Methods 8, 833–835 (2011).
Zhou, X. & Stephens, M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat. Methods 11, 407–409 (2014).
Yang, J., Zaitlen, N.A., Goddard, M.E., Visscher, P.M. & Price, A.L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100–106 (2014).
Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).
Cross-Disorder Group of the Psychiatric Genomics Consortium. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat. Genet. 45, 984–994 (2013).
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Gazal, S. et al. Linkage disequilibrium dependent architecture of human complex traits reveals action of negative selection. Preprint at bioRxiv http://dx.doi.org/10.1101/082024 (2017).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Krishna Kumar, S., Feldman, M.W., Rehkopf, D.H. & Tuljapurkar, S. Limitations of GCTA as a solution to the missing heritability problem. Proc. Natl. Acad. Sci. USA 113, E61–E70 (2016).
Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 (Bethesda) 1, 457–470 (2011).
Hayes, B.J., Visscher, P.M. & Goddard, M.E. Increased accuracy of artificial selection by using the realized relationship matrix. Genet. Res. (Camb.) 91, 47–60 (2009).
Habier, D., Fernando, R.L. & Dekkers, J.C. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 2389–2397 (2007).
Speed, D. & Balding, D.J. Relatedness in the post-genomic era: is it still useful? Nat. Rev. Genet. 16, 33–44 (2015).
Hardy, G.H. Mendelian proportions in a mixed population. Science 28, 49–50 (1908).
Weinberg, W. Über den Nachweis der Vererbung beim Menschen. Jahreshefte des Vereins fur Vaterländische Naturkd. Württemb. 64, 368–382 (1908).
Lee, S.H. & van der Werf, J.H. An efficient variance component approach implementing an average information REML suitable for combined LD and linkage mapping with a general complex pedigree. Genet. Sel. Evol. 38, 25–43 (2006).
World Health Organization. Global Tuberculosis Report (World Health Organization, 2014).
Gusev, A. et al. Quantifying missing heritability at known GWAS loci. PLoS Genet. 9, e1003993 (2013).
Speed, D. & Balding, D.J. MultiBLUP: improved SNP-based prediction for complex traits. Genome Res. 24, 1550–1557 (2014).
Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).
Moser, G. et al. Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model. PLoS Genet. 11, e1004969 (2015).
Visscher, P.M. et al. Statistical power to detect genetic (co)variance of complex traits using SNP data in unrelated samples. PLoS Genet. 10, e1004269 (2014).
Bhatia, G. et al. Haplotypes of common SNPs can explain missing heritability of complex diseases. Preprint at bioRxiv http://dx.doi.org/10.1101/022418 (2016).
Tobin, M.D., Sheehan, N.A., Scurrah, K.J. & Burton, P.R. Adjusting for treatment effects in studies of quantitative traits: antihypertensive therapy and systolic blood pressure. Stat. Med. 24, 2911–2935 (2005).
Asselbergs, F.W. et al. Large-scale gene-centric meta-analysis across 32 studies identifies multiple lipid loci. Am. J. Hum. Genet. 91, 823–838 (2012).
Delaneau, O., Zagury, J.F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).
1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Todd, J.A. et al. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nat. Genet. 39, 857–864 (2007).
Plenge, R.M. et al. TRAF1—C5 as a risk locus for rheumatoid arthritis—a genomewide study. N. Engl. J. Med. 357, 1199–1209 (2007).
Acknowledgements
Access to Wellcome Trust Case Control Consortium data was authorized as work related to the project “Genome-wide association study of susceptibility and clinical phenotypes in epilepsy,” while access to Children's Hospital of Philadelphia (CHOP) data was granted under Project 49228-1, “Assumptions underlying estimates of SNP heritability.” We thank A. Molloy, J. Mills and L. Brody for permission to use genotype data from the Trinity College Dublin Student Study and S. Langley for help accessing the CHOP data. This work is funded by the UK Medical Research Council under grant MR/L012561/1 (awarded to D.S.) and the British Heart Foundation under grant RG/10/12/28456 (the UCLEB Consortium) and is supported by researchers at the National Institute for Health Research (NIHR) University College London Hospitals Biomedical Research Centre. N.C. is an ESPOD Fellow from the European Molecular Biology Laboratory, European Bioinformatics Institute, and Wellcome Trust Sanger Institute. M.R.J. receives funding from the Imperial College NIHR Biomedical Research Centre (BRC) Scheme. S.N. is a Wellcome Trust Senior Research Fellow in Basic Biomedical Science and is also supported by the NIHR Cambridge Biomedical Research Centre. Analyses were performed with the use of the UCL Computer Science Cluster and the help of the CS Technical Support Group, as well as the use of the UCL Legion High-Performance Computing Facility (Legion@UCL) and associated support services.
Author information
Authors and Affiliations
Consortia
Contributions
D.S. and N.C. performed the analyses. D.S. and D.J.B. wrote the manuscript with assistance from N.C., M.R.J., S.N. and members of the UCLEB Consortium.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
A full list of members and affiliations appears in the Supplementary Note.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–25 and Supplementary Tables 1–12 (PDF 3715 kb)
Rights and permissions
About this article
Cite this article
Speed, D., Cai, N., the UCLEB Consortium. et al. Reevaluation of SNP heritability in complex human traits. Nat Genet 49, 986–992 (2017). https://doi.org/10.1038/ng.3865
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3865
This article is cited by
-
Massively parallel screen uncovers many rare 3′ UTR variants regulating mRNA abundance of cancer driver genes
Nature Communications (2024)
-
A method to estimate the contribution of rare coding variants to complex trait heritability
Nature Communications (2024)
-
Cross-ancestry genetic architecture and prediction for cholesterol traits
Human Genetics (2024)
-
A phenome-wide scan reveals convergence of common and rare variant associations
Genome Medicine (2023)
-
rBahadur: efficient simulation of structured high-dimensional genotype data with applications to assortative mating
BMC Bioinformatics (2023)