Abstract
The consensus approach to genome-wide association studies (GWAS) has been to assign equal prior probability of association to all sequence variants tested. However, some sequence variants, such as loss-of-function and missense variants, are more likely than others to affect protein function and are therefore more likely to be causative. Using data from whole-genome sequencing of 2,636 Icelanders and the association results for 96 quantitative and 123 binary phenotypes, we estimated the enrichment of association signals by sequence annotation. We propose a weighted Bonferroni adjustment that controls for the family-wise error rate (FWER), using as weights the enrichment of sequence annotations among association signals. We show that this weighted adjustment increases the power to detect association over the standard Bonferroni correction. We use the enrichment of associations by sequence annotation we have estimated in Iceland to derive significance thresholds for other populations with different numbers and combinations of sequence variants.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Hindorff, L.A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 9362–9367 (2009).
Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).
Gudbjartsson, D.F. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).
1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Pe'er, I., Yelensky, R., Altshuler, D. & Daly, M.J. Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet. Epidemiol. 32, 381–385 (2008).
International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003).
Genome of the Netherlands Consortium. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat. Genet. 46, 818–825 (2014).
1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Sham, P.C. & Purcell, S.M. Statistical power and significance testing in large-scale genetic studies. Nat. Rev. Genet. 15, 335–346 (2014).
Thomas, P.D. & Kejariwal, A. Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects. Proc. Natl. Acad. Sci. USA 101, 15398–15403 (2004).
Schork, A.J. et al. All SNPs are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs. PLoS Genet. 9, e1003449 (2013).
Yang, J. et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525 (2011).
Minelli, C. et al. Importance of different types of prior knowledge in selecting genome-wide findings for follow-up. Genet. Epidemiol. 37, 205–213 (2013).
Sveinbjornsson, G. et al. Rare mutations associating with serum creatinine and chronic kidney disease. Hum. Mol. Genet. 23, 6935–6943 (2014).
McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010).
Eilbeck, K. et al. The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. 6, R44 (2005).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Maurano, M.T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
Cooper, G.M. et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15, 901–913 (2005).
Goode, D.L. et al. Evolutionary constraint facilitates interpretation of genetic variation in resequenced human genomes. Genome Res. 20, 301–310 (2010).
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
Pickrell, J.K. Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet. 94, 559–573 (2014).
Iversen, E.S., Lipton, G., Clyde, M.A. & Monteiro, A.N. Functional annotation signatures of disease susceptibility loci improve SNP association analysis. BMC Genomics 15, 398 (2014).
Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).
Roeder, K. & Wasserman, L. Genome-wide significance levels and weighted hypothesis testing. Stat. Sci. 24, 398–413 (2009).
Roeder, K., Bacanu, S.A., Wasserman, L. & Devlin, B. Using linkage genome scans to improve power of association in genome scans. Am. J. Hum. Genet. 78, 243–252 (2006).
Genovese, C.R., Roeder, K. & Wasserman, L. False discovery control with p-value weighting. Biometrika 93, 509–524 (2006).
Köhler, S. et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 42, D966–D974 (2014).
Finucane, H.K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Hazra, A. et al. Common variants of FUT2 are associated with plasma vitamin B12 levels. Nat. Genet. 40, 1160–1162 (2008).
Nelder, J.A. & Mead, R. A simplex method for function minimization. Comput. J. 7, 308–313 (1965).
Efron, B. & Tibshirani, R.J. An Introduction to the Bootstrap (Chapman & Hall/CRC, 1993).
Acknowledgements
The authors thank all the participants in the study. We also thank the staff at the Patient Recruitment Center and the deCODE Genetics core facilities.
Author information
Authors and Affiliations
Contributions
The study was designed and results were interpreted by G.S., A.A., G.M., H.H., A.K., U.T., P.S., D.F.G. and K.S. G.S., A.A., F.Z., S.A.G., A.O., G.M., A.K., P.S. and D.F.G. performed the statistical and bioinformatics analyses. The manuscript was drafted by G.S., A.A., P.S., D.F.G. and K.S. All authors contributed to the final version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Tables 1–10 and Supplementary Note. (PDF 889 kb)
Rights and permissions
About this article
Cite this article
Sveinbjornsson, G., Albrechtsen, A., Zink, F. et al. Weighting sequence variants based on their annotation increases power of whole-genome association studies. Nat Genet 48, 314–317 (2016). https://doi.org/10.1038/ng.3507
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3507
This article is cited by
-
Development of a human genetics-guided priority score for 19,365 genes and 399 drug indications
Nature Genetics (2024)
-
Genetic architecture and biology of youth-onset type 2 diabetes
Nature Metabolism (2024)
-
Variant in the synaptonemal complex protein SYCE2 associates with pregnancy loss through effect on recombination
Nature Structural & Molecular Biology (2024)
-
Priors, population sizes, and power in genome-wide hypothesis tests
BMC Bioinformatics (2023)
-
Evaluating 17 methods incorporating biological function with GWAS summary statistics to accelerate discovery demonstrates a tradeoff between high sensitivity and high positive predictive value
Communications Biology (2023)