Key Points
-
A large proportion of genomic variation might be associated with human disease phenotypes. New approaches should improve our understanding of this variation and its functional significance.
-
Because there is insufficient guidance for molecular epidemiologists to optimally select variants for an epidemiology study, methods that prioritize the choice of genetic variants need to be included in molecular epidemiological studies.
-
Laboratory-based evidence about the functional significance of a genetic variant can provide the strongest evidence for the functional role of a genetic variant, but these studies are difficult to mount on the scale that may be required for characterizing all human genetic variants, and their results might not always reflect in vivo genotype function in humans.
-
Novel approaches to assessing the function of genetic variants are required to provide molecular epidemiological association studies with the information that is required to choose candidate genes and variants in these genes for association studies, and to optimally interpret the results of observed associations.
-
Novel experimental approaches that might be informative, include the HaploCHIP method, gene tagging, gene trapping, N-ethyl-N-nitrosourea (ENU) mutagenesis, proteomics methods and evaluation of epigenetic mechanisms that assess genotype function.
-
Non-laboratory-based approaches for assessing SNP function should also be considered. These approaches include those that use evolutionary similarity or structural effects of genetic variants, such as those implemented in the SIFT, PolyPhen or CODDLE algorithms.
-
Population and evolutionary genetics data can be directly incorporated into association studies to optimize the identification of genes that are causally related to disease risk. The 'set association' approach is one example of this method.
-
We propose an algorithm that can be useful in determining when a genetic variant might be functionally significant. This algorithm can be applied in the design and interpretation of molecular epidemiological association studies to maximize the potential that truly causative associations can be identified.
Abstract
Knowledge of inherited genetic variation has a fundamental impact on understanding human disease. Unfortunately, our understanding of the functional significance of many inherited genetic variants is limited. New approaches to assessing functional significance of inherited genetic variation, which combine molecular genetics, epidemiology and bioinformatics, promise to enhance reproducibility and plausibility of associations between genotypes and disease.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Cargill, M. et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nature Genet. 22, 231–238 (1999).
Salisbury, B. A. et al. SNP and haplotype variation in the human genome. Mutat. Res. 526, 53–61 (2003).
Schneider, J. A. et al. DNA variability of human genes. Mech. Ageing Dev. 124, 17–25 (2003).
Schork, N. J., Fallin, D. & Lanchbury, J. S. Single nucleotide polymorphisms and the future of genetic epidemiology. Clin. Genet. 58, 250–264 (2000).
Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001).
Zhu, Y. et al. An evolutionary perspective on SNP screening in molecular cancer epidemiology. Cancer Res. 64, 2251–2257 (2004). The first comprehensive evaluation and comparison of SIFT and PolyPhen algorithms in molecular epidemiological association studies.
Lohmueller, K. E., Pearce, C. L., Pike, M., Lander, E. S. & Hirschhorn, J. N. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nature Genet. 33, 177–182 (2003). A comprehensive evaluation of the consistency of association studies that demonstrates the need for functional correlates in achieving consistency in association study results.
Botstein, D. & Risch, N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nature Genet. 33 (Suppl.), 228–237 (2003).
Stoilov, P. et al. Defects in pre-mRNA processing as causes of and predisposition to diseases. DNA Cell Biol. 21, 803–818 (2002).
Knight, J. C. Functional implications of genetic variation in non-coding DNA for disease susceptibility and gene regulation. Clin. Sci. (Lond.) 104, 493–501 (2003).
Li, A. P., Kaminski, D. L. & Rasmussen, A. R. Substrates of human hepatic cytochrome P450 3A4. Toxicology 104, 1–8 (1995).
Hashimoto, H. et al. Gene structure of CYP3A4, an adult-specific form of cytochrome P450 in human livers and its transcriptional control. Eur. J. Biochem. 218, 585–595 (1993).
Rebbeck, T. R., Jaffe, J. M., Walker, A. H., Wein, A. J. & Malkowicz, S. B. Modification of clinical presentation of prostate tumors by a novel genetic variant in CYP3A4. J. Natl Cancer Inst. 90, 1225–1229 (1998).
Paris, P. L. et al. Association between a CYP3A4 genetic variant and clinical presentation in African-American prostate cancer patients. Cancer Epidemiol. Biomarkers Prev. 8, 901–905 (1999).
Felix, C. A., et al. Association of CYP3A4 genotype with treatment-related leukemia. Proc. Natl Acad. Sci. 95, 13176–13181 (1998).
Kadlubar, F. F. et al. The putative high activity variant, CYP3A4*1B, predicts the onset of puberty in young girls. Cancer Epidemiol. Biomarkers Prev. 12, 327–331 (2003).
Lai, J., Vesprini, D., Chu, W., Jernstrom, H. & Narod, S. A. CYP gene polymorphisms and early menarche. Mol. Genet. Metab. 74, 449–457 (2001).
Jernstrom, H. et al. Genetic factors related to racial variation in plasma levels of insulin-like growth factor-1: implications for premenopausal breast cancer risk. Mol. Genet. Metab. 72, 144–154 (2001).
Lamba, J. K. et al. Common allelic variants in cytochrome P4503A4 and their prevalence in different populations. Pharmacogenetics 12, 121–132 (2002).
Westlind, A., Lofberg, L., Tindberg, N., Andersson, T. B. & Ingelman-Sundberg, M. Interindividual differences in hepatic expression of CYP3A4: relationship to genetic polymorphism in the 5′-upstream regulatory region. Biochem. Biophys. Res. Commun. 259, 201–205 (1999).
Amirimani, B., Walker, A. H., Weber, B. L. & Rebbeck, T. R. Response: re: modification of clinical presentation of prostate tumors by a novel genetic variant in CYP3A4. J. Natl Cancer Inst. 91, 1588–1590 (1999).
Ando, Y. et al. Re: modification of clinical presentation of prostate tumors by a novel genetic variant in CYP3A4. J. Natl Cancer Inst. 91, 1587–1590 (1999).
Spurdle, A. B. et al. The CYP3A4*1B polymorphism has no functional significance and is not associated with risk of breast or ovarian cancer. Pharmacogenetics 12, 355–366 (2002).
Floyd, M. D. et al. Genotype-phenotype associations for common CYP3A4 and CYP3A5 variants in the basal and induced metabolism of midazolam in European- and African-American men and women. Pharmacogenetics 13, 595–606 (2003).
Amirimani, B. et al. Transcriptional activity effects of a CYP3A4 promoter variant. Environ. Mol. Mutagen. 42, 299–305 (2003).
Hamzeiy, H., Bombail, V., Plant, N., Gibson, G. & Goldfarb, P. Transcriptional regulation of cytochrome P4503A4 gene expression: effects of inherited mutations in the 5′-flanking region. Xenobiotica 33, 1085–1095 (2003).
Jeon, J. & An, G. Gene tagging in rice: a high throughput system for functional genomics. Plant Sci. 161, 211–219 (2001).
Cecconi, F. & Meyer, B. I. Gene trap: a way to identify novel genes and unravel their biological function. FEBS Lett. 480, 63–71 (2000).
Adams, M. D. ENU mutagenesis for pharma. Drug Discov. Today 8, 199–200 (2003)
Lee, Y. S. & Mrksich, M. Protein chips: from concept to practice. Trends Biotechnol. 20 (Suppl.), S14–18 (2002).
Nikaido, I. et al. EICO (expression-based imprint candidate organizer): finding disease-related imprinted genes. Nucleic Acids Res. 32 (database issue), D548–551 (2004).
Knight, J. C., Keating, B. J., Rockett, K. A. & Kwiatkowski, D. P. In vivo characterization of regulatory polymorphisms by allele-specific quantification of RNA polymerase loading. Nature Genet. 33, 469–475 (2003). The authors report a new method and application of experimental approaches to assessing genotype function.
Fay, J. C., Wyckoff, G. J. & Wu, C. I. Positive and negative selection on the human genome. Genetics 158, 1227–1234 (2001).
Akey, J. M., Zhang, G., Zhang, K., Jin, L. & Shriver, M. D. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12, 1805–1814 (2002).
Feder, J. N. et al. A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis. Nature Genet. 13, 399–408 (1996).
Nielsen, D. M., Ehm, M. G. & Weir, B. S. Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. Am. J. Hum. Genet. 63, 1531–1540 (1998).
Hoh, J., Wille, A. & Ott, J. Trimming, weighting, and grouping SNPs in human case-control association studies. Genome Res. 11, 2115–2119 (2001). The authors propose a novel approach to association studies that incorporates both association and population genetics information in identifying disease genes, including the possibility of genome-wide associations.
Perutz, M. F. Structure and function of haemoglobin. I. A tentative atomic model of horse oxyhaemoglobin. J. Mol. Biol. 13, 646–668 (1965).
Wang, Z. & Moult, J. Three-dimensional structural location and molecular functional effects of missense SNPs in the T cell receptor Vβ domain. Proteins 53, 748–757 (2003).
Wang, Z. & Moult, J. SNPs, protein structure, and disease. Hum. Mutat. 17, 263–270 (2001).
Chasman, D. & Adams, R. M. Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. J. Mol. Biol. 307, 683–706 (2001).
Ferrer-Costa, C., Orozco, M. & de la Cruz, X. Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. J. Mol. Biol. 315, 771–786 (2002).
Saunders, C. T. & Baker, D. Evaluation of structural and evolutionary contributions to deleterious mutation prediction. J. Mol. Biol. 322, 891–901 (2002).
Herrgard, S. et al. Prediction of deleterious functional effects of amino acid mutations using a library of structure-based function descriptors. Proteins 53, 806–816 (2003).
Miller, M. P. & Kumar, S. Understanding human disease mutations through the use of interspecific genetic variation. Hum. Mol. Genet. 10, 2319–2328 (2001).
Koref, M. E. S., Gangeswaran, R., Koref, I. P. S., Shanahan, N. & Hancock, J. M. A phylogenetic approach to assessing the significance of missense mutations in disease genes. Hum. Mutat. 22, 51–58 (2003).
Krishnan, V. G. & Westhead, D. R. A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics 19, 2199–2209 (2003).
Ng, P. C. & Henikoff, S. Predicting deleterious amino acid substitutions. Genome Res. 11, 863–874 (2001). An outline of the SIFT approach to assessing missense variant function using evolutionary similarity.
Ng, P. C. & Henikoff, S. Accounting for human polymorphisms predicted to affect protein function. Genome Res. 12, 436–446 (2002).
Ng, P. C. & Henikoff, S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003).
Ramensky, V., Bork, P. & Sunyaev, S. Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 30, 3894–3900 (2002). An outline of the PolyPhen methodology for using evolutionary and structure data to assess SNP function.
Fleming, M. A., Potter, J. D., Ramirez, C. J., Ostrander, G. K. & Ostrander, E. A. Understanding missense mutations in the BRCA1 gene: an evolutionary approach. Proc. Natl Acad. Sci. USA 100, 1151–1156 (2003).
National Institutes of Health. The ENCODE Project: ENCyclopedia Of DNA Elements [online], <http://www.genome.gov/10005107> (2003).
Rogan, P. K., Svojanovsky, S. & Leeder, J. S. Information theory-based analysis of CYP2C19, CYP2D6 and CYP3A5 splicing mutations. Pharmacogenetics 13, 207–218 (2003).
Pagani, F. & Baralle, F. E. Genomic variants in exons and introns: identifying the splicing spoilers. Nature Rev. Genet. 5, 389–396 (2004).
Sunyaev, S., Ramensky, V. & Bork, P. Towards a structural basis of human non-synonymous single nucleotide polymorphisms. Trends Genet. 16, 198–200 (2000).
Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001).
Schuetz, E. G. Lessons from the CYP3A4 promoter. Mol. Pharmacol. 65, 279–281 (2004).
Zeigler-Johnson, C. M. et al. Ethnic differences in the frequency of prostate cancer susceptibilty alleles at SRD5A2 and CYP3A4. Hum. Hered. 54, 13–21 (2002).
Acknowledgements
Some of the work discussed in this review was supported by grants from the Public Health Service and the University of Pennsylvania Cancer Center.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Glossary
- LINKAGE DISEQUILIBRIUM
-
The observation that two or more alleles, usually at loci that are physically close together on a chromosome, are not inherited independently but are observed to occur together more frequently than predicted under Mendel's law of independent assortment.
- NIFEDIPINE
-
A calcium-blocker drug (also called Procardia) that was one of the first drugs recognized to be metabolized by CYP3A4, and for which a regulatory element specific to the CYP3A4 gene was named.
- MENARCHE
-
The first occurrence of menstruation in a woman.
- HARDY-WEINBERG PROPORTIONS
-
The binomial distribution of genotypes (that is, frequencies of genotypes AA, Aa and aa will be p2, 2pq, and q2, respectively, where p is the frequency of allele A, and q is the frequency of allele a) that result in a population when there are no external pressures that cause deviations from p2, 2pq and q2.
- TEST STATISTIC
-
A quantity whose value is used to decide whether or not the null hypothesis should be rejected, usually based on quantities computed using observed data.
- RESTENOSIS
-
The constriction, narrowing or blockage of a coronary artery after an initial treatment such as angioplasty aimed at removing this blockage.
- ANGIOPLASTY
-
An operation that is used to repair a damaged blood vessel or unblock a coronary artery.
- ADMIXTURE
-
Combining two or more populations into a single group. Combining two populations has implications for studies of genotype–disease associations if the component populations have different genotypic distributions.
- CELL CYCLE CHECKPOINTS
-
Steps in the normal sequence of development and division in the cell. Disruption can lead to uncontrolled cell growth, and possibly cancer.
- ODDS RATIO
-
A measure of relative risk that is usually estimated from case control studies.
- TYPE I ERROR
-
Incorrectly rejecting a null hypothesis when the null hypothesis is correct. Similarly, the false positive rate.
Rights and permissions
About this article
Cite this article
Rebbeck, T., Spitz, M. & Wu, X. Assessing the function of genetic variants in candidate gene association studies. Nat Rev Genet 5, 589–597 (2004). https://doi.org/10.1038/nrg1403
Issue Date:
DOI: https://doi.org/10.1038/nrg1403
This article is cited by
-
Genetic Variation and Response to Neurocritical Illness: a Powerful Approach to Identify Novel Pathophysiological Mechanisms and Therapeutic Targets
Neurotherapeutics (2020)
-
SNP variants associated with non-Hodgkin lymphoma (NHL) correlate with human leukocyte antigen (HLA) class II expression
Scientific Reports (2017)
-
A new panel of SNPs to assess thyroid carcinoma risk: a pilot study in a Brazilian admixture population
BMC Medical Genetics (2017)
-
DFLAT: functional annotation for human development
BMC Bioinformatics (2014)
-
Biomarkers for Smoking Cessation
Clinical Pharmacology & Therapeutics (2013)