Main

Coronary artery disease (CAD) is a major health problem in industrialized countries. Some risk factors for CAD such as insulin resistance and type II diabetes mellitus (T2DM) are also increasing even among pediatric populations primarily because of changes in lifestyle including a high caloric intake and low physical activity.

Many genome-wide association studies (GWASs) have been used to identify genetic risk factors for markers associated with selected CAD risk factors (1,2). Typically, GWAS has used dichotomous traits (affected versus nonaffected controls) to identify new loci for CAD risk factors such as glucose levels (insulin resistance) (1,2), T2DM (1), obesity (BMI) (3), blood pressure (4), and serum lipids (5).

A consistent finding of all these GWASs of individual CAD risk factors and associated phenotypes has been the relatively weak contribution of any specific locus to the relative risk. Dichotomous studies of disease risk for T2DM, for example, have typically found only a 1.1-fold increase in relative risk for any specific genetic marker. This low effect size shows a marked difference from twin studies for each of the biochemical, physiological, and anthropomorphic measures associated with the development of CAD risk factors such as T2DM (6). Moreover, the poor predictive power of genetic loci, either singly or in isolation, has led some to conclude that clinical measures are in fact more prognostic than genetic predispositions (7). Similarly, quantitative (continuous) trait studies typically report that specific genetic components have an effect size of 1% or less. This has led to the hypothesis that traits associated with CAD risk factors are highly polygenic, with many loci each contributing a very small proportion of variation.

We hypothesized that study of younger populations could prove more valuable in identifying genetic contributions to CAD risk for two reasons: genetic markers associated with CAD risk factors could be highly valuable tools to identify individuals who might benefit most from early interventions and young adults lack many of the comorbidities that complicate gene/phenotype associations. The covariance of the comorbidities in older subjects would serve to lessen the effect size of any specific genetic locus. Thus, young adults may show much larger effect sizes. Finally, the study of young adults should uncover the genetic loci associated with onset of the disease, rather than later disease progression. There are also disadvantages to studying young adults. The variance of values is considerably less in young adults, with fewer subjects showing abnormal values of the marker, and this might be expected to reduce statistical power. However, we hypothesized that the reduced statistical power would be offset by larger effect sizes of strong genetic components.

In this study, we analyzed 20 GWAS-identified single-nucleotide polymorphisms (SNP) in younger populations. The loci studied have been replicated in numerous studies, with seven loci associated with serum lipids (5,8,9), 10 associated with T2DM (1), and three associated with the development of CAD (8,10). The initial cohort studied was college-aged individuals (Functional Polymorphism Associated with Human Muscle Size and Strength; FAMUSS), and gene associations queried for LDL (C), HDL (C), triglycerides (TG), glucose, homoeostasis model assessment (HOMA), and insulin. Validation of positive associations was performed in a cohort of sixth-grade volunteers (Cardiovascular Health Intervention Program; CHIP).

We establish that the 1p13.3 LDL (C) locus was strongly associated with a CAD risk factor in both of these young populations. LDL (C) is strong predictor of CAD (11), and this locus could be used to identify individuals who could benefit from early interventions to prevent CAD.

METHODS

Genotyping of the 20 GWAS-associated SNPs was performed in FAMUSS to identify loci associated with the following CAD risk factors: LDL (C), HDL (C), TG, glucose, HOMA, and insulin with and without adjustment for age and BMI. Significant loci were validated in the CHIP cohort.

FAMUSS cohort.

The FAMUSS study design is a population-based study of individuals (average age = 24 y) designed to identify genetic factors associated with skeletal muscle changes as a result of resistance training (12). We collected selected baseline CAD risk factors such as serum lipids [LDL (C), HDL (C), and TG], glucose, and insulin following an overnight fast (Quest Diagnostics, Chantilly, VA) in 548 White participants. The HOMA values were obtained using this calculation: (insulin × fasting glucose)/22.5 (13). The study was approved by the Children's National Medical Center (CNMC) Institutional Review Board (IRB) (protocol #2449) in compliance with the Helsinki Declaration. Informed consent was obtained from all subject participants.

CHIP cohort.

CHIP is a population-based study of sixth graders (11.5 y of age) in Michigan that includes a screening component and health education program (manuscript in preparation). The screenings have included the following CAD risk factor assessments: physical activity, cardiovascular fitness, body composition, blood pressure, family history, fasted blood lipids [LDL (C), HDL (C), and TG], glucose, dietary assessments, grip strength, and a blood spot sample. Serum lipids and markers of glucose metabolism were available for 810 participants. The study was approved by the Central Michigan University in compliance with the Helsinki Declaration. Informed consent was obtained from all mothers and/or fathers of participants in the CHIP study. De-identified blood spots were sent to CNMC for genotyping, and genotyping was approved by the CNMC IRB.

The Study of Targeted Risk Reduction Intervention through Defined Exercise study (STRRIDE).

This study examines defined modes, doses, and intensities of exercise in a structured exercise program, with the goal to define the relationship between mode, amount and intensities of exercise, and improvements in risk parameters of cardiovascular health. The STRRIDE encompassed 555 middle-aged subjects with features of metabolic syndrome [25 < BMI < 35; fasting hyperinsulinemia (>10 IU/mL); and dyslipidemia with LDL (C) between 130 and 190 and/or HDL <45 for women, <35 for men]. Subjects were assigned to one of six groups addressing exercise mode, intensity, and duration. STRRIDE participants signed informed consent associated with IRB approval by Duke University Medical Center.

Genotyping methods.

Genomic DNA was isolated using Puregene kits (Gentra Systems, Minnesota) from peripheral blood (FAMUSS) or from blood spots using Qiagen kits (CHIP). Genotyping was done using TaqMan allele discrimination assays (14). Both alleles were detected simultaneously and automatically called (see Table 1).

Table 1 Assay IDs for SNPs genotyped in FAMUSS study

Statistical analyses.

All genotype data sets were tested for Hardy-Weinberg equilibrium by using a χ2 tests to compare the observed genotype frequencies to those expected under equilibrium. Normality of each quantitative trait was tested using the Shapiro-Wilk normality test, and any trait not meeting the criteria was appropriately transformed.

Bivariate correlation analyses of each quantitative measurement showed several significant correlations with age and baseline weight; therefore, associations between each SNP and quantitative phenotype were assessed using analysis of covariance methods using both dominant genetic models and quantitative trait models.

Linear regression analysis, including likelihood ratio tests between full (containing genotype and covariates) and constrained (containing covariates only) models, was performed to estimate the proportion of variance in volumetric measurements attributable to each SNP's genotype. All statistical analyses were performed using Stata version 8 (StataCorp, College Park, Texas).

The CHIP cohort was used as a validation data set. Thus, only statistically significant associations surviving multiple testing thresholds seen in FAMUSS were subsequently tested in the CHIP cohort. Statistical significance threshold was established as p < 0.05, as this was a validation cohort with no multiple testing.

RESULTS

Analysis of selected CAD risk factor GWAS loci in young adults.

The FAMUSS cohort was used as the test cohort. Traits used for association studies were fasting glucose (mmol/L), HOMA, insulin (μU/mL), LDL (C) (mg/dL), HDL (C) (mg/dL), and TG (mg/dL) with and without adjustment for BMI (kg/m2). Average values for all subjects and stratified by sex are provided in Table 2.

Table 2 Sample characteristics for FAMUSS and CHIP population-based cohorts

The percentage of subjects showing abnormal levels of each CAD risk factor is provided in Table 3. BMI was >30 in 12.0% of all subjects (9.5% of females; 16.5% of males). A relatively large proportion of volunteers showed abnormal HDL (C) levels (81% of males; 29% of females). Females showed an overall lower risk for mean blood pressure (mm Hg) ≥100 (2.6% abnormal values in females; 14.5% in males) and for the diagnosis of metabolic syndrome (4.6% of females; 18.0% of males).

Table 3 Incidence of abnormal values for FAMUSS and CHIP

Loci previously reported to be associated with CAD, CAD risk factors, or glucose metabolism phenotypes were genotyped in FAMUSS (Tables 1 and 4). Resulting genotypes were tested for Hardy-Weinberg equilibrium; two SNPs showed divergence from expected allele frequencies (rs4846914 and rs1333049). Twenty loci were tested against six phenotypes [fasting glucose, TG, HDL (C), LDL (C), fasting insulin, and HOMA] in FAMUSS. This required adjustment for multiple testing with a p-value threshold of <0.00042 for significance (adjusted for the testing of 20 SNPs and six phenotypes).

Table 4 SNPs genotyped in FAMUSS study

Nineteen of the 20 loci tested did not reach the adjusted p-value threshold of 0.00042 for any T2DM-related or lipid trait in the FAMUSS group. Given the lower average serum values in young adults compared with older adults, and lower variance, our data do not preclude the other 19 loci from having a phenotype effect in a larger population. We are underpowered to detect a <1% effect size published in larger adult studies.

A single locus met our threshold for significance, and this was the rs646776 locus at 1p13.3, containing the SORT1 gene with association for circulating LDL (C) levels (Table 5; Fig. 1A). There was a dose effect of the polymorphism, suggesting a quantitative or additive effect of the alleles; however, the numbers of homozygotes for the rare allele were relatively low (Fig. 1A). The association between rs646776 and LDL (C) was significant for the entire cohort (p = 3.0 × 10×6) and significant in both male (p = 0.009) and female (p = 8.8 × 10×5) subcohorts. Inclusion of age and/or BMI as covariates in the statistical model showed little change in statistical support for the association of rs646776 with LDL (C). The amount of variation in serum LDL (C) explained by the rs646776 genotype was 3.6% for the entire cohort (4.1% in females; 3.0% in males; Table 5).

Table 5 1p13.3 variant (rs646776) is associated with LDL (C) in FAMUSS cohort (24 yr olds)
Figure 1
figure 1

1p13.3 variant (rs646776) shows an effect on LDL (C) in both FAMUSS and CHIP cohorts. Both sexes are shown for the FAMUSS study (1A) and the CHIP study (1B). In both population-based studies, individuals with the TT genotype had significantly higher LDL (C) values than the CT/CC genotypes. Figure 1A: *p = 0.000003; n = 345 for TT, n = 203 for CT/CC. Figure 1A: **p = 0.000008; n = 551 for TT, n = 259 for CT/CC.

LDL (C) levels have been shown to have a weak correlation with BMI in children and adolescents (ages 3–18 y) (15). Oxidized LDL (C) has a strong relationship with waist circumference and has been shown to have a strong relationship with LDL (C) levels (1619). Our association data in young adults have shown that more of the variation in LDL (C) in 24 y olds was due to rs646776 than to an increase in BMI, which is supported by the findings of Lee et al. (15). To test this, regression models of LDL (C) were done with and without rs646766 genotype and BMI to determine the percent variation attributable to each factor. In both the FAMUSS and CHIP cohorts, more of the variation in LDL (C) can be attributed to rs646766 than BMI (3.6% versus 3.0% in FAMUSS; 2.5% versus 1.4% in CHIP). Although there are significant associations between LDL (C) and BMI in both cohorts (p = 0.03 in FAMUSS; p = 0.02 in CHIP), the associations between LDL (C) and rs646766 are statistically much more significant (p = 0.000003 in FAMUSS; p = 0.00008 in CHIP).

The association between rs646776 and LDL (C) is replicated in the 11.5-y-old CHIP cohort.

The CHIP cohort comprised 810 predominantly White sixth-grade students attending public school in Michigan. As expected, average values for selected CAD risk factors in the 11.5-y-old CHIP volunteers were healthier compared with the FAMUSS, given the well-accepted increase in risk factors with age (Table 2).

We then looked at the abnormal threshold levels for each phenotype, using age-adjusted criteria (Table 3) (20). In this population-based study, 44.6% of children met the criteria for an abnormally high BMI (41.8% for girls, 47.6% for boys). Forty percent showed abnormal total cholesterol, whereas 25% of the children showed abnormal values for HDL (C), LDL (C), and TG as individual laboratory phenotypes.

As this was a replication cohort, we tested only the association of rs646776 with serum LDL (C). There was a strong statistical association between rs646776 and LDL (C), similar to that observed in FAMUSS (Fig. 1B; Table 6). There was a similar dosage effect of the genotype, with homozygotes for the ancestral common allele showing the highest LDL (C) levels. The percentage of variation explained by genotype was 2.5% for the entire cohort (2.7% for females; 2.0% for males).

Table 6 Replication data for 1p13.3 variant (rs646776) shows association with LDL (C) in the CHIP cohort (11.5 yr olds)

The rs646776 locus at 1p13.3 1 shows a greater effect in females than males.

In both populations, females showed a stronger percent variation attributable to SORT1 genotype (see Tables 4 and 5). In the FAMUSS study, the SORT1 locus contributed 4.1% to LDL (C) values in females while contributing 3.0% in males. We see a similar trend in the validation population, CHIP, where SORT1 contributed 2.7% to LDL (C) values in females and 2.0% in males.

The rs646776 locus at 1p13.3 1 shows a greater effect in young populations, compared with older, sicker populations.

The association between rs646776 locus at 1p13.3 and LDL (C) has been studied in many older populations (5), some with overt T2DM (9). Although each study has shown a similar effect of ancestral (common) allele associated with higher LDL (C), the effect size (proportion of variation explained by genotype) has been consistently quite small (1%). We hypothesized that the larger effect size seen in the younger populations studied here (e.g. 4.1% in 24-y-old females; 2.7% in 11.5-y-old females) could be explained by the relative absence of comorbidities that serve to obscure underlying genetic associations. To test this, we compared the association data from FAMUSS and CHIP with STRRIDE. STRRIDE includes older (age = 54 y; Table 7) subjects with features of metabolic syndrome but without a T2DM diagnosis (21). As expected, STRRIDE participants showed much higher average serum LDL (C) values (Fig. 2) but not a significant association of LDL (C) with rs646776 genotype. This contrasts to the 11.5-y-old CHIP children and 24-y-old FAMUSS young adults who showed a significant genotype correlation. However, the 24 y olds showed a higher genotype effect than the 11.5 y olds (3.6% versus 2.5%, respectively).

Table 7 Sample characteristics for STRRIDE study
Figure 2
figure 2

LDL (C) cholesterol as a function of 1p genotype shows higher effect size in young population-based cohorts compared with older subjects with features of metabolic syndrome. For the CHIP cohort, LDL (C) was significantly different between genotypes (*p = <0.001) with a mean ΔLDL (C) = 0.103 and with genotype explaining 2.5% of the variability in LDL (C). For the FAMUSS cohort, LDL (C) was significantly different between genotypes (*p = <0.001) with a mean ΔLDL (C) = 0.116 and with genotype explaining 3.6% of the variability in LDL (C). For the STRRIDE cohort, LDL (C) was not significantly different between genotypes with a mean ΔLDL (C) = 0.014 and with genotype explaining <0.1% of the variability in LDL (C).

DISCUSSION

GWASs have enabled the identification of key polymorphisms linked to complex disease, including T2DM (1) and related phenotypes (see reviews from Ref. 22). This approach is frequently cited as holding potential to transform the practice of medicine. Most frequently, the integration of genetic risk factors into clinical practice is envisioned through preventive medicine. For example, those at highest risk for T2DM and/or CAD can be identified before onset of clinical symptoms, with interventions targeted toward those at greatest risk, perhaps personalizing interventions for specific genotypes. There are some shortcomings to this rationale. First, clinical trials will be needed to provide the evidence-based data to bring such presymptomatic interventions into standard of care. Because T2DM and CAD take decades to develop, the progression to overt disease is unlikely to be a practical endpoint in such trials. Thus, alternative sensitive and reliable endpoints are needed that can be assessed over a reasonable amount of time (e.g. 1 y). Second, many CAD risk factor loci such as those for T2DM have been identified with case-control cohorts, with advanced or end-stage disease. Thus, it is possible, if not likely, that existing GWAS CAD risk factor loci identified in adults are associated with advanced development of the risk factor. We hypothesized that the identification of T2DM- and dyslipidemia-related loci showing association with related biomarkers in young population-based cohorts would overcome many of these limitations.

We tested 20 CAD risk factors and T2DM GWAS loci (Table 4) and found that the large majority (19/20) showed no evidence of association with related phenotypes as quantitative traits in a young adult (24-y-old) population. The phenotypes tested were serum lipids [LDL (C), HDL (C), TG], glucose, HOMA, and insulin resistance with and without adjustment for age and BMI. One locus, rs646776 at 1p13.3, showed strong associations with serum LDL (C) in our 24-y-old FAMUSS cohort, and this association (Table 5; Fig. 1A) remained significant after adjustment for multiple testing in the entire cohort (p = 3.0 × 10×6), males (p = 0.009), and females (p = 8.8 × 10×5). The association between rs646776 and LDL (C) was validated in a population-based 11.5-y-old replication cohort and found to show similar significance (Table 6; Fig. 1B).

We found a considerably larger effect size of the 1p13.3 LDL (C)-associated locus in young populations, compared with the previously published older cohorts. All previous studies have shown that rs646776 explains 1% of the population variation in LDL (C) levels (5,8). For comparison, a recent GWAS identified 30 loci for dyslipidemia (including the 1p13.3 locus) where the cumulative effect size of all 30 loci was 5% of the population variance in lipoprotein levels (5). In contrast, our data in 24 y olds showed the rs646776 SNP explained 3.6% of population variation—a considerably higher effect size than seen in most other loci (Table 5). This was despite lower average serum LDL (C) levels in our younger population (Fig. 2) and lower population variance in LDL (C) values. Our data in sixth graders (Table 6) also showed larger effect sizes than previously published (2.5% versus 1%); however, the effect size in our young population (11.5 y old) was not significantly higher than our 24 y olds (FAMUSS; 3.6% effect size).

The specific functional variant at the 1p13.3 LDL (C)-associated locus has recently been shown to be a noncoding SNP in the gene promoter of the SORT1 gene, a multiligand sorting receptor directly involved in lipid trafficking in the liver (23). This locus showed results consistent with our hypothesis that younger populations would show a stronger genetic component in population variation in serum markers. However, the other 19 loci tested did not show significance in the 24 y olds. This could be due to poor statistical power in younger subjects due to lower average serum marker levels and lower variance in blood levels for the serum markers. In addition, some genetic loci significant in older subjects may require the onset of age-related covariates before showing the published genetic associations. For example, the well-studied TCF7L2 T2DM locus has been associated with renal failure in diabetics and nondiabetics (24), a phenotype not expected to be expressed in younger subjects.

The rs646776 locus may provide the first opportunity to conduct clinical trials aimed at genotype-stratified prevention of dyslipidemia. Specifically, in our 11.5-y-old sixth-grader cohort, 27.3% of children showed levels of LDL (C) above the threshold considered abnormal (Table 3). Increasing LDL (C) levels increase the risk of CAD (11). Interventions can be designed to test the efficacy of treatment strategies such as dietary interventions and low dose, lifelong lipid-lowering medication to decrease LDL (C) in children identified at risk by rs646776 genotype.

Our findings were consistent with previous reports showing that the ancestral common allele is associated with higher LDL (C) levels. Thus, the larger population is at high risk for elevated LDL (C), whereas the more recently derived allele imparts protection from high LDL (C) levels to the subset of rare allele carriers. The relatively low relative risk associated with rs646776 and T2DM (1.1-fold), and small effects sizes (1%) in older sicker populations, is likely a consequence of the majority (two-third) of individuals carrying that high-risk genotype for increased LDL (C) levels (homozygous for the common rs646776 genotype), and that multiple environmental and polygenic confounders obscure some of the stronger effects seen earlier in young adults (Fig. 2).