Main

Genome-wide association studies in individuals of European descent have identified three genomic regions associated with atrial fibrillation on chromosomes 4q25 (PITX2)2,3,4,5, 16q22 (ZFHX3)3,5 and 1q21 (KCNN3)2. Studies of electrocardiographic traits have also identified a number of loci associated with atrial fibrillation6,7. However, despite these findings, much of the heritability of atrial fibrillation remains unexplained, justifying the search for additional genetic variants underlying atrial fibrillation risk8. Large-scale meta-analysis of GWAS results is a powerful method to identify additional genetic variation underlying traits and conditions. We therefore conducted a meta-analysis of multiple well-phenotyped GWAS samples of European ancestry to identify additional atrial fibrillation susceptibility loci.

Six prospective cohort and 10 prevalent study samples contributed to the discovery analysis, which was adjusted for age and sex (Table 1, Online Methods and Supplementary Note). Atrial fibrillation status was systematically ascertained in each sample (Online Methods and Supplementary Note). After application of quality control SNP exclusion criteria in each study (Supplementary Table 1), meta-analysis was performed, applying genomic control to each study. The genomic control inflation factor for the meta-analysis was 1.042 for the full set of SNPs and 1.040 after omitting all SNPs within 500 kb of the association signals that reached genome-wide significance. The quantile-quantile plot of the expected versus observed P-value distributions for association of the 2,609,549 SNPs analyzed is shown (Supplementary Fig. 1). We identified ten loci that exceeded our preset threshold for genome-wide significance (P < 5 × 10−8) (Fig. 1). The three loci most significantly associated with atrial fibrillation were at previously identified atrial fibrillation susceptibility loci on chromosomes 4q25 near PITX2 (rs6817105; P = 1.8 × 10−74)4, 16q22 in ZFHX3 (rs2106261; P = 3.2 × 10−16)3,5 and 1q21 in KCNN3 (rs6666258; P = 2.0 × 10−14)2 (Table 2).

Table 1 Subject characteristics
Figure 1: Manhattan plot of meta-analysis results for genome-wide association with atrial fibrillation.
figure 1

The −log10 (P value) is plotted against the physical position of each SNP on each chromosome. The threshold for genome-wide significance, P < 5 × 10−8, is indicated by the dashed line. The three previously reported loci for atrial fibrillation are indicated in blue, and the seven new loci that exceeded the genome-wide significance threshold are indicated in orange.

Table 2 Summary of GWAS meta-analysis results with P < 5 × 10−8

Seven new genomic loci were associated with atrial fibrillation with P < 5 × 10−8 in the discovery stage (Table 2). The most significantly associated SNP in each of the seven new loci was genotyped and tested for association with atrial fibrillation in an additional 3,132 to 5,289 independent individuals with atrial fibrillation and 8,159 to 11,148 referent individuals derived from six studies of individuals of European ancestry (Supplementary Table 2). Six of the loci associated with atrial fibrillation in the discovery stage met our criteria for independent replication. Study-specific replication results are detailed (Supplementary Table 3). The results from meta-analysis of the discovery and replication results are shown (Table 2), as are regional plots (Fig. 2). Recognizing that the genes in closest physical proximity to the associated SNPs are not always the causative genes, we report below the genetic associations in order of statistical significance and describe the nearest gene.

Figure 2: Regional plots for seven new atrial fibrillation loci in the discovery sample with P < 1 × 10−8.
figure 2

SNPs are plotted by meta-analysis P value and genomic position (NCBI Build 36). The SNP of interest is labeled. The strength of LD is indicated by red coloring. Estimated recombination rates are shown by the blue peaks, and gene annotations are indicated by dark green arrows. LD and recombination rates are based on the Utah residents of Northern and Western European ancestry (CEU) HapMap cohort (release 22). Plots were prepared using SNAP27.

The most significant new association in the discovery stage was on chromosome 1q24 (rs3903239; overall P = 8.4 × 10−14) in PRRX1, which encodes a homeodomain transcription factor highly expressed in the developing heart, particularly in connective tissue. Biological interaction between PRRX1 and a related homeobox transcription factor gene, PRRX2, results in abnormalities of great vessel development in a mouse knockout model9. In a separate PRRX1 knockout, fetal pulmonary vasculature development was impaired10.

A second locus was identified on chromosome 7q31 (rs3807989; overall P = 3.6 × 10−12) in CAV1, which encodes caveolin-1, a cellular membrane protein involved in signal transduction. CAV1 is selectively expressed in the atria11, and its knockout has been associated with dilated cardiomyopathy12. The CAV1 protein colocalizes with and negatively regulates the activity of KCNH2 (ref. 13), a potassium channel involved in cardiac repolarization; the corresponding KCNH2 gene was found to be associated with atrial fibrillation in a candidate gene association study, although not in our present analysis14. The top SNP at the CAV1 locus identified in the current study, rs3807989, was previously identified in a GWAS of the PR and QRS intervals and related to atrial fibrillation6,7. The relationship between other previously reported PR-associated loci and atrial fibrillation are reported (Supplementary Table 4). Of note, significant associations with atrial fibrillation were observed for SNPs related to the PR interval in SOX5, TBX5, SCN5A and SCN10A6,7.

The third locus was on chromosome 14q23 (rs1152591; overall P = 5.8 × 10−13) and was located in an intron of SYNE2, which encodes numerous nesprin-2 isoforms, some of which are highly expressed in the heart and skeletal muscle. Nesprin-2 localizes throughout the sarcomere and is involved in maintaining nuclear structural integrity by anchoring the nucleus to the cytoskeleton. In a candidate gene approach, mutations in SYNE2 were found to segregate in some families with Emery-Dreifuss muscular dystrophy15, which is characterized by skeletal muscle atrophy, cardiomyopathy and cardiac conduction defects.

The fourth locus was on chromosome 9q22 (rs10821415; overall P = 4.2 × 1011) and was located in an ORF on chromosome 9. Genes at this locus include FBP1 and FBP2, which are important for gluconeogenesis. Autosomal recessive FBP1 deficiency has been described, but cardiovascular features did not seem to be prominent16. Variants at 9q22 have been implicated in the regulation of height, pulmonary function, angiogenesis and attention deficit hyperactivity disorder, although rs10821415 is not in substantial linkage disequilibrium (LD) with any of these SNPs (r2 < 0.30).

A fifth signal was located at 15q24 (rs7164883; overall P = 2.8 × 10−17) in the first intron of HCN4. The HCN4 protein is the predominant cardiac hyperpolarization-activated cyclic nucleotide–gated channel and is highly expressed in the sinoatrial node. HCN4 activity underlies the funny current (If) that governs cardiac pacemaking, and mutations in HCN4 have been associated with various forms of sinus nodal dysfunction17,18.

A sixth locus on chromosome 10q22 (rs10824026; overall P = 4.0 × 10−9) was located 5 kb upstream of SYNPO2L and 20 kb upstream of MYOZ1. The proteins encoded by SYNPO2L and MYOZ1 are both expressed in skeletal and cardiac muscle, localize to the Z-disc and interact with numerous other proteins. However, the precise role of either gene in cardiovascular physiology is unknown19,20. A mouse knockout of MYOZ1 showed increased calcineurin activity and cardiac hypertrophy in response to pressure overload. However, candidate gene approaches have not supported a prominent role for MYOZ1 mutations in causing familial dilated cardiomyopathy21. Of note, the SYNPO2L locus is located within a previously reported atrial fibrillation susceptibility locus identified in a family with autosomal dominant atrial fibrillation22.

One other locus was identified in the meta-analysis of atrial fibrillation in the WNT8A gene (rs2040862; P = 3.2 × 10−8); however, this association failed to replicate in additional independent cohorts with atrial fibrillation (replication P = 0.36; combined P = 2.5 × 10−7).

There was evidence of significant heterogeneity in the discovery meta-analysis at the previously published atrial fibrillation susceptibility signals at 4q25 near PITX2 and at 16q22 in ZFHX3 (Table 2). Effect heterogeneity at the PITX2 locus has already been observed4,23,24.

We then sought to determine whether the top SNPs or their proxies at each locus were associated with alterations in gene expression in an expression quantitative trait locus (eQTL) database, the Genotype-Tissue Expression eQTL browser. The top SNP at the SYNPO2L locus, rs10824026, is in strong LD with a SNP, rs12570126 (r2 = 0.932), that was found to correlate with the expression of MYOZ1 and SYNPO2L (P = 1.5 × 10−6; data derived from lymphoblastoid cell lines in 270 individuals from the HapMap Consortium)25. Furthermore, the top SNP at the SYNPO2L locus is in LD with a nonsynonymous SNP in SYNPO2L, rs3812629 (r2 = 0.8; encoding the SYNPO2L P707L variant with respect to transcript Q9H987-1), that is predicted to be damaging by both the PolyPhen-2 and the SIFT algorithms. None of the other identified atrial fibrillation risk SNPs were associated with variations in gene expression in the Genotype-Tissue Expression eQTL browser.

We next examined the generalizability of our findings by examining our results in a separate GWAS of individuals of Japanese ancestry, including 843 subjects with and 3,350 subjects without atrial fibrillation in the Japan BioBank study (Supplementary Fig. 2a,b). Within the Japanese GWAS, only the chromosome 4q25 (PITX2) locus exceeded the preset threshold for genome-wide significance (rs2634073; odds ratio (OR) = 1.84, 95% confidence interval (CI) = 1.59–2.13; P = 3.7 × 10−17; Supplementary Figs. 2a and 3 and Supplementary Table 5). At the previously published locus at 16q22 (ZFHX3)3,5, rs12932445 was associated with atrial fibrillation in participants of Japanese ancestry (P = 6.8 × 10−4). The relationship between atrial fibrillation and variants at the KCNN3-PMVK locus on chromosome 1q21 failed to replicate (P = 0.17); however, a regional plot of this locus revealed a distinct signal at the rs7514452 SNP—approximately 375 kb away from rs6666258, the top SNP in the European ancestry sample—that was modestly associated with atrial fibrillation in the Japanese sample (P = 4.9 × 10−5; R2 = 0.002). At PRRX1 and CAV1, the top SNPs in the European samples were also associated with atrial fibrillation in Japanese individuals. For the loci near PRRX1 and C9orf3, alternate SNPs in the Japanese cohort were more significantly associated with atrial fibrillation than the top SNP at each locus in Europeans (Supplementary Fig. 3 and Supplementary Table 5).

Our study was subject to a number of limitations. To maximize both the power and the generalizability of our study, we included all available individuals with atrial fibrillation; thus, some individuals had comorbidities, such as systolic dysfunction and hypertension. However, none of the identified risk variants for atrial fibrillation were strongly associated with systolic dysfunction in the EchoGen Consortium26, a meta-analysis of echocardiographic data from 5 community-based cohorts consisting of over 12,000 individuals of European descent (P value < 1 × 10−5). Further, when our replication results were adjusted for hypertension status, the identified variants remained significantly associated with atrial fibrillation (Supplementary Table 3). Ultimately, the development of a comprehensive risk score incorporating clinical, biochemical and genetic marker data will be necessary to clarify the incremental benefit of our findings in clinical care. Our eQTL analyses were limited to data available within the Genotype-Tissue Expression eQTL browser; future eQTL analyses in cardiac tissue may be helpful in identifying a relationship between the SNPs associated with atrial fibrillation risk and gene expression. Finally, we acknowledge that the identified variants may not be causal but may represent causal elements in the same or different molecular pathways; future statistical, bioinformatic and biological analyses investigating potential genetic interactions are warranted. Fine mapping and deep resequencing will be necessary to uncover the genetic architecture accompanying the identified common atrial fibrillation susceptibility signals.

In summary, our GWAS meta-analysis for atrial fibrillation has identified six new susceptibility loci in or near plausible candidate genes involved in pacemaking activity, signal transduction and cardiopulmonary development. Our results show that atrial fibrillation has multiple genetic associations and identifies new targets for biological investigation.

URLs.

BIMBAM, http://quartus.uchicago.edu/~yguan/bimbam/index.html; Genotype-Tissue Expression eQTL browser, http://www.ncbi.nlm.nih.gov/gtex/GTEX2/gtex.cgi; iControlDB database, http://www.illumina.com/science/icontroldb.ilmn; IMPUTE, https://mathgen.stats.ox.ac.uk/impute/impute.html; MACH, http://www.sph.umich.edu/csg/abecasis/MACH/; METAL, http://www.sph.umich.edu/csg/abecasis/metal/index.html; PolyPhen-2, http://genetics.bwh.harvard.edu/pph2/; SIFT, http://sift.jcvi.org/; SNAP, http://www.broadinstitute.org/mpg/snap/ldsearch.php.

Methods

In each center, the local institutional review board reviewed and approved all study procedures; written informed consent was obtained from each participant, including consent to use DNA for genetic analyses of cardiovascular disease.

Discovery cohorts.

For a detailed description of the study cohorts used in the discovery phase, please see the Supplementary Note.

Replication cohorts.

Independent subjects with atrial fibrillation were identified from the German Competence Network for Atrial Fibrillation (AFNET) Study, and controls without atrial fibrillation were obtained from the Cooperative Health Research in the Region Augsburg (KORA) S4 study for the replication phase of the current study. Independent subjects with atrial fibrillation and controls without atrial fibrillation were identified from the Heart and Vascular Health (HVH) study for replication; included in the replication sample were atrial fibrillation cases of ≥66 years of age or with clinically recognized structural heart disease at atrial fibrillation diagnosis and referent subjects without atrial fibrillation, who were frequency matched to atrial fibrillation cases on the basis of age, sex, hypertension and year of identification. Independent subjects with early-onset atrial fibrillation were identified from the Massachusetts General Hospital (MGH) Atrial Fibrillation Study; referent subjects without atrial fibrillation were drawn from the local hospital catchment.

The Health Aging and Body Composition (Health ABC) Study is a National Institute on Aging–sponsored ongoing cohort study of the factors that contribute to incident disability and the decline in function of healthier older persons, with a particular emphasis on changes in body composition in old age. Health ABC enrolled well-functioning, community-dwelling black (n = 1,281) and white (n = 1,794) men and women aged 70–79 years between April 1997 and June 1998. Participants were recruited from a random sample of white and all-black Medicare-eligible residents in the Pittsburgh and Memphis, Tennessee, metropolitan areas. The key components of Health ABC include a baseline exam, annual follow-up clinical exams and phone contacts every 6 months to identify major health events and document functional status between clinic visits.

The Malmö Study consists of cases with prevalent or incident atrial fibrillation from two population-based cohorts from Malmö, Sweden, (Malmö Diet and Cancer and the reexamination of the Malmö Preventive Project) identified from national registers that were matched 1:1 to controls from the same cohort by sex, age (±1 year), date of baseline exam (±1 year) and requirement for a follow-up exam exceeding that for the corresponding case.

The Ottawa Heart Atrial Fibrillation study consists of individuals with lone atrial fibrillation or atrial fibrillation and hypertension, recruited from the Arrhythmia Clinic at the University of Ottawa Heart Institute (UOHI). Enrollment requires at least one episode of electrocardiographically documented atrial fibrillation, characterized by erratic atrial activity without distinct P waves and irregularly irregular QRS intervals. Exclusion criteria consist of a history of coronary artery disease, a left ventricular ejection fraction of <50% or significant valvular disease on echocardiography. Control subjects were drawn from the control arm of the Ottawa Heart Genomics Study, an ongoing case-control study for coronary artery disease at the UOHI. Male control subjects were ≥65 years of age, and female controls were ≥70 years of age. Control subjects with a documented history of atrial fibrillation were excluded from this study. All cases and controls were of western European ancestry.

Atrial fibrillation GWAS in Japanese.

We used 843 atrial fibrillation cases who participated in the BioBank Japan project between 2003 and 2006. Control subjects consisted of 2,444 Japanese individuals registered in BioBank Japan as subjects with 11 diseases (hepatic cirrhosis, osteoporosis, colorectal cancer, breast cancer, prostate cancer, lung cancer, uterine myoma, amyotrophic lateral sclerosis, drug eruption, gallbladder and bile duct cancer and pancreatic cancer) and 906 healthy volunteers recruited from the Osaka-Midosuji Rotary Club.

Illumina Human610-Quad and HumanHap550v3 Genotyping BeadChips were used for case and control groups, respectively. We applied quality control criteria (call rate of ≥0.99 in both cases and controls and Hardy-Weinberg equilibrium test P of ≥1.0 × 10−6 in the control population); 430,963 SNPs on all chromosomes passed the quality control filters. All cluster plots were examined by visual inspection by trained personnel to exclude SNPs with ambiguous calls.

Genotyping.

Detailed information on the genotyping platforms and exclusions in each cohort for the GWAS meta-analysis are provided (Supplementary Table 1). Replication genotyping was performed using TaqMan assays (Applied Biosystems) in the Ottawa sample or Sequenom iPlex single-base primer extensions with MALDI-TOF mass spectrometry (Sequenom) for AFNET/KORA S4, HVH, Malmö and MGH samples.

Statistical analysis.

For the meta-analysis, over 2.5 million HapMap SNPs were imputed within each study using the HapMap CEU population. MACH v1.0.1x was used by AFNET, AGES, Rotterdam (RS-1), Vanderbilt, MGH, FHS, ARIC, the Cleveland Clinic (CC) and WGHS; BIMBAM was used by HVH and CHS; and IMPUTE v0.5 was used by Study of Health in Pomerania (SHIP). In studies for which population structure was associated with the atrial fibrillation phenotype (FHS, MGH and Cleveland Clinic), analyses were adjusted for the principal components of genotype associated with phenotype28. The primary analysis in each center used logistic or proportional hazards regression, as appropriate, adjusting for age at DNA draw and sex. ARIC and CHS also adjusted for study site. Each SNP was modeled using an additive genetic effect. The ratio of observed-to-expected variance in the imputed SNP genotype counts29, the MACH Rsq statistic, which is a variation on this metric, or a measure of the observed statistical information associated with the imputed genotype that was computed by IMPUTE, was used as a quality control metric for imputed SNPs. All three metrics range from 0 to 1, with 1 indicating high imputation quality and 0 indicating no imputation information. For each SNP, all studies with quality scores greater than 0.10 were included in meta-analyses. For each SNP, a fixed-effects model was used for meta-analysis of the genotype logistic regression parameters (log odds ratios), using inverse-variance weights as implemented in the meta-analysis utility METAL. Before meta-analysis, genomic control was applied to each study having a genomic control inflation factor (λ) of >1.0, by multiplying the standard error of the SNP regression parameter by the square root of the study-specific λ value. A total of 2,609,549 SNPs with average minor allele frequencies of ≥0.01 across participating studies were included in meta-analyses. We preset a threshold of P < 5 × 10−8 corresponding to Bonferroni adjustment for 1 million independent tests as our criterion for genome-wide significance30. Our preset criterion for replication was that the meta-analysis of the discovery and replication studies would have a smaller P value than the discovery meta-analysis.

Prediction of SNP function and eQTL analyses.

The proxy of each of the three previously published and seven newly identified top SNPs was obtained from SNAP Proxy Search. The HapMap (release 22) CEU population was used as the reference panel, and the r2 threshold was 0.8. We limited the maximum physical distance to 500 kb. The seven newly identified top SNPs, along with their proxies, were then used for SNP function and eQTL analysis. eQTL analysis was performed by searching against the Genotype-Tissue Expression eQTL browser, which compiled data sets collected from multiple studies. Functional annotation of these SNPs was obtained from the dbSNP database. Nonsynonymous SNPs were selected and submitted to PolyPhen-2 and SIFT for functional effect prediction.