Introduction

Bladder cancer, >85% being superficial, is the ninth most frequent cancer worldwide (Theodorescu, 2003). Of the patients, approximately 50% will experience a recurrence within 2 years after an initial diagnosis, and 16–25% will have recurrence after endoscopic resection (Kim and Bae, 2008). Thus, the frequent recurrence of bladder cancer is a major medical problem. Numerous clinical variables and molecular biomarkers, including chromosomal markers, genetic variations or epigenetic alterations, have been investigated for their prediction of recurrence. However, the results from published studies remain conflicting rather than conclusive.

Tobacco smoking, as a predominant risk factor for bladder cancer, is responsible for about half the cases in men and a third in women (Parkin et al., 2005). As tobacco-related carcinogens often form bulky DNA adducts, base damage, single-strand breaks and double-strand breaks (Pryor et al., 1983), the variation in individual DNA repair capacity may explain the considerable variability in the risk in the general population and progression of bladder cancer and clinical outcomes in patients with similar tumor pathological grades and clinical stages.

There are several DNA repair pathways in human cells, including repair of oxidative DNA damage mediated by two major pathways, that is, base-excision repair and nucleotide excision repair (Frosina, 2000; Wood et al., 2001). Xeroderma pigmentosum group F (XPF), also known as the excision repair cross-complementing group 4 (ERCC4), is critically involved in the NER pathway (Berwick and Vineis, 2000). XPF also has an important role in recombination repair, mismatch repair and possibly immunoglobulin class switching (Kornguth et al., 2005). XPF contains the catalytic domain of the nuclease, and ERCC1 is required for DNA binding and stabilization of XPF (Tsodikov et al., 2005). The ERCC1–XPF complex can remove 3′ single-stranded flaps from DNA ends (De Laat et al., 1998) and cleaves the 5′ side of a bubble in NER to excise toward the lesion (Sijbers et al., 1996).

Although hundreds of polymorphisms of XPF have been identified, their effects on DNA repair and clinical outcomes in cancers have not been well characterized. Several studies have investigated the associations between the XPF polymorphisms and risk of cancers, including cancer of the breast (Smith et al., 2003; Lee et al., 2005; Milne et al., 2006; Han et al., 2008), lung (Shao et al., 2008), head and neck (An et al., 2007), skin (Winsey et al., 2000) and pancreas (McWilliams et al., 2008), but the results are not consistent, suggesting that additional functional studies are needed to provide some mechanistic support. Recently, an intronic XPF SNP rs744154 has been identified from among 640 SNPs in 111 genes as a possible causal variant for breast cancer risk in a two-stage case–control study (Milne et al., 2006).

In this study, we conducted a two-stage case–control analysis of 364 bladder cancer cases and 400 cancer-free controls in the Chinese population. We genotyped three known tagging single nucleotide polymorphisms (tagSNPs that tag 137 SNPs), including the recently reported intronic XPF SNP rs744154, to evaluate the associations between these three XPF variants and the risk of bladder cancer. Through further resequencing, phylogenetic analyses and molecular studies, we identified a novel, functional promoter polymorphism, –357A>C, which is likely to be the causal variant that may contribute to the development and recurrence of bladder cancer.

Results

Characteristics of study subjects and SNP genotyping

Frequency distributions of selected characteristics of the cases and controls are presented in Supplementary Table 1. The cases and controls appeared to be adequately matched on age and sex (P=0.485 and P=0.223, respectively, in the first set and P=0.912 and P=0.167, respectively, in the second set). The selected SNPs with locations and allele frequencies are summarized in Supplementary Table 2. The genotype frequencies of all four XPF SNPs among the controls were in agreement with the Hardy–Weinberg equilibrium (P>0.05 for all).

Association analysis of tagSNPs

In the first-set analysis of the three tagSNPs, the allele and genotype frequencies between cases and controls were significantly different for SNP rs744154 (P=0.006 and 0.014, respectively, for the allele and genotype) but not for the other two tagSNPs. Specifically, the CG and the combined CG/GG genotypes were associated with significantly reduced odds ratios of 0.56 (95% confidence interval (CI)=0.38–0.83) and 0.55 (0.38–0.81), respectively (Table 1), which were further confirmed in the second-set analysis (OR=0.60, 95% CI=0.36–0.99 for CG and 0.57, 0.35–0.93 for CG/GG genotypes) (Table 2). When the two sets were combined, the study power for the difference was significantly increased (OR=0.61, 95% CI=0.44–0.83, P=0.002 for the CG genotype and OR=0.60, 95% CI=0.44–0.81, P=0.001 for the CG/GG genotypes, compared with the CC genotype) (Table 2).

Table 1 Genotype frequencies of the XPF tagSNPs among cases and controls and their association with bladder cancer risk in the first-set analysis
Table 2 Association between the XPF SNPs and risk of bladder cancer among cases and controls in the second-set analysis and all subjects

Candidate cancer susceptibility variants in the XPF promoter region

As the SNP rs744154 within the first intron was seemingly non-functional, we hypothesized that this SNP is in linkage disequilibrium (LD) with other functional or causative SNPs in the XPF promoter. Thus, we performed a phylogenetic analysis from the first intron to the promoter region with the ClustalW 2.0 software. We found that this region was highly conserved across homo and mouse species (Supplementary Figure 1), suggesting that the causal variants may lie in this region. We then resequenced this region (that is, −1000 to +131 bp of the XPF gene) in 40 DNA samples and identified three common variants with MAF0.05 (that is, −644C>T, −357A>C and −30T>A) that showed a strong LD (all r2=1.00 for each pair of the loci) in the HapMap database, which were also in complete LD with the SNP rs744154 (r2=1.00).

Identification of the DNA protein-binding region and allele-specific effects of SNPs in the XPF promoter region

To determine whether these three promoter variants affect the binding affinity of the transcription factor, we performed the electrophoretic mobility shift assay (EMSA) to analyze the binding ability of the oligo probes containing different alleles of these SNPs (that is, −644 T, −644C, −357C, −357A, −30A and −30T) to the nuclear proteins. As shown in Figure 1a, no allele-specific DNA–protein complex was found for the −644T and −644C probes or −30A and −30T probes. However, a specific DNA–protein complex was formed by the −357A, but not −357C probe. Then, we conducted a computer-based MOTIF search (http://motif.genome.jp) for the XPF −357A>C polymorphism and found that the −357A, but not −357C allele, creates a binding motif of the C/EBPα transcription factor. We further performed the super-shift EMSA to examine whether C/EBPα was a component of the allele-specific complex (Figure 1b). As a result, we observed a specific complex that could not be supershifted with the anti-C/EBPα antibodies in the presence of the −357A probe (lane 5), suggesting that the binding complex may be some unknown proteins other than C/EBPα. By purifying the specific binding proteins of the XPF −357A>C site on PEG6000, we identified four binding proteins, EEF1G, SIRT4, REXO1 and HMGB2, molecular masses of approximately 32 to 66 KDa (Supplementary Figure 3). However, the functions of these proteins need to be clarified in future studies.

Figure 1
figure 1

Analysis of transcription factor-binding sites in XPF promoter region. (a) EMSA assay with biotin-labeled and NIH-3T3 cell nuclear extracts. Lanes 1, 4, 7, 10, 13 and 16 showed the mobilities of the labeled probes without nuclear extracts; Lanes 2, 5, 8, 11, 14 and 17 represented the mobilities of the labeled probes with nuclear extracts in the absence of competitors. A specific nuclear protein binding can be completely abolished by both 150-fold unlabeled −644T and −644C probes, but not −357A probe (Lanes 3, 6 and 12). No specific nuclear protein binding was found with −357C, −30A or −30T probes (Lanes 8, 14 and 17). (b) EMSA assay with biotin-labeled −357A and C/EBPα probes, confirmed by supershift with anti-C/EBPα antibody. A specific nuclear protein binding can be completely abolished by a 250-fold unlabeled −357A probe, but not 150-fold unlabeled −357A probe (Lanes 3 and 4). Super-shift assays incubating with anti-C/EBPα antibody showed a super-shifted protein complex by the C/EBPα probe, but not –357A probe (Lanes 5 and 7).

Effect of the XPF −357A>C polymorphism on transcriptional activity

We further constructed luciferase reporter vectors by using the pGL3-basic vector with either the −357A or −357C allele (Supplementary Figure 2a). The results showed that the vectors with the −357A allele had a 60 to 75% increase in the relative luciferase activities, compared with the −357C allele (P<0.05 for all cell lines) (Supplementary Figure 2b). Besides, transcriptional activity of the XPF gene was not significantly altered in cells treated with HMGB2 siRNA, and neither did the cells with these two constructs have any difference in luciferase activities in response to UV irradiation (Supplementary Figure 2c).

Effects of the XPF −357A>C polymorphism on XPF mRNA and protein expression

We then evaluated the effect of the XPF −357A>C polymorphism on XPF mRNA expression in 22 bladder tumor tissues detected by the real-time quantitative RT–PCR. As shown in Supplementary Figure 4, the XPF mRNA expression levels were significantly higher in individuals with the AA genotype (n=15) than in those with the AC/CC genotypes (n=7) (2.32±0.25 versus 1.00±0.18, in arbitrary units, P<0.01). In addition, we analyzed the XPF protein expression in 55 bladder tumor tissues (30 with the AA genotype and 25 with the AC or CC genotype). We found that the expression level was obviously higher for the AA genotype (46.7%) than for the AC/CC genotypes (16.0%) in both the cytoplasm and nucleus (P=0.022) (Supplementary Figure 5).

The XPF –357A>C polymorphism and bladder cancer risk, recurrence outcomes

We then examined the contribution of the newly identified XPF promoter −357A>C polymorphism to bladder cancer risk and recurrence in the second-set analysis. Specifically, of the 130 bladder cancer patients, 79 (60.8%) had superficial bladder cancer (pTis, pTa and pT1); the remaining 51 (39.2%) had invasive disease (pT2–pT4). Also, 109 (83.8%) had low-risk tumor (G1–G2), and the remaining 21 (16.2%) had high-risk tumor (G3). As shown in Table 2, the −357AC/CC genotypes were significantly associated with an increased risk of bladder cancer (OR=1.94, 95% CI=1.19–3.19) compared with the AA genotype, and this association was also more pronounced among subgroups of age >65 years (2.09, 1.03–4.24), men (2.04, 1.17–3.56), smokers (2.06, 1.00–4.25), patients with low tumor grade (1.99, 1.20–3.32) and superficial bladder cancer cases (1.91, 1.08–3.56).

We further focused on the association between the XPF promoter −357A>C polymorphism and bladder cancer recurrence in the second-set data set that had clinical follow-up data. Among the 79 superficial bladder cancer cases, 38 developed recurrence after a 36-month follow-up, with a 15-month median recurrence-free survival time. However, the −357AA homozygote had a statistically significantly higher median recurrence-free survival time (19.0 months) than the carriers of the −357AC/CC genotypes (11.0 months) (log-rank test, P=0.025) (Figure 2). Compared with the −357AA carriers, individuals with the −357C allele had a significant increase in recurrence risk (Hazard ratio (HR)=2.13, 95% CI=1.04–4.40) (Table 3). In stratified analysis by age, sex, pack-years smoked and tumor grade, this increased risk of recurrence was evident in men (2.37, 1.08–5.24) and those with low-risk tumors (3.62, 1.42–9.28) (Table 3).

Figure 2
figure 2

Kaplan–Meier survival function for recurrence among superficial bladder cancer patients by genotypes of XPF –357A>C polymorphism.

Table 3 Cox proportional hazard regression analysis for superficial bladder cancer patients' recurrence with XPF –357A>C polymorphism by selected variables using the second set of study subjects

Discussion

In a two-stage case–control analysis with resquencing for novel variants and functional analysis, we examined the associations between tagSNPs in the XPF gene and bladder cancer risk and identified that the potentially functional promoter −357A>C polymorphism is likely to be associated with bladder cancer risk. Furthermore, we showed that bladder cancer patients with the −357AC/CC genotypes had a shorter recurrence-free survival.

As XPF (ERCC4) has a critical role in removal of the damaged DNA through an ERCC1–ERCC4 heterodimeric protein complex with endonuclease activity (Friedberg, 2001), several variants of the XPF gene have been studied for their associations with cancer risk, including bladder cancer (Garcia-Closas et al., 2006). However, most of the published studies focused on only a few SNPs within the coding regions, including Arg415Gln (Smith et al., 2003; Garcia-Closas et al., 2006) and Ser835Ser (Lee et al., 2005; Garcia-Closas et al., 2006; Shao et al., 2008). Recently, Han et al. (2008) reported the associations between nine XPF SNPs and breast cancer risk, including −644C>T (rs3136038) (Shao et al., 2008) and Milne et al. (2006) identified a common intronic XPF variant rs744154 that protected against breast cancer risk . These findings suggested that although some potentially functional variants in the XPF gene may contribute to cancer susceptibility, other ‘non-functional’ variants such as the markers may be in LD with functional or disease-causing variants in the genome.

Our analysis showed that the promoter and the first intron of the XPF gene were highly conserved across species and that the SNPs in these regions may be in strong LD (Milne et al., 2006). Accumulative evidence suggests that polymorphisms in the promoter region could destroy transcriptional regulation factors, leading to altered transcription (Bond et al., 2004; Sun et al., 2007). By resequencing the XPF promoter region, we found that the novel –357A>C polymorphism, in strong LD with the reportedly significant first-intron tagSNP rs744154 (Milne et al., 2006), could destroy the binding of several transcription factors, including EEF1G, SIRT4, REXO1 and HMGB2. However, there are two subgroups of HMGB proteins different in their DNA-binding specificity, and the transcription factor HMGB2 is a nuclear protein believed to significantly affect DNA interactions by altering nucleic acid flexibility (Watanabe et al., 1994; Bustin and Reeves, 1996; Bustin, 2001). Our experimental data showed that the promoter harboring the –357A in the binding site of HMGB2 had increased transcriptional activity, and this activity was completely reduced by HMGB2 siRNA. These data suggested that the HMGB2 transcription factor may facilitate transcription by either binding directly to the transcription factor or stabilizing DNA looping (Travers et al., 1994; Laser et al., 2000; Kruppa et al., 2001).

High-penetrance defects of genes in the NER pathway have been implicated in XP patients who have 1000-fold increased risk of cutaneous malignancies because of UV light-induced DNA damage (Kraemer et al., 1994). However, we found that although the promoter containing the −357A allele had increased promoter activity, no difference was observed between the −357A and −357C alleles after UV irradiation, suggesting that the −357A>C SNP was not associated with a differential response to UV irradiation that is not an etiological factor for bladder cancer.

Owing to the unique role of XPF in the NER, insufficient DNA repair caused by the decreased XPF expression associated with the –357C allele may not be compensated by other NER enzymes, resulting in increased susceptibility to bladder cancer in carriers of this variant. Indeed, we found that the seemingly non-functional but important XPF rs744154 SNP in the first intron (Milne et al., 2006) was in complete LD with the promoter −357A>C polymorphism, and that this LD was highly conserved across species (Milne et al., 2006). We further found that the −357C allele was associated with decreased expression levels of both XPF mRNA and protein in bladder tumor tissues, a finding that provides some mechanistic support for the observed association between this variant and bladder cancer risk and recurrence.

Tobacco smoking is a known risk factor for bladder cancer (Cohen et al., 2000), and our results indicated that the risk associated with the XPF variant −357AC/CC genotypes was more pronounced in smokers who might have high levels of DNA damage induced by reactive oxygen species (Hecht, 2002), particularly in older men, which is consistent with a perceived longer exposure to smoking or unknown risk factors involved in the etiology of bladder cancer, such as certain chemical carcinogens in the workplace (Cohen et al., 2000) or environmental pollutants (Cohen and Johansson, 1992). These factors may interact with genotypes or act as potential confounders. Unfortunately, we did not have detailed information on these factors in these retrospective studies, and thus we were unable to perform further analysis; nor was our small sample size sufficiently large to identify significant gene–environment interactions. Therefore, future larger studies with more detailed environmental exposure data and inclusion of ethnically diverse populations are warranted.

Our results showed that the XPF −357A>C polymorphism could affect the recurrence of bladder cancer, which may add to the valuable molecular markers for detecting bladder cancer recurrence. Specifically, we showed a significant association between the novel XPF variant −357AC/CC genotypes and shorter overall recurrence-free survival time for bladder cancer, particularly for low-risk tumors and male smokers who may have accumulated more DNA damage, a finding consistent with the results reported by Gu et al. (2005). However, as the detectable effect size in our study was relatively small and the follow-up time was shorter than 60 months, the clinical utility of our findings should be interpreted with precaution. Further larger, better-designed studies are needed to validate our findings for their potential clinical application.

In conclusion, we had conducted a two-stage case–control analysis that had identified the −357A>C polymorphism in the XPF promoter region as a susceptibility locus for bladder cancer at risk of both primary cancer and recurrence. Our further functional analysis of this variant may provide a ‘proof-of-principle’ approach for mechanistically studying variants of candidate genes in susceptibility to bladder cancer.

Materials and methods

Study subjects

The study design with two independent sets of subjects is summarized in Figure 3. The first set included 234 patients with histologically confirmed bladder transitional cell carcinoma and 254 cancer-free control subjects who were recruited from First Affiliated Hospital of Nanjing Medical University between January 2003 and May 2006 as previously described (Wang et al., 2008). The second validation set included 130 cases and 150 controls from Huai-An Affiliated Hospital of Nanjing Medical University in our ongoing study starting from May 2005. All controls were recruited from healthy subjects who were seeking health care in outpatient clinics at the same hospitals. We used a short questionnaire to obtain information about demographic and risk factors and frequency-matched the controls to the cases by age (±5 years) and sex. Over 85% of eligible cases and controls agreed and consented to participate in the study, each donating 5 ml of blood for genomic DNA extraction. Those subjects who had smoked >100 cigarettes in their lifetime were considered ever smokers, and the others were never smokers. The second-set patients were followed up for additional clinical information (that is, recurrences and their pathological grades and clinical stages) every 3 months by telephone calls and histological confirmation after the first visit to the hospitals. The distributions of selected characteristics of the cases and controls are also presented in Supplementary Table 1. The research protocol was approved by the institutional review board of Nanjing Medical University.

Figure 3
figure 3

Study design and working model for investigating the association between the novel functional XPF −357A>C polymorphism and risk of bladder cancer.

SNP selection, identification and genotyping

We searched for all reported XPF variants, including 37 common SNPs (that is, minor allele frequency (MAF)0.05) in the HapMap database. The rs1799801 T>C (S835S) was directly selected from public SNP databases, because it was within the coding region (Shen et al., 2005). The rs744154 was forced into the selection, because it has been reported to be associated with breast cancer risk (Milne et al., 2006). We did not include other coding region SNPs (that is, rs1800067, rs1800124, rs2020955 and rs99802), because of their low MAF (<0.05) in Asian populations. The rs31870 was among a set of XPF tagSNPs selected with the following criteria: a minimal set of haplotypes ensuring an Rh2 of 0.8 to cover all possible haplotypes with a frequency of 5% as evaluated by the tagSNPs program (Stram et al., 2003). As a result, we selected three tagSNPs (rs744154, rs31870 and rs1799801) with an MAF>0.05 from the Chinese population included in the HapMap database. The selected tagSNPs could accurately predict the common (MAF>0.05) haplotypes with a minimum Rh2 of 0.936.

The selected XPF SNPs were genotyped by the PCR-restriction fragment length polymorphism (RFLP) method with positive (known genotypes) and negative (no DNA) samples on every 96-well plates. The genotypes were independently evaluated by two persons in a blind manner. Also, more than 10% of the samples were randomly selected for repeated assays, and the results were 100% concordant. In the first-set analysis, four DNA samples of the controls failed to generate reliable results; therefore, only 250 controls were included in the final first-set analysis.

Construction of reporter plasmids

We constructed two reporter plasmids encompassing the XPF promoter region amplified from two genomic DNA samples with different alleles. The amplified fragments were then cloned into the pGL3-basic vector (Promega, Madison, WI, USA). After cloning, the vectors were sequenced to confirm the orientation and integrity of each construct's inserts.

Transient transfections, luciferase assays and ultraviolet radiation

HeLa, NIH-3T3 and T24 cells were seeded in 24-well plates, and each well was transfected with 2.25 μg of the vector DNA or 20 pmol HMG2 siRNA by Lipofectamine 2000 (Invitrogen, Carlsbad, CA, USA). After transfection for 4 h, cells were treated with or without ultraviolet (UV) irradiation (10 J/m2) for 5 min. Luciferase activity was measured with a dual luciferase reporter assay system (Promega, Madison, WI, USA).

Electrophoretic mobility shift assay

Synthetic 3′ biotin-labeled oligonucleotides and cell nuclear extracts were performed by using the LightShift Chemiluminescent EMSA Kit (Pierce Rockford, IL, USA). For each gel shift reaction (10 μl), a total of 20 fmol labeled probe was combined with 1 μg nuclear extract prepared from NIH-3T3 cells. The band positions of the biotin-labeled probe in the membrane were detected according to the manufacturer's instructions.

DNA-binding protein purification

The DNA-binding protein in the XPF −357A>C region was performed by using the DNA-binding protein purification Kit (Roche Diagnostics, Indianapolis, IN, USA). The concatemeric oligonucleotides containing the −357C allele sequence were prepared by self-primed PCR with the primers. The eluted protein solutions were dialyzed and analyzed by SDS–PAGE.

MALDI-TOF MS analysis

The purified DNA-binding protein was in-gel digested with trypsin, and performed using a Bruker Biflex IV MALDI-TOF MS (Bruker Daltonics, Bremen, Germany). Data were screened against the NCBInr or Swiss-Prot databases using the MASCOT search program (http://www.matrixscience.com/) with the following search parameters: 100 p.p.m. for the precursor ion and 0.3 Da for the fragment ions.

Real-time analysis of XPF mRNA

To further detect the correlation between the XPF mRNA levels and –357A>C polymorphism, 22 bladder tumor tissues with different genotypes were subjected to extraction of total RNA by using the Trizol Reagent (Invitrogen Inc., USA). Each single-strand cDNA was diluted for subsequent PCR amplification of the XPF and β-actin genes, the latter being used as an internal quantitative control.

Immunohistochemistry

We also used representative paraffin sections of primary bladder tumor tissues from 55 patients, which were from the second-set analysis. All steps of the immunohistochemistry were performed using the Benchmark Automate (Ventana, Tucson, AZ, USA). The slides were incubated with the anti-XPF antibody (Bioworld, Dublin, OH, USA). The evaluation of XPF nuclear and cytoplasmic expression was made blindly by two independent pathologists simultaneously. The sum of the intensity and percentage scores was used as the final XPF nuclear or cytoplasmic staining score, and defined as follows: 0–2, negative or weak; 3–4, moderate; and 5–6, strong.

Statistical analysis

Chi-square test was used to compare differences in frequency distributions of selected demographic variables and the known risk factors such as tobacco smoking as well as XPF alleles and genotypes between cases and controls. The Hardy–Weinberg equilibrium of the control genotype distributions was test by a goodness-of-fit χ2-test. Unconditional univariate and multivariate logistic regression analyses were performed to obtain crude and adjusted OR and their 95% CI. In the first-set screening test, SNPSpD, which reflects the correction of markers (LD) on the corrected P-values, was used to control the inflation of type I error rate in multiple testing (Nyholt, 2004). The associations between polymorphic genotypes and tumor recurrence were estimated using the Kaplan–Meier method and assessed by the log-rank test. HR for risk of recurrence was estimated from a multivariate cox proportion hazards model, with adjustment for age, sex, pack-years smoked and tumor grade. As all patients were treated by transurethral resection of bladder tumors and all superficial bladder cancer patients underwent chemotherapy including mitomycin and pirarubicin, but not immunotherapy with Bacillus Calmette-Guerin, the effect of intravesical therapy on disease recurrence could not be adjusted in the analysis. Stratified analysis was also performed by age, sex, pack-years smoked and tumor grade. All tests were two-sided by using the SAS software (version 9.1; SAS Institute Inc., Cary, NC, USA).

Conflict of interest

The authors declare no conflict of interest.