Abstract
Genomic analysis of tumours has led to the identification of hundreds of cancer genes on the basis of the presence of mutations in protein-coding regions. By contrast, much less is known about cancer-causing mutations in non-coding regions. Here we perform deep sequencing in 360 primary breast cancers and develop computational methods to identify significantly mutated promoters. Clear signals are found in the promoters of three genes. FOXA1, a known driver of hormone-receptor positive breast cancer, harbours a mutational hotspot in its promoter leading to overexpression through increased E2F binding. RMRP and NEAT1, two non-coding RNA genes, carry mutations that affect protein binding to their promoters and alter expression levels. Our study shows that promoter regions harbour recurrent mutations in cancer with functional consequences and that the mutations occur at similar frequencies as in coding regions. Power analyses indicate that more such regions remain to be discovered through deep sequencing of adequately sized cohorts of patients.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Fredriksson, N. J., Ny, L., Nilsson, J. A. & Larsson, E. Systematic analysis of noncoding somatic mutations and gene expression alterations across 14 tumor types. Nat. Genet. 46, 1258–1263 (2014)
Weinhold, N., Jacobsen, A., Schultz, N., Sander, C. & Lee, W. Genome-wide analysis of noncoding regulatory mutations in cancer. Nat. Genet. 46, 1160–1165 (2014)
Araya, C. L. et al. Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations. Nat. Genet. 48, 117–125 (2016)
Melton, C., Reuter, J. A., Spacek, D. V. & Snyder, M. Recurrent somatic mutations in regulatory regions of human cancer genomes. Nat. Genet. 47, 710–716 (2015)
Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016)
Lawrence, M. S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014)
Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013)
Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013)
Nik-Zainal, S. et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012)
Ciriello, G. et al. Comprehensive molecular portraits of invasive lobular breast cancer. Cell 163, 506–519 (2015)
The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012)
Ellis, M. J. et al. Whole-genome analysis informs breast cancer response to aromatase inhibition. Nature 486, 353–360 (2012)
Stephens, P. J. et al. The landscape of cancer genes and mutational processes in breast cancer. Nature 486, 400–404 (2012)
Banerji, S. et al. Sequence analysis of mutations and translocations across breast cancer subtypes. Nature 486, 405–409 (2012)
Getz, G. et al. Comment on “The consensus coding sequences of human breast and colorectal cancers”. Science 317, 1500 (2007)
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013)
Roberts, S. A. et al. An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers. Nat. Genet. 45, 970–976 (2013)
Kim, J. et al. Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors. Nat. Genet. 48, 600–606 (2016)
Carter, S. L. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012)
Horn, S. et al. TERT promoter mutations in familial and sporadic melanoma. Science 339, 959–961 (2013)
Huang, F. W. et al. Highly recurrent TERT promoter mutations in human melanoma. Science 339, 957–959 (2013)
Huang, W. et al. DDX5 and its associated lncRNA Rmrp modulate TH17 cell effector functions. Nature 528, 517–522 (2015)
Standaert, L. et al. The long noncoding RNA Neat1 is required for mammary gland development and lactation. RNA 20, 1844–1849 (2014)
Carroll, J. S. et al. Chromosome-wide mapping of estrogen receptor binding reveals long-range regulation requiring the Forkhead protein FoxA1. Cell 122, 33–43 (2005)
Hurtado, A., Holmes, K. A., Ross-Innes, C. S., Schmidt, D. & Carroll, J. S. FOXA1 is a key determinant of estrogen receptor function and endocrine response. Nat. Genet. 43, 27–33 (2011)
Ross-Innes, C. S. et al. Differential oestrogen receptor binding is associated with clinical outcome in breast cancer. Nature 481, 389–393 (2012)
Badve, S. et al. FOXA1 expression in breast cancer–correlation with luminal subtype A and survival. Clin. Cancer Res. 13, 4415–4421 (2007)
Thorat, M. A. et al. Forkhead box A1 expression in breast cancer is associated with luminal subtype and good prognosis. J. Clin. Pathol. 61, 327–332 (2008)
Mehta, R. J. et al. FOXA1 is an independent prognostic marker for ER-positive breast cancer. Breast Cancer Res. Treat. 131, 881–890 (2012)
Fu, X. et al. FOXA1 overexpression mediates endocrine resistance by altering the ER transcriptome and IL-8 expression in ER-positive breast cancer. Proc. Natl Acad. Sci. USA 113, E6600–E6609 (2016)
Jeselsohn, R. et al. TransCONFIRM: identification of a genetic signature of response to fulvestrant in advanced hormone receptor-positive breast cancer. Clin. Cancer Res. 22, 5755 (2016)
Fisher, S. et al. A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries. Genome Biol. 12, R1 (2011)
Gnirke, A. et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009)
Pugh, T. J., Banerji, S. & Meyerson, M. Pugh et al. reply. Nature 520, E12–E14 (2015)
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009)
Cibulskis, K . et al. ContEst: estimating cross-contamination of human samples in next-generation sequencing data. Bioinformatics 27, 2601–2602 (2011)
Costello, M. et al. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res. 41, e67 (2013)
Ramos, A. H. et al. Oncotator: cancer variant annotation tool. Hum. Mutat. 36, E2423–E2429 (2015)
Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011)
Landau, D. A. et al. Mutations driving CLL and their evolution in progression and relapse. Nature 526, 525–530 (2015)
Purcell, S. M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185–190 (2014)
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006)
Gonzalez-Perez, A. & Lopez-Bigas, N. Functional impact bias reveals cancer drivers. Nucleic Acids Res. 40, e169 (2012)
Lochovsky, L ., Zhang, J ., Fu, Y ., Khurana, E . & Gerstein, M. LARVA: an integrative framework for large-scale analysis of recurrent variants in noncoding annotations. Nucleic Acids Res. 43, 8123–8134 (2015)
Martincorena, I. et al. Tumor evolution. High burden and pervasive positive selection of somatic mutations in normal human skin. Science 348, 880–886 (2015)
Dees, N. D. et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res. 22, 1589–1598 (2012)
Geyer, C. J. & Meeden, G. D. Fuzzy and randomized confidence intervals and P values. Stat. Sci. 20, 358–366 (2005)
Routledge, R. Practicing safe statistics with the mid-p. Can. J. Stat. 22, 103–110 (1994)
Kamburov, A. et al. Comprehensive assessment of cancer missense mutation clustering in protein structures. Proc. Natl Acad. Sci. USA 112, E5486–E5495 (2015)
Getz, G ., Gould, J . & Monti, S. Boosting permutation tests for marker selection. Broad Institute publications http://www.broadinstitute.org/mpr/publications/projects/Computational_Biology/GetzGouldMonti.pdf (2006)
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. B 57, 289–300 (1995)
The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)
Matys, V. et al. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. 31, 374–378 (2003)
Sandelin, A., Alkema, W., Engström, P., Wasserman, W. W. & Lenhard, B. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res. 32, D91–D94 (2004)
Hallikas, O. et al. Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity. Cell 124, 47–59 (2006)
Jolma, A. et al. Multiplexed massively parallel SELEX for characterization of human transcription factor binding specificities. Genome Res. 20, 861–873 (2010)
Jolma, A. et al. DNA-binding specificities of human transcription factors. Cell 152, 327–339 (2013)
Wei, G. H. et al. Genome-wide analysis of ETS-family DNA-binding in vitro and in vivo. EMBO J. 29, 2147–2160 (2010)
Touzet, H. & Varré, J. S. Efficient and accurate P value computation for position weight matrices. Algorithms Mol. Biol. 2, 15 (2007)
The Cancer Genome Atlas Research. Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas. N. Engl. J. Med. 372, 2481–2498 (2015)
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009)
Cowper-Sal lari, R. et al. Breast cancer risk–associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression. Nat. Genet. 44, 1191–1198 (2012)
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010)
Fuerer, C. & Nusse, R. Lentiviral vectors to probe and manipulate the Wnt signaling pathway. PLoS ONE 5, e9370 (2010)
Cao, L. et al. Independent binding of the retinoblastoma protein and p107 to the transcription factor E2F. Nature 355, 176–179 (1992)
Hallstrom, T. C. & Nevins, J. R. Specificity in the activation and control of transcription factor E2F-dependent apoptosis. Proc. Natl Acad. Sci. USA 100, 10848–10853 (2003)
Lazzerini Denchi, E. & Helin, K. E2F1 is crucial for E2F-dependent apoptosis. EMBO Rep. 6, 661–668 (2005)
Dick, F. A. & Dyson, N. pRB contains an E2F1-specific binding domain that allows E2F1-induced apoptosis to be regulated separately from other E2F activities. Mol. Cell 12, 639–649 (2003)
Coser, K. R. et al. Antiestrogen-resistant subclones of MCF-7 human breast cancer cells are derived from a common monoclonal drug-resistant progenitor. Proc. Natl Acad. Sci. USA 106, 14536–14541 (2009)
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011)
Acknowledgements
We thank the patients who contributed samples to this study. This study was a collaboration of the Broad Institute in Cambridge, Massachusetts, USA, and the National Institute of Genomic Medicine (INMEGEN) in Mexico City, Mexico. The work was conducted as part of the Slim Initiative in Genomic Medicine for the Americas (SIGMA), a project funded by the Carlos Slim Foundation in Mexico. We are grateful to S. Romero-Cordoba, R. Rebollar, and L. Alfaro-Ruiz for sample collection and processing. We thank the Broad Institute Genomics Platform and Target Accelerator for assistance; N. Dyson for assistance with E2F experiments; A. Kamburov and D. Rosebrock for computational help; M. Snyder, J. Reuter, and C. Cenik for discussion on TBC1D12; and S. Nik-Zainal for data access guidance. E.R., M.R., A.T.W., C.S., M.C., and J.S.B. were partly funded by SIGMA. J.M.E. was supported by the Fannie and John Hertz Foundation. P.P. and A.B were partly funded by the Massachusetts General Hospital startup funds of G.G. G.G. was partly funded by the Paul C. Zamecnick, MD, Chair in Oncology at Massachusetts General Hospital.
Author information
Authors and Affiliations
Contributions
G.G., M.M., T.R.G., and E.S.L. conceived and designed the study. A.H.-M., S.R.-C., J.B., and L.W.E. contributed patient samples. E.R. and G.G. designed analysis and developed methods. E.R., J.K., G.T., A.T.-W., and P.S. performed data analysis. P.P., J.G., J.M.E., T.S., Z.Z., J.L., and E.R. performed experiments. M.S.L., J.H., M.R., T.J.P., Y.E.M., and C.S. contributed data and analysis tools. M.L.C., S.S., C.C., and A.T. provided project management. G.G., S.B.G., J.S.B., M.M., A.J.I., A.B., T.R.G., and E.S.L. provided project leadership. E.R., E.S.L., and G.G. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
Competing financial interests: A.J.I. holds equity in and receives royalties from ArcherDx.
Additional information
Reviewer Information Nature thanks J. Carroll and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Figure 1 Patient cohort characteristics.
a, Comprehensive overview of coding and non-coding mutations in 360 breast cancer samples assayed on the ExomePlus platform. Samples are ordered on the basis of the promoter mutation events, then by known breast cancer coding drivers. b, Copy number profiles for 360 breast cancers.
Extended Data Figure 2 Targeted validation of promoter mutations.
a, Targeted sequencing validation of selected promoter mutations in 47 patients from the ExomePlus cohort with Illumina TruSeq Custom Amplicon panel (TSCA)-targeted sequencing technology. b, Validation rate of promoter mutations calculated as validated mutations over all sequenced and powered mutations. c, Median detection sensitivity at mutated sites for significantly mutated promoters. Each point indicates a single mutated position. d, PCR-MiSeq for the FOXA1 promoter locus for 126 patients with sufficient coverage for mutation calling from the original ExomePlus cohort. Three out of four mutations validated in experiment (green and red bars). PCR-MiSeq for 140 patients included but not covered in original ExomePlus experiment and 64 additional tumours yielded three novel mutations in each set (light and dark blue bars). No germline mutations at this site were detected in normal samples.
Extended Data Figure 3 Bi-allelic hits for TBC1D12 and LEPROTL1 promoter mutations.
a, Sequencing read alignment for tumour BDD-162 shows location of TBC1D12 hotspot mutations on mutually exclusive alleles. b, Location of hotspot mutations near the LEPROTL1 transcription start on mutually exclusive sequencing reads in patient BDD-MEX-BR-116. Reference bases are indicated in grey, mismatched bases in their respective colours (A, green; C, blue; G, orange; T, red). Hotspot mutation sites are outlined with black boxes. Images generated with the Integrative Genomics Viewer70.
Extended Data Figure 4 Characterization of TBC1D12 mutations.
a, TBC1D12 hotspot mutations are present in patients from TCGA (exome sequencing; numbers in parentheses indicate total number of patients). b, Exome hybrid capture alignment confirms mutual exclusivity of TBC1D12 mutations in a patient with bladder cancer (TCGA-C4-ACF1). Image generated with the Integrative Genomics Viewer70. c, TBC1D12 genomic locus (hg19) depicting location of promoter region and overlap with MCF-7 breast cancer cell line DNase signal. Red bar indicates native promoter region and TBC1D12 5′ UTR included in the promoter mutation reporter assay construct. Zoomed-in region shows two upstream putative alternative translation start sites (methionine, highlighted in green) potentially giving rise to larger luciferase protein products. Multiple sequence alignment of amino-acid sequence in primates illustrates evolutionary conservation of upstream translation start sites and downstream protein sequence in most species. Image generated with the Integrative Genomics Viewer70. d, Western blot of luciferase expressed from TBC1D12 and control reporter assay construct. Note that luciferase expressed from TBC1D12 construct is approximately 80 kDa larger than the control.
Extended Data Figure 5 Luciferase reporter assay and EMSA for additional promoter mutations.
a, EMSA shows gel shift for FOXA1 WT (lanes 1 and 2) and mutant (lanes 5 and 6) probes when incubated with HEK293T nuclear cell extract. WT FOXA1 competitor competes off protein from WT probes in a concentration-dependent manner (1 and 5 molar excess), but fails to do so for the mutant FOXA1 probe. Luciferase reporter assay and EMSA for WT and mutated probes in ZNF143 (b), LEPROTL1 (c), ALDOA (d), and TBC1D12 (e) show significantly decreased expression activity and a trend for loss of binding in promoter mutants (except for TBC1D12, where there is no binding). Individual data points in reporter assays (black) overlap summary statistic boxplots (grey) with median indicated by black horizontal line. P values calculated with two-sided Student’s t-test. Lanes 1 and 4 in each EMSA show biotinylated probes only. Lanes 2 and 5 show that addition of HEK293T nuclear extract induces a mobility shift of the biotinylated WT and mutant probes, indicating protein binding to the probe. Gel shift is prevented by the addition of excess matched unlabelled probes (lanes 3 and 6). No binding occurs for either WT or mutant probes in the TBC1D12 promoter (e), suggesting that these mutations do not affect transcriptional regulation from DNA.
Extended Data Figure 6 Increased binding of E2F/DP1 to the mutant FOXA1 promoter.
a, Immunoblot for haemagglutinin (HA)-tagged E2F3 and DP1 shows binding of both proteins in HEK293T cells transfected with either WT or mutant FOXA1 promoter luciferase construct. Immunoblot against tubulin serves as loading control. b, EMSA for HEK239T cells transfected with E2F3/DP1 expression constructs. EMSA was then performed for FOXA1 WT (lanes 1–3) and mutant (lanes 4–6) promoter probes. Ectopic expression of E2F3/DP1 increases nuclear protein binding signal to the mutant promoter compared with WT (compare lane 6 with lane 3), suggesting that increase in binding observed in mutant over WT is at least in part because of increased recruitment of the E2F/DP1 complex.
Extended Data Figure 7 IGR analysis.
a, Motif instances overlapping open chromatin in MCF-7 cells were considered for analysis (example of FOXA1 is shown). b, E2F1 average ChIP-seq signal from MCF-7 cells at WT, mutant, and control scramble motif locations measured in a 400 bp region surrounding motifs. Grey lines, 95% confidence interval.
Extended Data Figure 8 Stable overexpression of FOXA1 in MCF-7 cells.
MCF-7 cells stably transfected with FOXA1 show strong FOXA1 overexpression compared with MCF-7 cells transfected with empty vector.
Extended Data Figure 9 Discovery power in TCGA data set.
Discovery power of TCGA breast cancer whole genomes (100 patients) with median detection sensitivity of 93%. Black vertical line indicates power values for 100 patients. Horizontal red line demarcates 90% power.
Extended Data Figure 10 Lack of association between promoter mutation rate in ExomePlus cohort and covariates shown to correlate with mutation rate in coding genes.
Each bin represents a covariate quintile, and mutation rates are aggregates over all promoters in each bin. Error bars, s.d. of 1,000 bootstrap simulations. H3K4me1 signal from ENCODE breast luminal epithelial cells.
Supplementary information
Supplementary Information
This file contains a Supplementary Note, Supplementary References and Supplementary Figures 1 the uncropped blots. (PDF 3408 kb)
Supplementary Data
This file contains Supplementary Tables 1-8. (XLSX 1936 kb)
Rights and permissions
About this article
Cite this article
Rheinbay, E., Parasuraman, P., Grimsby, J. et al. Recurrent and functional regulatory mutations in breast cancer. Nature 547, 55–60 (2017). https://doi.org/10.1038/nature22992
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature22992
This article is cited by
-
APOBEC3-mediated mutagenesis in cancer: causes, clinical significance and therapeutic potential
Journal of Hematology & Oncology (2023)
-
Widespread perturbation of ETS factor binding sites in cancer
Nature Communications (2023)
-
M6A RNA methylation-mediated RMRP stability renders proliferation and progression of non-small cell lung cancer through regulating TGFBR1/SMAD2/SMAD3 pathway
Cell Death & Differentiation (2023)
-
Properties of non-coding mutation hotspots as urinary biomarkers for bladder cancer detection
Scientific Reports (2023)
-
Krüppel-like factor 7 influences translation and pathways involved in ribosomal biogenesis in breast cancer
Breast Cancer Research (2022)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.