Abstract
CpG islands (CGIs) are prominent in the mammalian genome owing to their GC-rich base composition and high density of CpG dinucleotides1,2. Most human gene promoters are embedded within CGIs that lack DNA methylation and coincide with sites of histone H3 lysine 4 trimethylation (H3K4me3), irrespective of transcriptional activity3,4. In spite of these intriguing correlations, the functional significance of non-methylated CGI sequences with respect to chromatin structure and transcription is unknown. By performing a search for proteins that are common to all CGIs, here we show high enrichment for Cfp1, which selectively binds to non-methylated CpGs in vitro5,6. Chromatin immunoprecipitation of a mono-allelically methylated CGI confirmed that Cfp1 specifically associates with non-methylated CpG sites in vivo. High throughput sequencing of Cfp1-bound chromatin identified a notable concordance with non-methylated CGIs and sites of H3K4me3 in the mouse brain. Levels of H3K4me3 at CGIs were markedly reduced in Cfp1-depleted cells, consistent with the finding that Cfp1 associates with the H3K4 methyltransferase Setd1 (refs 7, 8). To test whether non-methylated CpG-dense sequences are sufficient to establish domains of H3K4me3, we analysed artificial CpG clusters that were integrated into the mouse genome. Despite the absence of promoters, the insertions recruited Cfp1 and created new peaks of H3K4me3. The data indicate that a primary function of non-methylated CGIs is to genetically influence the local chromatin modification state by interaction with Cfp1 and perhaps other CpG-binding proteins.
Similar content being viewed by others
Main
To characterize the chromatin modifications typical of CGIs, we used the methyl-CpG-sensitive restriction endonuclease HinPI (cleavage site GCGC) to release small chromatin fragments from purified brain nuclei, as described previously9. As sites for this enzyme in bulk chromatin are rare and generally uncleavable owing to DNA methylation, the released fraction predominantly contains non-methylated CGIs. Confirming this, further digestion of the deproteinized DNA with HpaII (cleavage site CCGG) specifically collapsed the nucleosomal ladder generated by HinPI, but had little effect on DNA released from bulk chromatin with MseI (cleavage site TTAA; Supplementary Fig. 1)9. Western blotting confirmed that non-methylated CGI chromatin is enriched for histone modifications associated with actively transcribed genes (acetylated histone H3, H3K4me3 and H3K4me2) compared with bulk chromatin (Fig. 1a). In contrast, CGI chromatin was depleted for marks not found at active promoters: H3K36me3, H3K9me3, H3K27me3 and H4K20me3 (Fig. 1a). Agreement between these results and genome-wide studies of chromatin modifications3,4,10,11 indicated that this fraction could be used to identify proteins that preferentially localize to non-methylated CGIs. We first tested CXXC finger protein 1 (Cfp1), which binds to non-methylated CpG dinucleotides in vitro by a CXXC zinc finger domain6,12. The data showed that Cfp1 is enriched within the CGI fraction of the genome (Fig. 1a). Similarly, Kdm2a, an H3K36 demethylase that also contains a CXXC domain13, was enriched in the CGI fraction.
Focusing on Cfp1, we tested its in vivo binding specificity by chromatin immunoprecipitation (ChIP) at an endogenous CGI that is present in both methylated and non-methylated states. The Xist CGI is mono-allelically methylated in female cells, but fully methylated in males, which only have one X chromosome14. ChIP analysis of mouse brain tissue identified a peak of Cfp1 binding over the Xist CGI in females, but no peak was present in males, suggesting that Cfp1 exclusively binds to the non-methylated allele (Fig. 1b). To test this more stringently, we used bisulphite sequencing across the Xist locus to determine the methylation status of the immunoprecipitated chromatin recovered from females. As expected, input DNA comprised equal numbers of methylated and non-methylated DNA clones. DNA immunoprecipitated by the Cfp1 antibody was almost exclusively non-methylated (96%), however, whereas DNA immunoprecipitated with an antibody against the methyl-CpG-binding protein MeCP2 (refs 15–17) was predominantly methylated (88%; Fig. 1c). We conclude that Cfp1 selectively binds to non-methylated CpGs in vivo.
To test whether Cfp1 is concentrated at non-methylated CpGs within CGIs, we analysed the genome-wide distribution of Cfp1 using high-throughput DNA sequencing of immunoprecipitated DNA (ChIP-Seq). Prominent peaks of Cfp1 binding co-localized with non-methylated CGIs (Fig. 2a), 81% of which were Cfp1-associated. Cfp1 has been identified as part of the Setd1 H3K4 methyltransferase complex8 and ChIP-Seq with H3K4me3 antibodies showed that 93% of Cfp1-bound CGIs also possess this histone modification (Fig. 2b and Supplementary Table 1). Consistent with the possibility that Cfp1 binding is responsible for recruiting the Setd1 complex to these sites, Cfp1-negative non-methylated CGIs (19% of the total) also lack H3K4me3 (Fig. 2b). Despite being rich in non-methylated CpGs, these CGIs are somehow refractory to Cfp1 binding. One potential explanation came from alignment with the published18 distribution of the polycomb-associated mark H3K27me3 (ref. 19) in mouse brain. More than half (58%) of Cfp1-negative and H3K4me3-negative CGIs contained the H3K27 modification (Fig. 2a, b and Supplementary Fig. 2). In these cases H3K27me3 and polycomb binding may render a CpG island refractory to Cfp1 binding and to H3K4 methylation.
To assess the importance of Cfp1 for the recruitment of H3K4me3, we used stably expressed short hairpin RNAs (shRNAs) directed against Cfp1 to reduce its level in NIH3T3 cells. Single shRNAs reduced Cfp1 (Supplementary Fig. 3), but a combination of three gave a greater effect (Fig. 3a). Depleted cells showed altered morphology (Fig. 3b) and retarded growth (Fig. 3c). ChIP analysis revealed a loss of Cfp1 binding compared with vector-only transfected cells accompanied by a precipitous drop in levels of H3K4me3 across CGIs at the brain-derived neurotrophic factor (Bdnf), β-actin (Actb), c-Myc and Dlx5/6 genes (Fig. 3d). The same results were obtained with clones expressing each of two independent shRNA sequences, ruling out off-target effects of shRNA expression (Supplementary Fig. 3). As a further control, H3K27me3 profiles at the same loci were unaffected by depletion of Cfp1 (Fig. 3d and Supplementary Fig. 3b). The loss of H3K4me3 at six randomly selected CGI promoters in Cfp1-depleted cells argues that this modification is dependent on the presence of Cfp1.
Although Cfp1 binds non-methylated CpGs and seems to be required for H3K4 methylation at CGIs, it is possible that this reflects indirect recruitment of Setd1 by RNA polymerase II, which is present at active CGI promoters. Alignment of ChIP-Seq profiles for Cfp1, H3K4me3 and the unphosphorylated form of RNA polymerase II indeed showed co-localization of all three signals at 86% of all Cfp1-bound CGIs (Supplementary Table 1 and Supplementary Fig. 4). In a small proportion (7%) of cases, however, RNA polymerase II was undetectable, despite the presence of robust peaks of H3K4me3 and Cfp1 (Supplementary Fig. 4). This raised the possibility that RNA polymerase II may not be required and that Cfp1 binding is sufficient to direct H3K4 trimethylation. To test this hypothesis, we used embryonic stem (ES) cell lines in which artificial promoterless CpG-rich DNA sequences had been introduced into the genome at sites that normally lack H3K4me3. The DNA insert in ES line TβC44 (ref. 20) comprises a 720-base-pair (bp) enhanced green fluorescent protein (eGFP) coding sequence containing 60 CpGs21 adjacent to a 600-bp puromycin-resistance gene with 93 CpGs (Fig. 4a). The inserted sequence has the typical CpG density of a CGI, but lacks a promoter. Bisulphite analysis showed that integrated sequence is non-methylated (Fig. 4a). In the targeted cells, prominent domains of Cfp1 and H3K4me3 coincided with the inserted CpG-rich DNA (Fig. 4b). Interestingly, the peaks of H3K4me3 and Cfp1 tracked CpG density as expected if H3K4me3 is determined by this DNA dinucleotide sequence (Fig. 4b, broken line). No peak of RNA polymerase was detected. An independent ES cell line carrying an eGFP insertion on the X chromosome22 (Fig. 4c) also created a peak of H3K4me3 and Cfp1 (Fig. 4d). In this case, bisulphite sequencing showed that approximately a quarter of the integrated sequences were hypomethylated and the remainder were densely methylated (Fig. 4e, input panel). ChIP-bisulphite analysis demonstrated that Cfp1 and H3K4me3 antibodies significantly enriched the hypomethylated sequences (Fig. 4e). We conclude that clusters of non-methylated CpG are sufficient to recruit Cfp1 and create a peak of H3K4me3 modification, even in the absence of a promoter.
The density of non-methylated CpG is ∼50-fold higher in CGIs than in bulk genomic DNA, as CpG in the latter is deficient (20% of expected23) and mostly methylated (∼70%). It is unclear whether this high CpG density arises as a passive consequence of events at promoters and has no functional significance, or whether it has been selected over evolutionary time because it facilitates transcription (or other DNA-related processes). Our results favour selection, as they indicate that CpG density per se can directly influence histone modification status by the recruitment of the Cfp1 protein and its associated Setd1 histone H3K4 methyltransferase complex. The ability of an exogenous promoter-less CpG-rich insertion to create de novo an H3K4me3 focus provides strong support for this notion. An attractive biological rationale for this phenomenon may be simplification of the large mammalian genome by the creation of ‘beacons’ of H3K4me3 that highlight CGI promoters within the genomic landscape1.
Whether CpG clustering is sufficient to create stable non-methylated CGIs is uncertain. There is evidence that H3K4me3 is incompatible with de novo methylation as components of the DNA methyltransferase complex (Dnmt3L) are repelled by this modification24. In theory, therefore, Cfp1-bound CGIs should be intrinsically stable in the non-methylated state. Previous studies suggest, however, that transcription also has a role. Maintenance of non-methylated CGIs through the waves of de novo methylation in the early embryo depends on promoter function, as point mutations that prevent transcription factor binding without significantly reducing CpG density destroy the immunity of a CpG island to DNA methylation25,26. It follows that H3K4 methylation due to CpG clustering may not be sufficient to reliably perpetuate the non-methylated state. Indeed, more than half of cells carrying the promoter-less eGFP insertion at the Mecp2 locus had acquired dense methylation in ES cells despite the presence of a CpG cluster.
Our data suggest that chromatin modification need not arise secondarily as a result of, for example, transcriptional status, but can be determined genetically due to the sequence characteristics of the underlying DNA. In particular CpG, by virtue of its widely varying local densities and alternative modification states, has the properties of a signalling module that locally influences genome function. As shown here, DNA methylation-free CpG clusters can recruit Cfp1 and probably other CXXC domain proteins. Densely methylated CGIs, on the other hand, attract methyl-CpG-binding proteins, which in turn recruit enzymes that can reinforce repressive histone modifications17,27,28. Future studies of proteins that read and interpret CpG signals promise to shed further light on both genetic and epigenetic determinants of chromosome function.
Methods Summary
Mouse brain nuclei were incubated with restriction enzymes and then pelleted to liberate small fragments of CGI chromatin that were then analysed by western blotting. ChIP was performed on mature mouse brain with antibodies against various chromatin proteins. Immunoprecipitated DNA was used for: (1) quantitative PCR (qPCR) analysis of specific loci; (2) bisulphite analysis to determine DNA methylation patterns; and (3) ligation of linkers and Solexa sequencing to identify sites of binding. To knockdown Cfp1, mouse NIH3T3 cells were transfected using shRNA targeting Cfp1 or vector alone as a control, and stable clones were selected by puromycin resistance. RNA and protein samples were prepared to verify the knockdown of Cfp1. For Fig. 3, three shRNA were used in combination, but comparable results were obtained with two individual shRNAs (Supplementary Fig. 3). ChIP with antibodies against Cfp1, H3K4me3 and H3K27me3 determined the effect of Cfp1 knockdown at four loci using qPCR. The eGFP insert was targeted to the Mecp2 locus by homologous recombination of a construct containing a PGK-Neo cassette flanked by loxP sites. Cre-mediated recombination was used to delete the selectable marker before bisulphite and ChIP analysis.
Online Methods
Release of CGI chromatin
Nuclei were prepared from brains of 4-week-old mice as previously described31. Nuclear preparations were digested with a twofold excess of HinP1 or Mse1 in a buffer containing 50 mM Tris-HCl, pH 8, 100 mM NaCl, 5 mM MgCl2, 0.1 mM EGTA and 1 mM β-mercaptoethanol. The released chromatin was retained in the supernatant after centrifugation at 3,800g for 5 min and the proteins were precipitated using trichloroacetic acid before western blot analysis.
Antibodies
Antibodies used are listed in Supplementary Table 2.
Chromatin immunoprecipitation and bisulphite sequencing
ChIP on brain tissue was performed as described17 using antibodies as shown in Supplementary Table 2. Most ChIP-qPCR profiles were replicated using independent Cfp1 antibodies. Illumina linkers were ligated in-house and Solexa sequencing was carried out using Illumina 2G Solexa sequencers using two replicate lanes per biological sample. ChIP-Seq was analysed using custom bioinformatic tools generated in-house (see Supplementary Table 3 for the parameters used). ChIP using formaldehyde crosslinked NIH3T3 cells was performed as previously described32. Bisulphite sequencing was performed as described29. Real-time PCR was carried out using Quantace Sensimix Plus using a Biorad iCycler according to the manufacturer’s instructions (primer sequences are available on request).
Generation of stable Cfp1-knockdown cells
NIH3T3 cells were transfected using lipofectamine reagent (Invitrogen) with three independent pSuper vectors containing short hairpin constructs directed against Cfp1 (Oligoengine) or vector alone. Target sequences were as follows: target 986, 5′-GAAGGUGAAGCACGUGAAG-3′; target 1250, 5′-CAGCCAACCGAAUCUAUGA-3′; and target 1920, 5′-CUUCACCAAACGAUCCAAC-3′. Stable clones were selected for puromycin resistance. A combination of the three shRNAs reduced Cfp1 more robustly and was therefore used for the data in Fig. 3. Individual shRNAs gave comparable results by western and ChIP (see Supplementary Fig. 3). RNA was extracted using Tri reagent (Sigma) and was complementary DNA was prepared using reverse transcriptase (Promega). Expression levels were determined using real-time PCR analysis (primer sequences available on request).
ES cell lines
ES cell line TβC44 was generated by homologous recombination as described20. A Mecp2-eGFP knock-in targeting vector was constructed by sequential cloning of 5′ (5.3 kb) and 3′ (1.9 kb) regions of Mecp2 homology into peGFP-N1 (Clontech). A PGK-Neo cassette flanked by loxP sites was added to enable selection of transfected cells. Gene targeting was carried out in the ES cell line E14 TG2a to generate an insertion into the Mecp2 gene transcription unit at the junction between the open reading frame and the 3′ untranslated region. This construct was initially designed to create a MeCP2-eGFP fusion protein after transcription and translation. Cells were grown on gelatinized dishes in the presence of recombinant human LIF in Glasgow MEM (Invitrogen) supplemented with 10% FBS (Globepharm), 1× MEM non-essential amino acids, sodium pyruvate (1 mM) and β-mercaptoethanol (50 μM; all Invitrogen). ES cells (5 × 107 cells) were transfected with linearized targeting vector (250 μg DNA in 0.8 ml HEPES buffered saline) by electroporation (800 V, 3 μF, BioRad Gene Pulser) and plated at 5 × 106 cells per dish. Correctly targeted clones were first identified by PCR specific for homologous recombination. The integrity of the targeted locus was confirmed by Southern blot analyses. A single positive clone was transiently transfected with pCAGGS-CRE33 for the Cre-mediated deletion of the selectable marker and a recombinant clone was then used for this study.
References
Bird, A. P. CpG-rich islands and the function of DNA methylation. Nature 321, 209–213 (1986)
Suzuki, M. M. & Bird, A. DNA methylation landscapes: provocative insights from epigenomics. Nature Rev. Genet. 9, 465–476 (2008)
Guenther, M. G., Levine, S. S., Boyer, L. A., Jaenisch, R. & Young, R. A. A chromatin landmark and transcription initiation at most promoters in human cells. Cell 130, 77–88 (2007)
Bernstein, B. E. et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006)
Lee, J. H. & Skalnik, D. G. CpG-binding protein is a nuclear matrix- and euchromatin-associated protein localized to nuclear speckles containing human trithorax. Identification of nuclear matrix targeting signals. J. Biol. Chem. 277, 42259–42267 (2002)
Shin Voo, K., Carlone, D. L., Jacobsen, B. M., Flodin, A. & Skalnik, D. G. Cloning of a mammalian transcriptional activator that binds unmethylated CpG motifs and shares a CXXC domain with DNA methyltransferase, human trithorax, and methyl-CpG binding domain protein 1. Mol. Cell. Biol. 20, 2108–2121 (2000)
Lee, J. H. & Skalnik, D. G. CpG-binding protein (CXXC finger protein 1) is a component of the mammalian Set1 histone H3-Lys4 methyltransferase complex, the analogue of the yeast Set1/COMPASS complex. J. Biol. Chem. 280, 41725–41731 (2005)
Lee, J. H., Tate, C. M., You, J. S. & Skalnik, D. G. Identification and characterization of the human Set1B histone H3-Lys4 methyltransferase complex. J. Biol. Chem. 282, 13419–13428 (2007)
Tazi, J. & Bird, A. Alternative chromatin structure at CpG islands. Cell 60, 909–920 (1990)
Mikkelsen, T. S. et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560 (2007)
Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007)
Lee, J. H., Voo, K. S. & Skalnik, D. G. Identification and characterization of the DNA binding domain of CpG-binding protein. J. Biol. Chem. 276, 44669–44676 (2001)
Tsukada, Y. et al. Histone demethylation by a family of JmjC domain-containing proteins. Nature 439, 811–816 (2006)
McDonald, L. E., Paterson, C. A. & Kay, G. F. Bisulfite genomic sequencing-derived methylation profile of the Xist gene throughout early mouse development. Genomics 54, 379–386 (1998)
Lewis, J. D. et al. Purification, sequence and cellular localisation of a novel chromosomal protein that binds to methylated DNA. Cell 69, 905–914 (1992)
Nan, X., Tate, P., Li, E. & Bird, A. DNA methylation specifies chromosomal localization of MeCP2. Mol. Cell. Biol. 16, 414–421 (1996)
Skene, P. J. et al. Neuronal MeCP2 is expressed at near histone-octamer levels and globally alters the chromatin state. Mol. Cell 37, 457–468 (2010)
Mikkelsen, T. S. et al. Dissecting direct reprogramming through integrative genomic analysis. Nature 454, 49–55 (2008)
Schwartz, Y. B. & Pirrotta, V. Polycomb complexes and epigenetic states. Curr. Opin. Cell Biol. 20, 266–273 (2008)
Chambers, I. et al. Nanog safeguards pluripotency and mediates germline development. Nature 450, 1230–1234 (2007)
Yang, T. T., Cheng, L. & Kain, S. R. Optimized codon usage and chromophore mutations provide enhanced sensitivity with the green fluorescent protein. Nucleic Acids Res. 24, 4592–4593 (1996)
Guy, J., Gan, J., Selfridge, J., Cobb, S. & Bird, A. Reversal of neurological defects in a mouse model of Rett syndrome. Science 315, 1143–1147 (2007)
Waterston, R. H. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)
Ooi, S. K. et al. DNMT3L connects unmethylated lysine 4 of histone H3 to de novo methylation of DNA. Nature 448, 714–717 (2007)
Brandeis, M. et al. Sp1 elements protect a CpG island from de novo methylation. Nature 371, 435–438 (1994)
Macleod, D., Charlton, J., Mullins, J. & Bird, A. P. Sp1 sites in the mouse aprt gene promoter are required to prevent methylation of the CpG island. Genes Dev. 8, 2282–2292 (1994)
Lorincz, M. C., Dickerson, D. R., Schmitt, M. & Groudine, M. Intragenic DNA methylation alters chromatin structure and elongation efficiency in mammalian cells. Nature Struct. Mol. Biol. 11, 1068–1075 (2004)
Lorincz, M. C., Schubeler, D. & Groudine, M. Methylation-mediated proviral silencing is associated with MeCP2 recruitment and localized histone H3 deacetylation. Mol. Cell. Biol. 21, 7913–7922 (2001)
Illingworth, R. et al. A novel CpG island set identifies tissue-specific methylation at developmental gene loci. PLoS Biol. 6, e22 (2008)
Meissner, A. et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature 454, 766–770 (2008)
Klose, R. J. & Bird, A. P. MeCP2 behaves as an elongated monomer that does not stably associate with the Sin3a chromatin remodeling complex. J. Biol. Chem. 279, 46490–46496 (2004)
Schmiedeberg, L., Skene, P., Deaton, A. & Bird, A. A temporal threshold for formaldehyde crosslinking and fixation. PLoS ONE 4, e4636 (2009)
Araki, K., Araki, M. & Yamamura, K. Targeted integration of DNA using mutant lox sites in embryonic stem cells. Nucleic Acids Res. 25, 868–872 (1997)
Acknowledgements
We are grateful to D. Skalnik for the gift of a Cfp1 antibody, I. Chambers for the TβC44 ES cell line, R. Klose for discussions, E. Sheridan for testing the DNA sequencing protocol, and K. Auger and J. Parkhill for coordinating the DNA sequencing. We also thank R. Ekiert and J. Connelly for comments on the manuscript. This work was funded by a Cancer Research UK studentship to J.P.T. and by grants from the Wellcome Trust, the Medical Research Council and the European Union ‘Epigenome’ Network of Excellence.
Author Contributions J.P.T. and P.J.S. performed the ChIP, knockdown and bisulphite analysis. T.C. did preliminary ChIP analysis. J.S. and J.G. generated the MeCP2-eGFP cell line. J.S. performed bisulphite analysis of the TβC44 ES cell line. R.I. prepared samples for sequencing. K.D.J., D.J.T. and R.A. performed the sequencing and mapping. S.W., A.R.W.K., A.D. and R.I. performed the bioinformatic analysis. J.P.T., P.J.S., T.C., R.I. and A.B. wrote the manuscript.
Author information
Authors and Affiliations
Corresponding author
Supplementary information
Supplementary Information
This file contains Supplementary Figures S1 - S4 with legends and reference for Supplementary Figure S2, legend for Supplementary Table S1 (see separate Supplementary Table file) and Supplementary Tables S2 - S3. (PDF 558 kb)
Supplementary Table 1
This table shows the association of Cfp1, histone H3K4me3, H3K27me3 and RNA polymerase II with non-methylated CGIs. (see Supplementary Information file for full legend). (XLS 1090 kb)
Rights and permissions
About this article
Cite this article
Thomson, J., Skene, P., Selfridge, J. et al. CpG islands influence chromatin structure via the CpG-binding protein Cfp1. Nature 464, 1082–1086 (2010). https://doi.org/10.1038/nature08924
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/nature08924
This article is cited by
-
CXXC finger protein 1 (CFP1) bridges the reshaping of genomic H3K4me3 signature to the advancement of lung adenocarcinoma
Signal Transduction and Targeted Therapy (2023)
-
A genome-wide screen reveals new regulators of the 2-cell-like cell state
Nature Structural & Molecular Biology (2023)
-
Alternative promoters in CpG depleted regions are prevalently associated with epigenetic misregulation of liver cancer transcriptomes
Nature Communications (2023)
-
CFP1 governs uterine epigenetic landscapes to intervene in progesterone responses for uterine physiology and suppression of endometriosis
Nature Communications (2023)
-
Targeting Menin disrupts the KMT2A/B and polycomb balance to paradoxically activate bivalent genes
Nature Cell Biology (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.