Abstract
Active DNA demethylation in mammals involves oxidation of 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). However, genome-wide detection of 5fC at single-base resolution remains challenging. Here we present fC-CET, a bisulfite-free method for whole-genome analysis of 5fC based on selective chemical labeling of 5fC and subsequent C-to-T transition during PCR. Base-resolution 5fC maps showed limited overlap with 5hmC, with 5fC-marked regions more active than 5hmC-marked ones.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
Change history
21 September 2015
In the version of this article initially published online, Chuan He is incorrectly affiliated with Tsinghua-Peking Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing, China, and is missing an affiliation with the Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China. This error has been corrected for the print, PDF and HTML versions of this article.
References
Tahiliani, M. et al. Science 324, 930–935 (2009).
Ito, S. et al. Science 333, 1300–1303 (2011).
He, Y.F. et al. Science 333, 1303–1307 (2011).
Song, C.X. & He, C. Trends Biochem. Sci. 38, 480–484 (2013).
Bachman, M. et al. Nat. Chem. 6, 1049–1055 (2014).
Raiber, E.A. et al. Genome Biol. 13, R69 (2012).
Shen, L. et al. Cell 153, 692–706 (2013).
Song, C.X. et al. Cell 153, 678–691 (2013).
Maiti, A. & Drohat, A.C. J. Biol. Chem. 286, 35334–35338 (2011).
Pfaffeneder, T. et al. Angew. Chem. Int. Ed. Engl. 50, 7008–7012 (2011).
Booth, M.J., Marsico, G., Bachman, M., Beraldi, D. & Balasubramanian, S. Nat. Chem. 6, 435–440 (2014).
Lu, X. et al. Cell Res. 25, 386–389 (2015).
Wu, H., Wu, X., Shen, L. & Zhang, Y. Nat. Biotechnol. 32, 1231–1240 (2014).
Booth, M.J. et al. Science 336, 934–937 (2012).
Rivera, C.M. & Ren, B. Cell 155, 39–55 (2013).
Neri, F. et al. Cell Rep. 10, 674–683 (2015).
Song, C.X. et al. Nat. Biotechnol. 29, 68–72 (2011).
Yu, M. et al. Cell 149, 1368–1380 (2012).
Iurlaro, M. et al. Genome Biol. 14, R119 (2013).
Spruijt, C.G. et al. Cell 152, 1146–1159 (2013).
Wang, D. et al. Biochemistry 42, 6747–6753 (2003).
Obeid, S. et al. EMBO J. 29, 1738–1747 (2010).
Peng, X. et al. Nucleic Acids Res. 43, e35 (2015).
Krueger, F. & Andrews, S.R. Bioinformatics 27, 1571–1572 (2011).
Zhang, Y. et al. Genome Biol. 9, R137 (2008).
Heinz, S. et al. Mol. Cell 38, 576–589 (2010).
Xiao, S. et al. Cell 149, 1381–1392 (2012).
Williams, K. et al. Nature 473, 343–348 (2011).
Hon, G.C. et al. Mol. Cell 56, 286–297 (2014).
Acknowledgements
The authors thank R. Meng, S.T. Huang, J.Y. Liu, J.Y. Li, X.T. Shu, X.Y. Li and C.X. Zhu for technical assistance; X.X. Zhang and H.S. Guo (Peking University, Beijing, China) for providing genomic DNA at the beginning of the project; C.F. Xia for synthetic suggestions; and O. Stovicek for editing the manuscript. This work was supported by the National Basic Research Foundation of China (grant 2014CB964900 to C.Y.), the National Natural Science Foundation of China (grants 31270838 and 21472009 to C.Y.), and the US National Institutes of Health (grant R01 HG006827 to C.H.). C.H. is supported by the Howard Hughes Medical Institute.
Author information
Authors and Affiliations
Contributions
B.X. and C.Y. conceived the original idea and designed the experiments with the help of X.L. and C.H.; B.X. performed the experiments with the help of X.L., H.Z., M.L. and X.J.; D.H. performed bioinformatics analysis; Z.S. and A.Z. synthesized the chemicals; Q.Y. and W.X. helped with the library preparation; C.H. and C.Y. supervised the project; and B.X. and C.Y. wrote the manuscript with contributions from D.H., X.L. and C.H.
Corresponding authors
Ethics declarations
Competing interests
B.X., A.Z. and C.Y. are co-inventors on a filed patent (WO2015043493) for the labeling strategies and sequencing methods reported herein.
Integrated supplementary information
Supplementary Figure 1 Chemical labeling of 5fC via Friedländer reaction.
(a) Proposed chemical labeling of 5fC based on the principle of Friedländer reaction30. (b) Chemicals screened for 5fC labeling, with reaction yield indicated. 1, ethyl acetoacetate (EA). 2, methyl acetoacetate (MA). 3, ethyl 6-azido-3-oxohexanoate (EAO). 4, diethyl malonate (DM). 5, 1,3-Indandione (I). 6, 5-(2-azidoethyl)-1, 3-indandione (AI). 7, ethyl nitroacetate. 8, ethyl cyanacetate. 9, acetylacetone. 10, benzoylacetone. 11, barbituric acid. 12. benzoylacetonitrile. n.d., no detectable products as measured by MALDI-TOF mass spectrometry.
Supplementary Figure 2 MALDI-TOF characterizations of chemical labeling of 5fC in a 9-mer model DNA.
Calculated and observed molecular weights are shown. The proposed chemical structures of labeling products of 5fC are also shown. *Because this reaction was performed in alkaline methanol solution, the ethyl ester underwent transesterification to form the methyl ester. 5fC-AA: reaction product between 5fC and ethyl or methyl acetoacetate. 5fC-EAO: reaction product between 5fC and ethyl 6-azido-3-oxohexanoate. 5fC-DM: reaction product between 5fC and diethyl malonate. 5fC-I: reaction product between 5fC and 1,3-indandione.
Supplementary Figure 3 HPLC analysis of 1,3-indanedione labeling on 5fC-containing 9-mer model DNA.
9-mer 5fC and 5fC-I oligonucleotides were digested to nucleosides and analyzed with a C-18 column of HPLC (260 nm and 310 nm). The peak for 5fC was completely undetectable after the reaction, indicating full labeling. Authentic 5fC or 5fC-I nucleosides were also analyzed as positive controls.
Supplementary Figure 4 Properties of 5fC labeling products during PCR and/or bisulfite treatment.
(a) 5fC labeling with diethyl malonate protects the product from bisulfite-mediated deamination and hence is read as C in canonical bisulfite sequencing, similar to the previously reported fCAB-Seq8. (b) The adduct between 5fC and 1,3-indandione results in C-to-T transition during PCR. (c) Single-nucleotide incorporation of dATP, dGTP, dCTP and dTTP opposite the 5fC-I for 1, 2, 5, 10, 30 and 60 min, respectively. A template where X is a T was also included as a positive control. (d) Quantification of single-nucleotide incorporation. Values are presented as mean ± s.d. (n = 3). (e) Schematic comparisons of fC-CET with fCAB-Seq.
Supplementary Figure 5 Potential base-pairing properties of different cyclic adducts.
(a) Cyclic adducts between 5fC and ethyl acetoacetate, diethyl malonate or ethyl 6-azido-3-oxohexanoate. These adducts contain a 4-NH group and are expected to pair with dG in a canonical Watson-Crick fashion. (b) Hypothesized base-pairing modes of 5fC-I with dA. Left: N3-protonated 5fC-I could base-pair with dA in the Watson-Crick manner. Right: alternatively, the six-amino group of dA could form a bidentate hydrogen bond to 5fC-I. (c) Acid/base titration curves of 5fC-I solution (in DMSO:H2O = 5.5:1 solvent). The 13C NMR spectra of the free nucleoside and in 2 equivalents of hydrochoride acid (pH, ~2) are shown in Supplementary Note 2. These data suggest that no protonation events occurred to the free 5fC-I nucleoside.
Supplementary Figure 6 Monitoring the efficiency of AI-mediated 5fC labeling on both model sequence and genomic DNA.
(a) AI labeling of 5fC generates a pair of isomers and HPLC chromatograms (260 nm and 310 nm) of the nucleosides digested from the 9-mer 5fC-AI oligonucleotides. The split peaks of the two reaction products are also shown. (b) Dot-blot assay monitoring the efficiency of 5fC labeling on both model sequence (starting from 100 ng) and Tdg–/– mESC gDNA (starting from 1,000 ng). Lambda-DNA (λDNA, starting from 1,000 ng) was used as a negative control. (c) Matrix-assisted laser desorption–ionization time-of-flight (MALDI-TOF) mass spectrometry characterization of fC-CET in 9-mer 5fC oligonucleotides. Calculated and observed m/z are shown. (d) Agarose gel analysis of AI-labeled double-stranded 5fC model DNAs to indicate little degradation. Four replicates are shown. (e) Recovery efficiency for each step of fC-CET. 9-mer 5fC oligonucleotide, 5fC model DNA or gDNA samples were used to characterize the recovery efficiency. Values are presented as mean ± s.d. (n = 4 for oligonucleotide and model DNA; n = 5 for gDNA).
Supplementary Figure 7 AI-mediated labeling of 5fC is highly selective among cytosine derivatives.
9-mer DNAs with a central C (a), 5mC (b), 5hmC (c) and 5caC (d) were labeled under the same conditions as 5fC and then analyzed with MALDI-TOF. No cross-reactivity was observed for these cytosines.
Supplementary Figure 8 AI labeling of 5fC and subsequent C-to-T transition.
(a) Biotin conjugation for pulldown and DTT cleavage. (b) Sanger sequencing results for the fC-CET treated 5fC-containing 76-mer DNA, introducing the C-to-T transition during PCR amplification. (c) FspI-digestion tests to confirm the C-to-T transition caused by AI-mediated 5fC labeling. PCR-amplified input 70-mer 5fC oligo with FspI restriction site in the middle can be cleaved, whereas amplified products of 5fC-AI-Biotin or 5fC-AI-SH stay intact, suggesting loss of restriction site and hence C-to-T transition. (d,e) Efficiency of 5fC labeling on a model DNA. T%, calculated from TOPO-cloning (d) or high-throughput sequencing using Mi-Seq (e), was used to measure the C-to-T transition rate of 5fC after AI labeling and AI-mediated pulldown. In both cases, a normal C next to the target 5fCpG was used as a control.
Supplementary Figure 9 Choice of polymerase to maximize C-to-T signals during PCR.
(a) 76-mer oligonucleotide with two 5fC sites labeled with AI, PCR amplified and then subjected to Sanger sequencing. The graph shows that the polymerase can readily read through the modified sequence even with two successive 5fC-AI adducts next to each other (in a 5ʹ-fCGfCG-3ʹ context). (b) 76-mer control oligonucleotide with G and A replacing the two 5fCs in the sequence in a. (c) Sanger sequencing results of Taq polymerase reading through the adduct between 5fC and AI. PCR bias will diminish the effective C-to-T signals; hence different polymerases were tested to maximize the signal for 5fC detection. Alternatively, new labeling probes could be explored to minimize chemical scarring on the 5fC base31. (d) Commercially available MightyAmp DNA polymerase best overcomes the PCR bias and hence allows maximal C-to-T transition during PCR. (e) Scheme diagram of the on-bead wash step (with NaOH solution) to remove any strands that do not contain 5fC (“supernatant” depicted in Step 2). Therefore, the output (depicted in Step 3) is enriched for 5fC-AI-SH. (f) AI-labeled 76-bp dsDNA with a central 5fC-AI-SH:G pair. The C-T transition peak can be observed at the 5fC site, although the signal of C (resulting from the G in the complementary strand) is higher than that of T (from 5fC-AI-SH). (g) Elution from the DTT-cleavage step. The eluted fraction is enriched for 5fC-AI-SH, and hence a near-complete C-to-T transition peak can be observed at the 5fC site.
Supplementary Figure 10 Genome browser view of 5fC at the Fbxl18 gene.
(a) 5fC-enriched regions in both Tdgfl/fl and Tdg–/– mESCs. Results from two replicates are shown, demonstrating high reproducibility of fC-CET. 5hmC-enriched peaks by hmC-Seal are also shown. (b) Single-base 5fC sites, along with 5mC and 5hmC, in Tdgfl/fl and Tdg–/– mESCs are shown. The peaks of 5fC corresponds to 5mC sites with low abundance.
Supplementary Figure 11 Venn diagrams of the 5fC-enriched regions.
(a,b) 5fC-enriched regions in two biological replicates of Tdgfl/fl (a) and Tdg–/– (b) mESCs. (c) Majority of 5fC-marked regions in the Tdgfl/fl mESCs fall within those in Tdg–/– mESCs. (d,e) Comparisons of 5fC-enriched regions detected from fC-CET with results from fC-Seal in the Tdgfl/fl and Tdg–/– mESCs.
Supplementary Figure 12 5fC sites at single-base resolution.
(a) Comparisons of 5fC sites in the Tdgfl/fl and Tdg–/– mESCs. (b) 5fC sites in Tdg–/– mESCs are grouped based on genomic elements. (c) The relative enrichment of Tdg–/– 5fC sites in different genomic elements.
Supplementary Figure 13 Locus-specific validation of 5fC sites.
(a,b) fCAB-Seq8 and MAB-Seq13 were performed (in parallel with fC-CET) on the 76-mer 5fC model sequence. Results from TOPO-cloning assay (a) and high-throughput sequencing (b) are shown. In both cases, a normal C on the 5fCpG nearby was used as a control. (c–e) Locus-specific validation of 5fC sites with fC-CET (c,d) and fCAB-Seq (e). The columns represent the percentage of T at the target 5fC sites. For each site, a P value is given to show whether it is a statistically significant 5fC site (binomial test for c and d, and Fisher’s test for differences between fCAB and BS datasets in e). *P < 0.05, **P < 0.01, ***P< 0.001; #, not detected because of technical issues.
Supplementary Figure 14 5fC sites in Tdg–/– mESCs at different regulatory elements.
Normalized read densities of 5fC in Tdg–/– mESCs at the H3K4me1, H3K27ac, p300 and Tet1 regions, respectively. The signals of 5fC at such regions are higher than 5hmC, suggesting that 5fC represents a more active marker compared to 5hmC.
Supplementary Figure 15 5fC is more enriched than 5hmC at different genomic elements.
The genome loci analyzed included CTCF binding region, DNase I hypersensitive region, enhancer and p300 binding region; the relative enrichment (observed over expected) is used for comparison. The 5fC sites are more enriched than the 5hmC sites in the enhancer domain and p300 binding region, suggesting a more active nature of 5fC marked regions. This is further confirmed by the DNase I hypersensitive region, which represents the more de-condensed genomic regions.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–15, Supplementary Tables 1–3 and Supplementary Notes 1–3 (PDF 3048 kb)
Source data
Rights and permissions
About this article
Cite this article
Xia, B., Han, D., Lu, X. et al. Bisulfite-free, base-resolution analysis of 5-formylcytosine at the genome scale. Nat Methods 12, 1047–1050 (2015). https://doi.org/10.1038/nmeth.3569
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.3569
This article is cited by
-
Bisulfite-free mapping of DNA cytosine modifications: challenges and perspectives
Science China Chemistry (2023)
-
Mitochondrial base editor induces substantial nuclear off-target mutations
Nature (2022)
-
Cellular macromolecules-tethered DNA walking indexing to explore nanoenvironments of chromatin modifications
Nature Communications (2021)
-
Tissue-specific 5-hydroxymethylcytosine landscape of the human genome
Nature Communications (2021)