Abstract
RnBeads is a software tool for large-scale analysis and interpretation of DNA methylation data, providing a user-friendly analysis workflow that yields detailed hypertext reports (http://rnbeads.mpi-inf.mpg.de/). Supported assays include whole-genome bisulfite sequencing, reduced representation bisulfite sequencing, Infinium microarrays and any other protocol that produces high-resolution DNA methylation data. Notable applications of RnBeads include the analysis of epigenome-wide association studies and epigenetic biomarker discovery in cancer cohorts.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Lister, R. et al. Nature 462, 315–322 (2009).
Gu, H. et al. Nat. Methods 7, 133–136 (2010).
Bibikova, M. et al. Genomics 98, 288–295 (2011).
Down, T.A. et al. Nat. Biotechnol. 26, 779–785 (2008).
Harris, R.A. et al. Nat. Biotechnol. 28, 1097–1105 (2010).
Stevens, M. et al. Genome Res. 23, 1541–1553 (2013).
Bock, C. et al. Nat. Biotechnol. 28, 1106–1114 (2010).
Bock, C. Nat. Rev. Genet. 13, 705–719 (2012).
Krueger, F. & Andrews, S.R. Bioinformatics 27, 1571–1572 (2011).
Xi, Y. et al. Bioinformatics 28, 430–432 (2012).
Liu, Y., Siegmund, K.D., Laird, P.W. & Berman, B.P. Genome Biol. 13, R61 (2012).
Lienhard, M., Grimm, C., Morkel, M., Herwig, R. & Chavez, L. Bioinformatics 30, 284–286 (2014).
Wilson, G.A. et al. GigaScience 1, 3 (2012).
Riebler, A. et al. Genome Biol. 15, R35 (2014).
Meyer, L.R. et al. Nucleic Acids Res. 41, D64–D69 (2013).
Flicek, P. et al. Nucleic Acids Res. 41, D48–D55 (2013).
Giardine, B. et al. Genome Res. 15, 1451–1455 (2005).
Zhou, X. et al. Nat. Methods 8, 989–990 (2011).
Halachev, K., Bast, H., Albrecht, F., Lengauer, T. & Bock, C. Genome Biol. 13, R96 (2012).
Weisenberger, D.J. J. Clin. Invest. 124, 17–23 (2014).
Bock, C. et al. Mol. Cell 47, 633–647 (2012).
Bock, C. Genome Med. 6, 41 (2014).
Gentleman, R.C. et al. Genome Biol. 5, R80 (2004).
Gentleman, R. & Temple Lang, D. Bioconductor Project Working Paper 2 (2004).
Akman, K., Haaf, T., Gravina, S., Vijg, J. & Tresch, A. Bioinformatics 30, 1933–1934 (2014).
Hebestreit, K., Dugas, M. & Klein, H.U. Bioinformatics 29, 1647–1653 (2013).
Saito, Y., Tsuji, J. & Mituyama, T. Nucleic Acids Res. 42, e45 (2014).
Hansen, K.D., Langmead, B. & Irizarry, R.A. Genome Biol. 13, R83 (2012).
Morris, T.J. et al. Bioinformatics 30, 428–430 (2014).
Warden, C.D. et al. Nucleic Acids Res. 41, e117 (2013).
Barfield, R.T., Kilaru, V., Smith, A.K. & Conneely, K.N. Bioinformatics 28, 1280–1281 (2012).
He, J., Sun, X., Shao, X., Liang, L. & Xie, H. Bioinformatics 29, 2044–2045 (2013).
Zhang, Y., Su, J., Yu, D., Wu, Q. & Yan, H. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2013, 655–658 (2013).
Wu, D., Gu, J. & Zhang, M.Q. PLoS ONE 8, e74275 (2013).
Mancuso, F.M., Montfort, M., Carreras, A., Alibes, A. & Roma, G. BMC Res. Notes 4, 546 (2011).
Wang, D. et al. Bioinformatics 28, 729–730 (2012).
Kuan, P.F., Wang, S., Zhou, X. & Chu, H. Bioinformatics 26, 2849–2855 (2010).
Kilaru, V., Barfield, R.T., Schroeder, J.W., Smith, A.K. & Conneely, K.N. Epigenetics 7, 225–229 (2012).
Akalin, A. et al. Genome Biol. 13, R87 (2012).
Park, Y., Figueroa, M.E., Rozek, L.S. & Sartor, M.A. Bioinformatics 30, 2414–2422 (2014).
Aryee, M.J. et al. Bioinformatics 30, 1363–1369 (2014).
Pidsley, R. et al. BMC Genomics 14, 293 (2013).
Maksimovic, J., Gordon, L. & Oshlack, A. Genome Biol. 13, R44 (2012).
Teschendorff, A.E. et al. Bioinformatics 29, 189–196 (2013).
Triche, T.J. Jr., Weisenberger, D.J., Van Den Berg, D., Laird, P.W. & Siegmund, K.D. Nucleic Acids Res. 41, e90 (2013).
Xi, Y. & Li, W. BMC Bioinformatics 10, 232 (2009).
Chavez, L. et al. Genome Res. 20, 1441–1450 (2010).
Du, P. et al. BMC Bioinformatics 11, 587 (2010).
Westra, H.J. et al. Bioinformatics 27, 2104–2111 (2011).
Nordlund, J. et al. Genome Biol. 14, r105 (2013).
Sandve, G.K. et al. Nucleic Acids Res. 41, W133–W141 (2013).
Bock, C., Halachev, K., Büch, J. & Lengauer, T. Genome Biol. 10, R14 (2009).
Bock, C., Walter, J., Paulsen, M. & Lengauer, T. Nucleic Acids Res. 36, e55 (2008).
Smyth, G.K. Stat. Appl. Genet. Mol. Biol. 3, Article3 (2004).
Houseman, E.A., Molitor, J. & Marsit, C.J. Bioinformatics 30, 1431–1439 (2014).
Makambi, K.H. J. Appl. Stat. 30, 225–234 (2003).
Leek, J.T. et al. Nat. Rev. Genet. 11, 733–739 (2010).
Leek, J.T., Johnson, W.E., Parker, H.S., Jaffe, A.E. & Storey, J.D. Bioinformatics 28, 882–883 (2012).
Johnson, W.E., Li, C. & Rabinovic, A. Biostatistics 8, 118–127 (2007).
Teschendorff, A.E., Zhuang, J. & Widschwendter, M. Bioinformatics 27, 1496–1505 (2011).
Gagnon-Bartsch, J.A. & Speed, T.P. Biostatistics 13, 539–552 (2012).
Jaffe, A.E. & Irizarry, R.A. Genome Biol. 15, R31 (2014).
Houseman, E.A. et al. BMC Bioinformatics 13, 86 (2012).
Michels, K.B. et al. Nat. Methods 10, 949–955 (2013).
Reinius, L.E. et al. PLoS ONE 7, e41361 (2012).
Zou, J., Lippert, C., Heckerman, D., Aryee, M. & Listgarten, J. Nat. Methods 11, 309–311 (2014).
Afgan, E. et al. BMC Bioinformatics 11 (suppl. 12), S4 (2010).
Ziller, M.J. et al. Nature 500, 477–481 (2013).
Satterlee, J.S., Schübeler, D. & Ng, H.H. Nat. Biotechnol. 28, 1039–1044 (2010).
ENCODE Project Consortium. Science 306, 636–640 (2004).
Varley, K.E. et al. Genome Res. 23, 555–567 (2013).
Acknowledgements
We thank D. Brocks, H. Hernandez-Vargas, A. Houseman, E. Schneider, A. Schönegger and all users of RnBeads for their extensive testing and feedback. We also thank G. Friedrich, J. Büch and the Information Services and Technology team at the Max Planck Institute for technical support. This work is funded in part by the European Union's Seventh Framework Programme (FP7/2007-2013) grant agreement no. 282510 (BLUEPRINT) and grant agreement no. 267038 (NOTOX), as well as by the German Science Ministry grant no. 01KU1216A (DEEP).
Author information
Authors and Affiliations
Contributions
Y.A., F.M. and P.L. developed and maintain RnBeads; J.W., T.L. and C.B. supervised the project; all authors contributed to the writing of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data.
RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by increased DNA methylation levels (termed G-CIMP+), and to predict the G-CIMP status for a total of 124 patients using Infinium 450k data obtained from the TCGA project (http://cancergenome.nih.gov).
(a) Detection of genetic duplicates among the patient samples (columns) using a clustered heatmap of intensity values for the genotyping probes that are present on the Infinium microarray (rows). The inset shows that two samples exhibit a high level of genetic identity, and they are indeed derived from tumors of the same patient.
(b) Quality control plot summarizing the outcome of the data filtering. The bar plots on the top left show that the majority of CpG sites (top) and samples (bottom) are of good quality and can be retained. The relatively straight line in the quantile-quantile plot indicates that the probe filtering does not have a major impact on the distribution of DNA methylation in the dataset.
(c) Identification of a small but clearly distinguished cluster of G-CIMP+ glioblastoma samples with elevated DNA methylation levels especially in CpG-rich genomic regions (light blue in the leftmost column). In the heatmap, blue colors denote high levels of DNA methylation, red indicates low levels and grey represents intermediate levels. For visualization purposes, only the 100 gene promoters (rows) with the highest levels of inter-sample variation in DNA methylation are shown (columns), but the hierarchical clustering is based on the full set of promoters.
(d) Global assessment of the similarity between the DNA methylation profiles, plotting all glioblastoma samples according to their second and third principal components. The samples exhibit strong separation according to the G-CIMP status (denoted by point shape) and IDH1 mutation status (denoted by point color).
(e) Analysis of significant associations between all user-provided sample annotations. Significant p-values (<0.05) are highlighted in the left triangle, and the corresponding statistical tests are annotated in the right triangle (orange: Pearson correlation followed by permutation-based estimation of the p-value; green: Fisher’s exact test; blue: Wilcoxon rank sum test; violet: Kruskal-Wallis one-way analysis of variance).
(f) Genome-scale comparison between the DNA methylation levels of G-CIMP positive (y-axis) and G-CIMP negative (x-axis) tumor samples, focusing on CpG islands (left scatterplot) and on 5-kilobase tiling regions with a CpG content in the bottom quartile (right scatterplot), respectively. Genomic regions that are differentially methylated with an FDR below 0.05 are presented as red points. All other regions are displayed in blue, and color brightness denotes point density.
Supplementary Figure 2 RnBeads-based Methylome Resource of reference epigenome data sets.
Screenshot of the Methylome Resource (http://rnbeads.mpi-inf.mpg.de/methylomes.php), which makes large DNA methylation datasets more readily available for follow-up research. On the one hand, it provides detailed analysis reports for publicly available methylome datasets that can be explored interactively. On the other hand, the Methylome Resource website lets RnBeads users download all data and configurations that are needed to re-run all or part of the DNA methylation analyses in their local or cloud-based computing environment. These re-runnable analysis configurations make it straightforward for RnBeads users to analyze their own DNA methylation data in the context of publicly available reference epigenome maps.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1 and 2, Supplementary Table 2 and Supplementary Note (PDF 886 kb)
Supplementary Table 1
Comparison between software tools for DNA methylation analysis (XLSX 37 kb)
Rights and permissions
About this article
Cite this article
Assenov, Y., Müller, F., Lutsik, P. et al. Comprehensive analysis of DNA methylation data with RnBeads. Nat Methods 11, 1138–1140 (2014). https://doi.org/10.1038/nmeth.3115
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.3115
This article is cited by
-
TP63–TRIM29 axis regulates enhancer methylation and chromosomal instability in prostate cancer
Epigenetics & Chromatin (2024)
-
Advances in cancer DNA methylation analysis with methPLIER: use of non-negative matrix factorization and knowledge-based constraints to enhance biological interpretability
Experimental & Molecular Medicine (2024)
-
Differential methylation of linoleic acid pathway genes is associated with PTSD symptoms – a longitudinal study with Burundian soldiers returning from a war zone
Translational Psychiatry (2024)
-
Epigenetic age acceleration in surviving versus deceased COVID-19 patients with acute respiratory distress syndrome following hospitalization
Clinical Epigenetics (2023)
-
Predictive value of DNA methylation patterns in AML patients treated with an azacytidine containing induction regimen
Clinical Epigenetics (2023)