Abstract
RNA-seq is increasingly used for quantitative profiling of small RNAs (for example, microRNAs, piRNAs and snoRNAs) in diverse sample types, including isolated cells, tissues and cell-free biofluids. The accuracy and reproducibility of the currently used small RNA-seq library preparation methods have not been systematically tested. Here we report results obtained by a consortium of nine labs that independently sequenced reference, 'ground truth' samples of synthetic small RNAs and human plasma-derived RNA. We assessed three commercially available library preparation methods that use adapters of defined sequence and six methods using adapters with degenerate bases. Both protocol- and sequence-specific biases were identified, including biases that reduced the ability of small RNA-seq to accurately measure adenosine-to-inosine editing in microRNAs. We found that these biases were mitigated by library preparation methods that incorporate adapters with degenerate bases. MicroRNA relative quantification between samples using small RNA-seq was accurate and reproducible across laboratories and methods.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Change history
31 July 2018
In the version of this article initially published online, the text "Beth Israel Deaconess Medical Center/Dana Farber Cancer Institute (BIDMC/ DFCI)" was inserted into the last sentence in the right-hand column of p.10, beginning "It is worth noting...." . In addition, on p.2, the acronym for The Cancer Genome Atlas was given as TGCA, rather than TCGA; and on p. 3, UUTR should have been defined, as University of Utrecht, the Netherlands. Finally, ref. 48 was cited in the Online Methods after "4N_Xu protocol was performed as previously described35"; this extra citation has been deleted. The errors have been corrected for the print, PDF and HTML versions of this article.
01 October 2018
Nat. Biotechnol. 10.1038/nbt.4183; corrected online 31 July 2018 In the version of this article initially published online, the text “Beth Israel Deaconess Medical Center/Dana Farber Cancer Institute (BIDMC/DFCI)” was inserted into the last sentence in the right-hand column of p.10, beginning “It isworth noting.
References
Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).
Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009).
Levin, J.Z. et al. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat. Methods 7, 709–715 (2010).
Jayaprakash, A.D., Jabado, O., Brown, B.D. & Sachidanandam, R. Identification and remediation of biases in the activity of RNA ligases in small-RNA deep sequencing. Nucleic Acids Res. 39, e141 (2011).
Viollet, S., Fuchs, R.T., Munafo, D.B., Zhuang, F. & Robb, G.B. T4 RNA ligase 2 truncated active site mutants: improved tools for RNA analysis. BMC Biotechnol. 11, 72 (2011).
Zhang, Z., Lee, J.E., Riemondy, K., Anderson, E.M. & Yi, R. High-efficiency RNA cloning enables accurate quantification of miRNA expression by deep sequencing. Genome Biol. 14, R109 (2013).
Song, Y., Liu, K.J. & Wang, T.-H. Elimination of ligation dependent artifacts in T4 RNA ligase to achieve high efficiency and low bias microRNA capture. PLoS One 9, e94619 (2014).
Baran-Gale, J. et al. Addressing bias in small RNA library preparation for sequencing: a new protocol recovers microRNAs that evade capture by current methods. Front. Genet. 6, 352 (2015).
Sorefan, K. et al. Reducing ligation bias of small RNAs in libraries for next generation sequencing. Silence 3, 4 (2012).
Bellingham, S.A., Coleman, B.M. & Hill, A.F. Small RNA deep sequencing reveals a distinct miRNA signature released in exosomes from prion-infected neuronal cells. Nucleic Acids Res. 40, 10937–10949 (2012).
Nolte-'t Hoen, E. et al. Deep sequencing of RNA from immune cell-derived vesicles uncovers the selective incorporation of small non-coding RNA biotypes with potential regulatory functions. Nucleic Acid Res. 18, 9272–9285 (2012).
Huang, X. et al. Characterization of human plasma-derived exosomal RNAs by deep sequencing. BMC Genomics 14, 319 (2013).
Tietje, A., Maron, K.N., Wei, Y. & Feliciano, D.M. Cerebrospinal fluid extracellular vesicles undergo age dependent declines and contain known and novel non-coding RNAs. PLoS One 9, e113116 (2014).
Lunavat, T.R. et al. Small RNA deep sequencing discriminates subsets of extracellular vesicles released by melanoma cells--Evidence of unique microRNA cargos. RNA Biol. 12, 810–823 (2015).
Tosar, J.P. et al. Assessment of small RNA sorting into different extracellular fractions revealed by high-throughput sequencing of breast cell lines. Nucleic Acids Res. 43, 5601–5616 (2015).
van Balkom, B.W.M., Eisele, A.S., Pegtel, D.M., Bervoets, S. & Verhaar, M.C. Quantitative and qualitative analysis of small RNAs in human endothelial cells and exosomes provides insights into localized RNA processing, degradation and sorting. J. Extracell. Vesicles 4, 26760 (2015).
Burgos, K.L. et al. Identification of extracellular miRNA in human cerebrospinal fluid by next-generation sequencing. RNA 19, 712–722 (2013).
Bahn, J.H. et al. The landscape of microRNA, Piwi-interacting RNA, and circular RNA in human saliva. Clin. Chem. 61, 221–230 (2015).
Freedman, J.E. et al. Diverse human extracellular RNAs are widely detected in human plasma. Nat. Commun. 7, 11106 (2016).
Wecker, T. et al. MicroRNA profiling in aqueous humor of individual human eyes by next-generation sequencing. Invest. Ophthalmol. Vis. Sci. 57, 1706–1713 (2016).
Yuan, T. et al. Plasma extracellular RNA profiles in healthy and cancer patients. Sci. Rep. 6, 19413 (2016).
Bullard, J.H., Purdom, E., Hansen, K.D. & Dudoit, S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11, 94 (2010).
Risso, D., Ngai, J., Speed, T.P. & Dudoit, S. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat. Biotechnol. 32, 896–902 (2014).
Lin, Y. et al. Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster. BMC Genomics 17, 28 (2016).
't Hoen, P.A.C. et al. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat. Biotechnol. 31, 1015–1022 (2013).
SEQC/MAQC-III Consortium. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 32, 903–914 (2014).
Leinonen, R., Sugawara, H., Shumway, M. & International Nucleotide Sequence Database Collaboration The sequence read archive. Nucleic Acids Res. 39, D19–D21 (2011).
Kodama, Y., Shumway, M., Leinonen, R. & International Nucleotide Sequence Database Collaboration The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res. 40, D54–D56 (2012).
Kalra, H. et al. Vesiclepedia: a compendium for extracellular vesicles with continuous community annotation. PLoS Biol. 10, e1001450 (2012).
Simpson, R.J., Kalra, H. & Mathivanan, S. ExoCarta as a resource for exosomal research. J. Extracell. Vesicles 1, 18374 (2012).
Kim, D.-K. et al. EVpedia: an integrated database of high-throughput data for systemic analyses of extracellular vesicles. J. Extracell. Vesicles 2, 20384 (2013).
Weinstein, J.N. et al. The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).
Subramanian, S.L. et al. Integration of extracellular RNA profiling data using metadata, biomedical ontologies and Linked Data technologies. J. Extracell. Vesicles 4, 27497 (2015).
Ainsztein, A.M. et al. The NIH Extracellular RNA Communication Consortium. The NIH Extracellular RNA Communication Consortium. J. Extracell. Vesicles 4, 27493 (2015).
Xu, P. et al. an improved protocol for small RNA library construction using high-definition adapters. Methods Next Gener. Seq. 2 2, http://dx.doi.org/10.1515/mngs-2015-0001 (2015).
Hansen, K.D., Brenner, S.E. & Dudoit, S. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res. 38, e131 (2010).
Hafner, M. et al. RNA-ligase-dependent biases in miRNA representation in deep-sequenced small RNA cDNA libraries. RNA 17, 1697–1712 (2011).
Fuchs, R.T., Sun, Z., Zhuang, F. & Robb, G.B. Bias in ligation-based small RNA sequencing library construction is determined by adaptor and RNA structure. PLoS One 10, e0126049 (2015).
Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
McCarthy, D.J., Chen, Y. & Smyth, G.K. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 40, 4288–4297 (2012).
Love, M.I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Law, C.W., Chen, Y., Shi, W. & Smyth, G.K. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 15, R29 (2014).
Cortez, M.A. et al. MicroRNAs in body fluids: the mix of hormones and biomarkers. Nat. Rev. Clin. Oncol. 8, 467–477 (2011).
Yang, W. et al. Modulation of microRNA processing and expression through RNA editing by ADAR deaminases. Nat. Struct. Mol. Biol. 13, 13–21 (2006).
Kawahara, Y. et al. Redirection of silencing targets by adenosine-to-inosine editing of miRNAs. Science 315, 1137–1140 (2007).
Wang, Y. et al. Systematic characterization of A-to-I RNA editing hotspots in microRNAs across human cancers. Genome Res. 27, 1112–1125 (2017).
Warnefors, M., Liechti, A., Halbert, J., Valloton, D. & Kaessmann, H. Conserved microRNA editing in mammalian evolution, development and disease. Genome Biol. 15, R83 (2014).
Linsen, S.E.V. et al. Limitations and possibilities of small RNA digital gene expression profiling. Nat. Methods 6, 474–476 (2009).
Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016).
Goodman, S.N., Fanelli, D. & Ioannidis, J.P.A. What does research reproducibility mean? Sci. Transl. Med. 8, 341ps12 (2016).
Risso, D., Schwartz, K., Sherlock, G. & Dudoit, S. GC-content normalization for RNA-Seq data. BMC Bioinformatics 12, 480 (2011).
Hansen, K.D., Irizarry, R.A. & Wu, Z. Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics 13, 204–216 (2012).
Leek, J.T. & Storey, J.D. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 3, 1724–1735 (2007).
Li, S. et al. Detecting and correcting systematic variation in large-scale RNA sequencing data. Nat. Biotechnol. 32, 888–895 (2014).
Chu, A. et al. Large-scale profiling of microRNAs for The Cancer Genome Atlas. Nucleic Acids Res. 44, e3 (2016).
Markham, N.R. & Zuker, M. UNAFold: software for nucleic acid folding and hybridization. Methods Mol. Biol. 453, 3–31 (2008).
Acknowledgements
We are grateful to J.S. Rozowsky, R. Kitchen, S.L. Subramanian, W. Thistlethwaite, M.B. Gerstein, A. Milosavljevic and N. Sakhanenko for facilitating access to the exceRpt pipeline and helpful conversations and suggestions. We are also grateful to P.A.C. 't Hoen for helpful discussions. We acknowledge funding support from the NIH Extracellular RNA Communication Common Fund grants: U01 grants HL126499 to M.T., HL126496 to D.J.G. and K.W., HL126493 to D.J.E. and P.G.W., HL126494 to L.C.L., HL126495 to J.E.F., HL126497 to I.G., and UH3 grant TR000891 to K.V.K.-J. M.D.G. acknowledges initial support from a Rio Hortega Fellowship (CM10/00084) and later from a Martin Escudero Fellowship. E.N.M.N.-`t.H. and T.A.P.D. received funding from the European Research Council under the European Union's Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement number 337581 and from the Netherlands Organization for Scientific Research (NWO Enabling Technologies project nr 435002022). Y.E.W. received funding from the Dana-Farber Strategic Plan Initiative. K.W. received funding from DOD (W911NF-10-2-0111) and DTRA (HDTRA1-13-C-0055). Research reported in this publication was also supported by the National Cancer Institute of the NIH under Award Number P30CA046592 by the use of the following Cancer Center Shared Resource at the University of Michigan: DNA Sequencing. D.J.G. also acknowledges a special technology support award from the Pacific Northwest Research Institute to his lab. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Contributions
M.D.G. designed, led and coordinated the overall study. This comprised contributions throughout the entire process, including designing, preparing and distributing synthetic RNA pools, creating detailed instructions for the participating labs, performing experiments, coordinating experimental work and communication across labs, and organizing and managing all aspects of the project. In addition, she interpreted study results and was the primary writer of the manuscript. R.M.S. led the computational analyses and data management for this study, including processing data, designing and performing data analyses, identifying and applying methods to visualize data and results, and coordinating data and metadata incoming from collaborating laboratories. He also designed the composition of the ratiometric pools, interpreted results and contributed to the manuscript by preparing the figures, drafting figure legends and writing the computational methods. A.E. contributed to experimental design and preparation of synthetic RNA pools. He also performed experimental work, interpreted results and provided comments on the manuscript. He also developed and contributed the core in-house 4N protocol, variations of which were then used by multiple laboratories. M.T., D.J.G. and D.J.E. helped to design the study and interpret the results, along with contributions from the rest of the study team, including L.C.L., P.G.W., K.V.K.-J., I.G., Y.E.W., K.W., J.E.F., H.J., E.N.M.N.-`t.H. and H.P. J.B. contributed to design of statistical analyses and data interpretation. P.M.G., A.J.B., S.S., P.L.D.H., K.T., A.C., S.L., J.K., R.R., D.B. and T.A.P.D. carried out experiments. M.T. and D.J.G. supervised the overall study and did primary editing of the manuscript with substantial input from D.J.E. All of the authors contributed to reviewing, editing and/or providing comments on the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The spouse of L.C. Laurent is an employee of Illumina, Inc., the manufacturer of the TruSeq Small RNA Library Preparation Kit. L.C. Laurent and her spouse's equity interest in Illumina, Inc. represents <<1% of the company. The other authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–20, Supplementary Note 1, and Supplementary Protocols 1–7 (PDF 21474 kb)
Supplementary Tables
Supplementary tables 1–10 (XLSX 1311 kb)
Rights and permissions
About this article
Cite this article
Giraldez, M., Spengler, R., Etheridge, A. et al. Comprehensive multi-center assessment of small RNA-seq methods for quantitative miRNA profiling. Nat Biotechnol 36, 746–757 (2018). https://doi.org/10.1038/nbt.4183
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.4183
This article is cited by
-
Small RNA structural biochemistry in a post-sequencing era
Nature Protocols (2024)
-
Reference Materials for Improving Reliability of Multiomics Profiling
Phenomics (2024)
-
Discovery of the major 15–30 nt mammalian small RNAs, their biogenesis and function
Nature Communications (2023)
-
Single-base resolution mapping of 2′-O-methylation sites by an exoribonuclease-enriched chemical method
Science China Life Sciences (2023)
-
Multi-omics data integration using ratio-based quantitative profiling with Quartet reference materials
Nature Biotechnology (2023)