Abstract
Nanopore sequencers can be used to selectively sequence certain DNA molecules in a pool by reversing the voltage across individual nanopores to reject specific sequences, enabling enrichment and depletion to address biological questions. Previously, we achieved this using dynamic time warping to map the signal to a reference genome, but the method required substantial computational resources and did not scale to gigabase-sized references. Here we overcome this limitation by using graphical processing unit (GPU) base-calling. We show enrichment of specific chromosomes from the human genome and of low-abundance organisms in mixed populations without a priori knowledge of sample composition. Finally, we enrich targeted panels comprising 25,600 exons from 10,000 human genes and 717 genes implicated in cancer, identifying PML–RARA fusions in the NB4 cell line in <15 h sequencing. These methods can be used to efficiently screen any target panel of genes without specialized sample preparation using any computer and a suitable GPU. Our toolkit, readfish, is available at https://www.github.com/looselab/readfish.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All reads generated in the course of this study are available from the ENA under project ID PRJEB36644.
Code availability
Our code is available open source at http://www.github.com/LooseLab/readfish. See also “readfish code availability” above.
References
Loose, M., Malla, S. & Stout, M. Real-time selective sequencing using nanopore technology. Nat. Methods 13, 751–754 (2016).
Masutani, B. & Morishita, S. A framework and an algorithm to detect low-abundance DNA by a handy sequencer and a palm-sized computer. Bioinformatics 35, 584–592 (2019).
Kovaka, S., Fan, Y., Ni, B., Timp, W. & Schatz, M. C. Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED. Nat. Biotechnol. https://doi.org/10.1038/s41587-020-0731-9 (2020).
Edwards, H. S. et al. Real-time selective sequencing with RUBRIC: Read Until with Basecall and Reference-Informed Criteria. Sci. Rep. 9, 11475 (2019).
Rang, F. J., Kloosterman, W. P. & de Ridder, J. From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy. Genome Biol. 19, 90 (2018).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Kim, D., Song, L., Breitwieser, F. P. & Salzberg, S. L. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 26, 1721–1729 (2016).
Tate, J. G. et al. COSMIC: the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 47, D941–D947 (2019).
Mozziconacci, M.-J. et al. Molecular cytogenetics of the acute promyelocytic leukemia-derived cell line NB4 and of four all-trans retinoic acid–resistant subclones. Genes Chromosomes Cancer 35, 261–270 (2002).
Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338–345 (2018).
Charalampous, T. et al. Nanopore metagenomics enables rapid clinical diagnosis of bacterial lower respiratory infection. Nat. Biotechnol. 37, 783–792 (2019).
Marotz, C. A. et al. Improving saliva shotgun metagenomics by chemical host DNA depletion. Microbiome 6, 42 (2018).
Nicholls, S. M., Quick, J. C., Tang, S. & Loman, N. J. Ultra-deep, long-read nanopore sequencing of mock microbial community standards. Gigascience 8, giz043 (2019).
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).
Kozarewa, I., Armisen, J., Gardner, A. F., Slatko, B. E. & Hendrickson, C. L. Overview of target enrichment strategies. Curr. Protoc. Mol. Biol. 112, 7.21.1–7.21.23 (2015).
Gnirke, A. et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009).
Gilpatrick, T. et al. Targeted nanopore sequencing with Cas9-guided adapter ligation. Nat. Biotechnol. 38, 433–438 (2020).
Loose, M. Finding the needle: targeted nanopore sequencing and CRISPR-Cas9. CRISPR J. 1, 265–267 (2018).
Cunningham, F. et al. Ensembl 2019. Nucleic Acids Res. 47, D745–D751 (2019).
Heller, D. & Vingron, M. SVIM: structural variant identification using mapped long reads. Bioinformatics 35, 2907–2915 (2019).
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Meth. 15, 461–468 (2018).
Beyter, D., Ingimundardottir, H. & Eggertsson, H. P. Long read sequencing of 1,817 Icelanders provides insight into the role of structural variants in human disease. Preprint at bioRxiv https://doi.org/10.1101/848366 (2019).
Pedersen, B. S. & Quinlan, A. R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868 (2018).
Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37, 561–566 (2019).
Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).
Nattestad, M., Aboukhalil, R., Chin, C.-S. & Schatz, M. C. Ribbon: intuitive visualization for complex genomic variation. Bioinformatics https://doi.org/10.1093/bioinformatics/btaa680 (2020).
Pruitt, K. D. & Maglott, D. R. RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 29, 137–140 (2001).
Acknowledgements
We thank J. Quick, J. Tyson, J. Simpson and N. Loman for helpful comments and (mainly) criticisms and E. Birney, N. Goldman and A. Senf for helpful insights and discussion on these approaches. We thank M. Hubank and L. Gallagher for access to materials and reagents as well as general boundless enthusiasm. We thank M. Jain for assisting in manipulating data. We also thank S. Reid, C. Wright, C. Seymour, J. Pugh and G. Pimm from ONT for advice on MinKNOW and Guppy operations as well as extensive troubleshooting. This work was supported by the Biotechnology and Biological Sciences Research Council (grant numbers BB/N017099/1, R.M. and M.L.; BB/M020061/1, M.L.; and BB/M008770/1, 1949454 A.P.), the Wellcome Trust (grant number 204843/Z/16/Z, N.H. and M.L.) and the Defence Science and Technology Laboratory (grant number DSTLX-1000138444, R.M. and M.L.).
Author information
Authors and Affiliations
Contributions
M.L. and A.P. conceived the study. A.P., N.H. and M.L. acquired data. T.C. and R.M. designed and implemented metagenomics applications. A.P., B.J.D. and M.L. analyzed and interpreted data. All authors discussed the results and contributed to the final manuscript.
Corresponding author
Ethics declarations
Competing interests
M.L. was a member of the MinION access program and has received free flow cells and sequencing reagents in the past. M.L. has received reimbursement for travel, accommodation and conference fees to speak at events organized by ONT.
Additional information
Peer review information Nature Biotechnology thanks Jan Korbel and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–19, Tables 1–5, Note 1 and Data 1 description.
Supplementary Data 1
COSMIC panel coordinates for selective sequencing and the mean coverage across each run.
Rights and permissions
About this article
Cite this article
Payne, A., Holmes, N., Clarke, T. et al. Readfish enables targeted nanopore sequencing of gigabase-sized genomes. Nat Biotechnol 39, 442–450 (2021). https://doi.org/10.1038/s41587-020-00746-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-020-00746-x
This article is cited by
-
Antimicrobial resistance prediction by clinical metagenomics in pediatric severe pneumonia patients
Annals of Clinical Microbiology and Antimicrobials (2024)
-
Modeling the limits of detection for antimicrobial resistance genes in agri-food samples: a comparative analysis of bioinformatics tools
BMC Microbiology (2024)
-
mEnrich-seq: methylation-guided enrichment sequencing of bacterial taxa of interest from microbiome
Nature Methods (2024)
-
A distant global control region is essential for normal expression of anterior HOXA genes during mouse and human craniofacial development
Nature Communications (2024)
-
A long-read sequencing strategy with overlapping linkers on adjacent fragments (OLAF-Seq) for targeted resequencing and enrichment
Scientific Reports (2024)