Abstract
Cas9-mediated, high-throughput, saturating in situ mutagenesis permits fine-mapping of function across genomic segments. Disease- and trait-associated variants identified in genome-wide association studies largely cluster at regulatory loci. Here we demonstrate the use of multiple designer nucleases and variant-aware library design to interrogate trait-associated regulatory DNA at high resolution. We developed a computational tool for the creation of saturating-mutagenesis libraries with single or multiple nucleases with incorporation of variants. We applied this methodology to the HBS1L-MYB intergenic region, which is associated with red-blood-cell traits, including fetal hemoglobin levels. This approach identified putative regulatory elements that control MYB expression. Analysis of genomic copy number highlighted potential false-positive regions, thus emphasizing the importance of off-target analysis in the design of saturating-mutagenesis experiments. Together, these data establish a widely applicable high-throughput and high-resolution methodology to identify minimal functional sequences within large disease- and trait-associated regions.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Maurano, M.T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
Bauer, D.E. et al. An erythroid enhancer of BCL11A subject to genetic variation determines fetal hemoglobin level. Science 342, 253–257 (2013).
Canver, M.C. et al. BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527, 192–197 (2015).
Canver, M.C. et al. Characterization of genomic deletion efficiency mediated by clustered regularly interspaced palindromic repeats (CRISPR)/Cas9 nuclease system in mammalian cells. J. Biol. Chem. 289, 21312–21324 (2014).
Corradin, O. et al. Combinatorial effects of multiple enhancer variants in linkage disequilibrium dictate levels of gene expression to confer susceptibility to common traits. Genome Res. 24, 1–13 (2014).
Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013).
Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823–826 (2013).
Ran, F.A. et al. Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154, 1380–1389 (2013).
Hsu, P.D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827–832 (2013).
Uda, M. et al. Genome-wide association study shows BCL11A associated with persistent fetal hemoglobin and amelioration of the phenotype of beta-thalassemia. Proc. Natl. Acad. Sci. USA 105, 1620–1625 (2008).
Lettre, G. et al. DNA polymorphisms at the BCL11A, HBS1L-MYB, and beta-globin loci associate with fetal hemoglobin levels and pain crises in sickle cell disease. Proc. Natl. Acad. Sci. USA 105, 11869–11874 (2008).
Thein, S.L. et al. Intergenic variants of HBS1L-MYB are responsible for a major quantitative trait locus on chromosome 6q23 influencing fetal hemoglobin levels in adults. Proc. Natl. Acad. Sci. USA 104, 11346–11351 (2007).
Galarneau, G. et al. Fine-mapping at three loci known to affect fetal hemoglobin levels explains additional genetic variation. Nat. Genet. 42, 1049–1051 (2010).
Farrell, J.J. et al. A 3-bp deletion in the HBS1L-MYB intergenic region on chromosome 6q23 is associated with HbF expression. Blood 117, 4935–4945 (2011).
Stadhouders, R. et al. HBS1L-MYB intergenic variants modulate fetal hemoglobin via long-range MYB enhancers. J. Clin. Invest. 124, 1699–1710 (2014).
Mtatiro, S.N. et al. Genome wide association study of fetal hemoglobin in sickle cell anemia in Tanzania. PLoS One 9, e111464 (2014).
Bae, H.T. et al. Meta-analysis of 2040 sickle cell anemia patients: BCL11A and HBS1L-MYB are the major modifiers of HbF in African Americans. Blood 120, 1961–1962 (2012).
Ganesh, S.K. et al. Multiple loci influence erythrocyte phenotypes in the CHARGE Consortium. Nat. Genet. 41, 1191–1198 (2009).
Soranzo, N. et al. A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium. Nat. Genet. 41, 1182–1190 (2009).
Kamatani, Y. et al. Genome-wide association study of hematological and biochemical traits in a Japanese population. Nat. Genet. 42, 210–215 (2010).
Menzel, S., Garner, C., Rooks, H., Spector, T.D. & Thein, S.L. HbA2 levels in normal adults are influenced by two distinct genetic mechanisms. Br. J. Haematol. 160, 101–105 (2013).
van der Harst, P. et al. Seventy-five genetic loci influencing the human red blood cell. Nature 492, 369–375 (2012).
Chen, Z. et al. Genome-wide association analysis of red blood cell traits in African Americans: the COGENT Network. Hum. Mol. Genet. 22, 2529–2538 (2013).
Esvelt, K.M. et al. Orthogonal Cas9 proteins for RNA-guided gene regulation and editing. Nat. Methods 10, 1116–1121 (2013).
Mali, P., Esvelt, K.M. & Church, G.M. Cas9 as a versatile tool for engineering biology. Nat. Methods 10, 957–963 (2013).
Ran, F.A. et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186–191 (2015).
Kleinstiver, B.P. et al. Broadening the targeting range of Staphylococcus aureus CRISPR-Cas9 by modifying PAM recognition. Nat. Biotechnol. 33, 1293–1298 (2015).
Kleinstiver, B.P. et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature 523, 481–485 (2015).
Zetsche, B. et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell 163, 759–771 (2015).
Doench, J.G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 34, 184–191 (2016).
Kurita, R. et al. Establishment of immortalized human erythroid progenitor cell lines able to produce enucleated red blood cells. PLoS One 8, e59890 (2013).
Canver, M.C. & Orkin, S.H. Customizing the genome as therapy for the β-hemoglobinopathies. Blood 127, 2536–2545 (2016).
Sankaran, V.G. et al. MicroRNA-15a and -16-1 act via MYB to elevate fetal hemoglobin expression in human trisomy 13. Proc. Natl. Acad. Sci. USA 108, 1519–1524 (2011).
Rajagopal, N. et al. High-throughput mapping of regulatory DNA. Nat. Biotechnol. 34, 167–174 (2016).
Munoz, D.M. et al. CRISPR screens provide a comprehensive assessment of cancer vulnerabilities but generate false-positive hits for highly amplified genomic regions. Cancer Discov. 6, 900–913 (2016).
Aguirre, A.J. et al. Genomic copy number dictates a gene-independent cell response to CRISPR-Cas9 targeting. Cancer Discov. 6, 914–929 (2016).
Sanjana, N.E., Shalem, O. & Zhang, F. Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods 11, 783–784 (2014).
Pinello, L. et al. Analyzing CRISPR genome-editing experiments with CRISPResso. Nat. Biotechnol. 34, 695–697 (2016).
Findlay, G.M., Boyle, E.A., Hause, R.J., Klein, J.C. & Shendure, J. Saturation editing of genomic regions by multiplex homology-directed repair. Nature 513, 120–123 (2014).
Yang, L. et al. Targeted and genome-wide sequencing reveal single nucleotide variations impacting specificity of Cas9 in human stem cells. Nat. Commun. 5, 5507 (2014).
Doench, J.G. et al. Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat. Biotechnol. 32, 1262–1267 (2014).
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Shalem, O., Sanjana, N.E. & Zhang, F. High-throughput functional genomics using CRISPR-Cas9. Nat. Rev. Genet. 16, 299–311 (2015).
Chen, S. et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160, 1246–1260 (2015).
Shalem, O. et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343, 84–87 (2014).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Pinello, L., Xu, J., Orkin, S.H. & Yuan, G.-C.Analysisof chromatin-state plasticity identifies cell-type-specific regulators of H3K27me3 patterns. Proc. Natl. Acad. Sci. USA 111, E344–E353 (2014).
Whyte, W.A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).
Grant, C.E., Bailey, T.L. & Noble, W.S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).
Mathelier, A. et al. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res. 42, D142–D147 (2014).
Solovieff, N. et al. Fetal hemoglobin in sickle cell anemia: genome-wide association studies suggest a regulatory region in the 5′ olfactory receptor gene cluster. Blood 115, 1815–1822 (2010).
Giarratana, M.C. et al. Proof of principle for transfusion of in vitro-generated red blood cells. Blood 118, 5071–5079 (2011).
Acknowledgements
We thank Z. Herbert, M. Berkeley, and M. Vangala (Dana-Farber Cancer Institute Molecular Biology Core Facility) for sequencing, F. Lu at the HHMI Sequencing facility, and members at the Hematologic Neoplasia Flow Cytometry and the Flow Cytometry Core facilities at the Dana-Farber Cancer Institute for cell-sorting. We also thank J. Doench, M. Haeussler, J.-P. Concordet, R. Barretto, V. Sankaran, and J. Xu for helpful discussions. M.C.C. is supported by a National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) award (F30DK103359-01A1). L.P. is supported by a National Human Genome Research Institute (NHGRI) Career Development Award (K99HG008399). S.L. is funded by a Canadian Institutes of Health Research Banting Doctoral Scholarship. E.N.S. is supported by a Hematology Opportunities for the Next Generation of Research Scientists (HONORS) Award from the American Society of Hematology. G.C.Y. is supported by awards from the National Heart, Lung, and Blood Institute (NHLBI) (R01HL119099). G.L. is funded by the Canada Research Program, the Montreal Heart Institute Foundation, and the Canadian Institute of Health Research (MOP123382). A portion of the DNA genotyping was performed as part of the Biogen Sickle Cell Disease Consortium. D.E.B. is supported by NIDDK (K08DK093705, R03DK109232), NHLBI (DP2OD022716), the Burroughs Wellcome Fund, a Doris Duke Charitable Foundation Innovations in Clinical Research Award, an ASH Scholar Award, a Charles H. Hood Foundation Child Health Research Award, and a Cooley's Anemia Foundation Fellowship. S.H.O. is supported by an award from the NHLBI (P01HL032262) and an award from the NIDDK (P30DK049216, Center of Excellence in Molecular Hematology).
Author information
Authors and Affiliations
Contributions
M.C.C., D.E.B., and S.H.O. conceived this study. M.C.C. developed the DNA Striker computational tool and performed computational analysis of degrees of PAM saturation. M.C.C., Y.W., E.N.S., A.J.N., D.D.C., P.P.D., M.A.C., and J.Z. performed the experiments. S.L., Y.I., F.G., C.B., A.K., C.M., M.R., and G.L. performed the genotyping and genetic analysis. R.K. and Y.N. provided the HUDEP-2 cell line. M.C.C., S.L., Y.I., L.P., G.-C.Y., and G.L. performed computational and statistical analysis. D.E.B. and S.H.O. supervised this work. M.C.C., D.E.B., and S.H.O. wrote the manuscript with input from all authors.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–23 (PDF 23705 kb)
Supplementary Table 1
HbF-associated SNPs. Genome-wide significant SNPs from HbF meta-analysis. (XLSX 10 kb)
Supplementary Table 2
Previously published red blood cell trait associated SNPs22. (XLSX 29 kb)
Supplementary Table 3
Conditional analysis of HbF-associated SNPs. (XLSX 8 kb)
Supplementary Table 4
Genomic cleavage distribution for 8 PAM sequences by chromosome. (XLSX 11 kb)
Supplementary Table 5
Genomic cleavage distribution. Distances between adjacent genomic cleavages for 8 PAM sequences in (a) DHS, (b) enhancers, and (c) repressed regions for 9 ENCODE cell lines as well as (d) RefSeq gene annotations. (XLSX 13 kb)
Supplementary Table 6
NGG-restricted sgRNA library. (XLSX 392 kb)
Supplementary Table 7
NGA-restricted sgRNA library. (XLSX 546 kb)
Supplementary Table 8
sgRNA for Cas9 activity reporters. (XLSX 8 kb)
Supplementary Table 9
MYB shRNA sequences. (XLSX 8 kb)
Rights and permissions
About this article
Cite this article
Canver, M., Lessard, S., Pinello, L. et al. Variant-aware saturating mutagenesis using multiple Cas9 nucleases identifies regulatory elements at trait-associated loci. Nat Genet 49, 625–634 (2017). https://doi.org/10.1038/ng.3793
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3793
This article is cited by
-
Identification of candidate genes and pathways associated with juvenile idiopathic arthritis by integrative transcriptome-wide association studies and mRNA expression profiles
Arthritis Research & Therapy (2023)
-
Statistical learning quantifies transposable element-mediated cis-regulation
Genome Biology (2023)
-
Targeting leukemia-specific dependence on the de novo purine synthesis pathway
Leukemia (2022)
-
Guide RNAs containing universal bases enable Cas9/Cas12a recognition of polymorphic sequences
Nature Communications (2022)
-
A comprehensive Bioconductor ecosystem for the design of CRISPR guide RNAs across nucleases and technologies
Nature Communications (2022)