Abstract
Comprehensive identification of all functional elements encoded in the human genome is a fundamental need in biomedical research. Here, we present a comparative analysis of the human, mouse, rat and dog genomes to create a systematic catalogue of common regulatory motifs in promoters and 3′ untranslated regions (3′ UTRs). The promoter analysis yields 174 candidate motifs, including most previously known transcription-factor binding sites and 105 new motifs. The 3′-UTR analysis yields 106 motifs likely to be involved in post-transcriptional regulation. Nearly one-half are associated with microRNAs (miRNAs), leading to the discovery of many new miRNA genes and their likely target genes. Our results suggest that previous estimates of the number of human miRNA genes were low, and that miRNAs regulate at least 20% of human genes. The overall results provide a systematic view of gene regulation in the human, which will be refined as additional mammalian genomes become available.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Gumucio, D. L. et al. Phylogenetic footprinting reveals a nuclear protein which binds to silencer sequences in the human gamma and epsilon globin genes. Mol. Cell. Biol. 12, 4919–4929 (1992)
Wasserman, W. W., Palumbo, M., Thompson, W., Fickett, J. W. & Lawrence, C. E. Human-mouse genome comparisons to locate regulatory sites. Nature Genet. 26, 225–228 (2000)
Dubchak, I. et al. Active conservation of noncoding sequences revealed by three-way species comparisons. Genome Res. 10, 1304–1306 (2000)
Pennacchio, L. A. et al. An apolipoprotein influencing triglycerides in humans and mice revealed by comparative sequencing. Science 294, 169–173 (2001)
Boffelli, D. et al. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299, 1391–1394 (2003)
Sandelin, A., Wasserman, W. W. & Lenhard, B. ConSite: web-based prediction of regulatory elements using cross-species comparison. Nucleic Acids Res. 32, W249–W252 (2004)
Sinha, S., Blanchette, M. & Tompa, M. PhyME: a probabilistic algorithm for finding motifs in sets of orthologous sequences. BMC Bioinformatics 5, 170 (2004)
Dermitzakis, E. T. et al. Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment. Genome Res. 14, 852–859 (2004)
Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004)
Eddy, S. R. A model of the statistical power of comparative genome sequence analysis. PLoS Biol. 3, e10 (2005)
Kellis, M., Patterson, N., Endrizzi, M., Birren, B. & Lander, E. S. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003)
Cliften, P. et al. Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science 301, 71–76 (2003)
International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)
International Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)
Rat Genome Sequencing Project Consortium. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428, 493–521 (2004)
Bartel, D. P. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116, 281–297 (2004)
Lewis, B. P., Shih, I. H., Jones-Rhoades, M. W., Bartel, D. P. & Burge, C. B. Prediction of mammalian microRNA targets. Cell 115, 787–798 (2003)
Maglott, D. R., Katz, K. S., Sicotte, H. & Pruitt, K. D. NCBI's LocusLink and RefSeq. Nucleic Acids Res. 28, 126–128 (2000)
Blanchette, M. et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715 (2004)
Schwartz, S. et al. Human-mouse alignments with BLASTZ. Genome Res. 13, 103–107 (2003)
Mootha, V. K. et al. Errα and Gabpa/b specify PGC-1α-dependent oxidative phosphorylation gene expression that is altered in diabetic muscle. Proc. Natl Acad. Sci. USA 101, 6570–6575 (2004)
Johnston, S. D. et al. Estrogen-related receptor alpha 1 functionally binds as a monomer to extended half-site sequences including ones contained within estrogen-response elements. Mol. Endocrinol. 11, 342–352 (1997)
Matys, V. et al. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. 31, 374–378 (2003)
Su, A. I. et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl Acad. Sci. USA 101, 6062–6067 (2004)
Kuersten, S. & Goodwin, E. B. The power of the 3′ UTR: translational control and development. Nature Rev. Genet. 4, 626–637 (2003)
Lai, E. C. Micro RNAs are complementary to 3′ UTR sequence motifs that mediate negative post-transcriptional regulation. Nature Genet. 30, 363–364 (2002)
Lim, L. P., Glasner, M. E., Yekta, S., Burge, C. B. & Bartel, D. P. Vertebrate microRNA genes. Science 299, 1540 (2003)
Griffiths-Jones, S. The microRNA Registry. Nucleic Acids Res. 32 (Database issue), D109–D111 (2004)
Fontana, W. et al. RNA folding and combinatory landscapes. Phys. Rev. E 47, 2083–2099 (1993)
Hofacker, I. L. Vienna RNA secondary structure server. Nucleic Acids Res. 31, 3429–3431 (2003)
Ambros, V. et al. A uniform system for microRNA annotation. RNA 9, 277–279 (2003)
Lewis, B. P., Burge, C. B. & Bartel, D. P. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15–20 (2005)
Lim, L. P. et al. Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature advance online publication, 30 January (2005) (doi:10.1038/nature03315).
Berezikov, E. et al. Phylogenetic shadowing and computational identification of human microRNA genes. Cell 120, 21–24 (2005)
Chen, C. Y. & Shyu, A. B. AU-rich elements: characterization and importance in mRNA degradation. Trends Biochem. Sci. 20, 465–470 (1995)
Spassov, D. S. & Jurecic, R. The PUF family of RNA-binding proteins: does evolutionarily conserved structure equal conserved function? IUBMB Life 55, 359–366 (2003)
Margulies, E. H. et al. An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing. Proc. Natl Acad. Sci. USA (in the press)
Acknowledgements
We thank B. Birren, M. Kamal, K. O'Neill and A. Subramanian for advice and discussions. This work was supported in part by grants from the National Human Genome Research Institute.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors declare that they have no competing financial interests.
Supplementary information
Supplementary Notes
This file contains details of the Supplementary Methods used in the paper. It also contains two Supplementary Figures, 11 Supplementary Tables and additional references. (PDF 592 kb)
Rights and permissions
About this article
Cite this article
Xie, X., Lu, J., Kulbokas, E. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005). https://doi.org/10.1038/nature03441
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature03441
This article is cited by
-
Exploring objective feature sets in constructing the evolution relationship of animal genome sequences
BMC Genomics (2023)
-
Fine-mapping and candidate gene analysis of the Mcgy1 locus responsible for gynoecy in bitter gourd (Momordica spp.)
Theoretical and Applied Genetics (2023)
-
Identification of cis-regulatory motifs in first introns and the prediction of intron-mediated enhancement of gene expression in Arabidopsis thaliana
BMC Genomics (2021)
-
Differential expression of miRNAs and functional role of mir-200a in high and low productivity CHO cells expressing an Fc fusion protein
Biotechnology Letters (2021)
-
Exercise rejuvenates quiescent skeletal muscle stem cells in old mice through restoration of Cyclin D1
Nature Metabolism (2020)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.