Abstract
Understanding how proteins and their complex interaction networks convert the genomic information into a dynamic living organism is a fundamental challenge in biological sciences. As an important step towards understanding the systems biology of a complex eukaryote, we cataloged 63% of the predicted Drosophila melanogaster proteome by detecting 9,124 proteins from 498,000 redundant and 72,281 distinct peptide identifications. This unprecedented high proteome coverage for a complex eukaryote was achieved by combining sample diversity, multidimensional biochemical fractionation and analysis-driven experimentation feedback loops, whereby data collection is guided by statistical analysis of prior data. We show that high-quality proteomics data provide crucial information to amend genome annotation and to confirm many predicted gene models. We also present experimentally identified proteotypic peptides matching ∼50% of D. melanogaster gene models. This library of proteotypic peptides should enable fast, targeted and quantitative proteomic studies to elucidate the systems biology of this model organism.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Pennisi, E. Searching for the genome's second code. Science 306, 632–635 (2004).
Tupy, J.L. et al. Identification of putative noncoding polyadenylated transcripts in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 102, 5495–5500 (2005).
Kuster, B., Schirle, M., Mallick, P. & Aebersold, R. Scoring proteomes with proteotypic peptide probes. Nat. Rev. Mol. Cell Biol. 6, 577–583 (2005).
Desiere, F. et al. Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry. Genome Biol. 6, R9 (2005).
Anderson, N.L. & Anderson, N.G. The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell. Proteomics 1, 845–867 (2002).
de Godoy, L.M. et al. Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system. Genome Biol. 7, R50 (2006).
Duret, L. & Mouchiroud, D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc. Natl. Acad. Sci. USA 96, 4482–4487 (1999).
Keller, A., Nesvizhskii, A.I., Kolker, E. & Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 5383–5392 (2002).
Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E.L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567–580 (2001).
Bendtsen, J.D., Nielsen, H., von Heijne, G. & Brunak, S. Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340, 783–795 (2004).
Giot, L. et al. A protein interaction map of Drosophila melanogaster. Science 302, 1727–1736 (2003).
Nesvizhskii, A.I. & Aebersold, R. Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell. Proteomics 4, 1419–1440 (2005).
Komatsu, M. et al. A novel protein-conjugating system for Ufm1, a ubiquitin-fold modifier. EMBO J. 23, 1977–1986 (2004).
Washburn, M.P., Wolters, D. & Yates, J.R., III. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19, 242–247 (2001).
Tao, W.A. et al. Quantitative phosphoproteome analysis using a dendrimer conjugation chemistry and tandem mass spectrometry. Nat. Methods 2, 591–598 (2005).
Ong, S.E. & Mann, M. Mass spectrometry-based proteomics turns quantitative. Nat. Chem. Biol. 1, 252–262 (2005).
Johansson, K.C., Metzendorf, C. & Soderhall, K. Microarray analysis of immune challenged Drosophila hemocytes. Exp. Cell Res. 305, 145–155 (2005).
Stolc, V. et al. A gene expression map for the euchromatic genome of Drosophila melanogaster. Science 306, 655–660 (2004).
Pan, S. et al. High throughput proteome screening for biomarker detection. Mol. Cell. Proteomics 4, 182–190 (2005).
Mallick, P. et al. Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25, 125–131 (2007).
Zhang, H. et al. High-throughput quantitative analysis of serum proteins using glycopeptide capture and liquid chromatography mass spectrometry. Mol. Cell. Proteomics 4, 144–155 (2005).
Corthals, G.L., Aebersold, R., Goodlett, D.R. & Burlingame, A.L. Mass spectrometry: modified proteins and glycoconjugates. in Methods in Enzymology Vol. 405 (ed. Burlingame, A.L.) 66–81, (Academic Press, Boston, 2005).
Krijgsveld, J. et al. Metabolic labeling of C. elegans and D. melanogaster for quantitative proteomics. Nat. Biotechnol. 21, 927–931 (2003).
Gygi, S.P. et al. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994–999 (1999).
Gopal, S. et al. Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome. Nat. Genet. 27, 337–340 (2001).
Rubin, G.M. et al. A Drosophila complementary DNA resource. Science 287, 2222–2224 (2000).
Acknowledgements
We thank Bernd Roschitzki, Bertran Gerrits, Eva Niederer, Marko Jovanovic, Cristian Köpfli and Michael Walser for technical help, Hans Jespersen and Soeren Schandorff from Proxeon Bioinformatics for discussions regarding the proteotypic peptide data analysis and Hubert K. Rehrauer for help with statistical analysis. The project was funded by the University Research Priority Program Systems Biology/Functional Genomics of the University of Zurich. E.B., S.M., S.S. and S.L. are members of the Center for Model Organism Proteomes (C-MOP) which is funded by the University of Zurich (http://www.mop.unizh.ch). S.L. was supported by a Career Development Award of the University of Zurich. This work was also supported in part by a UBS grant to E.B. and K. Basler, and with federal funds from the US National Heart, Lung, and Blood Institute, National Institutes of Health under contract No. N01-HV-28179.
Author information
Authors and Affiliations
Contributions
E.B. and S.M. conducted most of the experimental work; E.B. performed most of the LTQ measurements, coordinated the interdisciplinary project and carried out the proteomics-based genome annotation work. S.M. performed the GO analyses; C.H.A. coordinated and carried out the bioinformatic analyses of the data set (Pfam analysis with C.P.), and conceived the ADE strategy; H.B. established the necessary statistical infrastructure for the project, carried out statistical analyses together with C.H.A. and performed the simulations; S.L. implemented the SBEAMS database, generated the Drosophila Peptide Atlas (supported by E.W.D.) and supported E.B with the proteomics-based genome annotation work; F.P. and C.P. (supported by E.W.D.) implemented and maintained the computational infrastructure at the Functional Genomics Center Zurich (FGCZ); U.L. provided the protein parameter computations; O.R. performed the gel-filtration experiments; H.L. helped with the setup of the LTQ instrument and LTQ MS analyses; P.G.A.P. generated the software for FCF calculations; J.M. set up the Free Flow Electrophoresis system and helped with experiments; K.K. and S.S. helped with selected experiments; F.K. helped with LCQ, and the initial experimental strategy; J.K. shared large data sets and performed the LTQ and LTQ-FT-ICR measurements in the lab of A.J.R.H.; R.S. supported the project with FGCZ resources; E.H. and R.A initiated the project and provided intellectual and financial support; R.A carried senior authorship responsibility.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Fig. 1
Schematic drawing that depicts the various steps and procedures used to prepare samples for mass spectrometric analysis. (PDF 381 kb)
Supplementary Fig. 2
Visualization of the statistical analysis of protein parameter distributions of all Drosophila proteins (population) and a subset of experimentally identified proteins (sample) using graphs of cumulative distributions and combined histograms. (PDF 524 kb)
Supplementary Table 1
The Drosophila melanogaster proteome based on Berkeley Drosophila Genome Project (BDGP) release 3.2. (DOC 27 kb)
Supplementary Table 2
Overview of all experiments grouped by developmental stage or cell line. (DOC 121 kb)
Supplementary Table 3
PFAM and GO slim analysis. (DOC 324 kb)
Supplementary Table 4
List of unvalidated peptides identified by cross-comparative database searches. (XLS 60 kb)
Supplementary Table 5
List of experimentally observed proteotypic peptides (PTPs). (XLS 2743 kb)
Rights and permissions
About this article
Cite this article
Brunner, E., Ahrens, C., Mohanty, S. et al. A high-quality catalog of the Drosophila melanogaster proteome. Nat Biotechnol 25, 576–583 (2007). https://doi.org/10.1038/nbt1300
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt1300
This article is cited by
-
Proteomics of protein trafficking by in vivo tissue-specific labeling
Nature Communications (2021)
-
A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila
Journal of Molecular Evolution (2020)
-
Quantitative assay of targeted proteome in tomato trichome glandular cells using a large-scale selected reaction monitoring strategy
Plant Methods (2019)
-
Proteome-wide association studies identify biochemical modules associated with a wing-size phenotype in Drosophila melanogaster
Nature Communications (2016)
-
Quantitative proteomics signature profiling based on network contextualization
Biology Direct (2015)