Key Points
-
Variations in the rate of protein evolution are determined by biases in the mutation rate and fixation rate (which are either protein specific or linked to genomic location).
-
By drawing on accumulating genomic data, evolutionary studies have moved from studying individual proteins to characterizing global cellular factors.
-
Protein-specific biases in fixation rate are due to differences in both purifying and positive selection across genes.
-
Although theoretical considerations that are based on purifying selection suggest that the importance of a gene (or its dispensability) is a key determinant of protein evolution, experimental data confirm at best a moderate influence.
-
An important concept in thinking about protein evolution is fitness density, that is, measuring the weighted fraction of sites at which mutations result in phenotypes with modified fitness.
-
Selection on protein structure and stability is presumably responsible for the largest contribution to fitness density.
-
The position of a protein in biological networks seems to be only of minor importance, despite much recent excitement.
-
Broadly expressed and highly expressed proteins evolve slowly; expression level is by far the strongest predictor of evolutionary rate in yeast (possibly because of selection for robust folding in highly expressed proteins).
-
Some recent studies suggest that a large fraction (∼30%) of amino-acid changes might be driven by positive selection, contrary to expectations that are based on the (nearly) neutral theory.
-
Positive selection often reflects compensatory mutations or arms races rather than adaptation.
-
Further research is needed to understand the relative importance of the different factors that affect protein evolution; future studies will be most effective if combined with the development of a coherent theory that is based on population genetics models.
Abstract
Why do proteins evolve at different rates? Advances in systems biology and genomics have facilitated a move from studying individual proteins to characterizing global cellular factors. Systematic surveys indicate that protein evolution is not determined exclusively by selection on protein structure and function, but is also affected by the genomic position of the encoding genes, their expression patterns, their position in biological networks and possibly their robustness to mistranslation. Recent work has allowed insights into the relative importance of these factors. We discuss the status of a much-needed coherent view that integrates studies on protein evolution with biochemistry and functional and structural genomics.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Webster, A. J., Payne, R. J. & Pagel, M. Molecular phylogenies link rates of evolution and speciation. Science 301, 478 (2003).
Cutter, A. D. & Ward, S. Sexual and temporal dynamics of molecular evolution in C. elegans development. Mol. Biol. Evol. 22, 178–188 (2005).
Bromham, L. & Leys, R. Sociality and the rate of molecular evolution. Mol. Biol. Evol. 22, 1393–1402 (2005).
Brakmann, S. & Schwienhorst, A. (eds) Evolutionary Methods in Biotechnology: Clever Tricks for Directed Evolution (Wiley, Weinheim, 2004).
Smith, N. G. & Eyre-Walker, A. Human disease genes: patterns and predictions. Gene 318, 169–175 (2003).
Searls, D. B. Pharmacophylogenomics: genes, evolution and drug targets. Nature Rev. Drug Discov. 2, 613–623 (2003). A summary of the potential links between evolutionary genomics and pharmacology.
Ramani, A. K. & Marcotte, E. M. Exploiting the co-evolution of interacting proteins to discover interaction specificity. J. Mol. Biol. 327, 273–284 (2003).
Wilson, A. C., Carlson, S. S. & White, T. J. Biochemical evolution. Annu. Rev. Biochem. 46, 573–639 (1977). A classical early study that recognized several potential determinants of protein evolution.
Fay, J. C. & Wu, C. I. The neutral theory in the genomic era. Curr. Opin. Genet. Dev. 11, 642–646 (2001).
Kimura, M. The Neutral Theory of Evolution (Cambridge Univ. Press, Cambridge, 1983).
Ohta, T. The nearly neutral theory of molecular evolution. Annu. Rev. Ecol. Syst. 23, 263–286 (1992).
Gillespie, J. H. The Causes of Molecular Evolution (Oxford Univ. Press, Oxford, 1991). References 10–12 are landmark reviews (frequently with opposing views) on the neutral and nearly neutral theories.
Ellegren, H., Smith, N. G. C. & Webster, M. T. Mutation rate variation in the mammalian genome. Curr. Opin. Genet. Dev. 13, 562–568 (2003).
Smith, N. G. C. & Hurst, L. D. The effect of tandem substitutions on the correlation between synonymous and nonsynonymous rates in rodents. Genetics 153, 1395–1402 (1999).
Lercher, M. J., Williams, E. J. B. & Hurst, L. D. Local similarity in evolutionary rates extends over whole chromosomes in human–rodent and mouse–rat comparisons: Implications for understanding the mechanistic basis of the male mutation bias. Mol. Biol. Evol. 18, 2032–2039 (2001). An analysis of mutation-rate variation across mammalian genomes and its effect on protein evolution.
Lercher, M. J., Chamary, J. V. & Hurst, L. D. Genomic regionality in rates of evolution is not explained by clustering of genes of comparable expression profile. Genome Res. 14, 1002–1013 (2004).
Williams, E. J. & Hurst, L. D. The proteins of linked genes evolve at similar rates. Nature 407, 900–903 (2000).
Matassi, G., Sharp, P. M. & Gautier, C. Chromosomal location effects on gene sequence evolution in mammals. Curr. Biol. 9, 786–791 (1999).
Datta, A. & Jinks-Robertson, S. Association of increased spontaneous mutation-rates with high levels of transcription in yeast. Science 268, 1616–1619 (1995).
Lercher, M. J. & Hurst, L. D. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18, 337–340 (2002).
Rattray, A. J. & Strathern, J. N. Error-prone DNA polymerases: when making a mistake is the only way to get ahead. Annu. Rev. Genet. 37, 31–66 (2003).
Hurst, L. D. & Peck, J. R. Recent advances in understanding the evolution and maintenance of sex. Trends Ecol. Evol. 11, 46–52 (1996).
Birky, C. W. Jr & Walsh, J. B. Effects of linkage on rates of molecular evolution. Proc. Natl Acad. Sci. USA 85, 6414–6418 (1988).
Wyckoff, G. J., Malcom, C. M., Vallender, E. J. & Lahn, B. T. A highly unexpected strong correlation between fixation probability of nonsynonymous mutations and mutation rate. Trends Genet. 21, 381–385 (2005). A remarkable study that suggests that up to 40% of the variation in protein evolutionary rates might be attributable to variation in the underlying mutation rate.
Chamary, J. V., Parmley, J. L. & Hurst, L. D. Hearing silence: non-neutral evolution at synonymous sites in mammals. Nature Rev. Genet. 7, 98–108 (2006).
Smith, J. M. & Haigh, J. The hitch-hiking effect of a favourable gene. Genet. Res. 23, 23–35 (1974).
Betancourt, A. J. & Presgraves, D. C. Linkage limits the power of natural selection in Drosophila. Proc. Natl Acad. Sci. USA 99, 13616–13620 (2002). This paper claims that regional recombinational differences have a strong influence on the fixation of positively selected mutations.
Bierne, N. & Eyre-Walker, A. The genomic rate of adaptive amino acid substitution in Drosophila. Mol. Biol. Evol. 21, 1350–1360 (2004).
Presgraves, D. C. Recombination enhances protein adaptation in Drosophila melanogaster. Curr. Biol. 15, 1651–1656 (2005).
Subramanian, S. & Kumar, S. Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome. Genetics 168, 373–381 (2004).
Wright, S. I., Yau, C. B., Looseley, M. & Meyers, B. C. Effects of gene expression on molecular evolution in Arabidopsis thaliana and Arabidopsis lyrata. Mol. Biol. Evol. 21, 1719–1726 (2004).
Pal, C., Papp, B. & Hurst, L. D. Highly expressed genes in yeast evolve slowly. Genetics 158, 927–931 (2001). The first identification of protein-expression level as a strong predictor of evolutionary rate in yeast.
Rocha, E. P. C. & Danchin, A. An analysis of determinants of amino acids substitution rates in bacterial proteins. Mol. Biol. Evol. 21, 108–116 (2004). This work (like reference 48) compares the relative importance of several factors that are implicated in protein evolution, identifying expression level as the most important variable.
Gerton, J. L. et al. Inaugural article: global mapping of meiotic recombination hotspots and coldspots in the yeast Saccharomyces cerevisiae. Proc. Natl Acad. Sci. USA 97, 11383–11390 (2000).
Pal, C., Papp, B. & Hurst, L. D. Does the recombination rate affect the efficiency of purifying selection? The yeast genome provides a partial answer. Mol. Biol. Evol. 18, 2323–2326 (2001).
Bachtrog, D. Protein evolution and codon usage bias on the neo-sex chromosomes of Drosophila miranda. Genetics 165, 1221–1232 (2003).
Bachtrog, D. Evidence that positive selection drives Y-chromosome degeneration in Drosophila miranda. Nature Genet. 36, 518–522 (2004).
Zuckerkandl, E. Evolutionary processes and evolutionary noise at the molecular level. I. Functional density in proteins. J. Mol. Evol. 7, 167–183 (1976).
Drummond, D. A., Bloom, J. D., Adami, C., Wilke, C. O. & Arnold, F. H. Why highly expressed proteins evolve slowly. Proc. Natl Acad. Sci. USA 102, 14338–14343 (2005). Might highly expressed proteins be under strong selection to avoid protein misfolding? Several tests in this remarkable study indicate that this is the case.
Kondrashov, A. S., Sunyaev, S. & Kondrashov, F. A. Dobzhansky–Muller incompatibilities in protein evolution. Proc. Natl Acad. Sci. USA 99, 14878–13883 (2002). An original study on the frequency and importance of compensatory substitutions.
DePristo, M. A., Weinreich, D. M. & Hartl, D. L. Missense meanderings in sequence space: a biophysical view of protein evolution. Nature Rev. Genet. 6, 678–687 (2005). An original and thought-provoking review that links protein stability and compensatory evolution.
Poon, A., Davis, B. H. & Chao, L. The coupon collector and the suppressor mutation: estimating the number of compensatory mutations by maximum likelihood. Genetics 170, 1323–1332 (2005).
Hirsh, A. E. & Fraser, H. B. Protein dispensability and rate of evolution. Nature 411, 1046–1049 (2001). A classical study on the effect of gene 'importance' on protein evolution.
Jordan, I. K., Rogozin, I. B., Wolf, Y. I. & Koonin, E. V. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 12, 962–968 (2002).
Cutter, A. D. et al. Molecular correlates of genes exhibiting RNAi phenotypes in Caenorhabditis elegans. Genome Res. 13, 2651–2657 (2003).
Pal, C., Papp, B. & Hurst, L. D. Rate of evolution and gene dispensability. Nature 421, 496–497 (2003).
Wall, D. P. et al. Functional genomic analysis of the rates of protein evolution. Proc. Natl Acad. Sci. USA 102, 5483–5488 (2005). A sophisticted analysis that aims to disentangle the influences of expression level and dispensability.
Drummond, D. A., Raval, A. & Wilke, C. O. A single determinant dominates the rate of yeast protein evolution. Mol. Biol. Evol., 327–337 (2005). This work (like reference 33) compares the relative importance of several factors that are implicated in protein evolution, and identifies expression level as the most important variable.
Zhang, J. Z. & He, X. L. Significant impact of protein dispensability on the instantaneous rate of protein evolution. Mol. Biol. Evol. 22, 1147–1155 (2005).
Papp, B., Pal, C. & Hurst, L. D. Metabolic network analysis of the causes and evolution of enzyme dispensability in yeast. Nature 429, 661–664 (2004). The 'importance' of a gene is highly environment-specific: about half of all 'dispensable' enzymes in the laboratory are essential in specific environments.
Krylov, D. M., Wolf, Y. I., Rogozin, I. B. & Koonin, E. V. Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. Genome Res. 13, 2229–2235 (2003).
Hurst, L. D. & Smith, N. G. Do essential genes evolve slowly? Curr. Biol. 9, 747–750 (1999).
Torgerson, D. G., Whitty, B. R. & Singh, R. S. Sex-specific functional specialization and the evolutionary rates of essential fertility genes. J. Mol. Evol. 61, 650–658 (2005). Shows that function-specific positive selection, rather than essentiality, seems to explain the evolution of fertility genes.
Pakula, A. A. & Sauer, R. T. Genetic analysis of protein stability and function. Annu. Rev. Genet. 23, 289–310 (1989).
Guo, H. H., Choe, J. & Loeb, L. A. Protein tolerance to random amino acid change. Proc. Natl Acad. Sci. USA 101, 9205–9210 (2004).
Dobson, C. M. Principles of protein folding, misfolding and aggregation. Semin. Cell Dev. Biol. 15, 3–16 (2004).
Haney, P. J. et al. Thermal adaptation analyzed by comparison of protein sequences from mesophilic and extremely thermophilic Methanococcus species. Proc. Natl Acad. Sci. USA 96, 3578–3583 (1999).
Sterner, R. & Liebl, W. Thermophilic adaptation of proteins. Crit. Rev. Biochem. Mol. Biol. 36, 39–106 (2001).
Dokholyan, N. V. & Shakhnovich, E. I. Understanding hierarchical protein evolution from first principles. J. Mol. Biol. 312, 289–307 (2001).
Parisi, G. & Echave, J. Generality of the structurally constrained protein evolution model: assessment on representatives of the four main fold classes. Gene 345, 45–53 (2005).
Dean, A. M., Neuhauser, C., Grenier, E. & Golding, G. B. The pattern of amino acid replacements in α/β-barrels. Mol. Biol. Evol. 19, 1846–1864 (2002).
Goldman, N., Thorne, J. L. & Jones, D. T. Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149, 445–458 (1998).
Bustamante, C. D., Townsend, J. P. & Hartl, D. L. Solvent accessibility and purifying selection within proteins of Escherichia coli and Salmonella enterica. Mol. Biol. Evol. 17, 301–308 (2000).
Koehl, P. & Levitt, M. Protein topology and stability define the space of allowed sequences. Proc. Natl Acad. Sci. USA 99, 1280–1285 (2002).
Aris-Brosou, S. Determinants of adaptive evolution at the molecular level: the extended complexity hypothesis. Mol. Biol. Evol. 22, 200–209 (2005).
Fisher, R. The Genetical Theory of Natural Selection (Dover, New York, 1958).
Orr, H. A. The genetic theory of adaptation: a brief history. Nature Rev. Genet. 6, 119–127 (2005). An excellent review on molecular adaptation.
Fraser, H. B., Hirsh, A. E., Steinmetz, L. M., Scharfe, C. & Feldman, M. W. Evolutionary rate in the protein interaction network. Science 296, 750–752 (2002). An influential, but controversial study on the effect of protein interactions on evolution.
Bloom, J. D. & Adami, C. Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein–protein interactions data sets. BMC Evol. Biol. 3, 21 (2003).
Hahn, M. W., Conant, G. C. & Wagner, A. Molecular evolution in large genetic networks: does connectivity equal constraint? J. Mol. Evol. 58, 203–211 (2004).
Jordan, I. K., Wolf, Y. I. & Koonin, E. V. No simple dependence between protein evolution rate and the number of protein–protein interactions: only the most prolific interactors tend to evolve slowly. BMC Evol. Biol. 3, 1 (2003).
Agrafioti, I. et al. Comparative analysis of the Saccharomyces cerevisiae and Caenorhabditis elegans protein interaction networks. BMC Evol. Biol. 5, 23 (2005).
Teichmann, S. A. The constraints protein–protein interactions place on sequence divergence. J. Mol. Biol. 324, 399–407 (2002).
Mintseris, J. & Weng, Z. Structure, function, and evolution of transient and obligate protein–protein interactions. Proc. Natl Acad. Sci. USA 102, 10930–10935 (2005).
Makino, T. & Gojobori, T. The evolutionary rate of a protein is influenced by features of the interacting partners. Mol. Biol. Evol. 23, 784–789 (2006).
Fraser, H. B. Modularity and evolutionary constraint on proteins. Nature Genet. 37, 351–352 (2005).
Jordan, I. K., Marino-Ramirez, L., Wolf, Y. I. & Koonin, E. V. Conservation and coevolution in the scale-free human gene coexpression network. Mol. Biol. Evol. 21, 2058–2070 (2004).
Evangelisti, A. M. & Wagner, A. Molecular evolution in the yeast transcriptional regulation network. J. Exp. Zool. B 302, 392–411 (2004).
Salathe, M., Ackermann, M. & Bonhoeffer, S. The effect of multi-functionality on the rate of evolution in yeast. Mol. Biol. Evol. 23, 721–722 (2006).
Mizokami, M. et al. Constrained evolution with respect to gene overlap of hepatitis B virus. J. Mol. Evol. 44 (Suppl. 1), 83–90 (1997).
Raff, R. The Shape of Life (Univ. Chicago Press, Chicago, 1996).
Davis, J. C., Brandman, O. & Petrov, D. A. Protein evolution in the context of Drosophila development. J. Mol. Evol. 60, 774–785 (2005).
Hazkani-Covo, E., Wool, D. & Graur, D. In search of the vertebrate phylotypic stage: a molecular examination of the developmental hourglass model and von Baer's third law. J. Exp. Zool. B 304, 150–158 (2005). In agreement with the 'hourglass' model of animal development, genes that are expressed during the phylotypic stage are under strong stabilizing selection.
Castillo-Davis, C. I., Kondrashov, F. A., Hartl, D. L. & Kulathinal, R. J. The functional genomic distribution of protein divergence in two animal phyla: coevolution, genomic conflict, and constraint. Genome Res. 14, 802–811 (2004).
Good, J. M. & Nachman, M. W. Rates of protein evolution are positively correlated with developmental timing of expression during mouse spermatogenesis. Mol. Biol. Evol. 22, 1044–1052 (2005).
Duret, L. & Mouchiroud, D. Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol. Biol. Evol. 17, 68–74 (2000). The first demonstration of faster evolution of tissue-specific proteins.
Xing, Y. & Lee, C. Evidence of functional selection pressure for alternative splicing events that accelerate evolution of protein subsequences. Proc. Natl Acad. Sci. USA 102, 13526–13531 (2005). Shows that exons that are used in minor isoform proteins evolve at higher rates than constitutive exons.
Akashi, H. & Gojobori, T. Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc. Natl Acad. Sci. USA 99, 3695–3700 (2002).
Akashi, H. Translational selection and yeast proteome evolution. Genetics 164, 1291–1303 (2003).
Fay, J. C., Wyckoff, G. J. & Wu, C. I. Testing the neutral theory of molecular evolution with genomic data from Drosophila. Nature 415, 1024–1026 (2002).
Wagner, A. Robustness, evolvability, and neutrality. FEBS Lett. 579, 1772–1778 (2005).
Nielsen, R. et al. A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 3, e170 (2005). A comprehensive overview of the gene classes that were shaped by positive selection in human evolutionary history (see also reference 100).
Wichman, H. A., Millstein, J. & Bull, J. J. Adaptive molecular evolution for 13,000 phage generations: a possible arms race. Genetics 170, 19–31 (2005). This work indicates that intraspecies competition might lead to selection for perpetual change.
Zhang, Z., Hambuch, T. M. & Parsch, J. Molecular evolution of sex-biased genes in Drosophila. Mol. Biol. Evol. 21, 2130–2139 (2004).
Poon, A. & Chao, L. The rate of compensatory mutation in the DNA bacteriophage φX174. Genetics 170, 989–999 (2005).
Fares, M. A., Moya, A. & Barrio, E. Adaptive evolution in GroEL from distantly related endosymbiotic bacteria of insects. J. Evol. Biol. 18, 651–660 (2005). This paper (along with others from the same group) indicates that a heat-shock protein might have evolved to mitigate the effect of deleterious substitutions in endosymbionts.
Shim Choi, S., Li, W. & Lahn, B. T. Robust signals of coevolution of interacting residues in mammalian proteomes identified by phylogeny-aided structural analysis. Nature Genet. 37, 1367–1371 (2005).
Fisher, S. E. & Marcus, G. F. The eloquent ape: genes, brains and the evolution of language. Nature Rev. Genet. 7, 9–20 (2006).
Mekel-Bobrov, N. et al. Ongoing adaptive evolution of ASPM, a brain size determinant in Homo sapiens. Science 309, 1720–1722 (2005).
Bustamante, C. D. et al. Natural selection on protein-coding genes in the human genome. Nature 437, 1153–1157 (2005). A comprehensive overview of the gene classes that were shaped by positive selection in human evolutionary history (see also reference 92).
Koonin, E. V. Systemic determinants of gene evolution and function. Mol. Syst. Biol. 13 Sep 2005 (doi:10.1038/msb4100029).
Chen, Y. & Xu, D. Understanding protein dispensability through machine-learning analysis of high-throughput data. Bioinformatics 21, 575–581 (2005).
Kondrashov, F. A., Ogurtsov, A. Y. & Kondrashov, A. S. Bioinformatical assay of human gene morbidity. Nucleic Acids Res. 32, 1731–1737 (2004).
Wolf, Y. I., Carmel, L. & Koonin, E. V. Unifying measures of gene function and evolution. Proc. R. Soc. B (in the press).
Elena, S. F. & Lenski, R. E. Evolution experiments with microorganisms: the dynamics and genetic bases of adaptation. Nature Rev. Genet. 4, 457–469 (2003).
Shendure, J. et al. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 1728–1732 (2005).
Patthy, L. Protein Evolution (Blackwell Science, Oxford, 1999).
Papp, B., Pal, C. & Hurst, L. D. Dosage sensitivity and the evolution of gene families in yeast. Nature 424, 194–197. (2003).
Jain, R., Rivera, M. C. & Lake, J. A. Horizontal gene transfer among genomes: the complexity hypothesis. Proc. Natl Acad. Sci. USA 96, 3801–3806 (1999).
Nei, M. & Kumar, S. Molecular Evolution and Phylogenetics (Oxford Univ. Press, Oxford, 2000).
Whelan, S., Lio, P. & Goldman, N. Molecular phylogenetics: state-of-the-art methods for looking into the past. Trends Genet. 17, 262–272 (2001).
Abascal, F., Zardoya, R. & Posada, D. ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21, 2104–2105 (2005).
Posada, D. & Crandall, K. A. MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817–818 (1998).
Goldman, N. & Yang, Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11, 725–736 (1994).
Miller, M. P. & Kumar, S. Understanding human disease mutations through the use of interspecific genetic variation. Hum. Mol. Genet. 10, 2319–2328 (2001).
Ng, P. C. & Henikoff, S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003).
Ramensky, V., Bork, P. & Sunyaev, S. Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 30, 3894–3900 (2002).
Rebbeck, T. R., Spitz, M. & Wu, X. F. Assessing the function of genetic variants in candidate gene association studies. Nature Rev. Genet. 5, 589–597 (2004).
Piganeau, G. & Eyre-Walker, A. Estimating the distribution of fitness effects from DNA sequence data: implications for the molecular clock. Proc. Natl Acad. Sci. USA 100, 10335–10340 (2003).
Loewe, L., Charlesworth, B., Bartolome, C. & Noel, V. Estimating selection on non-synonymous mutations. Genetics 172, 1079–1092 (2006).
Rokyta, D. R., Joyce, P., Caudle, S. B. & Wichman, H. A. An empirical test of the mutational landscape model of adaptation using a single-stranded DNA virus. Nature Genet. 37, 441–444 (2005). References 119–121 attempt to estimate the fitness distribution of mutations; these values are highly relevant to understanding the relative influence of deleterious and advantageous mutations on protein evolution.
Aharoni, A. et al. The 'evolvability' of promiscuous protein functions. Nature Genet. 37, 73–76 (2005).
Davis, J. C. & Petrov, D. A. Preferential duplication of conserved proteins in eukaryotic genomes. PLoS Biol. 2, 318–326 (2004).
Jordan, I. K., Wolf, Y. I. & Koonin, E. V. Duplicated genes evolve slower than singletons despite the initial rate increase. BMC Evol. Biol. 4, 22 (2004).
Cusack, B. P. & Wolfe, K. H. Changes in alternative splicing of human and mouse genes are accompanied by faster evolution of constitutive exons. Mol. Biol. Evol. 22, 2198–2208 (2005).
Kondrashov, F. A., Rogozin, I. B., Wolf, Y. I. & Koonin, E. V. Selection in the evolution of gene duplications. Genome Biol. 3, RESEARCH0008 (2002). Shows that selection pressure is relaxed for a short period after gene duplication.
Kumar, S. Molecular clocks: four decades of evolution. Nature Rev. Genet. 6, 654–662 (2005). A comprehensive overview of the reasons for evolutionary rate variation across species.
Gillooly, J. F., Allen, A. P., West, G. B. & Brown, J. H. The rate of DNA evolution: effects of body size and temperature on the molecular clock. Proc. Natl Acad. Sci. USA 102, 140–145 (2005).
Wernegreen, J. J. Genome evolution in bacterial endosymbionts of insects. Nature Rev. Genet. 3, 850–861 (2002).
Gillespie, J. H. The role of population size in molecular evolution. Theor. Popul. Biol. 55, 145–156 (1999).
Eyre-Walker, A., Keightley, P. D., Smith, N. G. & Gaffney, D. Quantifying the slightly deleterious mutation model of molecular evolution. Mol. Biol. Evol. 19, 2142–2149 (2002).
Paland, S. & Lynch, M. Transitions to asexuality result in excess amino acid substitutions. Science 311, 990–992 (2006).
Bustamante, C. D. et al. The cost of inbreeding in Arabidopsis. Nature 416, 531–534 (2002). References 132 and 133 show the effect of sex and breeding system on the accumulation of deleterious substitutions.
Bastolla, U., Porto, M., Eduardo Roman, M. H. & Vendruscolo, M. H. Connectivity of neutral networks, overdispersion, and structural conservation in protein evolution. J. Mol. Evol. 56, 243–254 (2003).
Glaser, F. et al. ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics 19, 163–164 (2003).
Deutschbauer, A. M. et al. Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Genetics 169, 1915–1925 (2005).
Holstege, F. C. et al. Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95, 717–728 (1998).
Wright, B. E., Longacre, A. & Reimers, J. M. Hypermutation in derepressed operons of Escherichia coli K12. Proc. Natl Acad. Sci. USA 96, 5089–5094 (1999).
Pal, C. & Hurst, L. D. Evidence for co-evolution of gene order and recombination rate. Nature Genet. 33, 392–395 (2003).
von Mering, C. et al. Comparative assessment of large-scale data sets of protein–protein interactions. Nature 417, 399–403 (2002).
Jeong, H., Mason, S. P., Barabasi, A. L. & Oltvai, Z. N. Lethality and centrality in protein networks. Nature 411, 41–42 (2001).
Coulomb, S., Bauer, M., Bernard, D. & Marsolier-Kergoat, M. C. Gene essentiality and the topology of protein interaction networks. Proc. Biol. Sci. 272, 1721–1725 (2005).
Acknowledgements
The authors wish to thank L. Hurst and L. Loewe for their insightful comments. C.P. and B.P. are supported by the Hungarian Scientific Research Fund (OTKA). C.P. is also supported by an EMBO (European Molecular Biology Organization) Long-term Fellowship. B.P. is a fellow of the Human Frontier Science Program. M.J.L. acknowledges financial support from the DFG (Deutsche Forschungsgemeinschaft).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Glossary
- Genetic drift
-
The stochastic changes in allele frequencies in a population that occur owing to random sampling effects in the formation of successive generations.
- Purifying selection
-
The removal of a deleterious genetic variant from the population owing to the reduced reproductive success of its carriers.
- Positive selection
-
The accelerated spread of a beneficial genetic variant in the population owing to the increased reproductive success of its carriers.
- Dispensability
-
A measure that is inversely related to the overall importance of a gene. It is usually approximated by the fitness (or growth rate) of the corresponding gene knockout strain under various laboratory conditions.
- Transition matrix
-
A matrix that contains the probabilities of each type of amino-acid substitution for a given period of evolution.
- Maximum-likelihood framework
-
A method that takes a model (for example, of sequence evolution) and searches for the combination of parameter values that best describes the observed data (for example, the aligned sequences).
- Synonymous (change)
-
A nucleotide change in the protein-coding region of a gene that leaves the encoded amino acid unchanged.
- Nearly neutral (mutation)
-
A mutation is nearly neutral when its fitness effect is too small to be governed only by selection, and so its fate is determined largely by genetic drift.
- Non-synonymous (change)
-
A nucleotide change in the protein-coding region of a gene that alters the encoded amino acid.
- Interference (Hill–Robertson effects)
-
A phenomenon that describes a reduction in the efficiency at which selection functions simultaneously at genetically linked sites, especially in regions of low recombination.
- Fitness density
-
The proportion of residues in a protein that are under natural selection, with the contribution of each site weighted by the fitness effects of mutations. Besides functional requirements, selection can favour many fitness components, including stability and robustness against errors. Therefore, fitness density is expected to be higher than functional density.
- Imprinted gene
-
A gene in which expression is determined by the parent from which it is inherited.
- Effective population size
-
The number of individuals in a population that contribute to the next generation. It is generally much smaller than the number of individuals in the population, and is influenced by factors that include population structure, sex ratio, mating system and age distribution.
- Essential protein
-
One for which deletion of the encoding gene results in a lethal phenotype, which is usually measured under laboratory conditions.
- Orthologous
-
Proteins that are encoded by genes that evolved from a common ancestral gene through speciation.
- Protein designability
-
The number of possible amino-acid sequences that are compatible with a given protein structure.
- Overdispersion
-
When the variance in the substitution rate across lineages exceeds its mean. This indicates that the substitution process does not follow a Poisson distribution.
- Module
-
A discrete entity that is isolated through spatial localization, gene-expression pattern, chemical specificity or position in biological network (for example, protein complex, metabolic or signal-transduction pathways). Ideally, the biological function of a module is separable from that of other modules.
- Overlapping reading frames
-
Adjacent protein-coding genes that share one or more nucleotides.
- Sexual selection
-
Competition among members of one sex for mating opportunities with the other sex.
- Gene conversion
-
Non-reciprocal transfer between a pair of non-allelic or allelic DNA sequences during meiosis and mitosis, such that the receiving sequence becomes more similar to the donating sequence.
- Codon usage bias
-
The non-random usage of synonymous codons for the same amino acid.
Rights and permissions
About this article
Cite this article
Pál, C., Papp, B. & Lercher, M. An integrated view of protein evolution. Nat Rev Genet 7, 337–348 (2006). https://doi.org/10.1038/nrg1838
Issue Date:
DOI: https://doi.org/10.1038/nrg1838
This article is cited by
-
Enhancing coevolutionary signals in protein–protein interaction prediction through clade-wise alignment integration
Scientific Reports (2024)
-
Environmental selection and epistasis in an empirical phenotype–environment–fitness landscape
Nature Ecology & Evolution (2022)
-
Low protein expression enhances phenotypic evolvability by intensifying selection on folding stability
Nature Ecology & Evolution (2022)
-
Richard Dickerson, Molecular Clocks, and Rates of Protein Evolution
Journal of Molecular Evolution (2021)
-
Evolutionary Forces and Codon Bias in Different Flavors of Intrinsic Disorder in the Human Proteome
Journal of Molecular Evolution (2020)