Key Points
-
Before the development of second-generation sequencing, the consensus opinion based on both fossil evidence and genetic evidence (which is based on single-locus and multilocus data) regarding human origins tended to favour a single recent African origin over a multiregional evolution model. The sequencing of the archaic Neanderthal and Denisovan genomes has shown that much more complex intermediate models are needed to explain the data.
-
The genome of an Australian Aborigine seems to provide evidence for two waves of migration through Asia shortly after anatomically modern humans (AMHs) left Africa ~60,000–50,000 years ago. However, recent estimates of the human mutation rate have raised the question of whether population divergence dates between Africans and non-Africans that are estimated from genetic data reflect the movement of people out of Africa or a much more complex demographic scenario that involves substantial ancient population structure and back migration.
-
Whole-genome sequencing (WGS) data from modern hunter-gatherers in Africa show that the click-speaking Khoe–Sans, followed by African Pygmies, are the most diverged AMH population alive today and that the Khoe–Sans maintain some genetic link with other click-speakers in Tanzania. However, robust inference of population history in Africa will require approaches that take into account 'ghost' archaic hominins from which ancient DNA is unlikely to be retrieved in the near future.
-
The geographical distribution of classical markers, non-recombining portions of the Y chromosome and mitochondrial DNA has resulted in substantial debate regarding the relative contributions of Paleolithic and Neolithic populations to the genetic ancestry of modern Europeans. Second-generation sequencing of ancient samples from these two prehistoric periods seems to suggest a major reshaping of European genetic diversity as part of the transition to a farming way of life.
-
WGS of ancient genomes is providing evidence that the prehistoric colonization of the Americas involved multiple waves of migrations over the Bering Strait and that the source populations are likely to derive from a relatively heterogeneous gene pool, which reflects periods of substantial demographic change in East Asia and Siberia since the Last Glacial Maximum.
-
Substantial efforts have been devoted to developing methods for inferring demographic history that can correct for, or are insensitive to, nucleotide calling errors that are produced during second-generation sequencing, especially when examining the allele frequency spectrum using low and medium coverage data sets. In addition, the sequentially Markovian coalescent model is providing the basis for many methods that are attempting to incorporate recombination into analytically novel methods, whereas third-generation sequencing technologies may provide extensive haplotype-phased data to fully exploit such approaches.
Abstract
Examining patterns of molecular genetic variation in both modern-day and ancient humans has proved to be a powerful approach to learn about our origins. Rapid advances in DNA sequencing technology have allowed us to characterize increasing amounts of genomic information. Although this clearly provides unprecedented power for inference, it also introduces more complexity into the way we use and interpret such data. Here, we review ongoing debates that have been influenced by improvements in our ability to sequence DNA and discuss some of the analytical challenges that need to be overcome in order to fully exploit the rich historical information that is contained in the entirety of the human genome.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Cavalli-Sforza, L. L., Menozzi, P. & Piazza, A. The History and Geography of Human Genes (Princeton Univ. Press, 1994).
Cann, R. L., Stoneking, M. & Wilson, A. C. Mitochondrial DNA and human evolution. Nature 325, 31–36 (1987).
Rosenberg, N. A. & Nordborg, M. Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nature Rev. Genet. 3, 380–390 (2002).
Rosenberg, N. A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).
Tishkoff, S. A. et al. The genetic structure and history of Africans and African Americans. Science 324, 1035–1044 (2009).
Novembre, J. & Ramachandran, S. Perspectives on human population structure at the cusp of the sequencing era. Annu. Rev. Genom. Hum. Genet. 12, 245–274 (2011).
Lachance, J. & Tishkoff, S. A. SNP ascertainment bias in population genetic analyses: why it is important, and how to correct it. Bioessays 35, 780–786 (2013).
Sousa, V. & Hey, J. Understanding the origin of species with genome-scale data: modelling gene flow. Nature Rev. Genet. 14, 404–414 (2013). This is an excellent overview of methods for analysing genome-wide data for inferring population genetic parameters and demographic history.
Underhill, P. A. & Kivisild, T. Use of Y chromosome and mitochondrial DNA population structure in tracing human migrations. Annu. Rev. Genet. 41, 539–564 (2007).
Nielsen, R. & Beaumont, M. A. Statistical inferences in phylogeography. Mol. Ecol. 18, 1034–1047 (2009).
Templeton, A. R. Coalescent-based, maximum likelihood inference in phylogeography. Mol. Ecol. 19, 431–435 (2010).
Beaumont, M. A. et al. In defence of model-based inference in phylogeography. Mol. Ecol. 19, 436–446 (2010).
Stringer, C. B. & Andrews, P. Genetic and fossil evidence for the origin of modern humans. Science 239, 1263–1268 (1988).
Wolpoff, M. H., Wu, X. Z. & Thorne, A. in The Origin of Modern Humans: a World Survey of the Fossil Evidence (eds Smith, F. H. & Spence, F.) 411–483 (John Wiley & Sons, 1984).
Ingman, M., Kaessmann, H., Paabo, S. & Gyllensten, U. Mitochondrial genome variation and the origin of modern humans. Nature 408, 708–713 (2000).
Vigilant, L., Stoneking, M., Harpending, H., Hawkes, K. & Wilson, A. C. African populations and the evolution of human mitochondrial DNA. Science 253, 1503–1507 (1991).
Thomson, R., Pritchard, J. K., Shen, P., Oefner, P. J. & Feldman, M. W. Recent common ancestry of human Y chromosomes: evidence from DNA sequence data. Proc. Natl Acad. Sci. USA 97, 7360–7365 (2000).
Underhill, P. A. et al. Y chromosome sequence variation and the history of human populations. Nature Genet. 26, 358–361 (2000).
Hawks, J. D. & Wolpoff, M. H. The four faces of Eve: hypothesis compatibility and human origins. Quatern. Int. 75, 41–50 (2001).
Krause, J. et al. Neanderthals in central Asia and Siberia. Nature 449, 902–904 (2007).
Krings, M. et al. Neandertal DNA sequences and the origin of modern humans. Cell 90, 19–30 (1997).
Serre, D. et al. No evidence of Neandertal mtDNA contribution to early modern humans. PLoS Biol. 2, e57 (2004).
Brauer, G. in Continuity or Replacement: Controversies in Homo sapiens Evolution (eds Brauer, G. & Smith, F. H.) 83–98 (Balkema, 1992).
Smith, F. H., Falsetti, A. B. & Donnelly, S. M. Modern human origins. Yearbook Phys. Anthropol. 32, 35–68 (1989).
Nordborg, M. On the probability of Neanderthal ancestry. Am. J. Hum. Genet. 63, 1237–1240 (1998).
Wall, J. D. Detecting ancient admixture in humans using sequence polymorphism data. Genetics 154, 1271–1279 (2000).
Currat, M. & Excoffier, L. Modern humans did not admix with Neanderthals during their range expansion into Europe. PLoS Biol. 2, e421 (2004).
Garrigan, D., Mobasher, Z., Kingan, S. B., Wilder, J. A. & Hammer, M. F. Deep haplotype divergence and long-range linkage disequilibrium at xp21.1 provide evidence that humans descend from a structured ancestral population. Genetics 170, 1849–1856 (2005).
Prugnolle, F., Manica, A. & Balloux, F. Geography predicts neutral genetic diversity of human populations. Curr. Biol. 15, R159–R160 (2005).
Ramachandran, S. et al. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc. Natl Acad. Sci. USA 102, 15942–15947 (2005).
Jakobsson, M. et al. Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451, 998–1003 (2008).
Templeton, A. R. Genetics and recent human evolution. Evolution 61, 1507–1519 (2007).
DeGiorgio, M., Jakobsson, M. & Rosenberg, N. A. Explaining worldwide patterns of human genetic variation using a coalescent-based serial founder model of migration outward from Africa. Proc. Natl Acad. Sci. USA 106, 16057–16062 (2009).
Hammer, M. F., Woerner, A. E., Mendez, F. L., Watkins, J. C. & Wall, J. D. Genetic evidence for archaic admixture in Africa. Proc. Natl Acad. Sci. USA 108, 15123–15128 (2011).
Wall, J. D., Lohmueller, K. E. & Plagnol, V. Detecting ancient admixture and estimating demographic parameters in multiple human populations. Mol. Biol. Evol. 26, 1823–1827 (2009).
Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010). This study presents the first archaic human genome, which represents a key advance in ancient-DNA-sequencing technology.
Reich, D. et al. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468, 1053–1060 (2010). The study is the first to use ancient-DNA-sequencing technology to identify a hominin lineage that was not previously known to exist.
Durand, E. Y., Patterson, N., Reich, D. & Slatkin, M. Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011).
Eriksson, A. & Manica, A. Effect of ancient population structure on the degree of polymorphism shared between modern human populations and ancient hominins. Proc. Natl Acad. Sci. USA 109, 13956–13960 (2012).
Sankararaman, S., Patterson, N., Li, H., Paabo, S. & Reich, D. The date of interbreeding between Neandertals and modern humans. PLoS Genet. 8, e1002947 (2012).
Yang, M. A., Malaspinas, A. S., Durand, E. Y. & Slatkin, M. Ancient structure in Africa unlikely to explain Neanderthal and non-African genetic similarity. Mol. Biol. Evol. 29, 2987–2995 (2012).
Wall, J. D. et al. Higher levels of Neanderthal ancestry in East Asians than in Europeans. Genetics 194, 199–209 (2013).
Krause, J. et al. The complete mitochondrial DNA genome of an unknown hominin from southern Siberia. Nature 464, 894–897 (2010).
Mendez, F. L., Watkins, J. C. & Hammer, M. F. Global genetic variation at OAS1 provides evidence of archaic admixture in Melanesian populations. Mol. Biol. Evol. 29, 1513–1520 (2012).
Prufer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2013). This study sequenced the first high coverage Neanderthal genome, which provides evidence for complex models of archaic admixture in hominin evolution.
Alves, I., Sramkova Hanulova, A., Foll, M. & Excoffier, L. Genomic data reveal a complex making of humans. PLoS Genet. 8, e1002837 (2012). This is an overview of previous and current models for the origins of AMHs that include archaic interbreeding and introgression.
Lahr, M. M. & Foley, R. Multiple dispersals and modern human origins. Evol. Anthropol. 3, 48–60 (1994).
Macaulay, V. et al. Single, rapid coastal settlement of Asia revealed by analysis of complete mitochondrial genomes. Science 308, 1034–1036 (2005).
Rasmussen, M. et al. An Aboriginal Australian genome reveals separate human dispersals into Asia. Science 334, 94–98 (2011). This study obtained good coverage next-generation sequencing data from a 100-year-old lock of hair that provided evidence for multiple waves of migration into Eurasia.
Wollstein, A. et al. Demographic history of Oceania inferred from genome-wide data. Curr. Biol. 20, 1983–1992 (2010).
Shi, W. et al. A worldwide survey of human male demographic history based on Y-SNP and Y-STR data from the HGDP-CEPH populations. Mol. Biol. Evol. 27, 385–393 (2010).
Liu, H., Prugnolle, F., Manica, A. & Balloux, F. A geographically explicit genetic model of worldwide human-settlement history. Am. J. Hum. Genet. 79, 230–237 (2006).
Gravel, S. et al. Demographic history and rare allele sharing among human populations. Proc. Natl Acad. Sci. USA 108, 11983–11988 (2011).
Harris, K. & Nielsen, R. Inferring demographic history from a spectrum of shared haplotype lengths. PLoS Genet. 9, e1003521 (2013).
Gronau, I., Hubisz, M. J., Gulko, B., Danko, C. G. & Siepel, A. Bayesian inference of ancient human demography from individual genome sequences. Nature Genet. 43, 1031–1034 (2011).
Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011). This is one of the first methods to incorporate recombination for analysing WGS data.
Conrad, D. F. et al. Variation in genome-wide mutation rates within and between human families. Nature Genet. 43, 712–714 (2011).
Kong, A. et al. Rate of de novo mutations and the importance of father's age to disease risk. Nature 488, 471–475 (2012).
Roach, J. C. et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636–639 (2010).
Campbell, C. D. & Eichler, E. E. Properties and rates of germline mutations in humans. Trends Genet. 29, 575–584 (2013).
Scally, A. & Durbin, R. Revising the human mutation rate: implications for understanding human evolution. Nature Rev. Genet. 13, 745–753 (2012). This is an excellent opinion piece that describes the implications for current models of human population history given new estimates of the mutation rate based on second-generation sequencing data.
Petraglia, M. et al. Middle Paleolithic assemblages from the Indian subcontinent before and after the Toba super-eruption. Science 317, 114–116 (2007).
Pagani, L. et al. Ethiopian genetic diversity reveals linguistic stratification and complex influences on the Ethiopian gene pool. Am. J. Hum. Genet. 91, 83–96 (2012).
Wood, E. T. et al. Contrasting patterns of Y chromosome and mtDNA variation in Africa: evidence for sex-biased demographic processes. Eur. J. Hum. Genet. 13, 867–876 (2005).
Henn, B. M. et al. Hunter-gatherer genomic diversity suggests a southern African origin for modern humans. Proc. Natl Acad. Sci. USA 108, 5154–5162 (2011).
Schlebusch, C. M. et al. Genomic variation in seven Khoe–San groups reveals adaptation and complex African history. Science 338, 374–379 (2012).
Tishkoff, S. A. et al. History of click-speaking populations of Africa inferred from mtDNA and Y chromosome genetic variation. Mol. Biol. Evol. 24, 2180–2195 (2007).
Veeramah, K. R. et al. An early divergence of KhoeSan ancestors from those of other modern humans is supported by an ABC-based analysis of autosomal resequencing data. Mol. Biol. Evol. 29, 617–630 (2012).
Schuster, S. C. et al. Complete Khoisan and Bantu genomes from southern Africa. Nature 463, 943–947 (2010).
Lachance, J. et al. Evolutionary history and adaptation from high-coverage whole-genome sequences of diverse African hunter-gatherers. Cell 150, 457–469 (2012). This paper publishes the first set of high coverage African genomes.
Pickrell, J. K. et al. The genetic prehistory of southern Africa. Nature Commun. 3, 1143 (2012).
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, e1002967 (2012).
Harvati, K. et al. The Later Stone Age calvaria from Iwo Eleru, Nigeria: morphology and chronology. PLoS ONE 6, e24024 (2011).
Rightmire, G. P. Out of Africa: modern human origins special feature: middle and later Pleistocene hominins in Africa and Southwest Asia. Proc. Natl Acad. Sci. USA 106, 16046–16050 (2009).
Campana, M. G., Bower, M. A. & Crabtree, P. J. Ancient DNA for the archaeologist: the future of African research. Afr. Archaeol. Rev. 30, 21–37 (2013).
Benazzi, S. et al. Early dispersal of modern humans in Europe and implications for Neanderthal behaviour. Nature 479, 525–528 (2011).
Higham, T. et al. The earliest evidence for anatomically modern humans in northwestern Europe. Nature 479, 521–524 (2011).
Stewart, J. R. & Stringer, C. B. Human evolution out of Africa: the role of refugia and climate change. Science 335, 1317–1321 (2012).
Pinhasi, R., Thomas, M. G., Hofreiter, M., Currat, M. & Burger, J. The genetic history of Europeans. Trends Genet. 28, 496–505 (2012). This is a thorough review of the use of genetic data from both ancient and contemporary samples for inferring the population history of Europe.
Menozzi, P., Piazza, A. & Cavalli-Sforza, L. Synthetic maps of human gene frequencies in Europeans. Science 201, 786–792 (1978).
Novembre, J. & Stephens, M. Interpreting principal component analyses of spatial population genetic variation. Nature Genet. 40, 646–649 (2008).
Barbujani, G., Bertorelle, G. & Chikhi, L. Evidence for Paleolithic and Neolithic gene flow in Europe. Am. J. Hum. Genet. 62, 488–492 (1998).
Richards, M. et al. Tracing European founder lineages in the Near Eastern mtDNA pool. Am. J. Hum. Genet. 67, 1251–1276 (2000).
Richards, M. B., Macaulay, V. A., Bandelt, H. J. & Sykes, B. C. Phylogeography of mitochondrial DNA in western Europe. Ann. Hum. Genet. 62, 241–260 (1998).
Simoni, L., Calafell, F., Pettener, D., Bertranpetit, J. & Barbujani, G. Geographic patterns of mtDNA diversity in Europe. Am. J. Hum. Genet. 66, 262–278 (2000).
Balaresque, P. et al. A predominantly neolithic origin for European paternal lineages. PLoS Biol. 8, e1000285 (2010).
Chikhi, L., Nichols, R. A., Barbujani, G. & Beaumont, M. A. Y genetic data support the Neolithic demic diffusion model. Proc. Natl Acad. Sci. USA 99, 11008–11013 (2002).
Semino, O. et al. The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective. Science 290, 1155–1159 (2000).
Belle, E. M., Landry, P. A. & Barbgv ujani, G. Origins and evolution of the Europeans' genome: evidence from multiple microsatellite loci. Proc. Biol. Sci. 273, 1595–1602 (2006).
Novembre, J. et al. Genes mirror geography within Europe. Nature 456, 98–101 (2008).
Auton, A. et al. Global distribution of genomic diversity underscores rich complex history of continental human populations. Genome Res. 19, 795–803 (2009).
Botigue, L. R. et al. Gene flow from North Africa contributes to differential human genetic diversity in southern Europe. Proc. Natl Acad. Sci. USA 110, 11791–11796 (2013).
Ralph, P. & Coop, G. The geography of recent genetic ancestry across Europe. PLoS Biol. 11, e1001555 (2013).
Lacan, M., Keyser, C., Crubezy, E. & Ludes, B. Ancestry of modern Europeans: contributions of ancient DNA. Cell. Mol. Life Sci. 70, 2473–2487 (2013).
Bramanti, B. et al. Genetic discontinuity between local hunter-gatherers and central Europe's first farmers. Science 326, 137–140 (2009).
Malmstrom, H. et al. Ancient DNA reveals lack of continuity between neolithic hunter-gatherers and contemporary Scandinavians. Curr. Biol. 19, 1758–1762 (2009).
Brandt, G. et al. Ancient DNA reveals key stages in the formation of central European mitochondrial genetic diversity. Science 342, 257–261 (2013).
Brotherton, P. et al. Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans. Nature Commun. 4, 1764 (2013).
Haak, W. et al. Ancient DNA from European early neolithic farmers reveals their near eastern affinities. PLoS Biol. 8, e1000536 (2010).
Haak, W. et al. Ancient DNA from the first European farmers in 7500-year-old Neolithic sites. Science 310, 1016–1018 (2005).
Hervella, M. et al. Ancient DNA from hunter-gatherer and farmer groups from northern Spain supports a random dispersion model for the Neolithic expansion into Europe. PLoS ONE 7, e34417 (2012).
Sampietro, M. L. et al. Palaeogenetic evidence supports a dual model of Neolithic spreading into Europe. Proc. Biol. Sci. 274, 2161–2167 (2007).
Lacan, M. et al. Ancient DNA reveals male diffusion through the Neolithic Mediterranean route. Proc. Natl Acad. Sci. USA 108, 9788–9791 (2011).
Keller, A. et al. New insights into the Tyrolean Iceman's origin and phenotype as inferred by whole-genome sequencing. Nature Commun. 3, 698 (2012). This paper reports the whole-genome sequence of the enigmatic Tyrolean Iceman.
Veeramah, K. R. et al. Genetic variation in the Sorbs of eastern Germany in the context of broader European genetic diversity. Eur. J. Hum. Genet. 19, 995–1001 (2011).
Sanchez-Quinto, F. et al. Genomic affinities of two 7,000-year-old Iberian hunter-gatherers. Curr. Biol. 22, 1494–1499 (2012).
Skoglund, P. et al. Origins and genetic legacy of Neolithic farmers and hunter-gatherers in Europe. Science 336, 466–469 (2012). This study generates the first autosomal sequence data using second-generation sequencing methods from ancient hunter-gatherer and farming groups in Europe.
Wilson, J. F. et al. Genetic evidence for different male and female roles during cultural transitions in the British Isles. Proc. Natl Acad. Sci. USA 98, 5078–5083 (2001).
O'Rourke, D. H. & Raff, J. A. The human genetic history of the Americas: the final frontier. Curr. Biol. 20, R202–R207 (2010).
Dillehay, T. D. Monte Verde, a Late Pleistocene Settlement in Chile (Smithsonian Institution Press, 1989).
Greenberg, J. H., Turner, C. G. & Zegura, S. L. The settlement of the America — a comparison of the linguistic, dental, and genetic-evidence. Curr. Anthropol. 27, 477–497 (1986).
Fagundes, N. J. et al. Mitochondrial population genomics supports a single pre-Clovis origin with a coastal route for the peopling of the Americas. Am. J. Hum. Genet. 82, 583–592 (2008).
Mulligan, C. J., Kitchen, A. & Miyamoto, M. M. Updated three-stage model for the peopling of the Americas. PLoS ONE 3, e3199 (2008).
Wang, S. et al. Genetic variation and population structure in Native Americans. PLoS Genet. 3, e185 (2007).
Zegura, S. L., Karafet, T. M., Zhivotovsky, L. A. & Hammer, M. F. High-resolution SNPs and microsatellite haplotypes point to a single, recent entry of Native American Y chromosomes into the Americas. Mol. Biol. Evol. 21, 164–175 (2004).
Ray, N. et al. A statistical evaluation of models for the initial settlement of the American continent emphasizes the importance of gene flow with Asia. Mol. Biol. Evol. 27, 337–345 (2010).
Rasmussen, M. et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 463, 757–762 (2010). This paper publishes the first ancient human genome.
Reich, D. et al. Reconstructing Native American population history. Nature 488, 370–374 (2012). This is a comprehensive study of genome-wide SNP variation that provided support for a three-wave model for the early peopling of the Americas.
Raghavan, M. et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature http://dx.doi.org/10.1038/nature12736 (2013). This recent WGS study of a 24,000-year-old specimen shows how genetic ancestry can vary over time in a given geographical region.
Raff, J. A., Bolnick, D. A., Tackney, J. & O'Rourke, D. H. Ancient DNA perspectives on American colonization and population history. Am. J. Phys. Anthropol. 146, 503–514 (2011).
Gilbert, M. T. et al. DNA from pre-Clovis human coprolites in Oregon, North America. Science 320, 786–789 (2008).
Kirsanow, K. & Burger, J. Ancient human DNA. Ann. Anat. 194, 121–132 (2012).
Stoneking, M. & Krause, J. Learning about human population history from ancient and modern genomes. Nature Rev. Genet. 12, 603–614 (2011). This is an excellent review of the theory and the practice of ancient DNA sequencing using second-generation methods as applied to human populations.
Pool, J. E., Hellmann, I., Jensen, J. D. & Nielsen, R. Population genetic inference from genomic sequence variation. Genome Res. 20, 291–300 (2010).
Nielsen, R., Korneliussen, T., Albrechtsen, A., Li, Y. & Wang, J. SNP calling, genotype calling, and sample allele frequency estimation from new-generation sequencing data. PLoS ONE 7, e37558 (2012).
Keightley, P. D. & Halligan, D. L. Inference of site frequency spectra from high-throughput sequence data: quantification of selection on nonsynonymous and synonymous sites in humans. Genetics 188, 931–940 (2011).
Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).
Lynch, M. Estimation of allele frequencies from high-coverage genome-sequencing projects. Genetics 182, 295–301 (2009).
Hellmann, I. et al. Population genetic analysis of shotgun assemblies of genomic sequences from multiple individuals. Genome Res. 18, 1020–1029 (2008).
Jiang, R., Tavare, S. & Marjoram, P. Population genetic inference from resequencing data. Genetics 181, 187–197 (2009).
Johnson, P. L. & Slatkin, M. Accounting for bias from sequencing error in population genetic estimates. Mol. Biol. Evol. 25, 199–206 (2008).
Lynch, M. Estimation of nucleotide diversity, disequilibrium coefficients, and mutation rates from high-coverage genome-sequencing projects. Mol. Biol. Evol. 25, 2409–2419 (2008).
Luca, F., Hudson, R. R., Witonsky, D. B. & Di Rienzo, A. A reduced representation approach to population genetic analyses and applications to human evolution. Genome Res. 21, 1087–1098 (2011).
Futschik, A. & Schlotterer, C. The next generation of molecular markers from massively parallel sequencing of pooled DNA samples. Genetics 186, 207–218 (2010).
Gautier, M. et al. Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping. Mol. Ecol. 22, 3766–3779 (2013).
Mailund, T. et al. A new isolation with migration model along complete genomes infers very different divergence processes among closely related great ape species. PLoS Genet. 8, e1003125 (2012).
Sheehan, S., Harris, K. & Song, Y. S. Estimating variable effective population sizes from multiple genomes: a sequentially Markov conditional sampling distribution approach. Genetics 194, 647–662 (2013)
McVean, G. A. & Cardin, N. J. Approximating the coalescent with recombination. Phil. Trans. R. Soc. B 360, 1387–1393 (2005). This seminal paper describes an algorithm for characterizing sequence evolution along a genome, which forms the basis of emerging methodologies for inferring population history using WGS data.
Palamara, P. F., Lencz, T., Darvasi, A. & Pe'er, I. Length distributions of identity by descent reveal fine-scale demographic history. Am. J. Hum. Genet. 91, 809–822 (2012).
Gravel, S. Population genetics models of local ancestry. Genetics 191, 607–619 (2012).
Browning, S. R. & Browning, B. L. Haplotype phasing: existing methods and new developments. Nature Rev. Genet. 12, 703–714 (2011).
Kitzman, J. O. et al. Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nature Biotech. 29, 59–63 (2011).
English, A. C. et al. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS ONE 7, e47768 (2012).
Schneider, G. F. & Dekker, C. DNA sequencing with nanopores. Nature Biotech. 30, 326–328 (2012).
Orlando, L. et al. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 499, 74–78 (2013).
Meyer, M. et al. A mitochondrial genome sequence of a hominin from Sima de los Huesos. Nature http://dx.doi.org/10.1038/nature12788 (2013).
Comas, I. et al. Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans. Nature Genet. 45, 1176–1182 (2013).
Mane, S. P. et al. Host-interactive genes in Amerindian Helicobacter pylori diverge from their Old World homologs and mediate inflammatory responses. J. Bacteriol. 192, 3078–3092 (2010).
Groenen, M. A. et al. Analyses of pig genomes provide insight into porcine demography and evolution. Nature 491, 393–398 (2012).
Huang, X. et al. A map of rice genome variation reveals the origin of cultivated rice. Nature 490, 497–501 (2012).
Reich, D. et al. Denisova admixture and the first modern human dispersals into Southeast Asia and Oceania. Am. J. Hum. Genet. 89, 516–528 (2011).
Skoglund, P. & Jakobsson, M. Archaic human ancestry in East Asia. Proc. Natl Acad. Sci. USA 108, 18301–18306 (2011).
Cox, M. P., Karafet, T. M., Lansing, J. S., Sudoyo, H. & Hammer, M. F. Autosomal and X-linked single nucleotide polymorphisms reveal a steep Asian–Melanesian ancestry cline in eastern Indonesia and a sex bias in admixture rates. Proc. Biol. Sci. 277, 1589–1596 (2010).
Xu, S. et al. Genetic dating indicates that the Asian–Papuan admixture through Eastern Indonesia corresponds to the Austronesian expansion. Proc. Natl Acad. Sci. USA 109, 4574–4579 (2012).
Fu, Q. et al. DNA analysis of an early modern human from Tianyuan Cave, China. Proc. Natl Acad. Sci. USA 110, 2223–2227 (2013).
Nachman, M. W. & Crowell, S. L. Estimate of the mutation rate per nucleotide in humans. Genetics 156, 297–304 (2000).
Awadalla, P. et al. Direct measure of the de novo mutation rate in autism and schizophrenia cohorts. Am. J. Hum. Genet. 87, 316–324 (2010).
Keightley, P. D. Rates and fitness consequences of new mutations in humans. Genetics 190, 295–304 (2012).
Lynch, M. Rate, molecular spectrum, and consequences of human mutation. Proc. Natl Acad. Sci. USA 107, 961–968 (2010).
Nelson, M. R. et al. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337, 100–104 (2012).
Knapp, M. & Hofreiter, M. Next generation sequencing of ancient DNA: requirements, strategies and perspectives. Genes 1, 227–243 (2010).
Maricic, T., Whitten, M. & Paabo, S. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PLoS ONE 5, e14004 (2010).
Wall, J. D. et al. A novel DNA sequence database for analyzing human demographic history. Genome Res. 18, 1354–1361 (2008).
Briggs, A. W. et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc. Natl Acad. Sci. USA 104, 14616–14621 (2007).
Axelsson, E., Willerslev, E., Gilbert, M. T. & Nielsen, R. The effect of ancient DNA damage on inferences of demographic histories. Mol. Biol. Evol. 25, 2181–2187 (2008).
Rambaut, A., Ho, S. Y., Drummond, A. J. & Shapiro, B. Accommodating the effect of ancient DNA damage on inferences of demographic histories. Mol. Biol. Evol. 26, 245–248 (2009).
Briggs, A. W. et al. Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Res. 38, e87 (2010).
Jonsson, H., Ginolhac, A., Schubert, M., Johnson, P. L. & Orlando, L. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics 29, 1682–1684 (2013).
Barbujani, G. & Goldstein, D. B. Africans and Asians abroad: genetic diversity in Europe. Annu. Rev. Genom. Hum. Genet. 5, 119–150 (2004).
Renfrew, C. Archaeogenetics — towards a 'new synthesis'? Curr. Biol. 20, R162–R165 (2010).
1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Acknowledgements
The authors thank J. Watkins, F. Mendez, A. Woerner, J. Burger and D. Caramelli for their comments on the manuscript and L. Johnstone for her help in figure 3. Support for this work was provided by the US National Institutes of Health to M.F.H. (R01_HG005226).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary information S1 (box)
Inferring changes in effective population size (Ne). (PDF 221 kb)
Supplementary information S2 (box)
Construction of the PCA plot of aDNA samples (Figure 3B). (PDF 186 kb)
Glossary
- Mitochondrial DNA
-
(mtDNA). A circular piece of non-recombining DNA of ~16,000 bp that is found in the mitochondrion and that is inherited exclusively from the maternal parent.
- Non-recombining portion of the Y chromosome
-
(NRY). The middle ~95% of the Y chromosome that is passed from father to son and that does not undergo recombination during meiosis, thereby allowing inheritance of genetic ancestry to be traced exclusively down the paternal line.
- Uniparentally inherited systems
-
Genetic material in organisms with distinct sexes that is passed on to offspring through inheritance only from one sex; that is, mitochondrial DNA and the non-recombining portion of the Y chromosome.
- Short tandem repeat
-
(STR). A DNA sequence that contains a variable number (typically ≤50) of tandem repeated short sequence motifs of 2–6 bp, such as (GATA)n.
- Population structure
-
The distribution of individuals into partially isolated local subpopulations or demes that are interconnected by migration.
- Phylogeographical approaches
-
Methods that use the geographical distribution of genetic lineages, which are deduced from phylogenetic methods, to infer the demographic history of a set of individuals or populations.
- Model-based inference methods
-
Analyses that specify demographic models, investigate the model that best fits the genetic data and infer parameters of interest (such as population size changes, divergence times and migration events) for the best-fitting model.
- Anatomically modern humans
-
(AMHs). Individuals that are classified as Homo sapiens on the basis of the set of morphological characteristics that distinguish them from other, now extinct, members of the genus Homo (that is, archaic humans). According to the fossil record, AMHs emerged ~ 200,000–150,000 years ago.
- Reciprocal monophyly
-
The phenomenon whereby all lineages within a species are genealogically closer to each other (in this context, on the basis of the sharing of common genetic variants) than they are to any lineages in other species that are considered in a phylogeny.
- Clades
-
Groups of entities (such as genes or organisms) in a phylogenetic tree that have all arisen from a common ancestor.
- Introgression
-
Gene flow between populations or species whose individuals hybridize.
- Range expansions
-
Increases in the geographical distribution of a population through time from some region of origin.
- Linkage disequilibrium
-
(LD). The nonrandom association of alleles that are carried at different loci. LD can arise for various reasons (such as novel mutations, genetic drift, natural selection and admixture), but recombination is the main process that removes it.
- Founder events
-
Scenarios in which a new population is founded by a small number of incoming individuals. Similarly to a bottleneck, the founder effect severely reduces genetic diversity and increases the effect of random drift.
- Admixture
-
Gene flow between two or more groups that have been separated for a long enough period of time to be genetically distinct.
- Coalescence
-
A process that describes the genealogy of chromosomes or genes under a particular demographic model. The genealogy is constructed backwards in time and starts with the present-day sample. Lineages coalesce until the most recent common ancestor of the sample is reached.
- Melanesians
-
The putative indigenous inhabitants of the islands of Melanesia in the Pacific, which is a subregion of Oceania that includes the modern-day countries Papua New Guinea, Solomon Islands, Vanuatu and Fiji.
- D statistic
-
A statistic that detects admixture by examining patterns of allele sharing between single genomes of two sister populations and a more diverged third population that has putatively experienced gene flow to a greater degree with one of these two sister populations since they diverged; an outgroup is also used to determine the ancestral state of alleles.
- Derived alleles
-
Alleles that arise in a population following the mutation of an ancestral allele. Ancestral alleles can be distinguished from derived alleles, as the ancestral allele will typically be present in an outgroup species (for example, the chimpanzee sequence when examining variants in humans).
- Allele frequency spectrum
-
(AFS). A distribution of the counts of single-nucleotide polymorphisms with a given frequency in a single population or in multiple populations.
- 'Ghost' archaic hominin
-
Archaic hominin species for which there is no current available genetic sequence data, although there may be fossil evidence. Such species may improve the current fit, or at least provide an equally good fit, when considering demographic models that examine anatomically modern humans and archaic species for which there are sequence data.
- Effective population size
-
(Ne). Formulated by Wright in 1931, Ne reflects the size of an idealized population that would experience genetic drift in the same way as an actual population under study. Ne can be lower (and occasionally higher) than census population size owing to various factors, including variance in reproductive success, a history of population bottlenecks and reduced recombination.
- Haplogroups
-
Specific lineages of either mitochondrial DNA or non-recombining portion of the Y chromosome that are defined by a genealogically concordant combination of alleles (that is, haplotypes) at slowly evolving binary markers.
- Levantine corridor
-
A narrow land route that lies between the coast of the Mediterranean Sea to the west and Arabian deserts to the east, which connects Africa to Europe and Asia. It is seen as a common prehistorical bidirectional route of movement for both flora and fauna (including the genus Homo).
- D4P statistic
-
A statistic that is similar to the D statistic in that it examines patterns of allele sharing as it relates to deviations of congruence between gene trees and population trees. However, D4P does not rely on an outgroup, such as chimpanzees, to infer ancestral and derived states and uses variants that are found in at least two of four test genomes, each from a different population. It is thus, among other things, more robust to sequencing errors.
- Identity by state
-
Alleles that are the same. These alleles may or may not be identical by descent owing to the possibility of multiple mutation events.
- Pairwise sequentially Markovian coalescent model
-
A specialization of the sequentially Markovian coalescent model that considers only two chromosomes.
- Cline
-
In the context of genetic data, the exhibition of regular and directional variation in genotype or allele frequencies across a geographical region.
- Demic diffusion model
-
A migration model in which populations diffuse into new geographical areas and displace or interbreed with indigenous populations.
- Isolation-by-distance model
-
A model in which the amount of gene flow between two locations decreases as a function of distance. At equilibrium, this model predicts that genetic differentiation increases as a function of geographical distance.
- Identity by descent
-
The phenomenon whereby two alleles are a copy of the same allele that was carried in an ancestral individual.
- Principal component analysis
-
(PCA). A statistical method that is used to simplify a complex data set by transforming a series of correlated variables into a smaller number of uncorrelated variables known as principal components.
- Singleton
-
A genetic variant that is present in only a single chromosome from the sample analysed.
- Sequentially Markovian coalescent model
-
A simplification of the standard model of coalescence with recombination, such that the addition of crossover events while moving along the genome has a Markovian structure; that is, the addition of recombination at a given position along the genome depends only on the previous genealogy in which recombination was considered, rather than on the whole ancestral recombination graph from the beginning of the chromosome.
- Haplotype-phased
-
Pertaining to DNA sequence or genotyping data in an individual for which the combination of alleles contributed by each parental chromosome is resolved.
Rights and permissions
About this article
Cite this article
Veeramah, K., Hammer, M. The impact of whole-genome sequencing on the reconstruction of human population history. Nat Rev Genet 15, 149–162 (2014). https://doi.org/10.1038/nrg3625
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrg3625
This article is cited by
-
Modelling hominin evolution requires accurate hominin data
Nature Ecology & Evolution (2022)
-
Admixture-enabled selection for rapid adaptive evolution in the Americas
Genome Biology (2020)
-
Whole-genome sequencing of 128 camels across Asia reveals origin and migration of domestic Bactrian camels
Communications Biology (2020)