Introduction

The Lower Xiajiadian culture (LXC) was a main branch of the bronze culture of northern China dating to 4500–3500 years ago and found mainly in the West Liao-River valley (Figure 1). It was a flourishing civilization during a period characterized by a complex social structure, highly developed agricultural economy, distinctive painted pottery and elaborate artifacts.1 Its ethnic composition and the relationship with the Central Plain, an important site for Chinese civilization, has been the focus of multi-disciplinary research. The LXC was replaced abruptly by a totally different culture, the Upper Xiajiadian culture (UXC) between 2900–2700aBP.2 The UXC absorbed and inherited many of the strong characteristics of the Bronze Age cultures that developed in the steppes of northern China. As no archaeological sites with transitional links or intermediate forms have been found, the cause for the transition from farming back to a pasturing lifestyle in the prehistoric West Liao-River valley has been debated for more than a century. Of particular interest is to know whether population replacement or gene flow accompanied the cultural transition from the LXC to the UXC.

Figure 1
figure 1

Geographic location of the Dadianzi site. A full color version of this figure is available at the Journal of Human Genetics journal online.

The Dadianzi site, located in Chifeng, Inner Mongolian Autonomous Region of China was dated back about 3600 years by 14C testing. Cultural relics such as painted pottery jars, bronze wares and burial practices presented a typical Lower Xiajiadian cultural identity.3 As a rare well-preserved burial ground for the LXC, the Dadianzi site provides us with a valuable opportunity to solve the mysteries of the LXC using modern molecular tools.

In this study, two uniparentally inherited markers, mitochondrial DNA (mtDNA) and Y-chromosome single-nucleotide polymorphisms (Y-SNPs), were analyzed on 14 human remains excavated from the Dadianzi site in order to study the genetic characteristics of the LXC population. By comparing the ancient Dadianzi DNA with that of ancient and extant populations in the West Liao-River valley, the Central Plain and other surrounding regions in Asia, we revealed the migration history and evaluated the genetic continuity in this area. This information will contribute to the understanding of chief factor(s) involved in the formation and transition of culture in this area.

Materials and methods

Samples

The Dadianzi site (42°18′ N, and 120°36′ E) is located in northeast China where the annual average temperature is 4 °C6 °C. The cold and dry climate is favourable for DNA conservation. Well-preserved molars were collected from 14 human remains for DNA analysis to minimize the possibility of modern DNA contamination. The archaeological and anthropological data of the ancient individuals are shown in Table 1.

Table 1 Sampling information from the Dadianzi site

Contamination precautions

In this study, standard contamination precautions were followed as closely as possible to ensure the accuracy and reliability of the results.4, 5 Isolated rooms were used for different experimental steps (three rooms for sample preparation, DNA extraction and PCR amplification). Pre-PCR and post-PCR procedures were thus carried out in separated areas, and strict cleaning procedures were performed by regular treatment with 10% liquid sodium hypochlorite and UV light (254 nm). Full-body protective clothing, facemasks and gloves (frequently changed) were used during all handling processes. All consumables, although purchased as DNA-free, were sterilized at 121 °C for 15 min, and reagents were further UV irradiated before use for at least 20 min. Extraction and amplification blank samples were included in every PCR assay as negative controls. To identify potential laboratory-based contamination, the mitochondrial hypervariable I sequences of all researchers were obtained and compared with those of the samples. In addition, only female researchers were involved in the pre-PCR procedures in the Y-SNP study, preventing possible contamination from modern Y-chromosome DNA.

Ancient DNA extraction, amplification and purification

Teeth samples were immersed in 5% liquid sodium hypochlorite for 20 min, and washed using ultra-pure water and 100% alcohol. Each side of the tooth was then exposed to UV light for 30 min. The teeth were ground to fine powder in liquid nitrogen in a 6750 Freezer Mill (Spex SamplePrep, Metuchen, NJ, USA) and stored at −20 °C. Teeth powders (0.5 g) were incubated in 3 ml solution containing 0.45 M ethylenediaminetetraacetic acid, 0.5% SDS and 0.7 mg ml−1 Proteinase K at 50 °C in a shaker (220 r.p.m. min−1) for 24 h. DNA was extracted using the QIAquick PCR Purification Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol.

Two sets of overlapping primers were used to amplify the mtDNA HVS-I between positions 16 035–16 409 (Table 2). PCR amplification was carried out in 25 μl of a reaction mixture containing 2 μl extract, 1.5 × reaction buffer (Fermentas, Burlington, Canada), 1 U of Taq polymerase (Fermentas), 2.5 mM MgCl2 (Fermentas), 0.2 mM dNTP Mix (Promega, Madison, WI, USA), 0.8 mg ml−1 BSA (Takara, Da Lian, China) and 0.2 μM of each primer (Sangon, Shanghai, China). PCR conditions were: initial denaturizing at 94 °C for 4 min, followed by 33 cycles at 94 °C for 40 s, 52 °C54 °C for 45 s and 72 °C for 40 s, with a final extension of 10 min at 72 °C and 4 °C for storage. Amplification products were purified using the QIAquick Gel Extraction Kit (Qiagen)

Table 2 All primers used in this study

mtDNA sequencing and SNP typing

Amplification products were sequenced directly using the ABI 310 Terminator Sequencing Kit (Applied Biosystems, Foster City, CA, USA) according to the manufacturer's instructions. Sequence reaction products were analyzed on an ABI PRISM 310 automated DNA sequencer.

To validate the mtDNA haplogroup, key SNPs of the mitochondrial coding regions were also typed. Seven sets of primers (for haplogroups M/N, D, D4, F, M7, M9 and Z) were used to amplify the mtDNA coding sequences by amplified product-length polymorphisms analysis.6, 7 Haplogroups A, G and M10 were typed by sequencing. The PCR reaction conditions were the same as those for the mitochondrial HVS-I amplification.

Sex identification and Y-chromosome SNP typing

The amelogenin fragment was amplified using primers shown in Table 2 for sex determination in all samples, and male samples were chosen for further analysis. We screened all male samples with four bi-allelic markers (M89-F, M9-K, M214-NO and M45-P) that define the major branches on the Eurasian haplogroup tree.8, 9 Subsequent analysis was restricted to markers (M231-N, Tat-N1c, M175-O, M119-O1, M95-O2, M122-O3, M242-Q and M173-R) on the appropriate sub-branch of the haplogroup tree.8, 10 The PCR reaction conditions were the same as those for the mitochondrial HVS-I amplification, but the length of the PCR products was all around 100–130 bp. All primers are listed in Table 2.

Cloning of PCR products

The mtDNA HVS-I products were cloned using the pGEM-T Easy Vector System I (Promega) according to the manufacturer's instructions. The remains of eight individuals were randomly selected on which to perform cloning analysis. Six to ten clones from two independent amplifications were selected for automated DNA sequencing, using vector M13 primers. As damaged DNA or jumping PCR would not result in mistakes in the determination of Y-SNP alleles, we did not clone the PCR products of the Y-SNP.10

Data analysis

Sequence alignments were analyzed using CLUSTAL X1.83 (http://www.clustal.org/download/1.X/ftp-igbmc.u-strasbg.fr/pub/ClustalX/). Comparison of DNA sequence homology was performed with the Blast program from the National Center for Biotechnology Information. An analysis of the molecular variance was performed on the 393 bp HVS-I sequences (np spanning 16 017–16 409), using ARLEQUIN 3.11.11 Fst values were considered significantly different with P-values under the threshold of 0.05. All results obtained for the comparison between the Dadianzi population and each population of the database were graphically plotted on a map with Surfer v.8.0 (Golden Software, Golden, CO, USA), using the location of each population given in the corresponding study.

Results and Discussion

Sequence authentication

Strict procedures and systematic controls were instituted to minimize the potential for exogenous DNA contamination. All the PCR controls as well as extraction controls yielded negative results. At least two extractions and two amplifications of different extractions were carried out on different teeth for each sample to assess the reproducibility of the results. All the sequences of the ancient individuals were confirmed to be different from those of the laboratory researchers (Supplementary Table S1). Correspondence of haplogroup inference was found between coding and control region data, and the sequencing of the clones further confirmed the results obtained by direct sequencing (Supplementary Table S2).

An inverse correlation between the size of the PCR amplicons and the amplification efficiency for the samples (138 bp>209 bp >235 bp >363 bp) was found in this study. Molecular sex identification results were in accordance with morphological sex assignments (Table 1) providing confidence for the presence of endogenous nuclear DNA. In order to validate the results generated in our laboratory, human remains from two samples (S9 and S12) were sent to the ancient DNA laboratory of Fudan University for further analysis, and identical results were obtained. All of these independent safeguards and checks provide confidence that the data obtained with these ancient samples are authentic.

mtDNA and Y-SNP analysis

Reproducible sequences were obtained from all 14 individual remains. The 393 bp fragments of the mtDNA HVS-I were compared with the revised Cambridge Reference Sequence.12 There was a total of 23 polymorphic sites, including 22 transitions and 1 transversion (16 232 C → A). Based on the HVS-I and coding region data combined with the eastern Eurasian mtDNA classification tree,7, 13, 14 the 14 sequences, which contained 13 haplotypes, were assigned to 9 haplogroups (Table 3). The dominant haplogroup in the Dadianzi people was D4 shared by five individuals who were associated with four different haplotypes. The other haplotype belonging to haplogroup D in the Dadianzi population was designated as D5 by the mutation at site 16 189 (T to C). The haplogroup M7c included two haplotypes, which were shared by two individuals in ancient Dadianzi people. The other haplogroups, including A4, F1b, G1a, M9a, M10 and M8z, were each present in one individual.

Table 3 Nucleotide changes of mtDNA and Y-SNP in the 14 Dadianzi specimens

Seven male samples were chosen for Y chromosome SNPs among the 14 individuals. Three samples (S1, S2 and S13) exhibited the mutations M89C → T, M9C → G, M214T → C and M231G → A, which were attributed to haplogroup N ( × N1C). Two samples (S8 and S12) exhibited the mutations: M89C → T, M9C → G, M175-5 bp del and M122T → C, belonging to haplogroup O3 (M122). We failed to obtain any product from two samples (S5 and S14) (Table 3).

Genetic characteristics of the LXC population

Due to its particular geographic location, the West Liao-River valley was a contact zone between northern steppe tribes and the Central Plain farming population. The formation and development of the LXC population was likely a complex process affected by admixture of ethnically different people. An archaeological study showed that the shapes and decorative patterns of ancient painted potteries were influenced by the Erlitou culture, which existed just before the LXC time in the Central Plain.15 Moreover, the climate of the West Liao-River valley was warmer at the beginning of the Early Bronze Age and was suitable for agricultural development, which may be one of the driving forces for the northward migration of the Central Plains farming population.

In the present study, mtDNA profiles of the Dadianzi population suggested that they were mainly comprised of northeast Asian predominant haplogroups such as A, D, G and M9a. Some haplogroups widespread in Northern and Eastern Asia such as M7c, F1b and M8 were also detected (Table 4). In China, there is a distinct north–south geographic genetic cline for maternal lineages.16 The Central Plain, as the name suggests, is located in the middle of China and hence possesses both northern and southern dominant haplogroups at medium frequencies, except for haplogroup M10 where ancient and extant Central Plain population show that M10 has the highest frequency in this area (Table 4). Haplogroup M10 also occurs at a high frequency in the West Liao-River valley in modern times (5.9%) as well as in the ancient population in the present study (7.2%), but it is rarely found in other places of China. It is worth noting that M10 appeared at an extremely high frequency (28.0%) in the ancient Taoshi population, which existed in the Central Plain 4500 years ago.17 Therefore, we deduce that the ancient population living in the West Liao-River valley experienced immigration from the Central Plain increasing genetic diversity of populations in this region.

Table 4 Haplogroup frequency distribution in the Dadianzi population and the 17 Eurasian reference populations

Determinations of the Y haplogroups gave further support to the conclusion above. The Dadianzi population contains two Y haplogroups, N (M231) and O3 (M122). Haplogroup N has a wide geographic distribution throughout northern Eurasia and is absent or only occasionally observed in more southerly areas.18, 19 By contrast, the haplogroup O3 (M122) is dominant among populations of East Asian and Southeast Asian populations, especially in the Chinese Han population with an average frequency of 52.3%.20 Y-SNPs analysis of prehistoric people (6400–3100 BP) along the Yangtze River showed that 65% of ancient individuals belonged to the haplogroup O3.10 However, studies on the ancient Xiongnu, Siberian and Mongolian populations indicated that the dominant haplogroups were C, N, Q and R, while haplogroup O3 was found at quite a low frequency.19, 21, 22 As with the maternal genetic data, these Y chromosome results indicate that, aside from the Northern lineage (haplogroup N), the Dadianzi population likely received the haplogroup O3 from immigrations from the south, most likely from the Central Plain. Intriguingly, there was an individual (S12) in the Dadianzi population who possessed both the northern maternal lineage (D4) and southern paternal lineage (O3-M122), suggesting that there was genetic admixture from the Central Plain during the Dadianzi time.

Genetic continuity in the West Liao-River valley from the Bronze Age

As the archaeological culture was replaced by the nomadic UXC and no transitional types have been found, there is much speculation about what happened to the LXC people. Two hypotheses have been proposed to explain the whereabouts of the LXC people. (1) Between 2900–2700aBP, based on pollen analysis as a record of climate change,23 temperatures suddenly dropped resulting in a cold climate. Thus, the people of the LXC, who were engaged in agriculture, were forced to migrate south and were replaced by nomadic populations from the northern steppe. (2) The ancient people learned to change their method of subsistence from farming back to a pasture-based lifestyle and adapt to the colder climate. Culture exchange with nomadic people to the north was responsible for the transition back to pastoralism. To investigate whether the LXC ancestral genetic components extended to the extant population of the West Liao-River valley, we compared all the haplotypes found in the Dadianzi population with the ancient and extant populations in Liaoxi and Asia. Figure 2 illustrates the current and past distributions of all the mtDNA haplotypes of our ancient LXC samples. The distribution of ancient DDZ mtDNA haplotypes were mainly concentrated in the West Liao-River valley but were also distributed widely in the surrounding areas, especially in the Central Plain. For the UXC, only five individuals belonged to the A, D, M8 and M* haplogroups, which were also found in the LXC population, but we found no sharing of haplotypes with the Dadianzi people.24

Figure 2
figure 2

Geographic distribution of the populations shared by each DDZ haplotype. Every dot represents an extant individual, and each triangle represents an ancient individual. A full color version of this figure is available at the Journal of Human Genetics journal online.

The population pairwise Fst comparison of maternal DNA showed that populations inhabiting the Central Plain presented the lowest genetic divergence, compared with other people in Northeast Asia (Figure 3), such as the Korean and Mongolian populations. A similar close relationship with the Dadianzi population also existed in the Buryats.19 High divergences were identified in the south and southeastern areas of China, such as Guangdong and Guangxi, and in minor ethnic populations as well as in the western regions such as Jiangxi and Tibet.

Figure 3
figure 3

Map of Fst values obtained from the pairwise comparisons of the DDZ maternal lineages with those of ancient and extant populations in surrounding areas of Asia. The dark colour scale represents the Fst values calculated between the mtDNA data of the DDZ population and the populations of interest from the geographic location, which are represented by numbers in the map. 1: Dadianzi (this study); 2: Gansu; 3: Dalian; 4: Chifeng; 5: Xining; 6: Xian; 7: Hefei; 8: Changding; 9: Tianlin; 10: Changsha; 11: Nanjing; 12: Nanchang; 13: Shanghai; 14: Weicheng; 15: Huizhe; 16: Hangzhou;27 17: Fengcheng; 18: Qingdao; 19: Uraqi; 20: Zhanjiang; 21: Wuhan; 22: Kunming;28 23: Taian;29 24: Oroqen; 25: Ewenki;29 26: Mongolian; 27: Buryat; 28: Yakut; 29: Khamnigan; 30: Korean; 31: Tuvinia; 32: Altaian-Kizhi; 33: Shor; 34: Telenghit; 35: Khakassian; 36: Teleut; 37: Evenki;19 38: Ulchi; 39: Udegey; 40: Negidel;30 41: Taiyuan;13 42: Guangzhou;13 43: Henan; 44: Hebei.31 A full color version of this figure is available at the Journal of Human Genetics journal online.

Combining the genetic characteristics and Fst comparisons of ancient and extant populations in the West Liao-River valley, we found no significant difference between the two populations in different historical periods. These findings pointed to the genetic continuity in this area after the early Bronze Age. However, due to the insufficient ancient population data for the UXC and other historic periods in this area, we cannot rule out the possibility that the genetic continuity was generated by returning migrant populations. Nevertheless, the distribution of shared haplotypes and low genetic divergence between the LXC and the Central Plain populations suggested that part of the Dadianzi population migrated to the south, likely due to the cooling climate, and these southwardly migrating LXC people contributed to the gene pool of the Central Plain in the Bronze Age. This viewpoint has been supported by archaeological evidence. Among the Yin Ruins relics of Shang Dynasty, which was built in the Central Plain during 3600–3050aBP, a large proportion of artifacts with northern cultural influences were identified.25 Furthermore, some archaeologists have even maintained that the ancestor tribe of the Shang Dynasty included part of the southward migrating population from the Yan Mountain and West Liao-River valley.26

Conclusions

Without major geographic barriers, population movement and cultural exchanges have been continuously occurring in the West Liao-River valley located between the northern steppe and Central Plain. Based on our analysis in this study, the main part of the LXC population of northern China was comprised of people carrying the haplogroup of northern Asians who had lived in this region since the Neolithic period, as well as genetic components due to immigration from the Central Plain. Climate change was a key factor in population migration in this area, as part of this population moved south later in the Bronze Age in response to a cooling climate, ultimately affecting the gene pool in the Central Plain. Although the local genetic continuity did not seem to be affected by people migrating out of the region, more data particularly from the UXC population are needed to determine whether returning migration was relevant to this genetic continuity.