Main

In contemporary central and eastern Eurasia, mobile dairy-based pastoralism is a key subsistence practice for many people1. Much of the eastern Eurasian Steppe is covered by dryland grasses which, while challenging for grain agriculture, can sustain large meat and dairy-producing herds2,3. Across the steppe, dairy is a staple food and the product of rich culinary traditions. In the Mongolian countryside, fresh, fermented, processed and distilled dairy products provide a major source of hydration and up to 50% of summer caloric intake1,4. Moreover, milk provides a protein- and fat-rich dietary component, while the processing of milk into dairy products enables the creation of a storable and transportable food source. In Mongolia today, dairy livestock, including sheep, goat, horse, cow, yak, reindeer and camel, are exploited for milk, meat, traction and transport across diverse environmental niches.

The adoption of dairy into adult human diets was a major transition in prehistoric subsistence5. In western Eurasian contexts, biomolecular approaches have been extensively applied to investigate the archaeological antiquity of a dairy diet. Following their initial domestication in Southwest Asia, cattle (Bos taurus), sheep (Ovis aries) and goats (Capra hircus) spread eastwards across the Eurasian steppe into Central Asia6. Biomolecular evidence for dairy lipids has been identified in ceramics from Neolithic Anatolia7 and eastern Europe8, as well as Copper and Bronze Age Kazakhstan9,10, indicating the potential spread of dairying out of Southwest Asia. As far east as the Tarim Basin of the Xinjiang region in northwestern China in the Middle Bronze Age, milk proteins have been identified in a woven basket11 and pieces of well-preserved kefir cheese12 (Fig. 1). In Bronze Age Mongolia, a recent study of human dental calculus from individuals across multiple sites in the northern Khövsgöl aimag identified milk proteins from sheep, goat and cattle1. Ancient DNA analysis of the same population found that almost all individuals were of predominantly local ancestry and only a single individual had over 10% of western steppe herder ancestry. This suggests that by the late second millennium bc, dairy pastoralism had been fully adopted by, or originated with, local northern populations, leaving an open question of when and how dairy subsistence arrived in this region.

Fig. 1: Ruminant and equine dairying in prehistoric Eurasia and contemporary Mongolia.
figure 1

a, Map of Eurasia showing major geographical features referred to in the text and sites where evidence of dairying has been previously found using proteomic approaches: (1) Khövsgöl1, (2) Xiaohe11, (3) Gumugou10, (4) Subeixi68, (5) Bulanovo29, (6) Hatsarat29, (7) Çatalhöyük West69, (8) Tomb of Ptahmes70, (9) Szöreg-C (Sziv Utca)29 and (10) Olmo di Nogara29. Locations for the earliest evidence of ruminant dairying based on the presence of milk fats in ceramics are shown in blue7 and the earliest evidence of horse dairying9 are shown in pink. Details for each site included in this figure are referenced in Supplementary Table 3. bf, Mongolian dairy products from Khövsgöl aimag: yoghurt starter culture, Khöröngö (Хөрөнгө) (b); curd from reindeer milk, ‘kurd’ (c); dried curd from mixed yak and cow milk, aaruul (ааруул) (d); clotted cream from mixed yak and cow milk, öröm (өрөм) (e); and fermented horse milk, airag (айраг) (f). g, Dairying ritual from Dundgobi aimag, Mongolia blessing the first horse airag production of the season. Credit: Photograph c provided by Matthäus Rest; photograph d provided by Jessica Hendy; and all others provided by Björn Reichhardt.

Direct evidence into the timing and nature of pastoral economies in Mongolia from other datasets is exceedingly rare. On the eastern Steppe, the ephemeral nature of pastoral campsites and severe wind deflation in most contexts makes detecting occupational sites with direct information on subsistence economies challenging13,14. As a result, archaeologists have often been forced to form conclusions about local subsistence from materials found in ritual human burials under stone monuments that dominate the Bronze Age archaeological landscape and occasionally include satellite animal burials. Specific features of burial mounds (stone type, shape and ringed fences) can be used to identify interred individuals into different culture groups as mound construction styles changed alongside evolving cultural traditions in Bronze Age Mongolia15,16. Prior to the Bronze Age, before the presence of constructed stone burial mounds, there are very few uncovered occupation or ritual sites, and pre-Bronze Age subsistence strategies are not well understood. However, it is assumed that Neolithic subsistence strategies included hunting, gathering and fishing, although the possibility of pastoralism should not be completely discounted14. Human burials associated with Afanasievo and Chemurchek culture groups (circa 3000–2500 bc) contain faunal remains of ovicaprid, bovine, equid and dog remains15,17,18,19, yet it is unclear whether the remains of each were of a domesticated variety or their wild relatives, such as Ovis ammon and Capra ibex in the case of ovicaprids20. Later Bronze Age (1500–800 bc) campsites containing ruminant and equine remains suggest the dietary consumption of horses and cattle, as well as sheep and/or goats as part of a fully pastoral economy21. Satellite burials containing multiple animals surrounding ritual human internments attest to pastoral culling and herd management patterns of sheep, goat, cattle and horses during this period22,23. By 1200 bc, remains of domesticated horses became almost ubiquitous at ritual burials sites in Northern Mongolia, with some of the crania showing evidence for equine dentistry and horse bridling and riding24,25.

In later time periods, written records from neighbouring regions, such as China and the Middle East, as well as within Mongolia, document the importance of ruminant and equine dairy in day-to-day subsistence, particularly the consumption of fermented horse milk as early as the Xiongnu Empire (circa 200 bc to ad 100)26 along with camel milk by the Mongol period (circa ad 1206–1398)27,28. Even though many archaeologists and historians assume milk had been included in ancient steppe diets, little direct evidence has been available about where and when specific animal species were first exploited for dairy on the steppe east of the Altai Mountains.

The analysis of ancient proteins extracted from ancient human dental calculus (calcified dental plaque or tooth tartar) has been established as an approach for detecting milk consumption in past individuals1,29,30,31. Differences in amino acid sequences between taxa enable the detection of the livestock species or ‘zooarchaeology by proxy’—a method to detect past animal use by the analysis of human remains alone. Here, we apply the proteomic analysis of ancient dental calculus to 32 individuals spanning from the late Neolithic through the Middle Ages to characterize the antiquity and species diversity of ruminant and equine dairying in Mongolia. We report the earliest direct evidence for dairy consumption in East Asia (east of the Altai Mountains), finding that ruminant dairy consumption was a feature of ancient diets in Mongolia from its initial pastoral occupation, circa 3000 bc and occurred in association with archaeological sites linked to western steppe cultures. We show that ruminant dairying became widespread by the Middle Bronze Age (1800–1200 bc) and from 1200 bc we observe the onset of horse milk consumption, perhaps the progenitor of the alcoholic drink airag, in tandem with the first evidence for mounted horseback riding and highly mobile economies—a tradition that remained important through the great nomadic empires and into the modern era.

Results

Milk proteins were identified in 72% of individuals analysed (23 of 32 individuals, Table 1), indicating the widespread consumption of dairy foods across multiple time periods in prehistoric and historic Mongolia. Specifically, we detected evidence of the milk proteins β-lactoglobulin (BLG I and II), α-S1-casein, kappa-casein, α-lactalbumin and β-casein, lysozyme C and peptidoglycan recognition protein 1. BLG was the most frequently detected milk protein, a pattern consistent with previous observations from ancient dental calculus1,32. We observe evidence of milk consumption from diverse taxa across multiple environmental zones within Mongolia (Fig. 2). Of the seven dairy livestock species used in contemporary Mongolia, we identified milk peptides from goat, sheep, cattle, horse and camel but did not detect peptides that could be assigned specifically to reindeer or yak milk.

Table 1 Presence of dairy proteins by individual and archaeological site
Fig. 2: Mongolian dairy consumption by period.
figure 2

ad, Maps showing changes in dairy consumption for Neolithic to Early Bronze Age (a), Middle–Late Bronze Age (b), Iron Age and Early Medieval (c) and Late Medieval (d). Archaeological site cultural affiliation is indicated by colours and symbols. Solid filled symbols indicate individuals with positive evidence of milk proteins, while symbols bisected with a diagonal line indicate individuals where no milk proteins were identified. Individuals of the same site are contained within brackets. Individual AT-923, associated with Ulaanzuukh, is not directly radiocarbon dated and is not included in this figure. Taxonomic icons only indicate the most specific taxa identified in a phylogenetic branch. The full list of dairy species identified for each individual is given in Table 1 and Supplementary Dataset 2. Data used in the creation of this figure are included in Supplementary Table 4.

We find evidence of milk proteins in the earliest directly dated individual in our sample set, AT-26 (3316–2918 cal bc; 2σ range) at the Afanasievo burials of Shatar Chuluu, the earliest known mounded burial features associated with pastoral economies in the territory of Mongolia. Specifically, we observe peptides deriving from a taxonomically ambiguous region of the milk whey protein BLG, where the species can be assigned as a bovid in Bovinae subfamily (cow, yak, bison and water buffalo) or Ovis genus (sheep), making a more specific taxonomic assignment for this individual’s milk consumption more challenging. In the two individuals associated with Chermurchek culture at the sites of Khundii Gobi (2886–2577 cal BC; 2σ range) and Yagshiin Khuduu (2567–2468 cal BC; 2σ range), we detect milk peptides matching to the subfamily Caprinae (sheep or goat) and the genus Ovis (sheep) from BLG and also α-S1-casein. At Khundii Gobi, these identifications are specific to sheep (with six BLG peptides identified) among others that are specific only to the subfamily Caprinae and the infraorder Pecora (all even-toed ruminant mammals). These results align with recent archaeofaunal data from the early second millennia bc, which suggest a significant role for sheep in the prehistoric economy of pastoral occupants of the Mongolian Altai33.

In the Middle Bronze Age (circa 1800–1200 bc) the consumption of ruminant dairy milk can be seen in four of seven individuals analysed in the central eastern site of Ulaanzuukh and in seven of nine individuals in the previously published Khövsgöl sites in northern central Mongolia. Ulaanzuukh individuals date to slightly earlier than the Bronze Age burial sites analysed in Jeong et al.1, where evidence of ruminant milk consumption was found in individuals associated with slope burials and khirigsuur ritual monuments. At Ulaanzuukh, BLG is the most frequently recovered protein across all samples, with peptides from α-S1-casein and kappa-casein (proteins that are associated more with milk curds than milk whey) identified to a lesser extent. The dental calculus from the three Ulaanzuukh individuals who did not show evidence of dairy consumption showed a poor level of protein preservation, with a general absence of typical salivary and bacterial proteins reported previously in dental calculus studies34, which may suggest that an absence of evidence for milk consumption in these individuals could be due to overall poor biomolecular preservation.

At Bronze Age sites dating to after 1200 bc, we found evidence for dairy consumption in all four individuals tested. We identified ruminant milk peptides from the same two proteins (BLG and α-S1-casein) identified in the previous time period, as well as from two additional proteins: α-lactalbumin and β-casein. In addition to ruminant milk proteins, we also detected the first palaeoproteomic evidence of horse (Equus) milk, including horse-specific peptides from BLG I and II (horse BLG is derived from two paralogous genes) in two of four individuals. An individual from the site of Shunklai Uul (circa 1000 bc), in central Mongolia, had 126 peptide spectral matches (PSMs) from ruminant BLG, another 50 from horse BLG I and II (Fig. 3) and an additional nine from α-S1-casein, β-casein and α-lactalbumin.

Fig. 3: Alignment of observed BLG peptides for two individuals analysed in this study, showing the number of Equus and ruminant BLG peptides detected.
figure 3

For Equus, peptides from both LGB1 and LGB2 paralogues are shown (see Supplementary Table 5 for data associated with figure). Where peptides from these two taxa overlap, this has been indicated by a blue/orange cross-hatch pattern. The arrow in Individual AT-775 indicates two contiguous but independent peptides. Beneath each individual is a consensus sequence of B. taurus BLG (UniProt: P02754) and E. caballus BLG1 (UniProt: P02758) with dark grey indicating sequence identity and pale grey indicating sites with sequence differences.

Of the three Early Iron Age individuals (800–400 bc), two from the northwestern Mongolia site of Chandman Mountain and another from Dartsagt (north-central Mongolia), all showed moderate to high preservation and contained an abundance of human and oral microbiome proteins. Two of these contained no robustly identified milk peptides; in contrast, one individual from Chandman Mountain contained an abundance of PSMs to milk proteins. During this period, BLG peptides specific to both sheep and goat were detected, along with others that can be assigned to the higher taxonomic orders of Caprinae and Bovidae. Also, we find peptides derived from casein and α-lactalbumin specific to caprines, and those from both BLG I and II specific to equine milk.

In the Late Iron Age, during the tenure of the Xiongnu Empire (circa 200 bc to ad 300), calculus from each of the three individuals studied contained evidence for dairy consumption. One individual had PSMs specific to only horse BLG, while the second had Bos-specific BLG and caprine α-S1-casein peptides. The third individual had ruminant casein and whey proteins (BLG and α-S1-casein) as well as lysozyme C peptides specific to Equus. In the post-Xiongnu period, a single individual archaeologically classified as Turkic era did not show any evidence of milk consumption.

Nine of the eleven Mongol Empire individuals showed evidence for the consumption of dairy, with many individuals showing evidence for the consumption of milk from multiple species. For example, one individual showed evidence for the consumption of ruminant, equine and camel milk, whilst a further five individuals showed evidence for the consumption of both ruminant and horse dairy products (Fig. 3). During this period, we observed the first evidence for the consumption of camel milk through the detection of peptidoglycan recognition protein 1 (UniProtKB: Q9GK12), an immune protein that has been isolated from modern camel milk.

Deamidation of glutamine and asparagine has been proposed as a marker of taphonomic degradation in ancient proteins35. We applied an analysis of bulk deamidation using a previously published approach36 to five individuals from this study to examine potential patterns of archaeological degradation in milk, which may suggest that the milk proteins in our samples are ancient, as they appear to be of similar deamidation levels as the human oral proteins in the calculus. All deamidation is reported in Supplementary Table 2. We observed that the older samples generally showed higher levels of deamidation. The milk-origin peptides retrieved from the early Bronze Age individuals showed an average of 23.9% glutamine deamidation, the Late Bronze Age 13.5% and the Mongol period 3.3%. The same pattern was observed in the deamidation of asparagine, with an average of 52.6% deamidation of milk-origin peptides in the Early Bronze Age and 32.9% in the Late Bronze Age. However, the milk peptides recovered from the Mongol period individual do not fit this pattern in terms of asparagine deamidation, with 48.3% deamidation (Supplementary Table 2).

Discussion

Earliest evidence for dairying is associated with western steppe herder archaeological cultures

Our results demonstrate the oldest known evidence of dairy consumption in Mongolia and the eastern Eurasian steppe (circa 3000–2500 bc), in the form of Early Bronze Age Afanasievo- and Chemurchek-associated individuals in both central and western Mongolia (Table 1). Previous ancient DNA analysis of one of the individuals at Shatar Chuluu showing ruminant dairy proteins (AT-26) was shown to have a non-local mitochondrial haplogroup consistent with western steppe herder populations37, supporting the interpretation of individuals associated with the Afanasievo culture as migrants from eastern Europe via the Russian Altai38,39,40 and a probable vector for the initial introduction of domestic animals into Mongolia14. The identification of ruminant milk proteins (Ovis and Bovinae/Ovis) supports the domestic nature of fauna found in these burial features and indicate that dairy pastoralism formed an important element of the subsistence base of these late fourth millennium bc transcontinental migrants. Most significantly, they suggest that human migrations associated with the expansion of the Afanasievo culture present a viable candidate for the initial introduction of dairy and domestic livestock into eastern Eurasia. In this study, the earliest individual in this dataset showed evidence of dairy consumption. To track the earliest instances of dairy consumption in the eastern Steppe, it would be necessary to analyse individuals from earlier time periods. This would be particularly informative to untangle the presence or absence of dairying before Yamnaya–Afanasievo migrations.

The antiquity of eastern steppe horse milk consumption

Temporal patterning in our protein results suggest that horse milk consumption played a key role in the emergence and proliferation of mobile pastoralism in Mongolia (Fig. 4). Today, horses play a vital role in traditional Central Asian pastoral lifeways, improving herd management as well as providing a primary source of meat and milk. Horseback herders can manage larger herd sizes. Horses can break through snow and ice to access the sustenance underneath, exposing grass oases for other animals in the herd41,42,43. We observe direct evidence of horse milk consumption on the eastern steppe, in the form of equine (Equus)-specific peptides from the milk whey proteins BLG (I and II) and lysozyme C in individuals associated with the Baitag and Slab Grave cultures in western Mongolia dated to the late second millennium bc (Table 1). In addition to the appearance of horses in dietary assemblages21, this time period is linked with the proliferation of horses in ritual sites, the first direct archaeological evidence for horse bridling and riding25,44, the first evidence for horse breeding and management22,23, innovations in horse healthcare24, an expanded use of dry intermontane grasslands23,24,45 and the emergence of mobile, horse-facilitated pastoralism in eastern Eurasia. Our findings suggest that the incorporation of horses into dairy herds may have been closely linked to this multifaceted economic transformation in the use of horses45.

Fig. 4: Timeline of evidence for the consumption of different livestock milk in prehistoric and historic Mongolia.
figure 4

Radiocarbon dates for each individual were calibrated using OxCal (OxCal v.4.3.2 Bronk Ramsey66; r:5 IntCal13 atmospheric curve67) and resulting radiocarbon probabilities were grouped by the taxa of dairy proteins identified in that individual (indicated by AT-numbers), with ruminant taxa (Ovis, Capra and Bovinae) indicated in purple, Equus indicated by orange and Camelus indicated by green. Dairy peptides identified in individual AT-26 (indicated with an asterisk) are specific to Bovinae/Ovis. Individuals without direct radiocarbon dates are indicated by unfilled boxes and are placed on the timeline based on the estimated time spans for the Xiongnu and Mongol Empires. For data used in this figure, refer to Supplementary Table 6.

Following the Bronze Age, direct proteomic evidence for horse milk consumption continues through the imperial Xiongnu and Mongol periods (Table 1), in agreement with extensive textual and zooarchaeological evidence underscoring their significance to historic economies. During the Iron Age, we identify horse milk proteins at Tamiryn Ulaan Khoshuu, a site within the heartland of the Xiongnu Empire46,47. Historical Chinese documents record that an assortment of dairy products were consumed during the Xiongnu period (circa 200 bc to ad 100), including dried curds from ruminant milk (aaruul) but lao (horse alcohol) was the most consistently referenced dairy product and it held a prominent place in the cultural practices and identities of the steppe peoples48,49,50. By the Mongol period, horse milk consumption appears in among >80% (9 of 11) of tested individuals, a finding that matches historical accounts for widespread consumption of fermented mare’s milk, known as airag (Mongolian) or koumiss/kymyz (Turkic languages)27.

Horse milk differs from ruminant milks in important ways, in particular, it contains less curd protein (caseins) and much more lactose41,51,52. While the lower casein content makes it undesirable for producing most dried curd products (like aaruul, a staple of the traditional Mongolian diet), its high lactose content makes it highly suitable for making alcoholic beverages with ethanol contents as high as 10–12% (ref. 53). Communal airag-drinking was and continues to be an important social activity (Fig. 1) and it has been frequently noted in both Mongolian and foreign historical texts. The cultural significance of airag continued throughout the Mongol period, with social gatherings at the time referred to as ‘going to drink airag with people’, and social rankings reinforced by how close one sat in relation to the pitchers of airag. Favoured associates were charged with serving the airag and with choosing the order in which airag was served at feasts27. At the Mongol capital of Kharkorum, a silver fountain was said to flow with fermented mare’s milk, dispensing airag54.

Identification of other milks and fermentation agents

In addition to cattle, sheep, goats and horses, domesticated Bactrian camels (Camelus bactrianus), yaks (Bos grunniens) and reindeer (Rangifer tarandus) are also milked in contemporary Mongolia; however, very little is known about the dairying history of these three species. Here we report protein evidence of camel milk consumption (together with horse and sheep milk) in a Mongol period individual buried in the Gobi Desert (Table 1), a habitat of Bactrian camels55. Although not unique to milk in other mammal species, the camel protein we identify here, peptidoglycan recognition protein 1 (PRP1), is an important component of camel milk whey56,57. Mongol period historical accounts, such as the twelfth-century Secret History of the Mongols27, and historic accounts from foreign travellers into the empire, such as Marco Polo28, include stories of camels used for subsistence (milk, meat and blood) and transport of people and gers (round tents). While probably not the earliest instance of camel milk-drinking, the PRP1 protein data presented here provide biomolecular evidence for its consumption. Camel dairy use is not well understood and this finding provides an insight into exploitation of this historic traction animal for milk.

We did not detect any milk peptides specifically identified as yak or reindeer in this study. While yak is a commonly herded species in Mongolia today, their past use as a dairying animal is not well understood. Yak proteins are difficult to specifically identify due to their sequence similarity to other cattle species. Yak BLG, for example, differs from that of cattle at only a single amino acid across the entire BLG protein. While it is possible that some of the peptides assigned in this study to Bos or the higher taxonomies of Pecora, Bovidae and Bovinae could have originated from domestic yak, we only observed cattle-specific variants when sufficient protein coverage enabled it. The absence of reindeer-specific milk consumption may be expected based on historical accounts on the use of this animal for dairying in this region. Contemporary reindeer herders in Mongolia do milk their animals but these families migrated from Siberia into northern Mongolia only in the last century58,59.

In the present study, we did not identify any proteins specific to any bacterial taxa or other fermentation agents used in the processing of milk into other dairy products. Although processing agents were not found in the samples analysed in this study, it does not mean these were not used in ancient and historic Mongolia and future studies should continue to look for their presence alongside milk peptides.

These results provide the earliest evidence for dairy pastoralism in Mongolia’s first herding societies, pushing back previous estimates of dairying in the Eastern steppe by more than 1,700 years and tracing pastoralist connections between the western and eastern steppes to the Early Bronze Age. These results show that within 5,000 years after the earliest evidence for dairying in the Near East, this practice and its associated animals had spread more than 7,000,000-km eastward to become a successful mode of subsistence on the Mongolian steppe. While the routes of this movement remain to be fully understood, our data suggest that dairy pastoralism, and in particular the emergence of horse riding and horse milking circa 1200 bc, provided the economic, political and social support necessary for the success of subsequent nomadic empires on the vast grasslands of Eurasia.

Methods

Sample collection

Dental calculus samples were collected from the Department of Anthropology and Archaeology at the National University of Mongolia (NUM) from 32 previously excavated individuals (Table 1, listed by NUM accession number; see Supplementary Table 1 for site details). Individuals were selected from archaeological sites assigned to time periods between the Neolithic and the Mongol period. Dental calculus was removed from the tooth using sterilized dental scalars and stored in Eppendorf tubes until extraction. Nitrile gloves were used during sample collection to avoid contamination from skin proteins. Samples were exported to the Max Planck Institute for the Science of Human History under permission from the Ministry of Culture, Education, Science and Sports (export no. 10/413 (7b/52) was received on 2 February 2017, no. A0109258, MN DE 7 643). Protein extractions were conducted in a dedicated laboratory for the extraction of ancient proteins at the Max Planck Institute for the Science of Human History, Jena, using a filter-aided sample preparation protocol previously published in Jeong et al.1. Following protein extraction, digested peptides were stored at −80 °C before being analysed by liquid chromatography tandem mass spectrometry (LC–MS/MS) at the Functional Genomics Center Zürich, ETH/University of Zürich.

LC–MS/MS analysis

Mass spectrometry analysis was performed on a Q Exactive HF mass spectrometer (Thermo Scientific) equipped with a Digital PicoView source (New Objective) and coupled to an M-Class UPLC (Waters). Solvent composition at the two channels was 0.1% formic acid for channel A and 0.1% formic acid, 99.9% acetonitrile for channel B. Column temperature was 50 °C. For each sample 4 μl of peptides were loaded on a commercial ACQUITY UPLC M-Class Symmetry C18 Trap Column (100 Å, 5 µm, 180 µm × 20 mm, Waters) followed by ACQUITY UPLC M-Class HSS T3 Column (100 Å, 1.8 µm, 75 µm × 250 mm, Waters). The peptides were eluted at a flow rate of 300 nl min–1 by a gradient from 5 to 40% B in 120 min. Column was cleaned after the run by increasing to 98% B and holding 98% B for 5 min before re-establishing loading condition. Samples were acquired in a randomized order. The mass spectrometer was operated in data-dependent mode (DDA), acquiring a full-scan MS spectra (350−1,500 m/z) at a resolution of 120,000 at 200 m/z after accumulation to a target value of 3,000,000, and a maximum injection time of 50 ms followed by higher-energy collision dissociation fragmentation on the 12 most-intense signals per cycle. Higher-energy collision dissociation spectra were acquired at a resolution of 30,000 using a normalized collision energy of 28 and a maximum injection time of 50 ms. The automatic gain control was set to 100,000 ions. Charge state screening was enabled. Singly, unassigned and charge states higher than eight were rejected. Only precursors with intensity above 90,000 were selected for MS/MS. Precursor masses previously selected for MS/MS measurement were excluded from further selection for 30 s and the exclusion window was set at 10 ppm. The samples were acquired using internal lock mass calibration on m/z 371.1012 and 445.1200.

Data analysis

To account for as much variation of milk-associated proteins as possible during MS/MS ion searches, a supplementary database of unreviewed milk protein sequences was curated from UniProtKB. As an additional source of dairy protein sequences, genomic data covering the two BLG genes in ancient horses generated by Gaunitz and colleagues60 were translated into amino acid sequences, aligned with their respective modern sequences, and any putatively divergent proteins were concatenated to the supplementary database. In total, 244 additional accessions from UniProtKB and four putatively divergent BLG sequences from ancient horse genomes were added (Supplementary Dataset 1).

Peak lists were generated from raw files by selecting the top 100 peaks using MSConvert from the ProteoWizard software package v.3.0.11781 (ref. 61). Each sample was searched using Mascot62 (v.2.6.0) against Swiss-Prot in combination with the curated milk protein database (Supplementary Dataset 1). Results were exported from Mascot as csv files and further processed through an internally created tool, MS-MARGE63 to estimate the validity of peptide identifications and summarize the findings. MS-MARGE is an R script that relies on an Rmarkdown file to generate the following: an HTML report summarizing the search (an example is given in the Supplementary Information file), a csv file containing confidently identified PSMs and a FASTA file of confidently identified peptide sequences. As input, MS-MARGE accepts a csv file exported from a Mascot MS/MS ion search against an amino acid database containing decoyed sequences with the Group Protein Families option turned off. Additionally, two parameters can be provided: an expected value (e-value) cutoff and a minimum number of peptides to support an identification (default 0.01 and 2, respectively). To estimate false discovery rate (FDR) at the PSM and protein level, MS-MARGE counts the number of decoy hits after filtering for e-value and minimum peptide support, and divides this value by the number of target hits minus the number of decoys. The resulting value is multiplied by 100 to provide an estimate of FDR as a percentage. We aimed for a protein FDR of under 5% and a peptide FDR of under 2% (Supplementary Dataset 2). A minimum of two individual PSMs were required for specific protein identifications and only peptides with an e-value below 0.01 were accepted. After filtering criteria were applied, we observed a range of variation in the numbers of proteins identified, with samples ranging from 2 to 209 confidently identified protein families.

Deamidation levels in five samples (AT-628, AT-590b, AT-26, AT-233 and AT-835) were calculated to assess authenticity using a previously published approach36 (Extended Data Fig. 1 and Supplementary Table 2). These samples were specifically chosen to verify the antiquity of the oldest samples in this study, as well as a more recent sample from the Mongol period for comparison of deamination patterns. The raw MS/MS files for each were run through MaxQuant64 v.1.6.2.6a against the previously described milk database and against the human proteome. Settings including a semitryptic search strategy allowing for a maximum of two missed cleavages were used, and the score cutoff for modified and unmodified peptides was set to 60 with no correction for FDR. Carbamidomethyl (C) was added as a fixed modification, while variable modifications included: oxidation (M), acetyl (protein N-term), deamidation (NQ), Gln-pyro-Glu (N-term E), Glu-pyro-Glu (N-term E), phospo (ST) and hydroxyproline. The deamidation levels of all milk proteins were averaged per sample. All identified peptides, including their posttranslational modifications, are reported in Supplementary Dataset 2.

Radiocarbon dating

A total of 26 bones and teeth from 24 individuals were radiocarbon dated at two research facilities, the Oxford Radiocarbon Accelerator Unit (ORAU, laboratory code OxA) and the Groningen Radiocarbon Laboratory (laboratory code GrM). The ORAU followed routine pretreatment and measurement procedures65. In brief, 200–600 mg of bone or dentine was drilled using a hand-held dentist drill and collagen was extracted through a series of chemical steps that involved immersion in HCl, removal of humic acids using NaOH and removal of adsorbed CO2 via a final HCl wash. Only four of the samples prepared at the ORAU (OxA-36230, OxA-36231, OxA-36232 and OxA-36233) underwent ultrafiltration using Vivaspin ultrafilters, due to initial indications of poor collagen perseveration. Extracted collagen was frozen overnight and lyophilized. Between 2 and 5 mg of collagen was combusted in an elemental analyser (EA) and its C and N stable isotopes were measured at an isotope ratio mass spectrometer (IRMS) instrument linked to the EA, before excess gas CO2 was collected, graphitized and measured at a High Voltage Engineering Europa (HVEE) accelerator, alongside blanks and standards. These were used for contamination calculation and final correction of the data. For these samples, collagen yields ranged greatly from 0.8% to 17.8%, C:N ratios of the extracted collagen fall within expected ranges (3.2–3.4), with the exception of OxA-36233 (C:N = 3.6), and percentage C in the combusted collagen was 37–46%. The measurements are reported in radiocarbon years before present (bp), where bp is ad 1950. Samples from Oxford were calibrated using OxCal v.4.3.2 (ref. 66) and an IntCal13 atmospheric curve67.

For radiocarbon dates processed at the Groningen Radiocarbon Laboratory, samples were decalcified over at least a 24-h period using mild acid (HCl, 2–4% w/vol; room temperature) at the Center for Stable Isotope Research at the University of Groningen. For each sample still not fully decalcified, the solution was refreshed, removing and storing soft portions separately in demineralized water until further preparation. Soft and pliable fragments were rinsed thoroughly with demineralized water. Extracts were then exposed to NaOH (1%, ~30 min) to eliminate humic acids, rinsed to neutrality and treated once more with acid (HCl, 4% w/vol, 15 min). The raw collagen fraction was denatured to gelatin in acidified demineralized water (pH 3) at 80 °C for 18 h. Before drying, the dissolved gelatin was filtered through a 50-μm mesh to eliminate any remaining foreign particulates and the crystalline collagen scraped from the glass. Approximately 4-mg aliquots of the reduced carbon fraction were then weighed into tin capsules for combustion in an EA (IsotopeCube NCS, Elementar). The EA was coupled to an IRMS (Isoprime 100), allowing the ∂13C value of the sample to be measured, as well as a fully automated cryogenic system to trap CO2 liberated on combustion. After run completion, the individual reaction vessels were transferred to a graphitization manifold, where a stoichiometric excess of H2 gas (1:2.5) was added and the CO2 gas reduced to graphite over an Fe(s) catalyst. The graphite samples were then pressed and the radioisotopic ratio determined on a MICADAS accelerator mass spectrometer. Samples from Groningen were calibrated using OxCal v.4.3.2 (ref. 66) and an IntCal13 atmospheric curve67.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.