Introduction

Small size at birth represents a major public health burden in South Asia, where 45% of infants are born small for gestational age and 26% of neonates are low in birth weight1. Suboptimal intrauterine growth in this region, often attributed to short maternal stature or malnutrition during pregnancy2, may increase postnatal risks of infant mortality3, stunted childhood growth4, poor cognitive development5, and chronic disease later in life6. These short- and long-term health consequences suggest that nutritional insults during the highly sensitive and critical period of fetal development may result in systemic and permanent modifications of gene expression, cell size and number, and organ structure and function that can adversely affect health outcomes throughout life7. However, our understanding of biological processes that are affected by poor fetal development and maintained in postnatal life remains incomplete.

A comprehensive analysis of tissue or circulating proteins using a comparative proteomics approach may help to reveal pathophysiological pathways or associated biomarkers of phenotypes altered by intrauterine growth retardation (IUGR). For example, experimental animal studies have demonstrated that IUGR changed protein expression in liver, muscle, kidney, and small intestines, contributing to abnormal absorption and metabolism of nutrients in newborn pigs and rats8,9. Other studies have shown that prenatal undernutrition affects hypothalamus and brain proteomes that may disturb energy and redox homeostasis and brain plasticity and maturation in newborn or adult rats10,11. A limited number of human studies have reported differentially abundant serum proteins in umbilical cord- or venous blood samples between IUGR and non-IUGR neonates12,13,14, revealing that differential protein biomarker abundances can be detected in the circulatory system shortly after birth. However, to our knowledge, no human studies have evaluated the persistence of differential plasma protein expression of IUGR into childhood or assessed such differences by multiple anthropometric size indicators at birth.

We assessed nutritional and health status of a cohort of children born to mothers who had participated in a micronutrient supplementation trial in south eastern Nepal with birth measurements15,16. Using a quantitative proteomics approach, we previously revealed in this cohort that suites of plasma proteins were associated with various nutritional and health conditions, including status of multiple micronutrients17,18,19,20, body size and composition21, inflammation22, and subsequent cognitive function in a subset of children from this cohort23. In this study, we test the hypothesis that 6–8-year-old children who had been born small differ in their plasma protein profiles from those of normal birth size.

Results

Birth anthropometry of study participants

Birth measurements of study children (n = 500) are summarized in Table 1. Average (SD) weight, length, head circumference at birth were 2.67 (0.41) kg, 47.6 (2.2) cm, and 32.7 (1.3) cm, respectively. Percentages of children who were born stunted, underweight, and wasted (length-for-age [LAZ], weight-for-age [WAZ], and weight-for-length [WLZ] z-scores < −2) were 16.3%, 26.0%, and 18.1%, respectively, and 20.3% of children were born with small head circumference (head circumference-for-age z-scores [HCZ] < −2). Children were, on average (SD), 7.5 (0.4) years old at the time of blood draw. Characteristics of children and household can be found in the Supplementary Table S1.

Table 1 Anthropometric characteristics of children at birth for proteomics analysis (n = 500).

Differentially abundant plasma proteins between children born with small versus normal sizes

The relationships between each birth size indicator and all 982 proteins detected in >10% of children (n > 50) are shown in the four panels of volcano plots (Fig. 1). Percent differences (%) in relative abundance of proteins (x-axis) were estimated, adjusted for potential confounding factors including child age, sex, height, body mass index, ethnicity, caste, schooling, maternal age, parity at the 1st trimester, gestational age, and household wealth index. Proteins that are more or less abundant by birth size passing the pre-determined significance cut-off (q < 0.05) are colored in blue and red, respectively. No proteins were differentially abundant based on being stunted (LAZ < −2), underweight (WAZ < −2) or wasted (WLZ < −2) at birth (all q > 0.05) (Fig. 1A–C); however, 25 proteins were differentially abundant in those born with a small versus normal head circumference (HCZ < −2 vs. ≥−2) (Fig. 1D). Among these proteins, angiopoietin-like 6 (ANGPTL6) was 19.4% more abundant (q = 0.0094) and the remaining 24 proteins were 7~21% less abundant (all q < 0.05) in children born with a small versus normal sized head, all adjusted for multiple covariates (Table 2).

Figure 1
figure 1

Volcano plots of differentially abundant plasma proteins in children born with small versus normal sizes. (A) Length-for-age, (B) Weight-for-age, (C) Weight-for-length, and (D) Head circumference-for-age z-scores <−2 versus ≥−2 (n = 500). All 982 proteins quantified by mass spectrometry in >10% of children were plotted based on corresponding percent differences in relative abundance (x-axis) and −log10(p-values) (y-axis) estimated by using linear-mixed effect models adjusted for child age, sex, height, body mass index, ethnicity, caste, schooling, maternal age, parity at the 1st trimester, gestational age, and household wealth index. Proteins passing the pre-determined significance cut-off (q < 0.05) were colored in blue and red for more and less abundant proteins, respectively, in children with small compared to normal birth sizes. Abbreviations: HCZ, head circumference-for-age z-score; LAZ, length-for-age z-score; WAZ, weight-for-age z-score; WLZ, weight-for-length z-score. Implausible or unavailable z-scores of children were excluded in analyses for LAZ (n = 1), WLZ (n = 42), and HCZ (n = 3)51.

Table 2 Plasma proteins differentially abundant between children born with small head size and children born with normal head size (head circumference-for-age z-scores < −2 or ≥ −2), q < 0.05.

The results of over-representation analysis showed that 6 annotation terms of the Gene Ontology (GO) database were 3–12-fold significantly enriched in the list of proteins associated with a small sized head at birth over the expected proteins in the reference list (all Bonferroni-corrected P < 0.05) (Table 3). The enriched GO terms were structural constituent of cytoskeleton (GO:0005200; P = 2.0 × 10−8) and actin binding (GO:0003779; P = 4.2 × 10−3) in Molecular Function, actin cytoskeleton (GO:0015629; P = 1.9 × 10−6) and intracellular (GO:0005622; P = 8.1 × 10−4) in Cellular Component, and cellular component morphogenesis (GO:0032989; P = 3.5 × 10−2) and muscle contraction (GO:0006936; P = 5.0 × 10−2) in Biological Process ontologies.

Table 3 Over-represented Gene Ontology categories in the list of proteins differentially abundant between children with small and normal head circumference (head circumference-for-age z-scores < −2 or ≥ −2) at birth, compared to the reference protein list.

The less abundant proteins comprised actin proteins (α- and β-actin); actin-binding proteins that form the actin filament complex (α-actinin, vinculin, talin, parvin, and filamin) and regulate actin cytoskeleton remodeling (cofilin, profilin, and gelsolin); proteins involved in muscle contraction (tropomyosin 3 and 4, transgelin 2, and myosin light polypeptide 6); a chaperone protein (14-3-3 zeta/delta); and glycolytic enzymes [glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and phosphoglycerate kinase 1 (PGK1)]. Except for ANGPTL6, which was a single more abundant protein in children born with small vs. normal head circumference, a correlation matrix among all proteins that were less abundant reveals proteins are highly positively correlated with each other (median r = 0.68; IQR: 0.57 to 0.73) (Fig. 2).

Figure 2
figure 2

Correlation matrix of plasma proteins differentially abundant between children with small head size and children with normal head size (head circumference-for-age z-scores < −2 or ≥ −2) at birth (q < 0.05). Blue and red color indicate positive and negative correlations, respectively, and strength of association is related to color intensity. Abbreviations: ACTA1, actin, alpha skeletal muscle; ACTB, beta actin; ACTN1, alpha-actinin-1; ANGPTL6, angiopoietin-like 6; CALR, calreticulin; CAP1, adenylyl cyclase-associated protein 1; CFL1, cofilin-1; FLNA, filamin-A; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; GSN, gelsolin; HCZ, head circumference-for-age z-score; MSN, moesin; MYL6, myosin light polypeptide 6; PARVB, beta-parvin; PFN1, profilin 1; PGK1, phosphoglycerate kinase 1; PPIA, peptidylprolyl isomerase A; S100A9, protein S100-A9; SH3BGRL3, SH3 domain-binding glutamic acid-rich-like protein 3; TAGLN2, transgelin-2; TLN1, talin 1; TPM3, tropomyosin alpha-3 chain; TPM4, tropomyosin alpha-4 chain; VASP, vasodilator-stimulated phosphoprotein; VCL, vinculin; YWHAZ, 14-3-3 protein zeta/delta.

Discussion

In this typical, rural setting of Nepal, poor prenatal growth manifests as stunting, underweight or wasting at birth, as well as deficits in other less measured dimensions such as a small head circumference. Using an untargeted proteomics approach, we sought to detect plasma proteomic signatures of reduced fetal growth in a generally malnourished population cohort of school-aged children. Our results revealed an absence of quantifiable proteins associated with indicators of newborn weight and length, but a significant differential relative abundance for twenty-five proteins in children born with a small versus normal sized head circumference. These results offer evidence that reduced fetal cranial growth may reflect lastingly altered protein regulation in early life, evident in the plasma proteome at six to eight years of age.

Angiopoietin-like 6 (ANGPTL6) was the only protein whose abundance was elevated in children with a small head circumference at birth. It is a member of an angiopoietin-like family, which is involved in angiogenesis and metabolic homeostasis24. Animal studies have shown increased serum ANGPTL6 concentration to be associated with increased energy expenditure and an improved lipid profile and insulin sensitivity25,26. However, human studies have reported paradoxical results, with serum ANGPTL6 being elevated in women with pregnancy-induced hypertension27,28, diabetic patients29 and individuals with metabolic syndrome30, suggesting compensatory up-regulation mechanisms. Although results from these few studies are mixed, an elevated plasma ANGPTL6 abundance in children with compromised fetal cranial growth may indicate an early trajectory of disturbed endothelial and metabolic functions, which should be further tested in cohort studies.

Most proteins that were less abundant in children with a small head size at birth were actin or actin-binding proteins. Actin is the most abundant intracellular protein, forming actin filament complex with crosslinkers (α-actinin, vinculin, talin, parvin, and filamin) and assembly/disassembly promotors (cofilin, profilin, and gelsolin)31. A chaperone protein, 14-3-3 zeta/delta (YWHAZ)32, glycolytic enzymes (GAPDH and PGK1)33,34, and peptidylprolyl isomerase A (PPIA)35 also showed high positive correlations with other actin-related proteins (Fig. 2), corroborating their known roles in the regulation of cytoskeleton structure. There is little human data that support a hypothesis that changes in intracellular structural composition are associated with inadequate intrauterine growth. Some studies have found typical plasma proteins involved in inflammatory or immune response, nutrient transport, and blood coagulation to be differentially abundant in umbilical cord or venous blood samples between IUGR and non-IUGR newborns12,14. The difference between these findings and those reported here may be the result of the depletion process of high abundance proteins carried out in the present study, which allowed us to detect less abundant intracellular proteins in the plasma36. It is also possible that the differences in immunologic or metabolic responses between IUGR and non-IUGR neonates may not remain significant in childhood. Experimental animal studies have shown changes in expression of cytoskeleton related proteins in kidney, brain, and small intestine in newborn IUGR offspring8,9,11. Swaili et al. have reported global changes in gatekeeper genes and proteins, including cytoskeletal proteins in mice embryos, suggesting that cytoskeletal remodeling and cell cycle regulation are the causal mechanisms of nutritional programming37. On the other hand, earlier animal studies have suggested that nutritional deprivation in fetal life can disturb cell multiplication and that deficits in tissue or organ cell number are not fully recoverable38. These observations lead us to postulate that reduced abundance of the actin filament complex in plasma may reflect reduced cellularity or impaired cell differentiation of vulnerable tissues, or epigenetic regulation of cell structure and cycle related gene expression39. The fundamental roles of the cytoskeleton in early developmental processes and cell physiology and fate may provide a platform to mediate prenatal effects on postnatal life.

Because identified proteins were specific in their association with head size but not with other body size parameters at birth, one might consider that affected proteins could reflect neurological impairment in the brain. Head circumference at birth is well correlated with brain growth in newborns40 and has shown positive associations with cognitive abilities of children in some studies41,42. For example, actin cytoskeleton plays a critical role in developmental processes of the brain including neurite outgrowth, proliferation, and migration43. Disruption of actin cytoskeleton in the brain is associated with microcephaly and abnormal cortical development44. In the larger cohort of the same children in this study, head circumference at birth was positively associated with test scores of general intelligence, executive function, and motor function45. However, in a separate analysis, we found no association between the proteins observed to be related to a small head circumference and performance on these same cognition tests23. As ANGPTL6 is mainly a liver-derived protein and tropomyosin 3 and 4 are abundant in muscle46, the brain might not be the only organ that was affected by prenatal exposures. Because actin cytoskeleton, for example, is ubiquitous, it is possible that the identified proteins may reflect systemic changes in peripheral tissue proteomes in response to probably the most severe nutritional deficits during early life in this study population. Tissue origins and physiological and clinical significance of the identified proteins need further investigation.

To the best of our knowledge, this is the first comparative plasma proteomics study in human subjects that has examined enduring effects of restricted fetal growth on a plasma proteome in mid-childhood. Rigorous methods of pregnancy assessment and repeated birth anthropometry within 72 hours of birth under rural field settings15 increase confidence in the reliability of neonatal measurements. In the laboratory, our strategies of random sampling and assignment of plasma samples to mass spectrometry channels and experiments minimized chances of contamination or experimental artifacts47. An untargeted and high-throughput proteomics approach offered by mass spectrometry allowed detection of subtle changes in multiple individual proteins that are functionally coherent, strengthening the validity of our findings. Among limitations, although we adjusted for extensive variables of maternal pregnancy, child characteristics and household socioeconomic status in our models, the possibility of residual confounding cannot be ruled out. Because proteins were quantified on a relative scale, absolute changes in plasma abundance have not yet been possible to measure. Lastly, proteomics data observed at a single time point is insufficient to definitively discern whether observed differences are transient or persistent. Further cohort assessments at older ages will be required to ascertain whether suppressed or overexpressed protein differences are sustained into adulthood and whether these patterns are associated with functional or health outcomes.

In this systematic exploration of the plasma proteome, we identified a novel cluster of biomarkers associated with a constrained head size in a South Asian population of school-aged children. As affected proteins may be expected to vary by population exposure, phenotype and proteomics methods employed, these findings should be considered preliminary and in need of verification in other birth cohorts. Further studies are warranted to examine clinical and public health implications of plasma proteomic patterns associated with growth, nutrition and other exposures early in life.

Methods

Study population and design

In a community-based trial conducted from 1999 to 2001 in Sarlahi District, located in rural Southeastern Nepal, nearly 5,000 pregnant women were randomized to receive from early pregnancy through 12 weeks postpartum daily antenatal micronutrient supplements containing vitamin A alone as the control or folic acid, iron-folic acid, iron-folic acid-zinc, or multiple micronutrients15. In this trial, iron-folic acid and multiple micronutrient supplements improved multiple dimensions of birth size and reduced the risk of low birthweight compared to the control. Children born to mothers who had participated in the trial were then followed-up in 2006–2008, when they were 6–8 years of age16,48. Children in the present plasma proteomics study were a subset of the larger child cohort. Full details of this sub-study sample, study design, sampling strategies have been published elsewhere17. Briefly, among 3,524 children assessed at the time of the follow-up, 2,130 children met our sampling frame criteria (i.e., availability of sufficient plasma volumes, complete epidemiological data collected during both the maternal trial and child follow-up assessment, and birth size measures obtained within 72 hours after birth). These children were stratified into one of five maternal micronutrient supplementation groups, from which 1000 were randomly selected, 200 per maternal trial supplement group, for extensive biochemical nutritional analyses49. From each stratum, we randomly selected a 50% sample, or 100 children per maternal trial group, for plasma proteomics analysis. The original maternal micronutrient supplementation trial was registered at ClinicalTrials.gov as NCT00115271. Due to high illiteracy in the study population, oral informed consent was obtained from parents of eligible children by trained field staff during the child follow-up. Ethical approval for both maternal and child follow-up studies was obtained from the institutional review board at Johns Hopkins University, Baltimore, MD, USA and the Nepal Health Research Council in Kathmandu, Nepal. All methods were carried out in accordance with the approved guidelines and regulations.

Birth assessment

Because most women delivered at home, birth anthropometry was conducted by trained anthropometrists during a home visit15. All birth anthropometry data in this study was collected within 72 hours of birth. Birth weight was measured to the nearest 2 g using a digital scale. Recumbent length was determined in triplicate to the nearest 0.1 cm on a length board. Head circumference was measured in triplicate to the nearest 0.1 cm with an insertion tape. We used the median of the three values of length and head circumference and calculated z-scores for weight-for-age (WAZ), length-for-age (LAZ), weight-for-length (WLZ), and head circumference-for-age (HCZ) based on the World Health Organization (WHO) child growth standards50. Implausible z-scores (LAZ < −6, n = 1; WLZ < −5, n = 1; HCZ < −5, n = 3) and unavailable z-scores for WLZ (recumbent length < 45 cm, n = 41) were treated as missing51. Children with z-scores less than −2 were considered to be born with small sizes, compared to the reference population of the WHO growth standards. We classified newborn size as normal or small, with the latter being <−2 in WAZ, LAZ, WLZ, and HCZ. Maternal data including age and parity during pregnancy was collected at the 1st trimester of pregnancy. Gestational age was estimated based on the first day of last menstrual period.

Child follow-up assessment & blood sample collection

Child characteristics (e.g., literacy, attained years of schooling) and household socio-economic status (e.g., asset ownership, ethnicity, caste and head of household education) were collected during the follow-up study16. The same team of trained anthropometrists as during the maternal trial visited children in their homes to measure child weight, height, and left mid-upper arm circumference following standard procedures. Height-for-age, weight-for-age and body mass index [BMI, weight (kg) / height2 (m)]-for-age z-scores were calculated based on the WHO growth reference52. On the following morning of the anthropometry assessment, field phlebotomists visited the homes and collected overnight-fasted venous blood samples from children48. Biospecimens were brought to the field laboratory for plasma extraction, stored and shipped in dry liquid nitrogen tanks to the Center for Human Nutrition, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA where they were stored −80 °C freezers until thawed for analyses.

Plasma proteomics

Plasma proteomics analysis procedures have been previously reported17. Briefly, six high-abundance proteins (albumin, haptoglobin, immunoglobulin A and G, transferrin, and anti-trypsin), comprising 85% of total plasma proteins, were removed from each of 500 40 μl plasma samples for enhancing detection sensitivity of low abundance proteins using a Human 6 multiple affinity removal system column (Agilent Technologies, California, USA)36. Depleted plasma samples (each containing ~100 μg of protein) were treated with trypsin overnight for protein digestion. Peptide samples from 7 individual samples with one pooled sample (internal standard) were randomly labeled with 8-plex isobaric Tag for Relative and Absolute quantitation (iTRAQ) reagents (AB Sciex), which contain different reporter ions. The eight samples were combined and separated by strong cation exchange chromatography into 24 fractions. Each fraction of labeled peptide samples was analyzed by mass spectrometry using an Eksigent 2D nano LC interfaced with a LTQ Orbitrap Velos mass spectrometer (Thermo Scientific). Peptides were identified by searching precursor and fragment mass data against the Refseq 40 protein database using MASCOT (Matrix Science) through Proteome Discoverer software (v1.3, Thermo Scientific). Peptide identification was performed with a confidence threshold of <5% false discovery rate. A total of 72 iTRAQ 8-plex mass spectrometry experiments were run for all plasma samples of children (n = 500).

Statistical analyses

Statistical methods of protein relative abundance quantification from the iTRAQ reporter ions were previously reported47. Briefly, the relative abundance of proteins in each channel of each iTRAQ experiment was estimated by computing the median of all the median-polished log2-transformed iTRAQ reporter ion intensities across all spectra belonging to each protein. Varying numbers of missing values were observed across proteins and unobserved values were considered to be missing at random47. We estimated mean differences in relative abundance of proteins between two groups of children classified by small vs. normal size at birth. We employed linear mixed-effects models with each protein as a dependent variable, each dichotomized birth size group as a fixed effect, and iTRAQ experiment as a random effect to take into account any random effects that can be derived from extreme values. P-values were calculated by using a two-sided test of a null hypothesis that there is no difference in protein relative abundance between two groups. We estimated q-values to control a false discovery rate (FDR) and considered proteins passing a FDR threshold <5% (q < 0.05) as being significantly differentially abundant53. We considered child age, sex, height, body mass index, and schooling as covariates. We identified household ethnicity, caste, and wealth index, and maternal age and parity during pregnancy, and gestational age as potential confounding factors and adjusted for them. The household wealth index variable was created by calculating the 1st principal component of the polychoric correlation of selected items of household assets (construction materials of ground floor, first floor, and roof of house, bicycle, radio, television, electricity, cattle, goat, and land ownership). Maternal micronutrient supplementation during pregnancy was not included in the adjusted model due to its having no effects on child plasma proteome (Lee SE et al., unpublished data, 2017). We report adjusted differences in relative abundance of proteins between small vs. normal size at birth and unadjusted estimates are listed in Supplementary Table S2. We plotted volcano plots to display all analyzed plasma proteins with corresponding adjusted percentage differences in relative abundance of proteins (%) in the x-axis and statistical significance (−log10p-value) in the y-axis.

We built a correlation matrix of proteins associated with small size at birth to examine biological relationships among the associated proteins. Because proteins were quantified as relative abundance within each iTRAQ experiment, we calculated protein-protein Pearson correlation coefficients using complete pairwise data in each mass spectrometry experiment, and used the averaged coefficients across all experiments. The order of proteins was determined by optimal leaf ordering that organizes more correlated elements adjacent.

Functional analysis

To identify statistically over- or under-represented functional clusters in the list of proteins differentially abundant between small vs. normal size at birth, we conducted an over-representation test using the PANTHER (Protein Annotation Through Evolutionary Relationship) classification system (version 11.1. Released 2016-10-24)54. For protein annotation, we used defaulted PANTHER Gene Ontology (GO)-Slim datasets, which are hierarchically comprised of GO terms in three aspects: molecular function, cellular localization, and biological process55. Identified proteins associated with small birth size were used as an input analyzed list and all proteins quantified by mass spectrometry detected in >10% of study children were used as an input reference list. Numbers of classified proteins in analyzed and reference lists are compared in each functional cluster. P-values were calculated by Binomial statistics under a null hypothesis that identified proteins associated with small size at birth are sampled from the same general population as proteins from the reference set56. Annotation categories with Bonferroni-corrected p-value < 0.05 were considered statistically significant.

Data availability

The datasets of birth anthropometry and relative abundance of proteins included in this published article are available in Supplementary Table S3. All analyses were performed by using the R Environment for Statistical Computing (version 3.1.2; R Development Core Team).