Introduction

The human skin color is one of the most perceptible phenotypic variations among humans and is known to be determined primarily by the amount and type of melanin which is synthesized within melanosomes in melanocytes (Thong et al., 2003). There is a high degree of variation in human skin color that ranges from dark brown to nearly colorless (Sturm, 2009).

Due to the advance of high-throughput SNPs genotyping technology, several genome-wide association studies (GWAS) for pigmentary traits (iris color, hair color, skin color, and tanning ability) have been conducted. To date, eleven genes have been identified to be related to pigmentation: TYR (11q14.3), TYRP1 (9P23), OCA2 (15q12-q13.1), SLC45A2 (5p13.2), SLC24A5 (15q21.1), MC1R (16q24.3), ASIP (20.q11.22), KITLG (12q21.32), SLC24A4 (14q32.12), IRF4 (6p25.3), and TPCN2 (11q13.3) (Stokowski et al., 2007; Sulem et al., 2007, 2008; Han et al., 2008; Kayser et al., 2008; Nan et al., 2009). From comparative genomics, we know that there are 378 loci (171 cloned genes and 207 uncloned genes) influencing pigmentation in mice and their human and zebrafish homologues available from the ESPCR web site (www.espcr.org/micemut/). However, only eleven genes are known to be related to human pigmentary traits, whereas hundreds of candidate genes have been found in mice. In addition, three of these eleven genes (SLC24A4 (14q32.12), IRF4 (6p25.3), and TPCN2 (11q13.3)) were recently newly identified in human GWAS that had not been known to be related to color variation in mice (Sturm, 2009). Therefore, there is possibility for the existence of as-yet-unidentified genetic loci and culprit genes associate with human pigmentary traits.

Most previously reported genes from those GWAS are mainly associated with general pigmentary traits. Considering that individuals with European ancestry show a higher degree of variation in eye and hair color compared to that in skin color (Rees, 2003; Sturm and Frudakis, 2004), the genes regulating skin color variation may differ from those of eye and hair color. Furthermore, most previous studies regarding skin color were so much as focused on tanning ability. Notably, the question of which genes are responsible for the variation of constitutive skin color among individuals is not still clearly answered. The only GWAS of constitutive skin color was conducted by Stokowski et al. who identified SLC24A5 (15q21.1), TYR (11q14.3), and SLC45A2 (5p13.2) as candidate genes (Stokowski et al., 2007). They measured skin color in six sites (both inner forearms, the outer forearms, and the inner arms above the elbows) and regarded the lightest skin color from those sites as constitutive skin color. To study exact constitutive skin color and further control for the confounding effect of sun exposure, we conducted a gene mapping study using objectively measured skin color in the buttock area, a prime example of lifelong lack of sun exposure.

Human skin color is a complex trait that is affected by the involvement of multiple genes. To identify genes regulating complex traits, isolated population models are most suitable because of decreased genetic heterogeneity and environmental variations. Thus, we selected an isolated population in rural Mongolia with identical ethnic backgrounds and life-styles.

In this study, we conducted a genome-wide linkage analysis of intrinsic skin color and further family-based association tests was used for fine mapping under selected regions with highest linkage peaks.

Results

Participant characteristics

Table 1 shows basic descriptions of the pedigrees and the subjects. The pedigree data used for this study were comprised of 344 individuals from 59 families with 706 parent-offsprings, 155 siblings, 74 half-siblings, 384 grandparents-grandchild, 175 avunculars, and 69 cousins. The average pedigree size was 11, ranging from 3 to 33. The mean age of study subjects was 30.4 yr. The proportion of females (n = 199, 57.8%) was higher than that of males (n = 145, 42.2%).

Table 1 Basic characteristics of the pedigrees and the subjects SD, standard deviation; MI, melanin index

MI (melanin index) has been used widely to indicate skin color objectively and is lower in individuals with lighter skin. We examined the MI in the buttock, which was representative of unexposed skin. The detailed results of MI according to sex and age are shown in Table 2. The mean MI in females (mean = 255.3, SD = 111.9) was significantly lower than that in males (mean = 319.5, SD = 126.4; P value < 0.0001). To investigate the effect of age on the sexual dimorphism of the skin color, we compared the results of subjects before (< 13 yr) and after puberty (≥ 13 yr). The number of subjects before and after puberty was 65 and 279, respectively. The mean MI after puberty was decreased in both sexes. Before puberty, there was no sex difference in mean MI, whereas the mean MI of females after puberty was significantly lower than that of males (P value < 0.0001).

Table 2 Comparison between male and female according to puberty SD, standard deviation; MI, melanin index. *P value for comparison between male and female was estimated by Kolmogorov-Smirnov Two-Sample Test since its distribution was not normally distributed

Familial correlations and heritability

The familial correlations between family pairs and heritability of MI are shown in Table 3. The familial correlation between parent and offspring was the largest (pair r = 0.52, SE = 0.08) and significant (P value < 0.0001). In addition, the correlation between siblings was also statistically significant (P value = 0.002). Notably, spouse correlations were not significant (P value = 0.971). Also, the heritability for MI was very high and statistically significant (h2 = 0.82, SE = 0.11; P value < 0.0001). The above results suggest a strong evidence for and importance of genetic factors in controlling skin color.

Table 3 Intraclass correlations between family pairs and Heritability of MI SE, standard error. *Heritability and intraclass correlations were estimated as residual value after adjusting by significant covariates such as sex, age, and sex × age2

Linkage analysis

On genome-wide multipoint linkage scanning, we found four suggestive regions with an logarithm of odds (LOD) score over 2 among 22 autosomes (Table 4 and Figure 1). It was previously proposed that there is significant evidence for linkage with an LOD score over 3.3 (Lander and Kruglyak, 1995). We discovered the locus of the highest LOD score of 3.39 on 11q24.2 at the required level of significance and in that region the nearest marker was D11S4151 (empirical P value < 0.0001). On chromosome 11, support linkage interval encompassing a 1.5 LOD score - maximum LOD score ranged from 123 cM to 147 cM. Other candidate linkage regions were located on chromosome 17 with an LOD score of 3.03 (empirical P value = 0.0002), chromosome 6 with an LOD score of 2.76 (empirical P value = 0.0002), and chromosome 13 with an LOD score of 2.27 (empirical P value = 0.001), having the nearest marker (linkage support interval) of D17S794 (89-99 cM), D6S1687 (147-156 cM), and D13S797 (100-105 cM), respectively.

Table 4 Results of genome-wide linkage scan for MI (LOD score > 2) LOD, logarithm of odds. *Support interval is defined from maximum to 1.5 LOD score
Figure 1
figure 1

Genome-wide multipoint linkage analysis results for MI across 22 autosomes (A), peak of linkage on chromosome 11 (B), chromosome 17 (C), chromosome 6 (D), and chromosome 13 (E).

To disregard the differences in mean MI before and after puberty, we also carried out an additional linkage analysis with subjects after puberty. The linkage peaks on our four candidate regions were maintained, although the maximum LOD scores were slightly reduced (Data not shown).

Family-based association test

To test association in the presence of linkage, we conducted a family-based association study of MI under support linkage interval on those four candidate linkage regions (Figures 1B-1E). The null hypothesis of no association and linkage was used for fine mapping under a linkage peak. This method can be applied when using the same data set for testing linkage and association (Laird and Lange, 2006). Family-based association results under linkage peaks are shown in Table 5. On the linkage region of chromosome 11, we identified ten significant SNPs reached at P value < 1.0 × 10-6, and six out of those ten significant SNPs were located within 3 candidate genes such as ETS1 (v-ets erythroblastosis virus E26 oncogene homolog 1 (avian)), UBASH3B (ubiquitin associated and SH3 domain containing, B), and ASAM (Adipocyte-specific adhesion molecule). In addition, two significant SNPs were discovered on chromosome 17 and both of them were located within CLTC (clathrin, heavy chain (Hc)) gene, another candidate gene for regulating skin color. On chromosome 6 and chromosome 13, we couldn't identify SNPs reached at P value < 1.0 × 10-6. These candidate genes are expressed in multiple tissues including skin (Supplemental Data Figure S1). Figure 2 shows regional association plot for the loci near the four candidate genes.

Table 5 Family-based association results between SNPs and MI under significant linkage regions Chr, chromosome; FBAT, family based association test; MAF, minor allele frequency. The 12 SNPs reached at significant (P value < 1.0 × 10-6) are shown in Table 4. *Positions are based on Build 36 from NCBI
Figure 2
figure 2

Regional association plot for ETS1 (A), UBASH3B (B), and ASAM (C) on chromosome 11 and CLTC (D) region on chromosome 17. The blue diamond indicates the strongest SNP in each gene region. The circle colors represent LD structure with the strongest SNP (r2 < 0.2; white, 0.2 ≤ r2 < 0.4; yellow, 0.4 ≤ r2 < 0.8; orange, r2 ≥ 0.8; red).

In addition, we carried out the replication study to confirm previously reported 11 genes associated with pigmentary traits: TYR (11q14.3), TYRP1 (9P23), OCA2 (15q12-q13.1), SLC45A2 (5p13.2), SLC24A5 (15q21.1), MC1R (16q24.3), ASIP (20.q11.22), KITLG (12q21.32), SLC24A4 (14q32.12), IRF4 (6p25.3), and TPCN2 (11q13.3). The results of association are shown in supplementary data Table S1. We couldn't find significant SNPs after adjusting multiple testing with stringent bonferroni correction. Therefore, we applied the less stringent criteria with significance level of P value ≤ 1 × 10-2. Of these 11 candidate genes, 5 genes were associated with MI (TYR, OCA2, SLC45A2, SLC24A4, and IRF4). The lowest P value in our results was found in OCA2 gene (P value = 0.0018).

Discussion

Skin color is known to be related to age and sex. Sexual dimorphism in skin color has been reported, with females tending to have lighter skin color than males after puberty (Cartwright, 1975; Clark et al., 1981; Jaswal, 1983; Frost, 1988; Mehrai and Sunderland, 1990; Williams-Blangero and Blangero, 1991). Lightening of skin color is reported to occur during puberty in both sexes, but the process is less noticeable in males, so that females are lighter than males in adulthood (Mesa, 1983). In a recent study, the age of menarche in an Asian rural area, whose population has demographic features in common with our participants, was reported to be 12.8 (Rah et al., 2009). Therefore, we divided the participants into those younger and older than the age of 13, and analyzed the effect of puberty on skin color. Table 2 shows the MI of participants according to age and sex. Our data showed consistent results.

Familial correlation and heritability can be used widely as important indicators to understand genetic evidence of interesting traits prior to gene mapping study (Burton and Tobin, 2005). We found parent-offspring and sibling correlations for MI level to be strongest while spouse pair correlation, representative of the effects of environmental sharing, were not significant (Table 3). The pattern of significant correlations between closer familial relationships supports genetically important effects. In previous studies, high heritability of human skin color was reported, ranging from 0.55 to 0.83 (Clark et al., 1981; Frisancho et al., 1981). Our heritability value of 0.82 is comparable (Table 3).

Through genome-wide linkage analysis, we found four candidate regions associated with regulation of constitutive skin color (Table 4 and Figure 1). In family-based association study under regions of linkage on chromosomes 11, 17, 6, and 13, we found twelve significant SNPs and four candidate genes (Table 5 and Figure 2). To the best of our knowledge, our linkage locus and significant SNPs were not previously reported to be related to the variation of human skin color.

Significant linkage evidence was observed on 11q24.2 with an LOD score of 3.39 and the second most statistically significant SNP (rs7126621) was located within the gene ETS1, which encodes Ets-1 transcription factor (Figure 2A). Target genes of Ets-1 include various matrix metalloproteinase and Ets-1 was reported to be related to the invasion of malignant melanoma (Rothhammer et al., 2004). Intriguingly, during early embryogenesis, Ets-1 is expressed in migrating neural crest cells which eventually differentiate into melanocytes (Meyer et al., 1997). The gene KITLG (KIT ligand) is also expressed in those cells and regulates migration from neural crest to skin and the association between polymorphisms within KITLG and pigmentations was previously reported (Sulem et al., 2007; Han et al., 2008). These data suggest that the genes regulating early migration of melanocyte progenitors could affect skin color and ETS1 could have a potential role in regulating skin pigmentation.

The third most significant SNP of the region (rs10790522) and other two significant SNPs (rs10790521 and rs11218771) were located within the gene UBASH3B (Figure 2B). The gene product of UBASH3B is regarded as being involved in the ubiquitination pathway. Interestingly, mice with null mutation in the gene MGRN1 (mahogunin, ring finger 1) which encodes E3 ubiquitin ligase show dark coat color and produce only eumelanin (dark, brown-black melanin) but not pheomelanin (light, red-yellow melanin) (He et al., 2003). Furthermore, ubiquitin-mediated proteasomal degradation plays important role in balance between synthesis and degradation of tyrosinase, the critical rate-limiting enzyme in melanin biosynthesis (Park et al., 2009). These data suggest that an abnormal ubiquitination pathway might be responsible for controlling constitutive skin color.

The both significant SNPs of rs11605822 and rs11607994 were located within the ASAM gene, also known as CLMP (CXADR-like membrane protein) (Figure 2C). The gene product of ASAM is regarded as being involved in cell-to-cell adhesion and cell migration. This gene was reported to be expressed in malignant melanoma cells (Cheng et al., 2006). Therefore, as in case of ETS1, we think it could affect constitutive skin color through involving migration process of early melanocyte progenitors.

On 17q23.2, we found a candidate linkage region with an LOD score of 3.03. Both significant SNPs on that region (rs10515171 and rs7224631) were located within the gene CLTC (Figure 2D). CLTC gene encodes for a clathrin that is major protein component of intracellular organelles and functions in sorting cargo molecules in membrane traffic pathways. Clathrin is found in stage I melanosomes, melanocyte specific organelles in which synthesis, storage and transport of melanin take place (Wasmeier et al., 2008). A polymorphism in this gene would possibly alter trafficking of melanosomal protein and therefore could affect constitutive skin color.

In association study to verify previous reported genes associated with human pigmentary traits, we identified several replicated genes with moderate significance such as TYR, OCA2, SLC45A2, SLC24A4, and IRF4. Although these genes showed moderate associations, OCA2 which is one of the most replicated candidate gene for pigmentary traits, had a lowest SNP P value in this study. These data suggest that our results were extension of pre-existing studies of human pigmentary traits. However, we identified unique genes that have not been previously reported in relation to human skin color, showing much stronger association than known candidate genes. This uniqueness in our result may be explained by several reasons. Firstly, considering that most of the genes reported in past studies are associated with general pigmentary traits or tanning tendency and that this study was designed to examine constitutive skin color, the genes we have found appear to be responsible for only the genetically determined color of the skin. Secondly, our family based study designs may provide an opportunity for the discovery of as-yet-unidentified genetic loci. Because unlike population based studies, family studies could detect the rare effects identified in specific families (Manolio et al., 2009). Although replication study to validate our novel results remains, consistent results from our two gene mapping studies make this hypothesis reasonable. In addition, the family-based design strategy against population stratification makes our results robust.

In conclusion, on genome-wide multipoint linkage scanning, we found a novel genomic region regulating constitutive skin color on 11q24.2 with an LOD score of 3.39. In addition, we discovered other three candidate regions for controlling skin pigmentation: 17q23.2, 6q25.1 and 13q33.2. Further, family-based association test under support linkage regions revealed ten and two SNPs reached at significance level on linkage regions of chromosome 11 and 17, suggesting four possible candidate genes: ETS1, UBASH3B, ASAM, and CLTC. Our results could be used as a basis for further functional studies of melanocyte biology and may contribute to improve the current treatment of pigmentary skin diseases.

Methods

Participants

This study was conducted as part of the GENDISCAN (Gene Discovery for Complex traits in Asian population of Northeast area) project which is designed for the genetic mapping of complex traits in Asian populations (Im et al., 2009, 2010; Lee et al., 2009, 2010). A total of 750 individuals for skin pigmentation measurement were collected in Dashbalbar, Dornod Province, Mongolia. To increase power for detecting linkage, we selected 59 large pedigrees consisting of 344 individuals and genotyped them. The basic information containing pedigree relationship was confirmed by personal interviews, genotype data, and questionnaires. Informed consent was obtained from all recruited subjects, and our study protocol was approved by the institutional review board (IRB) of Seoul National University (approval number, H-0307-105-002). The study was conducted according to the Declaration of Helsinki Principles.

Skin pigmentation measurement

Skin pigmentation was measured with a Mexameter MX18® (Courage and Khazaka, Köln, Germany). The measurement is based on the absorption principle. The probe of the Mexameter 18 emits light and a receiver measures the light reflected by the skin. The value of MI is calculated as the ratio of the quantity of light emitted from the probe and then absorbed by the skin and ranges from 0 to 999. The site of measurement was the buttock area which was regarded as being representative of constitutive skin color.

Genotyping

Most of our protocols for genotyping were described previously (Ju et al., 2008; Kim et al., 2009). Briefly, leukocyte DNA was extracted by standard protocols. From all subjects, 1,039 short tandem repeat (STR) microsatellite markers were genotyped. We removed the Mendelian and non-Mendelian error markers. In addition, SNP markers in subsamples were also genotyped for family-based association study under linkage regions using the Illumina Human 610-Quad Beadchip. In quality control of SNPs, markers with low call rate (< 99%), high error rate (> 1%), and low MAF (< 0.01) were excluded. Chromosomes 11, 17, 6, and 13 (maximum LOD score > 2) were selected and in each chromosome, support linkage region was defined (ranging from 1.5 to maximum LOD unit). A total of 7,087 SNPs were used in these regions. Additively, we carried out the replication study for previously reported 11 genes associated with human pigmentation. We used the SNPs within 500kb upstream and downstream of each gene. A total of 2,360 SNPs were used in this analysis.

Statistical analysis

Before genetic analysis, we evaluated the normal distribution of MI in the buttock. The distribution of MI was right skewed (Kolmogorov-Smirnov P value < 0.01), and after log transformation, was normally distributed (Kolmogorov-Smirnov P value = 0.13). Basic statistical analysis was performed by SAS, version 9.1. We used covariate variables such as age, sex, age × sex, age2, age2 × sex. We used the residual phenotypic variation of MI for adjusting significant covariates with the screen option of Sequential Oligogenic Linkage Analysis Routines (SOLAR) package (Almasy and Blangero, 1998). To identify genetic evidence on MI, we estimated familial correlations and heritability. The FCOR option in Statistical Analysis for Genetic Epidemiology (S.A.G.E.), Release 6.0.1 (http://darwin.cwru.edu/) was used for estimating familial correlations (Keen and Elston, 2003). Narrow-sense heritability (h2) (proportion of phenotype variance attributable to additive genetic variance) was estimated with the variance component method in SOLAR. MIBD matrices were calculated using Markov chain Monte Carlo method with the Loki package. To localize the QTLs that influence MI, multipoint linkage analysis was performed using SOLAR. To estimate the empirical P value, we performed 10,000 simulation replicates using the lodadj option in SOLAR. The PBAT tool in Helixtree software, version 6.4 (GoldenHelix Inc., Bozeman, MT) was used for family-based association testing. Based upon the "Linkage and no association (sandwich variance)" null hypothesis which can be used in the expanded pedigrees as it provides a robust variance estimate, we used FBAT-GEE (generalized estimating equation for FBAT) test statistic and an additive genetic model.