Abstract
The genetic correlation between a character in two environments is of considerable interest in the context of plant and animal breeding for the prediction of evolutionary trajectories and for the evaluation of the amount of genetic variance maintained at equilibrium in subdivided populations. The two-way analysis of variance with genotype and environment as crossed factors is the usual basis for estimating this genetic correlation. In plasticity experiments, the genetic variance can differ widely between environments, for instance when the variance component associated with the genotype–environment interaction is not constant over environments. When this is the case, the assumption of homoscedasticity is violated, and the ANOVA method tends to underestimate the absolute value of the genetic correlation. To solve this problem, a variance-stabilizing transformation previously applied in a multivariate ANOVA context was developed. This development resulted in a new procedure (method 3), in which the genetic correlation is estimated from the transformed data (i.e. after among-environment heteroscedasticity is removed, while the within-environment means are maintained). In a simulation study and an analysis of Chlamydomonas reinhardtii growth rate data, we compared method 3 with two existing methods in which the genetic correlation is estimated from the raw data. Method 1 uses one ‘global’ variance component associated with the genotype–environment interaction, and method 2 uses two variance components associated with the genotype and obtained from one-way ANOVAs conducted separately in the two environments. Under increasing among-environment heteroscedasticity, method 1 produces increasingly biased genetic correlation estimates, whereas method 3 almost consistently provides accurate estimates; the performance of method 2 is intermediate, with more estimates out of range or indeterminate. This is the first demonstration that a variance-stabilizing transformation of the data removes the bias in the estimation of genetic correlation caused by among-environment heteroscedasticity, while allowing valid statistical testing in an ANOVA-based approach.
Similar content being viewed by others
Introduction
Organisms allowed to develop in different environments typically show phenotypic plasticity (e.g. Via, 1993; Schlichting & Pigliucci, 1995). A central idea in the evolution of phenotypic plasticity is that a character measured in different environments may represent character states that are more or less correlated genetically (Falconer, 1952; Via & Lande, 1985). Therefore, selection on a trait in a particular environment may affect this trait differently when the selected population is raised in another environment.
Selection gradients and additive genetic variances and covariances have been used in equations to predict the evolutionary trajectory of character states (Via & Lande, 1985). The square root of the heritability of character states and their genetic correlation can be used equivalently in such equations (Falconer, 1989; Grant & Grant, 1995). Moreover, genetic correlation is a dimensionless descriptor of the state of populations (Houle, 1992) that can be used to compare the evolutionary potential of different character states (e.g. Fry et al., 1996). The genetic correlation between character states is of considerable interest, therefore, in the context of plant and animal breeding, especially when the selective environment differs from the environment in which the improved population will live. It also provides information on the rate at which optimum phenotypes are attained under disruptive selection in a spatially variable environment and on the amount of genetic variance maintained at equilibrium in the improved population (Via & Lande, 1985, 1987; Bell, 1992).
The two-way ANOVA with genotype (e.g. clones or sib-groups) and environment as crossed factors is the usual basis for the estimation and significance testing of the genetic correlation (Robertson, 1959; Yamada, 1962; Fry, 1992). The standard method using variance components associated with the genotype and the genotype–environment interaction is based on the assumption that the genetic variance (i.e. the variance computed among the genotypic means) is constant over environments (Yamada, 1962, p. 504). A common finding in plasticity experiments, however, is that the environment affects both the genetic architecture of populations (i.e. the genetic variances and covariances) and the phenotypic expression of genotypes (e.g. Gebhardt & Stearns, 1988; Bell, 1991; Simons &Roff, 1996). Consequently, the usual two-way ANOVA approach tends to underestimate the absolute value of the genetic correlation between character states when the variance component associated with the genotype–environment interaction differs between environments (Yamada, 1962, pp. 504–505). To correct for the bias, the genetic correlation has been estimated with modified formulae (Yamada, 1962, p. 505; Bell, 1990, pp. 306–307). Because the F-ratio test of the correlation is no longer valid, the execution of separate one-way ANOVAs in each environment, followed by the use of the resulting variance components associated with the genotype in the estimation of the genetic correlation, has been recommended by some without further justification (Via, 1984; Fry, 1992, p. 543).
In the ANOVA approach to the study of phenotypic plasticity, Dutilleul & Potvin (1995) developed various data transformations to remove the statistical nuisance of the among-environment heteroscedasticity or that of genetic autocorrelation (i.e. when the responses expressed by the same genotype in two different environments are more similar or dissimilar than two randomly associated values), or both. One of their recommendations was to apply the transformation that removes heteroscedasticity, while taking autocorrelation into account by modified F-testing. The authors suggested (p. 1818) that further investigation was needed before using their transformation in the context of genetic correlation analysis. The present paper develops and validates the variance-stabilizing transformation of Dutilleul & Potvin (1995) in that context. First, we define a new version of the transformation. Secondly, the resulting method of genetic correlation estimation based on two-way ANOVA of the transformed data is compared theoretically with the standard method and that based on separate one-way ANOVAs in each environment, both performed on the raw data. Thirdly, the accuracy and precision of the three methods are compared in a simulation study, in which the bias and variance of the genetic correlation estimates are analysed in relation to the level of among-environment heteroscedasticity and the number of replicates per genotype and environment. Fourthly, the methods are applied to Chlamydomonas reinhardtii growth rate data published by Bell (1991). Finally, the discussion is extended to other methods that were not studied further because of their lack of generality or poor performance following preliminary results.
The mixed two-way analysis-of-variance model
Following Fry (1992) and Dutilleul & Potvin (1995), the genotype and environment factors are considered random and fixed, respectively; this allows the expected value (i.e. the theoretical mean) of the phenotypic response to differ among environments. The model parameters are those of the ‘SAS model’, in which the genotype variance component represents the variance of the genotype main effects, and not of the ‘Scheffé model’, in which that variance component is the variance of the genotypic means. The SAS model is recommended for its natural application for estimating the genetic correlation and testing whether it differs from zero (Fry, 1992).
In a standard plasticity experiment, the phenotypic response of replicate k (k=1,..., rij) of genotype i (i=1,..., n) in environment j (j=1,..., p) can be expressed as
where μ is the intercept; Gi, ej, and Geij are the deviations attributable to genotype i and environment j and the interaction between them, respectively; and the εijk are residual deviations of microenvironmental and individual nature. Whereas μ+ ej=μj represents the expected value of the response of an individual in environment j, the other terms are all considered to be normally distributed with an expected value of 0 and a given variance component: σ2G for Gi, σ2Ge for Geij under among-environment homoscedasticity, and σ2ε for the error term. Under among-environment heteroscedasticity, the variance of the interaction term, Geij, may change from environment to environment, so there may be as many variance components as there are environments: σ2Ge,j (j=1,..., p).
When replicates follow from the replication of the np genotype–environment combinations in different growth chambers, as in the Chlamydomonas example considered here, it is justified to define profile vectors of repeated measures on the same genotype within a growth chamber, yik=(Yi1 k,..., Yipk), and to consider the following p-variate model:
where m=(μ+ e1,..., μ+ ep), Gi=(Gi+ Gei1,..., Gi+ Geip) and eik=(εi1 k,..., εipk); the last term would incorporate the growth chamber effects. In model (2), m is the mean vector of the profile vectors yik, whereas the variances of and the covariance between the phenotypic responses of genotype i in environments j and j§ (j≠ j§) are given by σ2G+σ2Ge,j+σ2ε, σ2G+σ2Ge,j§+σ2ε, and σ2G, respectively. The variance–covariance structure of the genotypic profiles yik can be described by two variance–covariance matrices, ΣG and Σε. The diagonal entries of ΣG are given by σ2G+σ2Ge,j (j=1,..., p), and the off-diagonal ones are the genetic autocovariances (i.e. the genetic correlations multiplied by the square root of the product of the corresponding variances). Matrix ΣG can be decomposed as the product of the diagonal matrix with entries σ2G+σ2Ge,j (j=1,..., p) and the genetic autocorrelation matrix, , with unit diagonal entries and off-diagonal entries equal to rg,0. The variance–covariance structure among replicates is assumed to be spherical (i.e. there is independence and homogeneity of variances among replicates): Σε=σ2ε Ip, where Ip is the p× p identity matrix. The multivariate model (eqn 2) and its assumptions listed above are used in the simulation study.
When the np genotype–environment combinations are replicated in a completely random way (i.e. there are no ‘blocks’ like growth chambers), the experimental unit for repeated measurements is the genotype. However, because there is then no link among replicates, the multivariate model applies to the genotypic mean profiles ¯yi=(¯Yi1,..., ¯Yip), where ¯Yij denotes the mean response computed across the rij replicates available for the genotype–environment combination (i, j) (Dutilleul & Potvin 1995).
The three estimation methods
All three methods are based on the following equation written under the SAS model (eqn 1):
To estimate the variance components involved in the calculation of the genetic correlations, we wrote a computer program (PLASTIC) using the SAS/IML language (SAS Institute Inc., 1988). For each of the three methods, the program implements a procedure equivalent to the VARCOMP procedure, option type I (SAS Institute Inc., 1989), in which the variance component estimates are solutions of the system of expected mean squares given in response to the RANDOM statement. The data generated in the simulation study and the real data used in the example are balanced (i.e. the number of replicates is the same for all np genotype–environment combinations: rij= r for i=1,..., n; j=1,..., p), so the argument of bias put forward by Fernando et al. (1984) does not apply here.
Method 1
Under model (1) with p=2, one ‘global’ variance component associated with the genotype–environment interaction is computed over environments j and j§, that is σ2Ge,j=σ2Ge, j§=σ2Ge in eqn (3). Therefore, this method is strictly valid only in the homoscedastic case. The same holds true for testing, when the significance of the genetic correlation is assessed with an F-ratio test based on the genotype mean square divided by the genotype–environment interaction mean square (Fry, 1992, p. 542); the denominator would tend to be overestimated with heteroscedastic data, resulting in a lack of power of the test (Yamada, 1962).
Method 2
Fry (1992, p. 543) briefly mentioned that, under among-environment heteroscedasticity (i.e. σ2Ge, j≠σ2Ge, j§, j≠ j§), it is preferable first to estimate the variance components associated with the genotype in one-way ANOVAs performed on the raw data in the two environments separately, and then to use the resulting variance component estimates under the square root in eqn (3). Via (1984) had already considered such an estimation procedure without real justification and labelled it ‘method 2’.
The justification for this procedure is the following. Using eqn (1) and rewriting it explicitly for environments j and j§ (e.g. j=1 and j§=2) provides
Grouping then the terms that do not depend on i (i.e. μ, e1, e2), those that depend on i only (i.e. Gi, Gei1, Gei2), leaving the last term depending on i and k (i.e. εi1 k, εi2 k), results in
The variance components associated with Gi1§ and Gi2§ are those that need to be estimated (i.e. σ2G+σ2Ge,1 and σ2G+σ2Ge,2). Clearly, the estimation of negative or zero variance components is a limiting factor for method (2) because of the square root in eqn (3). Furthermore, there is no test because the denominator of the eventual F-ratio is a mixture of variance components estimated from correlated data.
Method 3
Yamada (1962) stated that ‘the standard two-way analysis [of variance] is no longer valid [for estimating genetic correlations] unless some transformation to make homogeneous variances is made’. Accordingly, Dutilleul & Potvin (1995) proposed a transformation in which the genetic variance of the transformed data was fixed to 1.0 in each environment, while mentioning (p. 1818) that it may be justified to scale the genetic variances to a common value other than 1.0; in all cases, the transformation maintains the within-environment means. The version considered here for the analysis of genetic correlations uses the geometric mean of the genetic variances of environments j and j§ as common genetic variance after transformation. The basis for that choice is that the denominator in eqn (3) is, by definition, the geometric mean of the variances σ2G+σ2Ge,j and σ2G+σ2Ge,j§. Using the geometric mean of the two variances in the transformation therefore produces equality between the two terms under the square root in eqn (3), while maintaining their product equal to (σ2G+σ2Ge,j)(σ2G+σ2Ge, j§). The properties of our transformation (i.e. the means are maintained and the variances made homogeneous are fixed to an intermediate value) are illustrated in Fig. 1, using the data from two environments with extreme variances in the Chlamydomonas example.
If ^Σ denotes the sample covariance matrix estimated from the genotypic mean profiles (Dutilleul & Potvin 1995, p. 1817), the new transformation can be defined by
where ¯y is the overall sample mean vector computed over yik (i=1,..., n; k=1,..., r), ¯σgeom denotes the positive square root of the geometric mean of the genetic variances of environments j and j§, ^Σjj§ is the 2×2 submatrix of ^Σ corresponding to environments j and j§; and diag and 0.5 denote the diagonal and square root operators of matrix algebra respectively (Graybill, 1983); other notations are as in eqn (2).
Method 3 uses eqn (3) with the data transformed after eqn (4), for which σ2Ge, j=σ2Ge, j§=σ2Ge because the data so transformed are homoscedastic (Fig. 1). As the genetic variances involved in the geometric mean in eqn (4) are diagonal entries of the ^Σ matrix, they are inflated by the error variance; this is analogous to the contamination of the product–moment correlation of genotypic means. The contaminating term is σ2ε divided by r, so the higher the number of replicates, the less the contamination (Via, 1984; Roff & Preziosi, 1994). To provide a method also valid in small samples, we developed an adjustment for the inflation by subtracting the error mean square divided by r from each of the two genetic variances before computing the geometric mean in eqn (4). The simulation study will show if this adjustment is effective. Method 3 should then allow valid F-ratio testing based on the genetic correlations estimated on the transformed data, whatever the sample size.
The simulation procedure
Equation (2) was used for simulation with rij= r for any (i, j). The simulation parameters were the theoretical genetic correlation, rg,0 (i.e. the genetic correlation generated in the data and expected from the correct estimation method), the number of replicates per genotype and environment, r, and the level of among-environment heteroscedasticity. The rg,0-values considered were −1.0, −0.5, 0.0, 0.5 and 1.0. The numbers of replicates were 2, 4 and 8. The heteroscedastic pattern considered among eight environments (i.e. p=8), as there are eight environments in the Chlamydomonas example, is defined by σ2G=0.2 and σ2Ge, j=0.01(p+1− j)3 (j=1,..., p=8), so that σ2G+σ2Ge, j ranges from 0.21 to 5.32. When the number of environments had to be decreased to three in order to ensure that was positive semidefinite so that its square root existed, the three σ2Ge,j-values considered were 0.01, 0.2263 (the geometric mean of the other two) and 5.12; when p=2, only the two extreme values were retained. Such ratios of σ2G+σ2Ge, j fall within the range of values observed in other studies (e.g. Bell, 1991; A. R. Aldous, P. Dutilleul and M. J. Waterway, unpubl. manuscript).
The multivariate intercept m was maintained the same for all simulation runs, whatever the values of the simulation parameters; it was fixed to 5+0.5 exp [0.25 (p+1− j)] (j=1,..., p), in order to mimic the decreasing pattern in the phenotypic response of log relative growth rate over environments in the Chlamydomonas example (Dutilleul & Potvin, 1995). Also, the number of genotypes, n, was 12 and the error variance, σ2ε, was 0.6 (i.e. σ2ε/σ2G=3.0) for all simulation runs.
Model (2), with the covariance matrices ΣG and Σε, allows the simulation of genetic correlations of any sign and size for any among-environment variance–covariance pattern. Given an intercept m, a set of σ2G+σ2Ge, j (j=1,..., p), a genetic autocorrelation matrix and an error variance σ2ε, a profile vector yik can then be simulated as follows:
where e1 and e2 are two p-variate vectors of pseudorandom numbers from a standard normal distribution with zero mean and unit variance (SAS Institute Inc., 1990; function RANNOR); other notations are as in eqn (4).
For a given set of simulation parameters, the empirical bias of the genetic correlation estimates was calculated for each method as the sample mean of the estimated values minus the theoretical value rg,0; the empirical variance was provided by the sample variance. A standard one-mean t-test was performed to assess the departure of the empirical bias from 0.0; the asymptotic normality of the sample mean was verified empirically. All these outputs are available in the computer program PLASTIC, which is available from the first author upon request and on WWW at ftp://gnome.agrenv.mcgill.ca/pub/genetics/software.
The Chlamydomonas example
We used part of Bell's (1991) data set to compare the three estimation methods; the same data were used by Dutilleul & Potvin (1995) for illustration. It originates from a series of experiments on the ecology and fitness of Chlamydomonas reinhardtii (Bell, 1990, 1991, 1992). The data reanalysed here are log relative growth rates of strain CC-410 (mt−) grown in eight environments (i.e. p=8). Twelve genotypes (i.e. n=12) were grown in each environment and the design was replicated twice (i.e. r=2). Experimental and technical details can be found in Bell (1991). We estimated the genetic correlations between the 28 pairs of environments by each method. Genetic correlation estimates were compared in regression biplots for two random variables, in which a 95% confidence interval was computed for the slope of the major axis following Sokal & Rohlf (1995, pp. 544–549).
Results and discussion
The simulation study
The primary objective here was to establish the best estimation method using simulated data in which the magnitude and sign of the theoretical genetic correlation rg,0 are fixed for a given among-environment heteroscedasticity; this will serve as a basis for comparison when the three methods are applied to the Chlamydomonas example. Results are presented in order of decreasing value of the theoretical genetic correlation (Table 1, Table 2, Table 3, Table 4 and Table 5).
Under among-environment heteroscedasticity, the following trends are observed. First, the higher the heteroscedasticity and genetic correlation, the poorer the performance of the standard method (Table 1, 2 and Table 4, 5). In fact, when | rg,0|≲0.0, method 1 is only valid for low heteroscedasticity, whatever the number of replicates, and, as expected (Yamada, 1962; Fry, 1992), the bias is negative for positive rg,0 and positive for negative rg,0. Secondly, for a theoretical genetic correlation of 0.0 (Table 3), all three methods perform very well, with no statistically significant bias. Thirdly, as expected on a theoretical basis (see The three estimation methods section), method 3 performs well to very well, even when r=2 and the level of heteroscedasticity is low (see the pair of environments 7–8 in Table 1 and Table 2, and the pair 1–2 in Table 4). On the other hand, method 2 gets worse with increasing r, especially for high (negative or positive) rg,0-values under high heteroscedasticity (Table 1 and 5). In particular, method 2 provides less reliable genetic correlation estimates than the other two methods, especially for two replicates, in which case about 35% of the correlation estimates were either out of range or indeterminate because of a negative variance component estimate under the square root in eqn (3). Maximum likelihood estimation of the variance components would not improve the performance of method 2 because negative estimates would be rounded to zero. Fourthly, the bias of method 1 is almost constant when | rg,0|≲0.0 under moderate and high heteroscedasticity, whereas there is no evidence of a relationship between bias and number of replicates for methods 2 and 3. The adjustment for inflated genetic variance estimates in method 3 is thus confirmed to be effective; this is reported here for a σ2ε/σ2G ratio of 3.0 and was observed for σ2ε/σ2G ratios of 4.0 or less (results not reported). Finally, for all methods, the variance tends to decrease when the number of replicates increases.
Under among-environment homoscedasticity (unpubl. results), the three estimation methods behave very similarly in terms of absolute value of the bias and its statistical significance, especially when rg,0=0.0, with a slight advantage overall for the standard method. In particular, method 1 performs better for high and negative genetic correlation. The most significant biases are for rg,0=1.0. The lack of reliability mentioned above for method 2 holds true under homoscedasticity.
Overall, the novel method 3 performs better than the other two methods. Fig. 2 illustrates the bias for three non-negative values of rg,0 when r=4. When rg,0=0.0, among-environment heteroscedasticity has no effect on the bias whatever the method. Method 1 is strongly affected by heteroscedasticity when rg,0=0.5 (slope=−0.014, P<0.001) and 1.0 (slope=−0.030, P<0.001). In contrast, the departure from zero is nonsignificant (P≲0.05) for the slopes of methods 2 and 3 when rg,0=0.5 and 1.0, with a mere tendency to increase for method 2; the intercepts, however, are significantly (P<0.001) different from zero when rg,0=1.0 (intercept=0.052 and 0.048 for methods 2 and 3, respectively). Method 2, more than method 3, thus tends to overestimate positive genetic correlations.
In conclusion, under among-environment heteroscedasticity, methods 2 and 3 perform better than the standard method 1, except when the theoretical genetic correlation is near zero. Method 3 is recommended in all other cases, with the exception of moderate genetic correlation and low to moderate heteroscedasticity (e.g. ratios of about 2 to 12 between the genetic variances), for which method 2 is almost equivalent to method 3 but suffers from lack of reliable genetic correlation estimates. Increasing the number of replicates per genotype and environment affects the performance of method 2, especially when the genetic correlation is strong, whether positive or negative. Increasing the number of replicates does not affect the performance of method 3, as its adjustment for inflated genetic variance estimates in the data transformation is effective in the range of σ2ε/σ2G ratios considered (i.e. 1.0–4.0). This simulation study represents the first demonstration that a method of genetic correlation estimation based on data transformation is efficient in removing the nuisance effects of among-environment heteroscedasticity in an ANOVA-based approach.
The Chlamydomonas example
Bell's (1991) data are distinctly heteroscedastic, the highest ratio of genetic variances between environments being equal to 32.6 (Dutilleul & Potvin, 1995). Based on the results of the simulation study, therefore, we expected method 1 to underestimate the absolute value of the genetic correlation consistently. Indeed, the slope of the major axis between the genetic correlations estimated with methods 1 and 2 is significantly lower than 1.0 (Fig. 3: n=19, slope=0.85, 95% confidence interval=[0.76, 0.95]), as is that of the regression contrasting methods 1 and 3 (Fig. 3: n=24, slope=0.84, 95% confidence interval=[0.72, 0.98]). The slope of the major axis between the genetic correlations estimated with methods 2 and 3 is very close to 1.0 (Fig. 3: n=19, slope=1.019, 95% confidence interval=[1.001, 1.037]), which indicates that methods 2 and 3 are almost equivalently unbiased. Nevertheless, method 2 was less reliable than method 3 because the former provided no estimates of the genetic correlation for seven pairs of environments and yielded two estimates outside the [−1, 1] range, whereas method 3 always produced an estimate, even though four of them were out of the range. Method 1 yielded a genetic correlation estimate for all 28 pairs of environments (as did method 3), and only one of them was out of range. From the analysis of the Chlamydomonas data, one may conclude that method 1 consistently underestimated the absolute value of the genetic correlation compared with methods 2 and 3, and that method 3 should be preferred to method 2 because of its higher reliability.
Lack of effectiveness of other methods
We complete our discussion by elaborating on other methods that were not retained for the simulation study, either on a theoretical basis or after poor preliminary results. The three methods below are all based on eqn (3) and are performed on the log-transformed data, transformed data with a zero mean and a unit variance and the raw data in the framework of the mixed ANOVA models, respectively.
The log transformation is a well-known and very simple variance-stabilizing transformation (e.g. Sokal & Rohlf, 1995). In model (1), it would be applied to all observations Yijk indiscriminately, but in eqn (4) the data from two environments with unequal genetic variances are transformed differently: the dispersion of the observations from the environment with the higher variance is decreased, whereas that of the observations from the environment with the lower variance is increased (Fig. 1). More importantly, the log transformation modifies the within-environment means and thus the environment main effects in model (1) (i.e. the ‘mean plasticity’; Bell & Lechowicz, 1994), without completely removing the among-environment heteroscedasticity. On that basis, it cannot be recommended.
Transforming all the data from each environment to a zero mean and a unit variance (including the replicates) removes the environment main effects from model (1) (i.e. there remains no term for mean plasticity) and imposes a particular among-environment homoscedasticity with a common variance of 1.0 that can sometimes be quite out of range (i.e. much higher or much lower than the variances computed on the raw data). The variance components estimated from such transformed data are of no use per se; two experiments cannot be compared on the basis of their within-environment variances if these are all fixed at 1.0. Furthermore, the common within-environment variance is not a common genetic variance, because it is computed over genotypes and replicates within a genotype instead of among the genotypic means, and it incorporates the entire error variance because the variance of observation Yijk is σ2G+σ2Ge, j+σ2ε in model (1). This point is particularly important from the perspective of genetic correlation estimation from which the contamination by the error variance should be absent or at least minimized. Nevertheless, the estimates of variance components σ2G and σ2Ge change in such a way after (0, 1) transformation that the resulting genetic correlation estimates are similar to those provided by method 3 when r=1, because the genotype–environment interaction is then indistinguishable from the error term in eqn (1). Otherwise, the (0, 1) transformation only approximates method 3 in both estimation and testing. In summary, in the broad framework of plasticity analysis, the (0, 1) transformation is not recommended; only when an approximation of the genetic correlation is sufficient (without testing) can this transformation be used.
To recall, the environment main effects, or similarly the within-environment means, are maintained by the new transformation (eqn 4) Fig. 1; see also Dutilleul & Potvin, 1995). Equation (4) also uses an intermediate common genetic variance given by the geometric mean of the genetic variances of environments j and j§ (Fig. 1), and method 3 provides the user with an adjustment for the contamination of the genetic variance estimates by the error variance divided by the number of replicates.
Lastly, PROC MIXED (SAS Institute Inc., 1995) may seem to be an obvious solution to the problem of among-environment heteroscedasticity in the analysis of genetic correlations. In fact, this procedure carries out repeated-measures ANOVA [the random vectors yik in model (2) are profile vectors of repeated measures on the same genotype within a growth chamber], while estimating one variance component associated with each random term [genotype main effects and genotype–environment interaction in model (1)] and a variance–covariance matrix for the errors. Unfortunately, when using the REPEATED statement of PROC MIXED, the variance estimated for each environment separately is then an error variance instead of a genotype–environment interaction variance component and the covariance estimated between environments is computed between the corresponding errors. The correlation derived from this covariance will thus generally be far from the theoretical genetic correlation.
References
Bell, G. (1990). The ecology and genetics of fitness in Chlamydomonas. I. Genotype-by-environment interaction among pure strains. Proc R Soc B, 240: 295–321.
Bell, G. (1991). The ecology and genetics of fitness in Chlamydomonas. III. Genotype-by-environment interaction within strains. Evolution, 45: 668–679.
Bell, G. (1992). The ecology and genetics of fitness in Chlamydomonas. V. The relationship between genetic correlation and environmental variance. Evolution, 46: 561–566.
Bell, G. and Lechowicz, M. J. (1994). Spatial heterogeneity at small scales and how plants respond to it. In: Caldwell, M. M. & Pearcy, R. W. (eds) Exploitation of Environmental Heterogeneity by Plants: Ecophysiological Processes Above and Below Ground, pp. 391–414. Academic Press, San Diego, CA.
Dutilleul, P. and Potvin, C. (1995). Among-environment heteroscedasticity and genetic autocorrelation: implications for the study of phenotypic plasticity. Genetics, 139: 1815–1829.
Falconer, D. S. (1952). The problem of environment and selection. Am Nat, 86: 293–298.
Falconer, D. S. (1989). Introduction to Quantitative Genetics. 3rd edn. Longman, Harlow, Essex.
Fernando, R. L., Knights, S. A. and Gianola, D. (1984). On a method of estimating the genetic correlation between characters measured in different environmental units. Theor Appl Genet, 67: 175–178.
Fry, J. D. (1992). The mixed-model analysis of variance applied to quantitative genetics: biological meaning of the parameters. Evolution, 46: 540–550.
Fry, J. D., Heinsohn, S. L. and Mackay, T. F. C. (1996). The contribution of new mutations to genotype–environment interaction for fitness in Drosophila melanogaster. Evolution, 50: 2316–2327.
Gebhardt, M. D. and Stearns, S. C. (1988). Reaction norms for developmental time and weight at eclosion in Drosophila mercatorum. J Evol Biol, 1: 335–354.
Grant, P. R. and Grant, B. R. (1995). Predicting microevolutionary responses to directional selection on heritable variation. Evolution, 49: 241–251.
Graybill, F. A. (1983). Matrices with Applications in Statistics, 2nd edn. Wadsworth, Pacific Grove, CA.
Houle, D. (1992). Comparing evolvability and variability of quantitative traits. Genetics, 130: 195–204.
Robertson, A. (1959). The sampling variance of the genetic correlation coefficient. Biometrics, 15: 469–485.
Roff, D. A. and Preziosi, R. (1994). The estimation of the genetic correlation: the use of the jackknife. Heredity, 73: 544–548.
SAS INSTITUTE INC. (1988). SAS/IMLTM User's Guide, Release 6.03. SAS Institute Inc., Cary, NC.
SAS INSTITUTE INC. (1989). SAS/STAT® User's Guide, Version, 6, 4th edn. SAS Institute Inc., Cary, NC.
SAS INSTITUTE INC. (1990). SAS®Language: Reference, Version 6. SAS Institute Inc., Cary, NC.
SAS INSTITUTE INC. (1995). Introduction to the MIXED Procedure Course Notes. SAS Institute Inc., Cary, NC.
Schlichting, C. D. and Pigliucci, M. (1995). Gene regulation, quantitative genetics and the evolution of reaction norms. Evol Ecol, 9: 154–168.
Simons, A. M. and Roff, D. A. (1996). The effect of a variable environment on the genetic correlation structure of a field cricket. Evolution, 50: 267–275.
Sokal, R. R. and Rohlf, F. J. (1995). Biometry: the Principles and Practice of Statistics in Biological Research. 3rd edn. Freeman, New York.
Via, S. (1984). The quantitative genetics of polyphagy in an insect herbivore. II. Genetic correlations in larval performance within and among host plants. Evolution, 38: 896–905.
Via, S. (1993). Adaptative phenotypic plasticity: target or byproduct of selection in a variable environment? Am Nat, 142: 352–365.
Via, S. and Lande, R. (1985). Genotype–environment interaction and the evolution of phenotypic plasticity. Evolution, 39: 505–522.
Via, S. and Lande, R. (1987). Evolution of genetic variability in a spatially heterogeneous environment: effects of genotype–environment interaction. Genet Res, 49: 147–156.
Yamada, Y. (1962). Genotype by environment interaction and genetic correlation of the same trait under different environments. Jap J Genet, 37: 498–509.
Acknowledgements
The authors are indebted to Dr G. Bell for permission to reanalyse his data in the present paper. We are grateful to Dr M. J. Kearsey and Dr T. J. Crawford for their editorial work, and to an anonymous referee for his suggestions and comments. The research work of both authors is funded through NSERC grants in Ecology and Evolution. The research work of the first author is also supported by FCAR.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dutilleul, P., Carrière, Y. Among-environment heteroscedasticity and the estimation and testing of genetic correlation. Heredity 80, 403–413 (1998). https://doi.org/10.1046/j.1365-2540.1998.00267.x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1046/j.1365-2540.1998.00267.x
Keywords
This article is cited by
-
Prospective evaluation of designs for analysis of variance without knowledge of effect sizes
Environmental and Ecological Statistics (2014)