Average semivariance directly yields accurate estimates of the genomic variance in complex trait analyses
Average semivariance directly yields accurate estimates of the genomic variance in complex trait analyses
https://doi.org/10.1093/g3journal/jkac080
Advance Access Publication Date: 20 April 2022
Investigation
1
Department of Plant Sciences, University of California, Davis, CA 95616, USA,
*Corresponding author: Department of Plant Sciences, University of California, Davis, CA 95616, USA. Email: [email protected]
Abstract
Many important traits in plants, animals, and microbes are polygenic and challenging to improve through traditional marker-assisted selec-
tion. Genomic prediction addresses this by incorporating all genetic data in a mixed model framework. The primary method for predicting
breeding values is genomic best linear unbiased prediction, which uses the realized genomic relationship or kinship matrix (K) to connect
genotype to phenotype. Genomic relationship matrices share information among entries to estimate the observed entries’ genetic values
and predict unobserved entries’ genetic values. One of the main parameters of such models is genomic variance (r2g ), or the variance of a
trait associated with a genome-wide sample of DNA polymorphisms, and genomic heritability (h2g ); however, the seminal papers introduc-
ing different forms of K often do not discuss their effects on the model estimated variance components despite their importance in genetic
research and breeding. Here, we discuss the effect of several standard methods for calculating the genomic relationship matrix on esti-
mates of r2g and h2g . With current approaches, we found that the genomic variance tends to be either overestimated or underestimated
depending on the scaling and centering applied to the marker matrix (Z), the value of the average diagonal element of K, and the assort-
ment of alleles and heterozygosity (H) in the observed population. Using the average semivariance, we propose a new matrix, KASV , that di-
rectly yields accurate estimates of r2g and h2g in the observed population and produces best linear unbiased predictors equivalent to routine
methods in plants and animals.
Keywords: average semivariance; genomic heritability; genomic variance; genomic relatedness; linear mixed model; genomic best linear
unbiased predictor
selection reliability, prediction error variance, and response to ge- differences among genotypic values (g), i.e. 21 varðgi gj Þ
nomic selection (Goddard 2009; Hickey et al. 2009; Gorjanc et al. (Webster and Oliver 2007; Piepho 2019). Piepho (2019) derived
2015). Of these ratios, genomic heritability has been the most fre- ASV from a study’s observations, worked out the semivariance
quently reported in public research (Speed et al. 2012, 2017; Speed and took the average across all pairs of observations. In our con-
and Balding 2015; de los Campos et al. 2015; Legarra 2016; text, there is an equivalent alternative derivation based on the
Lehermeier et al. 2017; Yang et al. 2017). sample variance of the genotypic values Estaghvirou et al. (2013).
Genomic heritability is The sample variance among genotypic values is
P
ðn 1Þ1 ni¼1 ðgi gÞ2 . That is to say that the expected values of
r2g the sample variance of genotypic values are the ASV, i.e.
h2g ¼ ; (1)
r2g þ r2e Eðs2g Þ ¼ hASV
g . ASV can be used to estimate and partition the total
variance in LMM analyses into parts; such as the total variance,
where r2g is the genomic variance and r2e is the residual variance as in Piepho (2019), the variance explained by large effect
markers and marker–marker interactions, as in Feldmann et al.
The ASV definition of the genomic variance is AGHmatrix::Gmatrix() to calculate the Yang et al. (2010) (KY ) and
VanRaden (2008) relationship (KVR ) matrices (Rampazo Amadeu
T et al. 2016), rrBLUP::A.mat() to calculate the Endelman and Jannink
hASV
g ¼ ðn 1Þ1 trðZZT PÞr2g ¼ ðn 1Þ1 trðZ Z Þr2g ¼ ðn 1Þ1 trðKÞr2g
(2012) (KEJ ) relationship matrix, and statgenGWAS::kinship() to esti-
(5)
mate the Astle et al. (2009) (KAB ) and IBS relationship (KIBS ) matri-
T ces (van Rossum and Kruijer 2020).
where Z ¼ PZ is the mean-centered marker matrix, and K ¼ Z Z
The form proposed by VanRaden (2008) is
is the realized genomic relationship or kinship matrix described
P
by VanRaden (2008), omitting the scaling constant 2 j pj ð1 pj Þ, T
where pj is the allele frequency of the jth SNP, which requires ZZ
KVR ¼ Pm ; (8)
Hardy–Weinberg equilibrium (HWE) to hold (de los Campos et al. 2 j¼1 pj ð1 pj Þ
T T
2015), and trðZZT PÞ ¼ trðZ Z Þ. The trace of Z Z is a function of
heterozygosity in the observed population (Vitezica et al. 2013, where Z is the marker matrix centered on column means (2pj ),
Notably, the genomic variance hASV is on the same scale as the This form is the most numerically similar to KASV and only dif-
g
residual variance hASV , and both are defined such that (4) is accu- fers by a single denominator degree of freedom.
e
rate. REML estimates of the residual variance are equivalent to The form of the relationship matrix proposed by Endelman
ASV estimates when best linear unbiased estimators or LSMs are and Jannink (2012) is
the response variable y.
T
dSii I þ ð1 dÞS þ hZ •j ihZ •j i
Two equivalent methods yield accurate h2g KEJ ¼
2hpj ð1 pj Þi
; (10)
estimates
There are two equivalent ways to obtain accurate estimates of where d ðn=mÞCV2 is a shrinkage factor, CV2 is the coefficient
T T
genomic variance and subsequently genomic heritability. The of variation of the eigenvalues of S, S ¼ m1 Z Z hZ •k ihZ •k i; hSii i
first method, our recommended approach, utilizes KASV (2) in the is the mean of diagonal elements of S. Notably, at high marker
LMM analysis and directly yields accurate estimates of the geno- densities, when d ¼ 0, Endelman and Jannink (2012) is equivalent
mic variance components from the model by rescaling the GRM. to VanRaden (2008).
The first method works because V ¼ Kr2g þ In r2e is a true state- The method proposed by Yang et al. (2010) also centers the col-
ment regardless of K, but different choices of K change the scal- umns of Z by subtracting 2pj
ing and interpretation of r2g . Thus, variance components
8
estimated by ASV can then be substituted directly into (1) with- >
> Xm
ðzji 2pj Þðzjk 2pj Þ
>
> m1 ; i 6¼ k
out any adjustment. >
< 2pj ð1 pj Þ
j¼1
The second method is to adjust the genomic variance compo- KYik ¼ m z2 ð1 þ 2p Þz þ 2p2 ; (11)
>
> X ji j ji j
nent estimates from any form of K by multiplying them by a scal- >
> 1 þ m1 ; i¼k
>
: 2pj ð1 pj Þ
ing factor (ðn 1Þ1 trðKÞ) defined by the population size (n) and the j¼1
This formulation can be used directly with any form of K or K Xm ðzj 2pj 1Þðzj 2pj 1ÞT
by substituting REML variance component estimates. Note that KAB ¼ ð2mÞ1 ; (12)
j¼1 2pj 1 pj
ðn 1Þ1 trðKÞ is the same as the scaling coefficient used in (2).
The second strategy is analogous to the post hoc adjustment ap-
where zj is the i-element vector of the jth SNP.
proach Feldmann et al. (2021) proposed. The classical identity-by-state definition is (Astle et al. 2009):
Xm 1
Materials and methods KIBS ¼ ð2mÞ1 ðzj 1Þðzj 1ÞT þ : (13)
j¼1 2
Genomic relationship matrices
We calculated and applied seven relationship matrices for each Note that this is the only calculation that is not scaled or cen-
population, simulated or case example, including KASV . We used tered by any function of pj.
4 | G3, 2022, Vol. 12, No. 6
For each model and each simulation, we estimated two vari- made available by PIC (a Genus company) with n ¼ 3,534 entries
ance components (r2g and r2e ) using sommer::mmer() and took the genotyped at m ¼ 52,843 SNPs (H ¼ 0.311; d ¼ 0:0) that were phe-
ratio of variance components in R v4.1.0 (R Core Team 2020). We notyped for five traits: T1, T2, T3, T4, and T5 (Cleveland et al.
estimated genomic heritability using the standard form by 2012). For each population, we calculate the seven relationship
substituting REML estimates from (3) into (1). matrices (8–9) and apply them in (3) for each trait to estimate h^2
g
with (1).
LMM analysis in R We performed cross-validation to determine predictive ability
In the sommer R package (Covarrubias-Pazaran 2016), LMM (3) is ^ ; yÞ, or the correlation between BLUPs and LSM, which is a
rðg
expressed as measure of success commonly reported in genome prediction
studies that indicates how informative the phenotype is as a
mmerðfixed ¼ Y 1; measure of the genomic value. We also estimated the prediction
rffiffiffiffiffiffi
random ¼ vsðEntry; Gu ¼ KÞ;
accuracy rðg ^ 2 , which is a measure of success that scales
^ ; yÞ= h
rcov ¼ units; g rffiffiffiffiffiffi
where data is an n 2 matrix with Y as a column of LSMs, Entry is An ideal situation for genomic prediction is a low value of predic-
a column of factor-coded entries, and K is one of the seven GRMs tive ability and a high value of prediction accuracy. When the
in this study given. A large number of statistical computing solu- predictive ability is high, genomic selection is unlikely to outper-
tions can fit this model, including regress (Clifford and McCullagh form phenotypic selection. When the prediction accuracy is low,
2006), ASREML (Butler 2021), rrBLUP (Endelman 2011), GEMMA the model is bad at capturing the variation in genomic values.
(Zhou and Stephens 2012), emmREML (Akdemir and Okeke 2015), We first split each population into 80% train and 20% test and es-
brms (Bürkner 2017), and lme4GS (Caamal-Pat et al. 2021). timated genomic BLUPs and then calculated the accuracy as the
correlation between the estimated LSM y and the BLUP g ^ for all
Simulated data entries in the test set. We performed this cross-validation
We generated 36 experiment designs with different heterozygos- scheme 100 times for each population and each trait.
ity H ¼ 0.0, 0.25, 0.5, and 0.75 and different trait heritability h2g ¼
0.2, 0.5, and 0.8 and for population sizes of n ¼ 250, 500, and
1,000. In all examples, 1,000 populations genotyped at m ¼ 5,000 Results
causal loci were used to generate the genetic traits. We simulated Analysis of simulated data confirms that ASV
all m ¼ 5,000 marker effects following a normal distribution l ¼ 0 yields accurate estimates of genomic variance
and r ¼ 1. When multiplied by the marker genotypes and The ASV relationship matrix yielded suitable estimates of geno-
summed, the score is an individual’s true genetic value, g. mic variance and genomic heritability in the observed popula-
Residuals were simulated with l ¼ 0 and r2e ¼ ð1 h2 Þ=ðh2 s2g Þ to tions, while the other methods varied with the level of
obtain a trait with the desired genomic heritability (Endelman heterozygosity. When heterozygosity H < 0.5, the genomic vari-
P
2011) and s2g ¼ ðn 1Þ1 ni¼1 ðgi gÞ2 is the sample variance ance tends to be underestimated, and when H > 0.5, the genomic
among genotypic values (Estaghvirou et al. 2013). In this study, variance tends to be overestimated (Fig. 1) by methods excluding
the true value of h2g ¼ 0:2, 0.5, or 0.8. All plots were made with the (2) and (9). This pattern was realized regardless of the population
ggplot2 package (Wickham 2016) in R 4.1.0 (R Core Team 2020). size, e.g. n ¼ 250, 500, and 1,000. All methods tend to produce ac-
curate estimates when H ¼ 0.5, in which case the inbreeding coef-
Empirical data ficient f ¼ 0 and HWE is not violated.
We analyzed four publicly available data sets using seven meth- The precision (variance) improved by increasing the popula-
ods for calculating the realized relationship matrix and estimated tion size (n), but the accuracy (bias) did not improve. It has been
h2g . First, we analyzed six traits from Kumar et al. (2015), which demonstrated ad nauseam that increasing n increases precision
evaluated a breeding population of n ¼ 247 apple (Malus domes- or lowers the sampling variance of the estimates but does not
tica) hybrids genotyped at m ¼ 2,829 SNPs with H ¼ 0.348 (Kumar eliminate bias (Laird and Ware 1982; Searle et al. 1992; Lynch and
et al. 2015). The reported traits were fruit weight (WT), fruit firm- Walsh 1998; Legarra 2016). Notably, the entire parameter space
ness (FF), greasiness (GRE), crispiness (CRI), juiciness (JUI), and of h2g was observed when the population size is small (Fig. 1). Only
flavor intensity (FIN). The shrinkage factor d from Endelman and KASV and KGN yielded stable precision as H increased (Fig. 2).
Jannink (2012) was equal to 0.02. Second, we analyzed the wheat Other methods that we examined have variable precision and
data set from Crossa et al. (2010), who evaluated n ¼ 599 wheat variable accuracy depending on the sample size, heterozygosity,
(Triticum aestivum) fully inbred lines (H ¼ 0.0; d ¼ 0:03) for grain and the true value of h2g (Figs. 1 and 2). Interestingly, we observed
yield (GY) in four environments genotyped for m ¼ 1,278 SNPs. an interaction between h2g and H that impacted the precision of
We evaluated each environment (i.e. GY-E1, GY-E2, GY-E3, and genomic heritability estimation did not affect KGN or KASV .
GY-E4) with an independent model. Third, we analyzed data Precision improved as H increased for high heritability traits and
from Valdar et al. (2006) which evaluated a laboratory population precision worsened as H increased for low heritability traits. For
of n ¼ 1,814 stock mice (M. musculus) for body mass index (BMI), traits where h2h ¼ 0:5, precision was constant.
body length, and weight and genotyped for m ¼ 10,346 SNPs
(H ¼ 0.363; d ¼ 0:01). Fourth, we analyzed a population of Analysis of simulated and empirical data
n ¼ 1,057 naturally occurring Arabidopsis (A. thaliana) ecotypes confirms that ASV does not impact BLUPs or
phenotyped for the mean (l) and SD of flowering time under 10 C prediction accuracy
(FT10) and 16 C (FT16) and genotyped at m ¼ 193,697 SNPs ^ ; yÞ) nor the BLUPs from genomic
Neither the predictive ability (rðg
(H ¼ 0.0; d ¼ 0:0) from Atwell et al. (2010) and Alonso-Blanco et al. best linear unbiased predictor are affected by ASV. In our simu-
(2016). Fifth, we analyzed a commercial pig (S. scrofa) population lated populations, the predictive ability was equal across all
M. J. Feldmann et al. | 5
seven GRMs that we tested (Fig. 3), but the prediction accuracy with different concepts in mind, they are numerically similar,
^ 1 rðg
(h ^ ; yÞ) varies with the choice of GRM and therefore the het- apart from a single degree of freedom difference in the divisor of
g
erozygosity in the sampled populations. In 22 empirical trait the GRM: Forni et al. (2011) used the number of entries (n),
population examples we evaluated, the differences in the predic- whereas we used dfg ¼ n 1 for calculating the sample variance
tion accuracy, when present, appeared to be negligible and do not (Bulmer 1979). KGN , instead of being biased by a factor of 1=ð1 þ
lend themselves clearly to “better” or “worse” categories (Figs. 4 f Þ; KGN is biased by a factor of ðn 1Þ=n. Our simulations confirm
and 5). While the choice of K does not impact BLUP, it does im- this deviation and the median genomic variance estimates using
pact estimates of genomic variance r ^ 2,
^ 2g , genomic heritability h KGN were slightly larger than KASV , which was equal to the true
g
1
prediction accuracy h g rðg ^ ^ ; yÞ (Fig. 5), average prediction error value in the simulations (Fig. 1). This work, Forni et al. (2011), and
variance PEV, and selection reliability 1 r2 g PEV, which all rely Legarra (2016) all arrive at numerically similar solutions through
on r ^ 2g . Differences in Fig. 5 are more pronounced for the fully in- conceptually different derivations, which we feel is indicative of
bred populations, e.g. Arabidopsis and wheat, than the partially the value of these approaches for the plant, animal, and human
inbred populations, e.g. pig, mouse, and apple. ASV allows users genetic studies that rely on genomic relatedness, e.g. GWAS, ge-
to understand how well GS is performing relative to phenotypic nomic prediction, or inferring population structure and ancestry.
selection and to predict how reliable genomic selection can be for
certain traits in specific populations more accurately than other KASV yields genomic variance estimates that
methods since it directly yields accurate estimates of r2g and h2g naturally account for inbreeding
(Figs. 3–5). Inbreeding changes the patterns of among and within entry geno-
mic variance and drives deviations from HWE (Bernardo 2002;
The relationship between KASV and KGN Wricke and Weber 2010; Legarra 2016; Isik et al. 2017). A challenge
We found that the normalized K, i.e. KGN (9), proposed by Forni of partial inbreeding is that researchers may not know or infer
et al. (2011) and further described by Legarra (2016), yields esti- the reference population, making unadjusted genomic variance
mates of K that only deviated from KASV by a single degree of free- estimates hard to interpret (Legarra 2016). In genomic evalua-
dom in the denominator of the matrix scaling factor. Although tions in plants and animals, the current population is often inter-
these estimators were derived through different approaches and preted as the reference population, but this is an inaccurate
6 | G3, 2022, Vol. 12, No. 6
Fig. 3. Effect of heritability (h2g ), population size (n), and heterozygosity (H) on the predictive ability rðg
^ ; yÞ. Phenotypic observations were simulated for
1,000 samples with n ¼ 250, 500, and 1000 (left to right) genotyped for m ¼ 5,000 SNPs and the average heterozygosity H ¼ 0%, 25%, 50%, and 75%. rðg ^ ; gÞ
estimates from LMMs fit using the seven relationship matrices is shown for true genomic heritability h2g ¼ 0:2 (upper panel), 0.5 (middle panel), and 0.8
(lower panel). Each box’s upper and lower halves correspond to the first and third quartiles (the 25th and 75th percentiles). The notch corresponds to
the median (the 50th percentile). The upper whisker extends from the box to the highest value within 1.5 IQR of the third quartile, where IQR is the
interquartile range or distance between the first and third quartiles. The lower whisker extends from the first quartile to the lowest value within 1.5 IQR
of the quartile.
M. J. Feldmann et al. | 7
Fig. 4. Cross-validated predictive ability from five case studies and including 22 phenotypic traits using seven GRMs. Cross-validated predictive ability Downloaded from https://academic.oup.com/g3journal/article/12/6/jkac080/6571389 by guest on 11 February 2025
^ ; yÞ) results are presented from 100 realizations of 80 : 20 cross-validation using the seven relationship matrices for six traits in an apple population
(rðg
with n ¼ 247 entries genotyped at m ¼ 2,829 SNPs (Kumar et al. 2015) (first row), four traits in an Arabidopsis population with n ¼ 1,057 entries genotyped
at m ¼ 193,697 SNPs (Atwell et al. 2010) (second row), three traits in an mouse population with n ¼ 1,814 entries genotyped at m ¼ 10,346 SNPs (Valdar
et al. 2006) (third row), and five traits in a pig population with n ¼ 3,534 entries genotyped at 52,843 SNPs (Cleveland et al. 2012) (fourth row), four traits in
an wheat population with n ¼ 599 entries genotyped at m ¼ 1,278 SNPs (Crossa et al. 2010) (fifth row). For the Arabidopsis data set (second row), KY
systematically produced singular systems in sommer::mmer() and prediction accuracy was not estimated for either FT10l or FT16l . Each box’s upper and
lower halves correspond to the first and third quartiles (the 25th and 75th percentiles). The notch corresponds to the median (the 50th percentile). The
upper whisker extends from the box to the highest value within 1.5 IQR of the third quartile, where IQR is the interquartile range or distance between
the first and third quartiles. The lower whisker extends from the first quartile to the lowest value within 1.5 IQR of the quartile.
interpretation unless the population is at HWE and H ¼ 0.5 by de- When the study populations are entirely, or partially, inbred as
sign or happenstance. It may be that the only reference popula- in wheat, Arabidopsis, or inbred per se evaluations in hybrid crops,
tion that is concretely defined is the sample population. In such as maize, tomato, rice, the covariance among marker effects
connection to Legarra (2016), our work will allow researchers to increases. Lehermeier et al. (2017) proposed a novel method (termed
directly obtain accurate estimates of the genomic variance in the method M2) to account for the covariance of marker effects, which
sample population regardless of whether the assumptions of increases the genomic variance estimates in recombinant inbred
HWE are met. line populations. Our analyses of the same flowering time data with
8 | G3, 2022, Vol. 12, No. 6
Table 1. Genomic heritability (h ^ 2 ) estimates for the 22 traits from five case studies, including six traits in an apple population with
g
n ¼ 247 entries genotyped at m ¼ 2,829 SNPs (Kumar et al. 2015), four traits in an wheat population with n ¼ 599 entries genotyped at
m ¼ 1,278 SNPs (Crossa et al. 2010), four traits in an Arabidopsis population with n ¼ 1,057 entries genotyped at m ¼ 193,697 SNPs (Atwell
et al. 2010), and three traits in an mouse population with n ¼ 1,814 entries genotyped at m ¼ 10,346 SNPs (Valdar et al. 2006), and five
traits in a pig population with n ¼ 3,534 entries genotyped at 52,843 SNPs (Cleveland et al. 2012) using the seven GRMs compared in this
article.
Case study Trait ASV Forni et al. VanRaden Astle and Yang et al. Endelman IBS
(2011) (2008) Balding (2010) and
(2009) Jannink (2012)
Pincot et al. 2020; Petrasch et al. 2021; Fan et al. 2021), and for ac- studies, the population sizes are n 500, which may pose a gen-
counting for population structure and relatedness in marker-trait eral problem for variance component and ratio estimation as
association analyses (Kang et al. 2010; Yang et al. 2010, 2011; Tian those variance components can have high sampling variability be-
et al. 2011; Peiffer et al. 2014; Spindel et al. 2016; Alqudah et al. tween replicated experiments (Fig. 1). For large populations, com-
2016; Pincot et al. 2018; Ferguson et al. 2021; Freebern et al. 2020). mon in human and domesticated animal studies, it is possible to
As advocated by Speed and Balding (2015) and Legarra (2016), the precise (low variance) but inaccurate (high bias) estimates of r2g
ragged diagonal elements of KASV equal 1, on average, and the and h2g resulting from different relationship matrices, unless the
off-diagonal elements equal 0, on average. ASV directly yields ac- assumptions of HWE happen to be perfectly met in the study pop-
curate estimates of genomic heritability in the observed popula- ulation.
tion and can be used to adjust deviations that arise from other We did not explore differences that arise from population
commonly used methods for calculating genomic relationships structure or rare alleles, which is a limitation to our simulation
regardless of the population constitution, such as inbred lines approach (Astle et al. 2009; Lee et al. 2012, 2013; Speed et al. 2012).
and F1 hybrids, unstructured GWAS populations, and animal We believe, but have not demonstrated, that our ASV approach
herds or flocks (Fig. 1). could be applied to many of the existing methods that have been
The interpretation of genomic variance and heritability esti- proposed to handle these real-world situations. For example, Lee
mates was systematically affected by the available methods used et al. (2012) propose that K be calculated among different sets of
to estimate K. The bias that we show in this paper is independent SNPs with similar MAFs and then the genomic variance for each
of sampling error (large data sets mitigate sampling error) and MAF bin are jointly estimated and summed to account for unique
exists even for enormous data sets. We derived a new relationship variation attributable to common vs rare alleles. Speed et al.
matrix, KASV , using the ASV that yielded consistent variance com- (2012) proposed a scaling factor for each SNP based on its own
ponent estimates. We also derived a correction factor sample variance (varðxl Þs ), where s ranges from 2 to 2 and xl is a
ðn 1Þ1 trðKÞ that allowed accurate estimates of genomic herita- vector of marker genotypes at the lth locus (Speed et al. 2012; Lee
bility in the observed population from LMM analyses using various et al. 2013). This means that SNPs are either being centered and
software packages (Clifford and McCullagh 2006; Endelman 2011; scaled (s ¼ –1), which is equal to KGN , or that SNPs are being cen-
Zhou and Stephens 2012; Pe rez and de Los Campos 2014; Akdemir tered but not scaled (s ¼ 0). While Speed et al. (2012) indicate that
and Okeke 2015; Covarrubias-Pazaran 2016; Bürkner 2017; Runcie s ¼ –1 yields more stable estimates of h2g , it is not entirely clear
and Crawford 2019; Butler 2021; Caamal-Pat et al. 2021). how to optimally select a value of s for each locus.
Adopting experiment designs that enable screening of a greater Our simulations exposed systematic differences between (2)
number of entries n yield more precise estimates of key variance and other forms of K. Our simulation and empirical experiments
components in research programs (Smith et al. 2006; Moehring also suggested limited, if any, differences between genomic vari-
et al. 2014; Borges et al. 2019; Mackay et al. 2019; Hoefler et al. 2020) ance estimates from five other commonly cited GRMs (Fig. 1;
and ASV can ensure that those estimates are accurate and compa- Table 1). The lack of significant differences is perturbing. In every
rable across populations. In many plant quantitative genetic case, there are multiple reasons given for using one relationship
10 | G3, 2022, Vol. 12, No. 6
matrix over any other that do not seem to play any role in either Astle W, , Balding DJ. Population structure and cryptic relatedness in
bias (accuracy) or variance (precision) of the genomic variance genetic association studies. Stat Sci. 2009;24:451–471.
component estimates. Both (2) and (9) have the necessary nu- Atwell S, Huang YS, Vilhjálmsson BJ, Willems G, Horton M, Li Y,
meric properties advocated by Speed and Balding (2015) that en- Meng D, Platt A, Tarone AM, Hu TT, et al. Genome-wide associa-
able the variance components from LMM (3) to be interpreted tion study of 107 phenotypes in Arabidopsis thaliana inbred lines.
directly as the genomic variance in the sampled population. We Nature. 2010;465(7298):627–631.
recommend that the ASV approach be considered for adoption by Bernardo R. Breeding for Quantitative Traits in Plants, Vol. 1.
genetic researchers working in humans, microbes, or (un)domes- Woodbury (MN): Stemma Press; 2002.
ticated plants and animals. Bloom JS, Ehrenreich IM, Loo WT, Lite TLV, Kruglyak L. Finding the
sources of missing heritability in a yeast cross. Nature. 2013;
494(7436):234–237.
Data availability Borges A, González-Reymundez A, Ernst O, Cadenazzi M, Terra J,
The input and output data from simulations and analyses have Gutie rrez L. Can spatial modeling substitute for experimental de-
associated genetic variance and heritability in complex trait spatial designs outperform classic experimental designs? JABES.
analyses. PLoS Genet. 2021;17(8):e1009762. 2020;25(4):523–552.
Ferguson JN, Fernandes SB, Monier B, Miller ND, Allen D, Dmitrieva Huang W, Mackay TFC. The genetic architecture of quantitative
A, Schmuker P, Lozano R, Valluru R, Buckler ES, et al. Machine traits cannot be inferred from variance component analysis.
learning-enabled phenotyping for GWAS and TWAS of WUE PLoS Genet. 2016;12(11):e1006421.
traits in 869 field-grown sorghum accessions. Plant Physiol. 2021; Isik F, Holland J, Maltecca C. Genetic Data Analysis for Plant and
187(3):1481–1500. Animal Breeding. Berlin (Germany): Springer; 2017.
Forni S, Aguilar I, Misztal I. Different genomic relationship matrices Jensen J, Su G, Madsen P. Partitioning additive genetic variance into
for single-step analysis using phenotypic, pedigree and genomic genomic and remaining polygenic components for complex traits
information. Genet Sel Evol. 2011;43:1–7. in dairy cattle. BMC Genet. 2012;13:44.
Freebern E, Santos DJ, Fang L, Jiang J, Gaddis KLP, Liu GE, VanRaden Jivanji S, Worth G, Lopdell TJ, Yeates A, Couldrey C, Reynolds E,
PM, Maltecca C, Cole JB, Ma L. Gwas and fine-mapping of livability Tiplady K, McNaughton L, Johnson TJ, Davis SR, et al. Genome-
wide association analysis reveals qtl and candidate mutations in-
Statistical Genomics: Two Volume Set; 2019. p. 501–520. New Schmidt P, Hartung J, Rath J, Piepho HP. Estimating broad-sense heri-
York, NY. tability with unbalanced data from agricultural cultivar trials.
Maier RM, Zhu Z, Lee SH, Trzaskowski M, Ruderfer DM, Stahl EA, Crop Sci. 2019b;59(2):525–536.
Ripke S, Wray NR, Yang J, Visscher PM, et al. Improving genetic Searle SR, Casella G, McCulloch CE. Variance Components. New
prediction by leveraging genetic correlations among human dis- York: John Wiley & Sons; 1992.
eases and traits. Nat Comm. 2018;9:1–17. Smith A, Lim P, Cullis BR. The design and analysis of multi-phase
Makowsky R, Pajewski NM, Klimentidis YC, Vazquez AI, Duarte CW, plant breeding experiments. J Agric Sci. 2006;144(5):393–409.
Allison DB, de Los Campos G. Beyond missing heritability: predic- Speed D, Balding DJ. Relatedness in the post-genomic era: is it still
tion of complex traits. PLoS Genet. 2011;7(4):e1002051. useful? Nat Rev Genet. 2015;16(1):33–44.
Meuwissen T, Hayes B, Goddard M. Prediction of total genetic value Speed D, Cai N, Johnson MR, Nejentsev S, Balding DJ, Consortium U;
using genome-wide dense marker maps. Genetics. 2001;157(4): UCLEB Consortium. Reevaluation of snp heritability in complex
1819–1829. human traits. Nat Genet. 2017;49(7):986–992.
Meuwissen T, Hayes B, Goddard M. Genomic selection: a paradigm Speed D, Hemani G, Johnson MR, Balding DJ. Improved heritability
Wricke G, Weber E. Quantitative Genetics and Selection in Plant MG, et al. Genome partitioning of genetic variation for complex
Breeding. New York (NY): Walter de Gruyter; 2010. traits using common snps. Nat Genet. 2011;43(6):519–525.
Yadav S, Wei X, Joyce P, Atkin F, Deomano E, Sun Y, Nguyen LT, Ross Yang J, Zeng J, Goddard ME, Wray NR, Visscher PM. Concepts, esti-
EM, Cavallaro T, Ks A, et al. Improved genomic prediction of mation and interpretation of snp-based heritability. Nat Genet.
clonal performance in sugarcane by exploiting non-additive ge- 2017;49(9):1304–1310.
netic effects. Theor Appl Genet. 2021;134:1–18. Yu J, Pressoir G, Briggs WH, Bi IV, Yamasaki M, Doebley JF, McMullen
Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, MD, Gaut BS, Nielsen DM, Holland JB, et al. A unified mixed-
Madden PA, Heath AC, Martin NG, Montgomery GW, et al. model method for association mapping that accounts for multi-
Common SNPs explain a large proportion of the heritability for ple levels of relatedness. Nat Genet. 2006;38(2):203–208.
human height. Nat Genet. 2010;42(7):565–569. Zhou X, Stephens M. Genome-wide efficient mixed-model analysis
Yang J, Manolio TA, Pasquale LR, Boerwinkle E, Caporaso N, for association studies. Nat Genet. 2012;44(7):821–824.
Cunningham JM, de Andrade M, Feenstra B, Feingold E, Hayes
Communicating editor: A. Lipka