0% found this document useful (0 votes)

26 views

Sparse Testing Using Genomic Prediction Improves S

This study evaluated sparse testing strategies using genomic prediction in the early yield testing stages of a CIMMYT spring wheat breeding program. Genomic prediction enabled expanding the number of testing environments without increasing phenotyping costs. Results showed that substantial overlap between lines tested across environments achieved optimal prediction accuracy. Genomic best linear unbiased prediction was found to be the best predictor of true breeding value and proposed as the selection decision metric for the early stages. The proportion of lines overlapping between the top 20% selected using genomic prediction and true breeding values was used to assess prediction performance.

Uploaded by

Iara Gonçalves

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

Sparse Testing Using Genomic Prediction Improves S

Uploaded by

Iara Gonçalves

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Theoretical and Applied Genetics

https://doi.org/10.1007/s00122-022-04085-0

ORIGINAL ARTICLE

Sparse testing using genomic prediction improves selection

for breeding targets in elite spring wheat
Sikiru Adeniyi Atanda1 · Velu Govindan1 · Ravi Singh1 · Kelly R. Robbins2 · Jose Crossa1 · Alison R. Bentley1

Received: 10 January 2022 / Accepted: 16 March 2022

Abstract
Key message Sparse testing using genomic prediction can be efficiently used to increase the number of testing envi-
ronments while maintaining selection intensity in the early yield testing stage without increasing the breeding budget.
Abstract Sparse testing using genomic prediction enables expanded use of selection environments in early-stage yield testing
without increasing phenotyping cost. We evaluated different sparse testing strategies in the yield testing stage of a CIMMYT
spring wheat breeding pipeline characterized by multiple populations each with small family sizes of 1–9 individuals. Our
results indicated that a substantial overlap between lines across environments should be used to achieve optimal prediction
accuracy. As sparse testing leverages information generated within and across environments, the genetic correlations between
environments and genomic relationships of lines across environments were the main drivers of prediction accuracy in multi-
environment yield trials. Including information from previous evaluation years did not consistently improve the prediction
performance. Genomic best linear unbiased prediction was found to be the best predictor of true breeding value, and there-
fore, we propose that it should be used as a selection decision metric in the early yield testing stages. We also propose it as a
proxy for assessing prediction performance to mirror breeder’s advancement decisions in a breeding program so that it can
be readily applied for advancement decisions by breeding programs.

Introduction The CIMMYT global spring wheat breeding program

uses two yield testing stages to identify parents for the
Genomic prediction (GP) is a statistical method to predict next breeding cycle and promising candidates to advance
the genetic potential of unobserved lines based on genomic based on high and stable yield across managed selection
information. It has been identified as a viable tool to accel- environments (SEs) (Suppl. Figure 1). These candidates are
erate genetic gain and to reduce phenotyping costs in plant then tested internationally through collaborative trials with
breeding programs, particularly as genotyping costs become partners selecting elite lines for use as parents in national
cheaper than phenotyping costs (Crossa et al. 2010; Juliana breeding programs and/or for variety release. The SEs are
et al. 2019; Santantonio et al. 2020; Atanda et al. 2021a). defined by varying sowing time and management conditions
GP is a flexible approach that can be implemented at differ- in a single location (Ciudad Obregon, Mexico). Although
ent stages in a breeding program and for different purposes, the SEs were defined within a single location, they are con-
depending on the objectives and overall breeding strategy. structed to predict the performance in global target popula-
tions of environments (Crespo-Herrera et al. 2021). In the
initial yield testing stage (denoted PYT, or stage 1), lines
Communicated by Huihui Li.
with desirable agronomic and grain traits, and resistance
* Sikiru Adeniyi Atanda to diseases, especially rusts, are evaluated for yield poten-
[email protected] tial in an optimal five irrigations bed planting environment
* Alison R. Bentley to discard low yielding lines while maintaining a range of
[email protected] maturity. Accurately capturing genotype x environment
(GxE) interaction is critical in identifying promising lines
1
International Maize and Wheat Improvement Center with the greatest potential to perform in international trials
(CIMMYT), Texcoco, Mexico
(Falconer and Mackay 1996; Cooper et al. 1995; Moham-
2
Section of Plant Breeding and Genetics, School of Integrative madi and Amri 2011). Therefore, selected lines from stage
Plant Sciences, Cornell University, Ithaca, NY, USA

13
Vol.:(0123456789)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

1 are further evaluated in a subsequent Elite Yield Trial to increase the size of training set without increasing costs
(denoted EYT, or stage 2) in six SEs. This two-stage process for sparse testing using GP.
of identifying superior performing lines is time consuming In most plant breeding programs, the genetic value of
and costly, requiring two consecutive cycles of yield testing. genotypes is estimated through adjusted best linear unbiased
Therefore, a key objective of the CIMMYT spring wheat estimates (BLUEs) (Falconer and Mackay 1996; Santantonio
program is shortening the selection cycle by advancing lines et al. 2020; Bernardo 2020; Lell et al. 2021). Theoretically,
directly to multi-environment trials in year 1 using GP to breeding value is a predictor of selection candidate potential
discard lines with low genomic best linear unbiased pre- to produce superior progenies in the next generation. How-
dictions (GBLUPs) for grain yield and other relevant traits ever, true breeding value is unknown; thus, the efficiency of
(Suppl. Figure 1). the advancement decision depends on a selection metric that
Sparse testing, in which the phenotyping of lines is split is predictive of the true breeding value. In animal breeding,
across environments, is a robust strategy to help achieve two GBLUP estimated from phenotypic information of an indi-
objectives, specifically (1) increased number of lines tested vidual and relatives using a marker or pedigree relationship
across multiple, diverse environments and (2) increased matrix in mixed model equations is widely used as a selec-
number of testing environments while maintaining the same tion metric for candidates (Zhang et al. 2011; Junjie and
selection intensity (Burgueño et al. 2012; Jarquin et al. 2020; Shengjie 2019). Recently, adoption of GBLUP as a selection
Atanda et al. 2021b). The latter is the proposed usage of GP decision metric is gaining traction in plant breeding (Ber-
in CIMMYT spring wheat breeding where phenotypic data nardo 2020; Lell et al. 2021), Therefore, we propose and
across SEs serve as a calibration model to predict GBLUPs test the use of GBLUP as an advancement decision metric
to make earlier selection of promising lines for international for low to medium traits heritability, especially in the early
trials (Suppl. Figure 1). yield testing stages in order to improve selection accuracy.
The size of the full-sib family in the CIMMYT spring Prediction accuracy is often assessed as Pearson correla-
wheat breeding program in the early yield testing stage is tion between predicted GBLUP and the BLUE (Crossa et al.
relatively small. Therefore, the dataset used in this study, 2014; Zhang et al. 2019). However, this more closely reflects
individuals within a family were likely to be absent across predictive ability rather than accuracy, and a proxy for pre-
environments (contrary to Atanda et al. (2021b)), and the diction accuracy is needed to reflect the breeding program
size of the calibration set is relatively small in each envi- advancement decision strategy. Therefore, we assessed the
ronment. Both factors are likely to influence the prediction prediction performance of sparse testing aided GP as the
accuracy when applying sparse testing in the early yield test- proportion of lines that overlap between select top 20% lines
ing stage. Good prediction accuracies have been reported using the Smith Hazel selection index (Smith 1936; Hazel
within bi-parental populations and lower prediction accura- 1943) with GBLUP from the prediction model and GBLUP
cies across populations due to inconsistent quantitative trait estimated from full data across the SEs.
loci (QTL)-marker linkage phase across populations (Clark Using breeding data from the CIMMYT spring wheat
et al. 2012; Lehermeier et al. 2014; Brandariz and Bernardo program the overall objective of this study was to test sparse
2019; Atanda et al. 2021a). However, in a scenario where testing strategies using GP via a number of approaches,
the prediction and calibration set are heterogeneous, the namely: (1) highlighting selection accuracy using GBLUP
effect of changes in LD-marker phase across populations on as a selection metric for line advancement decisions in direct
prediction accuracy might be minimal. Consequently, we stage 2 skipping stage 1 testing; (2) determining the optimal
evaluated the efficiency of sparse testing with GP when the sparse testing aided GP strategy in direct stage 2 trials; (3)
prediction and calibration set are heterogeneous using dif- determining the contribution of historical data to increas-
ferent sparse testing strategies defined by the proportion of ing the calibration set size and improving prediction accu-
overlapping lines across SEs to identify an optimal strategy racy of untested lines across SEs without increasing cost;
without sacrificing selection accuracy. The sparse testing is and (4) determining the appropriate method for evaluating
implemented in stage 1 (where current stage 2 will become prediction accuracy to closely mirror breeder advancement
stage 1), and thus the dataset used here mimics the data decisions.
architecture expected in direct stage 2 by skipping stage
1 when GBLUP will be used to select promising lines for
further testing (Suppl. Figure 1). In addition, past breeding Materials and methods
lines with genotypic and phenotypic data in relevant envi-
ronments constitute a resource to increase the size of the Plant material and field evaluation
calibration set (Mangin et al. 2019; Brandariz and Bernardo
2019; Auinger et al. 2021; Atanda et al. 2021a). Here, we The genetic material used in this study consisted of F4:8
also evaluate the merit of using past breeding information (stage 1) and F4:9 (stage 2) CIMMYT spring wheat

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

breeding lines. The genetic material was classified into To account for spatial variation in the field, the trials were
three datasets (denoted DS1, DS2 and DS3). DS1 con- sown as a grid of (6, 8, 6) rows and (15, 15, 15) columns for
sisted of 1260 stage 1 lines grouped into 45 trials, each DS1, DS2 and DS3 data, respectively.
with 28 entries and two checks, evaluated for grain yield DS2 advanced lines from DS1, and DS3 lines were geno-
(GY) and other agronomic traits in an optimal five irriga- typed using genotyping-by-sequencing (GBS) and 93,349
tion bed planting SE (B5IR). Each trial was planted in SNP markers were generated. After removing SNPs with
an alpha-lattice incomplete block design with two repli- more than 20% missing values and with a minor allele fre-
cates in the 2019–2020 crop season at Norman E. Bor- quency less than 5%, 20,985 SNPs remained and were used
laug research station, Ciudad Obregon, Sonora, Mexico for the analysis. Missing SNPs imputed with Beagle 5.1
(27° 29′ N, 109° 56′ W). DS2 consisted of the 280 stage (Browning et al. 2018).
2 lines advanced from the DS1 1260 lines based on GY
performance, agronomic, disease, grain zinc and process- Cross‑validation scheme
ing quality traits. They were evaluated in six SEs at the
same location in the 2020–2021 season. In each SE, the We evaluated the efficiency of two sparse testing GP scenar-
lines were grouped into five trials, each with 56 entries and ios based on the approach reported by Jarquin et al. (2020):
4 checks, and the trials were planted in an alpha-lattice
incomplete block design with two replicates. 1. Phenotyping a different set of lines in each SE, i.e. lines
The SEs were defined by a combination of factors were not repeated across SEs. In this scenario, the 280
including planting date, irrigation, and planting condition lines in DS2 were divided into six unique sets with some
(flat or bed) as follows: SEs having 46 lines, while others had 47 lines in the
calibration set (Fig. 1). Therefore, the prediction set in
1. Optimal planting date and five irrigations bed planting each SE consisted of 234 lines where 46 lines were con-
(B5IR). Approximately 500 mm of water was applied sidered tested, and 233 lines when 47 lines were consid-
through flood irrigation. Optimal planting date implies ered tested in the SE. Splitting of lines across the SEs
planting during the third week of November to first week was repeated 30 times.
of December. 2. A subset of lines overlapping across the SEs to allow
2. Optimal planting date and five irrigations flat planting borrowing of information across SEs while varying the
(F5IR), with the same total amount of water applied as number of lines that serve as connectivity across the
B5IR, through drip irrigation. SEs. We considered the following sets of lines as over-
3. Optimal planting date and two irrigations bed planting lapping across SEs: 10, 20, 30, 40 and 50% of the DS2
(B2IR). Approximately 250 mm of water was applied (n = 280) (Fig. 1B and Suppl. Figure 2:3). When 10% of
through flood irrigation. the total lines overlapped across the SEs, the remaining
4. Optimal planting date and drought stress flat planting 252 lines were divided into six unique sets. Thus, in total
(FDRIP). Approximately 180 mm of water applied 70 lines were used as a calibration set in each SE to pre-
through drip irrigation. dict the genetic value of the prediction set (210 lines) in
5. Early heat stress bed planting (BEHT). Lines were each SE. For 20, 30, 40 and 50% of the total DS2 lines,
planted about 3 weeks earlier ( 1st week of November) 58, 88, 112 and 142, respectively, overlapped across the
than the optimal planting date with the aim of evaluating SEs. The calibration sets for each SE for the four differ-
the lines for heat tolerance during the early growth stage. ent overlapping sizes were 95, 120, 140 and 165 lines,
Approximately 500 mm of water was applied through respectively. The prediction sets in each SE for the four
flood irrigation. overlapping scenarios were 185, 160, 140 and 115 lines,
6. Late heat stress bed planting (BLHT). Contrary to respectively. Again, the process of line allocation across
BEHT, lines were sown 90 days after optimal planting the SEs was repeated 30 times for each overlapping size
date to evaluate the lines for heat tolerance during the scenario.
flowering and grain filling stages of the plant. Approxi-
mately 500 mm of water was applied through flood irri- For all the cross-validation schemes, DS1 and DS3 were
gation. used separately to increase the size of the calibration set. The
contribution to accuracy of prediction in each SE was then
DS3 consisted of the stage 2 lines advanced from evaluated independently for both DS1 and DS3.
2018–2019 stage 1 (data not used) and consisted of 253 We report two measures of the prediction accuracy: firstly,
lines evaluated in six SEs at the same location in the the Pearson correlation of the predicted GBLUP or PBLUP
2019–2020 season. and the best linear unbiased estimates (BLUE) calculated
using all observed records of tested and untested lines in

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

Fig. 1 Allocation of 280 DS2 lines to the six SEs. Each column rep- across the SEs. B: The pink section represents 28 lines (10% of DS2
resents a discreet SE and the green sections in each column corre- lines) that overlapped across the SEs, and the green sections are the
spond to unique lines tested in each SE. A: The green sections cor- 42 DS2 lines unique to each SE
respond to 46 or 47 lines unique to each SE with no overlapping lines

each SE. Secondly, we used a Smith–Hazel (SH) selection matrices and vector of economic weight (0.25, 0.1, 0.3, 0.2,
index (Smith 1936; Hazel 1943) to define the top 20% of 0.1, 0.05) that optimized the ability to select promising can-
candidates based on GBLUP of the lines estimated from full didates across the SEs, thus maximizing genetic gain. The
data and GBLUP (estimated and predicted) of lines in the latter method was used to reflect the advancement decision
six SEs obtained from the prediction model. Accuracy was strategy in the early yield testing stages in the CIMMYT
estimated as the proportion of lines that intersected between spring wheat breeding program (Suppl. Figure 1).
individuals selected using the GBLUP estimates of the lines To assess the accuracy of GBLUP as a selection decision
from full data and the GBLUP obtained from the predic- metric in the early yield testing stages, we estimate the BLUE
tion model. To calculate the SH index, we used the genetic and GBLUP of the 1260 DS1 lines and the 280 DS2 advanced
(from Eqs. 1 and 2) and phenotypic variance–covariance lines using the full data set. Given that the 1260 DS1 lines

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

were only evaluated in a single SE (B5IR), only information (⊗) between G or A relationship matrix with dimension g × g
from this SE was used for this analysis. We calculated the and the k × k variance–covariance matrix of the genomic
Pearson correlation of the GBLUP obtained for the complete effect of lines in and between SEs (Go). For combined DS1
280 DS2 lines as well as BLUEs in the two stages of yield and DS2 analysis, the G or A relationship matrix is a block
testing. We also measured accuracy as percentage of lines that diagonal matrix such that only 280 lines that overlap across
intersect between select top (10, 20, 30, 40 and 50%) lines in the two datasets were used in the analysis as the remaining
the two testing stages to account for possible selection inten- 980 lines do not have marker information.
sity in the early yield testing stage. [ ( )]
u2 ∼ N 0, G ⊗ Go (5)
Genomic selection models
The covariance of the genomic effect of the line u2 in
multi-environment model can be represented as:
We fitted a multi-environment linear mixed model in
( )
ASREML (Gilmour 1999) for the sparse testing aided GP Cov u, u� = Go ⊗ G (6)
analysis using DS2, combined DS1 and DS2 as well as com-
bined DS2 and DS3 as follows:
⎡ 𝜎g1
2
𝜎g12 ⋯ 𝜎g1k ⎤
y = 𝟏n 𝜇 + X1 b1 + Z1 u1 + Z2 u2 + Z3 u3 + Z4 u4 + Z5 u𝟓 + 𝜺 ⎢𝜎 𝜎2 ⋯⋯ ⎥
(1) Go ⊗ G = ⎢ g21 g2 ⎥⊗G (7)
⎢ ⋮ ⋱ ⋮ ⎥
y = 𝟏n 𝜇 + X1 b1 + Z1 u1 + Z2 u2 + Z3 u3 + Z4.1 u4.1 + Z5.1 u𝟓.𝟏 + 𝜺 ⎢ 𝜎gk1 ⋮ … 𝜎 2 ⎥
⎣ gk ⎦
(2)
For single-environment analysis, to evaluate the efficiency where Go Go Go represents the k × k variance–covariance
of GBLUP as a selection decision metric in the early yield test- matrix of the genomic effect of lines in the SEs. The diago-
ing stages we fitted a linear mixed model as follows: nal of the Go matrix is the additive genetic variance σ2g
k
within the k-th SE. The off-diagonal (𝜎g1k ) elements repre-
y = 𝟏n 𝜇 + X1 b1 + Z1.1 u1.1 + Z2.1 u2.1 + Z3 u3 + Z4.2 u4.2 + 𝜺
(3) sent the genetic covariance between SEs.
The factor analytic (FA) model which is a parsimonious
where y (n × 1) is the vector of phenotypes of the lines meas- approach for fitting GEI and complex covariance struc-
ured in the environments (1…k), μ is the overall mean and ture among environments (Piepho 1998; Smith et al. 2001;
𝟏n (n × 1) is a of vector ones, b1 is a fixed effect of replica- Crossa et al. 2004; Oakey et al. 2016; Smith and Cullis
tion, u1 is a random effect of SE, u1.1 is the random effect 2018) was used in this study. We use the extended FA
of the genomic effect of g-th line, u2 is the random effect (XFA) model that allows a non-full rank variance matrix
of the interaction between the genomic effect of g-th line for the GEI effects; therefore, the mixed model equation
and k-th SE, u2.1 is the random effect of replication nested is sparser, resulting in reduced computational require-
within trial, u3 is the random effect of the trial, u4 is the ran- ments compared to the standard FA model, as reported
dom effect of replication nested within SE and trial, u4.1 is in Thompson et al. (2003) and Meyer (2009). In general,
the random effect of replication nested within SE, trial and FA identifies one or few factors underlying the correlation
year for the multi-year dataset, u4.2 is the random effects of among environments by their relationship to unobservable
incomplete block nested within replication and trial, u5 is the latent variables. Thus, GEI is modeled as an interaction
random effects of incomplete block nested within replica- between the genomic effect of the g-th line and one or few
tion, trial and SE, u5.1 is the random effects of incomplete factors underlying the environmental influences on the line
block nested within replication, trial, SE and year for the (Piepho 1998; Smith et al. 2001; Crossa et al. 2004; Kelly
multi-year dataset. The number of fixed and random effects et al. 2007).
are represented as n and p, while Xn and Zp are incidence FA model for Cov(ug , ug ) is expressed as:
′

matrices for fixed and random effects, respectively. The vari-

ance of the random effects u2 , u2.1 , u3, u4 , u4.1, u4.2 , u5 and (𝚲𝚲� + 𝚿) ⊗ G (8)
u5.1 was assumed to be distributed as:
( ) where Λ is a k × m matrix of loading factors and the col-
up ∼ N 𝟎, Ip 𝜎up2
(4) umns of Λ are associated with the environmental loadings
for the m-th latent factor. Ψ is a k × k heterogeneous diagonal
( )
where Ip and 𝜎up
2 are the identity matrix and variance of the matrix with specific environment genetic variances 𝚿k on
p-th random effect expect u2.1 where I is either genomic (G) the diagonal and zero covariance between environments.
or pedigree (A) relationship matrix. In Eqs. 1 and 2, the The residual variance for Eqs. 1, 2 and 3 can be speci-
random GEI effect u2 is defined as the Kronecker product fied as:

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

𝜀 ∼ N(0, R) (9)
where R is a block diagonal matrix with the error variances
within SEs expect for Eq. 3. To allow separate spatial covari-
ance structure within SEs, R for each SE was defines as:
[ ( ) ( )]
Rk = 𝜎k2 𝐀𝐑𝟏 pc ⊗ 𝐀𝐑𝟏 pr (10)

𝜎k2 is( the

) spatial( residual
) variance in k-th SE and
𝐀𝐑1 p𝐜 ⊗ 𝐀𝐑1 pr is the Kronecker product of first-
order autoregressive processes across columns and rows,
respectively.
Plot-level heritability for k-th SE was derived from the
variance components obtained from the model as:
2
𝜎gk
h2k = 2 2 (11)
𝜎gk + 𝜎𝜀k

where 𝜎g2 and 𝜎ε2 are the genetic, residual variance estimates
k k
for k-th SE.
Fig. 2 Correlation estimates between the 280 lines selected from
Best linear unbiased estimates (BLUEs) for the lines were 1260 lines in stage 1 and further evaluated in stage 2 using GBLUP
computed using Eqs. 1–3 while allowing u2 and u2.1 to be and BLUE calculated from complete dataset as the selection criteria.
fixed instead of random. The BLUEs were used as reference GBLUP_stage 1 and GBLUP_stage 2 denote GBLUP estimates of the
genotypic value to compare the PBLUP or GBLUP models. same 280 lines in stage 1 using DS1 and stage 2 using DS2, respec-
tively. Similarly, BLUE_stage 1 and BLUE_stage 2 represent BLUE
estimates of the lines in stage 1 and stage 2, respectively

Results
years (B5IR, F5IR, B2IR, FDRIP, BEHT, BLHT and B5IR*;
GBLUP gives higher correlation values compared with B5IR* representing the B5IR SE in DS1 (stage 1)),
to BLUEs for stage 1‑to‑2‑line advancement the genetic correlation between SEs ranged from − 0.05
to 0.85 (Suppl. Figure 5). The genetic correlation between
As expected, GBLUP was found to be a more effective B5IR* and B5IR was high and positive (0.80), whereas that
metric for line advancement from stage 1 (DS1) to stage 2 between B5IR* and B2IR, and FDRIP was negative (− 0.11
(DS2). The GBLUP correlation between the two yield test- and − 0.05, respectively). In the analysis where DS3 was
ing stages was 0.45 compared to 0.35 when BLUEs were used to increase the size of calibration set in DS2, most of
used as the advancement decision metric (Fig. 2). GBLUP the SEs in DS3 denoted by (**) suffix have moderate to high
consistently improved selection accuracy compared to genetic correlation with SEs in DS2 (Suppl. Figure 6).
BLUE as the selection decision metric based on the pro-
portion of lines that overlap between the two testing stages Predictive ability increases dependent
for the select top (10, 20, 30, 40 and 50%) of lines (Suppl. on heritability, SE correlations and calibration set
Figure 4). size using different spare testing strategies

Heritability was high within SEs and genetic Predictive ability increased with higher heritability and
correlations varied significantly between pairs genetic correlation between SEs. The BEHT SE had the low-
of SEs est plot-level heritability (0.18) and had consistently low
predictive ability irrespective of the sparse testing strategy
The plot-level heritability for SEs in DS2 (stage 2) ranged and prediction model used (Fig. 4). Higher predictive ability
from moderate to high except BEHT which had a low plot- was generally observed in each SE with increased number of
level heritability value of 0.18 (Fig. 3). Similar results were genotypes overlapping across SEs. For instance, using only
obtained in the analyses when DS1 or DS3 were used to the DS2 the predictive ability increased by 42, 111, 59, 64,
augment the calibration set in DS2 (Suppl. Figures 5 and 989 and 67%, respectively, when 50% of the genotypes over-
6). The genetic correlation between SEs in DS2 ranged from lapped across the SEs (B5IR, F5IR, B2IR, FDRIP, BEHT,
− 0.04 to 0.67. Furthermore, when SEs were defined across BLHT) compared to non-overlap of genotypes across the

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

Fig. 3 Plot-level heritability

(diagonal) and genetic cor-
relation between pairs of SEs
(upper diagonal) from factor
analytic model analysis of
complete DS2

SEs. However, the predictive ability did not increase lin- calibration set in each SE. For example, with no overlap-
early with expansion in the number of lines that connected ping lines across SEs, using the DS3 to increase the calibra-
the SEs. tion set size resulted in a predictive ability obtained from
The augmentation of the calibration set with DS1 or DS3 the PBLUP model across the SEs ranging from 4 to 31%,
improved predictive ability of untested lines for some SEs, while the prediction ability obtained from the GBLUP model
especially with DS1 using only the 280 lines evaluated in ranged from 9 to 46% across the SEs.
the B5IR SE. For example, when lines were not repeated
across SEs, the predictive ability of the 234 lines in F5IR SE
increased by 21 and 1% using DS1 and DS3, respectively, to Mimicking selection advancement decision strategy
augment the calibration set. While it increased by 5 and 8% as proxy for prediction performance
in FDRIP SE using DS1 and DS2, respectively, to increase
the size of the calibration set. In general, the use of DS1 to The prediction performance improved with increasing size of
augment the calibration set improved predictive ability in the calibration set, further corroborating the importance of
each SE compared to DS3. However, the result of the popu- genetic connectivity across SEs/environments in sparse test-
lation structure of the datasets assessed by spectral decom- ing (Fig. 5). Contrary to the prediction performance in each
position of the genomic relationship matrix of the lines in SE, prediction accuracy was higher using DS3 to increase
DS2 and DS3 shows significant overlap across the datasets the size of the calibration set compared to DS1 which sug-
(Suppl. Figure 7). This possibly indicates that genetic cor- gests the relevance of including information from all SEs.
relation between environments influences predictive ability In general, the addition of DS1 or DS3 to increase the size
in multi-environment genomic prediction. of the calibration set did not consistently improve predic-
The prediction performance of the PBLUP model was tion accuracy. This prediction accuracy method reflects the
similar to the GBLUP model which is not surprising as DS2 advancement decision strategy in early yield testing stage
consisted of multiple populations with family sizes that of the CIMMYT spring wheat breeding program utilizing
ranged from 1 to 9 (Suppl. Figure 8). However, the predic- GBLUP and selection indices to select parent for the next
tive ability was lowest when DS3 was combined with the breeding cycle and candidates for further testing.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

Fig. 4 Predictive ability of untested lines in each SE for the differ- predictive ability obtained when calibration set was augmented with
ent sparse testing strategies. The different colors denote SEs (B2IR, DS1 and DS3, respectively. Also, A, A* and A** represent predictive
B5IR, BEHT, BLHT, F5IR and FDRIP). G, G* and G** represent ability obtained as the Pearson correlation of the predicted PBLUPs
predictive ability obtained as the Pearson correlation of the predicted to the observed BLUEs with the suffix (* and **) as above
GBLUPs to the observed BLUEs. The suffix (* and **) represents

Discussion relies on utilization of information from closely related

individuals within and across environments using multi-
Genomic prediction is a powerful tool to reduce selection environment models (Burgueño et al. 2012; Jarquin et al.
cycle time and increase selection intensity (Meuwissen et al. 2020; Atanda et al. 2021b).
2001; Burgueño et al. 2012; Jacobson et al. 2014; Oakey The multiple populations with small family size rang-
et al. 2016; Crossa et al. 2017; Santantonio et al. 2020). ing in our study from 1 to 9 individuals imply individuals
This study aimed to examine sparse testing using GP for within a family are likely to be absent across environments
increasing first-year yield trial testing across SEs with lines and this might suggest low predictive ability observed when
advanced based on GBLUP without increasing breeding non-overlapping lines were randomly distributed in differ-
costs. Sparse testing using GP in which new lines are evalu- ent SEs. In this scenario, genetic correlation between pairs
ated in different but genetically correlated environments of SEs was estimated through replication of alleles across

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

Fig. 5 Prediction accuracies obtained for the different allocation of prediction model and GBLUP estimates from full data across the SEs.
lines to SEs using proportion of lines that intersect between select top The suffix (* and **) represents prediction accuracy obtained when
20% lines based on SH selection index model using GBLUP from the calibration set was augmented with DS1 and DS3, respectively

SEs. Generally, increases in genetic connectivity across SEs overlapped across SEs. Further studies are required to exam-
improved modeling of the genotype by environment inter- ine the magnitude of each individual factor and their combi-
action. This gave more efficient correlation estimates and nations on the fraction of lines that overlap across environ-
consequently better use of information from related lines ments. This will enable optimal/desired predictive ability
tested in correlated SEs. Therefore, the improved predictive when implementing sparse testing using GP in breeding.
ability with increased number of lines connecting the SEs Given the multiple populations that constitute the datasets
suggests that the efficiency of sparse testing aided GP relies used in this study, we suspect differences in QTL-marker
on leveraging information within and across environments linkage phase across the populations might affect the pre-
(Burgueño et al. 2012; Jarquin et al. 2020). dictive ability. The observed predictive ability might be
Jarquin et al. (2020) reported improved predictive ability due to the number of QTLs segregating across the popula-
by increasing the number of lines that overlapped across tion limiting the influence of QTL-marker linkage phase.
three environments. Due to the data structure, our study did Although this assumption was not evaluated here, it was
not exclusively investigate factors that might contribute to previously reported by Schopp et al. (2017). The authors
the proportion of lines that should overlap across the SEs to reported improved predictive ability obtained for half-sib
obtain optimal predictive ability, and this is acknowledged as families compared to un-related families due to higher seg-
a key limitation of the study. We hypothesize factors such as regation of QTLs among half-sib families rather than con-
population size, the number of crosses per parent, number of sistency of QTL-marker linkage phases across the families.
half-sib families, yield testing stage and expected predictive Theoretically, predictive ability improves with increased
ability might largely influence the proportion of lines that size of the calibration set. Previous studies (Habier et al.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

2007; Clark et al. 2012; Riedelsheimer et al. 2013; Lee et al. lines allowing use of the phenotypic information from indi-
2017; Campos et al. 2013; Atanda et al. 2021a; Lopez-Cruz vidual lines and relatives (Henderson 1975; Lell et al. 2021).
et al. 2021) have emphasized the influence of size of the In the early yield testing stage where genetic merit of lines is
calibration set and the influence of degree of relatedness evaluated in few environments, our results indicate GBLUP
between calibration and prediction set on predictive ability. improved selection accuracy compared to BLUE as selec-
Considering the degree of genetic variation in the prediction tion decision metric. Other factors that contribute to pheno-
set, there is minimal possibility of improvement by apply- typic expression such as environmental variables can also
ing genetic optimization criteria accounting for relatedness be adjusted for in the GBLUP model to achieve estimation
between prediction set and the individuals in DS3 in our of genetic merit of lines that is potentially close to the true
study. This is because it is unlikely to have the same QTL breeding value of the lines (Jarquín et al. 2014; Monteverde
segregating across all the populations in the prediction set. et al. 2019; Costa-Neto et al. 2021; Crossa et al. 2021).
In Atanda et al. (2021a) and Brandariz and Bernardo (2019),
historical data were optimized around each bi-parental popu-
lation. The optimized individuals from historical data were
assumed to be in the same QTL-marker LD phase with full- Conclusion
sibs in the prediction population resulting in improved pre-
dictive ability. However, in the current study the prediction The results from our study show that GBLUP should be used
set comprised several populations with few progenies per as an advancement decision metric and parental selection
cross; therefore, relatedness between individuals in DS3 and criteria in the early yield testing stages, where genetic merit
the prediction set was ambiguous and did not consistently of lines is evaluated in few environments. For programs
improve predictive performance. implementing sparse testing GP for multi-environment
Unsurprisingly, the predictive performance of the PBLUP yield trials, consideration should be given to the propor-
model was similar to the GBLUP model. For across population of lines that overlap across environments in the early
tion predictions and based on the architecture of the dataset yield testing stages to increase the size of the SE or selection
used in this study, the predictive ability depends largely on intensity. Our study suggests including a substantial num-
the variation in genomic and pedigree relationship between ber of common lines across environments to ensure precise
families. Therefore, the PBLUP model was able to track estimation of genetic correlation between environments and
segregating QTLs across the families and thus explained a to enable improved modeling of GxE interaction effects. In
large proportion of the genetic variance among the families. general, sparse testing using GP is a promising strategy for
This result agreed with previous studies (Crossa et al. 2010; increasing genetic gain in a breeding program by optimizing
Albrecht et al. 2011; Schopp et al. 2017; Basnet et al. 2019; testing across SE while keeping the breeding costs constant.
Calleja-Rodriguez et al. 2020).
Supplementary Information The online version contains supplemen-
In a breeding program, accurate estimate of selection can- tary material available at https://d oi.o rg/1 0.1 007/s 00122-0 22-0 4085-0.
didates’ breeding value is critical as a predictor of genetic
potential and the ability to generate superior progenies in Acknowledgements The lead author was grateful to Giovanny Eduardo
the subsequent generation (Falconer and Mackay 1996; Gall Covarrubias Pazaran for reading the first version and providing useful
suggestions. We thank Kate Dreher for organizing the data used in this
and Bakar 2002; Zhang et al. 2011; Crossa et al. 2021). In
study and making it public. We thank Dr. Huihui Li and the two anony-
practice, programs are unlikely to have an estimate of true mous reviewers whose suggestions helped improve this manuscript.
breeding value of genotypes; however, GBLUP can be used
as a surrogate of true breeding value. This is especially true Author contribution statement SA conceptualized, analyzed, inter-
for traits with inherent low to medium heritability such as preted the result and drafted the manuscript. VG coordinated the field
experiments and supervision. AB made critical reviews and coordi-
grain yield (Henderson 1975; Gall and Bakar 2002; Zhang
nated with SA on the final draft. RS, JC, RR and other authors contrib-
et al. 2011; Lell et al. 2021) particularly in the early yield uted to the editing of the manuscript.
testing stages where genetic merit of lines is evaluated in
few environments. According to Lell et al. (2021), the use of Funding This research was supported by grant from Bill and Melinda
GBLUP or BLUE as selection criteria will be influenced by Gates Foundation [OPP1215722] and co-funded by Foreign and Com-
monwealth Office became the Foreign, Commonwealth & Development
the reliability of the available information (evaluation stage)
Office (FCDO) of the UK Government to CIMMYT.
as well as whether the assumption of genotype independence
can be valid in the evaluation stage. As a result, GBLUP Data availability The dataset used in this study is available in Dataverse
superiority over BLUE or BLUE superiority over GBLUP at https://hdl.handle.net/11529/10548639.
cannot be generalized across evaluation stages. The model
underlying BLUE assumes all lines were independent, while
GBLUP model incorporates genomic relationship between

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

Declarations reference data sets in livestock breeding schemes. Genet Sel Evol
GSE 44:4
Cooper M, Woodruff DR, Eisemann RL, Brennan PS, Delacy IH (1995)
Conflict of interest The authors declare no conflict of interest.
A selection strategy to accommodate genotype-by-environment
interaction for grain yield of wheat: managed-environments for
Open Access This article is licensed under a Creative Commons Attri- selection among genotypes. TAG Theor Appl Genet Theor Angew
bution 4.0 International License, which permits use, sharing, adapta- Genet 90(3–4):492–502
tion, distribution and reproduction in any medium or format, as long Costa-Neto G, Fritsche-Neto R, Crossa J (2021) Nonlinear kernels,
as you give appropriate credit to the original author(s) and the source, dominance, and envirotyping data increase the accuracy of
provide a link to the Creative Commons licence, and indicate if changes genome-based prediction in multi-environment trials. Heredity
were made. The images or other third party material in this article are 126(1):92–106
included in the article's Creative Commons licence, unless indicated Crossa J, Fritsche-Neto R, Montesinos-Lopez OA, Costa-Neto G,
otherwise in a credit line to the material. If material is not included in Dreisigacker S, Montesinos-Lopez A, Bentley AR (2021) The
the article's Creative Commons licence and your intended use is not modern plant breeding triangle: optimizing the use of genomics,
permitted by statutory regulation or exceeds the permitted use, you will phenomics, and enviromics data. Front Plant Sci. https://doi.org/
need to obtain permission directly from the copyright holder. To view a 10.3389/fpls.2021.651480
copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Crespo-Herrera LA, Crossa J, Huerta-Espino J, Mondal S, Velu G, Juli-
ana P, Vargas M et al (2021) Target population of environments
for wheat breeding in India: definition, prediction and genetic
gains. Front Plant Sci 12:638520
References Crossa J, Pérez-Rodríguez P, Cuevas J, Montesinos-López O, Jarquín
D, de los Campos G, Burgueño J et al (2017) Genomic selection
Albrecht T, Wimmer V, Auinger H-J, Erbe M, Knaak C, Ouzunova in plant breeding: methods, models, and perspectives. Trends
M, Simianer H, Schön C-C (2011) Genome-based prediction of Plant Sci. https://doi.org/10.1016/j.tplants.2017.08.011
testcross values in maize. TAG Theor Appl Genet Theor Angew Crossa J, Pérez P, Hickey J et al (2014) Genomic prediction in CIM-
Genet 123(2):339–350 MYT maize and wheat breeding programs. Heredity 112:48–60.
Atanda SA, Olsen M, Burgueño J, Crossa J, Dzidzienyo D, Beyene https://doi.org/10.1038/hdy.2013.16
Y, Gowda M et al (2021a) Maximizing Efficiency of Genomic Crossa J, de Los G, Campos PP, Gianola D, Burgueño J, Araus JL,
Selection in CIMMYT’s Tropical Maize Breeding Program. Theor Makumbi D et al (2010) Prediction of genetic values of quan-
Appl Genet. https://doi.org/10.1007/s00122-020-03696-9 titative traits in plant breeding using pedigree and molecular
Atanda SA, Olsen M, Crossa J, Burgueño J, Rincent R, Dzidzienyo D, markers. Genetics 186(2):713–724
Beyene Y et al (2021b) Scalable sparse testing genomic selection Crossa J, Yang R-C, Cornelius PL (2004) Studying crossover geno-
strategy for early yield testing stage. Front Plant Sci 12:658978 type × environment interaction using linear-bilinear models and
Auinger H-J, Lehermeier C, Gianola D, Mayer M, Melchinger AE, mixed models. J Agric Biol Environ Stat 9(3):362–380. https://
da Silva S, Knaak C, Ouzunova M, Schön C-C (2021) Calibra- doi.org/10.1198/108571104x4423
tion and validation of predicted genomic breeding values in an del los Campos G, Vazquez AI, Fernando R, Klimentidis YC,
advanced cycle maize population. Theor Appl Genet. https://doi. Sorensen D (2013) Prediction of complex human traits using
org/10.1007/s00122-021-03880-5 the genomic best linear unbiased predictor. PLoS Genet
Basnet BR, Crossa J, Dreisigacker S, Pérez-Rodríguez P, Manes Y, 9(7):e1003608
Singh RP, Rosyara U, Camarillo-Castillo F, Murua M (2019) Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics.
Hybrid wheat prediction using genomic, pedigree, and environ- Longman, Essex
mental covariables interaction models. Plant Genome. https://doi. Gall GAE, Bakar Y (2002) Application of mixed-model techniques to
org/10.3835/plantgenome2018.07.0051 fish breed improvement: analysis of breeding-value selection to
Bernardo R (2020) Reinventing quantitative genetics for plant breed- increase 98-day body weight in Tilapia. Aquaculture. https://doi.
ing: something old, something new, something borrowed Some- org/10.1016/s0044-8486(02)00024-8
thing BLUE. Heredity 125(6):375–385 Gilmour AR (1999) ASREML reference manual. NSW Agric Biom
Brandariz SP, Bernardo R (2019) Small Ad Hoc versus large gen- Bull 3:1–210
eral training populations for genomewide selection in maize Habier D, Fernando RL, Dekkers JCM (2007) The impact of genetic
biparental crosses. Theor Appl Genet. https://doi.org/10.1007/ relationship information on genome-assisted breeding values.
s00122-018-3222-3 Genetics 177(4):2389–2397
Browning BL, Zhou Y, Browning SR (2018) A one-penny imputed Hallauer AR, Carena MJ, Miranda Filho JB (2010) Quantitative genet-
genome from next-generation reference panels. Am J Hum Genet ics in maize breeding. Springer, Berlin
103(3):338–348 Hazel LN (1943) The genetic basis for constructing selection indexes.
Burgueño J, de los Campos G, Weigel K, Crossa J (2012) Genomic Genetics. https://doi.org/10.1093/genetics/28.6.476
prediction of breeding values when modeling genotype × envi- Henderson CR (1975) Best Linear unbiased estimation and predic-
ronment interaction using pedigree and dense molecular markers. tion under a selection model. Biometrics. https://doi.org/10.2307/
Crop Sci 52(2):707–719. https://doi.org/10.2135/cropsci2011.06. 2529430
0299 Jacobson A, Lian L, Zhong S, Bernardo R (2014) General combining
Calleja-Rodriguez A, Pan J, Funda T, Chen Z, Baison J, Isik F, ability model for genomewide selection in a biparental cross. Crop
Abrahamsson S, Wu HX (2020) Evaluation of the efficiency of Sci. https://doi.org/10.2135/cropsci2013.11.0774
genomic versus pedigree predictions for growth and wood quality Jarquin D, Howard R, Crossa J, Beyene Y, Gowda M, Martini JWR,
traits in scots pine. BMC Genomics 21(1):796 Covarrubias Pazaran G et al (2020) Genomic prediction enhanced
Clark SA, Hickey JM, Daetwyler HD, van der Werf JHJ (2012) The sparse testing for multi-environment trials. G3 10(8):2725–2739
importance of information on relatives for the prediction of Jarquín D, Crossa J, Lacaze X, Cheyron PD, Daucourt J, Lorgeou J,
genomic breeding values and the implications for the makeup of Piraux F et al (2014) A reaction norm model for genomic selection

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Theoretical and Applied Genetics

using high-dimensional genomic and environmental data. Theor Oakey H, Cullis B, Thompson R, Comadran J, Halpin C, Waugh R
Appl Genet. https://doi.org/10.1007/s00122-013-2243-1 (2016) Genomic selection in multi-environment crop trials. G3
Juliana P, Montesinos-López OA, Crossa J, Mondal S, González Pérez 6(5):1313–1326
L, Poland J, Huerta-Espino J et al (2019) Integrating genomic- Piepho H-P (1998) Empirical best linear unbiased prediction in cul-
enabled prediction and high-throughput phenotyping in breeding tivar trials using factor-analytic variance-covariance structures.
for climate-resilient bread wheat. TAG Theor Appl Genet Theor Theor Appl Genet 97(1–2):195–201. https://doi.org/10.1007/
Angew Genet 132(1):177–194 s001220050885
Junjie B, Shengjie Li (2019) The genetic parameters of growth traits Riedelsheimer C, Endelman JB, Stange M, Sorrells ME, Jannink J-L,
and breeding value estimation in largemouth bass (Micropterus Melchinger AE (2013) Genomic predictability of interconnected
Salmoides). Genet Breed Mol Marker-Assist Sel Breed Large- biparental maize populations. Genetics. https://doi.org/10.1534/
mouth Bass. https://d oi.o rg/1 0.1 016/b 978-0-1 2-8 16473-0 .0 0002-5 genetics.113.150227
Kelly AM, Smith AB, Eccleston JA, Cullis BR (2007) The accuracy Santantonio N, Atanda SA, Beyene Y, Varshney RK, Olsen M, Jones E,
of varietal selection using factor analytic models for multi-envi- Roorkiwal M et al (2020) Strategies for effective use of genomic
ronment plant breeding trials. Crop Sci 47(3):1063–1070. https:// information in crop breeding programs serving Africa and South
doi.org/10.2135/cropsci2006.08.0540 Asia. Front Plant Sci 11(March):353
Lee SH, Clark S, van der Werf HJ (2017) Estimation of genomic pre- Schopp P, Müller D, Wientjes YCJ, Melchinger AE (2017) Genomic
diction accuracy from reference populations with varying degrees prediction within and across biparental families: means and vari-
of relationship. PLoS ONE 12(12):e0189775 ances of prediction accuracy and usefulness of deterministic equa-
Lell M, Reif J, Zhao Y (2021) Optimizing the setup of multi-environ- tions. G3 7(11):3571–3586
mental hybrid wheat yield trials for boosting the selection capabil- Smith AB, Cullis BR (2018) Plant breeding selection tools built on
ity. Plant Genome 14(3):e20150 factor analytic mixed models for multi-environment trial data.
Lehermeier C, Krämer N, Bauer E, Bauland C, Camisan C, Campo L, Euphytica. https://doi.org/10.1007/s10681-018-2220-5
Flament P et al (2014) Usefulness of multiparental populations Smith A, Cullis B, Thompson R (2001) Analyzing variety by environ-
of maize (Zea Mays L) for genome-based prediction. Genetics ment data using multiplicative mixed models and adjustments for
198(1):3–16 spatial field trend. Biometrics 57(4):1138–1147. https://doi.org/
Lopez-Cruz M, de Los Campos G (2021) Optimal breeding-value pre- 10.1111/j.0006-341x.2001.01138.x
diction using a sparse selection index. Genetics. https://doi.org/ Smith FH (1936) A discriminate function for plant selection. Ann
10.1093/genetics/iyab030 Eugen 7:240–250. https://doi.org/10.1111/j.1469-1809.1936.
Mangin B, Rincent R, Rabier C-E, Moreau L, Goudemand-Dugue E tb02143.x
(2019) Training set optimization of genomic prediction by means Thompson R, Cullis B, Smith A, Gilmour A (2003) A sparse imple-
of EthAcc. PLoS ONE 14(2):e0205629 mentation of the average information algorithm for factor analytic
Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total and reduced rank variance models. Aust N Z J Stat 45(4):445–459.
genetic value using genome-wide dense marker maps. Genetics. https://doi.org/10.1111/1467-842x.00297
https://doi.org/10.1093/genetics/157.4.1819 Zhang T, Kong J, Luan S, Wang Q, Luo K, Tian Yi (2011) Estimation
Meyer K (2009) Factor-analytic models for genotype × environment of genetic parameters and breeding values in shrimp Fennerope-
type problems and structured covariance matrices. Genet Sel Evol. naeus Chinensis using the REML/BLUP procedure. Acta Oceanol
https://doi.org/10.1186/1297-9686-41-21 Sin. https://doi.org/10.1007/s13131-011-0093-8
Mohammadi R, Amri A (2011) Genotype X environment interaction Zhang H, Yin L, Wang M, Yuan X, Liu X (2019) Factors affecting the
for durum wheat grain yield and selection for drought tolerance accuracy of genomic selection for agricultural economic traits in
in irrigated and droughted environments in Iran. J Crop Sci Bio- maize, cattle, and pig populations. Front Genet 10:189
technol. https://doi.org/10.1007/s12892-011-0011-9
Monteverde E, Gutierrez L, Blanco P, Pérez F, de Vida JE, Rosas Publisher's Note Springer Nature remains neutral with regard to
VB, Quero G, McCouch S (2019) Integrating molecular mark- jurisdictional claims in published maps and institutional affiliations.
ers and environmental covariates to interpret genotype by envi-
ronment interaction in rice (L.) grown in subtropical areas. G3
9(5):1519–1531

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:

1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at

[email protected]

Modern Mathematical Statistics With Applications (2nd Edition)
13% (32)
Modern Mathematical Statistics With Applications (2nd Edition)
13 pages
Intro To Epidemiology Formula Sheet
100% (2)
Intro To Epidemiology Formula Sheet
3 pages
Vegetable Grafting: Principles and Practices
From Everand
Vegetable Grafting: Principles and Practices
Alfonso Albacete
No ratings yet
Genomic Prediction of Agronomic Traits in Wheat Using Different Models and Cross Validation Designs
No ratings yet
Genomic Prediction of Agronomic Traits in Wheat Using Different Models and Cross Validation Designs
18 pages
Cross A Etal
No ratings yet
Cross A Etal
13 pages
Training Population Selection and Use of Fixed Effects To Optimize Genomic Predictions in A Historical USA Winter Wheat Panel
No ratings yet
Training Population Selection and Use of Fixed Effects To Optimize Genomic Predictions in A Historical USA Winter Wheat Panel
15 pages
Lopez Cruz Et Al 2022
No ratings yet
Lopez Cruz Et Al 2022
15 pages
3. the Value of Early‑Stage Phenotyping for Wheat Breeding in the Age of Genomic Selection
No ratings yet
3. the Value of Early‑Stage Phenotyping for Wheat Breeding in the Age of Genomic Selection
22 pages
OsvalA MontesinosLpez2023 Amarkerweghtingapproachwihtinfamilygenomicprediction
No ratings yet
OsvalA MontesinosLpez2023 Amarkerweghtingapproachwihtinfamilygenomicprediction
11 pages
Persa Et Al. 2023
No ratings yet
Persa Et Al. 2023
12 pages
1. Using Public Databases for Genomic Prediction of Tropical Maize Lines
No ratings yet
1. Using Public Databases for Genomic Prediction of Tropical Maize Lines
11 pages
Genomic Selection
No ratings yet
Genomic Selection
25 pages
Genomics 5
No ratings yet
Genomics 5
8 pages
Multivariate Statistical Machine Learning Methods For Genomic Prediction
No ratings yet
Multivariate Statistical Machine Learning Methods For Genomic Prediction
707 pages
E3sconf Icmed2021 01162
No ratings yet
E3sconf Icmed2021 01162
11 pages
s41437-018-0105-y
No ratings yet
s41437-018-0105-y
15 pages
1 s2.0 S221451412100074X Main
No ratings yet
1 s2.0 S221451412100074X Main
9 pages
genetics0347
No ratings yet
genetics0347
26 pages
Briefings in Functional Genomics 2010 Jannink 166 77
No ratings yet
Briefings in Functional Genomics 2010 Jannink 166 77
12 pages
4. Calibration and Validation of Predicted Genomic Breeding Values in an Advanced Cycle Maize Population
No ratings yet
4. Calibration and Validation of Predicted Genomic Breeding Values in an Advanced Cycle Maize Population
13 pages
2. Genomic Prediction of Maize Yield Across European Environmental Conditions
No ratings yet
2. Genomic Prediction of Maize Yield Across European Environmental Conditions
10 pages
Genetic Advance in Sesame
No ratings yet
Genetic Advance in Sesame
9 pages
Ivan & Osmarino - Eberhart and Russel's Bayesian Method in The Selection of Popcorn Cultivars
No ratings yet
Ivan & Osmarino - Eberhart and Russel's Bayesian Method in The Selection of Popcorn Cultivars
7 pages
Using Genomic Data To Improve The Estimation of Ge
No ratings yet
Using Genomic Data To Improve The Estimation of Ge
20 pages
Washburn 2019
No ratings yet
Washburn 2019
15 pages
Dissertation CathyWesthues Revised
No ratings yet
Dissertation CathyWesthues Revised
239 pages
fpls-14-1137834
No ratings yet
fpls-14-1137834
24 pages
The Design of Early-Stage Plant Breeding Trials Using Genetic Relatedness
No ratings yet
The Design of Early-Stage Plant Breeding Trials Using Genetic Relatedness
26 pages
10 2135@cropsci2003 5490
No ratings yet
10 2135@cropsci2003 5490
7 pages
Yang Et Al. - 2024 - Microbiome-Enabled Genomic Selection Improves Prediction Accuracy For Nitrogen-Related Traits in Maize
No ratings yet
Yang Et Al. - 2024 - Microbiome-Enabled Genomic Selection Improves Prediction Accuracy For Nitrogen-Related Traits in Maize
9 pages
5. Combining Pedigree and Genomic Information to Improve Prediction Quality an Example in Sorghum
No ratings yet
5. Combining Pedigree and Genomic Information to Improve Prediction Quality an Example in Sorghum
13 pages
Predicting The Future of Plant Breeding: Complementing Empirical Evaluation With Genetic Prediction
No ratings yet
Predicting The Future of Plant Breeding: Complementing Empirical Evaluation With Genetic Prediction
26 pages
AI Breeder - Genomic Predictions For Crop Breeding
No ratings yet
AI Breeder - Genomic Predictions For Crop Breeding
5 pages
Ijair 2614 Final
No ratings yet
Ijair 2614 Final
7 pages
Journal Pone 0291105
No ratings yet
Journal Pone 0291105
27 pages
Dissection of Genotype-By-Environment Interaction and Simultaneous Selection for Grain Yield and Stability in Faba Bean (Vicia Faba L.)
No ratings yet
Dissection of Genotype-By-Environment Interaction and Simultaneous Selection for Grain Yield and Stability in Faba Bean (Vicia Faba L.)
39 pages
Balancing Genomic Selection Efforts For Allogamous Plant Breeding Programs
No ratings yet
Balancing Genomic Selection Efforts For Allogamous Plant Breeding Programs
10 pages
Fpls 09 01310
No ratings yet
Fpls 09 01310
13 pages
Improving Short and Long Term Genetic Gain by Accounting For Within Family Variance in Optimal Cross Selection
No ratings yet
Improving Short and Long Term Genetic Gain by Accounting For Within Family Variance in Optimal Cross Selection
36 pages
GSM3
No ratings yet
GSM3
12 pages
Diversity Assessment of Yield Yield Contributing Traits and Earliness of Advanced T-Aman Rice Oryza Sativa L. Lines
No ratings yet
Diversity Assessment of Yield Yield Contributing Traits and Earliness of Advanced T-Aman Rice Oryza Sativa L. Lines
11 pages
tmp890D TMP
No ratings yet
tmp890D TMP
10 pages
Multiple Trait and Selection Indices Genomic Predictions For Grain Yield and Protein Content in Rye For Feeding Purposes
No ratings yet
Multiple Trait and Selection Indices Genomic Predictions For Grain Yield and Protein Content in Rye For Feeding Purposes
15 pages
Crop Yield Prediction Using Machine Learning
No ratings yet
Crop Yield Prediction Using Machine Learning
8 pages
Agronomy Journal - 2019 - Olivoto - Mean Performance and Stability in Multi Environment Trials II Selection Based On
No ratings yet
Agronomy Journal - 2019 - Olivoto - Mean Performance and Stability in Multi Environment Trials II Selection Based On
9 pages
jkaa050
No ratings yet
jkaa050
61 pages
Journal Pone 0275407
No ratings yet
Journal Pone 0275407
19 pages
AMMI Chandu Sir
No ratings yet
AMMI Chandu Sir
1 page
Assessment of Genetic Variability and Heritability in Segregating Popuations
No ratings yet
Assessment of Genetic Variability and Heritability in Segregating Popuations
4 pages
8-9-53-664
No ratings yet
8-9-53-664
6 pages
Hierarchical Modeling of Seed Variety Yields and Decision Making For Future Planting Plans
No ratings yet
Hierarchical Modeling of Seed Variety Yields and Decision Making For Future Planting Plans
13 pages
Path Analysys
No ratings yet
Path Analysys
9 pages
Genomic Selection For Crop Improvement: Rajeev K. Varshney Manish Roorkiwal Mark E. Sorrells Editors
No ratings yet
Genomic Selection For Crop Improvement: Rajeev K. Varshney Manish Roorkiwal Mark E. Sorrells Editors
265 pages
META-R A Software To Analyze Data From Multi-Envir
No ratings yet
META-R A Software To Analyze Data From Multi-Envir
16 pages
Understanding The Genomic Selection For Crop Improvement: Current Progress and Future Prospects
No ratings yet
Understanding The Genomic Selection For Crop Improvement: Current Progress and Future Prospects
9 pages
Gu Et Al AoB
No ratings yet
Gu Et Al AoB
13 pages
Rice, B. R., & Lipka, A. E. (2021) - Diversifying Maize Genomic Selection Models
No ratings yet
Rice, B. R., & Lipka, A. E. (2021) - Diversifying Maize Genomic Selection Models
15 pages
An Indirect Estimation Approach for Disaggregating SDG Indicators Using Survey Data: Case Study Based on SDG Indicator 2.1.2
From Everand
An Indirect Estimation Approach for Disaggregating SDG Indicators Using Survey Data: Case Study Based on SDG Indicator 2.1.2
Food and Agriculture Organization of the United Nations
No ratings yet
Plant Biotechnology and Genetic Advances
From Everand
Plant Biotechnology and Genetic Advances
Kailash Verma
No ratings yet
Microbial Plant Pathogens: Detection and Management in Seeds and Propagules
From Everand
Microbial Plant Pathogens: Detection and Management in Seeds and Propagules
Perumal Narayanasamy
No ratings yet
Performance of Indigenous Animal Genetic Resources in Zimbabwe and Their Potential Contribution to the Livestock Sector
From Everand
Performance of Indigenous Animal Genetic Resources in Zimbabwe and Their Potential Contribution to the Livestock Sector
Patrick H R Tawonezvi
No ratings yet
Practical Guide for the Application of the Genebank Standards for Plant Genetic Resources for Food and Agriculture: Conservation in Field Genebanks
From Everand
Practical Guide for the Application of the Genebank Standards for Plant Genetic Resources for Food and Agriculture: Conservation in Field Genebanks
Food and Agriculture Organization of the United Nations
No ratings yet
Haider Atheer
No ratings yet
Haider Atheer
12 pages
Pages 107 118
No ratings yet
Pages 107 118
12 pages
Additional RRL Woohoo
No ratings yet
Additional RRL Woohoo
7 pages
TASK 6 Group 8
No ratings yet
TASK 6 Group 8
4 pages
Edexcel Math S3
No ratings yet
Edexcel Math S3
178 pages
07 - S1 Chapter 7
No ratings yet
07 - S1 Chapter 7
21 pages
MMW Questions Answer 1. T-Test For Correlated Sample
No ratings yet
MMW Questions Answer 1. T-Test For Correlated Sample
21 pages
Math Grade 70 92 80 74 65 83 English Grade 74 84 63 87 78 90
No ratings yet
Math Grade 70 92 80 74 65 83 English Grade 74 84 63 87 78 90
2 pages
Jurnal Bu Lastri - INTB PDF
No ratings yet
Jurnal Bu Lastri - INTB PDF
34 pages
Assessing Oral Cancer Awareness Among Undergraduate Student in Higher Education Institution Using Multiple Linear Regression
No ratings yet
Assessing Oral Cancer Awareness Among Undergraduate Student in Higher Education Institution Using Multiple Linear Regression
8 pages
Final New Syllabus of BBA II 2023.07.10
No ratings yet
Final New Syllabus of BBA II 2023.07.10
12 pages
Job Satisfaction and Acculturation Among Filipino Registered Nurses Request PDF
No ratings yet
Job Satisfaction and Acculturation Among Filipino Registered Nurses Request PDF
1 page
Mini Tab
No ratings yet
Mini Tab
30 pages
Kyu Edu 2301 WK12
No ratings yet
Kyu Edu 2301 WK12
5 pages
Prior-To-Class Quiz 10 - Statistics For Business-T123PWB-1
No ratings yet
Prior-To-Class Quiz 10 - Statistics For Business-T123PWB-1
6 pages
Textbook Correlation and Regression Analysis Egypt en
No ratings yet
Textbook Correlation and Regression Analysis Egypt en
39 pages
Communication Competence, Interpersonal Communication
No ratings yet
Communication Competence, Interpersonal Communication
11 pages
F.Y.B.COM SEM-I AND II SYLLABUS 2023-24 (1)
No ratings yet
F.Y.B.COM SEM-I AND II SYLLABUS 2023-24 (1)
43 pages
Output Aiteman
No ratings yet
Output Aiteman
32 pages
Assignment 1 Each One of You Are Assigned Roll No Wise 1 Question Individually That You Are Submitting
No ratings yet
Assignment 1 Each One of You Are Assigned Roll No Wise 1 Question Individually That You Are Submitting
10 pages
A Waste Relationship Model and Center Point Tracking Metric For Lean Manufacturing Systems PDF
No ratings yet
A Waste Relationship Model and Center Point Tracking Metric For Lean Manufacturing Systems PDF
20 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
30 pages
Greenberg 1999
No ratings yet
Greenberg 1999
7 pages
Coursera Statistics One - Notes and Formulas
No ratings yet
Coursera Statistics One - Notes and Formulas
48 pages
Math 100: Mathematics in The Modern World (MMW) Data Management
No ratings yet
Math 100: Mathematics in The Modern World (MMW) Data Management
32 pages
Open Stat Reference
No ratings yet
Open Stat Reference
403 pages
Statistical Relationship Between Income and Expenditures
100% (1)
Statistical Relationship Between Income and Expenditures
27 pages
Emem
0% (1)
Emem
31 pages

Sparse Testing Using Genomic Prediction Improves S

Uploaded by

Sparse Testing Using Genomic Prediction Improves S

Uploaded by

Theoretical and Applied Genetics

Sparse testing using genomic prediction improves selection

Received: 10 January 2022 / Accepted: 16 March 2022

Introduction The CIMMYT global spring wheat breeding program

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

matrices for fixed and random effects, respectively. The vari-

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

𝜎k2 is( the

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Fig. 3 Plot-level heritability

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Discussion relies on utilization of information from closely related

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

You might also like