GSM3
GSM3
ABSTRACT Genomic selection (GS) increases genetic gain by reducing the length of the selection cycle, as KEYWORDS
has been exemplified in maize using rapid cycling recombination of biparental populations. However, no tropical maize
results of GS applied to maize multi-parental populations have been reported so far. This study is the first to multiparental
show realized genetic gains of rapid cycling genomic selection (RCGS) for four recombination cycles in a multi- population
parental tropical maize population. Eighteen elite tropical maize lines were intercrossed twice, and self- rapid cycling
pollinated once, to form the cycle 0 (C0) training population. A total of 1000 ear-to-row C0 families was recombination
genotyped with 955,690 genotyping-by-sequencing SNP markers; their testcrosses were phenotyped at four genomic
optimal locations in Mexico to form the training population. Individuals from families with the best plant types, selection
maturity, and grain yield were selected and intermated to form RCGS cycle 1 (C1). Predictions of the geno- realized genetic
typed individuals forming cycle C1 were made, and the best predicted grain yielders were selected as parents gains
of C2; this was repeated for more cycles (C2, C3, and C4), thereby achieving two cycles per year. Multi- genetic diversity
environment trials of individuals from populations C0, C1, C2, C3, and C4, together with four benchmark checks Multiparental
were evaluated at two locations in Mexico. Results indicated that realized grain yield from C1 to C4 reached Populations
0.225 ton ha21 per cycle, which is equivalent to 0.100 ton ha21 yr21 over a 4.5-yr breeding period from the MPP
initial cross to the last cycle. Compared with the original 18 parents used to form cycle 0 (C0), genetic diversity
narrowed only slightly during the last GS cycles (C3 and C4). Results indicate that, in tropical maize multi-
parental breeding populations, RCGS can be an effective breeding strategy for simultaneously conserving
genetic diversity and achieving high genetic gains in a short period of time.
In the last 20 yr, marker-assisted selection has been widely used in mic-assisted breeding (genomic selection, GS) incorporates all available
plant breeding where a few markers significantly associated with the marker information simultaneously into a model to predict the genetic
phenotypic trait are employed to predict the genetic value of the can- value of the candidates for selection (Meuwissen et al. 2001). In plants, a
didates for selection (Bernardo 2008, 2016). On the other hand, geno- computer simulation study (Bernardo and Yu 2007) showed that better
prediction accuracy of breeding and genetic values was achieved by
incorporating all markers, as compared to using a subset of markers
Copyright © 2017 Zhang et al.
significantly associated with QTL. This result was verified by Massman
doi: https://doi.org/10.1534/g3.117.043141
Manuscript received January 4, 2017; accepted for publication May 15, 2017; et al. (2013), who used a biparental temperate maize population derived
published Early Online May 22, 2017. from a cross between two distinct heterotic groups (B73 and Mo17); the
This is an open-access article distributed under the terms of the Creative testcrosses were evaluated under well-watered conditions, and the pop-
Commons Attribution 4.0 International License (http://creativecommons.org/ ulation advanced using rapid cycling GS (RCGS where all markers are
licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction
in any medium, provided the original work is properly cited.
used for prediction) and marker-assisted recurrent selection (MARS
1
Corresponding authors: CIMMYT, Apdo. Postal 6-641, 06600 México D.F., where only significant markers are used for prediction). Massman et al.
México. E-mail: [email protected]; and [email protected] (2013) reported that RCGS had a superior response for stover yield, as
MATERIALS AND METHODS (CML495/CML549) from the complementary heterotic group (dent
type kernel) was used to generate testcrosses. All pollination activities
Developing the training population from 18 tropical
were conducted at CIMMYT’s experiment station in Agua Fria, Puebla.
maize inbred lines
The training population (C0) for developing genomic prediction
The RCGS experiment was designed in 2009 as part of the MasAgro
models was formed with the best 1000 selected S2s. Testcrosses of all
project funded by Mexico’s Secretariat of Agriculture, Livestock, Rural
1000 selected S2 were planted using a partial replicated design with 25%
Development, Fisheries and Food (SAGARPA, its acronym in Spanish)
of replicated genotypes at four optimal Mexican locations. Phenotypic
through the Sustainable Modernization of Traditional Agriculture pro-
data were collected at all locations for .10 agronomic traits, including
gram (MasAgro; http://masagro.mx).
grain yield at 12.5% moisture content (GY), anthesis date (AD), silking
The steps in the breeding scheme used for RCGS are shown in
date (SD), plant height (PH), ear height (EH), and moisture content
Figure 1. In total, 18 CIMMYT tropical maize inbred lines (CML247,
(MOI). For each S2 family, DNA was extracted by bulking equal
CML264, CML448, CML494, CML498, CML531, CLRCW72, CLRCW75,
amounts of leaf tissue from 15 individual plants. Genotyping-by-
CLRCW76, CLRCW93, CLRCW100, CLRCW260, CLWN201, CLWN228,
sequencing was performed at Cornell University Biotechnology Re-
CLWN229, CLWN247, CLG2312, and CLSPLW04), widely used in
source Center as described by Wu et al. (2016), where 955,690 SNPs
lowland tropical breeding environments, were crossed as parents to
were generated for each DNA sample. In the training population, the
form the training population through twice intermated pollination
genomic prediction model was developed by using only 331,740 filtered
and one self-pollination; the parents were selected based on their gen-
SNPs with minor allele frequency (.0.05), and where the missing data
eral combining ability for grain yield and per se, visual evaluation in-
rate was ,10%.
formation for major stress tolerance and disease resistance in lowland
tropical breeding environments. All 18 original parents tended to group
in heterotic pattern group “B” (flint type kernel). Cycle 0 (C0) phenotypic selection and formation of
In the 2010B season, half-diallel crosses were made between the cycle 1 (C1)
18 original parents to generate all possible F1 progenies (Figure 1). In the In C0, phenotypic selection was conducted by ranking the grain yield of
2011A season, all F1 were planted ear-to-row and intermated to form the 1000 S2 testcrosses. The best 50 respective S2 families were selected
the S1 population; then, all the F1 were separated into two groups of and planted ear-to-row, 25 plants per family (Table 1). Cycle 1 (C1) was
equal size. Bulk pollen from the first group was used to pollinate all formed by intermating the 50 selected S2 families. The 50 families were
plants of the other group and vice versa; three ears were harvested from divided into two equal groups, and bulk pollen from the first group was
each F1 family, and equal amounts of seed from each selected F1 ear used to pollinate all plants in the other group and vice versa. Based on
were bulked to form the subsequent generation for planting. In the visual evaluation of flowering time, plant type, plant/ear height, well-
2011B season, 4800 S1 individuals were planted and self-pollinated filled ears, and reaction to naturally occurring major diseases, along with
and advanced to S2. The best 1000 S2 ears were selected, and planted among-family and within-family selection, 157 ears (1–6 ears from each
ear-to-row in the 2012A season (Table 1). A single-cross tester selected family) were harvested and shelled individually to form C1.
n Table 2 Mean of GY (ton ha21) for each genomic cycle C0, C1, C2, C3, and C4, broad-sense heritability (H2), and mean of the four testers
at each location (Agua Fria and Tlaltizapan), and combined across the two locations
Agua Fria Tlaltizapan Combined
Cycle Entry H2 Checks Entry H2 Checks GY H2 Checks
C0 6.65 0.27 5.47 10.40 0.65 8.08 8.52 0.42 6.77
C1 6.49 0.06 5.73 10.29 0.59 9.32 8.40 0.63 7.52
C2 7.02 0.26 6.02 10.20 0.46 9.30 8.62 0.47 7.52
C3 6.88 0.38 5.64 10.95 0.59 9.31 8.92 0.67 7.52
C4 7.13 0.21 5.70 10.96 0.25 9.30 9.05 0.43 7.61
LSD0.05 (C0–C4) 0.402 — — 0.412 — 0.252 —
LSD0.05 (C1–C4) 0.408 — — 0.404 – 0.191 —
Average gain per cycle (C0–C4) 0.131 — — 0.177 — 0.158 —
Average gain per cycle (C1–C4) 0.171 — — 0.276 — 0.225 —
The average genetic gain in GY across cycles was estimated for each location and across both locations including all selection cycles (C0–C4), and including only the
genomic selection cycles (C1–C4). Least significant differences (LSD) test at the 0.05 probability level including all selection cycles (C0–C4) and only the genomic
selection cycles (C1–C4). The highest value is indicated in boldface.
1000, 157, 91 and 44 in C0, C1, C2 and C3, respectively. Families from C4 HiSeq2000 (Elshire et al. 2011). SNP calling was performed using the
were not genotyped. Genotyping-by-sequencing, SNP calling, imputa- TASSEL GBS Pipeline, and a GBS 2.7 TOPM (tags on physical map) file
tion and filtering were performed as described by Wu et al. (2016). was used to anchor reads to the Maize B73 RefGen_v2 reference
Briefly, genomic DNA was digested with the restriction enzyme ApeKI. genome (Glaubitz et al. 2014). Imputation was carried out with the
GBS libraries were constructed in 96-plex and sequenced on Illumina FILLIN method in TASSEL 5.0 (Swarts et al. 2014), which anonymized
Figure 2 Distribution of parents, cycle C0 entries (A) and the selected parents, and cycle C1 entries (B) and the selected parents based on rapid
cycling genomic selection-assisted recombination.
GBS 2.7 haplotypes made from 8000-site windows. In total, 955,690 RESULTS
SNPs were generated for each sample, filtering was performed
Heritability and prediction accuracy of GY in the
with minor allele frequency (.0.05) and the missing data rate
training population
was ,10%.
The combined GY heritability across both locations was 0.34, while GY
heritability at individual locations was 0.48, and 0.19 in Agua Fria and
Assessing the genetic diversity of the selection cycles
Tlaltizapan, respectively. Low-to-intermediate GY heritability was ob-
Based on genomic data, we computed two genetic diversity indices
served in the individual location analysis and combined analysis, mainly
between the families of the different selection cycles as well as the
because GY is a complex trait. To mimic future prediction problems we
18 parents. We calculated Pthe Shannon Diversity Index of the sample for will face, we implemented a fivefold random cross-validation with
each selection cycle as A1 Aa¼1 ^pa lnð^pa Þ; where ^pa is the frequency of the
100 replicates using entries in C0 (training population) for GY; the
major allele in the ath marker over the entire sample, and A is the total
mean correlation between the predicted and observed values was 0.55.
number of markers. The expected proportion of heterozygous loci per
individual
Pwas computedP i as2 the mean of heterozygosity for each marker
as 0 # L1 Ll¼1 ð1 2 na¼1 ^pla Þ # 1, where ^pla is the frequency of the Realized genetic gains from rapid cycling recombination
major allele in the a th marker of the l th individual, and L is the number of GS for grain yield
of individuals. A total of five groups of entries from C0, C1, C2, C3, and C4 plus four
Multidimensional scaling (MDS) was performed with the TASSEL checks were used for field evaluation at two Mexican locations (Agua
software (http://www.maizegenetics.net/tassel) to assess the genetic Fria and Tlaltizapan). Mean grain yield for each cycle and average gains
similarity of all the materials in each selection cycle. per cycle are shown in Table 2. The Tlaltizapan location had the highest
mean yield, with C4 reaching 10.96 ton ha21. At both individual lo-
Data availability cations, the average performance across all C4 entries surpassed the
The phenotypic and genotypic data for the training population (cycle C0) grain yield performance of the other cycles; average grain yield perfor-
evaluated in four sites, the phenotypic and genotypic data for the eval- mance across all C4 entries was 7.13 ton ha21, and 10.96 ton ha21
uation of the entries from the different selection cycles (C0, C1, C2, C3, for Agua Fria and Tlaltizapan, respectively. Also, the combined analyses
and C4), as well as a brief GUIDE can be found in the link http://hdl. of the two locations showed an increase in mean grain yield in C4 of
handle.net/11529/10927. A marker information file and characteristics 9.05 ton ha21 over that achieved in C3 (8.92 ton ha21) and over the
of the genetic materials are also included in the link. other GS cycles.
n Table 4 The Shannon Diversity Index, heterozygosity, and number of SNPS of the 18 original parents, the number of families in cycles
C0–C3 (in parentheses), and the selected parents in C0–C3, and including all the entries
Parents C0 (1000) C0 (50) C1 (157) C1 (25) C2 (91) C2 (18) C3 (44) C3 (22) All Entries
Shannon’s Index 0.0661 0.0728 0.020 0.0776 0.052 0.0765 0.043 0.0588 0.063 0.0740
Heterozygosity 0.1104 0.1226 0.1208 0.1297 0.1250 0.1276 0.1228 0.0973 0.0923 0.1245
Number of SNP markers 950,248 952,825 943,344 951,390 947868 953,199 953,453 954,058 954,924 954,960
Numbers in parentheses refer to the size of the cycle population and the selected parents to form the subsequent cycle.
n Table 5 Number of SNP markers with allele swaps, number of polymorphic markers that became monomorphic and number of markers
that were monomorphic and became polymorphic from [parents-C0] to [C3–C4]
Number of Number of Polymorphic Number of Monomorphic Total SNPs with Changes Total Number
Chromosome Allele Swaps to Monomorphic to Polymorphic N % of SNPs
1 1 3 4 2 1 11
2 1 1 4 1 1 1 9
3 1 1 2
4 2 2 1 1 6
5 2 9 3 2 1 1 1 19
6 1 3 2 1 1 8
7 2 2 1 1 6
8 2 4 2 2 10
9 1 2 1 3 1 8
10 1 4 3 1 9
Total 6 1 1 1 31 21 12 5 4 2 1 1 1 1 88
Figure 6 depicts the location of the clusters for each chromosome in etc. This also helped to broaden and maintain the genetic diversity
the genome. For example, chromosome 1 has one cluster with four observed in C1 and C2, which later declined in C3. As already men-
markers that changed their frequency (green color), one cluster of four tioned, the other reason of lower GY observed in C1 compared to C0 is
markers that changed from monomorphic in the 18 parents and cycle that the best 50 selected families (not the random selected families)
C0 to polymorphic in cycles C3 and C4 (blue color), and three clusters were used to represent selection cycle C0 in the genetic gain evaluation
with three markers, four clusters with four markers, and two clusters study. The GY mean of the best 50 selected families is much higher than
with six markers that changed from polymorphic in the 18 parents and that value of the 50 random selected families.
cycle C0 to monomorphic in cycles C3 and C4 (black color). As shown In terms of the prediction models used to predict the genetic values of
in Table 6, most of the clusters with changes in their allele frequency the entries to be selected in each genomic cycle, we used the direct
occurred in chromosome 5. genotyping-by-sequencing marker as biallelic. Since a multi-parental
(not a biparental) population was used, haplotype rather than biallelic
DISCUSSION marker could have been used in order to attempt to capture the whole
Previous studies on temperate and tropical maize showed realized gains allelic diversity. However, the problem on how to define the length of the
of RCGS in biparental populations (Massman et al. 2013; Beyene et al. haplotype segment in each chromosome could impose a major draw-
2015; Vivek et al. 2016). In this study, our results showed realized gains back for using this approach; different haplotypes methods exist but
of RCGS in a multi-parental tropical maize population that originated none of them seems to give clear superiority in terms of genomic-enabled
from crosses of 18 CIMMYT elite tropical maize lines. From a practical prediction accuracy.
breeding perspective, multi-parental populations might not be an at-
tractive option because the mean of 18 parental lines might be lower Realized genetic gains per unit of time
than the mean of the best few lines that could be used in biparental To compute the realized genetic gains per year (ton ha21 yr21), it is
crosses; however, as diversity becomes an important issue in GS, multi- necessary to account for the number of cycles per year (two cycles per
parental populations offer the opportunity to maintain diversity, while year in this study), and also for the time from the initial cross to the last
still achieving rapid cycles with high realized grain yield genetic gains cycle (4.5 yr from F1 development to harvesting the C4 in this study).
achieved in a shorter period of time, as found in this study. As for the
decrease in genetic diversity, this is not of much concern in a short-term
selection (four to five cycles), especially if the new developed lines from
C4 are crossed with other lines for further breeding.
Trends in the realized genetic gains of multi-parental
populations for grain yield
The genetic gains per unit of time are given by the breeders’ equation,
which is Gain = (i·r·h)/I, where i is the selection intensity, r is the
selection accuracy, h is the square root of narrow-sense heritability, and
I is the time (in years) it takes to complete a selection cycle. In this
study, the gains in GY in different selection cycles were not consistent,
decreasing slightly from C0 to C2, while increasing significantly from C2
to C3, and from C3 to C4. As for analyses combining the two sites, the
gains in grain yield were 6.2 and 7.7% from C0 to C4 and from C1 to C4,
respectively. The combined realized genetic gains reached 0.158 ton
ha21 per cycle for C0–C4, (Figure 7A) and 0.225 ton ha21 per cycle
for RCGS C1–C4 (Figure 7B).
The lower GY observed in C1 compared to C0 is explained because
the best GY entries selected from C0 as parents of C1 were further Figure 6 Genome location of clusters of SNPs with changes in their
intermated and selected based on flowering, plant, and ear height, polymorphic status.
Therefore, given that grain yield from C1 (8.40 ton ha21) to C4 drought environments (not optimal environments), and the RCGS in
(9.05 ton ha21) increased by 7.74%, the average genetic gain of this MPP targeted optimal environments.
0.225 ton ha21 per cycle (Table 2) is equivalent to 0.100 ton ha21 In this study, results obtained from MPPs in optimal environments
yr21 [i.e., (2 · 0.225)/4.5] under optimal conditions. reinforce the usefulness of GS-assisted recombination for achieving high
Masuka et al. (2017a) conducted a review of genetic gain studies that genetic gains in GY. Although only two cycles per year were completed in
used conventional pedigree selection on tropical hybrid maize germ- this study (Beyene et al. 2015, completed three cycles per year in bi-
plasm under optimal conditions in Sub-Saharan Africa, which gave parental populations), it is still time-efficient when compared to the
gains of 0.109 ton ha21 yr21. For tropical open-pollinated maize vari- 1.5 yr per selection cycle required for making testcrosses, phenotyping
testcrosses, and conducting selection and recombination in conven-
eties, realized genetic gains reached 0.109 ton ha21 yr21 in the early
tional pedigree breeding.
maturity group, and 0.079 ton ha21 yr21 in the intermediate-to-late
group (Masuka et al. 2017b). Therefore, the genetic gains from the RCGS
Trends in genetic diversity under rapid cycling
observed in the MPPs used in this study (0.100 ton ha21 yr21) are at recombination GS
the same or higher level than those observed in other studies under There are not many reports on the influence of RCGS on genetic
phenotypic selection but with a shorter breeding cycle. However, the variance in plant breeding. In a simulation study, Jannink et al.
0.070 ton ha21 yr21 achieved by Beyene et al. (2015) in bi-parental (2010) were the first to caution about the possible decline in genetic
populations is not comparable to the results of this study because the variance due to RCGS. Genetic gains in GS for stem rust in wheat were
genetic gains from RCGS in biparental populations targeted managed reported by Rutkoski et al. (2015); genetic gains in GS were compared