0% found this document useful (0 votes)
16 views10 pages

Estimation of Heritability From Limited Family Data Using Genome-Wide Identity-By-Descent Sharing

This study estimates heritability using genome-wide identity-by-descent (IBD) sharing among close relatives using small family samples. The researchers simulated IBD relationships among full siblings assuming a genome size similar to humans. Genetic variance was estimated from phenotypic data assuming IBD relationships could be recreated using genome markers. Estimates based on a single large full-sibling family were most accurate. This method is more robust than pedigree-based analysis and can estimate genetic variance using limited data from small populations.

Uploaded by

Carlos Pulgarin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views10 pages

Estimation of Heritability From Limited Family Data Using Genome-Wide Identity-By-Descent Sharing

This study estimates heritability using genome-wide identity-by-descent (IBD) sharing among close relatives using small family samples. The researchers simulated IBD relationships among full siblings assuming a genome size similar to humans. Genetic variance was estimated from phenotypic data assuming IBD relationships could be recreated using genome markers. Estimates based on a single large full-sibling family were most accurate. This method is more robust than pedigree-based analysis and can estimate genetic variance using limited data from small populations.

Uploaded by

Carlos Pulgarin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16

http://www.gsejournal.org/content/44/1/16
Ge n e t i c s
Se l e c t i o n
Ev o l u t i o n

RESEARCH Open Access

Estimation of heritability from limited family data


using genome-wide identity-by-descent sharing
Jørgen Ødegård1,2* and Theo HE Meuwissen2

Abstract
Background: In classical pedigree-based analysis, additive genetic variance is estimated from between-family
variation, which requires the existence of larger phenotyped and pedigreed populations involving numerous
families (parents). However, estimation is often complicated by confounding of genetic and environmental family
effects, with the latter typically occurring among full-sibs. For this reason, genetic variance is often inferred based
on covariance among more distant relatives, which reduces the power of the analysis. This simulation study shows
that genome-wide identity-by-descent sharing among close relatives can be used to quantify additive genetic
variance solely from within-family variation using data on extremely small family samples.
Methods: Identity-by-descent relationships among full-sibs were simulated assuming a genome size similar to that
of humans (effective number of loci ~80). Genetic variance was estimated from phenotypic data assuming that
genomic identity-by-descent relationships could be accurately re-created using information from genome-wide
markers. The results were compared with standard pedigree-based genetic analysis.
Results: For a polygenic trait and a given number of phenotypes, the most accurate estimates of genetic variance
were based on data from a single large full-sib family only. Compared with classical pedigree-based analysis, the
proposed method is more robust to selection among parents and for confounding of environmental and genetic
effects. Furthermore, in some cases, satisfactory results can be achieved even with less ideal data structures, i.e., for
selectively genotyped data and for traits for which the genetic variance is largely under the control of a few major
genes.
Conclusions: Estimation of genetic variance using genomic identity-by-descent relationships is especially useful for
studies aiming at estimating additive genetic variance of highly fecund species, using data from small populations
with limited pedigree information and/or few available parents, i.e., parents originating from non-pedigreed or
even wild populations.

Background analyses ignore relationships beyond those included in


Estimates of additive genetic variance are commonly the known pedigree. Second, the assumed relationships
based on data from large pedigreed populations incorpor- are expected relationships (based on expected sharing of
ating all known relationship information. Additive IBD alleles) rather than actual relationships. In fact, the
genetic relationships can be defined as twice the identity- pedigree relationship is exact only under an infinitesimal
by-descent (IBD) probability of two randomly drawn model [1], i.e., assuming that the additive genetic effects
alleles, which can be estimated from pedigree data. The of the quantitative traits are controlled by an infinite
advantages of these pedigree-based analyses are that they number of unlinked loci. Under a more realistic finite-
do not require any knowledge about the genetic architec- locus model (and assuming that some of the loci are
ture of the traits and that the additive relationships are linked), the actual relationships will be distributed
easily inferred from a known pedigree. However, these around the expectation, with variable relationships
methods also have some major limitations. First, such among full-sibs and other relatives [2]. By assuming
(incorrectly) homogeneous relationships among the same
* Correspondence: [email protected] type of relatives (e.g., sibs), in pedigree-based analyses,
1
Nofima, P.O. Box 210, NO-1431 Ås, Norway
Full list of author information is available at the end of the article
the genetic (co)variance components are estimated based

© 2012 Ødegård and Meuwissen; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the
Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly cited.
Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16 Page 2 of 10
http://www.gsejournal.org/content/44/1/16

on between-family variation only, since the Mendelian for humans, and thus they were typical of species with
sampling deviations of non-parents cannot be separated relatively large genomes. Variation in IBD sharing was
from the residual (or permanent environmental) effects simulated using a model with 80 “effective loci” (n e )
on the same animals [3]. Estimation of genetic variance within a family (equivalent to human genome size).
based on pedigree relationships is further complicated by Effective loci are defined as the number of indepen-
the fact that common environmental effects may be dently segregating “loci” that would yield the same stan-
important for some relatives, especially full-sibs (e.g., dard deviation of the proportion of genome shared
maternal environment, rearing environment, litter effects, among full-sibs as observed in real genomic data from
etc.), which means that the genetic variance must be esti- human sib pairs [4]. Hence, an “effective locus allele” is
mated from covariances among phenotypes of more dis- not a specific mutation, but is equivalent to a long hap-
tant relatives (e.g., half sibs, cousins, etc.). lotype block passed on from parent to offspring. For
Due to linkage between loci within the same chromo- simplicity, it was assumed that different families were
some, parents tend to pass on long segments of DNA to unrelated and that inbreeding was zero. For an “effective
their offspring. Hence, the “effective” number of segregat- locus” i, the IBD relationship of two full-sibs was there-
ing loci within a full-sib family will be much lower than fore defined as 0 if none of the paternal and maternal
the corresponding number for the whole population, “alleles” (haplotype blocks) were IBD, 0.5 if either their
even for species with a larger genome. For example, paternal or maternal “alleles” were IBD and 1 if both
recent reports have indicated that the effective number of their paternal and maternal “alleles” were IBD. The
segregating loci among full-sib pairs in humans is only actual relationship between two full-sibs was then
about 80 [4,5]. When the effective number of segregating defined as the average relationship across all “effective
loci is low, the actual relationships among full-sibs vary loci” (i.e., representing the whole genome). An example
substantially among sib-pairs. Visscher et al. [5] have esti- of the distribution of actual relationships in a large
mated that actual relationships among human full-sibs simulated full-sib family is shown in Figure 1. Since all
vary from 0.37 to 0.62, and used these relationships to relationships among full-sibs are based on the inheri-
quantify the additive genetic variation of human height tance of a limited number of “effective loci” (ne = 80),
based on within-family segregation only, i.e., free from the actual relationship matrix cannot be of full rank for
non-genetic factors. In this study, the heritability values large size families, which introduces numerical problems
were based on more than 3000 sib pairs. With such a in data simulation and analysis. Therefore, the relation-
large dataset, including numerous families, the main ship matrix was forced to be positive definite by adding
challenge is not to estimate between-family variation, but a small positive value (10-3) to each diagonal element
rather to separate genetic effects from other effects that (sufficiently small to have a neglible effect on the genetic
act on a family level. Visscher et al. [5] pointed out that (co)variance structure).
one limitation of their method was that it required large Data sets and data structures
datasets with densely genotyped individuals. Indeed, for a Nine data structures were generated, using various num-
sib-pair design (twin study), a large number of full-sib bers of full-sib families (1-10) and individuals (200-1000)
pairs would be needed. However, for livestock, aquacul- with data (Table 1). Furthermore, three scenarios were
ture species and laboratory animals, population struc- defined (Table 2), all assuming moderate heritability,
tures are usually very different from those in humans, but differing with respect to the distribution of genetic
with much larger progeny groups of either full- or half
sibs (or both). Therefore, the aim of the current study
was to test whether genetic variance could be accurately
estimated with relatively small datasets and a limited
number of families, using a population structure typical
of a high fecundity species (e.g., insects, crustaceans, fish
or poultry), and whether the results could also be gener-
alized to species in which only one of the sexes (usually
males) has a large reproductive potential (e.g., mamma-
lian livestock).

Methods
Simulation study
Genomic identity-by-descent relationships
The IBD relationships were simulated so that they clo- Figure 1 Example of actual relationships among simulated full-
sibs in a single family (N = 1000).
sely resembled the relationships estimated with real data
Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16 Page 3 of 10
http://www.gsejournal.org/content/44/1/16

Table 1 Simulated data structures where L is a lower triangular Cholesky decomposition


Structure Animals Full-sib families Animals per family of G, and z is a vector of standard normal deviates of
1 200 1 200 length N (number of animals in the dataset). This
2 200 5 40 assumes that (1) genetic variance is evenly distributed
3 200 10 20 across the genome, and (2) gene effects are normally
4 500 1 500 distributed, or that the aggregated effect of many genes,
5 500 5 100 i.e., the breeding values, are approximately normally dis-
6 500 10 50 tributed, even when individual gene effects are not (due
7 1000 1 1000 to the central limit theorem). It is also assumes that the
8 1000 5 200 different founder alleles at an “effective locus” have
9 1000 10 100 unique allelic effects, because an “effective locus” con-
tains many genes and thus contains a unique combina-
tion of alleles at these genes. All datasets were
variance; either equally distributed over genomic regions generated using the MATLAB ® software http://www.
(Scenarios 1 and 2) or located in a single region (“one mathworks.com.
effective locus”) only (Scenario 3). Furthermore, Sce- Statistic alanalysis
nario 1 included common environmental family effects The data sets were analyzed with the general linear
in addition to additive genetic effects, while Scenarios 2 model
and 3 assumed that common environmental effects were
absent. y = Xβ + Za a + e
All combinations of structures and scenarios were run
in 50 replicates and the results were averaged over repli- where Xb includes fixed effects of each family, or in
cates. However, for single-family structures, in which all absence of common environmental family effects, the
animals are necessarily within the same family environ- overall mean only. If fitted, common environmental
ment, there was no practical difference between Scenar- family effects were included as fixed effects due to the
ios 1 and 2 (the environmental family effect will be fact that the number of families included was very small
included in the overall mean). Hence, 1200 different (and the number of observations per family large) and
datasets were generated and analyzed. thus the associated variance was difficult to estimate.
Phenotypes were generated using the following model: Model 1: Genomic IBD animal model In this model,
the additive genetic effects were assumed:
 
y = 1μ + za a + zf f + e (1) a ∼ N 0, Gσ 2a , where G was calculated over all
 
 2
a

where μ is the overall mean, a ∼ N 0, Gσ , f ∼ N 0, Iσ , e ∼ N 0, Iσ ,f
2
 2
e
 genomic regions, i.e. it was assumed that the inheri-
G is the actual IBD relationship matrix, I is an identity tance of the DNA segments from parents to offspring
matrix of appropriate size, the Z-matrices are appropriate could be accurately traced using marker information,
and genetic variance was evenly distributed over gen-
incidence matrices and σa2 , σf2 and σe2 are the additive
ome segments (irrespective of the simulation scenario).
genetic, common environmental and residual variances, The genomic IBD animal model is equivalent to a gen-
respectively. For Scenarios 1 and 2, G was set up over all ome-wide gametic model, i.e., a model in which the
“effective loci”, while for Scenario 3, only the first “effective original gametes received from the sire and dam and
locus” was used to calculate G. The breeding values in a their associated actual relationships are reconstructed
were then generated as: using genomic data [6]. The relationship between two
(2) individuals in the animal model is twice the average of
a = Lz
the four gametic relationships (coancestry) for the two
individuals, and the animal genetic variance is twice
the gametic variance.
Model 2: Pedigree-based animal model This is the
Table 2 Input variance components  
classical animal model, assuming a ∼ N 0, Aσ 2a ,
Scenario Additive Common Residual Heritability*
genetic environment where A is the numerator relationship matrix (inferred
1 0.50 0.25 1.00 0.33
from the pedigree). Furthermore, since the classical
2 0.50 0.00 1.00 0.33
model uses only between-family variation to estimate
3 0.50† 0.00 1.00 0.33
additive genetic effect variance, environmental effects
common to full-sibs could not be included in the model
*heritability was calculated as genetic variance/(genetic variance + residual
variance); for this simple data structure (irrespective of whether

all genetic variance was located at one of the “effective loci” they were present or not).
Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16 Page 4 of 10
http://www.gsejournal.org/content/44/1/16

For both models, variance components were estimated heritabilities were estimated with moderate to high preci-
with restricted maximum likelihood methodology using sion even with the smallest datasets (200 animals).
the ASREML software package [7]. When assuming no common environmental variance
Selective genotyping The genomic IBD model assumes (Scenario 2), the pedigree-based analyses were also
that all animals are genotyped with a sufficiently dense unbiased but they were less precise than the genomic IBD
marker map covering the entire genome. However, in analyses (Figure 3). As expected, if common environmen-
some studies, selective genotyping of phenotypically tal effects were not included in the data, the precision of
extreme (high/low) animals within each family may be the estimated heritability was improved, in particular for
used to save costs. This may be a useful approach for QTL the smallest datasets using the classical model, while the
(Quantitative Trait Loci) detection, but our aim was to precision of the IBD model was unaffected for the largest
evaluate whether such data could be used to estimate datasets (1000 individuals). The differences between the
quantitative genetic variation as well. In these analyses, we two models were most pronounced with larger datasets
assumed a single family with 200, 500 or 1000 full-sibs, with a few families. For the IBD genomic model, within-
and for which only individuals with phenotypes deviating family variation dominated estimation of genetic variance,
more than one residual standard deviation from the mean and thus reducing family sizes to give room for more
were genotyped. However, because including only the gen- families led to more imprecise estimates of genetic
otyped (phenotypically extreme) animals in the analysis variance.
would probably yield overestimated variance components, For selectively genotyped data, the genomic IBD model
the non-genotyped animals were also included in the ana- was also able to estimate the genetic variance based on a
lysis. For the analyses, genomic IBD relationships among single large (N = 1000) family. However, single-family esti-
genotyped individuals were combined with pedigree rela- mates based on smaller samples (200 or 500) tended to be
tionships of non-genotyped individuals in a common rela- overestimated, and the precision of the estimates were
tionship matrix [8,9]. reduced compared to that with full genotyping (Figure 4).
If all the genetic variance was located in only one “effec-
Results tive locus”, and no common environmental variance
The estimated heritabilities (across-replicate means and existed (Scenario 3), estimation of heritability was still
standard deviations) for the different structures under unbiased for both the genomic IBD and the pedigree-
Scenarios 1 and 2 are presented in Figures 2 and 3, based (more than one family) methods (Figure 5). With
respectively. For the classical pedigree-based analyses, the larger datasets (500-1000 individuals), the genomic IBD
data structure did not make it possible to separate per- method was more precise, but the two methods were
manent environmental effects common to full-sibs from equally imprecise for the smallest datasets, and, in contrast
genetic effects, since both factors are estimated from with the earlier results, the single-family design yielded
between-family variation only (and no other relatives highly imprecise results with the genomic IBD model.
than full-sibs were present). Hence, the estimated herit-
ability in the classical model was biased by the common Discussion
environmental component, resulting generally in over- This study shows that tracing genomic IBD relationships
estimated genetic variance. Furthermore, when the num- using genomic information has clear advantages, not only
ber of families included in the dataset was low, the for prediction of individual breeding values [10] but also
estimates also varied substantially from replicate to repli- for estimation of genetic (co)variance components. Both
cate. For the one-family designs, no between-family the current and earlier studies have shown that genetic
variation existed, and therefore, by definition, genetic variance can be estimated based on within-family varia-
variance could not be estimated with a classical pedigree- tion. In contrast, estimation of genetic variance in a classi-
based model. However, for all the designs, the genomic cal genetic analysis is based only on between-family
IBD model was able to estimate genetic variance, due to variation. Hence, for the latter, it is imperative that genetic
the fact that the model inferred genetic variance from and non-genetic family effects are properly separated by
within-family variation, and multiple families were there- the model, which puts major limitations on the usefulness
fore not needed. Moreover, even with multiple families, of family data, e.g., resemblance among full-sibs may also
the heritability estimates were unbiased and much more be due to similarities in the environment. Furthermore, for
accurate than with the classical model. Furthermore, an accurate estimation of genetic variance in a classical
precision of the heritability estimate in the IBD model model, many families must be included in the study and
increased with increasing family sizes and were most pre- selection of data should be avoided. However, by using
cise for single-family designs (i.e., largest family size for a actual IBD sharing among sibs instead of expected rela-
given number of observations). For the latter design, tionships, genetic variation can be quantified solely from
Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16 Page 5 of 10
http://www.gsejournal.org/content/44/1/16

Figure 2 Across-replicate averages of estimated heritability (with between-replicate standard deviations of the estimates) by total
number of observations using a classical animal model (a) and a genomic IBD animal model (b) with 1, 5 or 10 families for Scenario
1. The dotted line represents the true input heritability.

within-family variation [5], which also facilitates proper study shows that with the genomic IBD approach, genetic
separation of genetic and non-genetic family effects (as the variance can be accurately inferred from a single family,
latter do not affect within-family variation). The current and for a given number of observations, including more
Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16 Page 6 of 10
http://www.gsejournal.org/content/44/1/16

Figure 3 Across-replicate averages of estimated heritability (with between-replicate standard deviations of the estimates) by total
number of observations using a classical animal model (a) and a genomic IBD animal model (b) with 1, 5 or 10 families for Scenario 2.
The dotted line represents the true input heritability.

families gives less accurate results (even in the absence of Falconer and Mackay [11] showed that the optimal family
common environmental effects). For a classical analysis size for a specific number of observations under a full-sib
(and in absence of common environmental effects), design was n = 2/h 2. However, using the genomic IBD
Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16 Page 7 of 10
http://www.gsejournal.org/content/44/1/16

gamete is limited, sharing of haplotype blocks within a


family can be estimated with a high degree of accuracy,
even with sparsely distributed genome-wide markers
[12]. Reconstructing paternal and maternal haplotype
blocks is equivalent to reconstructing the original
gametes received from the sire and dam, making the
genomic IBD animal model and a genomic IBD gametic
model equivalent.
In species with a low reproductive potential, the pro-
Figure 4 Across-replicate averages of estimated single-family
posed single-family mating design is of little relevance,
heritability (with between-replicate standard deviations of the since large full-sib groups cannot be produced. How-
estimates) by total number of observations using a genomic ever, some species have a high reproductive potential
IBD animal model with selective genotyping. The dotted line among males, while the reproduction of females is often
represents the true input heritability. limited (e.g., in mammalian livestock). For such species,
a sire gametic model may be more relevant. In such a
approach, this formula is no longer valid. For a genomic model, the sires’ gametes are reconstructed and the
IBD analysis, including only a single large family will maxi- actual relationships between them estimated. Genetic
mize the precision of the predicted Mendelian sampling variance can then be estimated from variation among
deviations of all family members, and there is also no need the sires’ gametes (half the within-family genetic var-
to separate genetic and environmental family effects. iance), rather than variation among individuals (sires’
Furthermore, the number of families is necessarily lower and dams’ gametes). In this model, genetic variation due
than the number of individuals, and prediction of Mende- to the dams’ gametes will be included in the random
lian sampling deviations is therefore more informative residual term and genetic variance may be estimated
than prediction of family means with respect to genetic from samples of offspring (e.g., daughters of dairy bulls).
variance. Estimation of genetic variance from family The proposed method can also be generalized to IBD
means (classical model) is also sensitive to selection of tracing in more complex pedigrees, i.e., beyond a single
data (both within and among families), and failure to generation, allowing information from various types of
account for this is expected to give downward biased esti- relatives to be exploited by linkage analysis e.g., [13].
mates of the genetic variance. In populations undergoing More distant relatives are generally less related but their
artificial selection, parents of the phenotyped animals are relationships are also expected to deviate more from
usually selected, and unbiased genetic analysis requires their expected values [14]. Hence, distant relatives may
that the selection history is properly included in the data, provide additional value to estimate genetic variance,
i.e., the analysis should involve data on multiple genera- especially for populations with smaller full- and half-sib
tions, which is not always available. However, selection groups. However, tracing haplotype blocks over multiple
among parents will have little impact on the within-family generations will be more challenging (shorter DNA
genetic variance (in absence of inbreeding). Since the blocks due to more recombination) and will require
genomic IBD model uses mainly within-family variation, it denser marker maps for accurate tracing compared with
is expected to be more robust to selection among parents. the one-generation (sib) approach.
Even with selective genotyping, the genomic IBD The proposed method will underestimate the total
model could estimate genetic variance relatively pre- genetic variance in cases where a fraction of the genome
cisely for large families (N = 1000), although there was a is not covered by the markers, i.e., if some of the “effec-
tendency towards overestimated heritability values and tive loci” are not accounted for in the G matrix. For
less precise results for smaller family sizes. This may be instance, if a fraction q of the genome is not covered by
explained by the fact that only the phenotypically most markers, the total variance will also be underestimated
extreme individuals are genotyped and only these are by a fraction q when the single full-sib family design is
informative with respect to partitioning of residual and used. When the design contains several families, the
Mendelian sampling variances. between-family variances are quite accurately predicted,
The current study assumes that inheritance of the even if part of the genome is not covered by markers,
haplotype blocks from parents to offspring is known. In which will recover some of the underestimation. The
real data, this is never the case but we may observe gen- underestimation may be completely recovered by includ-
ome-wide marker genotypes and this information can be ing a polygenic effect in the model, which has a covar-
used to trace inheritance of the haplotype blocks. iance structure equal to the pedigree-based relationship
Furthermore, since the number of recombinations per matrix, requiring several families of data.
Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16 Page 8 of 10
http://www.gsejournal.org/content/44/1/16

Figure 5 Across-replicate averages of estimated heritability (with between-replicate standard deviations of the estimates) by total
number of observations using a classical animal model (a) and a genomic IBD animal model (b) with 1, 5 or 10 families for Scenario 3.
The dotted line represents the true input heritability.

Scenarios 1 and 2 in the current study assumed that model [10]. The main difference between the GBLUP
genetic variance is evenly distributed over genomic model and the genomic IBD model is that the first
regions, as assumed in the genomic BLUP (GBLUP) model uses identity-by-state (IBS) relationships, while
Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16 Page 9 of 10
http://www.gsejournal.org/content/44/1/16

the latter uses IBD relationships (based on marker genomic IBD model (or equivalent gametic model)
alleles traced back to a common ancestor). The assump- requires only a single large family for proper and accu-
tion that genetic variance is distributed evenly across rate estimation of heritability for quantitative traits. In
genomic regions has been shown to be an appropriate contrast, classical pedigree-based estimation requires the
approximation for a number of traits [15,16]. However, establishment of a sizeable pedigreed population consist-
there are also examples of the opposite assumption, e.g., ing of numerous full- and (preferably) half-sib families
genetic variation in resistance against infectious pan- to produce estimates with acceptable accuracy. Further-
creas necrosis in Atlantic salmon seems largely con- more, the proposed genomic IBD model is expected to
trolled by a single major QTL [17,18]. For the latter be less affected by selection among parents and will
type of traits, some of the underlying assumptions of facilitate the separation of genetic and non-genetic
both the pedigree-based and genomic IBD models are family effects (e.g., effects of common rearing).
violated. First, within-family genetic variance will vary
greatly among families, depending on the actual parental
Acknowledgements
genotypes ("effective alleles”) for the genomic region This work was supported by the grant 203699 (New statistical tools for
that primarily affects the trait (although it will still on integrating and exploiting complex genomic and phenotypic data sets),
financed by the Research Council of Norway. The helpful comments of two
average be 12 σ 2a ); in the example on Atlantic salmon, the anonymous reviewers are gratefully acknowledged.
within-family genetic variance will depend on whether
Author details
or not the parents segregate for the major QTL. Second, 1
Nofima, P.O. Box 210, NO-1431 Ås, Norway. 2Department of Animal and
IBD relationships in the most important linkage group Aquacultural Sciences, Norwegian University of Life Sciences, P.O. Box 5003,
(s) will dominate genetic covariance between relatives, NO-1432 Ås, Norway.
not the overall genomic or expected (pedigree-based) Authors’ contributions
IBD relationships. Still, even for such data, the genomic JØ was mainly responsible for conception and design of the study, data
IBD model could estimate genetic variance more accu- analysis and writing of the manuscript. THEM contributed to the design of
the study and to writing and revision of the manuscript. Both authors read
rately than the classical pedigree-based analysis. Hence, and approved the final manuscript.
although the genomic IBD relationships are not necessa-
rily representative of the genetic covariance structure Competing interests
The authors declare that they have no competing interests.
among sibs in this situation, they are still more informa-
tive than the pedigree-based relationships. In this set- Received: 1 February 2012 Accepted: 8 May 2012 Published: 8 May 2012
ting, the differences between the classic pedigree and
genomic IBD models increased with the size of the data- References
1. Fisher RA: The correlation between relatives on the supposition of
set (no practical difference with 200 individuals but a Mendelian inheritance. Trans Royal Soc Edinburgh 1918, 52:399-433.
substantial difference with 1000 individuals). However, 2. Hill WG: Variation in genetic composition in backcrossing programs. J
estimation of genetic variance within a single family Heredity 1993, 84:212-213.
3. Ødegård J, Meuwissen THE, Heringstad B, Madsen P: A simple algorithm to
was, as expected, highly prone to sampling effects. In estimate genetic variance in an animal threshold model using Bayesian
Scenario 3, the real number of different breeding values inference. Genet Sel Evol 2010, 42:29.
represented within a single full-sib family is actually lim- 4. Gagnon A, Beise J, Vaupel JW: Genome-wide identity-by-descent sharing
among CEPH siblings. Genet Epidem 2005, 29:215-224.
ited to four (two “effective alleles” per parent), which 5. Visscher PM, Medland SE, Ferreira MAR, Morley KI, Zhu G, Cornes BK,
explains the large between-replicate deviations in the Montgomery GW, Martin NG: Assumption-free estimation of heritability
estimated heritability. Thus, in real data, for which the from genome-wide identity-by-descent sharing between full siblings.
PLoS Genet 2006, 2:e41.
underlying genetics of the trait is generally unknown, it 6. Goddard M: Genomic selection: prediction of accuracy and maximisation
is recommended to use more than one family for quan- of long term response. Genetica 2009, 136:245-257.
titative genetic analysis, even when applying the geno- 7. Gilmour AR, Gogel BJ, Cullis BR, Thompson R: ASReml user guide release 3.0.
30 edition. Hemel Hempstead: VSN International Ltd; 2009.
mic IBD approach. 8. Aguilar I, Misztal I, Johnson DL, Legarra A, Tsuruta S, Lawlor TJ: Hot topic: A
unified approach to utilize phenotypic, full pedigree, and genomic
Conclusions information for genetic evaluation of Holstein final score. J Dairy Sci
2010, 93:743-752.
The proposed genomic IBD method is particularly rele- 9. Christensen O, Lund M: Genomic prediction when some animals are not
vant for quantitative genetic studies aiming at estimating genotyped. Genet Sel Evol 2010, 42:2.
additive genetic variance of highly fecund species, using 10. Meuwissen THE, Hayes BJ, Goddard ME: Prediction of total genetic value
using genome-wide dense marker maps. Genetics 2001, 157:1819-1829.
data on populations with limited pedigree information 11. Falconer DS, Mackay TFC: Introduction to quantitative genetics Essex:
and/or few available parents. For example, genetic var- Longman Group Ltd; 1996.
iance may be estimated based on a few full-sib-families 12. Habier D, Fernando RL, Dekkers JCM: Genomic selection using low-density
marker panels. Genetics 2009, 182:343-353.
with parents sampled from the wild or from non-pedi- 13. Fernando R, Grossman M: Marker assisted selection using best linear
greed domesticated populations. In principle, the unbiased prediction. Genet Sel Evol 1989, 21:467-477.
Ødegård and Meuwissen Genetics Selection Evolution 2012, 44:16 Page 10 of 10
http://www.gsejournal.org/content/44/1/16

14. Hill WG, Weir BS: Variation in actual relationship as a consequence of


Mendelian sampling and linkage. Genet Res (Camb) 2011, 93:47-64.
15. Visscher PM, Macgregor S, Benyamin B, Zhu G, Gordon S, Medland S,
Hill WG, Hottenga J-J, Willemsen G, Boomsma DI, Liu Y-Z, Deng H-W,
Montgomery GW, Martin NG: Genome partitioning of genetic variation
for height from 11,214 sibling pairs. Am J Human Genet 2007,
81:1104-1110.
16. Yang J, Manolio TA, Pasquale LR, Boerwinkle E, Caporaso N,
Cunningham JM, de Andrade M, Feenstra B, Feingold E, Hayes MG, Hill WG,
Landi MT, Alonso A, Lettre G, Lin P, Ling H, Lowe W, Mathias RA, Melbye M,
Pugh E, Cornelis MC, Weir BS, Goddard ME, Visscher PM: Genome
partitioning of genetic variation for complex traits using common SNPs.
Nat Genet 2011, 43:519-525.
17. Moen T, Baranski M, Sonesson AK, Kjøglum S: Confirmation and fine-
mapping of a major QTL for resistance to infectious pancreatic necrosis
in Atlantic salmon (Salmo salar): population-level associations between
markers and trait. BMC Genomics 2009, 10:368.
18. Houston RD, Haley CS, Hamilton A, Guy DR, Mota-Velasco JC, Gheyas AA,
Tinch AE, Taggart JB, Bron JE, Starkey WG, McAndrew BJ, Verner-
Jeffreys DW, Paley RK, Rimmer GSE, Tew IJ, Bishop SC: The susceptibility of
Atlantic salmon fry to freshwater infectious pancreatic necrosis is largely
explained by a major QTL. Heredity 2010, 105:318-327.

doi:10.1186/1297-9686-44-16
Cite this article as: Ødegård and Meuwissen: Estimation of heritability
from limited family data using genome-wide identity-by-descent
sharing. Genetics Selection Evolution 2012 44:16.

Submit your next manuscript to BioMed Central


and take full advantage of:

• Convenient online submission


• Thorough peer review
• No space constraints or color figure charges
• Immediate publication on acceptance
• Inclusion in PubMed, CAS, Scopus and Google Scholar
• Research which is freely available for redistribution

Submit your manuscript at


www.biomedcentral.com/submit

You might also like