0% found this document useful (0 votes)
2 views

credibility-using-semiparametric-models

This paper by Virginia R. Young discusses the application of semiparametric models in Bayesian analysis for estimating insurance losses, addressing the challenges of selecting prior distributions. The author proposes using nonparametric density estimation techniques to improve the accuracy of predictive means for future claims, demonstrating this with simulated data. The findings suggest that the proposed method outperforms traditional linear credibility estimators, even under varying conditional distributions.

Uploaded by

Albert Wijaya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

credibility-using-semiparametric-models

This paper by Virginia R. Young discusses the application of semiparametric models in Bayesian analysis for estimating insurance losses, addressing the challenges of selecting prior distributions. The author proposes using nonparametric density estimation techniques to improve the accuracy of predictive means for future claims, demonstrating this with simulated data. The findings suggest that the proposed method outperforms traditional linear credibility estimators, even under varying conditional distributions.

Uploaded by

Albert Wijaya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

CREDIBILITY USING SEMIPARAMETRIC MODELS

BY VIRGINIA R. YOUNG

School of Business
University of Wisconsin-Madison

ABSTRACT

To use Bayesian analysis to model insurance losses, one usually chooses a


parametric conditional loss distribution for each risk and a parametric prior
distribution to describe how the conditional distributions vary across the risks. A
criticism of this method is that the prior distribution can be difficult to choose and
the resulting model may not represent the loss data very well. In this paper, we
apply techniques from nonparametric density estimation to estimate the prior.
We use the estimated model to calculate the predictive mean of future claims
given past claims. We illustrate our method with simulated data from a mixture of
a lognormal conditional over a lognormal prior and find that the estimated
predictive mean is more accurate than the linear Biihlmann credibility estimator,
even when we use a conditional that is not lognormal.

KEYWORDS

Kernel density estimation, claim estimation, Bayesian estimation.

1. INTRODUCTION

In a portfolio of insurance policyholders (also called risks), risks are


heterogeneous; that is, the insurance losses of different risks follow different loss
distributions. The premium an insurer charges a given risk depends on the
information available concerning the loss distribution of that risk. If the insurer
knew the exact loss distribution of a risk, then the appropriate net premium to
charge would be the expectation of that loss distribution. On the other hand, if
the insurer has no information about a specific policyholder, then the net
premium is the expectation over the entire portfolio of policyholders. For the
situation between these two extremes, suppose the insurer has prior claim data for
the risk, then the net premium is the conditional expectation of future claims
given the prior claims.
To use Bayesian analysis to model insurance losses, one usually chooses a
parametric conditional loss distribution for each risk and a parametric prior
distribution to describe how the conditional distributions vary across the risks. A
criticism of this method is that the prior distribution can be difficult to choose and

ASTIN BULLETIN, Vol. 27, No. 2, 1997, pp. 273-285

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


274 VIRGINIA R. YOUNG

the resulting model may not represent the loss data very well. One method of
circumventing this problem is to apply empirical Bayesian analysis in which one
uses the data to estimate the parameters of the model (Klugman, 1992).
In this paper, we use a semiparametric mixture model to represent the
insurance losses of a portfolio of risks: We choose a flexible parametric
conditional loss distribution for each risk with unknown conditional mean that
varies across the risks. This conditional distribution may depend on parameters
other than the mean, and we use the data to estimate those parameters. Then, we
apply techniques from nonparametric density estimation to estimate the
distribution of the conditional means.
In Section 2, we describe a mixture model for insurance claims and estimate
the prior density using kernel density estimation. In Section 3, we calculate the
credibility estimator assuming squared-error loss and also give the projection of
that estimator onto the space of linear functions. Finally, in Section 4, we apply
our methodology to simulated data from a mixture of a lognormal conditional
over a lognormal prior. We show that our method can lead to good credibility
formulas, as measured by the mean squared error of the claim predictor, even
when we use a gamma conditional instead of a lognormal conditional.

2. SEMIPARAMETRIC MIXTURE MODEL

2.1. Notation and Assumptions


Assume that the underlying claim of risk / per unit of exposure is a conditional
random variable Y\9j, i= 1, 2, ..., r, with probability density function f(y\9j).
For each of the r risks, we observe the average claims per unit of exposur
x, = (x,\,x/2, ..., Xjnj) with an associated exposure vector »», = (vv,i,
w/2, ..., w,-,,,.), i=l, 2, ..., r. Thus, the observed average claim xy is the
arithmetic average of wy claims, each of which is an independent realization of
the conditional random variable Y\9,. For example, if a risk is a group
policyholder, then xy may be the average claim per insured member of the group
in the j ' h policy period and wy is the number of members in the group during the
j ' h policy period. For the data from Hachemeister (1975), a risk is the collection of
insureds in a particular state covered by bodily injury automobile insurance, xy
represents the average claim severity during period j , and wy is the corresponding
number of claims.
Assume that the parameter 9 is the conditional mean, £"[K|0] = 9. There may
be other parameters that characterize the conditional distribution, such as the
shape parameter a for the gamma density. However, in this paper, we assume that
parameters, other than the conditional mean, are fixed across the risks. The loss
distribution of a given risk is, therefore, characterized by its conditional mean,
although that mean is generally unknown. Denote the probability density
function of 9 by TT(0), also called the structure function (Biihlmann, 1970). The
structure function characterizes how the conditional mean 9 varies from risk to
risk. We argue that assuming 9 to be continuous is reasonable because in the

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


CREDIBILITY USING SEMIPARAMETRIC MODELS 275

Bayesian paradigm, our uncertainty about 0 for any particular risk would be
represented by a continuous random variable. Also, if r is large, then the variable
6 can be well approximated by a continuous random variable. Even if r is not
large, the collection of r risks may be a sample from a larger population of risks
whose distribution can be approximated by a continuous distribution. Assume
that the experience of different risks is independent.
Note that our model is a special case of the one given by Buhlmann and
Straub (1970). Because Xy is the random variable of an average of wy iid claims
Yu Y2, ..., y,v,,., given 0h we have that £[J!f/;|0/] = E[Y\et] = 0,- is independent of
the period/ It also follows that

if j
CoV[XihXik\0] = { ^ t " =K
I 0, if jjtk,
as in the Biihlmann-Straub model. In the literature, £'[1/J0,] is called the hypothe-
tical mean and Kar[y|0/] the process variance. Note that we -assume the
observations for a risk arise as arithmetic averages of an underlying claim
random variable Y\6, while Buhlmann and Straub (1970) do not assume this in
their more general model.
The goal of credibility theory is to predict the future claim y (or an average of
future claims) of a risk, given that the risk's claim experience is x and exposure w.
In this paper, we restrict our attention to credibility formulas that are functions of
a single statistic because they are easier to estimate and to use. We choose the
_ Yl"i=\ wijxij
sample mean as our statistic, x, = J^n.—:—- because the claim experience x is a
vector of averages. However, we do not restrict a claim estimator to be linear.
To pick a parametric conditional distribution for Y\6, we use the following
criteria:
. E[Y\6}=0
• The sample mean is a sufficient statistic for 0.
• The functional form of /ij>|0) is closed under averaging. That is, if X is an
average of w claims that follow the distribution given by f(y\0), then the density of
X has the same functional form as f{y\0).
Three such families of densities are commonly used in actuarial science to
model insurance losses—(1) the normal, with mean 0 and fixed variance a2, (2) the
a
gamma, with mean 0 = — and fixed shape parameter a, and (3) the inverse
03
gaussian, with mean 0 and fixed A = ——j-^rjr- Indeed, Y\6 ~ -Y(0, a2) implies
that if X is an average of w iid claims Y\, Y2, ..., YK, given 0, then
X\0 ~ N(e, CT2/H<)- Similarly, if Y\8 ~ G(0, a), then X\8 ~ G(0, wa), and the
probability density function of Y\6 is

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


276 VIRGINIA R. YOUNG

Finally, if Y\9 ~ InvG{9, A), then X\9 ~ InvG(9, wX) and the probability
density function of Y\9 is

\(y-e)2
f{y\o) = 2 y92
, y>0.

We use the family of gamma conditional distributions in an example in Section 4.


In practice, one might use the normal conditional if the conditional variance is
assumed constant across the risks. One might use the gamma conditional if the
conditional coefficient of variation is assumed constant across risks or the inverse
gaussian conditional if one wanted to use a loss distribution with a long tail. Note
that for these three families, the predictive mean is a function of the sample mean
for any prior distribution -K. See Young (1997) for examples of credibility
estimators that are functions of a one-dimensional sufficient statistic, not
necessarily the sample mean.
In the Bayesian spirit, for a given loss function L = L(y, d(x)) of the future
claim y and the claim predictor d, we propose that the credibility estimator d be
the function that minimizes the expected loss
E[L(y, d(x))},
in which we take the expectation with respect to the joint density of the sample
mean and future claim. In our mixture model, this joint density is
ff(y\9) f(x\9) n(9)d8 Therefore, we require an estimate of the density ir(0).

2.2. Kernel Density Estimation

We use kernel density estimation (Silverman, 1986) to estimate the probability


density vr(0). A kernel K acts as a weight function and satisfies the condition

K{t)dt= 1.

If we were to observe directly the conditional means 9\, 9i, ..., 9r, then the kernel
density estimate of ir(9) with kernel K would be given by

in which h; is a positive parameter called the window width, or bandwidth. Assume


that the kernel is symmetric; therefore, the expectation of 9 is the sample mean.
Because we observe only data x, and w, and not the true conditional means #,,
we rely on the law of large numbers and use the sample mean x,- to estimate 9,
consistently, / = 1, 2, ..., r, (Serfling, 1980). In the expression in (2.1), one may
wish to weight the terms in the sum according to the relative number of

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


CREDIBILITY USING SEMIPARAMETRIC MODELS 277

claims for the /''' risk so that the expectation of 9 is the sample mean
x = —;,.' ~^,,' = ^-,.'" — in which w, = Yl"jL\ wii- We, therefore, propose the
following kernel density estimator for TT(#)

in which \vlol = ~Y^\= 1 HV = X]/=i S / i i w'</- See t n e Appendix for a discussion of the
asymptotic mean square consistency of ft(9).
Two commonly used symmetric kernels are (1) the Gaussian kernel, G,

G(t) = e~~, — oo < / < oo,

and (2) the Epanechnikov kernel, Epa,

£/x*(0=^4 V5 ' - V 5 < ? <


V 5 , (2.3)
0, else.

In our example in Section 4, we use the Epanechnikov kernel because its domain
is bounded, and we can, therefore, easily restrict the support of ft(6) to lie in the
positive real numbers.
Remark: The Epanechnikov is optimal with respect to mean integrated square
error (Silverman, 1986). The efficiency of the Gaussian kernel with respect to the
optimal Epanechnikov kernel is roughly 95% (Silverman, 1986), so one does not
lose much efficiency by using the Gaussian kernel. Silverman, therefore, suggests
that one choose the kernel according to auxiliary requirements, such as ease of
computing. •
There are many techniques for choosing the window width /;,; see, for
example, Silverman (1986, Section 3.4) and Jones, Marron, and Sheather (1996).
In our example in Section 4, we use a (modified) fixed window width selected by
reference to a standard distribution (Silverman, 1986, Section 3.4.2). The window
width h that minimizes the mean integrated squared error is given by

i - 2 / 5 r /• ~)]/$( r l" 1 / 5

t2 K(t) dt\ { K{tY dt\ \ I n"(9) d9\ r-]'5. (2.4)

To approximate this optimal window width h, ones assumes that ir(8) is say,
normal, with mean 0 and standard deviation a. In that case, the term JIT"(9) d9
equals ^ir~]/2a~5. We modify the window width h at each point x, to ensure that
the density has support on the nonnegative real numbers. Specifically, we set ht
equal to /z, if h < —j= otherwise, we set h, equal to —j=.

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


278 VIRGINIA R. YOUNG

3. CREDIBILITY USING SQUARED-ERROR LOSS

In this section, we use squared-error loss to determine a credibility estimator, as is


used in greatest accuracy credibility theory, (Willmot, 1994) or (Herzog, 1996).
The squared-error loss function has the form
L(y, d(x)) = (y- d(x)f.

It is straightforward to show that the minimizer of the expected loss is the


predictive mean (Buhlmann, 1967), which in this case is the posterior mean of 6
given the sample mean x which we estimate by
p,(x) = f E[Y\6]TT(6\X)C16 = E[0\x

For a general kernel K and bandwidths //,-, this estimated posterior mean of 6 can
be written

Recall that x is an average of w iid claims, each of which follows the density
/(v|#), as in Section 2.1. If we constrain the estimator d to be linear, then it is well-
known that-the least-squares linear estimator of £"[^|x] = iT^lx] is
d(x) = (l~Z)E[Y] + Zx, (3.2)

in which Z = with k = r~M (Buhlmann, 1967). Using our estimate for


w+ k Var[0\

the prior density (2.2), we obtain E[Y] = E[0] = x, as noted in Section 2.2. In the
:
case of the normal conditional; k = , in the case of the gamma condi-

tional, k = —p——-— . ; and in the case of the inverse gaussian conditional,

k =

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


CREDIBILITY USING SEMIPARAMETRIC MODELS 279

To end this section, we show that as w approaches oo, p,(x) approaches the
true expected value 6Q, for the given risk. Because X\9 has mean 9 and variance
— under certain regularity conditions, (DeGroot, 1970) and (Walker,
w
1969), the density f(x\9) approaches the delta function with its mass concentrated
at the point ~x = 9Q Then,
J9f(x\9)ir(9)d9

Thus, as an actuary gets more claim information for a given policyholder (w gets
large), the estimated expected claim approaches the true expected claim with
probability 1.

4. SIMULATED DATA FROM A LOGNORMAL-LOGNORMAL MIXTURE

The lognormal distribution is used by actuaries to model the distribution of claim


severity. It is also used to model the distribution of total claims in some lines of
insurance, such as health insurance. In this section, we assume that we are given
individual claim data; that is, wy = 1, for all risks / and policy periods j , and X=
Y. We model the lognormal-lognormal mixture as follows:

l
,/W0) =
ox
in which a > 0 is a known parameter, and

in which fj, > 0 and r > 0 are known parameters. That is, ( ) |
N(\n(/>, a2), and l n 0 ~ N(\n/j., r 2 ) . The marginal distribution of X is lognor-
mal; lnJST ~ Af(ln/x, a1 + r 2 ) .
Given claim data for a specific policyholder, X = x =< x\, X2, ..., xn>
e [0, oo]", the posterior distribution of 4>\x is lognormal; (\n(f))\x~ N
(ln/i*, T*' j , in which
1
' o1 In a + r2t
= exp'

and

a2 + nr2'

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


280 VIRGINIA R. YOUNG

Thus, the predictive distribution of Xn+i|x is lognormal;


(]nX,,+ i)\x~ AM In fi*, a1 + T*J. It follows that the true predictive mean is a
function of t

a2 {a2 + (n + \)T2)\
J
\ ) (4-1)

We performed 200 simulations of a lognormal-lognormal mixture of claims. We


let a2 = 0.25, r 2 = 0.50, and /x = 2000e" 025 . The marginal expectation of X is
2267, and the marginal standard deviation is 2395. For each simulation run, we
simulated claim data from this lognormal-lognormal mixture for r = 100 risks
(values of <fi). For each of the 100 risks, we simulated «, = w,• = 5 claims. To
estimate the distribution of the conditional means, we used kernel density
estimation with the Epanechnikov kernel, as given by (2.3). Also, we used a fixed
window width h, chosen by reference to a normal distribution with mean 0 and
standard deviation a. We estimated the standard deviation by the interquartile
range of the sample means, R, divided by 1.34 (Silverman, 1986, Section 3.4). The
bandwidth h was calculated by h = (l)~ 2/5 (0.268) l/5 (0.212)" l/5 -—-100" 1/5
« 0.3127? as in (2.4). We truncated this bandwidth h for a given risk if, by
otherwise using it, the prior density would have a negative support. Specifically, if
XT • \'
h > —= then we set the bandwidth /?,- equal to -^L to guarantee that the support
V5 V5
of the estimated density of 6 be contained in the nonnegative real numbers, as
described in Section 2.2.
Instead of assuming that the conditional is lognormal, we assumed that the
coefficient of variation is constant from risk to risk and, therefore, fit a gamma
conditional to each risk. In each simulation run, we estimated the parameter a by
x2
the median of the following sample statistic '• j . We used the
S=T£,=I (*/;-*'•)
estimated prior density along with the gamma conditional to estimate the
marginal density of X.
We used the estimated mixture model to estimate the predictive mean of Xn+\
given claim data x. We also computed the Biihlmann credibility estimator,
lin(x), for which we estimated the expected process variance by

EPV =
100(5- 1 ) ^ ' =

and the variance of the hypothetical means by

100-
(Willmot, 1994, Section 5.1).

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


CREDIBILITY USING SEMIPARAMETRIC MODELS 281

TABLE 4.1
DESCRIPTIVE STATISTICS OF h, MSE, MSEB, AND RATIO

Variable Mean Median StDev Qi Q3

h 564.35 561.00 91.64 500.25 623.75


MSE 16,450 12,111 13,146 7,808 21,623
MSEB 74,559 69,595 37,539 44,466 94,878
Ratio 0.2984 0.1777 0.3239 0.0890 0.3819

For n = w = 1, we compared the estimated predictive mean, fi(x) and the


Buhlmann credibility estimator, lin(x), with the true predictive mean, fi(x).
To compare these credibility estimators numerically, for each of the 200 simu-
lation runs, we calculated the mean squared errors up to the 95''' percentile of X,
namely 6,500: MSE = /06500 (fi(x) - ii{x)ff{x)dx and MSEB = /06500 (lin{x)-
fj,(x))~f{x)dx. See Table 4.1 for descriptive statistics of the bandwidth h; the
mean squared errors, MSE and MSEB; and the ratio of MSE to MSEB, Ratio.
Thus, we see that up to the 95''' percentile, on average, our estimated
predictive mean performs much better than the linear Buhlmann credibility
estimator. See Figure 4.1 for a scatter plot of MSE versus h. Note the quadratic
relationship between the two variables and that the minimum of MSE occurs
near the average value of h, 564. We fit a quadratic to these observations by
minimizing the sum of the absolute values of the errors and obtained the fitted
model

MSE = 196,603-691.36/;+ 0.6402/72,

80000 - 0

70000 - /

60000 - 0
/
0 0 /
50000 - o o o £ °° / o
o o /
40000 -
0 0 QjO 0 /

30000 -
<\ & o ° o ° o ° o l f ° 0
20000 -
° * V § 4 ° I ° 08 Oo4^^ >0 ° 0
10000 -

o - o ^ ° oof^*
I I I I I I
300 400 500 600 700 800 900

FIGURE 4.1: Scatter Plot of MSE versus h with Quadratic Superimposed.

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


282 VIRGINIA R. YOUNG

with vertex at 542. See Figure 4.1 for a graph of this quadratic superimposed on a
scatter plot of the observations.
We also computed some of the mean squared errors up to the 99'A percentile
and found that the estimated predictive mean compared poorly relative to the
Biihlmann credibility estimator. We conclude that our estimate of the prior
density at larger conditional means may suffer. Silverman (1986) suggests a
variable bandwidth approach for estimating densities with long tails which uses

6*10 1 1 1 1

4*10
/; \
V

I
2*10

1 1
1300 2600 3900 5200 6500
Estimated marginal density of X
True marginal density of X
FIGURE 4.2: Estimated and True Marginal Densities of Claims.

larger bandwidths in the regions of lower density. We tried this method without
increased accuracy in the upper percentiles of our claim estimator. We suspect
that the poor fit at the higher percentiles may be due to our using a medium-tailed
gamma conditional to model a heavy-tailed lognormal. We encourage the
interested reader to investigate using an inverse gaussian instead of a gamma
conditional to model the conditional claim distribution.
See Figure 4.2 for graphs of the estimated and true marginal densities of X for
one of the simulations1. Of the graphs we plotted, Figure 4.2 is typical, in that the
estimated marginal density of X is less skewed than the true density.
See Figure 4.3 for the corresponding graphs of the estimated and true
predictive means. Notice how closely the estimated predictive mean follows the
true predictive mean, compared with the linear Buhlmann estimator for claims
less than 4000. Also note how the estimated predictive mean diverges upward for
claims larger than 4000. This phenomenon occurred in all of the several graphs
that we plotted and is due, we believe, to the fact that we used a gamma
conditional to estimate a lognormal. It may also be due to computational errors

In this run, h = 476, MSE = 12,076, and MSEB = 84.571. Recall that « = 1 and that the claim
amount 6,500 is the 95"' percentile of X.

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


CREDIBILITY USING SEMIPARAMETRIC MODELS 283

6000

4000 —

2000 —

1300 2600 3900 5200 6500

(solid) Estimated predictive mean


(dot) Linear Buhlmann estimator
(dash) True predictive mean
FIGURE 4.3: Credibility Estimators.

because there are only a few simulated claims in the right tail. One way to adjust
the estimated predictive mean to eliminate this divergence is to extend it linearly
beyond some large value of the sample mean. Another solution may be to use a
conditional distribution with a longer tail, such as the inverse gaussian. Yet
another solution may be to apply my method of blending the criteria of accuracy
and linearity (Young, 1997).

5. SUMMARY AND CONCLUSIONS

The Biihlmann-Straub credibility method results in a linear estimator with a


different slope (or credibility weight) for each risk. Therefore, to apply their
method to a risk not used to construct the original model, one would be required
to recalculate the model to obtain a linear estimator for the new risk. An
advantage of our method is that it is applicable to risks outside the original data
set, if one assumes that the average claims and corresponding exposures of the
new risk come from the same parent (mixture) population as the data. Another
advantage of our method is increased accuracy over a linear estimator, as
demonstrated in the example in Section 4, even when we use an 'incorrect'
conditional density.
One may wish to use the underlying mixture model and kernel density
estimation in combination with other loss functions, such as a linear combination
of a squared-error term and a second-derivative term to blend the goals of

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


284 VIRGINIA R. YOUNG

accuracy and linearity (Young, 1997). Also, it would be interesting if one were to
extend the model to include a trend component, as in Hachemeister (1975), and
apply kernel density estimation in the more general model.
ACKNOWLEDGMENTS

I thank the Committee for Knowledge Extension and Research of the Society of
Actuaries (SOA) and the Actuarial and Education Research Fund for financial
support. I especially thank my SOA Project Oversight Group (Hans Gerber and
Gary Venter, led by Thomas Herzog and assisted by Warren Luckner of the
SOA) for helpful guidance.

APPENDIX
ASYMPTOTIC MEAN SQUARE CONSISTENCY OF ( 2 . 2 )

Let ff(0) = X]/=i ^~,h^-\ir) denote the kernel density estimator of n when we
are given observations #,, / = 1, 2, ..., r. Consider the mean squared error of the
density estimate -ft at a fixed value 8 :

= E £[(#(0) - ir(6))2}

By the law of large numbers (Serfling, 1980), x, approaches #,-, with probability
one, as Wj approaches infinity. Therefore, as w, approaches infinity, the first term
in the mean squared error goes to zero. By Silverman (1986) or Thompson and
Tapia (1990), the second and third terms go to zero as r goes to infinity if
lim hi = 0 and lim rhj = oo.
r—too r—<oo

REFERENCES

BUHLMANN, H. (1967), Experience rating and credibility, ASTINBulletin, 4: 199-207.


BOHLMANN, H. (1970), Mathematical Models in Risk Theory, Springer-Verlag, New York.
BUHLMANN, H. and E. Straub (1970), Glaubwiirdigkeit fiir Schadensatze, Milteilungen der Vereini-
gung Schweizerischer Versicherungs-Mathematiker, 70: 111-133.
DEGROOT, M. H. (1970), Optimal Statistical Decisions, McGraw-Hill, New York.
HACHEMEISTER, C. A. (1975), Credibility for regression models with application to trend, 129-163, in
Credibility: Theory and Applications, (ed. P. M. Kahn), 129-163, Academic Press, New York.
HERZOG, T. L. (1996), Introduction to Credibility Theory, second edition, ACTEX, Abington,
Connecticut.

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press


CREDIBILITY USING SEMIPARAMETRIC MODELS 285

JONES, M. C , J. S. MARRON, and S. J. SHEATHER (1996), A brief survey of bandwidth selection for
density estimation, Journal oj the American Statistical Association, 91: 401-407.
KLUGMAN, S. A. (1992), Bayesian Statistics in Actuarial Science with Emphasis on Credibility, Kluwer,
Boston.
SERFLINO, R. J. (1980), Approximation Theorems of Mathematical Statistics, Wiley, New York.
SILVERMAN, B. W. (1986), Density Estimation for Statistics and Data Analysis, Chapman & Hall,
London.
THOMPSON, J. R. and R. A. TAPIA (\990), Nonparametric Function Estimation, Modeling, and Simula-
tion, Society for Industrial and Applied Mathematics, Philadelphia.
WALKER, A. M. (1969), On the asymptotic behaviour of posterior distributions, Journal of the Royal
Statistical Society, Series B, 31: 80-88.
WILLMOT, G. E. (1994), Introductory Credibility Theory, Institute of Insurance and Pension Research,
University of Waterloo, Waterloo, Ontario.
YOUNG, V. R. (1997), Credibility using a loss function from spline theory: Parametric models with a
one-dimensional sufficient statistic, to appear. North American Actuarial Journal.

VIRGINIA R. YOUNG
School of Business
975 University Avenue
University of Wisconsin-Madison
Madison, Wl USA 53706

https://doi.org/10.2143/AST.27.2.542052 Published online by Cambridge University Press

You might also like