Approximations of The Aggregate Loss Distribution
Approximations of The Aggregate Loss Distribution
To CAS Members:
This is the Winter 2001 Edition of the Casualty Actuarial Society Forum. It
contains seven Ratemaking Discussion Papers, five Data Management/Data Quality/
Data Technology Call Papers, two committee reports, and three additional papers.
The Casualty Actuarial Society Forum is a non-refereed journal printed by the
Casualty Actuarial Society. The viewpoints published herein do not necessarily re-
flect those of the Casualty Actuarial Society.
The CAS Forum is edited by the CAS Committee for the Casualty Actuarial
Society Forum. Members of the committee invite all interested persons to submit
papers on topics of interest to the actuarial community. Articles need not be written
by a member of the CAS, but the paper's content must be relevant to the interests of
the CAS membership. Members of the Committee for the Casualty Actuarial Society
Forum request that the following procedures be followed when submitting an article
for publication in the Forum:
The CAS Forum is printed periodically based on the number of call paper
programs and articles submitted. The committee publishes two to four editions dur-
ing each calendar year.
All comments or questions may be directed to the Committee for the Casualty
Actuarial Society Forum.
Sincerely,
The Winter 2001 Edition of the CAS Forum is a cooperative effort between
the CAS Forum Committee and two CAS Research and Development Committees:
the Committee on Ratemaking and the Committee on Management Data and Infor-
mation.
The CAS Committee on Ratemaking presents for discussion seven papers pre-
pared in response to its Call for 2001 Ratemaking Discussion Papers. In addition, the
Committee on Management Data and Information presents three papers submitted in
response to the 2001 Call for Data Management/Data Quality/Data Technology Pa-
pers.
This Forum includes papers that will be discussed by the authors at the 2001
CAS Seminar on Ratemaking, March 12-13, in Las Vegas, Nevada.
Abstract
The goal of ratemaking methodologies is to estimate the future expected costs for a
book of business. Using past experience, including both internal and external data,
the actuary attempts to quantify the required premium level to achieve an acceptable
profit.
However, if one looks at the rate activity in a market, it is apparent that company
actions do not always follow the indications. Surprisingly, such decisions often lead
to successful results. It seems that there must be something going on that is invisible
to the naked eye? Do indications really mean so little? Or are there other factors,
buffed in the actuarial judgment of the experienced actuary' but difficult to quantify?
It is the premise of this paper that such factors do indeed exist. One such factor is the
effect of the rate change on market behavior. In this paper, we will describe one
method for quantifying some of this effect. The methodology described will require
much research to determine reasonable assumptions before it can used in practice. It
is our hope that it will stimulate further discussion, research, and a move toward
acceptance of dynamic economic principles in ratemaking.
RATEMAKING FOR M A X I M U M PROFITABILITY
Introduction
The goal of ratemaking methodologies is to estimate the future expected costs for a
book of business. Using past experience, including both internal and external data,
the actuary attempts to quantify the required premium level to achieve an acceptable
profit.
However, if one looks at the rate activity in a market, it is apparent that company
actions do not always follow the indications. It is not uncommon for a company to
leave rates relatively unchanged even with large indicated increases. To the purist,
such action seems illogical.
Even more surprising is the fact that such a decision often leads to successful results.
It seems that there must be somethirig going on that is ifivisible to the naked eye? Do
indications really mean so little? Or are there other factors, buried in the actuarial
judgment of the experienced actuary but difficult to quantify?
It is the premise of this paper that such factors do indeed exist. The great unknown in
current actuarial methodologies is a function of the fact that the future book of
business is not necessarily the same as the historical one. To the extent that the
nature of the book changes, traditional methodologies are inadequate. They are using
outdated data, based on policies that no longer figure in future costs.
For very moderate rate changes, this distortion may be minimal. However, large
revisions may cause significant changes in the nature of a company's book, This is
logical, because a company does not operate in a vacuum. One company's actions
can have an effect on the actions of its competitors, and (perhaps more importantly)
will have an effect on the behavior of its own policyholders. Will current customers
renew? Will new business levels be affected?
Therefore, one major factor in future results is the effect of the rate change on market
behavior. In this paper, we will describe one method for quantifying some of this
effect. The methodology described will require much research to determine
reasonable assumptions before it can be used in practice. It is our hope that it will
stimulate further discussion, research, and a move toward acceptance of dynamic
economic principles in ratemaking.
Every actuary knows the basic steps to developing rate indications. In a loss ratio
method, premium is adjusted to current rate level, and trended if the exposure base is
inflation-sensitive. Accident year losses are developed to ultimate, trended, and
adjusted to reflect catastrophe risk. Credibility of the data is considered, and external
data is used if necessary. Loss adjustment expenses are loaded by some method
(often being treated the same as losses), and underwriting expenses are reflected. Of
course, determination of the profit provision is an important, and otten controversial,
step. The projected loss ratio is then compared to the target loss ratio to determine
the indicated rate level change.
Pure premiums methods use a similar procedure, except that projected indicated
average premiums are compared to current average premiums to determine the
indicated change.
The future book of business will have essentially the same characteristics as the
current (or historical) one.
Rate changes will have no effect on the actions of other companies in the market.
The indicated rate level change (if taken) will be equal to the change in premium
volume.
Rate changes will have no effect on the company's retention or ability to write
new business.
The profit provision can be determined academically, rather than being dictated
by the market.
The assumption concerning profit provisions deserves further comment. The idea of
setting a regulated ~rofit provision is a function of the insurance industry's history.
For much of the 20 century, competition in the industry was limited. For roughly
the first ¾ of the century, bureau rates were not uncommon. In this environment, the
filed rate was the same for most (if not all of the market). Therefore, it was
impossible for the market to gravitate to a profit level determined by competition.
This led to a "public utility" attitude toward regulation. In that environment, a
regulated profit provision is not unreasonable - the consumer needs more protection
when market power is absent. However, in the current, increasingly competitive
market, it makes sense for insurance markets to work like other private industries.
There are many ways that one could attempt to reflect the inadequacies of traditional
ratemaking assumptions. This paper will address the profit margin as a tool for
reflecting the dynamic nature of the market. By using such statistics as price
elasticity, we will quantify, to some extent, the way that economic forces operate on
the profit margin. As you will see, such an approach would allow us to have
indications that more reasonably reflect the probable future results.
Price Elasticity
Price elasticity is defined as the percentage decline in the units of a good sold for
every percentage increase in the price, l Let X = price elasticity then,
• Rate changes will only impact average premium and not policy counts.
• Profit provisions should be determined in advance.
In a highly elastic marketplace, this approach can fail for two reasons:
Companies may be more concerned with maximizing profit and/or market share than
profit margin. Note this is not always the case in insurance because it is so capital
intensive. However, while additional volume may require additional capital, it also
implies future profits (or losses) in the present value of the renewals. Therefore, an
insurance company might reasonably seek to write 10 million at a 3% margin rather
than 5 million at a 5% margin. Since traditional formulas determine the profit margin
in advance, volume considerations are explicitly ignored.
The Internet is an even more interesting example. A policy sold through a company
web site would have an even greater percentage o f fixed expenses, with virtually all
acquisition costs as fixed.
With a high number o f fixed costs, firms could receive significant short-term benefits
to the income statement by increasing volume and spreading fixed costs over
additional premium.
O f course, not all fixed expenses are truly fixed. Ultimately, all expenses are variable
expenses. This paper will not examine how best to classify expenses as "fixed" or
"variable," except to note that this is an increasingly sensitive assumption in the
indication process.
The formula above assumes that the fixed expense ratio will vary with the magnitude
o f the rate change. Suppose an insurer has a forecast 100% loss/lae ratio, a 10% fixed
expense ratio, and a 15% variable expense ratio, with a 3% underwriting profit target.
The traditional indicated rate level change is +34.1%.
This assumes that fixed expenses will fall to 7.5% o f premium. However, such a
large increase will not likely produce nearly a 34.1% increase in premium. It is very
likely that actual premium volume will decline in such an environment. Therefore,
the fixed expense ratio assumed by the formula will be far too low. The opposite case
can be made with declining rate levels.
The pricing actuary must consider the elasticity o f her product in order to make
reasonable estimates o f the impact o f fixed expenses.
Therefore, Premium before the rate change = Premium after the rate change
Premiam after the rate change = (1 + RateChange) x (Avg Premium) x (Policy count
after rate change)
Premium before the rate change = (AvgPremium) x (Policy Count before Rate
Change)
Policy count after rate change = Policy Count before Rate Change/(l + RateChange)
=[ 1/(1 + RateChange)-l]/RateChange
= -1/(1 + RateChange)
Note that for any rate increase, -1 < Elasticity < 0, so this formula implies more
elastic markets than (1). In the example provided, the elasticity is approximately -.7,
as a 39% increase by formula (2) would cause a 28% drop in policy counts.
This formula implies that elasticity can never be <= -1, which is not a reasonable
conclusion if the insurance market is elastic.
Pricing Theory
There is considerable debate in the economics profession whether any meaningful
general theories of pricing can be formulated. Few pricing managers in any business
consult economic theory when setting prices. 3 However, it is well known that firms
tend to pursue market share as well as profitability. 4
Customers tend to infer the overall level of price from those items most frequently
purchased. This explains why large super stores may tend to have very competitive
prices ("loss leaders") for staples such as milk. 5
3 Devinney, p. 337
4 Devinney, p. 240
Gabor, p. 170
For a multi-line insurer, that may mean pricing lines that are purchased more
frequently (auto) differently than those items which are purchased less often (life
insurance).
For commodity products, firms tend to use market forces more than cost based
pricing. Differentiated products are typically priced on a "cost plus" basis, which is
analogous to traditional rate level indications.
The Model
In this section, we introduce an alternative model which can be used for pricing
insurance. This type of model could also be used as a benchmark for existing pricing
decisions.
The model we will develop in this section is appropriate for a highly elastic book of
business. Specifically, we are going to use non-standard auto as our example because
this book is not only highly elastic, but also has very different characteristics for new
and renewal business.
New business tends to be highly elastic and highly unprofitable, while renewal
business tends to be less elastic (but not necessarily inelastic) and highly profitable
due to significant improvements in loss and expense ratios (note: this paper only
considers improvements in loss ratios).
Note that the elasticity of the different types of business leads to different profit
assumptions in the marketplace.
A company that followed traditional actuarial pricing models for new and renewal
business would be uncompetitive on new business and very competitive on renewal
business. While this would maximize profit margins on any particular risk, the
overall portfolio of risks would not be as profitable as a price discriminating book.
And as an economist would expect, the market is much less competitive for the less
elastic part of the book.
The process for building an "ad hoc" model to employ both competitive forces and
the difference in new business and renewal profitability will be to calculate the
profitability for each increment of proposed rate change. While the "profit
maximizing" rate change could be calculated directly, we will maximize income by
inspection because this will more easily allow for stochastic simulation.
Inputs
Let AdjPol = Policies adjusted for the competitive environment (assuming no rate
change)
PropNB% = ( R C N B % ) / ( R C N B % + RCRB%)
PropRB% = 1 - PropNB%
Note that
Which simplifies to
N B L R = L R / ( I + R C ) + (RenBet)(RB%)
And R B L R = Renewal Business Loss/LAE ratio after the proposed rate change
= N B L R - RenBet
In order to simplify, the analysis, we will only examine underwriting income. In our
example,
IL = EP x [(NBLR)(PropNB%) +(RBLR)(PropRB%)]
FE = (FER)(CP)
VE = (VER)(EP)
Maximizing Income
10
Developing the expressions derived above as a function o f rate changes, we find that
Income = EP - IL - FE - VE
LR = 60%
VER = 15%
NB% = 50%
RB% = 50%
FER = 10%
RenBet = 20%
NE = -6
RE=-2
CE = -10%
AP = $1000
CP = 20,000,000
We will also examine "proposed market share" by simply comparing the proposed
premium with the market. In this example, assume the company's initial market share is
5%. The traditional indication using (1) is -12.5% with a 5% profit target. Here is a
graph o f profit and market share versus rate change:
11
Example t : Undenvritlng Profit a n d Market Sham for a Given Rate Change
! 1 t.
l
f m
Rate Change
The maximum profit occurs with a +1% rate change, and an underwriting profit of 2.5
million.
The forecast loss/alae ratio is 60%. Renewal business is projected to be 20 points better
than new business. Using the formulas derived above,
NBLR = .60/1.01 + (.20)(.50) = .6941
12
And RBLR = .6941 - .20 = .4941
The slight increase will readjust the renewal and new business percentages. Using the
formulas above, we get:
PropNB% = .4896
PropRB% = .5104
A firm may decide to trade short-term profits for additional market share and lower rates
further. "Market share" can be considered a measure o f l o n g term profitability. Inthe
example above, initial market share was 5%. Assuming that the overall market is neither
growing nor decreasing, the new market share will simply be the change in premium
times the initial market share. In this example, premium changes by -12.7% which
decreases market share to 4.4%.
LR = 80%
VER = 15%
NB% = 80%
RB% = 20%
FER = 35%
RenBet = 20%
NE = -6
RE=-2
CE = -10%
AP = $1000
CP = 3,000,000
Assume this is a new state which ls performing poorly. This has happened for two
reasons:
13
In this case, "fixed expenses" are projected to remain high on an absolute basis over the
next rate revision. This may be due to advertising contracts, leases, or other long-term
commitments, but we can assume that they will not be eliminated.
The traditional formula would indicate an increase of +44%. However, the profit
maximizing increase is just 8.5%. Note that in this case the maximum profit is actually a
loss of 834K. However, given the fixed expenses, this is the best that can be projected.
Example 2: Undenvrltlng Profit and Market Share for a Given Rate Change
1.e%
1.4%
(~00,000)
12%
10%
(~0.000)
i -a--Gro~ Pro~
i (1,0OO,OOO)
(1.200,0~0) (18%
(1.400,000)
04%
(1.eO0.(~O)
02%
(1.8~1.000)
(2,000,(X)0) OO%
Change
E x a m p l e 3 - A moderate loss ratio in a mature state with high fixed expenses and
low variable expenses
LR = 71%
V E R = 7.5%
NB% = 50%
RB% = 50%
FER = 17.5%
RenBet = 20%
NE=-6
RE=-2
CE = 0%
AP = $1000
CP = 10,000,000
14
Initial Market Share = 7.0%
In this example, the traditional model indicates an increase of+1.1%. This is not a
dramatically different result than the +4.0% increase that the "profit maximizing" model
produces.
Example 3: Undemutting Profit and Market Share for a Given Rate Change
i (t.~O,0OO)
1
(1.500,(~O)
4 0%
(2,000,000)
(2,500,000) .
(3.~0.000)
RateChange
Using the traditional formula in this case would cause a decline in underwriting income
of only (25K) and gain in premium of 1484K. Therefore, one should probably look
closely at market share in such a situation.
Introducing Simulation
How does the "profit maximizing" model respond to differences in expectations from the
traditional model?
5000 simulations of results were run of Example 3, with distributions substituted for the
loss ratio, new business elasticity, and renewal elasticity.
The most sensitive variable in a rate indication is the forecast loss ratio.
We replaced the loss ratio pick in the example above of 71% with a lognormal
distribution with a mean of.71 and a standard deviation of.15. We also truncated the
result to a minimum of 0.
15
The variance was selected judgmentally, but a review of the histogram of the results
shows a reasonable approximation of results for a line not subject to significant
catastrophe losses:
Lmm Ratio
Ideally, empirical studies would determine the elasticity of the product. Also, the
elasticity would be expected to change somewhat over the range of rate increases. For
example, the elasticity of a 10% rate increase would probably be different than a 1%
increase. However, we have not changed our elasticities in this example.
Not surprisingly, "traditional indications" correlate closely with the loss ratio, since
elasticities do not impact the result of this formula:
16
Exlmpls 3 - Traditional ~ Indication under ~ simulations
The "profit maximizing" indications, shown below, provide an interesting contrast to the
traditional indications:
17
Example 3 - " P r o l t t M a x i m i z i n g " R a m I n d i c a t i o n u n d e r 6000 s i m u l a t i o n s
14%
12%
I1%
,,,
J ?
Note that the shape of the distribution is remarkably different from the traditional
indication. The standard deviation of the results from the traditional indication is. 171
versus .092 from the "profit maximizing" model.
Conclusion
There are many assumptions in our examples that can be refined and improved. For
example, price elasticity is not constant for all indications - as a result our results would
be less accurate for extreme rate indications. The emphasis of the paper is on a method
of thinking about rate indications in a dynamic market. As we have shown, traditional
methodologies do not adequately account for the effect rate changes have on retention
and other economic factors.
We hope that our paper will lead to further research. For the model to be usable in the
real world, empirical studies will need to develop reasonable assumptions for price
elasticity functions, distributions of new versus renewal business, and other model inputs.
There are also several simplifying assumptions which would need to be refined.
This proposed approach is only a first step, but we are convinced that it is a step in the
right direction. Companies that are able to reflect market forces in their rate analysis can
gain a competitive advantage. A ratemaking approach that considers price elasticity to
maximize profit would be a useful tool by itself, but could be even more valuable as a
component of dynamic financial analysis.
18
As stated in our introduction, these types o f concerns are reflected indirectly every time
an actuary chooses not to propose the indicated rate level. With this approach, an
insurance company has a better chance o f measuring the effect o f such decisions, creating
a rate structure that balances profit and market concems.
19
20
Considerations in Estimating Loss Cost Trends
21
Abstract
The application of loss trends has long been a fundamental part of the
ratemaking process. Despite this, the actuarial literature is somewhat lacking in
the description of methods by which one can estimate the proper loss trend from
empirical data. Linear or exponential least squares regression is widely used in
this regard. However, there are problems with the use of least squares
regression when applied to insurance loss data.
The results of various methods are compared using industry loss data.
Stochastic simulation is also used as a means of evaluating various trend
estimation methods.
The concepts presented are not new. They are presented here in the context of
analyzing insured loss data to provide actuaries with additional tools for
estimating loss trends.
22
Introduction
This paper is organized into eight sections. The first section will describe the
importance of estimating loss cost trends in Property/Casualty ratemaking. In
addition, it will introduce the common industry practices used to estimate the
underlying loss cost inflation rate.
The second section will provide a review of basic regression analysis since
regression is commonly utilized for estimating loss trends. It will also describe
other relevant statistical formulae.
The third section will describe some characteristics of insured loss data, This
section will describe how insured losses violate some of the basic assumptions of
the ordinary least squares model. It will also describe the complications that
result because of these violations.
The fourth section will describe several methods that can be utilized along with
informed judgement to identify outliers.
The fifth and sixth sections will describe two alternative methods that address the
shortcomings of ordinary least squares regression on insured loss data.
The seventh section applies the common method of exponential least squares
regression and the two alternative methods to industry loss data and compares
the results.
In the last section, the performance of exponential least squares regression and
the alternative methods will be evaluated using stochastic simulation of loss data
with a known underlying trend.
23
In addition to credibility, there are many other considerations that must be taken
into account when applying loss trends, such as the effect of limits and
deductibles. These issues are beyond the scope of this paper.
The actuarial literature is sparse on the process of selecting the type of data to
evaluate, preparing trend data, choosing the most appropriate model and
assessing the appropriateness of the selected trends.
There are papers addressing several of the important basic issues of trending.
These include the appropriate trending period and the overlap fallacy.3 In
addition, the CAS examination syllabus addresses the permissibility of using
calendar year data to determine trends applied to accident year data.4 These
authors have well and fully addressed these topics and they need not be
revisited.
2ASP #13...
4 Cook, ibid.
?.4
In much of the syllabus material, both past and present, there are considerable
differences between the types of data used for trending and the amount of
discussion dedicated to the selection of the trend. Generally, each paper selects
either calendar or accident year data and utilizes either the simple linear or
exponential regression model with little guidance regarding which is more
appropriate or discussion of the data to which the model is applied. These
omissions are understandable since the subject of the articles is ratemaking, of
which trend selection is only one component. There are acknowledgements of a
need for better loss trending procedures contained in several papers.
The validity of using linear or exponential least squares regression, the basic
assumptions of regression analysis and the characteristics of loss data, in
evaluating ratemaking trends has not been widely addressed. When selecting a
model to estimate future trends, it is important to consider whether the data used
violates assumptions of the model.
Loss Data
An essential consideration in evaluating loss trend involves the selection of the
type of loss statistics to analyze. It is often useful to analyze both paid and
incurred loss frequency and severity if available.
2.5
For example, paid claim counts may include claims closed without payment.
Therefore, changes in claim handling procedures during the period under review
may affect the trend estimate. Likewise, changes in case reserving practices and
adjuster caseloads may affect incurred and/or paid severity amounts.
Analysis of both paid and incurred amounts, or amounts net versus gross of
salvage and subrogation, can assist in identifying changes in claims handling. In
any event, the loss statistics used should be defined consistently throughout the
experience period. For example, if the paid loss amounts are recorded gross of
salvage and subrogation for a portion of the time period, and net for the
remaining, the amounts should be restated to a consistent basis prior to analysis.
where,
}~ is the i '~ observation of the response variable.
,8, is a vector of model parameters to be estimated.
.~, is a vector of the the independent variables
~, is the random error term.
Regression models are designed to use empirical data to measure the
relationship between one or more independent variables and a dependent
variable assuming some functional relationship between the variables. The
functional relationship can be linear, quadratic, logarithmic, exponential or any
other form.
The important point is that the functional relationship, the model, is assumed
prior to calculation of the model parameters. Incorrect selection of the model is
an element of parameter risk.
26
essential in the development of statistical tests regarding the parameter
estimates and the performance of the selected model.
= P0 + # , X , + E,
where,
Y, is the i 'h observation of the response variable.
,b'0 and ,8, are the model parameters to be estimated.
X, is the i 'h value of the independent variable.
e, is the random error term.
The parameters of the regression model are estimated from observed data using
the method of least squares. This method will not be described in detail here. It
is sufficient for our purpose to note that the least squares estimators, b,, have
the following characteristics:
Because the normal distribution of the error terms is assumed, various statistical
inferences can be made. Hypothesis testing can be performed. For example,
the hypothesis that the trend is zero can be tested. Confidence intervals for the
regression parameters can be calculated. Also, confidence intervals for F and a
confidence band for the regression line can be calculated. These very useful
results make simple linear regression appealing.
2?
Exponential Regression
While linear regression models are often satisfactory in many circumstances,
there are situations where non-linear models seem more appropriate. Loss cost
inflation is often assumed to be exponential. The exponential model assumes a
constant percentage increase over time rather than a constant dollar increase for
each time period.
}~ = Yo + Y~ e~'v' + E,
and the trend is obtained from the linear least squared regression estimate of/~',.
28
If the error term of the linear regression model, c,, is assumed to have a
N(0,c) distribution, it can be shown that the error term in the transformed model
is Iognormal with expected value e°'12 . The error terms are positively skewed.
This distribution of the error terms in the linearized model may be preferable to
the normal distribution if the analyst believes it is more likely that observed
values are above the mean than below the mean. This certainly may be the case
with insured loss data.
Note that the Iognormal distribution of the error term in the linearized model
affects the calculation of confidence intervals and test statistics for the model.
The familiar forms of the test statistics based on the normal distribution do not
apply.
D ~ f=2
n
/=1
This value is compared to critical values, d~ and d., calculated by Durbin and
Watson. The critical values define the lower and upper bounds of a range for
29
which the test is inconclusive. When D > d ~ , there is no serial correlation
present. When /_) < dr., there is some degree of serial correlation present.6
There are several distinct characteristics of insured loss data that should be
recognized when selecting a regression model. In broad terms, one expects
data to be comprised of an underlying trend, a seasonality component, a possible
cyclical nature and a random portion.7 These traits make the estimation of the
underlying trend more difficult and the rigid use of simple linear or exponential
regression imprudent.
6 Neter, et. al., A~j)lied Linear StatisticalModels,4~ ed., McGraw-Hill,Boston, 1996 p 504.
30
Oregon Homeowners
11.0
9.0
7.0
50
Year/Qtr
A review of the severity data for the same time period shows a corresponding,
though less dramatic, drop in claim severity. This is typical of a high frequency,
low severity weather loss event. This drop in claim severity may go unnoticed if it
were not for the associated increase in frequency. Again, due to the twelve-
month moving organization of the data, the error terms are not independently
distributed.
3]
Oregon Homeowners
4CE PaidSeverity
3,500 0
! I' i ,
3,000 0
' , , i
I i , . ,
2,500 0
2,000 0 i
YearlQtr
Shock Losses
A high severity claim in a small portfolio may cause a distortion in the data and
affect the trend calculated by ordinary least squares methods if no adjustments
are made. A visual inspection of Nevada Private Passenger Auto Bodily Injury
severity data provided by the Insurance Services Office shows an unusual
occurrence in the first quarter of 1998.
The quarterly data shows the elevated severity in the first quarter of 1998 neatly
as one high point while the four quarter ending data exhibits this phenomena as
a four point plateau, This phenomenon occurs more often in smaller portfolios,
even when utilizing basic limit data.
10,000
:i '0] :D:O. O
9,000 ,0~o :oil.,
i
8.000
i
i
7,000 I , i
YnrlQtr
32
Nevada PPA - Bodily Injury Liability
4QE Paid Severity
10,000
9,500
9,000
8,500
8,000
YearlQtr
As demonstrated above, insured loss frequency and severity data may exhibit
abnormally high random error. If these errors occur early in the time series, the
resulting trend estimates from least squares regression will be understated.
Conversely, if the shock value occurs late in the time series, the trend estimate
will be overstated. The use of twelve-month-moving data compounds this effect
since the shock is propagated to three additional data points.
There are several methods available to identify outliers and measure their
influence on the regression results. These include Studentized Deleted
33
Residuals, DFFtTS, Cook's Distance and DFBETAS. 8 The identification of such
occurrences is addressed in section four below.
Seasonality of Data
The nature of insurance coverage creates seasonal variation in claim frequency
and severity. For example, winter driving conditions may cause higher Collision
and Property Damage Liability claims in the first quarter. Similarly, lightning
claims may be more prevalent during the summer months in certain states. The
probability of severe house fires may be higher during the winter months. Auto
thefts may be more frequent in summer months causing elevated severity for
Comprehensive coverage.
When reviewing New York Private Passenger Auto data for Collision coverage
on a quarterly basis, one can see the seasonal nature of claim frequencies. This
seasonality can be illustrated by grouping like quarters together.
10.0, i I . . . . . . . ~ . . . .
@
90 . . . . . . . . . . . . . . .
80 ~
70
60 1
Year/Qtr
a Neter, et. al., ibid, and Edmund S Scanlon, "Residuals and Influence in Regression", CAS
Proceedings, Vol. LY-,XXI, p. 123
34
data creates serially correlated errors when used in ordinary least squares
regression.
Any organization of data that has overlapping time periods from one point to the
next, by its construction, results, in serially correlated error terms. Serial
correlation of error terms occurs when the residual errors are not independent.
This result is shown for twelve-month-moving calendar year data in Exhibit 2
using the Durbin-Watson statistic.
Additionally, one can plot residuals to detect serial correlation. Below the
residual plot is displayed for twelve-month-moving New York Collision frequency.
As one can see, the errors for adjacent points are related. As noted above, the
independence of the error terms in ordinary least squares regression is generally
assumed and certain conclusions about the regression statistics are based on
this assumption.
New York C o l l i s i o n
Frequency
i
Residual Plot I
0.06
0.00 I
-0~06
}
35
According to Neter, et. al., when this assumption is not met the following
consequences result.
1. The estimated regression coefficients are still unbiased, but they no longer
have the minimum variance property and may be quite inefficient.
4. Confidence intervals and tests using the t and F distributions are not
strictly applicable.
Remedial Measures
Each of the first two issues with the insured loss data, widespread loss events
and extraordinary claim payments, can be resolved by removing outlying points
before calculating the exponential or linear regression. The removal technique
must rely on statistical tests and actuarial judgment. This will be discussed in the
following section. Seasonality and serial correlation can be addressed using
regression with indicator variables on quarterly data. Regression with indicator
variables explicitly incorporates seasonality as a component of the model. The
use of quarterly data eliminates the serial correlation resulting from the use of
overlapping time periods.
Comments on Goodness-of-Fit
Estimating the underlying trend in a given dataset entails more than simply fitting
a line to a set of data During the estimation process, it is important to determine
36
whether the underlying assumptions are met and whether the equation
accurately models the observed data. 9
Many consider R 2, the coefficient of determination, the most important statistic for
evaluating the goodness-of-fit. The coefficient of determination is the proportion
of the data's variability over time that is explained by the fitted curve. However, it
is widely agreed that this is not sufficient. 1° The coefficient of determination, by
itself, is a poor measure of goodness-of-fit.1
To assume that a low R 2 implies a poor fit is not appropriate. It has been shown
that a low or zero trend, by its nature, has a low R2value. 12 Also, whenever the
random variation is large compared to the underlying trend the R 2 will not be
sufficient to determine whether the fitted model is appropriate. One can illustrate
the low R a values associated with data exhibiting no trend over time. The scatter
plot below was generated from a simulation with an underlying trend of zero.
Simulation R e s u l t s "
Underlying Trend = 0%
1.oo
0.80
~ 0.60
~', 0.40
n~
0.20
0.00
-3.5% -2.5% -1.5% -0.5% 0.5% 1.5% 2.5% 3.5%
EstimatedTrend
9 scanlon, ibid.
lo D. Lee Barclay, "A Statistical Note on Trend Factors: The Meaning of R-Squared", CAS Forum,
Fall 1991, p. 7, and Ross FonticeUa,"The Usefulness of the Rz Statistic", CAS Forum, Winter
1998, p. 55, and Scanlon, ibid. and Neter et. al., ibid.
11 Barclay, ibid.
la Barclay, ibid.
37
The residuals between the actual and fitted points are highly useful for studying
whether a given regression model is appropriate for the data being studied. 13 It
is useful to graph the fitted data against the observed data to look for patterns. TM
A random scattering of residuals occurs when the fit is proper. 15 It is important
that the error term not appear systematically biased when compared to
neighboring points.
The use of the R2 statistic or plots of the residuals may result in the decision that
the model is an appropriate fit to the data. This conclusion applies to the
historical period based on this analysis. Another consideration is the
extrapolation of the trend model into the future. As McClenahan illustrates with
the use of the 3rd degree polynomial, a perfect fit within the data period does not
always result in the appropriate trend in the future. TM Extrapolation beyond the
data period should also be considered before the decision to proceed with the
model is undertaken.
This section describes methods by which one can identify extraordinary values
from observed loss data. These methods are designed to identify outliers from a
dataset on which regression is to be performed. An excellent reference on these
and other statistical methods is Applied Linear Statistical Models by Neter et. al.
14Fonticello,ibid.
15Barclay,ibid.
le McClenahan,ibid.
38
Therefore, the selected points should always be compared to the original
dataset.
Visual Methods
When performing simple linear regression there are several visual methods
which can result in easy identification of outlying points. Among these graphs
are residual plots against the independent variable, box plots, stem-leaf plots and
scatter plots 17. While residual plots may lead to the proper inference regarding
outliers, there are instances when this is more difficult. When the outlier imposes
a great amount of leverage on the fitted regression line, the outlier may not be
readily identifiable due to the resulting reduction of the residual.
Studentized Residuals
There are several standard methods that can be utilized to assist with the
identification of outliers, each with advantages and disadvantages. The
studentized residual detects outliers based on the proportional difference of the
error term, ¢,, and the variance of these errors. The studentized residual is
defined:
ei
r,=
s{~,} '
39
observation. In addition, there is no statistical test from which one can base a
decision regarding outliers.
DFFI TS
One measure of influence is the DFFITS statistic. The DFFITS is the
standardized difference between the fitted regression with all points included and
with the i '~ point omitted.
DFFITS, - ~ , l h " = t, •
Eh l 2
40
Cook's D
Another measure of influence is Cook's Distance measure, D,. Scanlon utilizes
Cook's D statistic to identify outliers. TM Cook's D measures the influence of the
i 'h case on all f~ted values.
/1 ^
-r,,,,)
- j=l
Oi p . MSE
p.MSE (l-h.)
As with all models good judgement is imperative and comparison to the original
data is advised. In addition to the methods described above, one can calculate a
confidence band around the fitted curve. Observations outside the confidence
band are candidates for removal.
Each of these methods is designed to identify a single outlier from the remaining
data. These techniques may not be sufficient to distinguish outliers when other
outliers are adjacent or nearby. Each of these methods is extendable to identify
multiple outliers from the remaining data. However, a discussion of these
extensions is beyond the scope of this paper.
la Scanlon,ibid.
41
Section 5: Manual Intervention - Deletion~Smoothing of Outliers
Manual Intervention
The identification of extraordinary values is certainly a matter of judgement. In
the analysis that follows, the determination of outliers is completed by use of
visual inspection.
Treatment of Outliers
Once the outliers have been identified, one can proceed in several ways. First,
the analyst may simply remove the outlying point from consideration and
complete the analysis as if the observation did not occur. While this alternative
may seem appealing, it does not allow for the reconstruction of twelve-month-
moving data.
The second approach is to replace the outlier with the fitted point from the
regression after removal of the outlier. This removes the outlier from the
regression entirely, but allows reconstruction of the four-quarter-ending data.
The final approach is to replace the outlying point with the fitted point plus or
minus the width of a confidence interval, as appropriate. This choice mitigates
4:2
the extent to which the outlier affects the regression results, without removing the
point entirely.
For simplicity, the authors have selected the first approach for comparison
purposes but acknowledge that the other two procedures may be appropriate in
other circumstances.
Parameter Estimation
Estimation of the underlying trend in the data is completed through exponential
regression on the quarterly data, excluding the outliers, with indicator variables to
recognize any seasonality.
Section 6: Q u a l i t a t i v e P r e d i c t o r V a r i a b l e s f o r S e a s o n a l i t y
This method of least squares regression recognizes the seasonal nature of
insured losses through the use of qualitative predictor variables, or indicator
variables. Indicator variables are often used when regression analysis is applied
to time series data. Also, since the data used in this method is quarterly rather
than twelve-month-moving, first-order autocorrelation of the error terms is not
present. Hence, the issues that arise from such autocorrelation are eliminated.
Where,
The model above can be viewed as four regression models, one for each set of
quarterly data.
43
The exponential equivalents, without error terms, are
One can think of ¢~' as the trend component of the model and e ~' , J ' and eP' as
the seasonal adjustments to e~° .
Detailed calculations using the Oregon Homeowners data are shown in the
attached exhibits. The results in the tables below show the annual trend derived
from each method and the associated R 2 value in parentheses.
# Years of Observations
Method ~ 3~E= 4_~yr, 5~r
12 MM -1 5% (.06) -13.9%(.53) -170% (.62) ~9% (17)
Quarterly -156% (.32) -26.7%(.45) -132% (.21) -39% (03)
Annual - -53% (.50) -192% (72) -10.1°/o(.34)
Manual Adjustment -- -6.8% (.79) -8.4%(58) -2 6% (20)
IndicatorVariables -94% (.91) -222% (.75) -109% (.48) -26% (27)
44
Table 2 - New York PPA Collision Frequency
# Years of Observations
Method ~ 3 yr. 4yr. ,Syr~
12 MM 0.3% (.04) -1.7% (.43) -2.2% (.61) -1.9% (.58)
Quarterly -0.6% (.00) -1.6% (.07) -2.8% (.17) -1.7% (.10)
Annual -- -0.6% (.14) -2.3% (.66) -1.2% (.37)
Manual Adjustment . . . . . 1.0% (.80) -0.8% (.84)
Indicator Variables 1.7% (.83) -0.6% (.80) -2.2% (.76) -1.2% (.74)
# Years of Observations
Method ~ 3 yr. 4yr. 5yr.
12 MM 12% (06) 3.0% (.52) 3.1% (.72) 3.1% (.78)
Quarterly 4.9% (.10) 4.3% (.20) 4.1% (.31) 2.7% (.25)
Annual -- 3.5% (.63) 2.8% (.71) 37% (.85)
Manual Adjustment -- 1.2% (.85) 1.9% (.65) 1.4% (.41)
Indicator Variables 9.4% (.57) 4.9% (.36) 4.0% (.37) 27°/o (.27)
The manual adjustment method and regression using indicator variables provide
additional estimates of the underlying loss trend to assist the actuary in selecting
appropriate adjustment for ratemaking.
45
Simulation Parameter Estimation
Based on the Nevada PPA Bodily Injury severity analysis from the previous
section the following simulation parameters were selected.
The shock probability and magnitude were chosen based on the observed data.
Of the 23 observations, only one observation appeared to have an extraordinarily
high severity. The magnitude of the shock is fixed at 20%. The simulation could
be further modified to include a stochastic variable for the shock magnitude.
Simulations for other states and lines of business would incorporate other
parameter values based on observed data.
where,
Pr[l~, = (1 + ~, )] = 1 / 23,
Pr[I~, = l.OOl = 22 / 23,
and
~, is N ( O , o "2)
46
The shock value of the natural logarithm of the severity, !+ 6, , corresponding to
the shock value of the severity must be calculated. It can be shown that the
value of 5, is given by
Likewise, the error variance, ~x2, for In(y, ) is derived from the estimated
variance of Y, = MSE~ 2 according to the following relationship.
/Y0
MSE
^ = e ~'
(e~ -1)
Simulation Results
Ten thousand simulated data sets were generated. The five estimation methods
were applied to each data ~et.
The table below summarizes the results of each regression method based on
10,000 simulations of twenty observations. Since the underlying trend in the
simulation is known, accuracy is measured using the absolute difference
between the estimated trend and the actual trend. The percentage of estimates
above the actual trends is also shown in order to detect upward bias in the
estimation method. Also, the percent of estimates within various neighborhoods
of the actual trend are calculated.
47
Table 5 - Comparison of Methods (based on 10,000 simulations)
r
Percentage of Estimates
Average Average
Trend Absolute Above Within 5% Within 75% Within 1% Average t
Method Estimate Difference Actual of Actual of Actual of Actual a2
12 MM 3.52% 0.82% 50.7% 37.7% 54.1% 669% 74
.~_~
Quarterly 333% 0.91% 44.1% 34.5% 490% 625% 34 '
Annual 351% 0.93% 502% 33.6% 486% 61 3% ~75
Indicator
3.51% 092% 50.4% 34.4% 49 3% 620%
Variables
Manual
3.50% 0.81% 49.4% 37.7% 539% 67.6%
Adjustment
A similar process can be used to simulate frequency data which include the
probability of loss events that produce large numbers of claims.
N o SHOCKS
w
Percentage of Estimates
Average Average
Trend Absolute Above Within .5% Within 75% Within 1% Ave~rage
Method __Estimate Difference Actual of Actual of Actual of Actual
12 MM 3.50% 0.69% 50.0% 43.2% 607% . 75 0%0
Quarterly 3.33% 0.78% 43 2% 392% 55.8% . 68 9%
Annual 3.51% 0]8% 50.7% 390% 554% .__688%
Indicator
3.51% 078% 508% 39.2% 55 4% 68 9% ~ _
Variables
Manual
3,51% i 0.78% 50.8% 39.2% 554%
Adjustment ~__
48
The results of this simulation show that there is little difference between
traditional regression techniques and regression using qualitative predictor
variables for seasonality.
ALL SHOCKED
Percentage of Estimates
Average i Average
Trend Absolute Above Within 5% Within .75% Within 1% Avemge
Method Estimate i Difference Actual of Actual of Actual of Actual R2
r
The results of the simulation using only data with shocks illustrate the increased
accuracy of the manual adjustment method described previously under these
circumstances.
SHOCKED EARLY
table 8 - Comparison of Methods (based on 10,000 simulations)
Percentage of Estimates
Average Average
I
Trend Absolute Above Within .5% Within .75% Within 1% Average
Method Estimate i Difference Actual of Actual of Actual of Actual Rz
49
This simulation illustrates the understatement of trend estimates by traditional
methods when shock vaiues occur early in the time series. While proper
elimination of the shocks may be difficult, this simulation shows the value of the
proper identification.
SHOCKED LATE
Percentageof Estimates
Average Average
' Trend L Absolute Above VV~thin.5% Within .75% Within1% Average
Method Estimate| Difference Actual of Actual of Actual I of Actual R 2 _
t-
12MM ~ 5.37% 193% 933% 124% 191% .6% .79_.
/
Quarterly ! 5.23% 1.85% 89.5% 15.5% 231% I 31.3% -39 1
Annual 2.22% ! 935% 106% 168% ~2_27% 80-
Indicator
Variables t ~ 207% 20~ 121%
Manual 0.~5%--! 371% 52 3°/,, [ 6 5 1%
Adjustment I__3 52°~
Conclusion
The regression concepts discussed here are not new to actuaries. Nor are the
characteristics of insured loss data. Actuaries are familiar with the stochastic
nature of claim frequency and severity. Actuaries are also keenly aware of the
potential for loss events, be they weather events that generate an extraordinary
number of "normal" sized claims, or single claims with extraordinary severity, that
do not fit the assumptions of basic regression analysis.
While outlier identification techniques are described in section four, they have not
been applied to the industry data. The evaluation of these techniques is a
subject worthy of further research. In addition, the authors would welcome
development of techniques to discriminate between random noise and
.50
seasonality, to identify turning points in the trend and to distinguish between
outliers and discrete but "jumps" in the level of frequency and severity.
Hopefully, the authors have presented some additional tools for ratemaking and
stimulated interest in developing trend estimation techniques that recognize the
unique characteristics of insured losses.
Acknowledgements
The authors would like to thank the Insurance Services Office for their generosity
in supplying the industry data used in this analysis, the ratemaking call paper
committee for their guidance and our families for their understanding and support
throughout the process of drafting and editing this paper.
.5]
Index to Exhibits
52
Nevada Bodily Injury
Insurance Industry Loss Data
Includescopyrbg~edmaterialof Insurance$e~K:es
Offlce, Irtc with ;ls pefmiss~ Copyright,Insurance
Ser~c~ Off~e 1999
53
New York Collision
Insurance Industry Loss Data
54
Oregon Homeowner
I n s u r a n c e I n d u s t r y L o s s Data
Includescopyrightedmaterialof InSurar.ceServices
Of~e. Ir~c w~thits pemlission Copyright,Insurwlce
Setvicel Office 1999~
55
oflmmt ~
~ | I R ~ l r e s s l o n with I n d i c a t ~ V a d a b l e t On Quarterty Frequency
,D~rn-Wat~a
~P.u=-~=~,~ ~ Qtdy~mq_ ;~ D2 [33 D4
1 94/4 6 17 0 O0 0 0 ! I 820 2 387 -o 4875 0 24 10 08
2 95/1 5 78 O 28 0 0 6 1 754 1 903 -8 t465 0 02 O 11 6 71
3 98/2 6 19 0 50 1 0 0 1 823 1 824 -0 1007 0 01 0 00 6 85
4 95/3 7 32 0 75 0 1 0 1 991 1 @68 00248 000 002 7 14
5 95/4 757 1 00 0 0 1 2 024 2 281 "O 2869 0 07 0 08 8 79
6 96/1 6 66 1 25 0 0 0 1 896 1 877 0 0193 0 00 0 08 8 53
7 96/2 8 08 1 50 1 0 O 2 089 1 886 0 1818 O 04 0 03 6 67
8 95/3 861 1 75 0 1 0 2 153 1 940 0 2133 0 05 0 00 6 96
6 96/4 2 4 86 2 00 0 0 1 3 213 2 255 0 9583 0 92 0 55 9 54
10 97/1 8 46 2 25 O O 0 2 135 1 651 02846 008 045 836
11 87/2 7 01 2 50 1 0 0 1 947 1 871 0 0759 0 01 0 04 650
12 97/3 6 55 275 0 1 0 1 879 1 813 -0 0340 0 60 0 01 8 78
13 97/4 8 30 3 00 0 0 1 2 230 2 229 0 0011 0 00 0 00 9 29
14 98/1 6 05 3 25 0 0 0 1 800 1 825 -0 0246 0 00 0 00 6 20
15 88/2 5 81 3 50 1 0 O 1 777 1 845 -6 0~87 0 00 0 O0 6 33
C~ 16 98/3 5 78 3 75 0 1 0 1 754 1 887 -0 1330 002 0O0 660
17 9814 7 30 4 00 0 0 1 1 988 2 203 -O 2150 0 05 0 01 9 05
18 89/1 5 30 4 28 0 0 9 1 668 1 799 -0 1308 0 02 0 01 6 04
19 9912 5 59 4 80 1 0 0 1 721 1 819 -0 0983 001 000 6 17
20 8813 5 99 4 75 0 1 0 1 790 1 861 .0 0712 001 080 643
SUm 1 53 1 41
O 0.82
Oregon Homeowners Frequency Number of X 4 O0
2s0
Observations 20 00
240 . . . . ¢
dv at 05 1 83
120 . . . . . . . . OL at 05 0 90
T e l t iS I n c o r l c l u l l v e
du at 01 1 57
YYtQ dE at 01 0 68
Test is Inconclusive
Residual Plot
ReQreseion Out.out
lo0 t
0~
o6O
o447 ~ . A ~ n ~ l i h / ~acto~
020 ,t ¢ •
Trend -2 58% 1st Qtr 1 O00
~20 $ • • R? g 27 2ha Qtr 1 028
Obs 20 3rd Qtr 1 078
41h QtT 1 488
OflgOn Homec*wmm
E x p o n e n t l | l Regrlmsk>n o n Quarterly Frequency
Dumir~W~qn
2 2
du at .01 115
dL at .01 0 95
Uncomdated
Residual Plot
Re(iression Output
1 3 5 7 9 11 13 15 17 19
Orllgon H ~
Expo~ecdt~ Regmslk:m on 4QE F ~ u e c v c y
Du~ir~Wat~on
f 2
du at 05 1 41
~20 ' • ' 6 ,
0~ at 05 I 20
First Order Auto~Con'elated
40
3,.at 01 ~ 15
dL at 01 0 95
y¥'Q @ Qtdy~req ~ i i , i ¢ , . ~ 3 r r ~ ~
First Order Auto'Correlated
~r eo~ - - ~ ~ '
dU at 05 na
dL at 0 5 na
so
dU at 01 na
dL at 01 na
Residual Plot
030
020 "
010
O~
4~ 10 •
-020
-0,30
1 2 3 4 5
Orison Homeownei's
ManUally A d J u l t l d Exponential R I ~ r l s I i O ~ Wtth ll~llclltl~ V l r i l b l l l On Q~Jidlldy Frequency
ou,O/n-Wet~,o~
~h~=ti~ ~(~ nttty Frl~ y~[ D2 D3 D~ Ln(Frl=~} EittPat Frw=q ~.~Jtt,,=lq
1 94/4 6 17 0 O0 0 0 1 1 820 2 068 -0 2479 006 7 81
2 95/1 5 78 0 25 0 0 0 1 754 1 903 -0 1485 0 02 00l 6 71
3 95/2 6 19 0 50 1 0 O 1 823 1 924 -0 1007 0 01 0 O0 6 85
4 95/3 7 32 0 75 0 1 0 1 991 1966 0 0248 0 O0 0 02 7 14
5 95/4 7 57 1 00 0 0 1 2 024 2042 0 0173 0 00 0 00 7 70
6 96/1 6 66 1 25 0 0 0 1 896 1 877 0 0193 0 00 0 00 6 53
7 96J2 8 08 1 50 1 O 0 2 088 I 898 C 1918 0 04 0 03 6 67
8 96/3 8 81 1 75 0 1 0 2 153 1 940 0 2133 005 000 696
10 97/1 8 46 2 25 0 0 0 2 135 1 851 0 2846 0 08 0 01 0(3O
11 97/2 7 01 2 50 1 0 0 1 947 1 871 0 0759 0 01 0 04 536
12 9713 6 55 2 75 0 1 0 1 879 1 913 O 0340 0 00 0 01 6 50
13 97/4 9 30 3 O0 0 0 ! 2 230 1 999 0 2407 006 0 08 6 78
14 98/1 6 05 3 25 0 0 0 1 800 1 825 -0 Q246 0 O0 0 07 7 31
15 98/2 5 91 3 50 1 0 O 1 777 1 845 -00687 0 00 0 00 6 20
16 98/3 5 78 3 75 O 1 O 1 754 1 887 -0 1330 0 02 000 6 33
17 9814 7 30 4 O0 0 O 1 1 988 1 963 0 0246 0 O0 0 02 6 60
18 99/1 5 30 4 25 0 O 0 1 668 1 799 0 1308 0 02 0 02 7 12
19 99/2 5 59 4 50 1 0 O 1 721 1 819 0 0983 0 01 0 O0 604
20 99/3 5 99 4 75 0 1 0 1 790 1 861 0 0712 0 01 0 O0 6 17
6 43
Sum 0 38 032
Oregon Homeowners Frequency 0 0.86
~eo Number of X 4 00
Observations 19 O0
cr
2 d j a r 05 1 85
d, at 05 0 86
4o FirSt Order Auto-Correlated
Re~pdual PLot
040 ~ . . . . . .
Reoression Output
030 • • • * !
o20 I I
~aC~.CLCS
Trend 2 58% lstQtr 1 003
010 • '~ • • •
R~ 0 2nd Qtr 1 028
0 20
Obs 1~ 3rd Q1~ 1 078
-0 30 J
4th Qtr 1 171
1 3 5 7 9 11 13 15 17 19
Ratemakingfor Excess WorkersCompensation
61
Ratemaking for Workers Compensation
By Owen M. Gleeson, FCAS, MAAA
Abstract
The market for Excess Workers Compensation in the United States has grown rapidly
over the last two decades. These are estimates that the annual premium volume in the
excess $500,000 attachment segment of this market is now in excess of $1 billion. This
paper presents a method of estimating rates for this type of coverage. The method
generates loss distribution of the total cost of individual large claims, Medical costs are
estimated form data samples. Indemnity costs, however, are for the most part estimated
from the benefits mandated in the Workers Compensation statutes.
62
I. Introduction
A. General Remarks
The market for property/casualty insurance in the United States has evolved rapidly in the
past 15 years. In particular, the alternative market for Workers compensation insurance
has shown explosive growth. Many of the entities that incur workers compensation costs
are now self-insured on the lower cost layer, e.g. the first $100,000 per claim. These self-
insured firms or groups still purchase insurance protection above retentions that are
$100,000 higher. The market for this type of coverage is now very large and in premium
dollar terms easily exceeds $1 billion. Another measure of the size of the market is that
the Self-Insurance Institute of America has over one thousand corporate members.
The task of estimating rates for this type of business is made difficult by several of the
characteristics of large workers compensation claims. The first is that large workers
compensation claims are infrequent and thus the amount of data available for ratemaking
is severely limited. A second characteristic is that large workers compensation claims
develop very slowly with the result that the ultimate cost of an individual claim,
particularly those involving medical may not be knownfor many years. Another aspect of
these claims is that there are distinct components of the loss:medical and indemnity. The
view adopted here is that the medical costs and the indemnity costs follow separate and
distinct distributions. As a result the distribution of the variable which is the sum of these
costs is quite complex. It is thus very difficult to model the underlying distribution of
these costs by using a sample of incurred losses.
Currently there is no pricing mechanism in the United States for this class of business
that provides comfort to the users and is widely accessible. The objective of this paper is
to provide a solution to the problem of pricing this line of business which will be seen as
generally satisfactory. There are of course no claims implied that what is presented in the
following is the only solution or the solution that is "best" in some sense. In addition, this
paper will not explore the issue of risk loading or required profit. Rather the paper will
focus on the sufficiently difficult task of estimating the pure loss cost.
B. Types of Claims
The focus of this paper is excess workers compensation costs. It follows that only those
types of claims whose cost might exceed a given limit e.g. $100,000 would be of interest.
Workers Compensation claims are often classified into six types: Medical Only,
Temporary Total, Minor Permanent Partial, Major Permanent Partial, Permanent Total
and Fatal. It's assumed for the purposes of this paper that no claim falling into one of the
first three classifications will be large enough to pierce the limits of interest. Therefore
only the remaining three types of claims will be analyzed.
At this point a discussion of the characteristics of each of the three types of claims will be
presented. It is hoped that this will provide motivation for the methods and tactics used in
producing the cost estimates. Each of the types of claims to be discussed, i.e. Fatal,
63
Permanent Total, and Major Permanent Partial have a medical component of the total
claim cost and an indemnity component of the cost. These will be discussed separately.
1. Fatal
a. Indemnity Benefits
The statutory specification of the indemnity benefits associated with fatal claims
can be quite complex. In highly simplified terms the parameters specifying the
benefits might be described as (l) period of benefits (2) basic percentage of wage
and (3) degree of dependency. For example, the period of benefits could be
lifetime. However the period of benefits could be limited by attained age, say age
65, or limited by amount (the maximum amount of fatal benefits in Florida is
$100,000). The basic percentage of wage is usually expressed in terms such as
"66 2/3 percent of the fatally injured individuals average weekly wage." (Many
workers in the United States do not receive the same amount of compensation
every week. As a result, it is necessary to determine the amount that should be
deemed the average weekly wage in the event of injury. Each state has developed
a complex set of rules to decide this question. This subject will not be explored
here.) The degree of dependency in a fatal case is determined generally by
familial status e.g. spouse, spouse and dependent children, dependent parents or
siblings, etc.
The specifications vary from one state to another. Thus the first step in dealing
with the costs of fatal claims is to analyze the laws of the state for which rates are
being estimated. Another step in the process is to decide on the simplifying
assumptions that need to be made in order to make the calculations tractable.
b. Medical Costs
It would be reasonable to think that there are probably little or no medical costs
associated with a Fatal claim. However, the data sets that the author and his
associates have reviewed have virtually all presented some fatal claims with
related medical costs. For the majority of fatal claims the medical cost is found to
be zero. However, there are medical costs associated with the other fatal claims
and these seem to fall into the following categories: small, medium and very
large. We speculate that the small costs are ambulance and emergency rooms fees
for individuals who survive a matter of hours. The medium costs may be
associated with claims where the injured party survived lbr a matter of days and
then expired.
The very large costs were likely the result of heroic and extensive efforts to treat a
very seriously injured person with the result that life was sustained for a year or
64
two followed by the expiration of the injured person. This last group averages
over $1,000,000, but seems extremely rare.
The above view has been developed by examining claim files, discussions with
claims adjusters and from conversations with others personally familiar with the
details of high cost workers compensation claims
2. Permanent Total
a. Indemnity
As in the case with fatal claims, the statutory specifications of indemnity benefits due an
impaired party can be fairly complex. The general parameters are !) the period of
benefits 2) limitations and/or offsets and 3) basic percentage of wage. The period of
benefits for permanent total claims in most states is lifetime. Many states mandate
payment of full benefits to injured individuals as long as they survive. However in other
states there are limitations or offsets most of which are associated in one way or another
with Social Security. For example, some states mandate payments only until eligibility
for Social Security. On the other hand some states require that the basic benefits be offset
by benefits obtainable under the Disability provisions of Social Security. The offsets vary
widely from state to state and can have significant impact on the cost of permanent total
claims. Finally there is the question of the basic percentage. This is usually expressed as
something like 66 2/3 percent of wages. However the percent is different from one state
to another and may be expressed as a percent of spendable income.
Again the law of the state under consideration must be analyzed carefully. Also as is the
case with fatal claims, it may be necessary to make some simplifying assumptions.
b. Medical
Many Permanent Total claims are characterized by extremely large medical costs. Not
only are the costs large but the costs seem to develop upwards throughout the life of the
claim which may be on the order of severaJ decades. Unfortunately, most data collecting
agencies do not follow the development on individual claims for a sufficiently long time.
This is not to be construed as criticism but rather recognizes the fact that the development
in PT claims while perhaps very large for an individual claim may not contribute a
significant amount of development to the overall workers compensation total loss cost.
As an example, if the developed medical cost on PT's throughsay 10 years is 4% of the
total loss cost dollar and the remaining development is 50% (probably too negative a
view) then the overall pure premium might be underestimated by 2%.
However, the interest here is not in aggregates but in the size of individual claims. The
data used by the author is drawn from a numbq" of private well-maintained databases of
individual workers compensation claims. In each of these, there are claims from many
accident years. The open claims are developed individually. The method will be
addressed in a later section. Both closed and open are then trended to the experience
period. Since Permanent Total claims are rather rare it seems virtually impossible to
generate a data set that can be used to provide an empirical size of loss distribution that
65
can be used without resorting to some smoothing. Thus, some smoothing (graduation)
must be introduced before the "tail" of the distribution can be used for pricing.
The approach used here was to obtain data by state on claims designated Major
Permanent Partial and to examine the characteristics of the data. This was supplemented
by information drawn from Workers Compensation Loss Cost filings from New York and
Pennsylvania which contain considerable detail State Workers Compensation laws were
also consulted with respect to benefits provided for permanent partial.
Evaluation of this body of information led to conclusions with respect to the medical
distribution and the indemnity distribution. The expected value and the range of the
distribution as well as some general characteristics are discussed in the tbltowing.
a. Indemnity
The indemnity associated with a Permanent Partial claim generally depends on the type
of injury. Examples of the type of injury are "Loss of a hand", "loss of an arm", "'Loss of
a foot", and so forth. An example of the compensation is "Ix~ss of a hand 335 weeks".
The amount of compensation is usually a percent of wage, e.g. 66 2/3 percent. As shown
in Appendix A, state workers compensation law list many specific types of injury each of
which entitles the injured party to a particular set of benefits.
The large number of categories alone would make modelling of the costs difficult even if
there were good data on the frequency of type of injury. Ilowever this is not the case. In
addition, analysis indicates that Permanent Partial claims do not contribute significantly
to the overall excess costs. This is due to the fact that review of an extensive amount of
data shows that, while the Permanent Partial claims are serious with a large average
value, the fiequency of claims in excess of say $500,000 is low and that there ,are also no
truly catastrophic claims.
Given the above it was decided to resort to analysis of sample data to estimate the
distribution of indemnity of Major Permanent Partial claims.
b. Medical
Indemnity costs on Major Permanent Partial are relatively well constrained by the
limitations resulting from statutorily defined benefits. However, injuries resulting in
Permanent Partial disability can result in a large range of incurred medical costs. In some
cases, such as loss of a hand, the injury maybe satisfactorily treated rapidly and at a low
medical cost. On the other hand there are catastrophic injuries such as severe burns or
injuries to the spinal column where the injured party will require significant medical
66
treatment but will eventually be able to return to work. At this point it might be observed
that there are some individuals who find that their quality of life is enhanced if they are
able to resume some sort of gainful employment no matter how serious the injury. Thus,
these individuals cannot be considered to be permanently and totally disabled.
It's probably worthwhile at this point to recall that the objective is to determine rates for
excess workers compensation coverages. Thus by far the largest number of claims
incurred under Workers Compensation coverages are, by definition, of no interest. For
example, consider the following data extracted from a Pennsylvania Compensation
Rating Bureau Loss Cost filing.
Table I
Ultimate Number of Injuries
M~or Minor
Permanent Permanent Permanent Temporary
Period Fatal Total Pa~ial Partial Total
From the point of view of credibility standards, it can be seen that there are insufficient
claims of the type of interest for rate making purposes even if the claims were restricted
to basic limits as found in other lines of business. Of course as previously mentioned the
size of some of the claims encountered in Statutory Workers Comp range up to $20
million. While it would be interesting to determine the number of claims necessary for
full credibility on claims of this size the knowledge gained is probably not worth the
effort. However, we suspect that it is well in excess of all the claims of the size under
consideration that are incurred in the United States in the span of a decade. Thus the
answer is irrelevant since the number required exceeds the number available. Therefore it
is necessary to develop an approach that circumvents this lack of data.
Excess Workers comp rates are needed by state since the statutory benefits vary by state
with respect to the indemnity portion of the claim. This compounds the data availability
problem in that a smaller number of claims are available in a given jurisdiction. Also
whereas relatively large states like Pennsylvania and Texas which have respective
populations of approximately 12 million and l~ million might have enough claims to
provide basis for a reasobably accurate estimate, the problem of constructing rates for
states like Iowa and Oregon with populations of approximately 3 million each remains.
67
Another issue that surfaced in the process of the construction of the rates is that the
indemnity portions of the serious workers compensation claim develops much differently
from the medical portion of the claim. The data in the table below has been generated by
using data drawn from a recent Pennsylvania Loss Cost filing to demonstrate that the
indemnity costs develop much more rapidly that medical costs. This stands to reason.
Consider a typical Permanent Total claim. Within a matter of five to ten years it should
be certain that the claimant is entitled to Permanent Total benefits. At this point, the cost
of the indemnity portion of the claim has been precisely determined. However the
medical costs are a function of how well the claimant responds to treatment, indicated
altemative treatment paths that emerge, new developments in medical care and so forth.
Required Reserve/
Current Reserve
The above suggests that applying a single development factor to the total of indemnity
and medical will likely produce less satisfactory results than the process of applying
development factors separately if possible or avoiding the use of development factors if
feasible.
Another aspect of the data problem is the question of combining data from different
states. Because the indemnity benefits (which account for about 50% of Major Permanent
Partial and 2/3 of Permanent Total costs) vary so significantly from one state to another
as a result of offsets, limitations, etc not to mention escalation it was decided that the
approach that would produce the most accurate results would be to estimate the
indemnity costs by state if at all possible.
On the other hand medical costs are not statutorily determined. While costs of some of
the more minor aspects of medical care such as bandages, splints, emergency room costs
probably display regional variations, the larger dollar costs such as treatment at national
burn care units or spinal treatment centers demonstrate more homogeneity than
indemnity. In addition the treatment proposed for estimating state indemnity costs has no
analogue for medical cost.
The above characteristics of serious workers compensation claims: low frequency, high
severity, different types of development for component costs and lack of comparability of
cost from state to state led to the solution proposed on the next section.
68
lI. General Approach to Solution of Estimating Excess Workers Compensation Costs
The basic solution to modeling the distribution of costs of large claims consists of two
steps. The first step was to create a distribution of costs for each type of serious claim:
Fatal, Permanent Total and Major Permanent partial. This step required the creation of
separate of distributions for indemnity and medical. These distributions were then used to
create a joint distribution for each of the type of claims. Excess cost factors are then
generated for each type of claim.
The second step was to determine the portion of the pure premium that is Fatal,
Permanent Total or Major Permanent Partial and then weight the excess factors of the
individual components.
The statement of the solution is fairly simple. However, the physical execution of it is
not. For example, given the above, the number of cost outcomes or cells for Permanent
Total Costs is numbered in the millions using an approximating method of calculating the
costs. Essentially what is determined is the frequency function CPT (m,w,a,l) where m is
medical cost, w is wage, a is age at time of injury and 1 is the number of years lived after
the injury. The distribution of the costs of fatal claims CF(m,w,a,l) is calculated in a
similar manner. The cost distribution for Major Permanent Partial is obtained in a slightly
different manner. One component is the medical cost. The other is the indemnity.
However the awards are not so life or age dependent since there are certain lump sums
statutorily provided for regardless of age or wage. Thus for this type of injury the
distribution of indemnity is determined from a statistical sample. The compound
distribution of costs is denoted CMPP(m, I).
For a given retention, R, the excess costs as a percentage of total costs are obtained by
type of injury for a given state. These percents are then weighted by the percent of the
pure premium ascribable to that type of injury. For example, suppose the retention for
State G is 500,000. Further suppose that 58.8% of total PT costs are excess 500,000;
2.48% of total Fatal costs are excess 500,000 and 3.36% of Major Permanent Partial are
excess 500,000. Also suppose that 12.2% of the pure premium (loss cost only) is the cost
of PT's, 3.1% is the Fatal cost and 63.3% is the Major PP cost with 21.2% o f loss costs
attributable to other types of injuries.
The problem to be solved, the difficulties, motivation and methodology have been
outlined above. What follows are some examples that are designed to assist in the
understanding of the methodology.
69
B. Examples
1. Example #1
In this example it is assumed that there are three types of claims which account
collectively for all the incurred loss. The goal is to determine that excess costs for an
attachment point of $500,000. Each type of claim is comprised of two components. The
components are considered to be independent. The distribution of the components of each
type of claim are given in the tables below.
Claim Type 1
Claim Type 2
70
Claim Type 3
If a joint distribution is created for each type of claim and the excess of 500,000 percent
is calculated for each, the excess percent is as shown in the following table.
Excess Cost
Claim
Type Prcnt.
#1 39.50%
#2 13.40%
#3 7.70%
Further assume that percent of the pure premium is known to be distributed as follows
Distribution
of
Loss Cost
Claim
Type Prcnt.
#I 5.2%
#2 71.3%
#3 23.5%
71
2. Example #2
This example illustrates some of the calculations involved in estimating the distribution
of costs for Fatal claims. In order to estimate the distribution of indemnity costs for a
fatal claim a number of parameters need to be specified. These are as follows:
Wage Distribution
Ratio Percent
AWW Workers
to Earning
SAWW* AWW
0.30 5.0%
0.60 30.0%
1.00 40.0%
1.35 10.0%
1.50 15.0%
SAWW = $600
Percent of
Workers
Age at Age
20 20.0%
30 20.0%
40 20.0%
50 20.0%
60 20.0%
d. Benefit Assumptions
72
e. Life Table
US Life T a b l e - 1980
See Appendix B.
Given data in a., b., d., and e. above the distribution of indemnity costs for fatalities
suffered by individuals aged 40 is as given in the following table.
Distribution of
Indemnity Costs
at Age 40
The figures in the above table were obtained by first calculating the costs for each
individual cell. For example, suppose a fatally injured worker was earning $810 per
week. Then the surviving spouse's weekly benefits would be (662/3%)($810) = $540 or
an annual amount of $28,080. Also assume that the spouse receives benefits for exactly
twenty years and then dies. The amount received is (20)($28,080) = $561,600 and the
probability of this event is (10%)((84,789 - 83,726)/94,926) = .112% (see wage
distribution and Appendix B). The outcomes were then grouped into intervals of
$100,000. The outcome of the above described event would fall into group G6.
A graph of the distribution of indemnity costs for a person age 40 is shown in Figure #1.
This is followed by a graph of the distribution of costs for a person age 30 in Figure #2.
A few things should be noted about the two graphs. One is that the distribution of costs in
the age 30 graph is somewhat to the right of the age 40 distribution. This would be
intuitively expected since the individuals age 30 at time of death would provide about an
additional 10 years of benefits to their survivors.
73
Figure #1
~
ge Group 30 Loss Distributionl
(~ncrement of 100.060) J
1200%
10 00%
8 0O%
6 00%
4 00%
-,,J
4~ 2 00%
0 OO%
G1 G6 G11 G16 G21 G26
(G~oup)
Figure #2
~
ge Group 40 Loss Distribution
(hcrement of 100,000)
12.00%
1000% D m
8 00%
600%
400%
2 00%
0,00%
G1 G6 Gli G16 G21 G26
(G.'oup)
It's also interesting to note that both o f the distributions are somewhat "'lumpy". The
distributions have been created from a life table which is fairly smooth, and the
combination o f a wage distribution and certain benefits assumptions.
It seems that the fact that the wage distribution shows an uneven distribution o f wages
and the statutory benefits display certain maximums and minimums is the cause o f the
unevenness. Thus it is doubtful that there is any existing statistical distribution currently
widely used that would fit these curves.
The graph in Figure 3 shows the distribution o f costs for all ages 20 through 60. Note that
some o f the "lumpiness" still remains. The right hand portion o f the graph is of greatest
concern to excess reinsurers and it is important to test the assumptions that go into the
creation o f this tail.
Fatal Medical
Distribution
Amount Probability
$0 99.0%
$100,000 0.75%
$1,000,000 0.25%
When a joint distribution is created using the above distribution and the distribution of
indemnity costs shown in Figure 3, the distribution shown in Figure 4 is obtained.
A. General Remarks
76
Figure#3
~
ge Groups 20 - 60 Loss Distributio 1
(fncrerr~nt of 100,000)
10 00%
8 00% --
6 00%
4 00%
2too%
.,,,,,j
.,,.,j 000%
G1 G6 Gll G16 G21 G26
(G~oup)
Figure #4
10 00%
600%
4 00%
2 00%
..,.j
OQ o oo%
GI G6 Gll G16 G21 G26 G31 G36
(Group)
B. Permanent Total
1. Medical
a. Comments/Range o f Amounts
On the other end o f the spectrum, ultimate incurred medical amounts that
are less than $25,000 have been observed. This is difficult to explain.
However, it has been suggested the accidents that are disabling such as
blinding might be one explanation. Another is that some states have
customarily awarded permanent total status for what seems to be minimal
injuries. An example o f this is an actual case where permanent total
disability was awarded for tendinitis o f the elbow. The medical costs o f
treating an injury o f this type would be expected to be nominal.
Intuitively, the data may not be satisfying but given that the same thing is
shown in several data sets it is reasonable to accept the indications.
c. Development
Having cleaned up the data as much as possible the next step taken was to
project individual costs to ultimate. At this point the only type o f costs
under discussion are the medical incurred amounts. Data was drawn from
a recent Pennsylvania Loss Cost filing was used to develop the estimates
in the following table.
79
Case Res. Case Res. Case Res.
Period Devl. Factor Period Devl. Factor Period Devl. Factor
These factors are applied to the case reserves on individual open claims
where the factors are selected according to the accident year of the claim.
For example, suppose the year in which the data is being analyzed is 1999
and the accident year is 1992. Also assume that the undeveloped medical
incurred is $272,312 where paid = $118,705. Then the ultimate medical
incurred is (3.26)( 153,607)+ 118,705 = 619,464.
d. Trending
The next step is to trend the cost on individual claims up to the current
date. A good source o f data for this purpose that is easily accessible is the
Bureau o f Labor Statistics. The web site address is www.bls.org. The
medical increases for the last 10 years have been in the 3+% range.
After bringing the costs up to current level the costs are then projected to
the middle of the period for which the rate will be applicable. Use o f a
future trend factor o f approximately 3.5% at thc writing o f this paper
seems reasonable.
e. Statistical Modeling
In previous applications of this method it has been found that the data
even after the previously described adjustments is not smooth enough over
various intervals to be used immediately. In particular it is often the case
that there are ranges of several million dollars where there are no claims.
Conversely - but occurring less often - there are instances when a fairly
narrow interval might include two or more fairly large claims. "Fairly
large" as used in this context m e a n s over 5 million.
80
fitting a single curve over the whole range of values. Some fitting over
limited ranges may seem workable but the benefits seem questionable.
One well known curve that initially seemed appealing was the log-normal
curve. However when a goodness of fit test was used (Komolgorov-
Smimov) on a medium size set of data the results were found to be
inconclusive. Later when testing on a much larger body of data it become
clear that the test results were indicating that it was unlikely that the data
was generated as a sample from a log-normal distribution.
The solution adopted was to simply use the data as a foundation for an
empirical curve. Before final construction of the curve, smoothing was
conducted over consecutive intervals. A facsimile of the final curve is
shown in Figures 5a-f.
2. Indemnity
( 1) Variation by State
81
SAWW; Florida-100% SAWW; Iowa-200% SAWW Mississippi-66
2/3% SAWW. The maximum for New York is a dollar amount =
$4OO.
(c) Limits
In addition to the specifications in (1) and (2) above some states have
limits specified in either time and/or amounts. Usually when there are
limits these are expressed in both time and amounts. For example,
South Carolina-500 weeks, $241,735; Mississippi-450 weeks,
$131,787. For the most part however, the benefits are granted for life,
although some states have offsets and other lypes of limitations that
are discussed in the next section.
(d) Offsets
Some states have introduced Offsets and this trend has continued into
the present time. For example: Arkansas-Reduce PP 50% of non-
employee portion of public/private funded retirement/pension plan of
65 years or older; Colorado-Social Security, unemployment
compensation, an employer-paid pension plan; Michigan-Disability,
unemployment compensation, pension, old age Social Security
retirement; New Jersey-Social Security; Pennsylvania-unemployment
compensation, Social Security Old Age and certain severance and
pension payments.
(e) Escalation
82
Massachusetts mandate escalation tied to the CPI but limited to 5% in
Massachusetts. Nevada's benefits are increased by an amount equal to
the change in the SAWW.
Wage Distribution
If it is assumed that the SAWW is 600 and the benefit 66 2/3% times
AWW, then the figures in the Wtd. Avg. column should be multiplied by
400 producing the following table.
83
Benefit Distribution
(b) Age
The protocol outlined on this paper is to assume that some workers are
age 20 at time o f injury, some 25 and so forth in five-year intervals up
to age 60. "/'his makes the number of ages more manageable and it
seems, through some research and analysis, still provides a good
estimate of the costs.
The life tables used in these calculations are the tables from the 1979-
1981 experience period and is total population. Thus, it includes males,
females and all races. This is obtainable from the Center for Disease
Control and can be downloaded from their website.
Theses tables are used based on the assumption that the U . S work
force has the same proportions o f men and women as does the general
population. Another assumption implicitly made here is lhat men and
women have equal exposure to serious injury.
It could and has been argued extensively that for Permanent Total
injuries, or at least certain subsets, an impaired life tablc should be
used. However medical care today has advanced to the point that even
84
very seriously injured individuals can expect a normal life span. The
NCCI undertook a study of impaired lives fairly recently (within the
last 10 years) and published a life table based on the study. Review of
that table did not offer convincing evidence that other than the total
U. S. population should be used
(d) Offsets
Next the Social Security benefits for the disabled workers must be
estimated. The benefits are based on earnings through the previous year
and hence the earnings are adjusted back to 1999. We then estimate the
Social Security benefit based on that number and using the Social
Security benefit structure. (This can be obtained form the Social
Security benefit website). In this case, the Social Security benefits are
found to be $210.34 per week.
The next step is to estimate the benefit under Workers Comp. Since the
individual is earning $475 per week and the benefit is awarded at 2/3
AWW the benefit is $316.67 per week. The sum of the Social Security
Benefit and the Workers Comp benefit is $527.01 which exceeds 80%
of $475 by $147.01. This amount is the "Offset". Thus the Workers
Comp benefits are reduced from $316.67 per week to $169.66. This
leaves the sum at $380.00 = (.8)(ACE).
It should also be noted at this point that Florida provides for escalating
benefits for a period. The interpretation of this part of the law made
here is that the 5% increase applies to the amount $169.66.
85
(e) State Average Weekly Wage
Since rates are made to be effective for some period in the future
historical information must be trended to that period. When a history of
State Average Weekly Wage is available, this is used to trend to the rate
effective period. An example of this is given in the following table.
I 0/1/97-9/30/98 $665.55
10/1/96-9/30/97 $631.03
10/1/95-9/30/96 $604.03
10/1/94-9/30/95 $585,66
10/l/93-9/30/94 $565.94
10/1/92-9/30/93 $543.30
10/1/91-9/30/92 $515.52
C. Fatal
1. Medical
As noted earlier most Fatal claims do not have any medical cost associated
with them. However some Fatal claims do display medical costs in small,
medium or even large amounts. The average cost of the medical on Fatal
claims is very', very small in comparison to the indemnity costs, ltowever
the task here is to estimate the Excess costs and thus the medical costs
although small in relation to first dollar costs can add significantly to
Excess costs. This is especially true when the Fatal benefits are extremely
limited as in Florida where Fatal benefits are limited to $100,000.
86
Fatal Medical
Distribution
Amount Probability
$0 95.0%
$500 4.0%
$100,000 0.7%
$1,500,000 0.3%
2. Indemnity
It assumed here that the ages of workers was uniformly distributed and
that the propensity to suffer a fatality was the same at each age. It must
be noted here that this is a simplifying assumption. There is some data
available that would indicate that the frequency of mortality is slightly
higher for workers in their twenties than for workers at higher ages. It
has been speculated that this is a result of young workers either not
having been fully trained in safety procedures, simply lacking
experience, being either more inclined to take risks or being less
careful. It should be mentioned here that similar data indicates that
workers between fifty and sixty are more inclined to suffer permanent
total injuries than younger workers. In this case it has been speculated
that older workers are simply less physically fit than younger workers
with the following implications. The first is that the execution of a
particular task is more likely to result in an injury to an older worker
than a younger worker e.g. lifting an object weighing 70 pounds. The
second is that, given a particular injury, it may be that a younger
worker would have a propensity to heal more quickly and completely
than an older worker. These considerations have not been incorporated
into the model due to the lack of a highly reliable database.
87
In order to perform the calculations it is necessary to a s s u m e a certain
age or potential ages o f the deceased worker. As is the case with
Permanent Total claims discussed previously the assumption used here
is that the worker's age at time of death was either 20, 25 . . . . . , up to
60.
b. Wage
c. Percent Award
88
d. Maximum, Minimum
Weekly fatality benefits are limited as is the case for Permanent Total.
Usually the maximum and minimum can be shown to be a function of
the SAWW. However benefits to children may cause some small
exceptions. These limitations play a significant role when the benefits
are payable for life but are not nearly as important when there are time
or amount limitations. For example consider Florida where the
limitation on Fatal benefits is $100,000.
The maximum weekly benefit in Florida is $522 per week. Thus the
length of payments is about 3.7 years. If the weekly maximum was
50% higher the length of the payments would be about 2.5 years and if
50% lower, the length would be about 5.5 years. Thus the average
point of payment would be either 1.85 years, 1.25 years or 2.75 years
with the difference between any of these being no more than a year
and a half. This is insignificant from the point of view of the time
value of money and for excess rating purposes.
e. Offset, Limitation
(1) Offsets
89
(2) Limitations
Escalation
g. Mortality Table
As is the case with Permanent Total claims the life tables used in
thcse calculations are the tables from tile lq79-1981 U.S,
experience and is derived from total population statistics. The
implicit assumption made here is that men and w o m e n suffer
{htalities equally in the workplace. This i~ probably not a precisely
correct assumption and it has been speculated that perhaps the
mortality rate is higher for men since men engage in inherently
more hazardous work e.g. contracting, roofing, logging and
fishing. However a considerable number ~f women drive or ride in
vehicles as part o f the job and many o f the fatalities experienced in
the course o f work result from vehicle accidents Whatever the true
exposure, the unavailability of good data makes attempts to
measure the mix o f male and females with respect to fatal claims
somewhat impractical, It should also bc noted that use of an "'all
lives" mortality table when most of the workers compensation.
fatalities arc men adds a degree of conse1~atism, tt might be noted
here that in developing this methodology many similar decision
points were encountered and the decision was made to make
90
conservative selections due to the large degree of risk taken in
underwriting an excess workers comp program.
1. Medical
a. Source of Data
b. Range of Amounts
It was mentioned earlier that the range of the medical costs associated
with this type of injury can be surprisingly large. Some databases that
we were able to access displayed claims whose maximum incurred
medical was not much over $500,000. But other databases presented
claims in the multiple millions of dollars. Serious injuries such as
damage to the spinal column, severe bums requiring extensive
reconstructive surgery and electrical burns causing nerve and muscle
91
damage are only a few o f the examples o f medical catastrophes that
are vet3,' costly but which may allow an individual to return to work.
The expected value o f the average medical claim for Major Permanent
Partial has been estimated to be between $80,000 and $100,000 in
PCRB filings in recent years.
Indemnity
For Fatal and Permanent Total Claims it was felt that direct recourse to
state Statutes would generate the best available estimate o f indemnity
costs associated with these types o f claims, ttowever this is not true
with respect to the benefits provided for Major Permanent Parfia[. For
one thing there are an inordinate number of categories e.g. loss of
index finger, thumb, eye, great toe, other than great toe, foot. arm.
hand, leg and on and on and on.
92
thereby obtaining, it is hoped, homogeneous data and constructing the
indemnity distribution curves. When there is not recourse to additional
data for all the states the curve is adjusted by reviewing the details of
the Statutory PP indemnity benefits.
b. Range of Amounts
It may seem surprising but the determination of the weights by type of loss
may be the weakest link in this methodology. Often the weighting must be
based on data that is the summary of data on a handful of claims.
93
A. Source of Data
The asterisk (*) indicates that the figure is based on less than 25 cases.
Given this, it might be expected that the indicated weight is not especially
accurate since the sample size is small and that the range of values of
individual claims is quite large.
In addition to the above cited weakness, the 1998 Edition also did not
display weights for several states. Some were large states, notably Ohio
and Pennsylvania.
Similar weights can be extracted from the rate filings of other rating
bureaus such as PCRB, NYCIRB and WCIRBC.
B. Development
94
All Policy Years
1. Experience as Reported
2. Developed Experience
1. Experience as Reported
95
2. Developed Experience
It should be noted that the above figures are taken from a primary rate
filing with the development terminated after a reasonable amount of time.
However, experience with Permanent Total claims would suggest that the
cost of this type of claim continues to develop over a period measured in
decades. Thus the distribution percentage for PT in particular is likely on
the low side even at what is construed to be ultimate for the purposes of
the rate filing. Thus the selection of the weights requires some judgement.
For example the Permanent Total column of the above constructed
facsimile shows weights between 5.2% and 7.5%. The states displaying
5.2% as the weight for PT are Alabama and Massachusetts. However
Massachusetts is a much higher benefit state than Alabama with not only
a higher average weekly wage but also with escalating benefits to age 65.
On the other hand the fatal benefits in Texas are about the same as in
Louisiana, so it is difficult to justify the difference in weights shown in he
table. Thus, when selecting weights, consideration must not only be given
to whatever data is available but also to the state mandated benefits.
V. Examples
A. Example #3
96
Figure 5a
3,0%
v 25%
1 5%
1 +0% i i i i i i i i i
O0 5,0 10 0 15 0 20.0 25+0 30 0 35.0 400 45.0 50+0
Medical Clakn Cost (000)
-,,,1
Figure 51:)
20.0%
15.0%
10.0%
!
5r0%
[] [] [] [] ,-, ~,
0.0% I I i i i i i i
'30 100 150 200 250 300 350 400 450 S00
Ivbdlcal C l J m C06t (000)
F i g u r e 5(:
08%
0 6%
g
g 04%
02%
0 0% I I I I I I I I I
500 550 600 650 700 750 800 850 900 950 1 .ooo
li41~dlclll Claim Cost (000)
F i g u r e ,Sd
O3%
O3%
01%
01%
o 0% i i i i i i i i i i i i I i
1 000 1,100 1,200 1,300 1,400 1,500 1,600 1,700 1,800 1,900 2,000 2,100 2,200 2,300 2A00 2,500
M~cl Ci=m C ~ t (000)
Figure 5e
0 08%
0 07%
i 0 06%
005%
- ~'~'~
- ~ ~ ~ c.~-'--o a
0 04% I 1 I [ I J J I I
2,500 2,750 3,000 3.250 3,500 3,750 4,000 4,250 4,500 4,750 5,000
Medical ~ Cost ((:x)o)
Figure 5f
014%
0 12%
0 10%
0 08%
~. 06%
~_. 0 04%
0 02%
o 00% , I i i I i [ i [ I i t I i I i A i I i i L ~
5.000 6,000 7,000 8.O00 9,O00 10.000 11,000 12.000 13,000 14,000 15,000 16,000 17.0OO
ClmmCost (ooo)
generated by assuming a wage distribution similar to that produced by
NCC1 in the past and assuming a given level of SAWW and benefits.
The SAWW is assumed to be $600 in this example.
Fatal
Medical Distribution
Amount Probability
0 25.0%
8,000 67.5%
75,000 4.0%
300,000 3.0%
1,750,000 0.5%
Weights by
Type of Loss
Type of
Loss Weil~hts
Fatal 2.0%
Perm. Total 11.5%
Maj. Penn. Pa. 55.0%
100
Figure 6a
100%
8 0%
60%
~" 40%
20%
00% I I I I I I I I i I I I I I J I
150 20 0 250 300 35.0 400 450 50 0 55,0 600 65 0 70.0 75.0 80 0 85.0 900 95 0 100.0
Medical Claim Cost (0O0)
Figure 6b
60%
~v 40%
i 2.0%
--~.~ ~ ~ = ~ ~ ~_, [] [] [] [] []
0.0% I l I I I I I I
to0 200 300 400 $00 600 7O0 8O0 gO0 t .000
~al ~ Cost (OO0)
Figure 6c
0.03%
g
0.02%
g
0.01%
000% I t I ~ h
1,000 1.500 2,000 2,500 3,000 3,500 4,000 4,500 5,000
IVbdc:alClaimCost (000)
Figure 7a
10.0%
8.0%
g
, 0%
4,0%
~'~ 2.0%
0.0% I I I I I I I I I I I I I I I I
15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 55.0 60.0 65.0 70.0 75.0 80.0 85.0 90.0 95.0 100.0
indemnity Claim Cost (000)
Figure 7b
8.0%
A 6.0%
4.0%
g
¢. 2.0%
0.0% I ~1"" 0 O--ELI~ "~" ~--'I~4~F ~ ~ ~' ~ ~ ~ 010 [] ~7 ,-~i ~ ~ ~ ~7 ~ .'7. .7. ~
100 200 300 400 500 600 700 800 900 1,000
Indemnity Qaim Cost (000)
These weights combine with the previously estimated excess factors to
produce an excess factor of 13.45%, i.e.
(2.0%)(34.1%)+(11.5%)(55.8%)+(55.0%)(11.5%) = 13.4%.
B. Example #4
In Appendix D, the reader will find a complete set of excess factors for
Pennsylvania. These were developed using the described methodology.
VI. Miscellaneous
1. Change of Benefits
Over the past decade workers compensation laws have been revised
often with varying levels of impact emanating from given changes.
Many of the changes have been focused on benefits. In an effort to
bring the benefits accruing to an injured worker to a level equal to
economic benefits accruing from other events, the benefits have
generally been reduced and/or the administration of the law modified.
For example, Maine at one time mandated escalating benefits for
workers that had been killed or had been permanently and totally
disabled. The benefits plus the rate regulation grew so onerous that
eventually the insurance industry stopped underwriting workers
compensation exposures in that state. The resulting problems that this
caused businesses that operated in Maine were partially remedied by
reducing the statutory benefits. Currently instead of escalating lifetime
benefits fatal claims receive level benefits for 500 weeks. Permanent
Total claim now receive level lifetime benefits but these are now offset
to an extent by Social Security benefits and other benefits such as
employer funded benefits.
Pennsylvania and Louisiana are two other states which have revised
the statutorily mandated benefits in the last decade.
104
which is often more critical. In addition to the above, the view
presented here is that the estimate generated using the described
methodology is, at any rate, more accurate since it does not depend on
a small sample of claims.
3. Pricing of Layers
105
does not guarantee that the results are accurate it is a simple task to
perform and may identify missteps in calculations.
It has been suggested that adjusting the rates to reflect the Hazard
Group profile might produce the appropriate rates. However it should
be noted that somewhat over 90% of all risks fall into either Hazard
Group II or Hazard Group Ill. Thus adjusting the statewide rates by
Hazard Group may produce some improvement in matching the rates
to the risk but it would seem that the progress would be minimal.
Statistics to begin the above suggested process are available from the
various ratemaking bureaus.
106
6. Allocated Loss Adjustment Expense Considerations
The methodology and the examples presented in this paper did not
consider the impact of allocated loss adjustment expense. However it
is felt that this methodology can be extended to include allocated loss
adjustment expense costs. It would seem that this would add an
additional layer of complexity. Evidence available to the author of this
paper suggests that ALAE is not a direct add-on. That is, it would be
inappropriate to load each claim value by say 10%. For example, a
claim whose size is $15,000,000 would not carry an associated ALAE
cost of $1,500,000.
On the other hand, whereas the medical and indemnity costs seem to
be independent, it would appear that the ALAE amount is, in some
way, related to the size of the claim cost excluding ALAE. However
incurred ALAE as a percent of incurred losses seems to be negatively
correlated to the size of loss.
Payout Rates
8. Closing Remarks
107
The effort, cost, and acceptance of the methodology do not guarantee,
of course, that the rates are as accurate as they might be. This is due in
part to the difficulties previously discussed. It is also due to
assumptions that have been untested but where at least a degree of
testing may be possible. Thus work must continue to refine the
methodology.
108
Appendix A
Death Benefits
• If there are six children, 66 2/3 percent of wages of deceased, but not
in excess of the SAWW
Spouse and children: To the widow or widower, if there is one child, 60 percent
of wages, but not in excess of the SAWW. To the widow or widower, if there are
two children, 66 2/3 percent of wages but not in excess of the SAWW. To the
widow or widower, if there are three or more children 66 2/3 per cent of wages,
but not in excess of the SAWW.
109
Appendix A cont'd
Miscellaneous Benefits:
Funeral Expenses Whether or not there are dependents, the reasonable expense of
burial, not exceeding $3,000 will be paid by the employer or insurer directly to
the undertaker (without deduction of any amounts already paid for compensation
or for medical expenses).
Schedule of Permanent lniuries. For all disability resulting from permanent injuries of the
following classes, 66 2/3 percent of wages is exclusively paid for the following number
of weeks
110
Appendix A cont'd
111
Appendix B
0 100.000 56 87.551
1 98,740 57 86,695
2 98,648 58 85,776
3 98,584 59 84.789
4 98.535 60 83,726
5 98,495 61 82,581
6 98,459 62 81,348
7 98.426 63 80.024
8 98,396 64 78,609
9 98,370 65 77,107
10 98.347 66 75,520
11 98,328 67 73,846
12 98,309 68 72,082
13 98,285 69 70,218
14 98,248 70 68,248
15 98,196 71 66,165
16 98,129 72 63,972
17 98,047 73 61,673
18 97,953 74 59,279
19 97,851 75 56,799
20 97,741 76 54,239
21 97,623 77 51,599
22 97,499 78 48.878
23 97,370 79 46,071
24 97,240 80 43,180
25 97,110 81 40,208
26 96,982 82 37,172
27 96,856 83 34,095
28 96,730 84 31,012
29 96,604 85 27,960
30 96,477 86 24,961
31 96,350 87 22,038
32 96,220 88 19.235
33 96,088 89 16.598
34 95,951 90 14,154
35 95,808 91 11.908
36 95.655 92 9.863
37 95,492 93 8,032
38 95,317 94 6.424
39 95,129 95 5,043
40 94,926 96 3,884
41 94,706 97 2,939
42 94,465 98 2,185
43 94,201 99 1.598
44 93,913 100 1.150
45 93,599 101 815
46 93,256 102 570
47 92,882 103 393
46 92,472 104 267
49 92,021 105 179
5O 91,526 106 119
51 90,986 107 78
52 90,402 108 51
53 89,771 109 33
54 89,087 110 21
55 88,348 111 0
112
Appendix C
100,000000% 753,447.87
113
Appendix D
State: Pennsylvania
Effective Year: 1999
Excess of ExcessFactor
100,000 37.67%
150,000 30.51%
200,000 25.32%
250,000 21.44%
300,000 l 8.31%
350,000 15.68%
400,000 13.43%
450,000 11.51%
500,000 9.88%
750,000 4.97%
1,000,000 2.50%
1,250,000 1.55%
1,500,000 1.07%
2,000,000 0.66%
114
Surplus Allocation for the Internal Rate of
Return Model." Resolving the Unresolved Issue
115
SURPLUS ALLOCATION FOR THE INTERNAL RATE OF RETURN MODEL:
RESOLVING THE UNRESOLVED ISSUE
Abstract
In this paper, it is shown that with a certain definition of risk-based discounted loss
reserves and a certain method of surplus allocation, there is an amount of premium for a
contract which has the following properties:
(1 .) It is the amount of premium required for the contract to neither help nor hurt the
insurer's risk-return relation.
(2.) It produces an internal rate of return equal to the insurer's target return.
If the insurer gets more than this amount of premium, then the insurer can get more return
with the same risk by increasing the percentage of the premium for the overall book
which is in the segment. Conversely, if the insurer gets less than this amount of
premium, the insurer can increase its return by decreasing the percentage of the overall
premium which is in the segment. The amount of premium is equal to the risk-based
premium in "Pricing to Optimize an Insurer's Risk-Return Relation," (PCAS 1996).
The above property 1 of risk-based premium is proven by Theorem 2 of the 1996 PCAS
paper and not by the present paper. The present paper proves property 2.
116
1. INTRODUCTION
The problem of relating pricing to the risk-return relation has been discussed in many
recent actuarial papers. Surplus allocation is not described in these papers as something
that can be done in a theoretically justifiable way. Actually, though, surplus allocation
has been used in a way that has been proven by a theorem (Theorem 2 of [1]) to derive
the amount of premium for any contract which will neither improve nor worsen the
insurer's risk-return relation. Certain estimates have to be used of course, e.g.
covariances and expected losses. The precise mathematical relationship between this
premium and the risk-return relation is specified by the statement of Theorem 2. This
theorem, and the corresponding definition of risk-based premium, are very rarely
mentioned in recent papers relating pricing to the risk-return relation.
Since the internal rate of return (IRR) model has been a part of the CAS exam syllabus
for years, and since it is a widely used method in insurance and other industries, it may be
possible to explain the method of [1] to a larger group of readers by relating it to the IRR
method.
The IRR model can be used to measure the rate of return for an insurance contract or a
segment of business, but only if the method used for allocating surplus can be related to
the insurer's risk-return relation. The model is generally presented without a
theoretically justifiable method of allocating surplus. But if an arbitrary method of
allocation (such as allocating in proportion to expected losses) is used, the results are
almost meaningless. The purpose of this paper is to complete the IRR method.
Incidentally, there are several actuarial papers which argue that surplus allocation doesn't
make sense because risk is not additive, or because in the real world all of surplus is
available to support all risks. Actually, just as a function f(x) associates each number x
with another number, surplus allocation is a mathematical function which associates each
member of a set of risks with a portion of sttrplus. This function can be used as a part of
a chain of reasoning in order to prove a theorem, as was done in Theorem 2 of [ 1].
117
Although surplus allocation was used in deriving the properties of risk-based premium in
[1 ], the derivation doesn't actually require any mention of surplus allocation. When risk-
based premium is related to the IRR method in section 4 of this paper, surplus allocation
is used because that is the traditional way of explaining the IRR method.
The premium derived in this paper by the 1RR method is the same as that determined by
the method in [1] The method in [1] is simpler to apply, but the IRR model has the
advantage of being widely used and understood. It is on the CAS syllabus and has also
been used by non-actuaries for many years. In order to relate the method in [1] to the
1RR method, explanations will be given of both methods. However, since both methods
are explained at length in the literature (see [2],[3],[4]), the explanations will be brief and
informal. This could actually be an advantage, since it could make the presentation more
lively and readable. The part of the paper which is new is the derivation of the
equivalence of the two methods, given certain assumptions and conditions.
118
2. THE IRR MODEL
The IRR model is a method of estimating the rate of return from the point of view of the
suppliers of surplus. Suppose for example that an investor supplied $100 million to
establish a new insurer, and that $200 million in premium was written the next day. Also
suppose that twenty years later the insurer was sold for $800 million. Ignoring taxes, the
return r to the investor satisfies the equation $100 million (1+020 = $800 million.
Therefore, the return r equals 10.96%.
Suppose that, beginning at the time of the above initial investment, each dollar of surplus
is thought of as being assigned to either an insurance policy currently in effect, a loss
reserve liability, or some other risk. Suppose that a cash flow consisting of premium,
losses, expenses, outflows of surplus, and inflows of surplus, is assigned to each policy in
such a way that the following is satisfied: the total of all the cash flows minus the
outstanding liabilities immediately prior to the time at which the above insurer is sold for
$800 million produces an $800 million surplus. It is then possible to express the input of
$100 million, and the payback of $800 million twenty years later, as the total result of the
individual cash flows assigned to each policy. Based on the individual cash flows, the
overall return of 10.96% could then be expressed as a weighted average of individual
returns for each policy. The individual return for a policy is called its intemal rate of
return. The following example is taken from [2].
!19
Suppose an insurer
The internal rate of return analysis models the cash flows to and from investors. The
cash transactions among the insurer, its policyholders, claimants, financial markets, and
taxing authorities are relevant only in so far as they affect the cash flows to and from
investors.
Reviewing each of these transactions should clarify the equity flows. On January 1,
1989, the insurer collects $1,000 in premium and sets up a $1,000 reserve, first as an
unearned premium reserve and then as a loss reserve. Since the insurer desires a 2:1
reserves to surplus ratio, equity holders must supply $500 of surplus. The combined
$1,500 is invested in the capital markets (e.g., stocks or bonds).
At 10% per annum interest, the $1,500 in financial assets earns $150 during 1989, for a
total of $1,650 on December 31, 1989. On January l, 1990, the insurer pays $500 in
losses, reducing the loss reserve from $1,000 to $500, so the required surplus is now
$250.
The $500 paid loss reduces the assets from $1,650 to $1,150. Assets of $500 must be
kept for the second anticipated loss payment, and $250 must be held as surplus. This
leaves $400 that can be returned to the equity holders. Similar analysis leads to the $325
cash flow to the equity holders on January 1, 1991.
120
Thus, the investors supplied $500 on 1/1/89, and received $400 on 1/1/90 and $325 on
1/1/91. Solving the following equation for v
yields v = 0.769, or r = 30%. (V is the discount factor and r is the annual interest rate, so
v = 1/(l+r).)
The internal rate of return to investors is 30%. If the cost of equity capital is less than
30%, the insurer has a financial incentive to write the policy.
Feldblum doesn't claim that the method of surplus allocation in the illustration can be
directly related to the risk-return relation. Allocating in proportion to expected losses
doesn't distinguish between the riskiness of unearned premium, loss reserves, property
risks, casualty risks, catastrophe covers, excess layers, and ground-up layers, for
example. Different methods of surplus allocation could be judgmentally applied to
different types of contracts, but from a theoretical risk-return perspective a certain use of
covariance is required. This will be explained in the next section.
121
3. RISK-BASED PREMIUM
What follows is an informal explanation o f the derivation in [1] o f the properties o f risk°
based premium. In the discussions below o f an insurer's risk-return relation over a one
year time period, "return" refers to the increase in surplus, using the definition o f surplus
below. (The term "risk-based discounted" is used in the definition and will be explained
later.)
surplus = market value o f assets - risk-based discounted loss and loss adjustment reserves
- market value o f other liabilities (3.1)
At any given time, the return in the coming year is a random variable. ]'he variance o f
this random variable is what we refer to by the term "risk". The expression ~'optimizing
the risk-return relation" is used in the same way that Markowitz [5] used it, i.e.,
maximizing return with a given risk or minimizing risk with a given return. (Markowitz
was awarded the Nobel Prize several years ago for his work on optimizing the risk-return
relation o f asset portfolios.)
For an insurance contract, or for a segment o f business, the risk-based premium can be
expressed as follows: (The term "loss" will be used for "loss and loss adjustment
expense.")
122
The above expense provision is equal to expected expenses discounted at a risk-free rate.
The starting time T for discounting recognizes the delay in premium collection.
Expenses are considered to be predictable enough so that the risk-free rate is appropriate.
The risk-based discounted loss provision is equal to the sum of the discounted values,
using a risk-free rate and the above time T, of
(a.) the expected loss payout during the year
(b.) the expected discounted loss reserve at year-end, discounted as of year-end at
a "risk-based" (not "risk-free') discount rate
A risk-free rate is used to discount (a) and (b) above because the risk arising from the fact
that (a) and (b) may differ from the actual results is theoretically correctly compensated
by the risk-based profit margin (see (3.2) above).
The phrase "contract or segment of business" will be replaced below by "contract", since
the covariance method used below has the following property: the risk-based premium
for a segment equals the sum of the risk-based premiums of the contracts in the segment.
At the inception of an insurance policy, the payout of losses during the year that the
contract is effective, and the estimated risk-based discounted loss reserve for the contract
at the end of the year, are unknown. The effect of the contract on surplus at the end of
the year, i.e., the difference in end of year surplus with and without the contract, can be
thought of as a random variable X at inception. The insurer's return, i.e., the increase in
surplus during the year, is also a random variable. Call it Y.
Assuming that the contract premium equals the risk-based premium, the expected effect
of the contract on surplus at the end of the year is equal to the accumulated value, at the
risk-free interest rate, of the risk-based profit margin. This is true because the expense
provision portion of the formula (3.2) above pays the expenses, and the risk-based
discounted losses portion pays the losses during the year and also accumulates at risk-
free interest, to the expected value of the risk-based discounted loss reserves at the end of
the year. Therefore, by formulas (3.1) and (3.2), above, the effect of the contract equals
the accumulated value of the risk-based profit margin.
123
The random variables X and Y were defined above. If
Cov(X,Y)/Var(Y)=E(X)/E(Y) (3.3)
then, according to Theorem 2 of [1], the contract neither improves nor worsens the risk-
return relation, in a certain sense. This was defined above in the abstract. Note that
E(X), above, equals the accumulated value of the risk-based profit margin.
One of the components of risk-based premium is the expected value of the risk-based
discounted loss reserves at the end of the year. This expected value is greater than the
expected value of the loss reserves discounted at the risk-tree rate corresponding to the
duration of the loss reserves. This is how the risk-based premium provides a reward for
the risk of loss reserve variability. The risk-based discount rate is therefore less than the
risk-free rate.
At the end of each year following the effective period of a contract, if the matching assets
for the risk-based discounted loss reserves are invested at the risk-tree rate~ their expected
value at the end of the following year will be greater than the expected discounted
liability. This is because the risk-based discount rate is less than the risk-free rate.
Assume, for example, that the loss payout is exactly equal to the expected loss payout.
At the moments that loss payments are made, both the discounted loss reserve and the
matching assets are reduced by the same amount. At other times, the matching assets are
growing at the risk-free rate and the discounted liability is growing at the lower risk-
based discount rate.
At the beginning of the second year after the inception of the policy, the end of the year
matching assets minus the discounted loss reserve can be thought of as a random variable
Z. If Cov(Y,Z)/Var(Y) is equal to E(Z)/E(Y), then, according to Theorem 2 of [1], the
risk-based discounted loss reserve and matching assets neither improves nor worsens the
risk-return relation for the year. It is possible to compute a discount rate before the
inception of the contract such that Cov(Y,Z)/Var(Y) is equal to E(Z)/E(Y). Note that if
124
the matching assets are not risk-flee, that affects both Cov(Y,Z) and E(Z) and may have a
slight effect on the risk-based discount rate. If risk-based discount rates are computed for
each of the years until the loss reserve is expected to be fully paid, the risk-based
premium for the contract is determined.
For practical purposes, the above derivation of risk-based discounting of losses for a
contract can be simplified if certain estimates are used. For example a single risk-based
discount rate can be used for all future years. This approach was used in [1]. Since risk-
based premium is determined by the estimated expense payout, loss payout, risk-based
profit margin, and risk-based discount rate, the explanation of risk-based premium has
now been concluded. The following two examples are taken from [1].
Assume that:
1. The probability of zero losses to the catastrophe cover is .96, and the probability
that the losses will be $25 million is .04. Therefore, the variance of the losses is
24 trillion, and the expected losses are $1 million.
2. Property premium earned for the year is $100 million, and there is no casualty
premium.
3. The standard deviation ofpre-tax underwriting return is 15 million.
4. The expected pre-tax retum from the entire underwriting portfolio is $8 million.
5. Taxes have the same proportional effect on the expected pre-tax returns on total
premium and on the catastrophe cover, and on the standard deviations of returns.
6. The covariance between the catastrophe cover's losses and total property losses
net of the cover is equal to .50 times the variance of the cover's losses.
7. The discount rate for losses is zero.
125
8. Total underwriting return, and the return on the catastrophe cover, are statistically
independent of non-underwriting sources of surplus variability.
It follows from 1, 6 and 8 above, and from the fact that Cov(X,Y+Z) = Cov(X,Y) +
Cov(X,Z), that the covariance with surplus of the pre-tax return on the catastrophe cover
is 24 trillion + .50(24 trillion); i.e., 36 trillion. It follows from 3 and from Cov(X,Y+Z) =
Cov(X,Y)+Cov(X,Z) that the corresponding covariance for total underwriting is (15
million) 2 , i.e., 225 trillion. Therefore, it follows from assumption 4 that the risk-based
profit margin for the catastrophe cover should be such that the pre-tax return from re-
assuming the catastrophe cover is given by (36/225)($8 million)= $1.28 million. (This is
greater than the cover's expected losses.) If the cover costs more than $2.28 million, then
it improves the insurer's risk-return relation to re-assume it. However, the cover may be
necessary to maintain the insurer's rating and policyholder comfort.
126
Let a and b denote the standard deviations of the losses to the higher and lower layers,
respectively. Let 13denote the correlation. With the above assumptions, the pre-tax
covariances with surplus for the higher and lower layers, respectively, are given by
The allocated surplus for 0-$500,000 layer is 202.5/29.25 (i.e.., 6.9) times as great as the
allocated surplus for the $500,000 excess of $500,000 layer. The expected losses are
nine times as great for the lower layer. Therefore, the required profit margin, as a
percentage of expected losses, is 1.3 (i.e., ((9)(29.25))/202.5) times as great for the higher
layer as it is for the lower layer. This is expected due to the higher layer's larger
coefficient of variation.
127
4. RISK-BASED PREMIUM AND THE IRR MODEL
An example is given below to show the following, Suppose that the risk-based premium
of my model, for a certain contract corresponds to a certain expected rate of return for the
insurer. Then, the expected rate of return for the contract, using the IRR model, also
equals that target rate if the method of allocating surplus for the IRR model is the
covariance method of my model.
Suppose the target rate of return is 15%. The risk-based premium for a contract equals
expense provision + risk-based profit margin + risk-based discounted losses. Suppose
that premium and expenses are paid at the end of the year, and the expected loss payout is
$100 at the end of each year for four years. Suppose expenses are $70 and the risk-based
profit margin is $30. Suppose that the risk-based discount rate is 4%. It then follows that
the risk-based discounted losses at the end of the year equal $377.51 and the risk-based
premium equals $70 + $30 + $377.51, or $477.51,
By the words surplus and return, I will mean them as defined in my model. The portion
of surplus allocated to the contract for the first year will be called S~ and, using (3.3), it
equals
It is possible to estimate the taxes corresponding to underwriting return at the end of the
year of a contract, and the taxes corresponding to the return on risk-based discounted loss
reserves and matching assets in the following years. The effect on taxes of premium
earned, expenses incurred, investment income from premium, and losses paid during the
year, as well as the effect of loss reserves discounted at the beginning and the end of the
year, can be used. In the case of risk-based discounted loss reserves and matching assets,
the expected taxes are less than taxes on the matching assets. This is because loss
rcscrves are discounted from a point in time one year later at the end of the year than at
128
the beginning of the year, producing a loss for tax purposes. If the discount rate used for
tax purposes equals the risk-free rate, but the tax law payout rate is faster than the actual
payout rate, the tax effect is the same as the effect of using the actual payout rate and a
certain discount rate which is lower than the risk-free rate.
Assume for simplicity that the insurer's assets earn 6% for the period of the coming year
and that the tax produced by each type of return is 35% of the return. Then, the above
$30 risk-based profit margin and the above allocated surplus Sj satisfy the equation
.15Si =.65(.06S1+30)
since the 15% target return on allocated surplus is produced by the remainder, after 35%
tax, of 6% investment income on allocated assets plus the $30 risk-based profit margin.
Solving the equation gives Sl = $175.68.
This allocated surplus will be called $2. The amount of surplus allocated the next two
years are defined similarly and will be called $3 and $4.
The risk-based discounted loss reserve corresponding to $2 equals $277.51 and satisfies
the equation
since the 15% return is equal to the after-tax return from investment income from the
allocated assets plus the after-tax return on loss reserves and matching assets. The
129
matching assets earn 6% and the discounted reserves increase at a rate o f 4%, i.e., the
risk-based discounted rate. Solving the equation gives $2 = $32.50.
Similarly, the risk-based discounted loss reserves corresponding to $3 and $4 are $188,61
and $96.15, respectively. Therefore,
So $3 = $22.09, and $4 = $11.26. It will nov,' be shown that the rctum on the contract is
15% according to the IRR model, using the same allocation o f surplus as above. It was
shown above that
If S~, $2, $3, and $4 are added, respectively, to both sides o f the four equations above,
respectively, and each equation is divided on both sides by' 1.15, we get
S~ = (l/l.15)(Sl + .65(.068~ + 30)) (4.5)
St - ( 1/1.15 )($2 + .65(.06S2 + 5.55)) (4.6)
S~ = (1/1.15)(S~ + .65(.06S3 + 3.77)) (4.7)
$4 - (1/1.15)($4 + .65(.06S4 + 1.93)) (4.8)
Therefore.
130
By substituting the expression which is equal to $4 in equation (4.8) for $4 in Equation
(4. l l), we get
By substituting the above expression for the term $3 at the extreme right of Equation
(4.10), we get
By substituting this expression for the term $2 at the extreme right of Equation (4.9), we
get
Therefore, Sj is the discounted value, at a 15% return, of amounts at the end of years
1,2,3 and 4 which are each equal to the following: the stun of supporting surplus which is
no longer needed at the end of the year plus the after-tax return during the year resulting
from investment income from supporting surplus and from the contract. So, according to
the IRR model, the rate of return on the contract is 15%. This completes the
demonstration of the relationship between risk-based premium and the IRR model.
131
REFERENCES
132
Fitting to Loss Distributions with
Emphasis on Rating Variables
133
Abstract
This paper focuses on issues and methodologies for fitting
Key words
Loss Distributions, Generalized Linear Models, Curve Fitting,
Right Censored and Left Truncated data, Rating Variables, Maximum
Likelihood Estimation.
134
I. I n t r o d u c t i o n
This section presents some preliminaries regarding losses,
section 5.
135
Losses are given on an individual basis, and have not been
losses, z e r o losses s h o u l d be e x c l u d e d .
136
treated. A r e p o r t e d loss with a v a l u e in excess of its d e d u c t i b l e
underlying policy limit, then any amount greater than the loss
those losses have not been censored. Varying policy limits are
137
complete, that is, when there are neither left truncated nor
d i s c u s s e d in sections 3 and 4.
138
to losses is given by Hogg and Klugman (1984). This paper
distribution.
heterogeneous. Each risk has its own risk characteristics and its
139
own propensity to produce a potential loss. For instance, two
factors. For this reason, risks with the same values for their
data.
I. C o n s l d e r a t i o n of a n u m b e r of p a r a m e t r i c p r o b a b i l i t y
loss distribution.
distributions.
distribution(s) in step 3.
140
example. The first step requires considering a number of
141
application of estimation procedures suitable for complete data
of a loss d i s t r i b u t i o n is b a s e d u p o n p r o p e r s p e c i f i c a t i o n of the
data.
loss amount.
exclusive and exhaustive forms, written as i,1, Li2, L,3, and Li4 '
142
as defined below. In addition, four indicator variables, 8,1, di2,
~,3 and J,4 are used in order to write a succinct expression for
L il = f ( y , ; 8 , ~ p ) (2 . l a )
data)
censored data)
L = 1- F(PL,;0,~o) (2.3a)
~3
= l l , l f D i =0 and y, > P L i
6'3 [0, Otherwise (2.3b)
143
The contribution of the i th loss to the likelihood function is
given by
L =~IL, (2.6)
I = ~ log(ti) (2 . 7 a )
=El,, (2.7b)
I, = log( L, ) (2.8a)
= d,i log( L,~ ) + 8,2 log( L~2 ) + 6,s log( L,~ ) + 6,4 log( L,, ) ( 2.8 b )
logarithm.
144
The third step requires a criterion for r a n k i n g o r c o m p a r i n g
Akaike (1973).
When two models are compared, the m o d e l with a smaller AIC value
is the m o r e d e s i r a b l e one.
log-likelihood function.
145
deductible, policy limit, a n d the c o d e for a type of c o n s t r u c t i o n
are stated. For the time being, let us ignore the information
146
Table 1
Negative
maximized
log-likelihood N u m b e r of
Distribution function Parameters AIC
lognormal 897.8 2 1799.6
Pareto 895.2 2 1794.4
Weibull 899.8 2 1803.6
gamma 914.5 2 1833.0
inverse gamma 893.7 2 1791.4
exponential 986.4 1 1974.8
147
With regard to Table i, it should be noted that the values of
148
Table 2
149
For our data, most of the losses a r e of case 2, i.e., losses with
paucity of data, we c o n c e n t r a t e o n l y on c a s e 2.
by
1 2
E[min(X,b) IX >a ] = 1 ~(l°lg(a)-'u) { e'U÷2°" [e(Iog(b)_.._o.o-) ~(Iog(a)-.u-o
2o. )l+hll-¢,(I°g(b)-o.
O"
Table 3 summarizes the comparison of t h e o r e t i c a l and sample
150
Table 3
C o m p a r i s o n of C o n d i t i o n a l P r o b a b i l i t i e s and
C o n d i t i o n a l L i m i t e d E x p e c t e d Value for F i t t e d
L o g n o r m a l w i t h its Sample V a l u e s
a = 500
151
The comparisons of fitted and sample quantities in Table 3
152
values for parameters, and f) the treatment of outliers. Last but
3. Fitting a F a m i l y of D i s t r i b u t i o n s to Loss
Data: A M e a n A p p r o a c h
153
and Nelder (1989). An alternative solution is p r e s e n t e d in
section 4.
distributions to o u r data.
154
and the link. The random component: the random variable of
exponential family is
where a(.), b(.) and c(.) are some specific functions. 0 is the
g( E(Y)) = q = ~fljxj
J
155
Each explanatory variable is considered either as a factor
.th
statistical model. Let ~ denote a linear predictor for the l
~=xiT ~
p
t~o
p
/.i
variables in the model. Note that when rating variables are not
156
available, then p takes on the v a l u e of zero. This c o r r e s p o n d s to
d e s c r i b e d in section 2.
d e f i n e d as follows:
157
shall use the logarithm of the building value instead of building
predictors as follows:
construction.
158
The linear predictor (3.1C) is u s e d w h e n we w i s h to e x a m i n e
distribution.
.th
x/ =(I log(BVi) Cil C,2 ) represents the contribution of the 1
three.
entertain the f o l l o w i n g s t a t i s t i c a l t e s t s of h y p o t h e s e s :
159
inclusion of building value or construction in the linear
a p p r o x i m a t e l y valid.
160
where .....fie
fl0.fll are regression like parameters and xij's
losses.
variables.
161
Likelihood r a t i o test s t a t i s t i c s are n e e d e d for p e r f o r m i n g nested
162
Table 4
L i k e l i h o o d Statistics for A l t e r n a t i v e
Statistical Models
"Mean" Models
Linear Negative of l o g a r i t h m of
Model Predictor Likelihood function
A ~i = flo 897.7654
DF 95 cn p e r c .
Test of L i k e l i h o o d Ratio* for of
Hypothesis Test Statistics Chi-sq. Chi-sq.
163
Let us interpret the results given by Table 4, later on we shall
Finally, the Model D has the largest likelihood value. Based upon
164
distribution to the data. Thus, the consideration of rating
is s t a t i s t i c a l l y significant.
affected the mean of the distribution but not the scale, the
predictors as follows:
165
These models will be referred to as "Scale" models. Parallel to
d e f i n e d as
H0: ~l = ~2 = ~3 = 0 (4.2)
Ho: ~2 = ~3 = 0 (4.3)
Ho: ~l = 0 (4.4)
166
Table 5
Likelihood S t a t i s t i c s for A l t e r n a t i v e
Statistical Models
"Scale" Models
N e s t e d H y p o t h e s e s B a s e d On M o d e l D
C o m p a r i s o n of "Mean" & "Scale" M o d e l s
DF 95 t" perc.
Test of Likelihood Ratio Mean Scale for of
Hypothesis Test Statistics* Model Model Chi-sq. Chi-sq.
H0: fll = f12 = '83 = 0 -2(log L A - log L D ) lO. 111o 19.7090 3 7.8147
*Depending upon the context, the L^, L s, Lc, and LD, above, correspond
to likelihood functions for "Mean" or "Scale" Models A, B, C, and D.
167
O n c e again we s h o u l d be careful to interpret the results
5. C o n c l u s i o n
i n t e r a c t i o n of p a r a m e t r i c p r o b a b i l i t y distributions, deductibles,
168
paper suggest that for any specific data set, there may be m a n y
169
References
170
Appendix A: TABLE A
171
Appendix B: Exhibit I
An S-Plus P r o g r a m to Compute M a x i m u m L i k e l i h o o d
Estimate of P a r a m e t e r s & M a x i m i z e d L i k e l i h o o d
S t a t i s t i c for Weibull D i s t r i b u t i o n
mydata<-TableA
m<-data.frame(mydata)
Weibull<-function(lamda, alfa, d a t a = d a t a . m a t r i x )
D <- d a t a . m a t r i x [ , l ]
PL <- d a t a . m a t r i x [ , 2 ]
y <- d a t a . m a t r i x [ , 3 ]
z <- D + ( ( y < P L ) * y + ( y >=PL)*PL)
deltal<- ~ D = = 0 ) * ( y <PL)
delta2<- (D> 0 ) * ( y <PL)
delta3<- ( D = = 0 ) * [ y >=PL)
delta4<- (D> 0 ) * ( y >=PL)
L1 <- a l f a * l a m d a * ( z ^ { a l f a - l ) ) * e x p ( - l a m d a * { z ~ a l f a ) )
L2 < - ( a l f a * l a m d a * ( z ^ ( a l f a - l ) ) * e x p ( - l a m d a ~ ( z ^ a l f a ) ) ) / e x p ( - l a m d a ' ( D ^ a l f a ) )
L3 <- e×p( - l a m d a * ( z ^ a l f a ) )
L4 <- exp{ - l a m d a * ( z ^ a l f a ) ) / e x p ( - ]amda * (D^alfa))
logL<- deltal*log(Ll)+delta2*log(L2)+delta3*log(L3]+delta4*log{L4)
-logL }
min.Weibull<-ms(~Weibull(lamda,alfa), data=m, start
=list(lamda=],alfa=.15~)
min. Weibull
value: 899.802
parameters:
lamda alfa
0.4484192 0.223073
formula: ~ Weibull(lamda, alfa)
i00 o b s e r v a t i o n s
call: m s ( f o r m u l a = ~ Weibull(lamda, a l f a ) , d a t a - m, s t a r t = l i s t ( l a m d a
= i, a l f a = 0 . 1 5 ) )
172
A p p e n d i x B: Exhibit 2
An S-Plus Program to Compute M a x i m u m L i k e l i h o o d
Estimate of Parameters & M a x i m i z e d L i k e l i h o o d
Statistic for a Family of Lognormal D i s t r i b u t i o n s
Based on "Mean" Model D
mydata<-TableA
m<-data.frame(mydata)
lognormal.model.D <- f u n c t i o n ( b 0 , b l , b 2 , b 3 , s i g m a , data=data.matrix)
{ D <- d a t a . m a t r i x [ , l ]
P L <- d a t a . m a t r i x [ , 2 ]
y <- d a t a . m a t r i x [ , 3 ]
z <- D + ( y * { y < P L ) + P L * ( y > = P L ) )
c n s t <- d a t a . m a t r i x [ , 4 ]
C1 <- c n s t == 1
C2 <- c n s t == 2
d < - D + ( D == 0)*I
m u <- b 0 + b l * l o g ( P L ) + b 2 * C l + b 3 * C 2
d e l t a l <- {D == 0 ) * ( y < PL)
d e l t a 2 <- (D > 0 ) * ( y < PL)
d e l t a 3 <- (D == 0 ) * ( y >= PL)
d e l t a 4 <- (D > 0 ) * ( y >= PL)
L1 <- d l n o r m ( z , m u , s i g m a )
L2 <- d l n o r m ( z , m u , s i g m a ) / ( 1 - p l n o r m ( d , m u , s i g m a ) )
L3 <- l - p l n o r m ( z , m u , s i g m a )
L4 <- ( l - p l n o r m ( z , m u , s i g m a ) ) / ( l - p l n o r m ( d , m u , s i g m a ) )
logL <-deltal*log(Ll)+delta2*log(L2)+delta3*log(L3)+delta4*log(L4)
-logL }
min.model.D<-ms(-lognormal.model.D(b0,bl,b2,b3,sigma), data=m,
start=list(b0=4.568, bi=0.238, b2=1.068, b3=0.0403, sigma=l.322))
min.model.D
value: 892.7099
parameters:
b0 bl b2 b3 sigma
1.715296 0.3317345 2.154994 0.4105021 1.898501
formula: lognormal.model.D(b0, bl, b2, b3, sigma)
i00 o b s e r v a t i o n s
call: m s ( f o r m u l a = ~ lognormal.model.D(b0, bl, b2, b3, s i g m a ) , d a t a = m ,
start =list(bO=4.568, bi=0.238, b2=i.068, b3=0.0403, sigma=l,322))
173
Appendix B: Exhibit 3
An S - P l u s P r o g r a m To C o m p u t e M a x i m u m L i k e l i h o o d
E s t i m a t e of P a r a m e t e r s & M a x i m i z e d L i k e l i h o o d
S t a t i s t i c for a F a m i l y of L o g n o r m a l D i s t r i b u t i o n s
B a s e d on " S c a l e " M o d e l B
mydata<-TableA
m<- data.frame(mydata)
lognormal.Scale.model. B<- function(bO,bl,b2,mu, data=data.matrix)
( D <- d a t a . m a t r i x [ , l ]
P L <- d a t a . m a t r i x [ , 2 ]
y <- d a t a . m a t r i x [ , 3 ]
c n s t <- d a t a . m a t r i x [ , 4 ]
z <- D + C y * ( y < P L ) + P L * ( y > = PL))
Cl <- c n s t = = 1
C2 <- c n s t = = 2
d <- D + (D == 0) * 1
s i g m a <- b 0 + b l * C l + b 2 * C2
deltal <- (D = = 0 ) ' ( y < PL)
delta2 <- (D > 0 ) * ( y < PL)
delta3 <- (D = = 0 ) * { y >= PL)
delta4 <- (D > 0 ) * ( y > = PL)
L1 <- d l n o r m ( z , m u , s i g m a )
L2 <- d l n o r m ( z , m u , s i g m a ) / ( l - p l n o r m ( d , mu, s i g m a } )
L 3 <- ] - p l n o r m ( z , m u , s i g m a )
L4 <- (i - p l n o r m ( z , m u , sigma))/(l - plnorm(d,mu,sigma))
logL <-deltal*log(Ll)+delta2*log(L2)+delta3*[og[L3)+delta4*log(L4)
-logL }
min. Seale. B<- ms(~lognormal. Scale.model.B(b0,bl,b2,mu}, data-m,
+ start=list(bO=2,bl-O,b2=O,mu=6))
min. Scale. B
value: 892.4242
parameters:
b0 bl b2 mu
1.583642 1.324647 0.1066956 6.55098
formula: lognormal. S£ale.model.B(b0, bl, b2, mu)
I00 observations
call: ms(formula = ~ l o g n o r m a l . S c a l e . m o d e l . B ( b 0 , bl, b2, mu), data = m,
s t a r t = l i s t ( b O = 2, bl = O, b 2 = O, m u = 6))
174
Approximations of the
Aggregate Loss Distribution
175
Approximations of the Aggregate Loss Distribution
Abstract
Aggregate Loss Distributions are used extensively in actuarial practice, both in ratemaking and reserving.
A number of approaches have been developed to calculate aggregate loss distributions, including the
Heckman-Meyers method, Panjer method, Fast Fourier transform, and stochastic simulations. All these
methods are based on the assumption that separate loss frequency and loss severib distributions are
available.
Sometimes, however, it is not practical to obtain frequency and severity distributions separately, and only
aggregate information is available for analysis. In this case the assumption about the shape of aggregate
loss distribution becomes very important, especially in the "tail" of the distribution.
This paper will address the question of what type of probability distribution is the most appropriate to use
to approximate an aggregate loss distribution.
176
Introduction
Aggregate loss distributions are used extensively in actuarial practice, both in ratemaking
and reserving. A number of approaches have been developed to calculate aggregate loss
distribution, including the Heckman-Meyers method, Panjer method, Fast Fourier
transform, and stochastic simulations. All these methods are based on the assumption that
separate loss frequency and loss severity distributions are available.
This paper will address the question what type of probability distribution is the most
appropriate to use to approximate an aggregate loss distribution. We start with a brief
summary of some important results that have been published about the approximations to
the aggregate loss distribution.
Dropkin [3] and Bickerstaff [1] have shown that the Lognormal distribution closely
approximates certain types of homogeneous loss data. Hewitt, in [6], [7], showed that two
other positive distributions, the gamma and log-gamma, also provide a good fit.
Pentikainen [8] noticed that the Normal approximation gives acceptable accuracy only
when the volume of risk business is fairly large and the distribution of the amounts of the
individual claims is not too heterogeneous. To improve the results of Normal
approximation, the NP-method was suggested. Pentikainen also compared the NP-
method with the Gamma approximation. He concluded that both methods give good
accuracy when the skewness of the aggregate losses is less than !, and neither Gamma
nor NP is safe when the skewness of the aggregate losses is greater than 1.
Seal [9] has compared the NP method with the Gamma approximation. He concluded that
the Gamma provides a generally better approximation than NP method. He also noted
that the superiority of the Gamma approximation is even more transparent in the "tail" of
the distribution.
Sundt [11] in 1977 published a paper on the asymptotic behavior of the compound claim
distribution. He showed that under some special conditions, if the distribution of the
number of claims is Negative Binomial, then the distribution of the aggregate claims
behaves asymptotically as a g a(ama-type distribution in its tail. A similar result is
described in [2] (Lundberg Theorem, 1940). The theorem states that under certain
conditions, a negative binomial frequency leads to an aggregate distribution, which is
approximately Gamma.
The skewness of the Gamma distribution is always twice its coefficient of variation.
Since the aggregate loss distribution is usually positively skewed, but does not always
have skewness double its coefficient of variation, adding a third parameter to the Gamma
177
was suggested by Seal [9]. However, this procedure may give positive probability to
negative losses. Gendron and Crepeau [4] found that, if severity is Inverse Gaussian and
frequency is Poisson, the Gamma approximation produce reasonably accurate results and
is superior to the Normal, N-P and Escher approximations when the skewness is large.
In 1983, Venter [12] suggested the Transformed Gamma and Transformed Beta
distributions to approximate the aggregate loss distributions. These gatmna-type
distributions, allowing some deviation from the Gamma, are thus appealing candidates.
This paper continues the research into the accuracy of different approximations of the
aggregate loss distribution. However, there are two aspects that differentiate it from
previous investigations.
Second, all prior research was based upon theoretical considerations, and did not consider
directly the goodness of fit of various approximations. We are using a different approach,
building a large simulated sample of aggregate losses, and then directly testing the
goodness of fit oI' various approximations to this simulated sample.
The ideal method to test the fit of a theoretical distribution to a distribution of aggregate
losses would be to compare the theoretical distribution with an actual, statistically
representative, sample of observed values of the aggregate loss distribution.
Unfortunately, there is no such sample available: no one insurance company operates in
an unchanged economic environment long enough to observe a representative sample of
aggregate (annual) losses. Economic trend, demography, judicial environment, even
global warming, all impact the insurance marketplace and cause the changes in insurance
losses. Considering periods shorter than a year does not work either because of seasonal
variations.
178
Our analysis involved the following formal steps:
Conducting our study, we kept in mind that the aggregate loss distribution could
potentially behave very differently, depending on the book of business covered. Primary
insurers usually face massive frequency (large number of claims), with limited
fluctuation in severity (buying per occurrence excess reinsurance). To the contrary, an
excess reinsurer often deals with low frequency, but a very volatile severity of losses. To
reflect possible differences, we tested several scenarios that are summarized in the
following table.
Number of claims distribution for all scenarios was assumed to be Negative Binomial.
Also, we used Pareto for the severity distribution in both primary and working excess
layers. In these (relatively) narrow layers, the shape of the severity distribution selected
has a very limited influence on the shape of the aggregate distribution. In a high excess
layer, where the type of severity distribution can make a material difference, we tested
two severity distributions: Pareto and Lognormal. More details on parameter selection for
the frequency and severity distribution can be found in the exhibits that summarize our
findings for each scenario.
179
Distributions Used for the Approximation of Aggregate Losses
Initially we used both the Maximum Likelihood Method and the Method of Moments to
estimate parameters for the approximating distributions. The parameter estimates
obtained by the two methods were reasonably close to each other. Also, the distribution
based on the parameters obtained by the Method of Moments provided a better fit than
the one based on the parameters obtained by the Maximum Likelihood Method. For these
reasons we have decided to use the Method of Moments for parameter estimates.
Once the simulated sample of aggregate losses and the approximating distributions were
constructed, we tested the goodness of fit. While the usual "deviation" tests (Kolmogorov
- Smirnov and g2-test) provide a general measurement of how close two distributions are,
they can not help to determine if the distributions in question systematically differ from
each other for a broad range of values, especially in the "tail". To pick up such
differences, we used two tests that compare two distributions on their full range.
180
The Percentile Matching Test compares the values of distribution functions for two
distributions at various values of the argument up to the point when the distribution
functions effectively vanish. This test is the most transparent indication of where two
distributions are different and by how much.
The Excess Expected Loss Cost Test compares the conditional means of two distributions
in excess of different points. It tests values E[X - x I X > x] * Prob{X > x}. These values
represent the loss cost of the layer in excess o f x if X is the aggregate loss variable. The
excess loss cost is the most important variable for both the ceding company and
reinsurance carrier, when considering stop loss coverage, aggregate deductible coverage,
and other types of aggregate reinsurance transactions.
The four exhibits at the end of the paper document the results of our study for each of the
seven scenarios described above. The exhibits show the characteristics of the frequency
and severity distributions selected for each scenario, estimators for the parameters of the
three approximating distributions, and the results of the two goodness-of-fit tests.
The results of the study are quite uniform: for all seven scenarios the Gamma distribution
provides a much better fit than the Normal and Lognormal. In fact, both Normal and
Lognormal distributions show unacceptably poor fits, but in different directions.
The Normal distribution has zero skewness and, therefore, is too light in the tail. It could
probably provide a good approximation for a book of business with an extremely large
expected number of claims. We have not considered such a scenario however.
In contrast, the Lognormal distribution is overskewed to the right and puts too much
weight in the tail. The Lognormal approximation significantly misallocates the expected
losses between excess layers. For the Lognormal approximation, the estimated loss cost
for a high excess layer could be as much as 1500% of its true value.
On the other hand, the Gamma approximation performs quite well for all seven scenarios.
It still is a little conservative in the tail, but not as conservative as the Lognormal. This
level of conservatism varies with the skewness of the underlying severity distribution,
and reaches its highest level for scenario 2 (Large Book of Business with Low
Retention). When dealing with this type of aggregate distribution, one might try other
alternatives.
As the general conclusion of this study, we can state that the Gamma distribution gives
the best fit to aggregate losses out of the three considered alternatives for the cases
considered. It can be recommended to use the Gamma as a reasonable approximation
when there is no separate frequency and severity information available.
181
Bibliography.
1. Bickerstaff, D. R. Automobile Collision Deductibles and Repair Cost Groups. The
Lognormal Model, PCAS LIX (1972), p. 68.
6. Hewitt, C.C. Distribution by Size of Risk- A Model, PCAS LIII (1966), p. 106.
7. Hewitt, C.C. Loss Ratio Distributions - A Model, PCAS LIV (1967), p. 70.
10. Stanard, J.N, A Simulation Test of Prediction Errors of Loss Reserve Estimation
Techniques, PCAS LXXII (1985), p. 124.
12. Venter, G. Transformed Beta and Gamma Distributions and aggregate losses, PCAS
LXX (1983).
182
Exhibit 1
Scenario 1
Scenario 2.
Scenario 3.
Scenario 5.
Scenario 6.
187
EXTENDED WARRANTY RATEMAKING
TABLE OF CONTENTS
Introduction
Pitfalls
Conclusion
188
Abstract
The warranty business is a relatively new line of insurance in the property-casualty market. For
the most part insurance coverage for warranties, extended warranties and service contract
reimbursement policies has been introduced over the last thirty years. There is great opportunity
in this line of business for the pricing actuary. It is an area where one can use his imagination
and creativity in developing actuarially sound models to price and evaluate warranty business.
This paper starts with auto extended warranty ratemaking, where there is usually plenty of data
to use the traditional actuarial approaches to ratemaking. From there the paper discusses a
non-traditional rate-making approach when historical experience is not available. This "back-to-
basics" approach focuses on developing the pure premium by independently deriving frequency
and severity. The next topic is the inclusion of unallocated loss adjustment expense (ULAE) into
the pricing equation. In this line of business, because of the long-term commitments, ULAE
must be carefully analyzed and provided for. Lastly, the paper discusses a number of pricing
pitfalls to avoid. Some of these errors have been made by the author, and it is in the hopes of
exposing these pitfalls that they can be avoided by others.
189
Introduction
The warranty business is a relatively new line of business to the property/casualty market. It is
generally within the last thirty years that insurance coverage has become an integral method to
transfer this risk. Warranty coverage is basically mechanical breakdown insurance; if a product
does not work due to some mechanical or component failure and it is covered under a warranty
contract, than the product is either repaired or replaced, depending on the type of coverage in
force.
Relatively speaking, there is very little actuarial literature on the topic of warranty business in
general. Several that come to mind are the 1994 Proceedings paper by Roger Hayne,
"Extended Service Contracts" and two papers in the 1993 CAS Forum Ratemaking Call Papers,
"A Pricing Model for New Vehicle Extended Warranties" by Joseph S. Cheng and Stephen J.
Bruce, and "The Use of Simulation Techniques in Addressing Auto Warranty Pricing and
Reserving issues" by Simon J. Noonan. Some of the topics addressed in those papers will be
touched on in this paper.
The pricing of a warranty product lends itself to the pricing actuary's expertise. It is generally a
line that has predictable frequencies and severities, given a credible amount of data. On the
auto warranty class, there is usually a great deal of data available to analyze using traditional
actuarial methods. Other product areas do not have large amounts of data and the actuary is
forced to develop a price by deriving a value for frequency and severity.
The warranty market today can be divided up into five basic segments, each with its own set of
distinguishing characteristics. These segments would be the automobile service contracts,
commercial warranties (example; policies covering business equipment), home warranties
(example; public service policies covering furnaces and air conditioners), retail warranties
(example; policies covering VCRs) and Odginal Equipment Manufacturers (OEM) warranties. In
this paper we will discuss auto extended warranty ratemaking and OEM warranty ratemaking,
as well as several general topics which touch all areas.
190
VEHICLE EXTENDED WARRANTY
The auto extended warranty concept dates back to the early 1970's. Prior to that the only
warranties on automobiles were the manufacturer's warranties on new vehicles, which were
generally limited to 12 months or 12,000 miles. Used cars were usually sold with no warranty.
In the early t970's a few independent companies, generally not insurance companies but third
party administrators (TPA) began to offer limited warranties on used cars. Soon there were a
number of companies offedng one, two and three year terms for these warranties,
Eventually these independents recognized another market could be extending the warranty
beyond what was offered by the manufacturer. Covering new vehicles appeared to be a great
cash flow bonanza, as the money for the coverage was paid up front, while claims would be
delayed by the year's coverage under the manufacturer's warranty. Interest rates were very high
in the early and middle of the 1980's, and investors were lured by the promise of high returns.
Manufacturers began to offer their own extended warranties, forcing independent TPAs out or to
reduce pricing. Some of these TPAs were backed by insurance companies; many were not.
The late 1980's saw a turmoil in this business as pricing on new vehicle service contracts (VSC)
was woefully inadequate. During this time the manufacturers also began to lengthen the term of
the underlying warranty to three years or thirty-six thousand miles. This posed an immediate
pricing problem. Purchasers of an extended warranty would expect the pricing to go down as
the manufacturer now covered more claims. However, actuarial studies indicated double digit
rate increases necessary. Interest rates also were coming down, lowering the investment
income.
TPAs that raised rates lost much of their volume almost overnight, as dealers had a choice of
the manufacturers' or other independents' products. However, a number of independents did
survive this period. Most of these a r e either owned by or closely affiliated with an insurance
company for security reasons, as long-term promises of vehicle service are being made. The
manufacturers control about 70% of the new vehicle extended warranty market with the
independents sharing the rest. The independents have a greater share of the used vehicle
market.
Insurance companies play an important role in the selling of the extended warranty product. The
extended warranty is an after-market product, that is, the dealer and consumer will generally
conclude the purchase of the vehicle before introducing the availability of the extended
warranty. If the dealer is successful in selling the consumer an extended warranty or service
contract, the dealer has then committed to a long-term relationship to service that vehicle.
In most states, the extended warranty service contract is not considered insurance and is not
regulated by the insurance department. It is simply a contract between the dealer and the car
buyer and is covered under contract law. What is considered insurance by most states and is
regulated by the various insurance departments is the Service Contract Reimbursement Policy
(SCRIP). If the dealer chooses to sell an independent TPA's VSC, the dealer needs to assure
himself that the TPA will be there to fulfill the promises made to the consumer. The consumer
also must satisfy himself that should he move from the area or the dealer goes out of business,
covered repairs wilt still be made. The TPA must therefore show that he is secure; most TPA's,
through an insurance company, therefore provide a SCRIP to the dealer. This SCRIP provides a
guarantee to the dealer and the consumer that if a covered repair is necessary it will be done,
either at the selling dealer or at an authorized repair shop.
191
The vehicle service contract
The vehicle service contract (VSC) has a number of options in terms of limits and coverage. The
predominate products will be discussed here. The discussion will be broken into three
segments; used vehicles, new vehicles and near-new vehicles. Used vehicles are those which
are being resold to the consumer by a dealer and which no longer are covered by the
manufacturer's warranty, New vehicles are those which have had no previous owners and have
the full protection of the manufacturer's warranty. Near-new vehicles are those that have had a
previous owner and are being resold by the dealer with some protection still under the
manufacturer's warranty.
a. One-year term - The VSC coverage is limited to one year from purchase of vehicle. Mileage
on the vehicle at time of purchase is also used as an eligibility factor, i.e., a vehicle with
mileage beyond a certain limit will not be eligible for an extended warranty.
b. Two-year term - This VSC coverage is limited to two years from the purchase of the vehicle.
Again a mileage limit as described above is in place, but it is usually lower than the one-year
eligibility as the coverage lasts longer.
Three-year term - This VSC coverage is limited to three years from the time of purchase
with an eligibility mileage limit in place. Again, this eligibility limit would normally be lower
than that for the two-year term.
The limits on a new VSC are almost always a combination of years and mileage. The most
popular combinations are usually in multiples of whole years (5,6 or 7) and multiples of 10,000
miles, from 60,000 to 100,000. An example of how this is shown would be 5/100,000, which
represents 5 years or 100,000 miles, whichever comes first. At one time an option for unlimited
mileage was offered, but industrywide experience was so poor that this option is now very
seldom seen. Coverage starts upon the purchase of the vehicle.
These limits would normally be expressed as those shown for new VSCs. In fact, until recently
this group was not separated from the "new" grouping. A new VSC would be sold to a consumer
as long as there was still coverage under the manufacturer's warranty, the theory being that
there was very little exposure to loss anytime during the period under which the vehicle was
covered by the manufacturer. Upon analysis, however, it was found that loss costs were higher
for new VSCs sold 18 months after coverage started under the manufacturer's warranty than
for new VSCs sold on vehicles within that 18 month period.
We initially began to study the loss costs of this group because we noted that a program which
we underwrote for motorcycles had much higher loss experience for older bikes which were
grandfathered into the program. These older motorcycles were only eligible for the new program
if they had been purchased no more than one year prior to the inception of the program. The
resulting loss costs on these bikes were significantly higher than the rest of the program; we
guessed that there was some type of adverse selection taking place.
192
If adverse selection was taking place in our motorcycle program where we provided an option to
purchase an VSC more than a year after the bike was bought, then it would be reasonable to
assume that the same adverse selection was taking place when a car owner purchased a VSC
more than a year after he bought the car. As noted above our subsequent analysis of the near-
new group showed significantly higher loss costs in comparison to the new group, and we
therefore created the near-new group with higher rates.
Before the two-year lease option became popular, this group of vehicles was very small.
However, this group has grown substantially over the last five years as the two year lease
became predominant. Remember, the most prominent manufacturer's warranty is now
3•36,000, so a vehicle coming off a two-year lease still has up to a year of underlying coverage,
depending on mileage.
Coverage under the VSC is for mechanical breakdown due to failure of a covered component
only, and perhaps some incidental coverage such as rental reimbursement and towing when a
covered mechanical breakdown has occurred. No physical damage due to other perils is
covered. For instance, an engine breakdown caused when a vehicle is caught in a flood is not
covered.
There are usually several options available in terms of coverage. There are a myriad of
components that make up the automobile, with some obviously being more essential to the
actual running of the auto than others. Basic coverage would normally cover the powertrain of
the vehicle, such as the engine and transmission. Other options could be offered, up to
"bumper-to-bumper" which pretty much covers everything in and on the car.
Before discussing the actual ratemaking for VSCs, it is important to understand the makeup of
the total price paid by the ultimate consumer, the purchaser of the vehicle. The total price is
comprised of:
P=I+A+T+M;
I = Insurer cost,
A = Agents commissions,
M = Dealer markup.
To clarify, let us build the ultimate price to the consumer from the bottom up. First, the insurer
determines the expected loss costs and adds any internal company expenses. This is passed to
the TPA as the insurer cost. The TPA has administrative costs (underwdting, claims, systems,
etc) which then get added on to the insurer cost. For the most part the TPA has an independent
agency force in place to sell the SCRIP to the dealer, thus agent's commissions must also be
included. (Note that as we pointed out eadier, the dealer sells the consumer a VSC, which is not
typically considered insurance, and thus the dealer is not an insurance agent.) All of the above
193
costs make up what is called the dealer's cost, to which the dealer then adds whatever markup
he can to arrive at the total price. Since this markup is not regulated in any state but Florida,
total price for the same VSC can vary from consumer to consumer, depending on the
negotiating skills of the buyer and seller.
Dealer markup is not regulated in any state but Florida, and therefore is not included as a cost
in filed rates anywhere but Florida. The remaining costs, however, may or may not be included
in filed rates. Some companies file rates which only include insurer costs (I); the TPA will then
collect a fee per VSC (T + A) from the dealer, which he will then have to use to pay the TPA's
expenses as well as any commissions to his distribution force, if any. The filed rates may
include I +T + A, in which case the insurer will pay out a commission to the TPA equal to T+A.
In Florida, the filed rates include all costs. While these different scenarios do not present a
problem for ratemaking, it does cause difficulty if one is trying to do a competitive rating study
among various companies, as unless the costs included in the ratemaking are known,
comparisons are almost worthless.
Insurer costs {I) are the next item of evaluation. Insurer costs are made up of expected loss
costs and the insurer's expenses. The expected loss costs are a function of many variables,
including but not limited to:
b. Coverage option
f. Deductible option
h. Special factors ( four-wheel drive, commercial use, advanced technology for example).
The company must decide what loss cost variables they would like to include in the ratemaking;
the above would be a pretty standard method to analyze data. As the variables above are all
important elements that differentiate rates, it is important that the data be captured in the same
detail. It is also important that the data be analyzed on a policy year basis. Because of the multi-
year terms of the policies, it is important to match the losses to the policies that generated those
losses. It also avoids any distortions caused by improper earning of the premium.
The earning of the premium for a warranty product is not straightforward. In general, premium is
earned over the policy period to reflect the exposure to loss during that policy period. For an
annual policy the premium is usually earned pro rata as losses are assumed to be uniformly
distributed over the policy period. This is not true in the extended warranty coverage.
For used VSCs losses generally come in faster than a pro rata distribution. A useful rule of
thumb is that half of the losses have emerged when the term is one-third expired, and two-
thirds of the losses are emerged when the term is half done. For example, on a two-year term
used VSC, two-thirds of the losses have emerged one year into the term. One primary reason
for this accelerated loss pattern is that mechanical problems on used vehicles can occur pretty
quickly after the sale. Sometimes a used car dealer will use the extended warranty as a
194
maintenance program. (This will be discussed later in the dealer management section.) For
used VSCs, the premiums should be earned accordingly.
On new VSCs the earning is somewhat trickier. First, very few losses are expected under the
extended warranty while the underlying warranty has not expired. The only losses during this
period would be towing or rental expenses over and beyond what the underlying covers. Once
the underlying warranty has expired, losses emerge on the extended warranty cover. As the
frequency and severity of repairs are expected to increase during the remainder of the service
contract we would envision an ever-increasing loss payout pattern. This type of pattern is well
described by the reverse sum of the digits function (see Exhibit E for definition and formula),
and this pattern is often used.
However, in actuality, while loss emergence does accelerate for a period of time after the
expiration of the underlying warranty, this emergence slows down considerably towards the end
of the term. This variable is sometimes called the attrition factor. Several things may happen
during the life of the VSC; the mileage limit could be hit before the term limit, the car may be
sold and the warranty not transferred, the owner voids the warranty by poor maintenance, or
even the owner just doesn't keep track of the warranty contract. In any event, this attrition factor
does exist, and it causes the loss payout pattern to take an "S" shape, slow starting out, grows
quickly in the middle and slows down at the end. Premiums should be earned in the same
fashion.
The loss payout patterns are direct byproducts of the actuarial analysis of the policy year loss
triangles. The actuary decides at what level the earnings should be done, and has the data
collected in these levels. For instance, earnings may be done by term and mileage, so
premiums and losses would be segregated into term and mileage subsets by policy year.
Losses are developed to ultimate using a variety of methods. Because the loss emergence is
low in the beginning of the contract period, more recent policy years benefit from the use of the
Bornhuetter/Ferguson (B/F)* and the Stanard/Buhlmann (S/B)** methods in addition to simply
multiplying the selected loss development factor by the emerged losses. It is also valuable to
use average claim costs to develop ultimate losses (See Exhibit A). Note that for more recent
years the paid loss projection is erratic as there are few emerged losses.
We also calculate a pure premium projection of ultimate losses (columns 13-15 in Exhibit A.)
We use the B/F annual projection to get an ultimate pure premium per contract (column 13.)
The B/F projection is used as its values are between the paid and the S/B projections, and thus
we hope to be neither too optimistic nor too conservative. In column 14 we convert the annual
pure premium into a running cumulative pure premium. In this way we incorporate mature years'
pure premiums which have minimal actuarial adjustments along with the more recent years'
pure premiums which are very dependent on actuarial assumptions on development. We then
multiply the number of contracts written (column 2) by the cumulative pure premium to obtain
the pure premium projection in column 15.
* For definition and explanation of the B/F method, please see Foundations of Casualty
Actuarial Science, pages 210-214.
** For definition and explanation of the S/B method, please see Foundations of Casualty
Actuarial Science, pages 352-354.
195
Of course, the other actuarial adjustments must also be made. Premiums must also be
developed to ultimate as well as put on current rate level, and losses must be trended from the
midpoint of the experience period to the midpoint of the proposed policy period. Individual
policy years are then averaged and compared against the expected loss ratio to compute the
required rate level indication.
LOSS TREND
Loss trend is a function of change in frequency vs a change in severity. For auto warranty
business, normally the frequency is high and the sevedty is low. Frequency is affected by
changes to the underlying manufacturers' warranties, the quality of the vehicles, the changing
mix of business, and the dealers' service departments' propensity to use the warranty coverage.
Severity is affected by the change in technology, change in mix, change in labor rates,
availability of parts and again the service departments' willingness to use the warranty product.
Both internal and external sources of data should be used to finally select a trend factor. Exhibit
B shows an internal measure by component for frequency and severity, as well as an external
measure of change in severity, using the government's PPI index as a source. For the external
measure, we have examined the PPI for auto parts, both new and rebuilt, and for labor charges.
We have weighted these indices together to get a combined external index. As labor charges
usually make up about half of the total repair bill, we have given it a weight of 50%. We have
given auto parts new and rebuilt each a weight of 25%, which assumes that half the time new
parts are used in the repair job and half the time rebuilt parts are used.
The selection of annual loss trend factors in auto warranty business is not straightforward. We
include external indices in our determination as it is often difficult to explain why internal factors
change. For instance, in Exhibit B we show a change in frequency for the new VSC group. This
is counterintuitive as it is generally accepted that the quality of new vehicles has improved;
shouldn't we then see a decrease in frequency? Perhaps our mix of vehicle make and model
has changed. Let's say the we determine that our mix did change. Would we expect the same
mix change in the next policy year for which we are projecting rates?
Another problem arises because of the multiyear policy terms. On the new and near-new groups
we must wait several years before we become comfortable with projecting a true frequency and
severity. We then must use a four or five year old trend factor to project lost costs for the
upcoming policy year. We have current calendar year data, but that is a mix of claims from up to
seven policy years. If the volume and mix of business is stable over the ratemaking experience,
then calendar year trends can be useful, otherwise it can lead to distortions.
It is therefore necessary to include external factors to smooth the results of our internal trend
analysis. It is appropriate to give a higher weight to the external factors as they are determined
from an industrywide database. This is important because a SCRIP program will most likely get
a spread of business from all makes and models. These industrywide or government indices are
also important as they tend to smooth the results from internal analysis. As we are often
projecting many older policy years in calculating the rate level indication, we must be conscious
of the compounding effect of many years of trend to this calculation.
As is seen in Exhibit A, nine policy years have been used in the ratemaking study. We also
know from the discussion above that there have been changes over that period of time, most
notably the change in the underlying manufacturer's warranty from 1 year / 12,000 miles to 3
196
years / 36,000 miles. This shift would have a significant impact on the older years. Can these
older years be used?
If the TPA or insurer keeps very detailed claim data, an actuary can "as it" the older years.
Claims from those older years can actually be recast as if the new terms and conditions were in
place. This is helpful not only in getting more accurate projection data but also in calculating
loss development factors. Thus older years not only can be used but they are very valuable as
they represent truly mature loss data.
IMPORTANCE OF RATEMAKING
The accuracy of the extended warranty rate level indication cannot be stressed enough.
Remember, rates are being set on contracts that could be up to seven years in duration. These
contracts are a single premium and are non-cancelable by the insurer. Oftentimes it is several
years before the adequacy of the current rates can be ascertained, which means you may have
written several years of inadequately priced business. If you lower the rates you will most likely
lose business and thus revenue just when the claim activity is increasing. It is therefore very
important to perform rate level analyses every twelve to eighteen months and make adjustments
as necessary.
DEALER MANAGEMENT
The actuary, from the pricing analysis, especially the analysis of frequency, can often find some
trouble spots. Notice above that both frequency and severity can be affected by the dealers, or
more precisely, the dealers' service departments. It is important, therefore, to keep track of the
frequency and severity for each dealer. It is a relatively simple matter to set up a test of
significance for an individual dealer's frequency and severity. If either measure is significant,
i.e., it is outside the normal range of frequency or severity, than appropriate dealer rehabilitation
measures must be taken. By rehabilitation it is meant that the dealer must be put on a program
in which frequency and severity are closely monitored, with special reporting done monthly. If
within a prescribed time period the dealer's experience has not improved, then the SCRIP will
most likely be cancelled. Of course, the TPA (and the insurer) are still responsible for the run-off
of the inforce VSCs, which may last up to seven years.
As in any line of insurance, fraud must be guarded against. In the warranty business, you must
be vigilant against increases in frequency because severity cannot be changed too drastically. A
good dealer management program is a must in this business and the pricing actuary can
certainly play an important role.
The vehicle service contract ('VSC") industry is young relative to most standard casualty lines of
business. As such, it is still evolving. The programs offered by the various third-party
administrators of VSCs are constantly changing. These changes in coverage terms and
conditions, coverage term options, deductibles and eligibility guidelines are driven by two sets of
factors: marketing requirements and changes in the environment of the marketplace. It is
important to understand the dynamics of these evolutionary changes and to incorporate such
understanding into the ratemaking process.
197
MARKETING REQUIREMENTS
Innovation is an important marketing tool in the VSC industry. A VSC administrator's need to
take an offensive position, to capture or retain market share, generally results in program
changes that increase risk. Most VSC administrators rely on a network of independent general
agents to distribute their programs to their first-level customers, automobile dealers.
Participating auto dealers employ after-sale specialists, finance and insurance ("F&I")
managers, to sell VSCs to the second-level customers, automobile purchasers. All auto dealers
sell VSCs.
A reasonably effective F&I manager will place a VSC on 30-40% of the retail sales transactions
at the dealership. The average profit generated by a VSC sale can add 50-100% to the profit
generated by the sale of the vehicle itself. Competition for the auto dealer's business is fierce.
Any innovation gives the agent new ammunition to improve his sales pitch. The latest change
might have enough impact to tip the account his way. Changes to VSC programs which expand
vehicle or mileage eligibility can increase penetration rates at existing accounts. Expanded
coverages or benefits give the F&I manager more reasons to justify higher retail pricing,
increasing gross profit margins.
ENVIRONMENTAL CHANGES
All of the foregoing is meant to illustrate one point. In order to ensure accuracy in ratemaking,
especially when measuring trend, know the history of the block of business you are observing.
In your due diligence study, prior to starting any rate adequacy study, pay special attention to
the following:
Data Integrity - Have all data items, especially manually-coded indicators, been entered and
maintained in a consistent manner throughout the history of the database? Are changes in
coverage reflected in changes in plan/coverage codes? Run comparison tests on contracts and
claims involving similar vehicles/repairs over multiple policy and accident years.
198
Benefits - Have ancillary benefit packages (substitute transportation, towing, trip interruption)
changed in composition or in the extent/nature of the benefits provided? Include benefit
packages in your comparison of coverage, conditions, exclusions.
Claims Adjustment Policies - What changes have been made in the interpretations of
coverages, conditions and exclusions over the years? When were such changes introduced?
Obtain copies of all procedure manuals, both external and internal, as well as any pertinent
policy memoranda.
Rate Structures - Have there been changes instituted in the method of rating vehicles? Have
surcharges been added/dropped? Have vehicle classifications changed? Obtain copies of all
rate charts and state premium filing exhibits.
Vehicle Mix - Has the mix of makes, models, equipment changed enough to affect trends in
composite loss development patterns or ultimate losses? Has the geographic mix of business
changed over the years under study? Obtain historical state/agent loss ratio reports.
IN CONCLUSION
Assessing the impact of change, and the rating provisions employed to offset change, is an
essential ingredient in the vehicle service contract ratemaking process. By initially focussing
your attention on this aspect of the ratemaking process, you will learn how to apply your
analytical skills and techniques to the best advantage.
199
RATEMAKING WITHOUT HISTORICAL DATA
The most accurate ratemaking is done when there is credible historic program data with which
to work. Many times, however, historic data is not available. It may be that the program is new.
Oftentimes the program is immature; remember that extended warranty contracts are usually
multiyear terms, thus it is usually a number of years before the first policy year is completely
expired. It is in these situations that one must use a "back to basics" approach. To price a
program properly, one must start with an accurate pure premium, which is the product of
frequency times severity.
An interesting example of using a pure premium approach is the pricing of a new program such
as the second generation of wind turbines. In the early 1980's, the US. government, in an effort
to decrease our dependency on foreign oil, granted tax credits for the advancement of
alternative energy sources. As part of this initiative, a number of wind turbines were hastily
developed and deployed. Each of these machines had manufacturer's warranties, most of which
were subsequently insured. Coverage included both mechanical problems and business
interruption. Through the ensuing years, the wind turbines proved mechanically deficient and
large losses were paid out by insurance companies.
In the mid-nineties, a second generation of wind turbines were being developed and coverage
sought for manufacturer's warranties. As there had been problems in the past, the financial
backers of these new wind turbines were asking for four specific warranties from the
manufacturer; workmanship, efficacy, availability and design defect. Each of these coverages is
described in more detail below.
Workmanship - This covers both mechanical breakdown of the machine and the installation of
the machine, and would usually be limited to one year from start-up
Efficacy - This would cover the buyer of the wind turbine for lost revenues as a result of the
machine not reaching the promised power generation levels.
Availability - Coverage is given for lost revenues due to down-time in excess of a prescribed
number of hours. Total hours functioning would be determined by average sustained wind
speed at the field site.
Design Defect - This would cover the retrofitting and lost income due to failure of the wind
turbines to perform due to faulty component design. Failure rate thresholds for various
components would be established.
Each of the above coverages poses a challenge to the actuary with respect to developing
frequency and severity. A thorough examination of the engineering of the new machine must be
done. As the actuary is not usually suited for this role, an independent engineering analysis
must be sought.
The U.S. and other governments often can provide data on failure rates of s=milar components
(gears, generators, bearings, etc) used in the wind turbine. Deductibles must be established so
this does not become a maintenance program and aggregates must also be in place so that a
worst-case loss can be determined. Also, as variation exists about all expected values for
variables such as failure rates, a risk premium must be considered.
Exhibit C shows a possible approach to determining the pure premium for the above coverages
for year 1 of a multiyear manufacturer's warranty, Three separate calculations are made;
revenue loss exposure per wind turbine, design defect loss exposure, and materials and
workmanship loss exposure.
200
Revenue loss exposure/wind turbine - This calculation includes the business interruption
coverage from both the efficacy and availability sections above. Potential downtimes are given
for repairs or retrofits of various components along with the probability that failure of that specific
component will occur. For example, given that downtime projected for normal maintenance is
274 hours annually and that 125% of those hours will be used, we can expect 342.5 hours to be
used annually in normal maintenance. In total, we expect 1,396.1 hours of downtime; in this
program we are allowed 10%, or 876, hours of downtime annually (876 hrs = 10% of 24 hr/day x
365 days). This is shown at the bottom of Exhibit C, and is the deductible feature of the
program. As noted above, about 40% of the deductible would be used for normal maintenance;
the other 60% would be to reduce dollar-swapping as well as have the insured share in some
risk. With a machine expected to produce 82 kwh/hour, and at $.08/kwh, a resultant loss of
$3,412 is expected. A worst-case scenado is also provided, with the probability of occurrence
increased by two standard deviations of the expected probability of failure.
Desi,qn defect loss exposure - This calculation includes the retrofit cost (sevedty) and the
probability of failure (frequency) by component. Expected costs for each component are
calculated; the expected cost per wind turbine for this coverage would be $1,040. The worst-
case scenario include revised retrofit costs as well as increased frequencies as described
above.
Materials and workmanship loss exposure - As above, a retrofit cost and probability of
failure is assigned for each component resulting in an expected cost of failure for each
component. The total expected cost for this coverage would be $544. Worst-case scenario is
calculated as described above.
As mentioned above, consideration must be given to adding a risk premium to the above. A
number of assumptions have been made which, if wrong, can materially affect the calculated
pure premium. For instance, the wind turbine is expected to produce 82 kwh per hour. This has
not been proven. Also, a rate of $.08 per kwh produced may vary widely in today's fluctuating
energy market. Probabilities of failure for similar components tested in government studies
might not be representative of the actual components used in the design and manufacture of the
wind turbines. In place of a risk premium, a retrospective rating policy might be considered. In
any event, while a determination of a pure premium can be made, its accuracy is only as good
as the assumptions made. There can be a wide range into which the correct premium may fall.
201
HANDLING OF UNALLOCATED LOSS ADJUSTMENT EXPENSE
Unallocated loss adjustment expense (ULAE) can be defined as that part of loss adjustment
expense which covers the creation and maintenance of a claims department, among other
things. It has been overlooked in the past and is one of the reasons why entities have turned to
insuring the warranty exposure. Consider the warranty product. The pure premium is typically
made up of high frequency low severity occurrences, i.e., there are many small losses.
Expected losses in this scenario are generally predictable, and in the early days of shorter-term
(mostly annual) warranties the manufacturer kept this risk. As both manufacturer's warranties
and extended warranties increased in length of policy term, problems were created.
Manufacturers or retail outlets which sold warranties went out of business on occasion, leaving
the consumer with a worthless warranty, one on which he most likely paid the premium up front.
A warranty is a promise to pay for a covered repair or replacement to a product; if the provider is
not around at the end of a five or ten year policy term, that promise goes unfilled. This is one
reason that the transfer of this dsk by insurance is now so common. However, insurance
companies may decide that they no longer want to be in the warranty business or may go out of
business themselves, and non-recognition of ULAE costs can lead to financial difficulties in
these instances.
Take for example the auto extended warranty provider. Typically a new-car buyer may purchase
an extended warranty for up to seven years or one hundred thousand miles, whichever comes
first. The warranty insurer gets the full premium at the time of purchase of the car and is now
obligated for the full term of the contract. This means that if for whatever reason the insurer
leaves the warranty business, some provision for the fulfilling of the warranty promise must be
made. The creation and maintenance of a claims department to fulfill this promise falls under
the heading of ULAE and is an important consideration for the actuary in pricing the warranty
risk.
Exhibit D illustrates the calculation of ULAE by showing the cost of maintaining a claims
operation for the duration of the inforce policies. The calculation starts with the number of claims
expected annually, and then the determination of how many underwriters, claim adjusters,
auditors and clerks would be needed to service those claims. Also factored in would be the cost
of equipment, and facilities for these people. As can be seen, the total cost can then be reduced
to a rate per contract and included in total price.
The most important calculation in Exhibit D may be the of distribution of claims. In this example
warranty contracts are sold with terms varying from 1 year to 7 years. For policy year 1998, the
contract sold on December 31 = of that year will not expire until December 31, 2005. If no more
contracts were ever written, there would be a need for a claims staff for seven more years. It is
important that claims data can be linked to policy information in order to determine the claim
development (it is not uncommon for warranty administrators to keep premium and claims data
completely separated, though this is becoming less and less common). If no data is available, a
distribution can be developed by working with sources knowledgeable with tile product being
warrantied. There may also be similar products being warrantied about which claim
development data is available that can be used as a proxy.
The actual ULAE costs can be determined in one of two ways. The costs may be determined
by viewing the claims operation either as an on-going business or as a run-off operation.
Viewing it as a run-off operation would lower the costs as claim-paying standards would most
likely drop. The insurer is no longer interested in maintaining a strong service image. For
example, in an on-going operation the standard of issuing a claim payment from notice of claim
may be five days; in a run-off operation this standard could be relaxed to two weeks or more.
This philosophy would also influence the setting of a ULAE reserve.
202
ULAE can be collected in various ways, depending on the way an insurer provides the warranty
product. If the insurer administers the settling of the claims it can be included in the warranty
premium. If a third party administrator (TPA) handles the claims, it may be provided for by fees
charged by the administrator to the dealer or retailer or it may be part of the commission
structure. For example, the TPA may earn a commission of 25%, but only get 15% with the
remaining 10% amortized over several years.
As can be seen, not recognizing the ULAE costs on a multiyear non-cancelable policy can have
financial implications. At the very least, a liability should be shown in the financial statements. At
the worst, it could lead to a claims department totally unprepared to handle the volume of claims
in the future.
203
PITFALLS
Many companies have entered the insuring of manufacturers' and extended warranty market
and many have failed, losing great amounts of money. Most often failures occur because the
risk being transferred was not understood. Let's face it, at the outset, this business looks very
attractive, as for the most part premiums are paid up front in full, and claims may occur years
later. Just think of all the investment income to be made!
Vehicle service contracts ("VSCs") present us with a unique risk/exposure structure. In no other
form of insurance is the insured, the producing agent and the service provider the same entity.
This structure is akin to a doctor selling health insurance to his own patients. As you might
imagine, such a structure is full of moral hazards and conflicts of interest.
None of the previous examples can be controlled through underwriting or claims adjustment
efforts or controls. Without effective account management systems, administrators are left with
three, equally unpalatable alternatives: raise rates, post-claims underwrite or cancel bad
accounts. If rates are raised beyond competitive levels, business will fall off. Generally, the
greatest losses are among the lowest risk, most profitable vehicle makes. The artificially high
rates become attractive only to high-risk dealers, selling high-risk cars, which will soon prove
even the artificially high rates to be inadequate. Tightening claim adjustment policies can have
the same effect - lost business. Cancellation of poorly-performing accounts, while eliminating
the problem, can end up eliminating all of a company's problems.
Identifying problem accounts is simply a matter of generating a listing of accounts whose earned
loss ratios exceed a specific target. The three major areas of VSC groupings involve new
vehicles, near-new vehicles (or extended eligibility new vehicles) and used vehicles. If any or all
of the target loss ratios for these groupings are exceeded, the account should show on the
listing. If programming resources permit, it is also useful to develop some sort of ranking
system, encompassing factors such as : newly acquired account shock losses, number of VSC
grouping target loss ratios exceeded, overall loss ratio target loss ratio exceed, as well as the
amount by which the targets have been exceeded.
Identifying problem areas within the operation of the targeted accounts is a more complex issue
In order to begin the analysis of specific problem areas, a more complex target set, or model, is
necessary. This model needs to be constructed according to major franchise group (Standard
Asian, luxury Asian, standard domestic, luxury domestic, standard European, luxury European)
204
and reflect acceptable frequency and severity targets for each VSC grouping (New, Near-New,
Used, Total). Frequency and severity targets for this matrix can be calculated by averaging the
results of several accounts within each franchise group whose loss ratios for all VSC groupings
are at or below target levels.
Once the variances from frequency/severity targets are established, specific causes for such
variances can be derived and solutions proposed. High rates of early used vehicle claims can
be traced to less than adequate used vehicle reconditioning practices. Generally high claim
severity (usually combined with high rates of multi-item repairs) usually point to highly
incentivised service writers/technicians "discovering" failures that were not prompted by
customer complaints. Generally, high frequency levels point to some type of customer incentive
program, e.g. free inspections or other service specials.
In order to implement solutions, the internal systems and the state filings must be flexible
enough to provide support for: reduced claim reimbursement (factory time and/or labor rates as
opposed to retail) claim elimination periods (typically 30 days on used vehicles) premium
adjustments (individual rate premium modifier factors) underwriting restrictions (high mileage
used vehicles, long term new vehicle plans). Rate adjustments, elimination periods and
underwriting restrictions are used to address selection and reconditioning issues, involving the
sale of the VSCs. Reduced claim reimbursement is used to combat overzealousness in the
service department. By focussing the solutions on the specific areas of the account's operation
that is causing the problem, recovery is speeded and recovery rates are increased.
WARRANTY IN GENERAL
In the early days companies evaluated warranty business on a calendar year basis. Premiums
on multi-year terms were earned evenly over the contract period. Unfortunately, losses tended
to occur later in the term of the warranty. In Exhibit E, it can be seen how this combination
understates the loss ratio in the first calendar year of the warranty term. Now, since the loss
ratio is so low, an obvious albeit erroneous conclusion would be that not only should we write
more of this business, we should reduce rates to help our marketers! It only takes a few years to
dig a deep hole, as inadequately priced business has now been written for several years. Rate
relief is essential. Of course, this leads to further problems. If the rate level increase needed is
large, there may be difficulty getting approval from the various states. Even if approvals are
finally received, implementing a large rate increase could lead to a very rapid drop-off in VSCs
written, as dealers can use a competing program. A large drop-off in VSCs would mean a large
reduction in revenue, just when the cash is needed to pay the claims from the old business. It is
easy to see how this could become a run-off operation.
Earning premiums correctly is very important as can be seen above. Premiums should be
earned in direct proportion to the loss payout pattern. Earning premiums in this fashion
maintains the proper loss ratio for the life of the policy period, as shown in Exhibit E. Hopefully
existing loss payout data is available in order to determine the payout pattern. In cases where
the data is not available and the losses are expected to start out slowly in the beginning of the
term and monotonically increase over the life of the contracts, the reverse sum of the digits rule
can be used. Exhibit E shows the loss payout pattern described by this rule. As shown, we
would earn 1/36 of the premium in the first year, 2/36 in the second year, and so on up to 8/36
in the last year. Note that this earning methodology is conservative; it does not recognize the
aforementioned "attrition factor." The state of Louisiana actually requires that a non-insurance
company that guarantees warranties or extended warranties earn its income no faster than the
reverse sum of the digits rule. If the term of the contract pedod is annual, this rule is often
referred to as the reverse rule of 78s (using monthly earnings).
205
Pricing of warranties or extended warranties should be do~le by product or at most by
homogeneous classes of products. Do not make the mistake of giving one overall rate for a
warranty program made up of many different products. Exhibit F, example 1, illustrates what
can happen. Company A administers a warranty program for "brown and white goods" (basically
electronics appliances, and office equipment.) Loss costs are available, and Company A is
looking to transfer the warranty risk to insurance company B. Since B will insure the entire
program, B decides to give a single program rate of $169. Unfortunately for B, A writes a new
account which only sells refrigerators. This changes the mix of risks, thus changing loss costs
and making the single rate of $169 inadequate as the new rate should be $193. Practically
speaking, rates would not be modified every time a new account came on line, so it would be
better to charge a rate by class to minimize the mix change problem.
Another pricing pitfall to avoid is basing the rate on the overall revenue an administrator gets for
the warranty contract. Again in Exhibit F, example 2 Company A (the administrator) sells a
warranty contract for $50, and Company B (the insurer) determines that the loss cost is $5. B
than grosses the loss cost up for expenses and wants $7 in premium. B than sets a rate of 14%
per revenue. Unfortunately for B, next year A decides to lower its selling price of the warranty to
$40. Now B only gets $560 per contract, which barely covers his loss costs let alone his
expenses.
Another problem often encountered by the pricing actuary on warranty business is the lack of
quality data. To properly price a warranty product, policy year data must be used. Most
administrators do not show data in policy year format; some cannot show it as losses cannot be
tied back to the premium. Obviously in this type of operation there can be no verification of
coverage; the claim is paid when it is presented. This type of account cannot be soundly priced.
If triangular data is available, it must be reconciled with the TPA's audited financials. Again,
many TPAs are not used to providing actuarial data, so a thorough checking of the data is
required.
206
CONCLUSION
Ratemaking in the extended warranty line of business is well-suited to take advantage of the
actuarial approach. The business is driven by frequency rather than severity so that it lends
itself to actuarial modeling. For the vehicle extended warranty there is often credible data
available. When there is not data available, the "back-to-basics" approach is best done by an
actuary. The actuary is an essential member of the warranty pricing team.
207
AUTO EXTENDED WARRANTY EXHIBIT A
LOSS PROJECTIONS
1 2 3 4 5 6 7 8 cj 10 11 !2 13 14 15
199C 1 25,000 , $ 5,000t000 , $ 200 , $4,8001000 96.0% 11300 0.0(30 85 0% $ 4600000 s 4,800,000 j s 4,600,ooo 1 s 192 , $ 192 , 4rBO~tO00
1991 i 25r000 , $ 5r050=000 , $ 202 , $5~201~500 103.0% c 10001 0.000 850%' $ 5T201T500 ' $ 5~201r500 J $ 5r201 500 ~$ 208 , $ 200 , $ 5rOOOr750
1992~ 300005 6,150~000 , $ 205 , $6~211~500 1010% 1.030 0.029 850%' $ 6,397~845 ' $ 6,377,189 J $ 6,363',757 J$ ' 212 $ 205 $ 6t136~971
1993 35~000 , $ 71350~000 , $ 210 , $ 5~9531500 81 0% 1.1401 0.123 850%' $ 6T786=990 ' $ 8~788r419 , $ 6r720,737 $ 192 : $ 201 : $ 7,026,172
19941 40,000 , $ 81520,0oo ' $ 213 , $513~7,600 63 0% 1450 0 310 850%' $ 71783=020 ' $ 7T813r382 ' $ 7,615,117 ' $ '~90 , $ 198 , $ 7~9221867
199~J 45~000, $ 9~675~000 , $ 215 , $ 3~678~50~ 38 0%' 2306' 0565 85 0%' $ 8T455,950 ' $ 8,7341748 ' $ 8,324r707 ' $ 185, $ 195 , $ 8,'780~809
1996 50000 $11r000000 $ 220 $1 100000 10 0%I 5400 j 0 815 850%' $ 5,940,000 ' $ 9~390j586 ' $ 8,718,519 '$ 174 $ 191 $ 9,548~867.
1997 55~000 , $12~375~000 , $ 225 , $ 111~375 0 9 % ' 27.700' 0.964 85 0%" $ 3r085~088 ' $ 11r144~799 ' $10r250,387 ' $ 186 ' $ 190 ', $10,458,065
1998 60,000. $13,800,000. $ 230. $ 13,800 0 1 % ' 43200(31 0998 850%' $ 5,961 600 ' $12,749,013 ' $11,716,647 ' $ 195. $ 191 . $11,459,403
O~
¢o(umn notes on calculation Stanard / Buhlmann CalculatioR
4 col 3/cal 2 premium x I reported I ulbmate
6 oo1510o13 ELR est I prem 1-la~l • 1 0 - !ag , IBNR losses losses
8 1 00-100/ocl 7 1990 92% I $ 5rO0~'O00 0~0 $ + 0 $
C 4,BOOrO00 $ 4~800T000
10 col 5 x cal 7 1991 92% $ 5.050r000 O 000 $ $ - $ 5,201~500 $ 5~201r500
11 ====> 1992 92% $ 6~150,000 0029 ~ 179,126 , ~ 165,689 I $ 6r211 500 $ 6,377,189
12 ¢ol 5 + col 3 x col 8 x col 9 1993, 92%~ $ 7,350,000 0123, 902r632 , 834r919 $ 5,953,500 $ 6r768t419
13 col 12/col 2 1994, 92%, $ 6r520,000 0310, $ 2,644,138 , $ 2,445,782 $ 5,367r600 i $ 7~813~382
14 (sum of col 12 current and preceding years) I 1995 92% $ 9,6751000 0 565, $ 5r468,478 , $ 5,058r248 $ 3r676r600 $ 8r734~748
(sum of cot 2 current and preceding years) 1996. 92%, $ 11,000r000 0815 $ 8,962~963 $ 8r290,586 586 I $ t 100,000 I $ 9~390~586
15 col 2 x col 14 1997. 92%• $ 12~375~000 0 964', $ 11,928t249 ', $11~033~424
$11,033;424 ' $ 111,375 ' $ 11,144~799
1998, 92% $ 13~800r000 0 9 9 8 $ 13,768,056 , $ 12,735,213 • $ 13800 ' $ 12,749r013
Calculation of ELR E s t i m ~
TREND ANALYSIS
INTERNALREPAIRCOST ANALYSIS
1999 1999
pOtCy claim
I~J~Y c.~m I
component coverage count count p a y m e n t frequenc-/ seventy count count p a y m e n t fmquenc,/ seve~y
rental lew 75,000 18,000 $ 1,050,000 200~ $ 70 60,000 11.400 $ 885,000 190% $ 75
~t~'pump new 75,000 6,000 $ 1,050,000 80~ $ 175 60,000 4,200 $ 735,000 70% $ 175 EE~mm~EE~
i k c ~ d compressor new 75,000 3,750 $ 1,500,000 50=~ $ 400 60,000 2,700 $ 1,012,500 4 5% $ 375 IIm~m~EE~
pump new 75000 2,250 $ 528.750 30% $ 235 60:000 2,100 $ 472,500 3 5~ $ 225 i~m,t~EB~
¢anlmumon(automatc)internalparls Jew 75,000 21250 $ 1,743,750 30~ $ 775 60~000 1,500 $ 1,155,000 25% $ 770 E ~ E ~ E E ~
:ranlMude(autoenat¢) imemal pa~ls new 75.000 2,250 $ 1,912,500 3 0~; $ 850 60,000 1,500 $ 1,275,000 2 5% $ 850 EE~m~r~IIE~
(automate) as~mb~ nc-w 75,000 1,125 $ 1,293,750 1 5~ $ 1,150 60,000 900 $ 1,060.000 15% $ 1,200
nb'l~ ( a u t O ) assembly new • 75,000 750 $ 750,000 10 ~ $ 1,000 60,000 600 $ 660,000 10% $ 1,100 IIm~m~mB~
Ingmeec~mbly ~w 75,000 450 $ 678,000 06~( $ 1,500 60,000 360 $ 522.000 06% $ 1,450
~lRemntial(re~assembly 'row 75,000 300 $ 157,800 0.4~( $ 525 60,000 180 $ 94.500 03% $ 525 EE~m~EE~
~l~rn~l~t(ma41ual}astmmbly :new 75,000 75 $ 62~250 0.t~ $ 830 60,000 60 $ 66,000 01% $ 1,100 E E 3 ~ E E ~
new 75.000 34.200 $10,723,500 456% $ 314 60,000 25,900 $ 7,927,500 425~ $ 311
EXTERNALTREND ANALYSIS
exponengalVend= 1 8%
WIND TURBINES EXHIBIT C
YEAR 1 EXPOSURE
Defect loss
probability probable reasonable worst case worst case
item retrofit of cost retrofit probability exposure
, cost , ~ccurrence ,exp°sure cost occurrence
normal maintenance n/a , 1.00, ~0 n/a 2.00 $0
rotor blade repair , $2,000 0.05 , $100 ~16,500 0.08 , $1,292
hub retrofit $5,000 , 0.06 , ~300 $6,500 0.30 $1,952
teeter damper retrofit , $2,500 • 0,08 , $200 $4,000 0.45 $1,791
gearbox retrofit , ~1,000 , 0.06, $240 $.19,200 O.16 . $3~052
generator retrofit $1,000, 0.04, $40 $7,000 0.07 $478
mainframe repair $0 0.06 ~o $7,500 0.16, $1,192
yaw bearing retrofit $4,000, 0.04 , ~;160 $5,500 0.15 $842
tower repair $0, 0.08, $0 $3,000 0.31 $919
totals L $1 ~040 $11,518
210
Oustanding ULAE Estimates EXHIBIT D
AS of 10/97 PAGE 1
Assumptions
CY1998 CY1999 CY2000 CY2001 CY2002 CY2003 CY2004 CY2005 Total O/S
PY98 33,000 19,500 19,800 26,400 13.200 13,200 3,960 2,640 132,000
PY97 18,975 18,975 25,300 12,650 12.650 3.795 2,530 94.875
PY96 18,150 24.200 12.100 12.100 3,630 2,420 72,600
PY96 23,100 11.550 11.550 3,465 2,310 51,975
PY94 11,000 11,000 3,300 2,200 27,5~
PY93 10,450 3,135 2,090 15,675
PY92 2,970 1,980 4,950
PY91 1,870 1,670
" 0.0%
Inflation Factors Dates
1998 86,515 19 0 40 20 40 760,000 180.000 t00 00~ 100,000 193.800 1,333,800 5 0% 1,333,e00
1999 70,840 160 30 1.0 30 640,000 135,000 50.000 75,000 153,000 t,053,000 50% 1,105,650
2000 54,340 12 0 20 10 20 480,000 90,000 50,000 50000 113,900 783.900 50% 864,250
2001 30,415 70 20 10 2.0 280,000 90.000 50,000 50,000 79900 549,900 5 0% 636,578
2002 18,590 40 10 10 10 160,000 45.000 50000 25,000 47.600 327,600 5 0% 398,200
2003 6,215 20 10 10 10 80,000 45,000 50,000 25.000 34000 234,000 50% 298,65Q
2004 2,530 10 t 0 10 10 40,000 45,000 50,000 25,000 27,200 187,200 5 0% 250,866
1998 33,000 70 20 10 20 280,000 90.000 50.000 50,000 79,900 549,900 50% 549.900
1999 19,8OO 40 10 10 10 160,000 45000 50.000 25,000 47,600 327,600 50% 343.980
2000 19,8OO 40 10 10 10 160,000 45,000 50,0OO 25,000 47,600 327,600 5 0% 361,179
2~1 26,4OO 60 10 10 1.0 240,OO0 45.000 50,000 25.000 61,200 421,200 50% 487.592
2002 13,2~ 30 10 10 10 120,000 45.000 50,000 25,000 40,8OO 280,600 5 0% 341.314
2~3 13,2~ 30 10 10 10 120,000 45,000 50,000 2 fi,0CO 40,800 280,800 50% 358,360
2004 3,960 10 10 10 10 40,000 45.000 50,000 25,OO0 27,200 187,200 50% 2,50.~
2~5 2,~0 10 10 10 10 40,00(} 45.000 50,000 25,OO0 27,200 187,200 5 0% 263.409
+2
I X 3o,oi x"7,1 x104 x +312,1 x +418o,ol x÷620o,o x÷,20o,o x÷,13o,o TOTALI o,o
For usa with examples POLICY yEAR WRI17-EN PREMIUM = $100,000
below:
EXPECTED LOSS RATIO = 75%
Cumulative Earnings $ 12,500 $ 25,000 $ 37,500 $ 50,000 $ 62,500 $ 75,000 $ 87,500 $100,000
Incurred Losses $ 2.250 $ 5,250 $ 7,500 $ 9,000 $ 11,250 $ 15.000 $ 15,000 $ 9,750 $ 75,000
Cumulative Losses $ 2,250 $ 7.500 I $ 15.000 $ 24,000 $ 35,250 $ 50,250 $ 65,250 $ 75,000
Policy Year X Loss Ratio 18% 30% 40% 48% 56% 67% 75% 75%
Incurred Losses $ 2.250 $ 5,250'$ 7,500 $ 9,000 $ 11,260 $ 15,000 $ 15,000 $ 9,750 $ 75,000
Cumulative Losses $ 2,250 $ 7,500 $ 15,000 $ 24.000 i $ 35,250 $ 50,250 $ 65,250 $ 75,000
Policy Year X Loss Ratio 75% 75°/¢ 75% 75% 75% 75°/~ 75% 75%
Cumulative Earnings $ 2.778 i $ 8,333 $ 16~667 $ 27,778 $ 41 867 $ 58 333 $ 77,778 , $100,000 1
Incurred Losses ,$ 2,250 $ 5,250 $ 7,500 ' $ 9,000 ' $ 11,250 $ 15,000 $ 15,000 , $ 9 750 ' $ 75,000
Cumulative Losses .$ 2,250 $ 7,500 ' $ 15.000 ! $ 24,000 ' $ 35,250 ,i $ 50,250 , $ 65,250 , $ 75 000
Policy Year X Loss Ratio 81% 90% 90%! 86% 85% I 86% I 84% 75%
214
EXTENDED WARRANTY EXHIBIT F
EXAMPLE 1: CHANGE IN LOSS COSTS DUE TO CHANGE IN MIX o TPA ADDS REFRIGERATOR ACCOUNT
j tnt~l~
Year 2 - rate still 14% $ 40.00 $ 5.00 $ 200 $ 700 18% $ 5.60
of price
215
216
A Macro Validation Datasetfor
US. Hurricane Models
217
A Macro Validation Dataset for U.S. Hurricane Models
Abstract
Public and regulatory acceptance of catastrophe models has been hampered by the
complexity and proprietary nature of the models, The outside user is generally
dependent on the modeler to demonstrate the validity and reasonableness of model
results. Accordingly, we have developed a dataset permitting macro validation - one
that would allow a lay person to compare the overall results oir a hurricane model to
an historical record.
The macro validation dataset consists of the aggregate insured losses from
hurricanes affecting the continental United States from 1900 through 1999. The
historical losses in each county have been "trended" - adjusted from the conditions
at the time to those existing today. The trending reflects not only estimated changes
in price levels, but also estimated changes in the value of the ~tock of properties and
contents, and changes in the insurance system Our work extends and improves
upon similar work by tandsea and Pielke (1998), published by the American
Meteorological Society
The paper describes the construction of the validation dataset and summarizes the
resulting size of loss distributions by event, state and county It also provides tables
summarizing key statistics about all hurricanes affecting the United States (and
Puerto Rico, the I.J~S~Virgin Islands and Bermuda) during the 20~hcentury. Finally, we
compare summary statistics from the dataset to the results of a hypothetical
probabilistic hurricane model.
218
I. I N T R O D U C T I O N
Hurricane Andrew in 1992 heightened the concern among property insurers and
reinsurers about the potential for losses from natural catastrophes. This heightened
concern spread beyond hurricanes to other perils with the Northridge earthquake in
1994, and several major winter snowstorms and tornadoes during the nineties.
Major catastrophes outside the U.S. during this time have also helped keep
catastrophe issues in the forefront for property insurers and reinsurers worldwide.
Since natural catastrophes are infrequent, traditional actuarial pricing methods are of
limited value. Actuaries are accustomed to estimating rate adequacy by adJusting a
body of historical insurance premium and loss experience to reflect the anticipated
future environment. For property insurance, this typically involves a projection using
three to six years of recent, mature experience, Prior to hurricane Andrew, the
actuarial literature suggested using a thirty-year experience period for measuring
excess wind loads in property insurance ratemaking.
When extreme events in a particular region are expected to happen only once every
hundred years or more, alternative approaches are clearly required This is true
whether the objective is to measure expected losses for rating purposes or probable
maximum losses 1 for risk and capital management purposes. For catastrophe risk
management, probabilistic computer simulation models have been developed as
such an alternative. These models incorporate longer-term historical data about the
physical events as well as engineering knowledge about their destructive potential,
Insurers, reinsurers and rating agencies have generally accepted use of the models
to project losses.
The models and their use as a ratemaking tool have not been free from controversy.
Some insurance regulators have rejected their use in rate filings, citing the difficulty
of verifying the model results. Regulators have also cited extreme rate indications
and inconsistent results between competing models as a basis of their rejection.
Despite these issues, the use of models continues to increase because they provide
the most comprehensive use of available data to measure the costs and risks of
catastrophes, In response, regulators in Florida and Louisiana have set up formal
processes for evaluating catastrophe models,
Model Validation
Fundamentally, all catastrophe models proceed along the same analytical path. First,
the key scientific parameters describing a specific historical or hypothetical event are
determined. The models then estimate the incidence of damaging forces to property
from that event. Finally, the resulting property damage and insured loss are
The probable maximum loss, or PML, is the loss amount that is estimated to be exceeded
with a specific probability, for example 1% (or exceeded once within a specified return
period, for example 100 years), resulting from one or more causes of loss affecting a portfolio
of properties.
219
estimated based on the characteristics of the structure and the policy terms, More
specifically, a probabilistic hurricane model contains the following four basic steps.
At each step of the process, error is introduced to the extent that model results do
not fully agree with actual observations. Model error is present because no model
can precisely replicate an actual physical event. By definition, a model is a
representation of the event; it seeks to capture the key underlying variables and their
inter-relationships, leaving estimation errors from variables and inter-relationships
not captured. Simulating a large number of hypothetical events can reduce certain
of these errors. Some of the key contributors to hurricane model error are:
220
• In estimating the wind speeds at specific locations affected by each event
- - limited availability of wind speed data for a sufficient number of locations for a
sufficient number of historical events
- - limited ability to simulate the actual impact of land, vegetation and man-made
objects on wind speeds
- - limited ability to simulate the possible variations in windfield shape (i.e., the
distribution of wind velocity by distance and direction from the center),
particularly including localized bursts of wind.
surge
- - limitations in our ability to determine the portion of damage due to flood
rather than wind.
These errors can be significant or modest in relation to the final results produced by
the model. For example, Kelly and Zeng (Kelly and Zeng 1996) suggest that, based
on their experience with one hurricane model, the errors introduced by the damage
step are generally much less than a single order of magnitude while the errors
introduced by the event steps can be several orders of magnitude. In other words,
the model's estimate of expected losses for a particular risk might be off by 20% due
to a mis-specified damage function, but those same expected losses might be off by
200% due to mis-estimation of the landfall probability.
In the authors' view, public (and regulatory) acceptance of these models is hampered
by the complexity of this layered validation approach, which leaves the outside user
with an unclear picture of the overall goodness of fit between the model and
historical data. The problem is only exacerbated when the model formulas and the
validation results are treated as proprietary by the modelers. Accordingly, we set out
to develop and publish a dataset permitting macro validation - one that would allow
a lay person to compare the overall results of the model to an historical record. In
addition to a comparison of model results to historical results, the dataset also
demonstrates the limitations of the historical experience and data.
221
The macro validation dataset consists of the aggregate insured losses from each
hurricane affecting the continental United States from 1900 through 1999. The
dataset includes storms determined by NOAA to have caused hurricane conditions
over land. Exhibit 1 lists these hurricanes3 and shows their magnitude, as
determined by NOAA, in each of the coastal states affected. The overall losses for
each event have been allocated to county, based on estimates of relative loss within
the state. The historical losses in each county have then been "trended" - adjusted
from the conditions at the time to those existing today. Our work extends and
improves upon similar work published by Pielke and Landsea (Pielke and Landsea
1998), which looks at total economic damages rather than insured losses and does
not cover the entire 20'h century.
Because the models are used primarily by the insurance industry, our focus was to
estimate the aggregate insured losses directly sustained by the U.S. insurance
industry. The same approaches described in the paper can be used to project total
economic losses as well.
The remainder of this paper has two major sections. Section II describes the
construction of the validation dataset, which consists of the losses from each
historical event adjusted to 2000 cost and exposure levels. Section III illustrates the
use of the dataset,
Historical Losses
Data on the losses sustained from past hurricanes is available from a variety of public
and private sources. The various data sources differ as to the types of costs
included, the level of detail, and whether the figures are actual results or estimates.
The National Weather Service (NWS, which is part of NOAA) compiles data on the
economic impact of each U.S. hurricane; that data is published annually in the
Monthly Weather Review. A summary of this historical data from 1900 forward is
presented in Deadliest, Costliest, and Most Intense United States Hurricanes of This
Century (Hebert. Jarrell and Mayfield 1996). The data published by NWS are
estimates based on surveys of the areas affected and consultations with experts, not
a tabulation of actual costs incurred. The estimates include all direct costs stemming
from the event, including insured losses, uninsured property losses, federal disaster
assistance outlays, agriculture and environmental losses, etc. (Technically, the
insured losses include some secondary costs due to the inclusion of business
interruption and additional living expense claims.) Typically. the estimates for each
event are not broken down by state or county. Separate estimates are made when a
single hurricane makes more than one distinct landfall.
] The summary tables on Exhibit ], Sheet 3, show total storms by category and state.
Appendix A displays key statistics on hurricanes affecting Bermuda, Hawaii. Puerto Rico and
USVI during the 20th century.
222
Property Claim Services, Inc. (PCS), a subsidiary of Insurance Services Office (ISO),
prepares estimates of the direct insured losses for each natural catastrophe,
including hurricanes. Their historical data extends back only to 1949. To be
considered a catastrophe by PCS, the aggregate insured losses from the event must
exceed a set dollar threshold. This threshold was originally set at $1 million; over
time it has been raised to its current level of $25 million. The estimates published by
PCS are based on surveys of insurers' reported loss activity, insurer market share
data and a database of the number and types of structures by county. The current
PCS practice is to prepare an initial loss estimate approximately two weeks after the
event and to revise its estimates based on new information after subsequent 60 day
periods until the estimate stabilizes, at which point no further revisions are made.
Until the late 19805, PCS estimates were rarely updated after 60 days and evidence
suggests that these estimates often underestimated the total loss.
The PCS estimates are intended to include all insured losses paid directly by US.
insurers under property and inland marine insurance coverages. This would include
payment of the costs to repair or replace damaged property and contents,
reimbursement for alternative housing while repairs are effected, and compensation
for business interruption losses. The insurer's specific expenses for adjusting the
claims are not included. The PCS estimates for each event are currently broken
down by state, separately for personal property, commercial property and
automobile, and also include the number of claims and the average payment.
223
TABLE 1
Comparison of PC$ Estimates of Industry Losses to Estimates from the AIRAC
Survey
Certain state insurance departments also conduct studies of hurricane losses in their
state. In the case of hurricane Andrew, the Florida Department of Insurance
compiled the actual losses for the insurance industry. Under emergency rules
promulgated by the Department, each insurer operating in the state was required to
report their accumulated losses to the Department at the end of each quarter. The
reported figures include only losses (i.e., not including costs of adjusting the claims),
for Florida business only. Losses in Louisiana and elsewhere are not included. 4 The
results as of March 31, 1994 were published in The Journal o f Reinsurance (Lilly,
Nicholson and Eastman 1994). In the aggregate, insurers reported 798,356 claims
from hurricane Andrew, with a total dollar cost of approximately $16.1 billion. As of
that date, insurers had paid out roughly 91.9% of that figure, with the balance
representing their estimate of payments still to be made pending final adjustment.
The final PCS estimate for Florida losses from Hurricane Andrew was $15 billion.
4 Anecdotally, we would point out that insurance losses could be sustained by policyholders
far away from the event. For example, in the case of hurricane Andrew an insurer sustained a
loss by a Massachusetts policyholder who lost a camera while vacationing in Florida at the
time. This loss would not be included in the figures quoted above.
224
(Hebert, Jarrell and Mayfield 1996). The normalized loss for these hurricanes
represents only about 3% of the total normalized loss.
Once a best estimate of the industry aggregate insured losses was selected, the
losses were allocated to county. We devised a damage index for each county that
reflected the estimated relative impact of the hurricane. The damage indices for all
counties affected by an event were scaled such that, when multiplied by the number
of housing units in the county at the time, the sum across all counties balanced to the
selected industry aggregate insured loss.
The damage indices for an event are derived from the ToPCat hurricane model. The
use of these indices means that the allocation of losses to county (and to state, prior
to PCS estimates) is model-dependent. Nevertheless, the total insured loss estimates
for each storm are not model dependent as they are balanced to the selected
industry loss estimate.
Trending
The historical losses reflect the price levels and property exposure existing at the
time of the event. If the same event were to happen today, the losses arising from
that event would reflect
• today's price levels, reflecting the general inflation in price levels that occurred
during the intervening period
the current stock of properties and contents, reflecting the increase in the number
of structures of various types, any increases in the average size or quality of the
structures, and the greater amounts and value of the typical contents in the
structures
The impact of monetary inflation was measured by reference to the Implicit Price
Deflator (IPD) for Gross National Product, published by the Department of Commerce
in their annual Economic Report to the President. An inflation trend factor was
computed by dividing the estimated value of the IPD at year-end 2000 by the value at
225
the time of the event. The IPD is only available back to 1950. For prior years, a 3.5%
annual trend was assumed.
Of course, property values have increased by more than inflation. For example, the
average size of houses and the amount of contents have gradually increased over
time. The national growth in the value of property was measured using estimates of
Fixed Reproducible Tangible Wealth (FRTW) published by the Department of
Commerce's Bureau of Economic Analysis. FRTW measures the total value of all
structures and equipment owned by businesses, institutions, and government as well
as residential structures and durable goods owned by consumers In this context,
structures include buildings of all types, utilities, railroads, streets and highways, and
military facilities. Similarly, equipment includes industrial machinery and office
equipment, trucks, autos, ships, and boats. While FRTW includes some elements not
entirely relevant to property insurance such as military facilities and highways, these
elements represented less than 10% of the total as of year-end 1995.
The national growth in property exposure has been far from uniform geographically
The general migration of the U.S. population towards the South and West over the
last several decades has been well publicized. Of particular relevance to potential
hurricane losses is the increased concentration of people and pr{~perty in vulnerable
coastal locations.
Pielke and Landsea (Pielke and Landsea 1998) have suggested that the national
property growth factor be adjusted based on relative growth of the population in the
affected region versus the nation as a whole. They introduce a population
adjustment equal to the ratio of the growth in population in the affected coastal
counties to the growth in population nationally. While this approach reasonably
captures the migration of the U.S. population to the Sunbelt, it fails to take into
account the explosive growth in vacation homes. (Census population data accounts
for people at the location of their principal residence,) This issue is particularly
significant because a large number of vacation homes are located in coastal resort
areas: Cape Cod, Long Island, Cape Hatterras, Florida, etc.
We improve upon Pielke and Landsea's approach by using the growth in the total
number of housing units in each county during the time period for which it is
available, rather than the growth in population. Housing unit data is available from
the Census, back to 1940 (County data from the decennial census was interpolated
to obtain annual housing unit estimates for each county. Prior to 1940, we used
population statistics to estimate housing units)
226
A second improvement relates to the way in which the county data is used. Pielke
and Landsea (Pielke and Landsea 1998) identified the coastal counties that were
affected by each event and based their geographic adjustment on the aggregate
change in population for all counties combined. Because we estimated the insured
loss by county, we were able to weight the growth by relative damage in each
county.
Since we are adjusting insured losses, a final adjustment was necessary to account
for changes in the insurance system. Ideally, this adjustment should account for
each of the following.
Changes in the prevalence of insurance coverage. Coverage for the wind peril is
fairly universal today, primarily because mortgage lenders require it. (This
requirement does not exist for earthquake insurance, resulting in significantly
lower market penetration for that coverage, even in earthquake-prone areas.)
Property that is uninsured tends to be lower valued. Prior to the introduction of
multiple peril policies in the 1960s, wind coverage was far less universal. The
introduction of FAIR plans and wind pools has also contributed to more universal
coverage.
Changes in the level and structure of coverage. Competition has led to gradual
increases in the level of coverage offered by standard insurance policies. For
example, coverage for contents, generally written as a standard percentage of
building coverage on personal lines policies, has increased over time. More
significantly, there has also been a longer-term trend away from actual cash value
to replacement cost coverage. This shift has been widespread in homeowners;
even some business-owners is now written on a replacement cost basis.
Conning (Conning & Company 1996) has pointed out that this change in coverage
significantly increases the insurer's exposure, essentially changing it from a net
(of depreciation) to a gross value basis. One coverage trend has acted to reduce
insurers hurricane exposure in recent years. Subsequent to Hurricane Andrew,
there was a significant increase in required deductibles in coastal areas. While
individuals have tended to resist voluntary increases in retentions, there has been
a longer-term trend toward larger self-insured retentions in the commercial
insurance sector.
Changes in the typical practices regarding claim settlements. While this element
may be the hardest to specify, industry professionals believe that policyholders
have a greater propensity to file claims, particularly claims relating to minor or
consequential damage. At the same time, insurers are more willing to interpret
the coverage in a manner favorable to the insured (contrary to public perception),
in the interests of customer satisfaction, particularly after a catastrophe.
Taken collectively, all of these factors work to increase the extent of economic losses
covered by insurance, particularly as one goes further back in time. The insurance
utilization index was derived from a review of ratios of PCS insured loss estimates to
NOAA economic loss estimates from 1949 through 1995. The data and selected
insurance utilization index are compared in the graph on Appendix B, Exhibit 2. The
227
11
selected index from 1950 through 1995 was based on a linear least squares fit of the
data. The fit produced a line from approximately 21% in 1950 through 55% in 1995.
From 1995 through 2000, the insurance utilization rate was kept at a constant 55% to
judgmentally reflect the increasing use of deductibles. Prior to 1950, a linear trend
from 10% in 1900 through 21% in 1950 wasjudgmentally selected. As total
economic losses were used as the starting point for normalization prior to 1949, this
latter assumption has virtually no impact on normalized losses.
Appendix B, Exhibit I displays the historical growth rates in the IPD and FRTW
indexes as well as the national growth in population and housing units.
• Limitations of the normalization process itself (these limitations would also relate
to comparisons of normalized and modeled historical storm results)
228
12
the insured loss or its allocation to county can produce large changes in the
normalized amount for events that occurred many years ago; this distortion
should be less significant at the statewide level or for groups of neighboring
counties)
- - trending of exposures based solely on housing units (normalized losses in
counties with commercial property growth significantly different than housing
unit growth will be distorted)
paths, sizes and intensities, which can produce results that differ significantly
from the results of one hundred-year period that are influenced greatly by the
location of the 5 or 10 largest or most intense storms
- - probabilistic model industry loss estimates are dependent on the accuracy of
the modeler's estimate of total insured property exposures by ZIP code or
county that are used in the modeling to estimate industry loss (these industry
exposure sets are independently developed by modelers, or may be
developed by users, based on insurance industry or external statistics on
property values)
- probabilistic models may include tropical storms that do not reach hurricane
-
strength or strafing hurricanes that do not produce hurricane winds over land
(these differences can distort loss comparisons as well as frequency
comparisons)
Results
Inflation 297.4%
Growth in wealth per capita (2.317 + 1.703) 36.1%
Growth in insurance utilization 55.6%
Growth in housing units 222.8%
Thus, in Hancock County, the impact of inflation (297.4%) is less than the combined
impact of the other three factors (584% = (1.361 x 1.556 x 3.228)-1), the most
important of which is the growth in the number of housing units.
229
13
Exhibit 3 summarizes the estimated actual and normalized losses for hurricanes
affecting the U.S. during the 20~" century. The normalized losses for these 164
hurricanes average $1.75 billion per storm, or $287 billion per year. The resulting
size of loss distribution by Saffir-Simpson category on Exhibit 3, Sheet 4 shows the
impact of storm severity on insurance losses. While only about 9% of historical
events were category 4 hurricanes, those events produced 55% of the normalized
losses, Interestingly, the category 5 hurricanes have not produced a similarly
skewed impact because the only two such events (#2 in 1935 and Camille in 1969)
did not hit densely populated areas.
Exhibit 3, Sheet 4 also shows the variation in normalized loss by decade, most
notably the high losses in the twenties and the relatively low losses in the seventies
and eighties.
As 100 years is not a sufficiently long time period to credibly determine the likely loss
levels at the longer return periods, random elements are evident in the state
distributions. For example, the 100-year loss for South Carolina, Hurricane Hugo in
1989, is approximately 10 times the 100-year loss in Georgia, Hurricane Opal in1995.
Georgia was not hit heavily in the 20th century, having had no landfalling events, but
saw several major hurricanes in the 19th century. On a probabilistic basis, it is
reasonable to expect the 100-year loss in Georgia to be somewhat closer to the
South Carolina 100-year loss.
230
14
• What data are the Model T frequency distributions based on, and why do they
differ from the 20th century distributions?
• What are the paths and Saffir-Simpson categories of the typical 50 year and 100
year return events in Model T, compared to the worst events by state during the
20'h century?
• Why are the Model T expected losses in Texas so much lower and New York and
New Jersey so much higher than the normalized 20 m century expected losses?
• How do these and other key differences from the 20'" century storm set affect the
results of Model T on a specific insurer's portfolio?
Exhibit 5 displays annual aggregate loss distributions for counties with significant
annual expected losses in Texas and Florida. Random elements are even more
evident at the county level. For example, Dade County has expected losses over 3
times expected losses in Broward County and over 5 times those in Palm Beach
County, Florida, due to the influence of Hurricane Andrew and storm number 6 of
1926.
Exhibit 6, Sheet 3 compares the normalized losses from the 50 largest events of the
20m century to the Model T results for those same events. Here we see evidence that
modeled individual storm estimates often differ significantly from the normalized
amounts. Differences of over 50% occur on 18 of the 50 storms. These differences
occur primarily on storms prior to the advent of PCS estimates in 1949. Only 2 of the
18 (Hurricane King in 1950 and Hurricane Donna in 1960) have normalized estimates
based on PCS. These differences indicate the uncertainty in both normalizing and
modeling these older storms.
In conclusion, the normalized hurricane loss database provides a variety of tools for
hurricane model users to perform macro validation tests of model assumptions. In
keeping with the spirit of this call for papers on data, the authors will provide
interested readers with an electronic copy of the normalized loss database by event
and county. We trust that future research will expand the scope of hurricane loss
231
15
data to include not only hurricanes of the 21 ~ century, but improvements to this 20'"
century database, and perhaps also the addition of estimates of hurricane losses in
prior centuries.
232
16
IV. REFERENCES
Conning & Company. 1996. Homeowners Insurance: The Problem is the Product.
Hartford, CT.
Hebert, P.J., J.D. Jarrell, and M. Mayfield. Updated February 1996. Deadliest,
Costliest, and Most Intense United States Hurricanes of This Century (and other
Frequently Requested Hurricane Facts). NOAA Technical Memorandum NWS TPC-1.
Kelly, P.J., and L. Zeng. 1996. The Engineering, Statistical, and Scientific Validity of
EQECAT USWlND Modeling Software. Page.2. Presented at the ACI Conference for
Catastrophe Reinsurance. New York, NY.
Lilly III, C.C., J.E. Nicholson, and K. Eastman. 1994. Hurricane Andrew: Insurer Losses
and Concentration. Journal of Reinsurance Volume 1, Number 3, p. 34.
Neumann, Charles J., Brian R. Jarvinen, Colin J. McAdie and Joe D. Elms. 1993.
Tropical Cyclones of the North Atlantic Ocean, 1871-1992. 4'h revision. Asheville, NC:
National Climatic Data Center.
Pielke, Jr., R.A., and C.W. Landsea. 1998. Normalized Hurricane Damages in the
United States: 1925-1995. Weather and Forecasting Number 13, p. 621.
Tucker, Terry. Amended1995. Beware the Hurricane. Bermuda: The Island Press
Limited.
233
Exhibit 1
Sheet 1
1900 1 05-Sep 4 4
1901 3 10-Jul
1901 4 14-Au9 2 2
1903 3 I 1- ~ p 1 2 2
1903 4 16-Sep
1904 2 14-~p ? L
1906 2 16-J~n l
1 1
1906 4 17-Sep I 3 3
1906 5 27-Sep 3 3 L
I~K~J 6 17-0ct I L 2 2
19~8 2 3G-JUI
1909 3 21-Jul 3 3
1909 5 2Z-AUg 2 2
1909 7 20-Sep
1909 9 1t-Ocl
1910 2 14-Sep 2 2 I
1910 4 17-C,ct 3 L 3
1911 1 11-Aug I 1 1 1
1911 2 27-Aug 2 2
1912 3 13-Sep I 1
1912 5 16~0c1 1 I I
1913 1 27.J,Jn 1 I J
1913 2 02-Sep
1915 2 17-AU9 4 4
1915 4 04-sep 1 I
1915 5 29-Sep 4
1916 1 05-Ju4 I L
i
3 3
1916 2 21 -Jul
1916 3 14-,~JI J
1916 4 18*Au~ 3 3
1916 13 18+0ct 2 2 2
1916 14 l~-Nov 1 f
1917 3 2P~Se~ I 3 3
1919 I 06-Au9 L ' 3
1919 2 14-Sep 4 4 I 4 4
1920 2 21-Se0 2
192(3 3 22-Sep
1921 I 22-Jun 2 2
1921 6 25~,cf. L 3 2 3
1923 3 1.~Oct ! ?
1924 ~- 15-Sep I 1
1924 7 20~)d ! 1 L 1
1925 2 01,0ec 1 L 1
1926 1 27-Jo~ 2 2
1926 3 25-Au~j 3
1926 6 18-Sep I L 3 3 3 42 L 42
1928 1 07-Aug
1929 4 16-Sep 4 2 4 1 1
1929 1 26-Jun
1929 2 28-,~p 2 3 3
1932 2 13-Aug 4 4
1932 3 Ol-Se~ I
1933 5 30-,.kJl 2 2 1 1
1933 181 23-Au9 2 2
1933 04-Sep 3 3
1933 12 03.Sep 3 3
1933 13 16-Sep l 3
1934 2 16-Jun 3
1934 3 25-Jut 2 2 I
I
1935 2 o:~.5ep 2 5 5
1935 6 04-Nov i L ? 2
1936 3 27-Jun 1 L I
1936 5 3 l-..k~ I 3 3
1936 13 18-Se0
1938 2 14-AU~ I I J
1938 4 21 -Sep I 3 3 3 3
1939 2 11-AU9 I t 1 1
1940 2 07~u 9 2 2 I 2
1940 3 11-A~ I 2 2
1941 2 23-Sep 3 3 I
234
Exhibit 1
Sheet 2
1941 5 06-Oct 2 2 2 21
1942 1 21-Aug f f
1942
1943
1944
1944
2
I
3
7
29-Au9
26-Jul
01-A,u9
14-Sep
3 L
2
3
2
I 3 3 mB~m~m
1944 11 18-Oct 3 2 3
1945 1 24~un 1 1
1945 5 26-Aug 2 C 2
1945 9 15-Sep 3 3
1946 5 97~C)Cl 1 I 1
1947 3 24-Aug I 1 1
1947 4 17-$ep 3 3 2 4 4 4
1947 8 11~ 1 1 1 2 2 2
1948 5 03~o 1 1
1948 7 21-Sep 3 2 3 3
1948 8 05~Ct L 2 2 2
1949 1 24-Aug 1 1
1949 2 26-Aug 3 L 3 L 3
1949 10 O3~Od 2 2 2
1950 Baker 30-Aug I t
1950 Easy 04~ep 3 3 3
1950 Kl~ng 17~ct 3 L 3 3
1952 Ab4e 30-Au9 1 1
1953 Barbara 13-Au9 1 1
1953 Carol 07~o f 1
1953 FIcce~ce 26-Sep 1 1 1
1954 Carol 31-Aug 2 3 3 3 L L L 3
1954 Edna 11-Sep L L 3 L 1 3
1954 Hazel 15-Oct 4 4 L 2 L L L 4
1955 Conrae 12.Au9 3 1 L
1955 Dmne 17-Aug 1
1955 k>ne 19-Sep 3
1956 F l o s s y 24oSep 2 1 1
1957 Audrey 27-Jun 4 4 4
1959 Cr,dy 06-Jul 1
1959 Oebra 24-J~,d 1 1
1959 GrZoe 29~o 3
1 ~ 0 Donna 09-Sep 4 2 4 3 3 2 2 1 1 1
1960 E ~ 15~Sep
1961 C~rla 11-Sep 4 L 4
1S63 Ce~ly 17-Stop 1 1
I~o.4 Cleo 26~Auo 2 L 2
1964 DOca ~-~p 2 2 L
1964 Hklda 03-O¢1 3
1964 Isb~l 14~ 2 2 2
1965 t~4sy 0 ~ 3 L 3 3
1966 Alma 09-Jun 2 2
lm,~ i ~ z o4-oct 1 L 1
1967 8euiah 20-Sep 3 3
1968 Gladys 19"0CI 2 1 2
1969 C~malle 1T-Aug 9 5
1969 Gerda G9-Seo 1
1970 Celia O3-Aug 3 L 3
1971 Edil~ 16-Sep 2
1971 Fern 09-Sep 1 1
1971 G~ger 30--Sep 1
1972 A~fles 19-Ju~ 1 1 L L t 1
1974 Carton 07-Sep 3
1975 EIO~se 23-Sep L 3 3
1976 Belle 09-~ t
1977 Babe 04*Seo 1
1979 ~ 11-Jul 1
1979 David 93-Sep 2 2 2 2 2 L
1979 Fred~¢ 12-Sep 3 3 L
1980 ~ 09-Aug 3 3
1983 redic=a 17.AvJ9 3 3
1984 Dm~a 11-Sep 3
1985 Bob 24-J~ 1
tga5 ~ IS-,,*~9
235
Exhibit 1
Sheet 3
,
3
32712911 1:°51:i6:!
6 1 3 10 8 5 15 7 6 7 1 2 1
......
1 5 3 3 2 48
4 1 1 4 6 3 2 4 2 15
5 I 1 ,2
To[~ 14 6 17 9 24 20 26 8 5 14 5 1 0 1 9 8 $ 6 2 164
3 6 1 3 10 5 3 5 4 7 16 2 3 1 1 48
4 1 1 4 6 2 1 4 1 15
5 1 1 2
Total 13 6 17 6 15 14 22 3 0 10 0 0 0 1 4 0 2 2 0 164
Nolas
Coastal statues atleofed, and Category deslgPatlo~s according to Sa~z~qampe,o~ Humcar~ Scofe based on Neumann (NePJm~n, Jarvlnen, Mc,Ad*e and
Elms 1993) t~rough 1992 ~ on NOAA sun,~nae/ceCocls for 1993-199<3 States "offecteo" rat~E~tsNOAA's tudgrrmnt as to wi~ch areas received humc~e
c . o n d ~ at the nlensify rd the ~efine~ Saltir *~rnp$on cate~ory In some cases, ~e conddJon= may have e, isled only ,n very Ioca~zed areas and n~ay~ l
have exJated tn a~reasthad conta~S<J s,gnlflcant amounts of ~n$~red I ~ o ~ y A~e~op.BI states "~th non~allZed losses grealer than $25 rr~lt~ noted by 'L'
Filet latlddofl in¢l~aled by it~lbcs (strafing of coa~lat i~ands no~ consrdered as ~rst tandfatt if subsequent landf a41more s,gn~6£~nt}
Saffir-~mpson Centrat
Numbe~ Pressure OR Winds OR Surge
(CateootvI lMh~b~r'~i (MPH)
1 >979 74-95 6*5
2 965 - 979 96-110 E-8
3 945 - 964 11%130 9-12
4 9ZO - 944 131-155 13-18
5 <920 >155 >18
236
Exhibit 2
237
Exhibit 3
Sheet 1
Insured
Hurricane Total Estimated Aclual Loss at Time o( Event Loss Max
Number/ insurance Normalized Loss Max
Year N~me E.conomic
E UIjti;;~tion tn~red Soume T0 ~O00 $tate/Reqion Cateqory
238
Exhibit 3
Sheet 2
Insured
Hurricane Total Estimaled Actual Loss al Time of Event Loss Max
NumbeH tnsurance Normalized Loss Max
Name EconomiC U!iliza~ign Insur~,~ Source T9 2000 StatelReqi0n Caleqq~
239
Exhibit 3
Sheet 3
H u n t c a n e Loss E s t i m a t e s
C o n t i n e n t a l U,S. 1 9 0 0 - 1 9 9 9
D o l l a r s in T h o u s a n d s
Insured
Hurricane Total Estimated Actual Loss at Time of Event Loss Max
Number/ Insurance Normalized Loss Max
Year N~m0 Economic Utilization Insured Source To 2000 State/Reoion CateQorv
240
Exhibit 3
Sheet 4
Insured
Hurricane Total Eslimaled Actual Loss at Time of Event Loss Max
Number/ Insurance Normalized Loss Max
Year N~m~ E~:oqgmic Utilization Insu~,d $ovrce To 2000 Stale/Reoion Cateaorv
Notes:
Where based 0n NOAA. insured loss equals economic loss times insurance utilization factoe times flood
adjustment factor Only the following storms, which had unusual amounts of uninsured flood damage, were
reduced to reflect flood; 1900 #1 (50%}, 1915 #2 (75%). 1916 #1 (50%), 1955 Diane (5%)
PCS losses exclude the following stales and tardtones, which were excluded from the normalization model:
1975 EIolse PA, PR
1979 David PR, Vl. VA to MA
1979 Frederic K¥. NY, OH, PA, WV
1980 Allen PR, VI
1989 Hugo PR, VI
1995 Opal NC, SC, TN
1996 Fran PA, OH
1997 Danny NC, SC
1998 Geofges PR. Vl
1999 Floyd PA, RI
241
Exhibit 4
Sheet 1
South Carc4ina 4, t 40.037 606,128 244,375 220,535 40.168 5,947 61,660 21%
Nortt~ Carolina 1,943,528 1,768,044 1,399,847 1,371 862 267,909 23,152 t 09,399 3 8%
Total All States 51,789,586 24,486,691 16,485,683 15,106.320 9.373,159 3,555,627 2,872,969
Nota: Return penod loss based on distnbu6on by stale of normahzed losses in Exhibit 3. e g 100 year
return is the worst year in the 20th century, 50 year return is the SeCOndworst year, 25 year return is the 4th
worst year, etc. Not to be confused with probabilistic return penod distributions and expected losses based
on catastrophe models, which are intended to reflect longer term probabilities
242
Exhibit 4
Sheet 2
Soulh Carolina 4,140,037 605.316 244,375 220,535 37,008 5.947 1989 - Hugo
North Carotina 1943.528 1,641,766 1,371,862 641,628 26L909 23.152 1954 - Hazel
New Jersey 600,714 579.055 99,297 92.297 32,234 1938 - 4 or 1954 - Hazel
New York 3,082,156 1,077,727 208,076 183.374 36.439 1936 - 4 ("Great New England")
Note Relurn period loss based on distribution by state of the largest normalized loss per year in
Exhibll 3. e g , 100 year return is the worst event, 50 year return is the second worst event, 25 year
return is the 4lh worst event, etc Not to be confused with probal~listic return period distributions an(:l
expected losses based on calastrophe models, which are intended to reflect longer term probabdibes.
243
Exhibit 5
N o r m a l i z e d H u r r i c a n e L o s s - A n n u a l A g g r e g a t e S e v e r i t y D i s t r i b u t i o n s b y State a n d C o u n t y
C o u n t i e s w i t h S i g n i f i c a n t A n n u a l E.xpected L o s s e s
Do#lars in Thousands
Esbmated Expected
2000 Notmazed Actual 201h Cenlun/Return Penod (Years) Expected LOSSPer
State CounW Hous4naUnits 100 ~ 25 20 10 Ann~al Unit ~$'s)
TX
Hams 1,305,351 $9,953.674 $8,841.048 $729,077 $560.265 $199.602 $0 $245,595 $188
Galveston 110,157 4,606,461 4,084,453 360,805 316.733 44,502 1106 104,432 948
Nueces 122,333 7,287,137 2,001,912 90,356 53,950 36.982 0 98.660 806
Brazona 88,261 1.359.509 581.793 166,175 164.674 28.757 434 33.046 374
Fort Bend 121,367 911,594 401,431 160,493 153,787 14.463 O 23,965 197
Cameron 114,432 647.510 513.497 68.357 32,195 3.978 0 14.581 127
Aransas 14,188 1.203.723 114.140 6,721 4,802 1.624 0 14,044 990
San Patnc~ 26,640 1,032,527 136,714 6,96,B 5,220 3.636 0 12,619 474
Montgomet~ 114,584 285.815 244,840 53.763 22,594 3,909 O 7,953 69
Hidalgo 184,668 665,041 119,872 14,585 5,975 0 0 7.720 42
Jefferson 97,658 261,334 165,980 33,504 21,097 7,430 32 6,103 63
Matagorda 18329 179,112 141,720 42,226 9,220 1,892 206 4,539 248
Chambers 9305 145,296 127,939 8,388 4,940 1,430 11 3,147 338
Viclona 31,792 268,874 14,153 5,013 1,338 355 C 3,067 96
FL
Dade 960,587 24,841,690 21,503,754 2,448.916 1,154.922 5281163 32.834 594,201 690
Btoward 784.873 8,274,310 1.837,931 1,276,267 1,250.347 432.580 30,674 188,435 240
Palm Beach 580.029 2.613,939 2.449,415 1,278.092 874.908 186,600 30.599 119.848 207
Monn3~ 48.610 3.285,189 1.3G6,132 815,359 659.162 93,993 8.586 86.746 1,785
Lee 232,004 4.333.589 1.174.856 282.775 278.928 47,403 14.434 75.937 327
Escambia 122,238 1.242.614 537,338 243.515 86.124 8,999 156 26.799 219
8mvaKI 228.560 805.310 688,639 202.758 173.427 23,231 2,025 25.084 110
Colltet" 134,052 1 , 5 1 0 + 8 3 7 345,577 110,492 68.745 12,317 4,488 25,022 187
Salrasota 174,066 1.167,395 723,028 112,817 51,022 23,990 5,187 24.846 143
Firlellas 470889 603,486 470,479 152,418 95,421 58,754 9,286 23.269 49
Santa Rosa 52623 961,706 639,907 150,955 83,197 8,161 250 22.866 435
SL Lucle 94,~o6 1.110,664 376.664 115,185 76,406 24.309 1,799 21,996 232
hilitsborougt~ 413122 749,675 222,368 134,788 95.736 26.053 4,100 16,790 41
Okaloosa 79,064 632,113 336.647 121,265 C00,763 5,794 336 14,755 187
Matin 64.667 619.485 272,745 117,000 74,602 10.303 1,420 14.627 226
Manatee 133,772 483,954 468A04 71.797 39,658 23,189 3,284 13,879 104
Volusia 216,688 314,543 278,835 148,743 137,068 14,648 1,118 13,635 63
Orange 339,869 411,441 196,923 134,578 122,244 20,628 343 13,610 40
polk 213.034 375,193 365.058 124,589 109,153 16,023 1,041 13,420 63
Indian River 52,411 562,726 174,576 40,527 37.628 12,896 509 11,084 211
Charlotte 84,296 568.944 270,309 22,544 16.893 6,665 761 10,036 119
Pasco 175.8,54 219,943 162,509 47.060 30.902 11.942 1,696 6.880 39
Lake 106,250 186,706 179,379 44.272 42,158 8,788 301 6,538 62
Sem~ole 152.097 145.588 95.484 61.216 55.408 6.372 0 5,571 37
Du-,tal 317,548 232.279 84.687 46.432 28.001 5.734 O 5.544 17
Bay 81.598 264.066 100.921 36.975 17.167 5.810 O 5.423 66
Osceola 70,504 148,485 65,616 36.843 23,752 6,872 219 4.080 58
Marion 124,315 131,971 107,128 23,137 22,395 8,516 243 4,071 33
H~h~nds 46.304 60,603 52.898 25.421 22,991 2.042 236 2.745 59
Nora: Return penocl loss based on dlsldbu~on by state and county of normalized losses in Exhibit 3, e g , 100 year return
is t6e worst year In the 2Olh cet~tuty, 50 yely return i~ the seco~l woPst year. 25 ~ mtum is ~ 4~ ~ ~ar, ~ ,
NO( tO be confused wdh probabitts6c return ~ distdbofions and expected losses based on Catastrophe models, whch
m ~t8,nde(I to reflect longer t~nm Wobabilibes Expected k:]$sper unit cofftpanss expected armual losses (personal
commercial, and auto) ~ t h residential - only housing units, ie., it is interKled as II mte~ve measure of cost per u~t of
IlXpOSute bof not as a measure of ~ l J a l costs per unit,
244
Exhibit 6
Sheet 1
Notes: Countrywide (CW) normalized figures based on continental U.S. from Exhibits 3 and 4.
Texas and Florida actual frequencies from Exhibit 1.
Texas and Florida normalized damages from Exhibit 4 and undedying data.
Model T is a hypothetical probabilistic hurricane model
245
Exhibit 6
Sheel 2
Comparison of Actual vs. Modeled Hurricane Expected Losses by State
GA F
SC
NC
VA • Model 3-
ME) I"1 Normalized j
DE
NJ
NY
CT
RI I ..c~[
MA B=
NH
ME
246
Exhibit 6
Sheet 3
Comparison of Actual vs. Modeled Hurricane Losses
T o p 50 H i s t o r i c a l N o r m a l i z e d E v e n t s
Max
Number/ Loss Max
Rank Year Name Normalized Mogfel T State/Reqion Cateqory
264,812,155 294,300,000
247
Appendix A
Exhibit 1
Hurricanes Affecting the Bermuda, Hawaii, Puerto Rico and U~;V11900-1999
Catego~ 1 9 0 3 1 5 3 4 7
Cateoory 2 E 0 1 0 3 5 3 6
Cateoonf 3 2 0 O 0 0 1 0 0
Category 4 0 O 1 0 I 0 1 1
Category 5 0 0 0 0 10 0 1 15
TOtal 17 0 5 1 1 9 9 1
Note: Category desCjnatbons, a,ccord~ng to Sa~rlSimpson Humcane Scale, based on estimated sustained wlads over
land reflecting authocs' }udomenl baS~d on r e v ~ of
• NOAA summan/reports and best track files (www ~hc noaa 90v/paslalq html)
- Neumann (Newmi~n, Jarvinen, McAd~e and Elms, 1993, p 31)
• P~bert (Hebett. J~t,rell and Mayf~ld, 1996. Table 14)
- Tucker (Tucker. 1995)
No huh'cartes have affected the west coast Of the U S dunng the 20th century Accordin 9 to the
NatK)nal Weather Sen/¢e office ~n Oxna~, California, two storms are recogmzed as havre9
produced tropical storm condit~ns over land:
248
Appendix A
Exhibit 2
Hurricanes Affecting the Bermuda, Hawaii, Puerto Rico and USVI 1900-1999
Esbmated Damage at Time of Event
Dollars in Thousands
B~rmudp
1900 4 Unk
1903 6 Unk
1915 3 Unk
1916 10 Unk
1918 4 Unk
1921 3 Unk
1922 2 Unk
1926 10 Unk
1939 4 Unk
1947 9 Unk
1948 6 Unk
1948 8 Unk
1953 Edna Unk
1963 Arlene 75 Tucker
1987 Emily 35,000 NOAA
1989 Dean 5,000 NOAA
1999 Gert Unk
Hpw~lii
1950 Hiki Unk
1957 Nina 200 Hebert
1959 Dot 6,000 Hebert
1982 Iwa 137,000 PCS
1992 Iniki 1,906,000 PCS
Insured
Eqqnomic PR USVI ~9ur~:e
Puetlo Rim and USVI
1916 51San Hipolito 1,000 Hebert
1916 12 Unk
1926 1/San Libono 5,000 Hebert
1928 4/San Felipe 85,000 Hebert
1930 2 Unk
1931 6/San Nicolas 200 Hebert
1932 7/San Cipnan 30.000 Hebert
1956 Santo Clara (Betsy) 40,000 10,000 PCS
1960 Donna Unk Hebert
1989 Hugo 440,000 800.000 PCS
1995 Madlyn 75,000 800.000 PCS
1996 Bertha Unk
1996 Hortense 150,000 PCS
1998 Georges 1,750.000 50,000 PCS
1999 Lenny Unk
249
A,opendlx B
Exhibit 1
NOteS: Imphcd pnce deflator available back to 1950; 3 5% trend assumed for 1960 and prior
F R S ~ / i s fixed reproducable langlble wealth, Deparlmenl of C o m m e r c e Bureau of E c o n o m < AnaP~sis
- Available back Io 1925, 2 5% trend assumed for 1925 an0 prbor
Housing units and population growth based on annual growth between each decenntal census
Insurance utd~zatlon index based on Imear trends I r o n 1900 to 1950 and from 1950 Io 1995
- See lext arxl graph on Appendix B, Exhibit 2 for further information
250
Appendix B
Exhibit 2
90%
80%
70%
Ii
60%
50% -
40%
30%
20% -
10%
0%
1900 1920 1940 1960 1980 2000
251
252
Neural Networks Demystified
253
Title: Neural Networks Demystified
by Louise Francis
Abstract:
This paper will introduce the neural network technique of analyzing data as a
generalization of more familiar linear models such as linear regression. The reader is
introduced to the traditional explanation of neural networks as being modeled on the
functioning of neurons in the brain. Then a comparison is made of the structure and
function of neural networks to that of linear models that the reader is more familiar with.
The paper will then show that backpropagation neural networks with a single hidden
layer are universal function approximators. The paper will also compare neural networks
to procedures such as Factor Analysis which perform dimension reduction. The
application of both the neural network method and classical statistical procedures to
insurance problems such as the prediction of frequencies and severities is illustrated.
One key criticism of neural networks is that they are a "black box". Data goes into the
"black box" and a prediction comes out of it, but the nature of the relationship between
independent and dependent variables is usually not revealed.. Several methods for
interpreting the results of a neural network analysis, including a procedure for visualizing
the form of the fitted function will be presented.
Acknowledgments:
The author wishes to acknowledge the following people who reviewed this paper and
provided many constructive suggestions: Patricia Francis-Lyon, Virginia Lambert,
Francis Murphy and Christopher Yaure
254
Neural Networks Demystified
Introduction
Artificial neural networks are the intriguing new high tech tool for finding hidden gems
in data. They belong to a broader category o f techniques for analyzing data known as data
mining. Other widely used tools include decision trees, genetic algorithms, regression
splines and clustering. Data mining techniques are used to find patterns in data.
Typically the data sets are large, i.e. have many records and many predictor variables.
The number of records is typically at least in the tens of thousands and the number of
independent variables is often in the hundreds. Data mining techniques, including neural
networks, have been applied to portfolio selection, credit scoring, fraud detection and
market research. When data mining tools are presented with data containing complex
relationships they can be trained to identify the relationships. An advantage they have
over classical statistical models used to analyze data, such as regression and ANOVA, is
that they can fit data where the relation between independent and dependent variables is
nonlinear and where the specific form of the nonlinear relationship is unknown.
Artificial neural networks (hereafter referred to as neural networks) share the advantages
just described with the many other data mining tools. However, neural networks have a
longer history of research and application. As a result, their value in modeling data has
been more extensively studied and better established in the literature (Potts, 2000).
Moreover, sometimes they have advantages over other data mining tools. For instance,
decisions trees, a method of splitting data into homogenous clusters with similar expected
values for the dependent variable, are often less effective when the predictor variables are
continuous than when they are categorical. I Neural networks work well with both
categorical and continuous variables.
Neural Networks are among the more glamorous of the data mining techniques. They
originated in the artificial intelligence discipline where they are often portrayed as a brain
in a computer. Neural networks are designed to incorporate key features of neurons in
the brain and to process data in a manner analogous to the human brain. Much of the
terminology used to describe and explain neural networks is borrowed from biology.
Many other data mining techniques, such as decision trees and regression splines were
developed by statisticians and are described in the literature as computationally intensive
generalizations of classical linear models. Classical linear models assume that the
functional relationship between the independent variables and the dependent variable is
linear. Classical modeling also allows linear relationship that result from a
transformation of dependent or independent variables, so some nonlinear relationships
can be approximated. Neural networks and other data mining techniques do not require
that the relationships between predictor and dependent variables be linear (whether or not
the variables are transformed).
255
The various data mining tools differ in their approach to approximating nonlinear
functions and complex data structures. Neural networks use a series of neurons in what is
known as the hidden layer that apply nonlinear activation functions to approximate
complex functions in the data. The details are discussed in the body of this paper. As the
focus of this paper is neural networks, the other data mining techniques will not be
discussed further.
Despite their advantages, many statisticians and actuaries are reluctant to embrace neural
networks. One reason is that they are a "black box". Because of the complexity of the
functions used in the neural network approximations, neural network software typically
does not supply the user with information about the nature of the relationship between
predictor and target variables. The output of a neural network is a predicted value and
some goodness of fit statistics. However, the functional form of the relationship between
independent and dependent variables is not made explicit. In addition, the strength of the
relationship between dependent and independent variables, i.e., the importance of each
variable, is also often not revealed. Classical models as well as other popular data mining
~echniques, such as decision trees, supply the user with a functional description or map of
the relationships.
This paper seeks to open that black box and show what is happening inside the neural
networks. While some of the artificial intelligence terminology and description of neural
networks will be presented, this paper's approach is predominantly from the statistical
perspective. The similarity between neural networks and regression will be shown. This
paper will compare and contrast how neural networks and classical modeling techniques
deal with three specific modeling challenges: 1) nonlinear functions, 2) correlated data
and 3) interactions. How the output of neural networks can be used to better understand
the relationships in the data will then be demonstrated.
256
This is analogous to the use of such statistical procedures as regression and logistic
regression for prediction and classification. A network trained using unsupervised
learning does not have a target variable. The network finds characteristics in the data,
which can be used to group similar records together. This is analogous to cluster analysis
in classical statistics. This paper will discuss only the former kind of network, and the
discussion will be limited to a feedforward MLP neural network with one hidden layer.
This paper will primarily present applications of this model to continuous rather than
discrete data, but the latter application will also be discussed.
Figure I displays the structure o f a feedforward neural network with one hidden layer.
The first layer contains the input nodes. Input nodes represent the actual data used to fit a
model to the dependent variable and each node is a separate independent variable. These
are connected to another layer of neurons called the hidden layer or hidden nodes, which
modifies the data. The nodes in the hidden layer connect to the output layer. The output
layer represents the target or dependent variable(s). It is common for networks to have
only one target variable, or output node, but there can be more. An example would be a
classification problem where the target variable can fall"into one of a number of
categories. Sometimes each of the categories is represented as a separate output node.
As can be seen from the Figure 1, each node in the input layer connects to each node in
the hidden layer and each node in the hidden layer connects to each node in the output
layer.
Figure 1
257
This structure is viewed in the artificial intelligence literature as analogous to that o f
biological neurons. The arrows leading to a node are like the axons leading to a neuron.
Like the axons, they carry a signal to the neuron or node. The arrows leading away from
a node are like the dendrites o f a neuron, and they carry a signal away from a neuron or
node. The neurons o f a brain have far more complex interactions than those displayed in
the diagram, however the developers o f neural networks view neural networks as
abstracting the most relevant features o f neurons in the human brain.
Neural networks "learn" by adjusting the strength o f the signal coming from nodes in the
previous layer connecting to it. As the neural network better learns how to predict the
target value from the input pattern, each o f the connections between the input neurons
and the hidden or intermediate neurons and between the intermediate neurons and the
output neurons increases or decreases in strength. A function called a threshold or
activation function modifies the signal coming into the hidden layer nodes. In the early
days o f neural networks, this function produced a value o f I or 0, depending on whether
the signal from the prior layer exceeded a threshold value. Thus, the node or neuron
would only "fire" if the signal exceeded the threshold, a process thought to be similar to
that o f a neuron. It is now known that biological neurons are more complicated than
previously believed. A simple all or none rule does not describe the behavior o f
biological neurons, Currently, activation functions are typically sigmoid in shape and can
take on any value between 0 and 1 or between -1 and 1, depending on the particular
function chosen. The modified signal is then output to the output layer nodes, which also
apply activation functions. Thus, the information about the pattern being learned is
encoded in the signals carried to and from the nodes. These signals map a relationship
between the input nodes or the data and the output nodes or dependent variable.
In this example the true relationship between an input variable X and an output variable
Y is exponential and is o f the following form:
Y = e : +~:
Where:
258
- N(0,75)
X - N(12,.5)
and N (It, o) is understood to denote the Normal probability distribution with parameters
It, the mean o f the distribution and o, the standard deviation o f the distribution.
Figure 2
8O0
6OO : : = : :
>..
.:. ::-* :
4O0
: =;- .....
2OO
11 12 13
X
259
Figure 3
5OO
IO0
115 120 125 130
X
A simple neural network with one hidden layer was fit to the simulated data. In order to
compare neural networks to classical models, a regression curve was also fit. The result
o f that fit will be discussed after the presentation o f the neural network results. The
structure o f this neural network is shown in Figure 4.
Figure 4
0 +e +4
260
As neural networks go, this is a relatively simple network with one input node. In
biological neurons, electrochemical signals pass between neurons. In neural network
analysis, the signal between neurons is simulated by software, which applies weights to
the input nodes (data) and then applies an activation function to the weights.
Neuron signal of the biological neuron system --) Node weights o f neural networks
The weights are used to compute a linear sum of the independent variables. Let Y denote
the weighted sum:
Y = w o + w~ * X~ + w 2 X 2... + w X ,
The activation function is applied to the weighted sum and is typically a sigmoid
function. The most common of the sigmoid functions is the logistic function:
1
f(Y) -
i + e -r
The logistic function takes on values in the range 0 to 1. Figures 5 displays a typical
logistic curve. This curve is centered at an X value of 0, (i.e., the constant w0 is 0). Note
that this function has an inflection point at an X value o f 0 and f(x) value of.5, where it
shifts from a convex to a concave curve. Also note that the slope is steepest at the
inflection point where small changes in the value of X can produce large changes in the
value of the function. The curve becomes relatively flat as X approaches both ! and -1.
Figure 5
Logistic Function ]
Io
oe
O6
O4 X * * °X
O1
O0
• a a6 ~4 a2 O0 02 04 OS OS 10
X
261
Another sigmoid function often used in neural networks is the hyperbolic tangent
function which takes on values between -1 and 1:
e r _e -v
f(Y) e r + e -r
In this paper, the logistic function will be used as the activation function. The Multilayer
Perceptron is a multilayer feedforward neural network with a sigmoid activation function.
The logistic function is applied to the weighted input. In this example, there is only one
input, therefore the activation function is:
1
h = f(X; wo, w I ) = f ( w 0 + W l X ) = 1 + e -tw°" +WlX)
This gives the value or activation level of the node in the hidden layer. Weights are then
applied to the hidden node:
w2 +w3h
The weights w0 and wz are like the constants in a regression and the weights wm and w3
are like the coefficients in a regression. An activation function is then applied to this
"signal" coming from the hidden layer:
1
o = f ( h ; w 2 , w 3 ) = 1 + e -(w~ +w3h)
The output function o for this particular neural network with one input node and one
hidden node can be represented as a double application of the logistic function:
It will be shown later in this paper that the use of sigrnoid activation functions on the
weighted input variables, along with the second application of a sigmoid, function by the
output node is what gives the MLP the ability to approximate nonlinear functions.
One other operation is applied to the data when fitting the curve: normalization. The
dependent variable X is normalized. Normalization is used in statistics to minimize the
impact of the scale of the independent variables on the fitted model. Thus, a variable
with values ranging from 0 to 500,000 does not prevail over variables with values
ranging from 0 to 10, merely because the former variable has a much larger scale.
262
Various software products will perform different normalization procedures. The software
used to fit the networks in this paper normalizes the data to have values in the range 0 to
1. This is accomplished by subtracting a constant from each observation and dividing by
a scale factor. It is common for the constant to equal the minimum observed value for X
in the data and for the scale factor to equal the range of the observed values (the
maximum minus the minimum). Note also that the output function takes on values
between 0 and 1 while Y takes on values between -oo and +oo (although for all practical
purposes, the probability of negative values for the data in this particular example is nil).
In order to produce predicted values the output, o, must be renormalized by multiplying
by a scale factor (the range of Y in our example) and adding a constant (the minimum
observed Y in this example).
Min(E(Y - 17)2 )
Warner and Misra (Warner and Misra, 1996) point out that neural network analysis is in
many ways like linear regression, which can be used to fit a curve to data. Regression
coefficients are solved for by minimizing the squared deviations between actual
observations on a target variable and the fitted value. In the case of linear regression, the
curve is a straight line. Unlike linear regression, the relationship between the predicted
and target variable in a neural network is nonlinear, therefore a closed form solution to
the minimization problem does not exist. In order to minimize the loss function, a
numerical technique such as gradient descent (which is similar to backpropagation) is
used. Traditional statistical procedures such as nonlinear regression, or the solver in
Excel use an approach similar to neural networks to estimate the parameters of nonlinear
functions. A brief description of the procedure is as follows:
1. Initialize the neural network model using an initial set of weights (usually
randomly chosen). Use the initialized model to compute a fitted value for an
observation.
2. Use the difference between the fitted and actual value on the target variable to
compute the error.
263
3. Change the weights by a small amount that will move them in the direction of a
smaller error
• This involves multiplying the error by the partial derivative of the
function being minimized with respect to the weights. This is because
the partial derivative gives the rate of change with respect to the
weights. This is then multiplied by a factor representing the "learning
rate" which controls how quickly the weights change. Since the
function being approximated involves logistic functions oftbe weights
of the output and hidden layers, multiple applications of the chain rule
are needed. While the derivatives are a little messy to compute, it is
straightforward to incorporate them into software for fitting neural
networks.
4. Continue the process until no further significant reduction in the squared error can
be obtained
Further details are beyond the scope of this paper. However, more detailed information is
supplied by some authors (Warner and Misra, 1996, Smith, 1996). The manuals of a
number of statistical packages (SAS Institute, 1988) provide an excellent introduction to
several numerical methods used to fit nonlinear functions.
However, the assumption for the purposes of this paper is that the overwhelming majority
of readers will use a commercial sottware package when fitting neural networks. Many
hours of development by advanced specialists underlie these tools. Appendix 1 discusses
some of the software options available for doing neural network analysis.
Table 1
WO Wl
Input Node to Hidden Node -3.088 3.607
Hidden Node to Output Node -1.592 5.281
264
To produce the fitted curve from these coefficients, the following procedure must be
used:
1. Normalize each xi by subtracting the minimum observed value 2 and dividing by the
scale coefficient equal to the maximum observed X minus the minimum observed X.
The normalized values will be denoted X*.
2. Determine the minimum observed value for Y and the scale coefficient for y3.
3. For each normalized observation x*~ compute
1
h ( x *i ) = ! + e -I-3"088+ j
1
o(h(x*i)) 1 + e -~-Isg~,5281h~x'"
Compute the estimated value for each yi by multiplying the normalized value from
the output layer in step 4 by the Y scale coefficient and adding the Y constant. This
value is the neural network's predicted value for Yi.
Table 2 displays the calculation for the first 10 observations in the sample.
265
Table 2
Figure 6
.o
O.8 ] ....'"
06 .
0.4 I
Figure 7
2oot
267
It is natural to compare this fitted value to that obtained from fitting a linear regression to
the data. Two scenarios were used in fitting the linear regression. First, a simple straight
line was fit, since the nonlinear nature of the relationship may not be apparent to the
analyst. Since Y is an exponential function of X, the log transformation is a natural
transformation for Y. However, because the error term in this relationship is additive, not
multiplicative, applying the log transformation to Y produces a regression equation which
is not strictly linear in both X and the error term:
Bx BX
Y=Ae 2 +oo__~ln(Y)=ln(Ae 2 + 6 ) = I n ( Y ) = l n ( A ) + B X + E
2
Nonetheless, the log transformation should provide a better approximation to the true
curve than fitting a straight line to the data. The regression using the log of Y as the
dependent variable will be referred to as the exponential regression. It should be noted
that the nonlinear relationship in this example could be fit using a nonlinear regression
procedure which would address the concern about the log transform not producing a
relationship which is linear in both X and c. The purpose here, however, is to keep the
exposition simple and use techniques that the reader is familiar with.
The table below presents the goodness of fit results for both regressions and the neural
network. Most neural network software allows the user to hold out a portion of the
sample for testing. This is because most modeling procedures fit the sample data better
than they fit new observations presented to the model which were not in the sample. Both
the neural network and the regression models were fit to the first 80 observations and
then tested on the next 20. The mean of the squared errors for the sample and the test
data is shown in Table 3
Table 3
Method Sample MSE Test MSE
Linear Regression 4,766 8,795
Exponential Regression 4,422 7,537
Neural Network 4,928 6,930
As expected, all models fit the sample data better than they fit the test data. This table
indicates that both of the regressions fit the sample data better than the neural network
did, but the neural network fit the test data better than the regressions did.
The results of this simple example suggest that the exponential regression and the neural
network with one hidden node are fairly similar in their predictive accuracy. In general,
one would not use a neural network for this simple situation where there is only one
predictor variable, and a simple transformation of one of the variables produces a curve
which is a reasonably good approximation to the actual data. In addition, if the true
function for the curve were known by the analyst, a nonlinear regression technique would
probably provide the best fit to the data. However, in actual applications, the functional
form of the relationship between the independent and dependent variable is often not
known.
268
A graphical comparison of the fitted curves frGm the regressions, the neural network and
the "true" values is shown ,in Figure 8.
Figu re 8
V . . . . .
co0
(00 ~. j __. _ _
/ ~l -- NNPr~
The graph indicates that both the exponential regression and the neural network model
provide a reasonably good fit to the data.
269
Figure 9
1.0
'". S /
0.8 r //
.... - '~ \ : / I ......... wl=-s I .......
...... -. ', \ / / I......... ~:,~..+ ....
0.6
.......
......... ;,i ::-- ......
..°.o.. -- // '. - .
0.4
" - , . i ..............
0,0
x
Figure 10
Logistic C u r v e W i t h V a r y i n g C o n s t a n ~
270
Varying the values o f w0 while holding wL constant shifts the curve right or left. A great
variety o f shapes can be obtained by varying the constant and coefficients o f the logistic
functions. A sample o f some o f the shapes is shown in Figure I I. Note that the X values
on the graph are limited to the range o f O to 1, since this is what the neural networks use.
In the previous example the combination o f shifting the curve and adjusting the steepness
coefficient was used to define a curve that is exponential in shape in the region between 0
and 1.
Figure 11
Constant=2
Constant=-2
1.0 1.0
• . = : - - ...................
0.8 0,8
/ /
0.6 0.6
r~ i/I
! i!
0.4 0.4 ! I
e i!
0.2 02
~7;,-:~,[,Z.; ..........
0.0 0.0
271
Using Neural Networks to Fit a Complex Nonlinear Function:
To facilitate a clear introduction to neural networks and how they work, the first example
in this paper was intentionally simple. The next example is a somewhat more complicated
CHIVe.
X - U(500,5000)
e - N(0,.2)
Note that U denotes the uniform distribution, and 500 and 5,000 are the lower and upper
ends o f the range o f the distribution.
A scatterplot o f 200 random values for Y along with the "true" curve are shown in Figure
12
Figure 12
Scatterplot of Y = sin(X/675)+ln(X) + e
This is a more complicated function to fit than the previous exponential function. It
contains two "humps" where the curve changes direction. To illustrate how neural
272
networks approximate functions, the data was fit using neural networks of different sizes.
The results from fitting this curve using two hidden nodes will be described first. Table 4
displays the weights obtained from training for the two hidden nodes. W0 denotes the
constant and Wi denotes the coefficient applied to the input data. The result of applying
these weights to the input data and then applying the logistic function is the values for the
hidden nodes.
Table 4
W0 WI
Node i -4.107 7.986
Node 2 6.549 -7.989
A plot of the logistic functions for the two intermediate nodes is shown below (Figure
13). The curve for Node 1 is S shaped, has values near 0 for low values of X and
increases to values near 1 for high values of X. The curve for Node 2 is concave
downward, has a value of I for low values of X and declines to about .2 at high values of
X.
Figure 13
Table 5 presents the fitted weight~ connecting the hidden layer to the output layer:
Table 5
W0 Wl W2
6.154 -3.0501 -6.427
273
Table 6 presents a sample of applying these weights to several selected observations from
the training data to which the curve was fit. The table shows that the combination of the
values for the two hidden node curves, weighted by the coefficients above produces a
curve which is like a sine curve with an upward trend. At tow values of X (about 500),
the value of node 1 is low and node 2 is high. When these are weighted together, and the
logistic function is applied, a moderately low value is produced. At values of X around
3,000, the values of both nodes 1 and 2 are relatively high. Since the coefficients of both
nodes are negative, when they are weighted together, the value of the output function
declines. At high values of X, the value of node 1 is high, but the value ofnode 2 is low.
When the weight for node 1 is applied (-3.05) and is summed with the constant the
value of the output node reduced by about 3. When the weight for node 2 (-6.43) is
applied to the low output of node 2 (about .2) and the result is summed with the constant
and the first node, the output node value is reduced by about 1 rcsulting in a weighted
hidden node output of about 2. After the application of the logistic function the value of
the output node is relatively high, i.e. near 1. Since the coefficient of node 1 has a lower
absolute value, the overall result is a high value for the output function. Figure 14
presents a graph showing the values of the hidden nodes, the weighted hidden nodes
(after the weights are applied to the hidden layer output but betbre the logistic function is
applied) and the value ofthe output node (after the logistic function is applied to the
weighted hidden node values). The figure shows how the application of the logistic
function to the weighted output of the two hidden layer nodes produccs a highly
nonlinear curve.
Table 6
Computation of Predicted Values for Selected Values of X
(3) (4) (5) (6) (7)
((1)-508)/4994 6.15- l/(l+exp(- 6.52+3.56
3.05"(3)- (5)) "(6)
6.43*(4)
X Normalized X Output of Output of Weighted Output Predicted
Node 1 I Node 2 Hidden Node Y
Node Logistic
Output ,Function ,
508.48 0.00 0.016 0.999 -0.323 0.420 7.889
1,503.00 0.22 0.088 0.992 -0.498 0.378 7.752
3,013.40 0.56 0.596 0.890 -1.392 0.199 7.169
4,994.80 1.00 0.980 0.1901 1.937 0.874 9.369
Figure 15 shows the fitted curve and the "true" curve for the two node neural network
just described. One can conclude that the fitted curve, although producing a highly
nonlinear curve, does a relatively poor job of fitting the curve for low values of X. It
turns out that adding an additional hidden node significantly improves the fit of the curve.
274
Figure 14
o.! f
/
04
Figure 15
gs
ii
0 1ooo 2000 ~Doo 4~
x
Table 7 displays the weights connecting the hidden node to the output node for the
network with 3 hidden nodes. Various aspects of the hidden layer are displayed in Figure
16. In Figure 16, the graph labeled "Weighted Output of Hidden Node" displays the
275
result of applying the Table 7 weights obtained from the training data to the output from
the hidden nodes. The combination of weights, when applied to the three nodes produces
a result which first increases, then decreases, then increases again. When the logistic
function is applied to this output, the output is mapped into the range 0 to I and the curve
appears to become a little steeper. The result is a curve that looks like a sine function
with an increasing trend. Figure 17 displays the fitted curve, along with the "'true" Y
value.
Table 7
Weight 0 Weight 1 Weight 2 Weight 3
-4.2126 6.8466 -7.999 ~6.0722
Figure 16
-"-.\
2
09
1
\
,f~'\ /
I
/1
04
I"~\\ "'\\
i --
// j
\\~,~,/// I
0 1500 3000 4500 1100 2 2 0 0 33(30 4400
X X
276
Figure 17
Flllmd3NodeNeucd~dT~eYVd~]
.¢
Io
es
>
eo
75
7o
It is clear that the three node neural network provides a considerably better fit than the
two node network. One of the features of neural networks which affects the quality of
the fit and which the user must often experiment with is the number of hidden nodes. If
too many hidden nodes are used, it is possible that the model will be overparameterized.
However, an insufficient number of nodes could be responsible for a poor approximation
of the function.
This particular example has been used to illustrate an important feature of neural
networks: the multilayer perceptron neural network with one hidden layer is a universal
function approximator. Theoretically, with a sufficient number of nodes in the hidden
layer, any nonlinear function can be approximated. In an actual application on data
containing random noise as well as a pattern, it can sometimes be difficult to accurately
approximate a curve no matter how many hidden nodes there are. This is a limitation that
neural networks share with classical statistical procedures.
Neural networks are only one approach to approximating nonlinear functions. A number
of other procedures can also be used for function approximation. A conventional
statistical approach to fitting a curve to a nonlinear function when the form of the
function is unknown is to fit a polynomial regression:
Y =a+blX+b2X2...+bnX n
277
method for approximating nonlinear functions is to fit regression splines. Regression
splines fit piecewise polynomials to the data. The fitted polynomials are constrained to
have second derivatives at each breakpoint; hence a smooth curve is produced.
Regression splines are an example ofcontemporary data mining tools and will not be
discussed further in this paper. Another function approximator that actuaries have some
familiarty with is the Fourier transform which uses combinations of sine and cosine
functions to approximate curves. Among actuaries, their use has been primarily to
approximate aggregate loss distributions. Heckman and Meyers (Heckman and Meyers,
1983) popularized this application.
In this paper, since neural networks are being compared to classical statistical procedures,
the use of polynomial regression to approximate the curve will be illustrated. Figure 18
shows the result of fitting a 4 th degree polynomial curve to the data from Example 2,
This is the polynomial curve which produced the best fit to the data. It can be concluded
from Figure 18 that the polynomial curve produces a good fit to the data. This is not
surprising given that using a Taylor series approximation both the sine function and log
function can be approximated relatively accurately by a series of polynomials,
Figure 18 allows the comparison of both the Neural Network and Regression fitted
values. It can be seen from this graph that both the neural network and regression
provide a reasonable fit to the curve.
Figure 18
72]
62
0 1000 2000 3000 4000
X
278
While these two models appear to have similar fits to the simulated nonlinear data, the
regression slightly outperformed the neural network in goodness of fit tests. The r2 for the
regression was higher for both training (.993 versus .986) and test (.98 versus .94) data.
The previous sections discussed how neural networks approximate functions of a variety
of shapes and the role the hidden layer plays in the approximation. Another task
performed by the hidden layer of neural networks will be discussed in this section:
dimension reduction.
Data used for financial analysis in insurance often contains variables that are correlated.
An example would be the age of a worker and the worker's average weekly wage, as
older workers tend to earn more. Education is another variable which is likely to be
correlated with the worker's income. All of these variables will probably influence
Workers Compensation indemnity payments. It could be difficult to isolate the effect of
the individual variables because of the correlation between the variables. Another
example is the economic factors that drive insurance inflation, such as inflation in wages
and inflation in the medical care. For instance, analysis of monthly Bureau of Labor
Statistics data for hourly wages and the medical care component of the CPI from January
of 1994 through May of 2000 suggest these two time series have a (negative) correlation
of about .9 (See Figure l 9). Other measures of economic inflation can be expected to
show similarly high correlations.
Figure 19
~'004
O03
3 02 . . . . . . . .
279
Suppose one wanted to combine all the demographic factors related to income level or all
the economic factors driving insurance inflation into a single index in order to create a
simpler model which captured most of the predictive ability of the individual data series.
Reducing many factors to one is referred to as dimension reduction. In classical
statistics, two similar techniques for performing dimension reduction are Factor Analysis
and Principal Components Analysis. Both of these techniques take a number of
correlated variables and reduce them to fewer variables which retain most of the
explanatory power of the original variables.
The assumptions underlying Factor Analysis will be covered first. Assume the values on
three observed variables are all "caused" by a single factor plus a factor unique to each
variable. Also assume that the relationships between the factors and the variables are
linear. Such a relationship is diagrammed in Figure 20, where F1 denotes the common
factor, U1, U2 and U3 the unique factors and X1, X2 and X3 the variables. The causal
factor FI is not observed. Only the variables X1, X2 and X3 are observed. Each of the
unique factors is independent of the other unique factors, thus any observed correlations
between the variables is strictly a result of their relation to the causal factor F 1.
Figure 20
FU" *X2, - U2
~ 3 * U3
For instance, assume an unobserved factor, social inflation, is one of the drivers o f
increases in claims costs. This factor reflects the sentiments of large segments of the
population towards defendants in civil litigation and towards insurance companies as
intermediaries in liability claims. Although it cannot be observed or measured, some of
its effects can be observed. Examples are the change over time in the percentage of
claims being litigated, increases in jury awards and perhaps an index of the litigation
environment in each state created by a team of lawyers and claims adjusters. In the social
280
sciences it is common to use Factor Analysis to measure social and psychological
concepts that cannot be directly observed but which can influence the outcomes of
variables that can be directly observed. Sometimes the observed variables are indices or
scales obtained from survey questions.
Figure 21
////.j, LitigationRates ~ - - - U I
Social InflationJ
. . . . Size of Jury ~ U2
Faclor ~ Awards
Index of State
Litigation ,,-.- -U3
Environment
In scenarios such as this one, values for the observed variables might be used to obtain
estimates for the unobserved factor. One feature of the data that is used to estimate the
factor is the correlations between the observed variables: If there is a strong relationship
between the factor and the variables, the variables will be highly correlated. If the
relationship between the factor and only two of the variables is strong, but the
relationship with the third variable is weak, then only the two variables will have a high
correlation. The highly correlated variables will be more important in estimating the
unobserved factor. A result of Factor Analysis is an estimate of the factor (FI) for each
of the observations. The F1 obtained for each observation is a linear combination of the
values for the three variable for the observation. Since the values for the variables will
differ from record to record, so will the values for the estimated factor.
281
causal relationship between the factors and the variables. It simply tries to find the
factors or components which seem to explain most o f the variance in the data, Thus both
Factor Analysis and Principal Components Analysis produce a result of the form:
= w~X~ + ~,v,_X:...+ w )(
where
One can then use these indices in further analyses and discard the original variables.
Using this approach, the analyst achieves a reduction in the number of variables used to
model thc data and can construct a more parsimonious model.
- S.
Factor Analysts ts an example of a more general class o f models known as Latent
Variable Models. For instance, observed values on categorical variables may also be the
result o f unobserved factors. It would be difficult to use Factor Analysis to estimate the
underlying factors because it requires data from continuous variables, thus an alternative
procedure is required. While a discussion o f such procedures is beyond lhe scope o f this
paper, the procedures do exist.
In fact Maslerson created such indices for the Property and Casualty lines m the 1960s,
s Principal Componenls, because it does not have an underlying causal facrm is nol a lalenr variable model
282
neural networks, the hidden layer performs the dimension reduction. Since it is
performed using nonlinear functions, it can be applied where nonlinear relationships
exist.
F a c t o r l ~ N(1.05,.025)
On average this factor produces a 5% inflation rate. To make this example concrete
F a c t o r l will represent the economic factor driving the inflationary results in a line of
business, say Workers Compensation. F a c t o r l drives the observed values on three
simulated economic variables, Wage Inflation, Medical Inflation and Benefit Level
Inflation. Although unrealistic, in order to keep this example simple it was assumed that
no factor other than the economic factor contributes to the value of these variables and
the relationship of the factors to the variables is approximately linear.
Figure 22
*y
283
Figure 23
4
jjJJ
Also, to keep the example simple it was assumed that one economic factor drives
Workers Compensation results. A more realistic scenario would separately model the
indemnity and medical components o f Workers Compensation claim severity. The
economic variables are modeled as followsr:
ln(Wagelnflation) = .7 * ln( F a c t o r l ) + e
e - N(0,.005)
I n ( B e n e f i t _ l e v e l _ t r e n d ) = .5 * ln( F a c t o r l ) + e
e ~ N(0,.005)
Two hundred fi~y records o f the unobserved economic inflation factor and observed
inflation variables were simulated. Each record represented one of 50 states for one o f 5
years. Thus, in the simulation, inflation varied by state and by year. The annual inflation
rate variables were converted into cumulative inflationary measures (or indices). For each
state, the cumulative product of that year's factor and that year's observed inflation
6 Note that the according to Taylor's theorem the natural log of a variable whose value is close to one is
approximately equal to 1 minus the vartable's value, i.e., ln(l+x) ~ x. Thus, the economic variables are, to
a close approximatton, linear functions of the factor.
284
measures (the random observed variables) were computed. For example the cumulative
unobserved economic factor is computed as:
t
C u m f a c t o r l t = [1 F a c t o r l k
k=l
A base severity, intended to represent the average severity over all claims for the line of
business for each state for each of the 5 years was generated from a lognormal
distribution. 7 To incorporate inflation into the simulation, the severity for a given state
for a given year was computed as the product of the simulated base severity and the
cumulative value for the simulated (unobserved) inflation factor for its state. Thus, in
this simplified scenario, only one factor, an economic factor is responsible for the
variation over time and between states in average severity. The parameters for these
variables were selected to make a solution using Factor Analysis or Principal
Components Analysis straightforward and are not based on an analysis of real insurance
data. This data therefore had significantly less variance than would be observed in actual
insurance data.
Note that the correlations between ihe variables is very" high. All correlations between the
variables are at least .9. This means that the problem of multicollineariy exists in this
data set. That is, each variable is nearly identical to the others, adjusting for a constant
multiplier, so typical regression procedures have difficulty estimating the parameters of
the relationship between the independent variables and severity. Dimension reduction
methods such as Factor Analysis and Principal Components Analysis address this
problem by reducing the three inflation variables to one, the estimated factor or index.
Factor Analysis was performed on variables that were standardized. Most Factor
Analysis software standardizes the variables used in the analysis by subtracting the mean
and dividing by the standard deviation of each series. The coefficients linking the
variables to the factor are called loadings. That is:
Xl = bt Factor1
X2 = b2 Factorl
X3 = b3 Factorl
Where Xl, X2 and X3 are the three observed variables, Factorl is the single underlying
factor and b~, b2 and b3 are the Ioadings.
In the case of Factor Analysis the Ioadings are the coefficients linking a standardized
factor to the standardized dependent variables, not the variables in their original scale.
Also, when there is only one factor, the loadings also represent the estimated correlations
between the factor and each variable. The loadings produced by the Factor Analysis
procedure are shown in Table 8.
7This distributionwill have an average of 5,000 the fwstyear (after applicationof the inflationaryfactor for
yearI). Also In(Severity) ~ N(8.47,.05)
285
Table 8
Variable Loading Weights
Wage Inflation Index .985 .395
Medical Inflation Index .988 .498
Benefit Level Inflation Index .947 .113
Table 8 indicates that all the variables have a high loading on the factor, and thus all are
likely to be important in the estimation of an economic index. An index value was
estimated for each record using a weighted sum of the three economic variables. The
weights used by the Factor Analysis procedure to compute the index are shown in Table
8. Note that these weights (within rounding error) sum to 1. The new index was then
used as a dependent variable to predict each state's severity for each year. The
regression model was of the form:
Severity = a + b * Index + e
where
The results of the regression will be discussed below where they are compared to those of
the neural network.
The simple neural network diagramed in Figure 23 with three inputs and one hidden node
was used to predict a severity for each state and year. Figure 24 displays the relationship
between the output of the hidden layer and each of the predictor variables. The hidden
node has a linear relationship with each of the independent variables, but is negatively
correlated with each of the variables. The relationship between the neural network
predicted value and the independent variables is shown in Figure 25. This relationship is
linear and positively sloped. The relationship between the unobserved inflation factor
driving the observed variables and the predicted values is shown in Figure 26. This
relationship is positively sloped and nearly linear. Thus, the neural network has produced
a curve which is approximately the same form as the "true" underlying relationship.
286
Figure 24
12
MedCPI by HiadenNode
1.6
1.2
~ H~da~N~de . . . .
16
12
00 0.2 0.4 06 08
HiddenNode
287
F i g u r e 25
15ev, R a t I by Neur~Netw~kPred~t~
1.6
1.2
16
1.2
F i g u r e 26
7200
~.
i
52'00
10 11 12 1.3 14
InflationFaCtor
288
Intervretin~ the Neural Network Model
With Factor Analysis, a tool is provided for assessing the influence of a variable on a
Factor and therefore on the final predicted value. The tool is the factor Ioadings which
show the strength of the relationship between the observed variable and the underlying
factor. The Ioadings can be used to rank each variable's importance. In addition, the
weights used to construct the index s reveal the relationship between the independent
variables and the predicted value (in this case the predicted value for severity).
1. Hold one of the variables constant; say at its mean or median value.
2. Apply the fitted neural network to the data with the selected variable held
constant.
3. Compute the squared errors for each observation produced by these modified
fitted values.
4. Compute the average of the squared errors and compare ~t to the average squared
error of the full model.
5. Repeat this procedure for each variable used by the neural network. The
sensitivity is the percentage reduction in the error of the full model, compared to
the model excluding the variable in question.
6. If desired, the variables can be ranked based on their sensitivities.
s This would be computed as the product of each variable's weight on the factor limes the coefficient of the
factor in a linear regression on the dependent variable (.85 in this example).
289
Since the same set of parameters is used to compute the sensitivities, this procedure does
not require the user to refit the model each time a variable's importance is being
evaluated, The following table presents the sensitivities of the neural network model
fitted to the factor data.
Table 10
Sensitivities of Variables in Factor Example
Benefit Level 23.6%
Medical Inflation 33.1%
Wage Inflation 6.0%
According to the sensitivities, Medical Inflation is the most important variable followed
by Benefit Level and Wage Inflation is the least important. This contrasts with the
importance rankings of Benefit Level and Wage Inflation in the Factor Analysis, where
Wage Inflation was a more important variable than Benefit Level. Note that these are the
sensitivities for the particular neural network fit. A different initial starting point for the
network or a different number of hidden nodes could result in a model with different
sensitivities.
Figure 27 shows the actual and fitted values for the neural network and Factor Analysis
predicted models. This figure displays the fitted values compared to actual randomly
generated severities (on the left) and to "true" expected severities on the right. The x-axis
of the graph is the "true" cumulative inflation factor, as the severities arc a linear
Figure 27
700O
7000
65oo
l .j
6000
J oI
60 0 i
5000
SO00
4500 . . . . . .
4000 . . . . . . . . .
10 12 ')4 10 12 14
Cure ulalivQ Factor eumu[atqve Factor
290
function of the factor. However, it should be noted that when working with real data,
information on an unobserved variable would not be available.
The predicted neural network values appear to be more jagged than the Factor Analysis
predicted values. This jaggedness may reflect a weakness of neural networks: over
fitting. Sometimes neural networks do not generalize as well as classical linear models,
and fit some of the noise or randomness in the data rather than the actual patterns.
Looking at the graph on the right showing both predicted values as well as the "true"
value, the Factor Analysis model appears to be a better fit as it has less dispersion around
the "true" value. Although the neural network fit an approximately linear model to the
data, the Factor Analysis model performed better on the data used in this example. The
Factor Analysis model explained 73% of the variance in the training data compared to
71% explained by the neural network model and 45% of the variance in the test data
compared to 32% for the neural network. Since the relationships between the independent
and dependent variables in this example are approximately linear, this is another instance
of a situation where a classical linear model would be preferred over a more complicated
neural network procedure.
Interactions
Another common feature of data which complicates the statistical analysis is interactions.
An interaction occurs when the impact of two variables is more or less than the sum of
their independent impacts. For instance, in private passenger automobile insurance, the
driver's age may interact with territory in predicting accident frequencies. When this
happens, youthful drivers have a higher accident frequency in some territories than that
given by multiplying the age and territory relativities. In other territories it is lower. An
example of this is illustrated in Figure 28, which shows hypothetical c u r v e s 9 of expected
or "true"(not actual) accident frequencies by age for each of four territories.
The graph makes it evident that when interactions are present, the slope of the curve
relating the dependent variable (accident frequency) to an independent variable varies
based on the values of a third variable (territory). It can be seen from the figure that
younger drivers have a higher frequency of accidents in territories 2 and 3 than in
territories 1 and 4. It can also be seen that in territory 4, accident frequency is not related
to age and the shape and slope of the curve is significantly different in Territory 1
compared to territories 2 and 3.
9 The curves are based on s~nulated data. However data from the Baxter (Venebles and Ripley) automobile
claims database was used to develop parameters for the simulation.
291
Figure 28
T~.ary 3
0.3
01
Tln'ilary 1 T ~ 2 __
=~O3
: ,, , , r ,
01
U
292
As a result of interactions, the true expected frequency cannot be accurately estimated by
the simple product of the territory relativity times the age relativity. The interaction of
the two terms, age and territory, must be taken into account. In linear regression,
interactions are estimated by adding an interaction term to the regression. For a
regression in which the classification relativities are additive:
Vl/here:
Y= = is either a pure premium or loss ratio for territory t and age a
B0 = the regression constant
Bt, Ba and Bat a r e coefficients of the Territory, Age and the Age, Territory interaction
It is assumed in the regression model above that Territory enters the regression as a
categorical variable. That is, if there are N territories, N-1 dummy variables are created
which take on values of either I or 0, denoting whether an observation is or is not from
each of the territories. One territory is selected as the base territory, and a dummy
variable is not created for it. The value for the coefficient B0 contains the estimate of the
impact of the base territory on the dependent variable. More complete notation for the
regression with the dummy variables is:
Yt~ = B0 + Btl*T1 + Bt2*T2 + Bt3 * T3 +B=*Age + Batl* Tl*Age+ Bat2* T2*Age+ Bat3*
T3*Age
where TI, T2 and T3 are the dummy variables with values of either I or 0 described
above and Btl - Bt3 are the coefficients of the dummy variables and Bail- Bat3* are
coefficients of the age and territory interaction terms. Note that most major statistical
packages handle the details of converting categorical variables to a series of dummy
variables.
The interaction term represents the product of the territory dummy variables and age.
Using interaction terms allows the slope of the fitted line to vary by territory. A similar
formula to that above applies if the class relativities are multiplicative rather than
additive; however, the regression would be modeled on a log scale:
where
B*0, B't, B*= and B'at are the log scale constant and coefficients of the Territory, Age
and Age, Territory interaction.
Examole 3: Interactions
To illustrate the application of both neural networks and regression techniques to data
where interactions are present 5,000 records were randomly generated. Each record
represents a policyholder. Each policyholder has an underlying claim propensity
dependent on his/her simulated a g e and territory, including interactions between these
293
two variables. The underlying claim propensity for each age and territory combination
was that depicted above in Figure 28. For instance, in territory 4 the claim frequency is a
fiat .12. In the other territories the claim frequency is described by a curve. The claim
propensity served as the Poisson parameter for claims following the Poisson distribution:
"~'6 x
P(X = x ; 2 ~ ) = x! e a~'
Here k,j is the claim propensity or expected claim frequency for each age, territory
combination. The claim propensity parameters were used to generate claims from the
Poisson distribution for each o f the 5,000 policyholders.l°
Where xil...xi, are the independent variables for observation i, y, is the response (either 0
or !) and BI..B, are the coefficients o f the independent variables in the logistic
regression. This logistic function is similar to the activation function used by neural
networks. However, the use o f the logistic function in logistic regression is very different
from its use in neural networks. In logistic regression, a transform, the logit transform, is
mThe overall distribution of drivers by age used in the simulation was based on fitting a curve to
infoznmtionfrom the US Department of Transportation web site.
294
applied to a target variable modeling it directly as a function of predictor variables. After
parameters have been fit, the function can be inverted to produce fitted frequencies. The
logistic functions in neural networks have no such straightforward interpretation.
Numerical techniques are required to fit logistic regression when the maximum
likelihood technique is used. Hosmer and Lemshow (Hosmer and Lemshow, 1989)
provide a clear but detailed description of the maximum likelihood method for fitting
logistic regression. Despite the more complicated methods required for fitting the model,
in many other ways, logistic regression acts like ordinary least squares regression, albeit,
one where the response variable is binary. In particular, the logit of the response variable
is a linear function of the independent variables. In addition interaction terms,
polynomial terms and transforms of the independent variables can be used in the model.
A simple approach to performing logistic regression (Hosmer and Lemshow, 1989), and
the one which will be used for this paper, is to apply a weighted regression technique to
aggregated data. This is done as follows:
1. Group the policyholder's into age groups such as 16 to 20, 21 to 25, etc.
2. Aggregate the claim counts and exposure counts (here the exposure is
policyholders) by age group and territory.
3. Compute the frequency for each age and territory combination by dividing the
number of claims by the number of policyholders.
4. Apply the logit transform to the frequencies (for logistic regression). That is
compute Iog(p/(l-p)) where p is the claim frequency or propensity. It may be
necessary to add a very small quantity to the frequencies before the transform is
computed, because some of the cells may have a frequency of 0.
5. Compute a value for driver age in each cell. The age data has been grouped and a
value representative of driver ages in the cell is needed as an independent variable
in the modeling. Candidates are the mean and median ages in the cell. The
simplest approach is to use the midpoint of the age interval.
6. The policyholder count in each cell will be used as the weight in the regression.
This has the effect of cau~,ng the regression to behave as if the number of
observations for e: ~h cell equals the number of policyholders.
One of the advantages of using the aggregated data is that some observations have more
than one claim. That is, the observations on individual records are not strictly binary,
since values of 2 claims and even 3 claims sometimes occur. More complicated methods
such as multinomial logistic regression N can be used to model discrete variables with
more than 2 categories. When the data is aggregated, all the observations of the
dependent variable are still in the range 0 to 1 and the Iogit transform still is appropriate
for such data. Applying the logit transform to the aggregated data avoids the need for a
more complicated approach. No transform was applied to the data to which the neural
network was applied, i.e., the dependent variable was the observed frequencies. The
result of aggregating the simulated data is displayed in Figure 29.
295
Figure 29
Temto~: 3
i Temtoqf 4 --t 05
C¸ ~ i
02
==
Temlo~: 1 [ Ternto~ 2 ~ . . . . . . .
02
17 225 325 475 625 775 17 225 325 475 625 775
Age
Interpreting the neural network is more complicated than interpreting a typical regression.
In the previous section, it was shown that each variable's importance could be measured
by a sensitivity. Looking at the sensitivities in Table 12, it is clear that both age and
territory have a significant impact on the result. The magnitude of their effects seems to
I~ roughly oqual
296
Table 12: Sensitivity of Variables in Interaction Example
Variable Sensitivity
Age 24%
Territory 23%
Neither the weights nor the sensitivities help reveal the form of the fitted function.
However graphical techniques can be used to visualize the function fitted by the neural
network. Since interactions are of interest, a panel graph showing the relationship
between age and frequency for each territory can be revealing. A panel graph has panels
displaying the plot of the dependent variable versus an independent variable for each
value of a third variable, or for a selected range of values of a third variable. (Examples
of panel graphs have already been used in this paper in this section, to help visualize
interactions). This approach to visualizing the functional form of the fitted curve can be
useful when only a small number of variables are involved. Figure 30 displays the neural
network predicted values by age for each territory. The fitted curve for territories 2 and 3
are a little different, even though the "true" curves are the same. The curve for territory 4
is relatively fiat, although it has a slight upward slope.
Figure 30
0 20
OlO
020
~
17 225 325 475 625 775 17 225 325 475 625 775
Age
Re~ression fit
Table 13 presents the fitted coefficients for the logistic regression. Interpreting these
coefficients is more difficult than interpreting those of a linear regression, since the logit
represents the log of the odds ratio (p/(1-p)), wherep represents the underlying true claim
frequency. Note that as the coefficients of the Iogit of frequency become more positive,
the frequencies themselves become more positive. Hence, variables with positive
297
coefficients are positively related to the dependent variable and cocfficicnts with negative
signs are negatively related to the dependent variable.
Figure 31 displays the frequencies fitted by the logistic regression. As with neural
networks graph are useful for visualizing the function fitted by a logistic regression. A
noticeable departure from the underlying values can be seen in the results for Territory 4.
The fitted curve is upward sloping for Territory 4, rather than nat as the true values are.
Figure 31
i 020
O05
~020
17 21 2 5 5 3 2 5 4 2 5 5 2 5 6 2 5 7 2 5 8 2 5 17 21 2 5 5 3 2 5 4 2 5 5 2 5 6 2 5 7 2 5 e 2 5
a~
298
I rrab'a14 I
esults of Fits: Mean squared error
[Training Data~est Data
eural Network| 0.005t 0,014
egression l 0.007] 0.016
In this example the neural network had a better performance than the regression. Table
14 displays the mean square errors for the training and test data for the neural network
and the logistic regression. Overall, the neural network had a better fit to the data and did
a better job of capturing the interaction between Age and Territory. The fitted neural
network model explained 30 % of the variance in the training data versus 15% for the
regression. It should be noted that neither technique fit the "true" curve as closely as the
curves in previous examples were fit. This is a result of the noise in the data. As can be
seen from Figure 29, the data is very noisy, i.e., there is a lot of randomness in the data
relative to the pattern. The noise in the data obscures the pattern, and statistical
techniques applied to the data, whether neural networks or regression will have errors in
their estimated parameters.
The examples used thus far were kept simple, in order to illustrate key concepts about
how neural networks work. This example is intended to be closer to the typical situation
where data is messy. The data in this example will have nonlinearities, interactions,
correlated variables as well as missing observations.
To keep the example realistic, many of the parameters of the simulated data were based
on information in publicly available databases and the published literature. A random
sample of 5,000 claims was simulated. The sample represents 6 years of claims history.
(A multiyear period was chosen, so that inflation could be incorporated into the
example). Each claim represents a personal automobile claim severity developed to
ultimate 12. As an alternative to using claims developed to ultimate, an analyst might use
a database of claims which are all at the same development age. Random claim values
were generated from a lognormal distribution. The scale parameter, p., of the lognormal,
(which is the mean of the logged variables) varied with the characteristics of the claim.
The claim characteristics in the simulation were generated by eight variables. The
variables are summarized in Table 15. The la parameter itself has a probability
distribution. A graph of the distribution of the parameter in the simulated sample is
shown in Figure 32. The parameter had a standard deviation of approximately .38. The
objective of the analysis is to distinguish high severity policyholders from low severity
12The analyst may want to use neural network or other data mining techniquesto develop the data.
299
policyholders. This translates into an estimate ofp. which is as close to the "true" p as
possible.
Figure32
J Distribution of Mu ]
1.2
0.8
0.4
0.0
6.50 6.75 700 725 7.50 7.75 8.00 825 850 8 75 900
MU
Table 15 below lists the eight predictor variable used to generate the data in this example.
These variables are not intended to serve as an exhaustive list o f predictor variables for
the personal automobile line. Rather they are examples of the kinds o f variables one
could incorporate into a data mining exercise. A ninth variable (labeled Bogus) has no
causal relationship to average severity. It is included as a noise variable to test the
statistical procedures in their effectiveness at using the data. An effective prediction
model should be able to distinguish between meaningful variables and variables which
have no relationship to the dependent variable. Note that in the analysis of the data, two
o f the variables used to create the data are unavailable to the analyst as they represent
unobserved variables (the Auto BI and Auto PD underlying inflation factors). Instead,
six inflation indices which are correlated with the unobserved Factors are ayailable to the
analyst for modeling. Some features o f the variables are listed below.
300
Table 15
Variable Variable Type Number of Missing Data
Categories
~,ge of Driver Continuous No
Territory Categorical 45 No
~ge of Car Continuous Yes
3ar Type Categorical No
3redit Rating Continuous Yes
IAuto BI Inflation Factor Continuous No
Note that some of the data is missing for two of the variables. Also note that a law
change was enacted in the middle of the experience period which lowered expected claim
severity values by 20%. A more detailed description of the variables is provided in
Appendix 2.
Figure 33
O*¢,nbulo~ ot ~ol(S*~h~v)
.i,
311771O o GOtGO 1 114$o
mo(so~s*~y)
1o/ooo7 IZ ~ l l l N
301
The data was separated into a training database of 4,000 claims and a test database of
1,000 claims. A neural network with 7 nodes in the hidden layer was run on the 4,000
claims in the training database. As will be discussed later, this network was larger than
the final fitted network. This network was used to rank variables in importance and
eliminate some variables. Because the amount of variance explained by the model is
relatively small (8%), the sensitivities were also small. Table 16 displays the results of
the sensitivity test for each of the variables. These rankings were used initially to
eliminate two variables from the model: Bogus, and the dummy variable for car age
missing. Subsequent testing of the model resulted in dropping other variables. Despite
their low sensitivities, the inflation variables were not removed. The low sensitivities
were probably a result of the high correlations of the variables with each other. In
addition, it was deemed necessary to include a measure of inflation in the model. Since
the neural network's hidden layer performs dimension reduction on the inflation
variables, in a manner analogous to Factor or Principal Components Analysis, it seemed
appropriate to retain these variables.
One danger that is always present with neural network models is overtltting. As more
hidden layers nodes are added to the model, the fit to the data improves and the r 2 of the
model increases. However, the model may simply be fitting the features of the training
data, therefore its results may not generalize well to a new database. A rule of thumb for
the number of intermediate nodes to include in a neural network is to use one half of the
number o f variables in the model. After eliminating 2 of the variables, 13 variables
remained in the model. The rule o f thumb would indicate that 6 or 7 nodes should be
used. The test data was used to determine how well networks of various sizes performed
when presented with new data. Neural networks were fit with 3, 4, 5, 6 and 7 hidden
nodes. The fitted model was then used to predict values of claims in the test data.
Application of the fitted model to the test data indicated that a 4 node neural network
302
provided the best model. (It produced the highest re in the test data). The test data was
also used to eliminate additional variables from the model. In applying the model to the
test data it was found thai dropping the territory and credit variables improved the fit.
Goodness of Fit
The fitted model had an r 2 of 5%. This is a low re but not out of line with what one
would expect with the highly random data in this example. The "true" la (true expected
log (severity)) has a variance equal to 10% of the variance of the log of severity. Thus, if
one had perfect knowledge of ~t, one could predict individual log(severities) with only
10% accuracy. However, if one had perfect knowledge of the true mean value for severity
for each policyholder, along with knowledge of the true mean frequency for each
policyholder, one could charge the appropriate rate for the policy, given the particular
characteristics of the policyholder. In the aggregate, with a large number of
policyholders, the insurance company's actual experience should come close to the
experience predicted from the expected severities and frequencies.
With simulated data, the "true" la for each record is known. Thus, the model's accuracy
in predicting the true parameter can be assessed. Figure 34 plots the relationship between
~t and the predicted values (for the log of severity). It can be seen that as the predicted
value increases, p. increases. The correlation between the predicted values and the
parameter mu is .7.
Figure 34
Scatterplot of Neural Network Predicted vs Mu
i :
I
7~ 80 8S
As a further test of the model fit, the test data was divided into quartiles and the average
severity was computed for each quartile. A graph of the result is presented in Figure 35.
This graph shows that the model is effective in discriminating high and low severity
claims. One would expect an even better ability to discriminate high severity from low
severity observations with a larger sample. This is supported by Figure 36 which
displays the plot of"true" expected severities for each of the quartiles versus the neural
303
network predicted values. This graph indicated that the neural network is effective in
classifying claims into severity categories. These results suggest that neural networks
could be used to identify the more profitable insureds (or less profitable insureds) as part
of the underwriting process.
Figure 35
I j~
j~
if
j J/
5°°° 1
77 79 81 83
NN Predic~,ed
Figure 36
80OO
~7coo
t
~ s(x~o
77 79 81 83
304
In the previous example some simple graphs were used to visualize the form of the fitted
neural network function. Visualizing the nature of the relationships between dependent
and independent variables is more difficult when a number of variables are incorporated
into the model. For instance, Figure 37 displays the relationship between the neural
network predicted value and the driver's age. It is difficult to discern the relationship
between age and the network predicted value from this graph. One reason is that the
predicted value at a given age is the result of many other predictor variables as well as
age. Thus, there is a great deal of dispersion of predicted values at any given age due to
these other variables, disguising the fitted relationship between age and the dependent
variable.
Figure 37
~eo / "fl
20 40 GO 80
Researchers on neural networks have been exploring methods for understanding the
function fit by a neural network. Recently, a procedure for visualizing neural network
fitted functions was published by Plate, Bert and Band (Plate et ai., 2000). The procedure
is one approach to understanding the relationships being modeled by a neural network.
Plate et al. describe their plots as Generalized Additive Model style plots. Rather than
attempting to describe Generalized Additive Models, a technique for producing the plots
is simply presented below. (Both Venables and Ripley and Plate et al. provide
descriptions of Generalized Additive Models). The procedure is implemented as follows:
I.Set all the variables except the one being visualized to a constant value. Means
and medians are logical choices for the constants.
2. Apply the neural network function to this dataset to produce a predicted value for
each value of the independent variable• Alternatively, one could choose to apply
the neural network to a range of values for the independent variable selected to
represent a reasonable set of values of the variable. The other variables remain at
the selected constant values.
305
3. Plot the relationship between the neural network predicted value and the variable.
4. Plate el al. recommend scaling all the variables onto a common scale, such as 0 to
1. This is the scale of the inputs and outputs of the logistic functions in the neural
network. In this paper, variables remain in their original scale.
The result of applying the above procedure is a plot of the relationship between the
dependent variable and one of the independent variable. Multiple applications of this
procedure to different variables in the model provides the analyst with a tool for
understanding the functional form of the relationships between the independent and
dependent variables.
The visualization method was applied to the data with all variablcs set to constants except
for driver age. The result is shown in Figure 38. From this graph we can conclude that
the fitted function declines with driver age. Figure 39 shows a similar plot for car age.
This function declines with car age, but then increases at older ages.
Figure 38
306
Figure 39
¢;
•- <o
tO
I<
~JD"ql2 40 ,:
0 5 10 1S 20
car age
Suppose we wanted to visualize the relationship between a predictor variable which takes
on discrete values and the dependent variable. For instance, suppose we wanted to know
the impact of the law change. We can create fitted values for visualizing as described
above but instead of producing a scatterplot, we can produce a bar chart. Figure 40
displays such a graph. On this graph, the midpoint for claims subject to the law change
(a value of I on the graph) is about .2 units below the midpoint of claims not subject to
the law change. This suggests that the neural network estimates the law effect at about
20% because a .2 impact on a log scale corresponds approximately to a multiplicative
factor of 1.2, or .8 in the case of a negative effect (Actually, the effect when converted
from the log scale is about 22%). The estimate is therefore close to the "true" impact of
the law change, which is a 20% reduction in claim severity.
307
Figure 40
I~oo
1ooo
soo
o
2000
1500
1000
SO0
o
7 70 7 80 7 90 8 00 e 10 8 20 8 30 $ 40 8 50
Predicted
The visualization procedure can also be used to evaluate the impact of inflation on the
predicted value. All variables except the six economic inflation factors were fixed at a
constant value while the inflation variables entered the model at their actual values. The
predicted values are then plotted against time. Figure 41 shows that the neural network
estimated that inflation increased by about 40% during the six year time period of the
sample data. This corresponds roughly to an annual inflation rate of about 7%. The
"true" inflation underlying the model was approximately 6%.
One way to visualize two-way interactions is to allow two variables to take on their
actual values in the fitting function while keeping the others constant. Figure 42 displays
such a panel graph for the age and car age interaction. It appears from this graph that the
function relating car age to the predicted variable varies with the value of driver age.
Figure 41
,5 10 15 20
Quarters of • Year
308
Figure 42
~, i
, e~
o io 2~r ~
Rem'ession Model
A regression model was fit to the data. The dependent variable was the log o f severity.
Like neural networks, regression models can be subject to overfitting. The more
variables in the model, the better the fit to the training data. However, if the model is
overfit it will not generalize well and will give a poor fit on new data. Stepwise
regression is an established procedure for selecting variables for a regression model. It
tests the variables to find the one that produces the best r z. This is added to the model. It
continues cycling through the variables, testing variables and adding a variable each
cycle to the model until no more significant variables can be found. Significance is
usually determined by performing an F-test on the difference in the r2 o f the model
without a given variable and then with the variable.
Stepwise regression was used to select variables to incorporate into the model. Then a
regression on those variables was run. The variables selected were driver age, car age, a
dummy variable for the law change and the hospital inflation factor. Note that the
hospital inflation factor had a very high correlation with both underlying inflation factors
(even though the factors were generated to be independent o f each other Z3). Thus, using
just the one variable seems to adequately approximate inflation. On average, the increase
in the hospital inflation index was 4.6%. Since a factor o f 1.15 (see Table 17) was
applied to the hospital inflation factor, inflation was estimated by the regression to be a
little over 5% per year, The regression model estimated the impact o f the law change as a
reduction of.3 on the log scale or about 35% as opposed to the estimate 0fabout 22% for
the neural network. Thus, the neural network overestimated inflation a little, while the
regression model underestimated it a little. The neural network estimate o f the law
~3This may be a result of using a random walk procedure to generate both variable. Using random walk
models, the variables simulated have high correlations with prior values in the series.
309
change effect was close to the "true" value, while the regression overestimated the
magnitude of the effect.
The regression found a negative relationship between driver age and severity and
between car age and severity. An interaction coefficient between age and car age was
also estimated to be negative. The results correspond with the overall direction of the
"true" relationships. The results of the final regression are presented in Table 17.
The fitted regression had a somewhat lower r2 than the neural network model. However,
on some goodness of fit measures, the regression performance was close to that of the
neural network. The regression predicted values had a .65 correlation with It. versus .70
for the neural network. As seen in Figures 43 and 44, the regression was also able to
discriminate high severity from low severity claims with the test data. Note that neither
model found the Bogus variable to be significant. Also, neither model used all the
variables that were actually used to generate the data, such as territory or credit
information. Neither technique could distinguish the effect of these variables from the
overall background noise in the data.
Figure 43
///
70o0
i/
•/J
/
/7
7~ 77 79 81 1~3
R m ~ fitted Io~(Se~mly)
310
Figure44
J
i
rr 79 al e3
Re~e,~en ~ee~
, z
E(Y, ) = e ......
where
Since ~ti and o 2 are unknown, estimates o f their values must be used. The predicted
value from the neural network or regression is the usual choice for an estimate o f [u.i. The
mean square error o f the neural network or regression can be used as an estimate o f o 2 in
the formula above. A predicted value was computed for the claims that were used to fit
the neural network model. A plot o f the predicted severities versus the "true" expected
severities is displayed in Figure 43.
311
Figure 43
Scatterplot of Expected versus Predicted Severity
!°
However, many of the underwriting applications of modeling and prediction require both
a frequency and a severity estimate. A company may wish to prune "bad" risks from its
portfolio, pursue "good" risks or actually use models to establish rates. For such
applications either the loss ratio or pure premium will be the target variable of interest.
There are two approaches to estimating the needed variable: 1) One can separately
estimate frequency and severity models and combine the estimates of the two models.
An illustration of fitting models to frequencies was provided in Example 4 and an
example of fitting models to severities was supplied in Example 5. 2) Alternatively, one
can estimate a pure premium or loss ratio model directly.
One difficulty of modeling pure premiums or loss ratios is that in some lines of business,
such as personal lines, most of the policyholders will have no losses, since the expected
frequency is relatively low. It is desirable to transform the data onto a scale that does not
allow for negative values. The log transformation accomplishes this. However, since the
312
log is not defined for a value of zero it may be necessary to add a very small constant to
the data in order to apply the log transform.
Once a pure premium is computed, it can be converted into a rate by loading for expenses
and profit. Alternatively, the pure premium could be ratioed to premium at current rate
levels to produce a loss ratio. A decision could be made as to whether the predicted loss
ratio is acceptable before underwriting a risk. Alternatively the loss ratio prediction for a
company's portfolio of risks for a line of business can be loaded for expenses and profit
and the insurance company can determine if a rate increase is needed.
Summarv
This paper has gone into some detail in describing neural networks and how they work.
The paper has attempted to remove some of the mystery from the neural network "black
box". The author has described neural networks as a statistical tool which minimizes the
squared deviation between target and fitted values, much like more traditional statistical
procedures do. Examples were provided which showed how neural networks 1) are
universal function approximators and 2) perform dimension reduction on correlated
predictor variables. Classical techniques can be expected to outperform neural network
models when data is well behaved and the relationships are linear or variables can be
transformed into variables with linear relationships. However neural networks seem to
have an advantage over linear models when they are applied to complex nonlinear data.
This is an advantage neural networks share with other data mining tools not discussed in
detail in this paper. Future research might investigate how neural networks compare to
some of these data mining tools.
Note that the paper does not advocate abandoning classical statistical tools, but rather
adding a new tool to the actuarial toolkit. Classical regression performed well in many o f
the examples in this paper. Some classical statistical tools such as Generalized Linear
Models have been applied successfully to problems similar to those in this paper. (See
Holler et al. for an example).
A disadvantage o f neural networks is that they are a "black box". They may outperform
classical models in certain situations, but interpreting the result is difficult because the
nature of the relationship between dependent and target variables is not usually revealed.
Several methods for interpreting the results of neural networks were presented. Methods
for visualizing the form of the fitted function were also presented in this paper.
Incorporating such procedures into neural network software should help address this
limitation.
313
Appendix 1: Neural Network Software
Neural network software is sold at prices ranging from a couple o f hundred dollars to
$100,000 or more. The more expensive prices are generally associated with more
comprehensive data mining products, which include neural networks as one of the
capabilities offered. Some o f the established vendors o f statistical software such as SPSS
and SAS sell the higher end data mining products 14. These products are designed to
function on servers and networks and have the capability o f processing huge databases.
They also have some o f the bells and whistles useful to the analyst in evaluating the
function fit by the neural network, such as a computation o f sensitivities. Both o f these
products allow the user to apply a number o f different kinds o f neurat networks,
including types o f networks not covered in this paper.
M a n y o f the less expensive products provide good fits to data when the database is not
large. Since the examples in this paper used modestly sized databases, an expensive
product with a lot o f horsepower was not required. Two o f the less expensive tools were
used to fit the models in this paper: a very inexpensive neural network package,
Brainmaker, and the S-PLUS neural network function, nnet. The Brainmaker tool has a
couple o f handy features. It creates a file that contains all the parameters o f the fitted
neural network function for the hidden and output layers. It also has the capability o f
producing the values o f the hidden nodes. Both o f these features were helpful for the
detailed examination o f neural networks contained in this paper. However, the
Brainmaker version employed in this analysis had difficulty filling networks on larger
databases l-s, so the S-PLUS nnet function was used for the last example. The S-PLUS
rmet function is contained in a library supplied by Venables and Ripley, rather than the
vendors o f S-PLUS, but it is included in the basic S-PI,US package. This software also
provides the fitted parameters for the hidden and output layers. (However, it does not
provide the fitted values for the hidden nodes). Chapter 9 o f Venables and Ripley
describes the software and how to use it. (Venables and Ripley, 1999).
N u m e r o u s other products with which the author o f this paper has no experience are also
available for fitting neural networks. Thus, no statement made in this paper should be
interpreted as an endorsement o f any particular product.
14The SPSS dam mining product is called Clementine. lT,e SAS product ~scalled the Enterprise Miner.
SPSS also sells an inexpensive neural network product, Neural Conncction q~e author has used Neural
Connection on moderately sized databases and found it to be effecnve on prediction and classification
groblems.
It should be noted that lhe vendors of Brainmaker sell a professional version which probably performs
better on large databases.
314
Appendix 2
This appendix is provided for readers wishing a little more detail on the structure of the
data in the Example 5.
Automobile Bodily Injury (ABI) inflation factor and Automobile Property Damage and
Physical Damage (APD) inflation factor: These factors drive quarterly increases in the
bodily injury, property damage and physical damage components of average severity.
They are unobserved factors. The ABI factor is correlated with three observed variables:
the producer price index for hospitals, the medical services component of the consumer
price index and an index of average hourly earnings. The APD factor is correlated with
three observed variables: the produce price index for automobile bodies, the producer
price index for automobile parts and the other services component of the consumer price
index. Bureau of Labor statistics data was reviewed when developing parameters for the
factors and for the "observed" variables. The ABI factor was given a 60% weight and the
APD factor was given a 40% weight in computing each claim's expected severity.
~6This database of Automobile claims is available as an example database in S-PLUS. Venables and
Ripley supply the S-PLUS data for claim severities in a S-PLUS library. See Venables and Ripley, p.467.
315
Law Change: A change in the law is enacted which causes average severities to decline
by 20% after the third year.
Interactions:
Table 18 shows the variables with interactions. Three o f the variables have interactions.
In addition some o f the interactions are nonlinear (or piecewise linear). An example is
the interactions between age and car age. This is a curve that has a negative slope at
older car ages and younger driver ages, but is flat for older driver ages and younger car
ages. The formula used for generating the interaction between age, car age and car type
is provided below (after Table 19). In addition to these interactions, other relationships
exist in the data, which affect the mix o f values for the predictor variables in the data.
Young drivers (<25 years old) are more likely not to have any credit limits (a condition
associated with a higher average severity on the credit variable). Younger and older
(>55) drivers are more likely to have older cars.
Table 18
Interactions
Driver Age and Car Type
Driver Age and Car Age
Driver Age and Car Age and Car Type
lqonlinearitie$
A number o f nonlinear relationships were built into the data. The relationship between
Age and severity follows an exponential decay (see formula below). The relationships
between some o f the inflation indices and the Factors generating actual claim inflation
are nonlinear. The relationship between car age and severity is piecewise linear. That is,
there is no effect below a threshold age, then effect increases lincarly up to a maximum
effect and remains at that level at higher ages.
Missin~ Data
In our real life experience with insurance data, values are often missing on variables
which have a significant impact on the dependent variable. To make the simulated data in
this example more realistic, data is missing on two o f the independent variables. Table 19
presents information on the missing data. Two dummy variables were created with a
value o f 0 for most o f the observations, but a value o f I for records with a missing value
on car age and/or credit information. In addition, a value o f 1 was recorded for car age
and credit leverage where data was missing. These values were used in the neural
network analysis. The average o f each o f the variables was substituted for the missing
data in the regression analysis.
316
Table 19
Missing Values
Car Age 10% of records missing information if driver age is < 25. Otherwise 5% of data
is missing
Credit 25% of records are missing the information if Age < 25, otherwise 20% of data is
missing.
The ~t parameter o f the lognormal severity distribution was created with the following
function:
where
317
References
Berry, Michael J. A., and Linoff, Gordon, Data Mining Techniques, John Wiley and Sons, 1997
Dhar, Vasant and Stein, Roger, Seven methodsfor Transforming Corporate Data Into Busim,ss
Intelligence, Princeton Hall, 1997
Derrig, Richard, "Patterns, Fighting Fraud With Data", Contingencies, pp. 40-49.
Freedman, Roy S., Klein, Robert A. and Lederman, Jess, Artificial Intelligence in the Capital Markets,
Probus Publishers 1995
Hatcher, Larry, A Step by Step Approach to Using the SAS Systemfor Factor Ananlvsis, SAS Institute,
1996
Heckman, Phillip E. and Meyers, Glen G., "The Calculation of Aggregate Loss Distributions from Claim
Severity and Claim Cost Distributions", Proceedings of the Casualty Actuarial Society, 1983, pp. 22-61.
Holler, Keith, Somner, David, and Trahair, Geoff, "Something Old, Something New in Classification
Ratemaking With a New Use of GLMs for Credit Insurance", Casualty Actuarial Society Forum. Winter
1999, pp. 31-,$4.
Hosmer, David W. and Lemshow, Stanley, Applied Logistic Regression, John Wiley and Sons, 1989
Keefer, James, "Finding Causal Relationships By Combining Knowledge and Data in Data Mining
Applications", Paper presented at Seminar on Data Mining, University of Delaware, April, 2000.
Kim, Jae-On and Mueler, Charles W, Factor Analysis: Statistical Methods and Practical Issues, SAGE
Publications, 1978
Lawrence, Jeannette, Introduction to Neural Networks: Design, Theory and Applications, California
Scientific Software, 1994
Martin, E. B. and Morris A. J., "'Artificial Neural Networks and Multivariate Statistics", in Statistics and
Neural Networks Advances at the Interface, Oxford University Press, 1999, pp. 195 - 292
Masterson, N. E., "Economic Factors in Liability and Property Insurance Claims Cost: 1935 1967",
Proceedings of the Casualty Actuarial Society, 1968, pp. 61 - 89.
Monaghan, James E., "'The Impact of Personal Credit History on Loss Performance in Personal Lines",
Casualty Actuarial Society Forum, Winter 2000, pp. 79-105
Plate, Tony A., Ben, Joel, and Band, Pierre, "Visualizing the Function Computed by a Feedforward Neural
Network", Neural Computation, June 2000, pp. 1337-1353.
Ports, William J.E., Neural Network Modeling: Course Notes, SAS Institute, 2000
Smith, Murry, Neural Networksfor Statistical Modeling, International Thompson Computer Press, 1996
318
Speights, David B, Brodsky, Joel B., Chudova, Durya I.,, "Using Neural Nwtworks to Predict Claim
Duration in the Presence of Right Censoring and Covariates", Casual~ Actuarial SocieO, Forum, Winter
1999, pp. 255-278.
Venebles, W.N. and Ripley, B.D., Modem AppliedStattstics with S-PLUS, third edition, Springer, 1999
Warner, Brad and Misra, Manavendra, "Understanding Neural Networks as Statistical Tools", American
Statistician, November 1996, pp. 284 - 293
319
320
Actuarial Applications of Multifractal Modeling
Part L" Introduction and Spatial Applications
321
Actuarial Applications of Multifractal Modeling
Abstract
Multifractals are mathematical generalizations of fractals, objects displaying "fractional
dimension," "scale invariance," and "self-similarity." Many natural phenomena, includ-
ing some of considerable interest to the casualty actuary (meteorological conditions,
population distribution, financial time series), have been found to be well-represented by
(random) multifraetals. In this part I paper, we define and characterize multifractals and
show how to fit and simulate multifractal models in the context of two-dimensional
fields. In addition, we summarize original research we have published elsewhere
concerning the multifractal distribution of insured property values, and discuss how we
have used those findings in particular and multifractal modeling in general in a severe
storm catastrophe model.
Introduction
In this section, we introduce the concepts of fractals and multifractals.
Fraetals
Mathematicians have known of sets whose dimension is not a whole number for some
time, but the term "fractal'" emerged on the scientific and popular scenes with the work of
Benoit Mandlebrot in the 1960s and 1970s [Mandlebrot 1982].
Mathematically, a fractal can be defined as a point set with possibly non-integer
dimension. Examples of fractals include continuous random ~alks (Weiner processes),
the Cantor set, and the Sierpinski triangle (the latter two discussed below). Phenomena in
natqre that resemble fi-actals include dust spills and coastlines.
Regular tYactals possess the attribute of self-similarity. This means that parts of the set
are similar (in the geometrical sense of equivalence under a transformation consisting of
magnification, rotation, translation, and reflection) to the whole. This givcs regular
fractals an "infinite regress" look, as the same large-scale geometrical features are
repeated at ever smaller and smaller scales. Self-similarity is also known as scale
The authors would like to thank Jol~l Mangano for his contributions to this paper, Shaun Lovejoy and
Daniel Schertzer tbr their helpful conversations, and Gary Venter for his review of an early draft. Errors,
of cottrse, are solely the responsibility of the authors.
322
symmetry or scaling - the fractal doesn't have a characteristic scale at which its features
occur; they occur at all scales equally.
Irregular fractals do not possess strict self-similarity, but possess statistical self-similarity
and scaling. This will be clarified below.
The key numerical index o f a fractal, itsfractal dimension, deserves further explanation.
It is not immediately obvious how the concept o f dimension from linear algebra, the
m a x i m u m n u m b e r o f linearly independent vectors in a space, can be generalized to
include the possibility o f noninteger values. While there are several ways o f doing so -
and they often coincide - the so-called capacity dimension (sometimes m i s n a m e d the
H a u s d o r f f dimension 2) is perhaps the easiest to understand.
Consider a closed and bounded subset S o f N-dimensional Euclidean space R N. W e
define a covering o f S o f size ~. to be a set o f hypercubes {Hi} such that (1) each
hypercube is o f size ~. on a side and (2) the set S is contained within the union o f all
hypercubes u H i . For any ~., let n(~.) be the m i n i m u m number o f hypercubes needed to be
a covering o f S. The dimension o f S can then be defined in terms o f the scaling behavior
o f coverings o f S, i.e., the behavior o f n(~.) as ~.---~0.
Examples:
• If S consists o f a finite n u m b e r o f points, then for all ~. less than the m i n i m u m
distance between points, a coveting needs to have as m a n y hypereubes as there are
points: n is constant for small ~,.
• I f S consists o f a line segment o f length L, then n(~.)=L/~.: n varies as the reciprocal o f
the first power o f scale ~..
• If S consists o f a (sub-) hypercube o f dimension m and length L on a side, then n(~.) is
approximately (L/~.)m: n varies as the reciprocal o f the ruth power o f scale ~..
This exponential relation, n(~.) oc ~-ra, motivates the definition o f fractal dimension:
d = -lim/I°g(n(a)) ) (1)
~-,o~, log(a) J
The previous examples show that by this definition, a set o f isolated points has dimension
zero, a line s e g m e n t has dimension one, and an m-hypercube has dimension m, as we
would expect. 3
Subsets o f the unit interval m a y have various dimensions less than or equal to one, and
cardinality is no guarantee o f dimension for infinite sets. Finite point sets have
2 The definition of Hausdorff dimension is more technically complicated, involving an inftmum rather than
a limit, thereby handling cases where the limit (in equation t below) does not exist.
3 Note that the dimension N of the embedding space is irrelevant. While it is true that a line segment of
finite length can be made to fit in a hypercube of arbiUarily small side if the dimension of the hypercube is
big enough, what really matters is the scaling behavior. That is, if the side of the hypercube is halved, then
two of them are needed to cover the line segment - unplying the line segment has dimension one.
323
dimension zero, o f course, but there are countable subsets with dimension zero and those
with dimension one. For example, the set o f rational numbers (a countable set) is a dense
subset o f the real numbers, meaning that any open set around a real number contains a
rational. Therefore, the fractal dimension o f the rationals is the same as that o f the reals
(they need exactly the same covering sets), that is, one.
On the other hand, the countable set consisting o f points Xk = a k, k=1,2,3 .... where
0<ct<l, has dimension 0. This can be seen by considering coverings by blocks o f length
~.= cd for some arbitrary j. The first block covers all points xj, x~+t, and at most (j-l)
blocks are needed to cover the other (j-l) points. Thus,
log(n(~.))/log(~.) < Iog(j)/(j*log(ct)) --~ 0.
Nothing in the definition o f fractal dimension precludes the possibility o f a set S having a
noninteger dimension d. We now present some examples to show how this can happen.
The Cantor set is a subset o f a line segment and is defined recursively as follows. Start
with the entire line segment. Remove the middle third, leaving two disconnected closed
line segments. Repeat the process on each remaining line segment, ad infinitum. In the
limit, we have the Cantor set. At stage k o f the construction (the whole segment being
stage 0), we have 2 k subsegments each o f length 3 "k, for a total length o f (2/3) k. In the
limit, the Cantor set has measure 4 zero (it consists o f points with no net length) because
in the limit, (2/3) k goes to zero. For any length 2.=3 -k, we need 2 k segments Hi to cover
the set. Therefore the fractal dimension o f the Cantor set is log(2)/log(3) = 0.63093 ....
corresponding to something between a line and a set o f isolated points.
The self-similarity o f the Cantor set follows directly from its construction. Each sub-
segment is treated in precisely the same way (up to a scale factor) as the original
segment.
As an example o f a noninteger fractal dimension in a 2-dimensional space, consider the
Sierpinski triangle (also known as the Sierpinski gasket). This subset o f the unit square is
defined recursively as follows: Start with an equilateral triangle and its interior. Draw an
inscribed triangle (point down) connecting the midpoints o f each side. This divides the
triangle into four similar and congruent sub-triangles. Remove the interior o f the
inscribed triangle. Repeat the process on each o f the remaining three sub-triangles.
Figure 1 shows an approximation to the result. As with the Cantor set, the Sierpinski
triangle has zero measure (no area), because each stage o f the construction takes up (3/4) k
area o f the outer triangle. Assuming the original triangle is inscribed in a unit square, at
stage k o f the construction, we need 3 k squares Hi o f side 2.=2-k to cover the set.
Therefore, the Sierpinski triangle has fractal dimension log(3)/log(2) - 1.584963 ....
corresponding to something between a linear and a planar figure.
The self-similarity o f the Sierpinski triangle again follows directly from its construction.
Each sub-triangle is a miniature version o f the original triangle and is similar to all other
triangles appearing in the set.
324
The analysis o f fractal dimension by this method is generally termed box-counting.
There are other approaches, but they will not be discussed here. Note that the method
applies to arbitrary sets, not just self-similar ones. A non-self-similar set is called an
irregular fractal if it has a noninteger fractal dimension.
Among natural phenomena, coastlines are frequently cited as good examples o f irregular
fractals. The measured length o f a coastline depends on the scale o f accuracy o f the
measuring tool. Comparing maps at various scales, one can see progressive deterioration
o f detail as larger scales are used. What appears as a wrinkled inlet on one map is
abstracted to a simple polygon on the next and then obliterated completely on the next.
[Barnsley] gives the fractal dimension o f the coast o f Great Britain as approximately 1.2.
[Woo] discusses numerous areas where fractal laws relate to natural hazard processes.
This notion o f scale-dependent measurements will play a central role in the practical
application o f fractal and multifractai theory to real-world problems.
Multifractals
Multifractals, also known as fractal measures, generalize the notion o f fractals.
Mandlebrot also worked on multifractals in the 1970s and 1980s [Mandlebrot 1988], but
the first use o f the term is credited to U. Frisch and G. Parisi [Mandlebrot 1989]. Rather
than being point sets, multifractals are measures (distributions) exhibiting a spectrum o f
fractal dimensions.
A brief review o f measure theory is in order. A measure It on a space X is a function
from a set o f subsets o f X (a o-algebra o f "measurable sets") to the real numbers R. In
order to be a measure, the function It must satisfy It(~)=0, It(S)_>0, and p. o f any count-
able collection o f disjoint sets must equal the sum o f It on each set. Actuaries typically
encounter only probability measures, where, in addition, p.(X)=l. The usual measure on
R N is Lebesgue measure v(S), characterized by the fact that if S is a rectangular solid
with sides o f lengths ~i, i= 1..... N, then v(S)=Fli2.i.
If a measure It on R N is zero on every set for which v is zero (i.e., it is absolutely
continuous), then the ratio o f measures/a(H)/v(H) where H is a neighborhood (with non-
zero measure) around a point x is well-defined, and in the limit, as the neighborhood
shrinks to measure zero, the ratio fix), if it exists, is the density o f It, also known as the
Radon-Nikodym derivative. Not all measures have densities; think o f a probability
function with a point mass at zero. As H shrinks around the point mass, It(H) cannot
become less than the point mass, but v(H) goes to zero; the density becomes infinite.
Multifractals, as measures, tend to be extremely ill.behaved, not characterizable in terms
o f densities and point, line, plane, etc., masses.
The simplest way to create a multifractal is by a multiplicative cascade. Consider the
"binomial multifractal," constructed on a half-open unit interval (0,1] with uniform
density as follows: Divide the interval into two halves (open on the left) o f equal length.
Distribute 0<p<l o f the mass uniformly on the left half and l-p o f the mass uniformly on
the right half (here p is a constant throughout all stages o f the construction). Repeat on
each subinterval. Figure 2 shows several stages o f construction with p=I/3. The
horizontal axes show the unit interval and the vertical axes show density. The upper left
325
panel shows stage 1, where 1/3 o f the mass is on the left half and 2/3 is on the right. Note
that the average density is 1. The upper right panel shows stage 2, where the left and
right halves have each been divided. The 2 "d and 3 rd quarters o f the interval have the
same density because they have masses of(1/3)*(2/3) and (2/3)*(1/3), respectivelj¢. The
lower left panel shows stage 4 where the interval has been divided into 2 = 16
subsegments. Some local maxima seem to be appearing, they are labeled. The lower
right panel shows stage 7, and begins to give a sense o f what the ultimate multifractal
looks like. Note the similarity o f left and right halves.
As you can see, at the local maxima, the density "blows up" as the scale resolution gets
finer. Note how the maximum density increases from panel to panel. However, the rate
o f divergence is different at different points. The set o f locations with particular
(different) rates o f divergence turn out to be fractals (with different fractal dimensions).
Thus we have layers o f fractals representing different "orders o f singularities," with a
relationship between the rate o f divergence and the fractal dimension. See Appendix A
for mathematical details.
This relationship is known as the spectrum of singularities - no single fractal dimension
suffices to characterize the fractal measure, hence the name multifractaL
Having a spectrum o f singularities means that the multifractal measure consists o f
infinitely spiky peaks sprinkled throughout predominant valleys, but that with proper
mathematical technology, the peaks can be classified by the rate at which they diverge to
infinity, and comparable peaks can be collected together into fractal "mountain ranges.'"
Figure 3 shows a real-world density field that approximates a multifractal. It is the
population density o f the northeastern USA. The big spike in the middle is New York
City. Lesser spikes pick out other densely-populated cities.
In their analysis o f turbulent meteorological phenomena, [Schertzer & Lovejoy] write the
functional relationship between a chosen scale o f resolution ~. and the average densities
q~ measured at that scale as:
Pr{~ a > 2 r }Qc 2 -~''' (2)
This is very much in the spirit o f box-counting for fractals, except the equivalent
formulation for fractals would have (1) the event inside Pr{ } being the probability o f
finding any point o f the fractal in a k-neighborhood, instead o f points that satisfy a
certain degree o f singularity, and (2) the exponent on the right hand side being a constant,
the fractal dimension o f the set, instead o f a function. In this formulation, the function
c(y) carries all the information necessary to characterize, in a statistical sense, the
multifractal. 5
s It is tempting to read this equation as a statement about the probability of encountering a point with
exponent ~/or higher or the probability of fractal dimension. However, if the fractal dimension of points
having exponent "for higher is less than the dimension of the embedding space, then such points make up a
set of (Lebesgue or probability) measure zero. In the typical multifractal, "'almost" all the mass is
concentrated in "almost" none of the region. The equation is really a statement about the scaling
relationship between mtensity and probability.
326
Compare Figure 3 with Figures 4 and 5. The former measures population density at the
resolution o f 8 miles. The latter two measure it at resolutions o f 16 and 32 miles,
respectively. Clearly, one's impression o f this density field is largely driven by the scale
o f resolution used. A systematic investigation o f the appearance o f a field using various
scales o f resolution is at the heart o f multi fractal analysis.
A box-counting approach developed by [Lavallre et. al.] known as Probability
Distribution Multiple Scaling (PDMS) can be used to estimate the probabilities o f
singularities with assorted rates o f divergence. (See also [Lovejoy & Schertzer 1991].) It
turns out that directly estimating c(y) in such a fashion is not a productive approach to
analyzing real data sets for multifractality due to the severe demands that the procedure
places on the sample data. In the next section, we will show how multifractals can be
understood equally well through the behavior o f their moments.
[Pecknold et. al.] give many examples o f (apparent) multifractals in nature. See also
[Ladoy et. al.] These include rain and cloud fields (measured from scales o f a thousand
kilometers and years down to millimeters and seconds - see [Lovejoy & Schertzer
1991]), human population density (as above, also see further discussion below), and
foreign exchange rates. Part o f the impetus for the development and practical application
ofmultifractal analysis came from "the burgeoning mass o f remotely sensed satellite and
radar data" [Tessier et. al., 1993]. Depending on the scale o f resolution used, measure-
ments o f cloud cover could be made to vary drastically; moreover, how this variation
with scale behaved was also dependent on the level o f intensity chosen as a threshold
just the sort o f fractals-within-fractals behavior to be expected from multifractal fields.
Spatial Fields
In this section, we delve into the general theory o f self-similar random fields, focusing on
the two-dimensional case. (The extension to three or more dimensions is straight-
forward.) Examples are taken from our applications in property-liability insurance.
• an integer, for example, in the case where the random field is a (discrete) time series,
• a real number, for example, in the case where the random field is a (continuous)
stochastic process,
• a vector in D-dimensional Euclidean space R D, in the case o f a general random field.
Typically, we focus on O = 1 for financial/econometric time series and O = 2 for spatial distributions in
geography or meteorology.
To analyze a random multifractal, we must first respect the fact that it is a measure, and
strictly speaking may not (typically does not) possess real-valued densities. Therefore,
327
we cannot treat a random multifractal as a random field q~(r).6 However, as we have
seen in previous sections, when viewed at a finite scale of resolution L, a multifractal
does have a well-behaved density that we can treat as a random field ~oL(r). Thus, the
approach to studying random multifractals is to consider sequences o f random fields that
describe the density o f the measure at various scales o f resolution, and to study the
scaling behavior o f those sequences.
Appendix B outlines the mathematics. The box-counting approach appears to admit
straightforward application (and becomes PDMS) as discussed above. For various
reasons discussed below, it is more fruitful to deal with moments o f the random fields.
The key object o f the analysis is the so-called K(q) function, describing the scaling
behavior o f the qth-moments o f the sequence o f random density fields as the scale o f
resolution 3. varies. At finer resolutions, the density fields appear more "spiky" and
average q-powers o f the fields for q>l (q<l) get arbitrarily large (small) according to the
power law:
The boundary conditions K(O) = K(I) = 0 further constrain the K(q) curve.
Take a unit square with uniform density. Divide it into four quadrants and multiply the
density in each quadrant by the corresponding element o f a . Note that the average o f the
four elements o f a is 1.0, so the average density across the entire square is unchanged.
Repeat the procedure on each quadrant, recursively. In the limit, we have a multifractal.
At stage k, neighborhoods o f the upper left comer have average density 2k. That point has
the highest degree o f singularity. 7 The lower left comer has a different sort o f
singularity, with density 0.6 k approaching zero as the scale shrinks. The entire lower
right half is empty (density zero). Like the Sierpinski triangle, in fact, the s]uare is
almost everywhere empty: at each stage, the area with nonzero density is (3/4) which
approaches zero as k increases without bound. Figure 6 depicts the result.
6 It might be tempting to consider a random measure as a collection of random variables indexed by subsets
of the underlying R ° space, but that quickly becomes awkward to work with.
7 Countably many other points have the same degree of singularity. These are the "upper left comers" of
nonzero subcells; at all stages k after some stage a, they have density m2 k"a .
328
A random version of the Sierpinski multifractal can be seen in Figures 7 and 8. Here, the
positions of the elements of a are randomly shuffled at each downward step in the
cascade. Statistically, the random and regular versions are identical, but visually, the
random version suggests phenomena taken from biology or geography.
Figure 9 shows the empirically fit and theoretical ("universal") K(q) curves for the
Sierpinski multifractal. The latter will be explained in the next section.
U n i v e r s a l i t y Classes; F o r m of K(q)
By making certain plausible assumptions about the mechanisms generating a multifractal,
we can arrive at a "universal" theory, akin to a central limit theorem, for multifractals.
The critical assumption is that the underlying generator (analogous to the multiplicative
factors in the matrix of the previous example) is a random variable with a specific type of
distribution: the exponentiatedextremalL~vydistribution. This is plausible because Lrvy
distributions generalize the Gaussian distribution in the central limit theorem.
This leads to a two-parameter family of K(q) curves:
K(q)=IaC~t_l(q~-q) a~l
(5)
[ C,.qlog(q) a =1
where C~ acts as a magnification factor and ct, related to the tail index o f the L~vy
generator, determines curvature. These parameters in turn can be related to position and
scale parameters/1 and cr to be applied to a "standard" L~vy variable A~(- 1).
The derivation, and an introduction to Lrvy variables, is presented in Appendix C.
S y n t h e s i s o f Multifraetais: E x t r e m a l L~vy G e n e r a t o r s
In creating multifraetals for liability applications, we adopt this still somewhat
controversial theory of universality. 8 That is, each step of a simulated multiplicative
cascade is a multiplication by the random factor a given by equation 32 (Appendix C) for
appropriately chosen parameters. A cdf of random step factors corresponding to the best
universal fit to the Sierpinski cascade example above is shown in Figure 11 (thick curve).
A multiplicative cascade with these random step factors could be used instead of the four-
element array used above (shown as thin line step function) to construct a multifractal
with roughly the same properties as the Sierpinski multifractal.
The Laplace transform of the logarithm of these factors take on the particularly simple
forms described in Appendix C. This fact is exploited in data analysis, as will be
explained later in the discussion of Trace Moments.
s The scope and relevance of the necessary conditions to real-world phenomena are hotly debated.
329
Example Spectrum Analysis: Insured Property Portfolios
A preliminary step, to be taken before fitting a K(q) curve to suspected multifractal data,
is spectrum (Fourier) analysis. The key point is that a multifractal must possess a spectral
density having a certain shape: a straight line in a log-log plot. Furthermore, the slope o f
that line has additional implications. Therefore, spectrum analysis is used as a screening
step before applying multifractal analysis. The mathematics relating K(q) to spatial
spectral density is presented in Appendix D.
The spatial distribution o f the human environment has been studied in geography and
human ecology. [Major] analyzed homeowners insurance property as a two-stage Poisson
process. Multifractal approaches include the analysis by [Tessier et al. 1994] o f the
global meteorological network (i.e., locations o f weather stations) and [Appleby]'s study
o f population in the USA and Great Britain. Until [Lantsman et. al.], no one had studied
the spatial distribution o f insured property values (Total Insured Value, or "TIV").
[Lantsman et. al.] show that some portfolios o f insured homeowners properties display a
spatial distribution consistent with multifractal behavior (over appropriate scales). Figure
12 shows the isotropic power spectra o f the insured value density o f five geographically
distinct regions o f an insured property portfolio.
The preparation o f such graphs starts with a grid o f insured values at a sufficiently small
scale o f resolution. First, accumulate insured value totals over a 2tin-squared grid over
the UxU area. In practice, we have found Tm=7 or 8 to be comfortable for Pentium-III
class machines. If the data originates as individual observations (e.g., geocoded lat-lon
locations) then each observation must be assigned to the appropriate grid cell. If the data
originates as areal data (e.g., accumulated values for polygons) then the data must be
allocated to the grid. In any case, make sure that L=U/2 TM is larger than the resolution o f
the data. For analysis o f large portfolios with ZIP-level data, we typically use U - 512 or
1024 miles 9 with Tm = 6 or 7, resulting in a resolution o f L=8 miles, which is a bit bigger
than the square root o f the average area o f a ZIP c o d e ) °
The second step is to compute the 2-dimensional discrete Fourier transform (DFT) o f the
array. The third is to convert to an isotropic power spectrum. Appendix D has details.
Roughly speaking, the isotropic power spectrum reveals the strength (vertical axis) o f
various periodicities (horizontal axis) in the spatial data, averaged over all directions.
The horizontal axis o f Figure 12 represents the wavenumber (spatial frequency) r where,
e.g., wavenumber r =10 corresponds to a periodicity o f 512/10 = 51.2 miles. The plots
stop at the finest resolution o f 8 miles, corresponding to wavenumber r = 512/8 = 64.
The vertical axis represents the power (spectral density - i.e. Fourier component ampli-
tude - squared) P(r), with arbitrary constant factors used to separate the five curves.
~0Since most of the population resides in smaller, more densely populated ZIP codes, we feel that an 8-mile
resolution is appropriate.
330
All but one curve show the smooth, Ioglinear relationship between power and wave
number that is to be expected from a self-similar random field. The exception displays
higher than expected spectral amplitude at wavenumbers 45-50 (-11 miles) and less than
expected at wave numbers 30-35 (-16 miles). This anomaly was traced to unique factors
in this insurer's distribution channel. They had a strong affinity marketing program for
military personnel. In Washington DC proper, the portfolio's spatial density of insured
value was nearly zero. However, in two suburban enclaves adjacent to nearby military
bases, the value density was among the highest observed in the region. The two groups
were about 11 miles apart and 16 miles away from the center of DC. If not for this
unusual geographic structure to the market, the power spectrum would have been similar
to that of the other regions.
of the (r,c) element of the 2 T grid. These represent the same field, but at progressively
coarser scales of resolution.12 See Figures 3 through 5, mentioned previously. Note that
for each grid, the average cell value is one. The coarsest grid, corresponding to T=0 and
scale U, consists of the single entry, one.
The fourth step is to compute qth powers of the dressed fields and look for a loglinear
relationship between them and the scale. If multifractal scaling is present, we should see,
for each fixed q, a linear relationship between T (the label identifying the coarseness of a
H Recall the original data was at the ZIP code level o f resolution, so entire ZIP codes were allocated to
particular grid squares, introducing a bit o f distortion at the smallest scales.
12 Tm
As a refinement o f this prOcess, we start with two grids, the 2 -sided grid as described, as well as a
, Tn~2
slightly coarser 3 2 -sided grid, and operate on them in parallel. This way, we get a factor o f i .5 or 1.33
(ideally it would be the square root o f two) between adjacent scale ratios instead o f a factor o f two. This
doubles the sample o f scale ratios in the analysis.
331
grid, equal to Iog~ of the number of rows or columns in the grid ) and the logarithm of the
average of the qth power of the grid entries.
Figure 13 shows this relationship for q = 0.6, 0.9, and 1.4. These so-called trace
moments are close enough to linear to make the multifractal model appropriate.
Having satisfied ourselves that scaling is present, the fifth step is to estimate K(q) values
as coefficients in a linear regression version of equation 16 (Appendix B), for each of a
range of values for q. A certain amount ofjudgrnent is called for, however, in choosing
the range over which the regression should be carried out. [Essex] and [Lavall6e et. al.]
discuss "symmetry breaking" that results from the limitations of sample data. The
selected range of scaling must avoid these extremes in order to deliver unbiased estimates
of moments, and hence undistorted K(q) estimates. Linear regression in this case suggests
that K(0.6) = -0.2, K(0.9) = -0.1, and K(1.4) = 0.3. An example of the resulting empirical
K(q) curve based on slopes estimated from regressions of trace moments corresponding
to q values ranging from 0.16 to 4.5 is shown in Figure 14.
Before considering how to best fit a universal K(q) to the empirical curve, we must
address additional limitations of the methodology. The relation between K(q) and c(7)
(the latter "box counting" exponent expressing the scaling behavior of probability of
extreme values) is given by a Legendre transform; there is a one-to-one correspondence
between moments and orders of singularities [Tessier et. al. 1993]. Realistic limitations
to data (rounding low values to zero, finite sample size, bounded sample) can limit the
range of observable singularities and consequently introduce distortions in the measured
K(q). In addition, estimating the universal parameters Cl and o. by nonlinear least
squares may run afoul of a substantial degree of collinearity between the parameters.
For such instances, [Tessier et. al. 1993, 1994] developed the double trace moments
technique. This is based on the observation that i f a universal field is exponentiated first
by 1"1,then averaged to scale Z., then exponentiated to q, we have the relation
In this case, a standard two-parameter nonlinear regression does fine, with Ct =0.66 and
Ci = 0.72 obtained. The resulting theoretical K(q) curve is compared, to the empirical
version in Figure 16.
332
for hurricane risk analysis, it does not for thunderstorm wind, tornado or hail perils. On
the one hand, the average s!ze o f a ZIP code is 8 by 8 miles, and the distribution o f
properties over the area is typically very sparse, irregular and non-uniform. A damage
potential (expected damage rate) field representing a hail or tornado event is o f a
comparable scale (scattered patches less than a mile wide by a few miles long for hail;
narrower and longer for tornadoes), and it, too, is highly non-uniform (e.g., 90% o f the
damage potential from a tornado occurs in less than 5% o f its area). Given that the
details o f the hazard and exposure fields must be superimposed to obtain a reasonable
estimate o f losses sustained, one can appreciate the difficulty o f working with aggregate
data.
Previous solutions to the problem were simplistic and reflected a characterization o f TIV
over the area either as regularly or randomly uniformly distributed, or, at the other
extreme, concentrated at a single point, (i.e., the area's centroid). The result o f this kind
o f misrepresentation is a critical misestimation o f the variability inherent in the process o f
loss generation. Figures 17 and 18 illustrate this.
Figure 17 is a map o f a portion o f a real homeowner property portfolio. The scale is 8
miles on a side, the average size o f a ZIP code. Figure 18 shows a realization o f the same
number o f homes assuming a uniform spatial point process. The true portfolio shows
more "clumps and gaps" than the relatively smoother uniform random version. Figure 19
shows the results o f applying the multifractal model. While it does not reproduce the
original portfolio (no random model would be expected to), it does appear to exhibit the
same spatial statistics. When intersected with a number o f simulated damage footprints
from hail or tornadoes, it will clearly do a better job o f estimating the damage probability
distribution than will either the uniform random version or a version that puts all the
properties at the center o f the figure. The uniform distribution will result in too many
small loss events and not enough large loss events, and vice-versa for the centroid.
The construction o f a synthetic geocoding proceeds as follows:
1. Create a multifractal field over the area in question. Typically, we use a five-
to seven-stage process, depending on the outer scale. A seven-stage process
divides a square into 27x27=16,384 grid cells; this is sufficient to carve an 8-
mile square into 2.5 acre parcels. At each stage i = 0 to Tin, instantiate a 2ix2 i
array o f independent and identically distributed exponentiated extremal Lrvy
random variables (see equation 32 o f Appendix C). 13 In the example o f
Figure 19 we used the parameters t~ = 0.8, Cj = 0.6. In [Lantsman et. al.], we
reported different parameters for industry and selected client portfolios, t4
Combine factors via multiplicative cascade as described for the Sierpinski
multi fractal.
~3[Samorodnitsky & Takku] has an efficient algorithm for simulating Ldvy variables.
~4Specifically, ot = 1.024 and C~ = 0.560 for industry TIV measured at the ZIP code level, and a = 0.552,
C~ = 0.926 for a geocoded client portfolio. The implications of this difference are discussed in that paper.
333
2. Normalize the field and use it as a probability distribution to drive a
multinomial point process. If the area is a polygon other than a square, then
grid ceils must be identified as to being inside or outside the polygon. Outside
grid cells are zeroed out; inside cell intensity values are divided by the total o f
all inside values to renormalize. Say the grid probability in cell i is p,. The
desired number o f homes, N, is then allocated to each cell Ni, by a
multinomial(N, Pl, P2. . . . . P4^Tm)joint random variable draw. In practice, this
is implemented by a sequence o f conditional binomial r.v. realizations. The
first r.v. is Ni-binomial(N, p~). Subsequent cells' realizations are conditional
on all that precede, viz., N3-binomial(N-Ni-N2, p3/(I-pl-p2)), etc.
334
The production of tornadoes and hail involves meteorological processes exhibiting
complex behavior over a wide range of scales, from synoptic weather patterns (thousands
ofkm) down to the size of the hailstone (millimeters or centimeters). We have made use
of multifractal modeling, not only to distribute property values in statistically appropriate
patterns, but directly in the simulation of the hazards themselves.
Multifractal modeling is not appropriate to all scales, however. Thunderstorms exhibit a
strong seasonality during the year, nonhomogeneity of occurrence frequencies over
distances of thousands of km, and anisotropy in terms of preferred directions of
movement. At the smaller scales, the structure of tornado tracks and hail streaks
(continuous bands of hailfall) are also highly idiosyncratic. In between, however, we
have found that the scale of the swath (tens to hundreds of km) on a single day is
amenable to multifractal modeling.
Figure 20 shows a set of reported hail occurrences for 3/30/98. Unfortunately, while
swaths may make conceptual and meteorological sense, data are not reported in swath
groupings. Before we can analyze swaths, we must identify them, using various tools
including Bayesian classification, modal clustering, and nearest-neighbor methods.
Figure 21 shows the same set of reports, now grouped into meaningful swaths.
In order to expand the data into a meaningful set of possible altemative scenarios, we
have followed the practice of other modelers in using the historical data as a template for
a synthetic "'probabilistic database" of possible events. Figure 22 exemplifies the typical
practice of equally displacing all reported events by some random X-Y translation
vector. ]5 One of our innovations is to use multifractal modeling to create and recreate
alternative detailed patterns within a given swath.
Our procedure is as follows:
1. Historical reports are grouped into swaths as mentioned above.
2. Swaths are characterized by a small number of key parameters: the location,
size, orientation, and eccentricity of the bounding ellipse; the prevailing storm
motion direction within, and parameters describing the overall intensity level
of the activity. In the case of hail, intensity is defined by a categorical type
label and the total volume of hail (number of hailstones). In the case of
tornado, intensity is defined by Fujita class-specific Poisson parameters for
the number of touchdowns and two principal component scores defining the
conditional distribution of tornado path lengths. In the case of non-tornado
wind, intensity is defined as total wind power (war(s).
3. When an historical swath is drawn from the database as a template for a
simulated swath, the ellipse is gridded at the 1-km scale and a multifractal
field (with parameters appropriate to the peril and type) is laid down over the
grid. As described above for simulated geocoding, this field is "condensed" to
a schedule of report (hail, tomado, or wind event) locations.
Js Since this translation is by no more than a degree in either direction, it is a bit difficult to see at first.
335
4. Details of each report (hail streak size and intensity details; tornado F-class
and track length, etc.) are drawn from conditional distributions, with
correlation induced with the intensity of the underlying multifractal field at
the point of condensation.
Figure 23 shows several realizations of the multifractal simulation of these particular
swaths. Note how they respect the ellipse boundaries, yet vary dramatically in their inner
detail. A much richer variety of possible outcomes is made possible, compared to simple
location-shift models, but the statistics of event properties and their spatial colocation are
still respected.
Conclusion
In this part I paper, we introduced the ideas of fractal point sets and multifractal fields.
We showed that while those mathematical constructs are rather bizarre from a traditional
point of view (e.g., theory of smooth, differentiable functions), they nonetheless have
applicability to a wide range of natural phenomena, many of which are of considerable
interest to the casualty actuary. We showed how to analyze sample data from
multidimensional random fields, detect scaling through the use of the power spectrum,
detect and measure multifractal behavior by the trace moments and double trace moments
techniques, fit a "universal" model to the trace moments function K(q), and use that
model to simulate independent realizations from the underlying process by a
multiplicative cascade. In particular, we discussed synthetic geocoding and the
simulation of non-hurricane atmospheric perils.
In the companion part. II paper, we focus on time series analysis and financial
applications.
336
Appendix A: Binomial Multifractal
This appendix establishes' a relationship between orders o f singularities and fractal
dimension in the binomial multifractal on the half-open unit interval (0,1]. We follow the
presentation in [Mandlebrot 1989].
Divide the interval into two halves (each open on the left) o f equal length. Distribute
0<p<l o f the mass uniformly on the left half and 1-p o f the mass uniformly on the right
half (here p is a constant throughout all stages o f the construction). Repeat on each
subinterval.
At stage k o f the construction, we have 2 k pieces o f length 2 k, o f which k!/(h!(k-h)!) o f
them have density ph( 1_p)k-h.
Any point x in the interval can be expanded as a binary number 0.blb2b3 .... ~6 By
considering the sequence o f expansions truncated at bk, we make meaningful statements
about the behavior o f the measure at x. For example, define
1 *
f(k) =-k-~b,. (7)
42Mcf(' - f )
This gives us a fractal dimension d = f log2f + (1-f)log2(l-f). Since the exponent a =
flog2(l-p) + (l-f) log2(p) is also a function o f f, we have a functional relationship
between the order o f the singula~rity a and the fractal dimension d o f the set o f points
having that exponent.
sb Since binary xyz0111.., is the same as xyzl000_., let us agree to use only the I I I... representation for
such cases. (This is consistent with our closing the right side of intervals.)
337
Appendix B: Analysis of Multifractal Fields
A random field is c a l l e d stationary t7 if the distribution of q~(rl ) is the same as that o f
ql(r2) for any different rl and r2. This does not imply the two random variables are
independent, however. For example, a multivariate normal m a y have identical marginal
distributions but nonetheless possess a nontrivial correlation structure. A nonstationary
field is said to have stationary increments if the distribution of qT(rl ) - qT(r2) depends
only on the difference vector rl - r2. Furthermore, for D >1, such a field is said to be
isotropic if the distribution o f qg(rl) - qg(r2) depends only oil the magnitude o f that
vector, [rl - r21.
Our discussion follows [Novikov & Stewart], [Shertzer & Lovejoy], [Marshak et. al.] and
[Menabde et. al.] in the general context o f a D-dimensional space and for stationary
fields. The generalization to non-stationary fields will be discussed in Appendix D.
Formally, consider a measure la(X) whose domain consists of a c~-fietd o f subsets X o f
R D. Define the scale-L average density as:
q~L ( r ) = L - ° / z ( V ) (10)
where L < U, therefore V L c V U . This is only defined for nonzero values o f (,ou, but
note that when it is zero, so must be (PL. We have the property that:
~9 It is helpful Io think o f the measure ~a as a physical quantity, such as mass, rather than a probability
measure. That way, probability statements about the random la will not be confused with statements about
the properties o f particular realizations o f la.
338
stationarity implies that aL.V is a random variable whose distribution does not depend on
the position of the volume,center r. Furthermore, we assume it depends only on the ratio
L/U and that the random variables aL.p and ap.v in equation (3) are independent. This
last statement is the techn{cal definition of it(X) being a statistically self-similar (a.k.a.
scale-invariant, or scaling) random measure.
Scaling of Moments, g(q) Function
It is possible to show that under these assumptions the statistical moments of aL.V have
the property:
E(aqL,U ) = E(aL,p)E(ap,
q q U) (13)
where E ( . ) is expected value operator. Since the moments of aL,V depend only on the
ratio L/U, the most general expression for scaling behavior of statistical moments is:
This form reveals K(q) as the coefficient in a log-linear regression between the scale
index T and average q-power of the field, as used in empirical data analysis.
339
Appendix C: Universality Classes; Form of K(q)
To further explore the structure and behavior o f the K ( q ) function we follow [Schertzer
& Lovejoy], [Lovejoy & Schertzer 1990], and especially [Menabde et. al.] and formalize
the idea o f a multiplicative cascade generator (MCG):
The random variables on the right-hand side o f equation (8) are assumed to be indepen-
dent and identically distributed random variables with a pdf:
which depends solely o'n the scale ratio, ( L / U ) l/n . The property expressed in equations
19 and 20 implies that the probability density for GL.u belongs to the class o f infinitely
divisible distributions [Feller]. The natural candidate for a MCG would therefore be a
random variable with a stable L~vy distribution.
An aside oil Lrvy random variables is in order. Lrvy random variables generalize
Gaussian (normal) random variables in the Central Limit Theorem. The CLT states that
the distribution o f a sum o f a set o f N independent, identically distributed random
yariables with finite variance converges to a normal distribution as the number N
increases without bound. More generally, if the restriction to finite variance is removed,
we can say that the sum converges to a L~vy distribution.
Lrvy distributions are characterized by four parameters: ct, which must be in (0,2]; 13,
which must be in [-1,1]; and la and ~>0, which are otherwise unrestricted. The latter two
are location and scale parameters, respectively, allowing us to express a Lrvy random
variable as ~t+~Aa(13) where A is "standardized" and depends on only two parameters.
Note that cr is not the standard deviation because in general, variance is infinite for a
Lrvy random variable. The parameter ~t is the tail index: the case et=l gives the Cauchy
distribution while the case t~=2 gives the Normal distribution. As x increases without
bound, the probability that a Lrvy random variable exceeds x is proportional to x -~. The
second parameter, 13, is a symmetry index: if 13=0, then the distribution is symmetric;
otherwise, the probability o f the upper tail is proportional to 1+13 and the probability o f
340
the lower tail is proportional to !-[3 (in the large-x limit). When ct=i, the 13 parameter
becomes irrelevant, and is conventionally set to 0. While there is no closed-form
expression for the distribution function for Lrvy variables, the characteristic function is
analytically tractable/°
To develop a moment scaling relation for the random multifraetal p.(X) we apply the
Laplace transform to the density functionp(g," L/U):
oe
where s > 0.
Becausep(g," L/U) is the pdfofan infinitely divisible distribution, from equation 18 we
can conclude:
Z(s) = ~1 - e x p ( - s • X)M(dx)
(24)
X
0
For processes under consideration with some degree of rigor we can limit ourselves to
considering only measures M having a density M . In such cases we can replace the
Lebesgue integral with a Riemann integral, replacing M(dx) with M dx. It is this density
function M* (or equivalently Z(s) or p(g,'L/U) ) that completely determines the
properties of the MCG and therefore the (statistical) properties of the self-similar
multifractal p.(X).
The expression in equation 21 could be considered as an expectation of exp(-sGL.~) and
can be rewritten as follows:
341
gt(s; L / U ) = E[exp(-sGL,L, )]
= E[exp(sln(~OLL
D/tpvUD)]= E((oL/~pt,)~(L/U)St
, (26)
From equations 14 (appendix B) and 27, after replacing s with q, we can get following
expression:
One can choose any form for the density measure M that satisfies the convergence and
normalization conditions of equations 25 and 28. The most appealing measure is:
M ' ( x ) oc x -a (30)
(specifying only the limiting behavior for large x) which corresponds to a stable L6vy
distribution [Feller]. With this choice of measure and proper renormalization we can
express K(q) as:
K(q)=IaC~l_l(q~-q) a¢:l
(31)
C~-qlog(q) cz = 1
This expression represents the classes of "universal generators" [Schertzer & Lovejoy].
The first remarkable thing to notice is that a universal generator is characterized by only
two fundamental parameters ( G , a'). The idea behind the introduction of universality
classes is that whatever generator actually underlies the multiplicative cascade giving rise
to a random multifractal, it may "converge" (in some sense) to a well-defined universal
generator.
With only two degrees of freedom, the K(q) curves represented by universal multifractals
are of a limited variety. As mentioned previously, K(q) is constrained to go through the
points (0,0) and (1,0) with negative values when 0<q<l and posilive values for q>l. The
parameter CI clearly behaves as a vertical scaling factor. The o' parameter affects the
curvature, as can be seen in Figure 10, with the extreme case of a -~, 0 converging to a
straight line (with discontinuity at q = 0).
For this "universality" result to be useful, we must also investigate which classes of MCG
are stable and attractive under addition and will at least converge for some positive
moments (not necessarily integer order). The task to specify universality classes could be
342
accomplished by considering the Lrvy distribution in a Fourier framework, i.e., its
characteristic function. The restriction imposed by the Laplace transform (equation 21) is
that we require a steeper than algebraic fall-off of the probability distribution for positive
order moments, hence, with the exception of the Gaussian case (Or = 2), we have to
employ strongly asymmetric, "extremal" Lrvy laws (fl = -1), as emphasized by
[Schertzer & Lovejoy]. The Lrvy location parameter/1 is fixed by the normalization
constraint and the scale parameter cr is derived from Cl [Samorodnitsky & Takku].
Roughly speaking, the universality theory states that multifractals built from random
multiplicative cascades are statistically equivalent to those built from a special class: the
exponentiated extremal L~vy variables:
a = exp(/.t + or. A,~ ( - 1)) (32)
According to [Schertzer & Lovejoy], we can designate the following main universality
classes by specifying the parameter a:
1. tz- 2: the Gaussian generator is almost everywhere (almost surely) continuous. The
resulting field is a realization of the log-normal multiplicative cascade introduced by
[Kolmogorov], [Obukhov], and [Mandelbrot 1972] to account for the effects of
inhomogeneity in three-dimensional turbulent flows (turbulent cascades).
2. 2 > a > 0: the Lrvy generator is almost everywhere (almost surely) discontinuous and
is extremely asymmetric.
3. ct = 0+: this limiting case corresponds to divergence of every statistical moment of
the generator and represents the so-called "'fl" model.
343
Appendix D: Spectrum Analysis; K(q) and Spectrum Slope
In this section, we explore the relation between the m o m e n t scaling function K(q) and
the power spectrum o f the stationary field (PL that represents a random multifractal at the
(sufficiently small) scale o f resolution L. 21 Recall that the power spectrum o f a time
series or one-dimensional stochastic process quantifies the magnitude (amplitude) o f
cycles o f various lengths (periods). Spectral analysis generalizes to multidimensional
fields by characterizing not only the amplitude and periodicity o f such "waves" but their
directions as well. An isotropic power spectrum averages the D-dimensional power
spectrum over all directions, converting it to a one-dimensional spectrum. 22
Because o f Fourier duality between the correlation function o f the field and its power
spectrum [Feller] it is customary in analysis o f empirical stochastic processes to examine
the correlation structure o f a process and then m a p it into Fourier space. But the
correlation function is not well suited to analyzing non-stationary fields so we need to
develop s o m e guidance as to how to check for stationarity, and, if it exists, how to
quantify the underlying field.
Because in the case o f stationarity the functional form o f the correlation function closely
relates to the K(q) function, we can be reasonably confident in establishing a direct link
between the power spectrum and K(q) function o f the field. Following [Menabde et al.],
we demonstrate h o w it could be accomplished.
21 Historically, power spectrum analysis played a central role in identifying and characterizing the scaling
properUes of self-similar random fields. Recent advances [Marshak et al.] in understanding the limitations
of applicability and sensitivity of power spectrum analysis leads one to realize that the issue of stationarity
is critical in qualifying and quantifying interminency of the field. The erroneous assumption that everything
could be extracted from knowledge of the spectral exponent leads to a failure to discriminate between
qualitatively different fields.
344
where 8 ( ) is a delta function (1 at 0, 0 elsewhere) and P(k) is the isotropic power
spectrum. On the other hand, from equations 33 and 34 we can get the expression:
N "
Convert this to the isotropic power spectrum by accumulating values IH(k,h)l 2 (i.e.
complex magnitude squared) into one-dimensional array cells A(r) where
(Here, the vertical bars indicate vector magnitude, i.e., square root of sum of squares.)
Then convert A values to averages P by dividing each accumulated A entry by the
number of H cells contributing to the entry.
Equation 37 could be utilized in many ways: to check a D-dimensional stationary
isotropic field for SS properties, to verify the validity of a numerical approximation of the
K(q) function at the point q = 2, or to examine a non-stationary field with stationary
increments (Brownian motion and "fractional Brownian motion"). Note that P(k) and
K(2) can be computed by independent methods from the same data, enabling one to
verify the consistency of assumptions about stationary increments.
If we relax the assumption of stationarity, the problem of identification and
characterization of SS fields develops some complications. We outline some important
guidelines in handling non-stationary fields:
1. First of all, the power spectrum analysis still can indicate self-similarity of the field
under investigation, revealing the following form:
P(k) oc k -p (4o)
2. For D-dimensional fields the condition ]3 > D will indicate lack of stationarity, but
some transformations of the original field (like power-law filtering or taking the
absolute value of small-scale gradients) could produce a stationary field.
345
3. The spectral exponent f l contains information about the degree of stalionarity of the
field. The introduction of a new parameter H (sometimes called the Hurst exponent)
related to/~ could aid in the task of characterizing the degree of persistence or long-
term memory of the field. We will illustrate the importance of parameter H for time
series in the part II paper.
4. The arguments that the correlation function is not well suited for non-stationary
situations (because of its translation dependence) led to the development of new ideas
about the statistical properties of non-stationary fields to be properly estimated by
spatial averaging procedures. The Wiener-Khinchine relation applicable to fields with
stationary increments [Monin & Yaglom] states that it is the second-order structure
function - not the correlation function - that is in Fourier duality with the power
spectrum. We will introduce the structure function in the context of time series
analysis in the part I1 paper and illustrate how lhe structure fimction is the one-
dimensional analog of the K(q) function.
A further refinement of the multiplicative cascade is to pass from the discrete cascade,
which is what has been described up to this point, to the continuous cascade. The idea
behind a continuous cascade is that rather than proceeding in identifiable steps, the
multiplicative transfer of intensity variation between scales happens continuously at all
scales. [Schertzer & Lovejoy] describe a method of implementing continuous cascades
by means of the Fourier transform.
The functional form for K(q) (equation 31 in appendix C) could be extended to
nonstationar)' fields, and fractional integration (power-law filtering in Fourier space)
could be used to transform simulated stationary random fields to any desired degree of
non-stationarity (in the sense of spectral exponent fl ). This is considered more fully in
the part II paper.
346
References
Appleby, Stephen. (1996). "Multifractal Characterization of the Distribution Pattern of
the Human Population," Geographical Analysis, vol. 28, no. 2, pp. 147-160.
Barnsley, Michael (1988). Fractals Everywhere, Boston: Academic Press.
Davis, Anthony, Alexander Marshak, Warren Wiscombe, and Robert Calahan. (1994).
"Multifractal Characterizations of Nonstationarity and Intermittency in Geophysical
Fields: Observed, Retrieved, or Simulated," Journal of Geophysical Research v. 99, no.
D4, pp. 8055-8072.
Essex, Christopher. (1991). "Correlation Dimension and Data Sample Size," in Non-
Linear Variability in Geophysics: Scaling and Fractals, Daniel Schertzer and Shaun
Lovejoy, eds., The Netherlands: Kluwer Academic Publishers.
Feller, W. (1971). An Introduction to Probability Theory and its Applications, volume 2,
New York: Wiley.
Kolmogorov, A. N. (1962). "A Refinement of Previous Hypothesis Concerning the
Local Structure of Turbulence in Viscous Incompressible Fluid at High Reynolds
Number," Journal of Fluid Mechanics, vol. 13, pp. 82-85.
Ladoy, P., S. Lovejoy, and D. Schertzer. (1991). "Extreme Variability of Climatological
Data: Scaling and Intermittency," in Non-Linear Variability in Geophysics." Scaling and
Fractals, Daniel Schertzer and Shaun Lovejoy, eds., The Netherlands: Kluwer Academic
Publishers.
Lantsman, Yakov, John A. Major, and John J. Mangano. (1999), "On the Multifractai
Distribution of Insured Property," to appear in Fractals.
Lavall6e, D., D. Schertzer, and S. Lovejoy. (1991). "On the Determination of the
Codimension Function," in Non-Linear Variability in Geophysics: Scaling and Fractals,
Daniel Schertzer and Shaun Lovejoy, eds., The Netherlands: Kluwer Academic
Publishers.
Lovejoy, Shaun, and Daniel Schertzer. (1990). "Multi fractals, Universality Classes and
Satellite and Radar Measurements of Cloud and Rain Fields," Journal of Geophysical
Research v. 95, no. D3, pp. 2021-2034.
Lovejoy, Shaun, and Daniel Schertzer. (1991). "Multifractal Analysis Techniques and
the Rain and Cloud Fields from 10"3 to 106 m," in Non-Linear Variability in Geophysics:
Scaling and Fractals, Daniel Schertzer and Shaun Lovejoy, eds., The Netherlands:
Kluwer Academic Publishers.
Major, John A. (1999). "Index Hedge Performance: Insurer Market Penetration and Basis
Risk," in The Financing of Catastrophe Risk, K. Froot, ed., National Bureau of Economic
Research, Chicago: University of Chicago Press, pp. 391-426.
Mandlebrot, B. (1972). "Statistical Models of Turbulence," in Lecture Notes in Physics,
vol. 12, M. Rosenblatt and C. Van Atta, eds., Springer Verlag, p. 333.
347
Mandlebrot, B. (1982). The Fractal Geometry of Nature, Freeman.
Mandlebrot, B. (1988). "An Introduction to Multifractal Distribution Functions," in
Fluctuations and Pattern Formation. H. E. Stanley and N. Ostrowsky, eds., Kluwer.
Mandlebrot, Benoit B. (1989). "The Principles of Multifractal Measures," in The Fractal
Approach to Heterogeneous Chemistry, D. Avnir, ed., Chichester: John Wiley & Sons.
Marshak, Alexander, Anthony Davis, Warren Wiscombe, and Robert Calahan. (1997).
"Scale Invariance in Liquid Water Distributions in Marine Stratocumulus. Part lI:
Multifractal Properties and lntermittency Issues," Journal of the Atmospheric Sciences,
vol. 54, June, pp 1423-1444.
Menabde, Merab, Alan Seed, Daniel Harris, and Geoff Austin. (1997). "Self-Similar
Random Fields and Rainfall Simulation," Journal of Geophysical Research v. 102, no.
DI2, pp. 13,509-13,515.
Monin, A. S., and A. M. Yaglom. (1975). Statistical Fluid Mechanics, volume 2, Boston:
MIT Press.
Novikov, E. A., and R. Stewart. (1964). "lntermittency of Turbulence and Spectrum of
Fluctuations in Energy-Dissipation," Izv. Akad. Nauk SSSR, Ser. Geofiz. vol. 3, pp. 408-
412.
Obukhov, A. (1962). "Some Specific Features of Atmospheric Turbulence," Journal of
Geophysical Research v. 67, pp. 3011-3014.
Parisi, G., and U. Frisch. (1985). "A Multifractal Model of lntermittency," in Turbulence
and Predictability in Geophysical Fluid Dynamics and Climate Dynamtcs, Ghil, Benzi,
and Parisi, eds., North-Holland, pp. 84-88.
Pecknold, S., S. Lovejoy, D. Schertzer, C. Hooge, and J. F. Malouin. (1998). "'The
Simulation of Universal Multifractals," in Cellular Automata. Prospects in Astrophysical
Applications, J. M. Perdang and A. Lejeune, eds., World Scientific.
Samorodnitsky, Gennady, and Murad S. Taqqu, (1994). Stable Non-Gaussian Random
Processes. Stochastic Models with Infinite Variance, New York: Chapman and Hall.
Schertzer, Daniel, and Shaun Lovejoy. (1991). "Nonlinear Geodynamical Variability:
Multiple Singularities, Universality and Observables," in Non-Linear Variability in
Geophysics: Scaling and Fractals, Daniel Schertzer and Shaun Lovejoy, eds., The
Netherlands: Kluwer Academic Publishers.
Tessier, Y., S. Lovejoy, and D. Schertzer. (1993). "'Universal Multifractals: Theory and
Observations for Rain and Clouds," Journal ofApphed Meteorolog?,; vol. 32, February,
pp. 223-250.
Tessier, Y., S. Lovejoy, and D. Schertzer. (1994). "'Multifractal Analysis and Simulation
of the Global Meteorological Network," Journal of Applied Meteorology, vol. 33,
December, pp. 1572-1586.
Wilson, J., D. Schertzer, and S. Lovejoy. (1991). "Continuous Multiplicative Cascade
Models of Rain and Clouds," in Non-Linear Variabilio' in Geoph.vsics: Scaling and
348
Fractals, Daniel Schertzer and Shaun Lovejoy, eds., The Netherlands: Kluwer Academic
Publishers.
Woo, Gordon 0999). The Mathematics of Natural Catastrophes, London: Imperial
College Press.
349
Figures for Part I
John A. Major
Yakov Lantsman
¢ ^^•
i', •
,£;., £:\
,'. 4x ,\
~', i" . . . . .
,(^" h. •'~^" h A'^,~-. ~'*~'.,.
x',4x
^^~ • ^^o •
A 4x A
........ •~ . ~ ~-. - \ ~: , - ,^ ~. A A ..~ ~ ..~)~ !'~t:~ ~"~ .5:. ,~:, ~:~ .~:, .~z~
1.5
1
05
O5
I I I t o I 1 I I
o2 04 0.6 08 o o.2 04 06 o.s
Stage 1 Stage 2
I I I I I I I I
t~
3
Local maxima.'?
2
1 I I I
02 04 06 08 0 02 04 06 08
Stage 4 Stage 7
% ~.- I
F
I
ip J- +~
1.5
0,5
0.:
0.5 l 1.5 2 2.5 3 3.5 4 4.5 5
Exponent q
-o- Empirical
Theoretical
4-
+
1.5
+ . /
o
/
+ - /_ f'/
1
÷
+
,
,
/" ./
//
-I- , " f / .
+ ,'/'f ~-
0.5
), , J
[
(~.'.+.+
\ ~
+ +
-'-Z
+ +
" "-
f
-0.5 0
i
0.5 1 1.5 2.5
-- alpha=O.l
- alpha=O..5
-- alpha= I
-" alpha=l.5
+ + + alpha=2
/
1.5 //
/
0.5
/
0
Jr' 0.2 0.4 0.6 0.8 1
Oaxlmlativ*~ i l i t y
. O.1
"• 0.01
1' 10- 3
I' 10''4
1' 10- 5
,J ~...fr"
fl
, t : U" ~"
,,¢//i
p,, f
,..q
o~
-1
-% '1 2 3 4 $ 6 7
T (~c~ e ~ )
÷ q=0.6
-~- q=0.9
-x q=l.4
ta~
ra~
Y
-10
b~
o~ -3
~h
0
log eta
t .
' II I+ +
"+:* w + I I
! , ++ ~+j
+++~ 14 + + "¢4+
+ +++ ~ + + i !
-~2s-+- + +- ~ - 4---+ ~ - ++ . . . . . . . . . . ... . . . +. . . . . . . . . i 4. L-
, . . . . . . . . . . . . . . +. . . . . . . . . . . . . . . . . .
I ~
+ 4t +++~+
"229 ~ + + ++~"~
4
+÷ 4
"230-419 "'418 -417 "416 "~15 -414 "L "-413 "'412 -411
3(.
1
tl , ! ! ÷ F t /
| ÷ + i
tl
+ + + + i + + ] +', t I
I I + T + ! + I~ /
i t ': .+ J++ I i ~ ~ t
t ÷ i ÷ + + + 4: ÷ '
÷ i + ! ÷~ + ÷ tit+ + t j
725 • i ] ~ ...... ~ . . . . . . ~ - - - + !
I I++ i + / t~ ÷ + I
L ÷/ ÷i + I + ~ ÷ I + I ~ I
T+ , I | + +
I
+ ++ 7-'226
3~.
1
Figure 18' P o i s s o n S i m u l a t i o n o f P o ~ f o l i o
Multifractal Simulate d Oe olo c at.ions
,2 +÷:¢
-223
..... i ...........
-2~4
"-2~5
+ ++ "2~6
~o -22"/
-'228
-229
-2-'='419
"~ -418 "-41~' -416 "4 IS "414 -413 -41:2 -411
X.
1
e •
(, •
(.
I •
i •
t~ I •
"M
iI Z.
I •$
+
• •e
• ••$ ** q,
%
• .a:..."
4' 4,•
f ee
• ,b
,p"
r **f~ ° • ,t**~ '~*
*/.'7 '¢~ ,
I+* •
• *e
..... ~i~i ~
/
L'-
*'4
$
".,d
*e
e~ eee •
I
'b ,
• ,le*6
t o ** ]
o,' °
• •° • " ~o #
:/
o
t,a 3
% • ,
• t• ;
375
Actuarial Applications of Multifractal Modeling
Part II: Time Series Applications
b y Y a k o v L a n t s m a n , Ph.D. a n d J o h n A. M a j o r , A S A , M A A A
email:lant [email protected], [email protected]
Abstract
Multifractals are mathematical generalizations of fractals, objects displaying "fractional
dimension," "scale invariance," and "self-similarity.'" Many natural phenomena, inclu-
ding some of considerable interest to the casualty actuary (meteorological conditions,
population distribution, financial time series), have been found to be well-represented by
(random) multifractals. In part II of this paper, we show how to fit multifractal models in
the context of one-dimensional time series. We also present original research on the
multifractality of interest rate time series and the inadequacy of some state-of-the-art
diffusion models in capturing that multifractality.
Introduction
In the accompanying part I paper, we introduced the ideas of fractal point sets and
multifractal fields. We showed that those mathematical constructs are applicable to a
wide range of natural phenomena, many of which are of considerable interest to the
casualty actuary. We showed how to analyze sample data from multidimensional random
fields, detect and measure multifractal behavior, fit a "universal" model, and use that
model to simulate independent realizations from the underlying process. In particular, we
discussed synthetic geocoding and the simulation of non-hurricane atmospheric perils.
The theory of self-similar random time series is more fully developed than the general
multidimensional case. In this part II paper, we focus on time series analysis and financial
applications. We present some additional theoretical machinery here and discuss
applications to weather derivatives and financial modeling.
Time Series
Introduction to Multifractal Time Series Analysis; Structure Function
Financial and geophysical time series feature a large range of time scales and they are
governed by strongly non-linear processes; this suggests the possible applicability of
scaling (multifractal) models. We consider a random process X(t) defined on the time
segment [0, 7"]. The process X(t) has variously represented exchange rates, interest rates,
temperature and precipitation in our work.
As in the two-dimensional case, scale invariance is most readily tested by computing
P(k), the power spectrum ofX(t). In the case of a one-dimensional time series, standard
techniques of spectral (Fourier) analysis are available in many off-the-shelf statistical and
mathematical packages, including Microsoft EXCEL.
For a scaling process, one expects power law behavior:
376
P(k) oc k -p (1)
over a large range of wave-numbers k (inverse of time). I f / 3 < 1, the process is stationary
in the most accepted sense of the word [I], that is, X(t) is statistically invariant by
translation in t. If 1 < / 3 < 3, the process is non-stationary but has stationary increments
and, in particular, the small-scale gradient (derivative or first difference) process will be
stationary. Introducing the Hurst exponent H (0 < H < 1), a parameter describing the
degree of stationarity of X(t), we can express the exponent /3as follows:
,8 = 2 H + 1 (2)
We can demonstrate a wide range of self-similar processes by changing the Hurst
exponent: Brownian motion (H = 0.5,/3 = 2), an "anti-persistent" fractional Brownian
motion (0 < H < 0.5, 1 </3 < 2), and "persistent" fractional Brownian motion (0.5 < H <
1, 2 < / 3 < 3). This is the class of additive models. The last has become popular for
modeling financial time series.
Most of financial and geophysical time series demonstrate non-stationary behavior. This
creates major complications if power spectrum analysis is the only available tool. It is
well known [2] that knowledge of/3 alone is insufficient to distinguish radically different
types of statistical behavior (the phenomena of"spectral ambiguity"). It is not so difficult
to construct two processes with identical power spectra - one additive and sufficiently
smooth, and the other one multiplicative with a high degree of intermittency. But such
cases can be resolved with the help of multifractal analysis, which can be viewed as an
extension in the time domain of scale-invariant spectral analysis.
An appealing statistical characteristic to use in exploring time series is the structure
function. Structure function analysis of processes with stationary increments consists o f
studying the scaling behavior of non-overlapping fluctuations AX~ (t) = [ X (t+ r) - X (t) [
for different time increments r. One estimates the statistical moments of these
fluctuations, which - assuming both scaling (1) and statistical translational invarianee in
time (i.e., the property of stationarity increments) depend only on the time increment r
in a scaling way:
where E( AXr q) is a constant ( T is the fixed largest time scale), q > 0 is the order o f the
moment, and ( ( q ) is the scale invariant structure function. The expectation E( AXf( t )q)
is assumed finite for q in an interval [0, q,,~ ). The structure funetion ( ( q ) is a focal
concept in the one-dimensional theory ofmultifraetals.
We examine some properties of ( ( q ) . By definition, we have ( ( 0 ) = 0. Davis A. ctal.
[1] show that ( ( q ) will be concave: dZ((q) /dq2 < 0. This is sufficient to define a
"hierarchy of exponents" using ~"(q):
377
((q)
H(q) - (4)
q
It can also be shown that H (q) is a non-increasing function. The second moment is linked
to the exponent fl as follows:
f l = 1+ ( ( 2 ) = 2 H ( 2 ) + 1 (5)
Obtaining _d"(q) or, equivalently, H (q) is the goal o f structure function analysis. A
process with a constant H (q) function could be classified as "monoffactal" or
"monoaffine"; in the case o f decreasing H (q), multiffactal or "multiaffine."
Additive processes can be shown to have linear ( ( q ) or constant H (q). For Brownian
motion we have:
(Frl~C(q)=q(h--½ ) (7)
Note that Brownian motion corresponds to h - 1 (an ordinary integral o f Gaussian white
noise, which gives H =: ½ in Fourier space).
In the case o f the more exotic "L6vy flight" (additive processes with L6vy noise) the
behavior o f ( ( q ) is still linear. In this case, there is a L6vy index c~ ( 0 _< a _< 2 ), which
characterizes the divergence o f the moments o f the L6vy noise. In general ( ( q ) diverges
for q > ct, but for finite samples we obtain the following ( ( q ) function for a L6vy flight
o f index a:
((q)=qH- CI
a_l(q" -q) (9)
378
where H = ~ 1 ) the same as (2), CI is a parameter with the same role as in equation (31)
of part I, and a is the Lrvy index.
Analogous considerations could guide us to modify part I's equation (31) to express the
K(q) function for a non-conservative field:
f CI
K(q)+qH :~LS_~(q -q) a . l (10)
[ Clqlog(q) a =1
where the H parameter is the degree of non-stationarity of the process. In other words,
first bring the field to a state of stationarity (by fractional differentiation, i.e., power-law
filtering in Fourier space or a small-scale gradient transformation) to eliminate the linear
part qH, and then proceed with the analysis as for conservative fields.
To summarize, the basic steps are:
1. Examine the data for evidence of intermittency and self-similarity; this could be
accomplished by studying the power spectrum.
2. Establish the status of multifractality (or monofractality) and qualitatively
characterize the system under investigation; for this, we use the structure function.
3. Fit model parameters to the universal form of~'(q).
4. Simulate, using multiplicative cascade techniques based on the universal form of the
generator.
5. Apply, including, possibly, drawing inferences about the underlying process.
379
To compensate for the consequences of these characteristics, the number of parameters in
the "classical" models has been increasing over time. If this continues unchecked, it
could make models unstable and decrease their predictive power.
We distinguish two major classes of models in use by practitioners today: continuous
time stochastic diffusion models ("diffusion models") and discrete time series models
("discrete models").
Diffusion models build on the well-understood theory of Brownian motion. The
development of stochastic calculus (particular It6 integrals) and the theory of martingales
created the essential mathematical apparatus for equilibrium theory. The assumption of
arbitrage free pricing (rule of one price) has a very elegant mathematical interpretation as
a change of stochastic measure and the transformation to a risk-neutral stochastic process.
Application of diffusion models is a crucial element in the valuation of a wide variety of
financial instruments (derivatives, swaps, structured products, etc). Researchers have,
however, long recognized major discrepancies between models based on Brownian
motion and actual financial data, including long-term memory, volatility clustering and
fat tails. To resolve these problems some extensions of diffusion models were offered.
Often, this means introducing more stochastic factors, creating so-called multi-factor
models.
Modem discrete models extend classical auto-regressive (AR) moving average (MA)
models with recent advances in the parameterization of time-conditional density
functions. These include ARCH, GARCH, PGARCH, etc. Discrete models have been
partially successful in compensating for lack of long-term memory, volatility clustering
and fat tails, but at the cost of an increasing number of parameters and structural
equations. Using appropriate diagnostic techniques one can demonstrate that the
statistical properties of discrete models (viz., self-similarity of moments, long-term
memory, etc.) are essentially the same as for Brownian motion.
There is a third class of models, in little use by practitioners, but familiar to academics.
This group constitutes the so-called additive models, including fractional Brownian
motion, Lrvy flight and truncated Lrvy flight models. These models can replicate m o n o -
fractal structure of underlying processes - their corresponding structure functions g" (q)
(7), (8) are linear - but they cannot produce multifractal (nonlinear) behavior.
I A similar "Ptolemaic crisis" afflicted meteorologicalprecipitation modehng in the 1980s. See, e.g., the
Water Resources Research special issue on Mesoscale PrecipitationFields, August 1985.
380
of the straight line is the parameter fl; here equal to 1.592. This value suggests the
underlying process may be non-stationary but with stationary increments.
An important application of multifractal analysis is to characterize all order moments for
the validation of a scaling model. The appropriate tool to do this for the particular case of
a time series is structure function analysis.
To apply the structure function method, we rewrite the equation (3) in logarithmic form:
Iog[E(AX~(t)q)]=log[E(AXrq)]+((q){Iog(r)-log(T)} (11)
f(~r(t)q)=-l~laXAt)l q (12)
(see Fisher, A. et al. [4]). We then plot log[E(AX~(t)q)] against log(r) for various values
o f q and various values of r. Linearity of these plots for given values o f q indicates self-
similarity. Linearity could be checked by visual inspection or by some more sophisticated
techniques (e.g., significance test t'or higher-order regression terms). The slope of the
line, estimated by least squares regression, gives an estimate of the scaling function ( ( q )
for that particular q.
The structure function, mapping q to its slope, is depicted in Figure 4. Here, we also draw
an envelope of two straight lines corresponding to Brownian motion (slope 0.5) and
fractional Brownian motion (slope 0.6), respectively. The non-linear shape of the
empirical curve is the signature of multifractality.
Having established the existence of multifractality in the data, we can move to the next
step - fitting parameters. In the case of one dimensional (time series) field, we use
equation (9) to find universal parameters. For FX data, the universal parameters are: H =
0.532, ct = 1.985, CI = 0.035.
indicates that this interest rate series might be modeled by a non-stationary process with
stationary increments.
Figure 8 represents the ( ( q ) curve for interest rates with the same Brownian motion and
fractional Brownian motion lines that we used for the FX analysis overlaid on the graph
for reference. Again, the signature of multifractality is clearly present in the data. We
obtain the following universal values: H = 0.612, a = 1.492, C1 = 0.095. These values
could be used to simulate interest rates by applying a multiplicative cascade technique.
381
A n d e r s e n - L u n d is Not Multifractal
We present an original analysis of the three-factor Anderson-Lund model of interest rates
and show that even this model, with its highly complex structural equations and difficult
fitting techniques, cannot replicate key features of empirical interest rate data.
The general form of the diffusion model Vetzal, K. [5] is
dr, = t g . d t + c r . d W (14)
where rt is the interest rate at time t, 8 is the average growth rate of the process, and cr is
a volatility scale parameter.
Perhaps the most sophisticated of the analytically tractable models is the Cox-Ingersoll-
Ross (CIR) model [7]:
where t¢ is the mean reversion constant and 8 is the global mean of the process. CIR adds
realism to the Merton model by introducing mean reversion and volatility that is
functionally dependent on the level of the rate.
Visual analysis of the interest rate time series graphs (as well as statistical diagnostics)
reveals several distinctive features to US interest rates that cannot be accommodated by
the CIR model.
1. Local trends in interest rate movements, indicating a changing mean to which the
process reverts.
2. Heteroscedasticity that is not simply a function of the level of the rates.
3. Volatility clustering.
382
To address these limitations of CIR and previous models, Andersen and Lund [8] intro-
duced the following (analytically intractable) three-factor model:
383
simulated by the A-L model and that of Brownian motion are nearly identical; the
stochastic process underlying the A-L model appears to be monofractal. 2
The fundamental difference in scaling behavior revealed by the structure function
comparison could lead to qualitatively different time series behavior. The universal
parameters fit to the empirical process in the previous section indicate that the underlying
mechanism should have a multiplicative cascade structure with (approximate) Lrvy
generator, rather than an additive process of information accumulation (Brownian motion
type). Paraphrasing Mtiller et al. [10], the large scale volatility predicts small scale
volatility much better than the other way around. This behavior can be compared to the
energy flux in hydrodynamic turbulence, which cascades from large scales to smaller
ones, not vice-versa.
Conclusions
In the companion part 1 paper, we introduced the ideas of fractal point sets and
multifractal fields. We showed that while those mathematical constructs are rather
bizarre from a traditional point of view (e.g., theory of smooth, differentiable functions),
they nonetheless have applicability to a wide range of natural phenomena, many of which
are of considerable interest to the casualty actuary. We showed how to analyze sample
data from multidimensional random fields, detect scaling through the use of the power
spectrum, detect and measure multifractal behavior by the trace moments and double
trace moments techniques, fit a "universal" model to the trace moments function K(q),
and use that model to simulate independent realizations from the underlying process by a
multiplicative cascade. In particular, we discussed synthetic geocoding and the
simulation of hail and tornadoes.
In this part II paper, we showed how to analyze time series through the structure function,
and showed particular examples of foreign exchange and interest rate time series. We
discussed the variety of time series models in use by practitioners and theoreticians and
showed how even state-of-the-art diffusion models are not able to adequately reflect the
multifractal behavior of real financial time series.
The field of stochastic modeling is constantly growing and evolving, so the term
"Copernican revolution" might be too strong to describe the advent of multiplicative
cascade modeling. Nonetheless, multifractals have clearly taken hold in the realm of
geophysical and meteorological modeling, and it seems clear that they will eventually
find ~heir place in the world of financial models, as well. However, there are still
numerous open questions, such as how to implement arbitrage-free pricing, that need to
be answered before multifractal models can replace diffusion models as explanations of
market pricing mechanisms.
384
References
1. A. Davis, A. Marshak, W. Wiscombe, and R. Cahalan, "Multifractal characterizations
of nonstationarity and intermittency in geophysical fields: Observed, retrieved, or
simulated," Journal of geophysical research, Vol. 99, N. D4, pp.8055-8072, April 20,
1994.
2. D. Schertzer, and S. Lovejoy, "Physical modeling and analysis of rain clouds by
anisotropic scaling multiplicative processes," Journal of geophysical research, Vol.
92, pp. 9693-9714, 1987.
3. F. Schmitt, D. Schertzer, and S. Lovejoy, "Multifractal analysis of foreign exchange
data," submitted to Applied Stochastic Models and Data Analysis (ASMDA).
4. A. Fisher, L. Calvet, and B. Mandelbrot, "Multifractality of deutschmark / US dollar
exchange rates," Cowles Foundation Discussion Paper # ! 165, 1997.
5. K. Vetzal, "A survey of stochastic continuous time models of the term structure of
interest rates," Insurance: Mathematics and Economics #14, pp. 139-161, 1994.
6. R. C. Merton, "Theory of rational option pricing," Bell Journal of Economics and
Management Science, Vol. 4, pp. 141-183, 1973.
7. J. Cox, J. Ingersoll, and S. Ross, "A theory of the term structure of interest rates,"
Econometriea # 53, pp. 385-407, 1985.
8. T. Andersen, and J. Lund, Stochastic "Volatility and mean drift in the short rate
diffusion: sources of steepness, level and curvature in the yield curve," Working
Paper #214, 1996.
9. A. Gallant, and G. Tauchen, "Estimation of continuous time models for stock returns
and interest rates," Manuscript, Duke University, 1995.
10. U. Miiller, M. Dacorogna, R. Dave, R. Olsen, O. Pictet, and J. von Weizsacker,
"Volatilities of different time resolutions - Analyzing the dynamics of market
components," J. of Empirical Finance # 4, pp. 213-239, 1997.
385
Figures for Part II
ta,a
Ot~
O~
Yakov Lantsman
John A. Major
3.5
3.25
2.75
2.5
2.25
1.7:
Oo 1..'
1.25
1
0 85 170 255 340 425 510 595 680 765 850
O~
-1
100 200 300 400 500 600 700 800
-11
-1:
-5.5 -5 -4.5 -4 -3.5 -3 "2.5 -'2 -1.5 -1 "0.5
12
18
16
14
12
1(~
6
u.a
0
0 180 360 540 720 900 1080 1260 1440 1620 1800
1C
-lq
L~o
tO "2q
-3q
200 400 600 800 1000 1200 1400 1600 1800
t~
t~
-ll
-7 -6.3 -5.6 2.9 2.2 -3.5 -2.8 -2.1 -1.4 -0.7 0
k~
0 . .~
0
0 0.5 1 1.5 2 2.5
13.5
12
10.5
7.5
taO
6
4.5
3
t
1.5
00 540 1080 1620 2160 2700 3240 3780 4320 4860 5400
t,aa
-11
-21
200 400 600 800 1000 1200 1400 1600 1 8 0 0 2000 2200 2400
15
Oo
05
00 Q5 1 15 2 25 3
399
Let Me See:
Visualizing Actuarial Information
Aleksey S. Popelyukhin, Ph.D.
"Human inside"
Dossier
Aleksey Popelyukhin is a Senior Vice-president of Technology with the Sam Sebe LLC and a
Vice-President of Information Systems with the Commercial Risk Re in Stamford, Connecticut.
He holds a Ph.D. in Mathematics and Mathematical Physics from Moscow University (1988).
• Prizefor the best 1997 article m the "'Data Management discussion paper"program entitled
"The Big Picture: Actuarial Processfrom the Data Management point of view'" (1996)
Creation and distribution of the popular actuarial utilities like Triangle MakerTM (1994) and
Triangle Maker T M Pro (1997), Actuarial Toolchest T M (1998) and Enabler T M (l 999)
• Design, development and coding of the 2nd and 3rd (current) generation of the very powerful
andflexible actuarial software package called Affinity (1996)
• Promotion (through his papers andpresentations) of his notions like Ideal Actuarial ~ystem
and Data Quality Shield, and paradigms like object-oriented actuarial software and data-
driven visualization.
400
Let Me See:
Visualizing Actuarial Information
Aleksey S. Popelyukhin, Ph.D.
Abstract
No one would argue that there are limits on how much information a human being can perceive,
process and comprehend. Even as advances in computer technology throw more and more data at
actuaries, these limits stay the same. It is time to delegate to computers the very important task of
presentation of information.
The article will try to demonstrate how existing data-driven technologies can help to evolve an
Ideal Actuarial System from an actuarial tool into a company's Alarm System. Utilizing tools
readily available to everyone who owns a contemporary Office Suite package, actuaries can
present information in such a way that the effectiveness of Corporate Decision Making, Data
Error Detection and suitability of Actuarial Algorithms will increa~ dramatically.
Actuarial results, properly combined, summarized and filtered by importance, may be arranged
into a so-called Digital Dashboard that serves as aportal into the wealth of detailed actuarial
information and the calculations behind it. This article itself can be considered as a portal that
refers actuaries to the wealth of information on visualization techniques and dam-driven
technologies.
401
Let M e See:
Visualizing Actuarial Information
Aleksey S. Popelyukhin, Ph.D.
Inbx)duction
"Let's see..."
to Potyphemu,
The Actuarial Process, like every analytical process, consists of three major stages:
• Representation o f Results
t~ers
Figure I
The amount of intbrmation to be processed is io vast, though, that there is no way to do it without
computers.
402
But, computers are used strikingly differently throughout the process: in the first two stasea we
serve computers (feeding them with data and supplying them with actuarial parameters selection),
and only in the last stage computers serve us (providing us with answers for making decisions).
The irony here is that first two stages (data transformation and application of algorithms) have
been automated to a fair degree, while presentation of results stage still remains the least
computerized of all, and tools available for reporting and visualization are severely underused by
actuaries.
At~ortuletms
The amount of numeric information at all stages of the Actuarial Process is either exceedingly
massive (raw data), overwhelming the recipient, or too small (summary) to adequately represent
all the nuances and data patterns.
Human perception relies heavily on Short-Term Memory (STM) - a small "buffer" where
external information is recognized and perceived (see [1]). Unfortunately, STM is limited in
capacity - it can hold only 5-7 similar items at once. Another problem with STM is that new
information replaces old (it's just a "buffer"), and new information is coming in continuously.
Every change (in color, size or position) attracts a person's attention and changes the focus of his
perception. Even in the best-case scenario, without any distractions, STM can hold information
only 30 seconds or so*.
A few approaches seem to alleviate these limitations of human perception (and even exploit its
weaknesses to attract a person's attention toward the important information): visualization,
adaptive reporting and alarm systems. Indeed, "traditional" ways of displaying just myriad
"boring" numbers in a spreadsheet are not adequate anymore. Without proper assistance, it is
practically impossible to notice imperfections in the data; the inapplicability of a particular
actuarial technique; or to pinpoint a claim, line or location that demonstrate unusual development.
The solution lies in augmenting standard report techniques with the informationfl/tered by
importance. It means that only a few outstanding items with a/arm/rig behavior show up (or
somehow get highlighted) in the report. The task of defining alarms and assigning levels o f
importance to actuarial results lies squarely on actuarial shoulders.
It is important to realize that nowadays, with the proliferation of Office Suites software, a wide
variety of visualization tools is within the reach of every actuary. Almost every chapter o f this
article is illustrated by an example from an Office program . Equally important, one can safely
assume that everybody understands the text of the BASIC' program. Coding in VBA has become
a skill nearly as essential and vital as reading and writing.
* Conduct the following experiment: read a new telephone number digit-by-digit and then, (without
attempting to repeat it, combine digits into groups or make associations and other mnemonic rules) after 30
seconds pass, try to dial it. Even withottt distraction it is practically impossible - that is probably why 4 ! l-
type ~ repeat telephone numbers st least two times.
403
Adaptive Reporting
"Data is not necessarily Information."
Every company almost certainly has established a fixed way of presenting the results of actuarial
analysis. The overwhelming majority of these presentations are static reports with predetrmed
content and layout: think of it as a list of reserves for 100 lines ofbuainess or a list of net present
value of premium for 1000 treaties. There is nothing wrong with that way of presenting
information, except that human perception cannot effectively span beyond 5-7 similar items.
Nobody can guarantee that equal attention will be paid to every item m the long, monotonous
report. To alleviate this problem, sometimes information is presented in a summarized form
without important details. Either way, important information about the 68 ~ LOB or the 729 ~
treaty may escape the reader's attention.
The solution lies in the use of data-driven technologies to create dynamic or adaptive reports.
Reports whose size, shape and format adapts to the data. Placing these reports in an interactive
environment such as a spreadsheet allows the user to interact dynamically with the report
(effectively creating a whole family of reports rather than a single one), shaping it to the level of
detail that suits the user.
• Filtering,
• Outlining,
• Sorting,
• OLdP-enabled tools.
R/t~qr~
'The simplest and most straightforward way to reduce the amount of informauon displayed in the
report is Filtering. If information is organized as a list or a table in a spreadsheet and there is an
easy way to define relevant subset, then Filtering fits the bill.
The AutoFilter feature of Microsoft Excel is a powerful and elegant implementation of Filtering.
Acc~sible and customizable through either an interface or the VBA "macro" language,
AutoFilter serves as an ideal Filtering tool*.
* Reporting and Visualization tools, including Filtering, are available in many products, not only
in Excel. In fact, database products with built-in SQL language provide much more powerful and
robust Filtering tool. However, these products may be outside the reach of many actuaries.
404
One can use this tool, for example, to limit the visible data table to a particular LOB~, location
and Open/Closed status. Another interesting use of Filtering tools, and Exeel's AutoFilter in
particular, is checking for all distinct values in a list: AutoFilter's drop-down boxes display a
sorted list of all distinct values present in a column. The fastest way to check ifa huge
spreadsheet populated with the data contains all requested LOB's, States or Policy Numbers is to
initiate AutoFilter and click on the down-arrow in the corresponding column.
I t , ] ' F 1 , O.'.:i{ :
Figure 2
Example 1. For an illustration of using AutoFilter for something less straightforward than
plain-vanilla filtering (i.e., by LOB or Location), observe how to use it to filter losses in
the 90 tb percentile of their Incurred Value:
Const colLOSS As Integer = 6 'Loss value is located in the column number colLOSS
Sub CreativeUseOfAutoFilterO
Dim rRange As Range,nRowsAs Long
ActiveSheet.Celis(1, l).S¢lect
With Selection
nRows = .CurrentRegion.Rows.Count
Set rRange = .Offset(I, colLOSS- l).Resiz~nRows- 1, I)
.AutoFilte¢
.AutoFilter Field:=colLOSS, Criterial :=">" & ApplicatimxPereendle(rRang¢, 0. 9)
End With
End Sub
Filtering is a fast and effective way to cut down the amount of data displayed. However, ifa
filtered subset is still too large, or there is a need to see different levels of detail for different
groups of data, or the user has to see different aggregate values (subtotals, averages), then
Outlining or Pivot Tables would be a better choice of tools.
Outlining
Outlining is a hierarchical representation of data organized into a list or table, with the ability to
hide or display details of all or selected groups on any level of the hierarchy. Every user of
Windows Explorer or any other File Directory tool is familiar with the notion of Outlining.
Excel's implementation of Outline allows both horizontal (mwsdrecords) and vertical
405
(column.~fields) outlining. Along with the ability to display detailed records, Excel's Outline
supports aggregate functions such as sum, count, average, standard deviation, etc. This capability
may become handy in situations when different members of a hierarchy are treated at different
levels of detail.
[..i~ ~ ~c ~
wc
~
Nv
,~
l~r
,,712~
1.r30~ e
~ JM~WC NY Total 3,443,119
wc cr lgsr l ~rs,~
A,B C ~ CT Told 3,919,99"1
~ wc ~ 1~ 2,172,o41
~c ~ 1met
Figure 3
I
.
Im
Figure 4
406
Sort/nO
Sorting is a powerful technique that brings the most important records to the top o f the display or,
in the case of printed report, to the first page. Sorting does not reduce the amount of data
displayed, but it assures that the first several records get more attention before the report reader
gets tired. Consequently, the main skill of using Sorting is in defining what constitutes
importance. An ability to rank information in accordance to its significance, and to identify
which actuarial measurements affect decision-making the most is a yardstick that separates great
report designers from mere mortals.
"Combined Loss Ratios," "Net Profitability" and "Reserve Adequacy" - all these measurements
may serve as actuarial significance indicators. Both more sophisticated ones like "percent Change
of the Current Estimate of the Net Ultimate Loss Hindsight Estimated" or simpler ones like
"Time since the Latest Loss Run" help to sort data and generate useful actuarial information.
Figure 5
Sub FillAdjacentColunm0
With ActiveCell
.Resize(.CurrentR©gion.Rows.Cotmt+ .CurrentRegien.Row - .Row, l).Formula_
= .Formula
End With
End Sub
Conditional Formatting
Conditional Formatting is a feature of Microsoft Excel that allows users to define the font, color,
border and background pattern of a cell as a function o f ~ e values m other cells. When values
the referenced cells change so does the conditional format. Despite some limitations (currently,
407
Excel currently supports up to 3 variable formats per cell), this feature opens unprecedented
creative possibilities for report designers. Combined with other reporting techniques like
Filtering, Sorting and Pivot Tables, Conditional Formatting is indispensable for attracting the
report reader's attention to the most crucial information. Due to its dynamic nature, Conditional
Formatting can serve as a building block for an actuarial Alarm System (a cell automatically
becomes red when an actnarially significant value becomes too high or too low).
Given that the format condition's formula can be any expression that uses user-defined functions
along with built-in ones, the use of Conditional Formatting is limited only by one's imagination:
it's use can range from data-error detection to Thomas Mack-like assumption testing to statistical
outlier warnings (see [2]).
A I ~P I .e I O I ,! [ F't O I N 1 ~ I J I W I b I m
-- 2441 1.423 1.1. 1230 1.101 1,087 l~ag 1.
~ '
I,,.A~,~CE(e~:m)-ee >z'.~'~e,~ ,m) "k,I
[] I
Example 6. As an example of the user-defined fommt condition formula, one can use an
algorithm that assigns different types to the treaties based on the relationships among
parameters like Premium, Ultimate Loss, Aggregate Limit and others. Every treaty's
record in this kind of report can be formatted in accordance with assigned type. And as
the estimate of the ultimate net loss changes, the type of the treaty (potentially) changes,
and so does the formatting.
408
Conditional Formatting used in combination with other reporting techniques is especially
powerful. Sorting a list of records by one criterion (e.g., Net Profitability) and Conditionally
Formatting them according to another criterion (e.g., treaties with a particular insurance company
can create quite an impressive display and produce an easy-to-comprehend overall picture.
OI.AP
OLAP stands for Online Analytical Processing and there are many tools from numerous vendors
that provide OLAP functionality. However, in order to demonstrate the accessibility of these
tools, only one particular implementation of OLAP will be considered - Microsoft Excel's Pivot
Tables.
* Bear in mind, that for true Outllnin~ it is necessary to support hierarchies. Starting with Excel 2000,
Pivot Tables' dimensions do support hierarchies. See [3] for examples of actuarial hierarchies.
409
The total flexibility of Pivot Tables may cause some problems for actuaries.
First, it is unclear which dimensions to choose for display and which ones for aggregation (or
errata-section) to get actuarially meaningful results. Also, actuaries should define additional
(calculated) fields with some kind of "actuarial significance" indicator, which can be later used in
Sorting, Filtering or Conditional Formatting.
Second, unlike other professions where cresting a Pivot Table is a destination - a final act of the
amdytical process - actuaries frequently use aggregated data as a starting point of their analysis
(see [4]). If created properly, a Pivot Table can serve as a convenient storage for actuarial
triangles with selective drill-down capabilities. One can create an OLAP Cube hierarehy in such a
way that any suspicious element of the triangle can be drilled-down for details up to individual
Claims and even individual payments Iced:
~ dlnmmam.
[] rf LOe
6~ toe
[] (~ ~:cYear
"I
,g
[] r~ L,~e~rs
~cr~
410
2
t0
$70,027 I
~2 $~ t3~ 702
q8
!$
Figure 6
The problem is that with an unpredictable shape/size of the Pivot Table it is hard to incorporate
its content into subsequent calculations. One workaround is to use the GETPIVOTDATA
spreadsheet function, while another is to use the Pivot Tables' Calculated Fields - an ability to
add fields/dimensions to the Pivot Table that are calculated on the fly.
Example 7. Sometimes, a Calculated Field is the only mechanism to add new dimensions
correctly. As an example, consider the Loss Ratio field. A Loss Ratio like any ratio is a
nonlinear operation and, consequently cannot be summarized properly: a sum of ratios/s
practically never" a ratio of the sums. That is where Excel's Pivot Tables make a clear
distinction between input fields (for subtotals, the sums o f ratios are calculated) and
Calculated Fields (for subtotals, the ratios o f sums are calculated).
Unless the ratios have the Smile ~ o l u t c values and different signs.
411
, A I • I c I o I e I f I •
2 m
tl l m - -
,2,___'-" ...... lx~d ~Netto~atio "--I
t 414WA-M~G~
1 ' ~ 1 ~ l m Fot~rm: ~m(N~l.os$+NetAtAEb/NIK'Prem ~ I
1;, t ,m B,~
le m l -~
1o ~roo.~ ._J
~'TDExce0ss ..~
I im seou.AE
~'J ~ g 4 1 1 m g e Taml
PTOOedL~wer2 i
:si p tm x~Dedt~z .ZI
~ L ~m n m t mad I
rt ~m.mumT~
Example 8. Once a (calculated on the fly) Loss Ratio field is added to the Pivot Table,
one can use it for Sorting and Filtering. A screenshot below illustrates 3 important
features of Pivot Tables simultaneously. First, by the simple dragging of the field label,
one can rearrange the Pivot Table from a "Policies by Underwriting Years" to an
"Underwriting Years by Policies" view. Second, one can Son a field by the results of on-
the-fly calculations: in our case we sorted Policies by Loss Ratios in descending order.
And, third, one can Filter by the results of on-the-fly calculations; in our case we chose to
display just the 5 worst policies per underwriting year based on the Loss Ratio indicator.
Note that we could choose any indicator (like Net Profitability or Discounted Loss Ratio)
that is available in the Pivot Table as an input field or a dynamically calculated field. As
one can see, the impressive reporting tools are all there; the quest is on for actuarial
indicators.
412
2 "
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ; . . . . . . . . . . . . . . . . . . . . . . . . . .
t0' I
1t ' trait I~A4MI?1113 75% I 1.447 $ 1~r321 $ lf948~EO S 13g.56~
I~415~I1416 10%I 1.067 $ 10.6/30.577 $ 11.~:~t.99r2 $ 1.980.625 130..'.'.'.'.'.'.'.'.'~
'14 127.73%
t"5" I~II~Iklllltlb~ . . '.2 ~, ,.. L " """ 127.51%
_ :';;~.
• .... ,~....~.. ~ ,-,',~.
Visualization
Seeing is believing
Popular belief
Visualization (see [5]) is the process of exploring, transforming and viewing data as images to
gain understanding and insight into the data. Studies in human perception, computer graphics,
imaging, numerical analysis, statistical methods and data analysis have helped to bring
visualization to the forefront. Images have unparalleled power to convey information and ideas.
Informally, visualization is the transformation of data or information into pictures.., it engages
the primary human sensory apparatus, vision, as well as the processing power of the human mind.
The result is an effective medium for communicating complex and/or voluminous information.
As the amount of data overwhelms the ability of the human to assimilate and understand it, there
is no escape from visualization. So actuaries have to develop conventions for representation of
their data and results.
413
There are unavoidable problems with multidimensional visualization: projection (only two or
three dimensions are available) and understanding (we as humans do not easily comprehend more
than three dimensions or three dimensions plus time animation). Projections can be implemented
if the three most important dimensions are identified in such a way that the remaining dimensions
can be ignored. Once again, it is an actuarial task to choose these dimensions.
Charts
This paper is too short to discuss all the possible uses of charts and diagrams in the actuarial
process. A great number of wonderful books (see [6]-[8]) explain which type of chart to use in
which situation: a line chart for displaying increases, bar for shares, pedestal for ranks, Gantt for
schedules, etc... However, it is up to actuaries to decide how to display triangular data.
Exan~le 9.By examining the 3-D chart representing logarithms of age-to-age factors, one
can formulate a hypothesis about changes in calendar year trends (arguably not
immediately evident from looking at the raw data): the last 4 years have different trend
than the rest of the triangle. Rotating the graph for the look from another angle allows
one to confirm or discard that hypothesis. The final check comes from the algorithm
described, for example, in [9].
0.0o
I
12-24 Z4-36 3648 44-40 ~72 72-44 114.4NI 96.108
181~ 0.892 0.353 0.241 0.214 0.097 0.083 0.067 0.1)81
1911/ 0.864 0.34t 0.257 0.118 0.058 0.087 0.059
1992 0.870 0.367 0.175 0.125 0.087 0.071
1993 0.884 0.304 0.130 0.106 0.084
111114 0.801 0.317 0.150 0.154
111~i 0.802 0.270 0.181
11104 0.797 0.313
1507 0.797
Figure 8
For almost 86 years since 1914, CAS members have not yet agreed on the standard graphical
representation of a triangle (one of the most basic actuarial notions). The author firmly believes
that properly displayed triangles may reveal some important Uends that are not evident in
numerical representations.
414
Not only does visualization inspire useful hypotheses for actuarial analysis: sometimes it's the
only way to deal with the data*. Developments in new areas of actuarial science present new
challenges in demonstrating findings. DFA~ that deals with many less traditional notions, such as
scenarios and strategies, is a good example of such a challenge.
Example lO. The majority of actuarial information contains a location code associated
with the values. Legislation requirements, types of coverage, rates, exposures and loss
performance differ from region to region. Geo-coding swLPdyemerges as one of the
hottest actuarial applications. Yet it is hard to imagine how one can notice trends and
dependencies of geographically related data without visualization. Microsoft MapPoint -
ideologically, an integral part of Microsoft Office - provides the means for precisely that
type of visualization. For example, a map of WC Ultimate development factors by State
based on NCCI data (see [ I0]) is presented below.
o ...... i[
° II '
............ " . . . . .
Figure 9
Animation
Ifa picture is worth a thousand words, then an animation is worth a thousand pictures.
Animation is the best way to exploit an aspect of the human psyche called "selective attention" -
people readily react (by shifting the focus of their attention) to any movement, including change
in color, size or position. Animation is suitable for visualization of the range of uncertainty and/or
development - two of the most important actuarial phenomena. While not a standard feature of a
plain-vanilla spreadsheet, animation is nevertheless quite within reach of every Microsoft Excel
user.
• So-called "quarterly" triangles, sometimesstudied by actuaries, can easily reach size of 60x60 or more,
which makes them impractical for examining by traditional means (in a spreadsheet),yet visualization
techniques would shine in this circumstance.
415
Exanml¢ 11. Below is the code for conceptual animation:
Sub AnimationlnExcel0
Dim i ~ Loeg
Call SetupAnimatien 'see Appen~ 2
Application.lteradea = True
Applicatien.Maxlteratiens = 1000
Applicatio~MaxClmagc = O.!
F o r i = I To 1000
Calculate
Next i
End Sub 'do not fgwget to restore original Calculation Mode///
Taking a tip from the computer games industry, animation can be effectively used for
visualization of the simulation process (see [11]). Indeed, with the growing importance of DFA
and other non-analytical modeling techniques, simulations are steadily becoming a technique o f
choice for the majority of actuaries. Currently, however, all intermediate steps used in simulation
arc hidden from the user - such a wealth of information is, essentially, discarded. Use of
animation may prevent this "waste" of intermediate calculations. Animated display of
simulation's steps may help the user to visualize the dynamics of the simulated process or
appreciate the range of uncertainty in simulated scenarios.
Interactive selecUon
Another application of visual technologies is the interactive selection of actuarial parameters. For
example, selecting parameters by moving points directly on the graph of a development pattern
appears to be much more intuitive and convenient than typing numbers into spreadsheet cells.
~ ~ " ".,-! ! ~
,::: :j
Figure 10
416
This fimctionality - a two-way link between the graphical display of numbers and their values in
a spreadsheet - is not yet as user-friendly as other Office tools. But given that this feature is
available in numerous other applications, it is only a matter o f time until Visual Editing becomes
an equal member of the Office tools roster and actuaries are able to incorporate interactive
graphical manipulation of numbers into their spreadsheets.
Example 12. Even though "Visual Chart Editing" is not a standard Office feature, with
some amount of VBA programming, it is possible to establish a two-way link between
numbers and shapes in Excel (below is the code behind the buttons from Figure 10):
Sub VisualizationFromShapeToSprendsheet0
Dim cellAS Range, node As SlmpcNode, n AS Integer
Set cell= AetiveCell
ActiveSheet.Range("Coordinates").CurrentRegion.Cleat
ActiveSheet.Shapes("InteractiveSelect").Select
n=0
For Each node In ActiveShect.Shapes("interactiveSeleet").Nndes
i1--11+1
ActiveSheet.Range("Coordinates").Cells(n, 1) = node.Points( I, 1)
ActiveSheet.Rangc("Coordinates').Cells(n, 2) = (210 - lode.Points(l, 2)) / 200
Next node
cell.Select
End Sub
Sub VisualizationFromSpfeadsheetToShape0
Dim c As Range, cell As Range
Set cell = ActiveCell
On Error Resume Next
ActiveSheet.Shapes("IntemcfiveSelect').Delete
With ActiveShect.Shapes.BuildFreeform(msoEdifingAuto, 0, 0)
Fo¢ Each c In ActiveSheet.Range("Coordinates").Ctm~tRegion.Resize(, 1)
.AddNodes msoSegmentLine, rmoEditingAuto, c.Value, 210 - 200 * c.Cells(I, 2).Value
Next c
.ConvenToShape.Select
End With
Selection.Name = "lnteractiveselect"
ActiveSheet. Shapes("Interactiveselect').Nodes.Delete I
Selection.Placement = xlFreeFloating
cell.Select
End Sub
417
Alarm Systems
It's the eleventh h o u r - do you know where your reserves ere?
Actuarialproverb
oam Oual~/
Companies that build data warehouses and clean up their data soon realize that the majority of
their data comes from external sources (TeA's iv, industry bodies, self-insureds), which are neither
clean nor in a single format. It is time to combine efforts and make sure that every source can
supply high quality data in a timely manner.
There are some recommendat!ous on data.quality procedures by IDMA v (see [12], [13]) and data
elements' definitions by ISO ~ and N C C f '~ (see [ 10], [ ! 4]), but they are not part of everyday life
in every data collection entity. In fact, a study of more than 40 TPA's (see [15]) showed that
practically every one of them has failed even the most primitive data quality checks.
Example 13. An Alarm System that is worth its while should trigger some action when a
problem is found. Painting some cells in a spreadsheet is a good example of such an
action, but automatically sending an e-mail with the description of the problem would be
much more effective. The code below continues Example 5: first, it checks a triangle of
age-to-age factors for outliers and then it sends an e-mail to the System Administrator
with the addresses of all problematic cells:
Sub AlarmEvent0
Dim c As Range, sAlarm As SUing
Dim otlApp As Outlook.Applic~ion, eMaii As Object
For Each c In Range("Tri").SpecialCelIs(xlCeUTypeSameFormatCondldons)
c.select
Range("Temp').Formula = c.FormatConditiens( I ),Formula I
If Not Applicafion.lsError(Renge("Temp").Value) Then
If Range("Temp").Value Then
sAlarm = sAlarm & "Problem at :" & c.Address & Chr(13)
End If
End If
Next c
I f stdarm & "" <2, "" Then "if there is any problem
Set otlApp = New Outlook.Applkadon 'launch Outlook
Set eMail = otlApp.Createltem(olMaiUtem) 'and create e-Mail m ~ s a g e
With eMail
.To = "[email protected]"
.Subject = "ALARM from" & ActiveWorkbook.Name & "!" & ActiveShee1.Name
.Body = sAlarm
.Send 'send e-Mail
End With
Set eMail = Nothing
otlApp.Quit
Set otlApp = Nothing
EadIf
End Sub
418
Data Quality can be testedboth on the detaillevel and on pre-aggregated levels (see [15]). In both
cases, reporting techniques like Filtering, Sorting and Conditional Formatting may help attract
attention to the problem (the same applies to visualization techniques, which can help to pinpoint
a problem). One can calculate changes in case reserves and sort claims in descending order by
that field to bring the largest outstanding claims on top. Or, using conditional formatting, one can
highlight outliers among age-to-age factors (see Example 5).
Algorithms Applicability
Closely relatedto Data Quality testson the pre-aggregated (as opposed to detailed)level is
actuarialassumption testing(see [15]).Indeed, a monotonically increasing number of claims can
be both a data qualitytestand a requirement for the applicabilityof the Berquist-Shcrman
algorithm. The same for the assumption of lognormality in ICRFS ~", which coincides with the
check thatrequires incremental gross payments to be positive.A n Alarm System may warn users
about Thomas Mack-styletest(see [2],[16]) failures.
, , ~ has ,,~,,~;',,~
, G30 ~ mJ FAIL calendar ~
I~1 ~ pASo
I • : . . ~. . . ~.
. . .
: . . I. . . .
• Manually remo~
ISgO 1o.li34.s4.2 10.(~2.N4 ii~,~ .~I 1.II~. Ill7 ll~ImOJ I I I 0,~/~,4211 II.I ll~m.se I
_~ IS~il 10.8117.7114 IO.?IMI.NI IO.'.~l,~ 10.711.7112 |0,i~6,021 10,;'4~i~]1 10,4 • ShOW ~J: O f ~
II
4~I 11 I 1.3T~',I~1 I 0.7~1,0~1 lO~lJ'J,121P I0,112,~113 I lOLling,442 4Ui~
Figure 11
Digital Dashboard
Digital Dashboard is Microsoft's name for a portal that consolidates the most important personal,
professional, corporate and external information with an immediate access to analytical and
collaborative tools. In a single view, the user can see charts, Alarm messages, Pivot Tables,
calendars, etc. Thus, Digital Dashboard looks like an obvious place for all important reports and
alarms. Dashboard's space limitations re-emphasize the necessity of smart and space-conscious
419
reporting techniques: Dashboard's start screen is the place for the most important information
presented in the most concise way.
While every reporting and visualization technique described in this paper is powerful and
effective, it is their combinations (Filtering + Sorting, Pivot Tables + Conditional Formatting, etc
+ etc...) that convert a flood of data into truly useful and indispensable information. Digital
Dashboard - a "combination of combinations" of reporting tools - is just a very logical extension
of the mechanisms that make this information immediately available and accessible. By the same
token, Digital Dashboard is a most natural interface for an Alarm System. Not only can it display
all types of alarms in a single location, but also - thanks to its portal capabilities - it can provide
links to detailed information that triggered an alarm.
Figure 12
With the proliferation of the lnternet, portal interfaces have become very popular: an ability to
organize a wealth of information into a concise and focused display is very appealing. In fact, this
article itself is organized as a portal into a wealth of information on reporting and visualization
techniques: it is just a concentrated extract of the most important facts about tools available to
actuaries, and as such it serves as a starting point for further research.
420
Actuarial Significance
"The whole system makes me feel.., insignificant."
Anti.
Actuaries have a lot of work to do before any of the aforementioned techniques can be used to
generate useful reports and visualizations.
For adaptive reporting, actuaries have to decide which measurements are of actuarial significance,
so reports can be filtered or sorted accordingly. Every step of the actuarial process requires
different significance indicators. For the data preparation step, actuaries should find
measurements that will catch severe data errors that may considerably affect the consequent
application of actuarial methods. For the results presentation step, actuaries should define (and
calculate during the application of algorithms step) indicators that will aid decision-making. That
will help to concentrate the attention of the report readers on important issues and will energize
and strengthen the decision-making process.
Another important area that needs assistance from actuaries is in Alarm Systems. Nobody likes
false alarms. It is actuary's job to come up with and f'me-tune alarm definitions, to determine
which combination of circumstances should trigger an actuarial alarm and attract immediate
attention.
Selection of these most important variables depends on the available data and the goal oftbe
display, and is clearly an actuarial task. It would make sense for actuaries to develop conventions
that cover most situations. Unfortunately, to the best of the author's knowledge, this work has not
even started.
Conclusion
"The gods help them that help themselves."
Aesop
A list of easily accessible presentation techniques with examples of their uses should help
actuaries to realize what tools are available for Actuarial Reporting. But it is up to actuaries to
express themselves using these tools. Indeed, reporting techniques described in this article are so
flexible it does not make sense to use a limited number ofpre-designed "canned" reports
anymore. In addition to that, reporting tools are incredibly interactive - they were designed in
order to give the end user (i.e., actuary) report-creation power. And they are so easy to use - it is
a sin not to use them.
421
Acknowledgements
"Thank you very much."
Many thanks to Leigh Walker who helped to make this paper a better presentation oftha author's
ideas: he issued an alarm every time he saw a granurmtieal error or an unclear passage. The
author is eternally grateful to Boris Privman, FCAS who greatly affected his views on the
actuarial profession. And without the understanding and support of the author's family, this
article would have not been presentable at aU.
Stamford, 2000
422
Appendix 0
Readers of the Acrobat version o f this paper (downloaded from the www.casact.om website) can
copy code snippets from the text and paste them into Excel's VBA Editor as described below.
Appendix 1
Appendix 2
423
Bibliography
(in the order of reference)
[ 1] Theo Mandel. The elements of User interface Design. Wiley Computer Publishing, 1997
[2] Thomas Mack, Measuring the Variability of Chain Ladder Reserve Estimates. CAS, 1993
[4] Aleksey S. Popelyukhin. The Big Picture: Actuarial Processfrom the Data Processing Point
of View. Library of Congress, 1996
[5] Will Schroeder, Ken Martin, Bill Lorensen. Visualiazion Toolkit. Prentice Hall, 1998
[6] Gen¢ Zela~ay. Say it with Charts: The Executive's Guide to Visual Communication.
McGraw-Hill, 1996
[9] Ben Zehnwirth, Probabilistic Development Factor Models with Applications to Loss Reserve
Variability, Prediction Intervals, and Risk Based Capital
[1 I] Aleksey S. Popelyukhin. SimActuary: The Game We Can Play. Submitted to CAS Forum,
Spring 2003
[ 13] Data Quality Certification Model (Framework and Guidelines)for Insurance Data
Management. IDMA, ISO, t995
424
Acronyms
i . f
BASIC Beginner s All-purpose Symbolic Instruction Code: one of the earliest and simplest high-level
I~Ogramming languages - still a very popular choice among educators
i L _ . . •
• OB~ Line of Business. here, a type of msurance coverage like Workers" Compensation or Professional
Liab..y.
m DFA - Dynamic Financial Analysis: a process for analyzing the fmancial condition of an insurance
entity.
,v TPA - Third Party Administrator: a company in the business of handling day-to-day activities and/or
providing services on insurance claims. Consequently, TPA is a primary source of actuarial data. See [14].
"IDMA - Insurance Data Management Association: an independent nonprofit association dedicated to
increasing the level of professionalism in insurance data management, httD://www.ins-data-momt 91T •
" I S O - Insurance Services Office, Inc.: leading supplier of statistical, actuarial, underwriting, and claims
information, httv://www.iso.cegm.
•ii NCCI - National Council on Compensation Insurance, Inc.: a value-added collector, nmnager, and
dis~butor of information related to workers' compensation insurance, htW://www2.ncci.com/ncciweh
vm I
CRFS - Interactive Claims Reserving Forecasting System: commercially available statistical
modeling framework from Insureware. hlW://www.insu~ware.¢gm '
VBA - Visoal BASIC for Appllentions: version of BASIC language embedded into host application
(i.e. Excel) with the access to host's objects - a better "macro language", htto://www.mierosot~eom.
425
426
Materiality and ASOP No. 36.
Considerationsfor the Practicing Actuary
CAS Committee on
Valuation, Finance, and Investments
427
November 17, 2000
To Actuaries Preparing Statements of Actuarial Opinion Regarding
Property/Casualty Loss and Loss Adjustment Expense Reserves:
The Casualty Actuarial Society's (CAS) Valuation, Finance, and Investments
Committee (VFIC) has prepared the attached note entitled "Materiality and
ASOP No. 36: Considerations for the Practicing Actuary".
Actuarial Standard of Practice No. 36, Statements of Actuarial Opinion Re-
garding Property~Casualty Loss and Loss Adjustment Expense Reserves,
became effective on October 15, 2000. Among other things, the new ASOP
requires the actuary to use the concept ofmateriality in a number of impor-
tant ways. The American Academy of Actuary's Committee on Property
and Liability Financial Reporting (COPLFR) asked VFIC to prepare a note
that would aid the actuary considering materiality in the context of ASOP
No. 36.
This note is the result. It is intended to be distributed as an appendix to the
Practice Note prepared by COPLFR as well as via the CAS website and The
Actuarial Forum.
Some of the general concepts of materiality discussed in the note may be
relevant beyond statements of actuarial opinion. However, this note does
not discuss the intended purposes of analyses in any other contexts, and in-
tended purpose is key to consideration ofmateriality.
IMPORTANT CAVEAT: This note is intended only as an aid and does
not supercede the actuary's professional judgment or the language of
ASOP No. 36. Although the note has been prepared by knowledgeable
members of VFIC, it has not received the professional review process
required for establishment of actuarial standards. Accordingly, the note
is not an authoritative document for actuaries and is not binding on any
actuary. VFIC recommends that this note be read in conjunction with
ASOP No. 36.
428
Materiality and ASOP No. 36:
Considerations for the Practicing Actuary
Introduction
This note has been prepared by the Valuation, Finance, and Investments Com-
mittee (VFIC) of the Casualty Actuarial Society as an aid to the actuary con-
sidering the concept of materiality contained in Actuarial Standard of Prac-
tice (ASOP) No. 36, Statements of Actuarial Opinion Regarding Property/
Casualty Loss and Loss Adjustment Expense Reserves.
ASOP No. 36 requires the actuary to use the concept ofmateriality in a num-
ber of important ways, including:
determination of whether or not to issue a qualified opinion,
determination of the need for disclosure of significant risks and
uncertainties,
consideration of factors likely to affect the actuary's reserve
analysis, and
determination of the need for a number of other possible disclo-
sures.
There is no formulaic approach to determining the standard of materiality
the actuary should use for a given statement of actuarial opinion (SAO). The
ASOP instructs the actuary to evaluate materiality based on professional judg-
ment, any applicable guidelines or standards, and the intended purpose of
the SAO.
VFIC intends this note to aid the actuary who must evaluate materiality in
the course of preparing a SAO. Following this introduction are three sec-
tions:
. Materiality and ASOP No. 36: Discusses the use of the concept of
materiality in ASOP No. 36, highlighting its impact on decisions
made by the actuary in the course o f preparing a SAO.
. Materiality in Accounting Contexts: Reviews the concept of
materiality in accounting contexts, including both regulatory and
SEC financial reporting. This discussion is not intended to be
guidance for the actuary, since an actuary's issues and concerns are
not in general the same as those of accountants. Instead, this review
is provided to enrich the discussion of potential issues with regard to
materiality.
. Materiality, Statements of Actuarial Opinion, and ASOP No. 36:
Discusses qualitative and quantitative concepts the actuary may
wish to consider while coming to a professional judgment on
materiality in the context of ASOP No. 36. Although certain
quantitative measures caq be suggested for consideration in certain
circumstan~.' es, no formulaic approach to a quantitative materiality
standard can be developed.
429
Several caveats are in order at this point:
This note is intended only as an aid and does not supercede the
actuary's professional judgment or the language of ASOP No.
36. Although the note has been prepared by knowledgeable
members of VFIC, it has not received the professional review
process required for establishment of actuarial standards.
Accordingly, the note is not an authoritative document for
actuaries and is not binding on any actuary. VFIC recommends
that this note be read in conjunction with ASOP No. 36.
This note discusses concepts ofmateriality relevant to the SAO's
that are the subject of ASOP No. 36. This note does not focus on
considerations of materiality that may be required for other pur-
poses, such as GAAP or Statutory financial statements. Although
some of the general concepts of materiality that are discussed here
are relevant in other contexts, key to the concept of materiality is
consideration of the intended purpose of the analysis..Discussion of
the intended uses of financial statements is beyond the scope of this
document.
ASOP No. 36 applies to any written SAO on loss and loss expense
reserves. Many SAO's are prepared to be filed for regulatory
purposes with an insurer's statutory annual financial statements. If
the actuary is preparing an SAO for some other purpose, e.g.,
valuation of a company or of a book of business, then the actuary's
materiality standards may differ from those relevant to the statutory
SAO.
430
Materiality and ASOP No. 36
ASOP No. 36 applies to actuaries issuing written statements of actuarial opin-
ion regarding property/casualty loss and loss adjustment expense reserves in
the following situations:
the opinion is provided to comply with requirements o f law or
regulation for a statement of actuarial opinion; or
the opinion is represented by the actuary as a statement o f actuarial
opinion.
Further, if the actuary's statement includes opinions regarding amounts for
items other than loss and loss adjustment expense reserves, ASOP No. 36
applies only to the portion of the statement of actuarial opinion that relates to
loss and loss adjustment expense reserves.
Whenever the actuary determines that a material condition exists, the actu-
ary is required to make some response to the condition. The following lists
sections of ASOP No. 36 that use the word "material". For convenience, the
discussion below quotes some of the context showing how the term material
(with added highlighting) is used in the section.
Again, please note that VFIC has not reproduced ASOP No. 36 in this
note. Actuaries should read that document in conjunction with this one.
Sections 3.3.2 d: "The actuary is not required to issue a qualified opinion if
the actuary reasonably believes that the item or items in question are not
likely to be material.'"
Section 3.3.3: "When the actuary reasonably believes that there are signifi-
cant risks and uncertainties that could result in material adverse deviation,
the actuary should also include an explanatory paragraph in the statement o f
actuarial opinion." This statement is further clarified. "The actuary is not
required to include in the explanatory paragraph general, broad statements
about risks and uncertainties due to economic changes, judicial decisions,
regulatory actions, political or social forces, etc., nor is the actuary required
to include an exhaustive list of all potential sources of risks and uncertain-
ties."
Section 3.4: "... the actuary should consider the purposes and intended uses
for which the actuary prepared the statement of actuarial opinion. The actu-
ary should evaluate materiality based on professional judgment, materiality
guidelines or standards applicable to the statement of actuarial opinion and
the actuary's intended purpose for the statement of actuarial opinion."
Section 3.5: "In addition to the reserve methods used, the actuary should
consider the relevant past, present, or reasonably foreseeable future condi-
tions that are likely to have a material effect on the results of the actuary's
reserve analysis or on the risk and uncertainties arising from such condi-
tions."
431
Specific considerations listed in Section 3.5 are the following:
433
A. NAIC Accounting Practices and Procedures Manual
The Codification defines a material omission or misstatement of an item in a
statutory financial statement as having a magnitude such that it is probable
that the judgment of a reasonable person relying upon the statutory financial
statement would be changed or influenced by the inclusion or correction of
the item.
In narrowing the definition, the following considerations are discussed:
Some items are more important than others and require closer
scrutiny. These include items which may put the insurer in danger
of breach of covenant or regulatory requirement (such as an RBC
trigger), turn a loss into a profit, reverse a downward earning trend,
or represent an unusual event.
434
The use of a percentage or numerical threshold may provide the
basis for a preliminary assumption regarding materiality.
435
Materiality, Statements of Actuarial Opinion, and ASOP No. 36
VFIC intends that the prior section's review ofmateriality in an accounting
context be regarded as suggestive of issues an actuary may consider in evalu-
ating materiality in the context of ASOP No. 36. One common element
between financial reporting and the SAO is that judgments regarding materi-
ality involve both qualitative and quantitative considerations. As noted in
Section 3.4 of ASOP No. 36:
"The actuary should evaluate materiality based on professional judg-
ment, materiality guidelines or standards applicable to the statement of
actuarial opinion and the actuary's intended purpose for the statement of
actuarial opinion."
Requiring the use of professional judgment and placing importance on in-
tended purpose both emphasize the role of qualitative considerations in evalu-
ating materiality.
Actuaries will naturally also focus on quantitative considerations related to
judgments on materiality. No formula can be developed that will substitute
for professional judgment by providing a materiality level for each situation.
What can be done is to highlight some of the numerical considerations that
may be relevant to the determination of materiality in some situations.
437
438
White Paper on Fair Valuing Property~Casualty
Insurance Liabilities
439
C A S T a s k F o r c e on Fair V a l u e Liabilities
W h i t e P a p e r on Fair V a l u i n g P r o p e r t y / C a s u a l t y I n s u r a n c e Liabilities
Executive Summary
This white paper was undertaken by the CAS Task Force on Fair Value Liabilities in reaction to
recent developments by the Financial Accounting Standards Board (FASB) and the International
Accounting Standards Committee (IASC). it is meant to be an objective discussion of the issues
surrounding the fair valuing of property/casualty insurance liabilities, particularly in the United
States. While the recent FASB and IASC proposals are mentioned and quoted, the white paper is
meant to be applicable to the "fair value" issue in general, wherever the issue appears.
The paper begins with an introduction and background, including a definition of"fair value." In
general, fair value is defined as the market value, ifa sufficiently active market exists, or an
estimated market value otherwise. Most definitions also include a requirement that the value
reflect an "arms length" price between willing parties, so as to eliminate "fire sale" valuations.
Most observers agree that a sufficiently active market does not exist in most cases for
property/casualty insurance liabilities. Hence, estimation methods have to be used to determine
their fair value.
A short history of the fair value concept then follows. In brief, the concept of" "fair value"
gained prominence as a result of the 1980's Savings & Loan crisis in the United States. The
accounting rules for these banks at that time did not require the recording of assets at market
value, hence, banks were able to manipulate their balance sheets through the selective selling of
assets. Troubled banks could sell those assets with market values higher than recorded book
values and inflate their reported equity, even as the quality of their balance sheet was
deteriorating. The concern was raised that any time financial assets are not held at their
economic value (i.e., market or fair value), financial reports can be manipulated through the
selective buying and selling of assets.
Since then, the FASB has been embarked on a long-term project to incorporate "fair value"
concepts inthe accounting for financial assets and liabilities. In December of 1999, they
released a document labeled "Reporting Financial Instruments and Certain Related Assets and
Liabilities at Fair Value (Preliminary Views)." This document proposed, tbr the first time, that
certain insurance liabilities also be reported at "fair value."
At around the same time, the IASC, in its efforts to develop consistent international accounting
standards, released its "Insurance Issues" paper. This paper also proposed a fair value standard
for the recording of insurance liabilities.
The paper is organized into the following sections at~er the introduction
A. Background regarding fair value concepts
B. Fair Value in the insurance context
C. Alternatives to Fair Value Accounting for p/c insurance liabilities.
D. Methods of Estimating Risk Adjustments - a brief discussion of possible methods for
determining risk adjustments, required in the fair valuing of insurance liabilities. Pros
and cons for each method are listed. Detailed discussions of these methods can be
found in the technical appendix.
E. Accountin~ Presentation Issues, including alternative income statement or balance
440
sheet formats in a "fair value" world.
F. Implementation Issues surrounding the fair valuing ofp/c insurance liabilities for
financial accounting statements.
G. Accounting Concepts, or how well fair value accounting and the issues discussed in
the earlier sections would be viewed in the context of general accounting concepts
(such as reliability, relevance and representational faithfulness).
H. Credit Standing and Fair Value Liabilities, a discussion of issues related to the
reflection of credit standing in determining the fair value of liabilities. This issue has
given rise to vigorous discussion, both within and outside the actuarial profession. Due
to the controversial nature of this issue, it has been given its own separate section,
rather than including it within the earlier sections.
1. Professional Readiness
J. Summary and observations.
K. Technical Appendices.
These sections are meant to be conceptual discussions, with any discussion of detailed
implementation procedures left to the technical appendices. The appendices also include a list of
references for each section.
1. New reouirement
In all the accounting conventions that we were aware of, insurance liabilities have not been
stated at fair value, resulting in a lack of established practice to draw on. This has implications
in numerous areas, including estimation methods, implementation problems and practitioner
standards. As with any new requirement, the switch to a fair value valuation standard for
property/casualty insurance liabilities would probably result in many unanticipated
consequences. These consequences could be mitigated if implementation is phased in, For
example, one phase-in alternative would be to institute disclosure requirements at first, followed
by full fair value reporting depending on the results of the disclosure period.
441
3. Expel:ted Value versus best estimate
All the methods discussed in this paper assume that expected value estimates are the starting
point in the fair value estimation process. The task force recognizes that confusion sometimes
exist as to where current practice stands. While the term "best estimate" is commonly used in
current accounting literature, it is not clear whether this means the best estimate of the expected
value (mean), or the mode (i.e., most likely value), median (the value which will be too low half
the time, and too high half the time) or midpoint (the average of the high and low of the range of
"reasonable" estimates). While a recent U.S. actuarial standard has cleared up some of this
confusion (ASB No. 36, Statement of Actuarial Opinion Regarding Property/Casualty Loss and
Loss Adjustment Expense Reserves, discussion of "expected value estimates" and "risk
margins"), the task force believes that clarification on this topic within the accounting standards
would be beneficial, and would become even more important in a fair value context.
4. Multiple methods
There are multiple methods for estimating the fair value of property/casualty insurance liabilities.
All of these methods have their own advantages and disadvantages. No one method works well
in all situations. As such, those estimating fair value may need to use a variety of methods. The
task force sees a need for any accounting standard to provide for flexibility in estimation
methods.
442
8. When market prices and "fair value" estimates are in conflict.
The task force observed that there are at least four situations where market prices may be in
conflict with the results of a fair value estimation process. In these situations, the fair value
estimation process may be preferred over a market value for financial reporting. These situations
include:
• Market disequilibrium. Given a belief in an efficient market, disequilibrium positions
should be only temporary, but how long is temporary? Restrictions on insurance market
exit and entry (legal, regulatory and structural) can lead to disequilibrium positions that
last years. The underwriting cycle is viewed by some as a sign of temporary
disequilibrium, whereby the market price at certain points in the cycle may not equal
what some believe to be a fair value.
• Market disruption. At various points in time, new events lead to significant uncertainty
and temporary disruption in the market for insurance products. Examples can include a
threatening hurricane, a newly released wide-ranging court decision and new legislation
(e.g., Superfund, or California Proposition 103?). At such times, market prices right after
the event may be wildly speculative, or the market may even be suspended, making fair
value estimation even more uncertain.
• Information As~,mmet~. The market price for a liability traded on an active market is
likely to be quite different depending on the volume of liabilities actually traded. For
example, if a primary insurer cedes 1% of its liabilities, the reinsurers will quite rationally
believe that this liability is not a fair cross-section of the primary's entire portfolio: i.e.,
the ceding insurer is selecting against the reinsurer. Consequently, the price will be
rather high, compared to the case where the entire portfolio (or a pro-rata section of it) is
transferred. Thus, the "actual market price" is not a better fair value representation than
an internal cash flow based measurement unless most of the insurer's liabilities are
actually transferred. This situation arises because the market (i.e., reinsurance market)
does not have access to the insurer's private information on the liabilities. If all of the
private information were public, then the actual market prices for liability transfers would
better represent their fair value."
• Significant intangibles. Market prices for new business may be set below expected costs
for such business, due to the value of expected future renewals. As such, an estimated
fair value that ignores this intangible may be materially different from the market price.
Both the IASC and FASB proposals indicate a preference for the use of observed market values
over estimated valuations. Given the imbalances noted above, the task force is uncertain as to
how to reconcile the realities of the insurance marketplace with the IASC's and FASB's
preferences for observed market value. It may be that internal estimates can sometimes be
preferable to market based estimates in a fair value accounting scheme.
443
is the sum of their two risk margins.
Not all risk margin methods result in value additivity. When this is the case, reporting problems
can occur. For example, if the risk margin for the sum of line A and line B is less than the sum
of the two risk margins, how should this synergy be reported? As an overall adjustment, outside
of the line results? Via a pro-rata allocation to the individual lines?
The issue of risk margins and value-additivity centers around discussions of whether markets
compensate for diversifiable risk. Diversifiable risk is generally not additive. For example, the
relative risk or uncertainty in insuring 2,000 homes across the country is generally less than
twice the relative uncertainty from insuring 1,000 homes across the country.
it is not clear whether value-additivity should or should not exist for risk margins in a fair value
system. A key question in the debate is the role of transaction costs, i.e., the costs of managing
and/or diversifying risk, and how the market recognizes those costs in its quantification of risk
margins.
The task force has not taken a final position on this issue. Instead it has flagged the issue
wherever it has been a factor in the discussion.
444
12. Historical comnarisons - imnlementatlon issues, nresentation issues
The implementation o f fair value accounting would cause problems with the traditional ways o f
making historical comparisons, particularly for historic development triangles. One difficulty
involves the possible need to restate history, to bring past values to a fair value basis. Should
these restated values reflect perfect hindsight, or should some attempt be made to reflect the
uncertainty (and estimation risk) that probably existed back then? (Any such restatement may
have to consider restating several years of history, based on current reporting requirements.) Or
should historic development data not be reported on a fair value basis, similar to current
reporting requirements in the U.S. statutory statement, Schedule P, whereby undiscounted values
are reported even if the held reserves are discounted?
445
| 7 , P r o f e s s i g ~ Readiness
Given no established practice in this area to-date, some education effort will probably be
required. Professional readiness may also not be determinable until general understanding of the
issue increases.
The task force hopes this white paper will aid in the understanding of fair value accounting
issues as applied to property/casualty insurance. We acknowledge that no one paper can include
all that is known about a topic, especially one as new and emerging as this one. As such, we
expect this to be only an initial step in the understanding of the issue.
446
Casualty Actuarial Task Force on Fair Value Liabilities
December 1999 - August 2000
(other contributors." David Appel, Paul Brehm, Roger Hayne, Gary .1osephson, Joe Lebens,
Steve Lowe, Glenn Meyers, Elizabeth Smith, Pat Teufel, Gary Venter, American Academy o f
Actuaries - Committee on Property and Liability Financial Reporting)
447
CAS Task Force on Fair Value Liabilities
White Paper on Fair Valuing Property/Casualty Insurance Liabilities
Table of Contents
Executive S u m m a r y
Introduction
Section A - B a c k g r o u n d
448
C A S T a s k F o r c e on F a i r V a l u e Liabilities
W h i t e P a p e r on Fair V a l u i n g P r o p e r t y / C a s u a l t y I n s u r a n c e Liabilities
Introduction
This paper is not meant to advocate any particular position, but is instead meant to be a "white
paper," an objective discussion of the actuarial issues associated with fair value accounting.
2) Scope
The scope of this paper is limited to the issue of fair valuing of p/c insurance liabilities (and
related insurance assets), with particular emphasis on insurance accounting in the United States.
The analysis includes discussion of estimation issues and their application to accounting. It does
not address fair valuing of life or health insurance liabilities, although we recognize the benefits
of a consistent approach, where possible, across all insurance liabilities.
The scope is meant to include all material property/casualty insurance liabilities, regardless of
the type of entity reporting them in their accounting statements. This would include insurance
liabilities held by self-insureds, captives, reinsurers, etc. It would also include unearned
premium liabilities, accrued retrospective premium assets/liabilities, material contingent
commission liabilities and the like. We have not addressed all possible insurer liabilities, but we
have addressed those we believe to be material at an insurance industry level.
449
I. Professional Readiness
J. Summary and Observations
K, Technical Appendices
These sections axe meant to be conceptual discussions, with any discussion of detailed
implementation procedures left to the technical appendices.
450
CAS Fair Value Task Force
White Paper on Fair Valuing Property/Casualty Insurance Liabilities
Section A - Background
11 D e f i n i t i o n o f " f a i r v a l u e "
What is "fair value?" Accounting authorities do not currently have a consistent definition for this
term. However, a short definition t could be:
a. the market value, i f a sufficiently active market exists, O R
b. an estimated market value, otherwise.
If no active market exists, an estimated market value can be determined from the market price o f
similar assets (or liabilities). If no sufficiently similar assets (or liabilities) exist, the estimated
market value is based on a present value o f future cash flows. These cash flows are to be
adjusted for "the effects o f . . . risk, market imperfections, and similar factors i f market-based
information is available to estimate those adjustments. "2
In adjusting these cash flows, one o f the more controversial possible adjustments is the impact o f
the entity's (or obligor's) own credit standing. Under some proposals, the weaker the obligor's
financial situation, the lower the fair value o f their liabilities would be. The assumption is that
the parties to the entity is indebted to would lower their settlement demands, recognizing the risk
o f possibly getting much less i f the entity went insolvent. This would represent a major change
to the accounting paradigm for "troubled" companies. A separate section o f the white paper has
been devoted to this issue, due to its controversial nature and its impact on almost every facet o f
the fair value discussion.
Note that the fair value is an economic value, but not the only possible "economic value." Other
examples o f economic values include economi c "value-in-use" and forced liquidation value.
Economic value-in-use can be defined as the marginal contribution o f an item to the overall
entity's value. The forced liquidation value is the cash value achievable in a forced sale. Due to
the pressures involved, the forced sale price may be materially different from the normal market
price.
While fair value accounting could be applied to any asset or liability, it is most commonly an
issue for financial assets or liabilities. Financial assets are generally either cash or contractual
rights to receive cash or another financial a s s e t ) Financial liabilities are generally obligations to
There is no universally accepted definition of "fair value" to-date, although they all follow the same general
concept given by this short definition. The detailed definition that FASB is proposing can be found in FASB's
Preliminary Views document titled "Reporting Financial Instruments and Certain Related Assets and Liabilities at
Fair Value," dated December 14, 1999, and labeled "No. 204-B." The definition starts on paragraph 47, with
discussion and clarification continuing through paragraph 83. Paragraph 47 states:
"Fair value is an estimate of the price an entity would have realized if it had sold an asset orpaid if it had been
relieved o f a liability on the reporting date in an arm 's-length exchange motivated by normal business
considerations. That is, it is an estimate of an exit price determined by market interactions."
The IASC has a similar definition (found on page A181 of their Insurance Issues Paper, released November 1999).
It reads:
"The amount for which an asset could be exchanged, or a liability settled, between Imowledgeable, willing
parties in an arm's length transaction. "
2 Paragraph 56 of the FASB Preliminary Views document mentioned above.
3 This is a simplified definition. A more complete definition includes both options and equities in its scope. Note
451
provide financial assets. 4
Lastly, a fair value accounting system focuses on the measurement o f assets and liabilities, not
income. The income statement in such a paradigm is just a consequence o f the changing balance
sheet. 5 This is in contrast to a "deferral and matching" approach, such as that used to justify
prepaid expense assets (e.g., Deferred Acquisition Costs, or DAC), where the focus is to match
revenues and expenses in the income statement. As a result, a fair value income statement could
look very different from traditional income statements.
Historically, many financial assets were accounted for at cost or amortized cost. These values
were readily available and verifiable, resulting in balance sheet values that could be produced at
minimal cost and that were relatively easy to audit. Likewise, many financial liabilities were at
ultimate settlement value, a value that in many cases is contractually set and hence, readily
available and auditable. 6
During the U.S. banking crisis o f the late 1980s, this accounting approach caused problems.
Banks, which held many financial assets at historical cost, were undergoing financial strains.
Many became aware that their reported balance sheet value could be improved by selling those
assets with a market value greater than book value, where the book values were based on
historical or amortized cost. Assets with market values less than book values were retained, as
selling them would only decrease the reported book equity. 7 As a result, many banks were left
with asset portfolios dominated by weak and underperforming assets, and many o f these banks
eventually went insolvent.
452
The FASB, s and many others, felt that a balance sheet based on market values would have
provided earlier warning of a bank's financial weakness. They proposed that all bank financial
assets be reported at market value, at least for U.S. GAAP financial statements. These concerns
resulted in FAS 9 115, which requires market value accounting for those assets held in a "trading
portfolio." These discussions also led to the discussion of fair value accounting for financial
assets and liabilities.
New problems arose when determining the scope of FAS 115. Recognizing the fact that many
financial institutions compete against one another, whether in the same narrowly defined
industry or not, FASB proposed that all U.S. financial institutions be subject to their new asset
reporting rules. This would include securities firms, life insurers and p/c insurers (although it is
less obvious how p/c insurers compete directly with the others on this list). The FASB's concern
was that to not treat all competitors equally in these rules would result in an uneven playing field.
Several parties raised concerns with requiring assets to be held at market value, when the
liabilities were not reported at market. They believed that this would cause reported equity to
become very volatile and not meaningful. Given the desire for consistency between asset and
liability valuation, and the belief by many that market value (or even fair value) accounting for
insurance liabilities was not possible, they proposed that the standard's scope exclude the
insurance industry. The FASB was not swayed by this argument. They decided to include the
insurance industry in the scope of FAS 115, and possibly address the balance sheet inconsistency
at a later date.
Since then, the FASB has had a stated vision of having all financial assets and liabilities reported
at fair value, pending resolution of any remaining implementation issues.l°
31 F A S B F a i r Value proiect
In 1986, FASB added a broad-based project concerning the appropriate accounting for financial
assets and liabilities (i.e., financial instruments) to its agenda. As of a result of the influences
mentioned above (and others), it has evolved into the FASB Fair Value project.
The FASB has held discussions on this project during much of 1999. In December of 1999, they
issued a "Preliminary Views" document on this project, which was intended to communicate
their initial decisions and to "solicit comments on the Board's views about issues involved in
reporting financial instruments at fair value." The preliminary views document had a comment
s Financial Accounting Standards Board, the principal setter of GAAP accounting standards in the U.S. The FASB's
standards are superceded only by the Securities and Exchange Commission (SEC). The FASB also must approve
AICPA standards of practice before they can become effective.
9 Financial Accounting Standard. Financial Accounting Standards, or FASs, are issued by the FASB.
l0 In paragraph 3 of the previously mentioned FASB Preliminary View document is a quote from FAS 133, that
states as follows. "The Board is committed to work diligently toward resolving, in a timely manner, the conceptual
and practical issues related to determining the fair values of financial instruments and portfolios of financial
instnnnents. Techniques for refining the measurement of the fair values of all financial instnnnents continue to
develop at a rapid pace, and the Board believes that all financial inslaxnncntsshould be caxried in the statcngxtt of
financial position at fair value when the conceptual and measurement issues are resolved. [paragraph 334]"
453
deadline of May 31, 2000.
This FASB document states that insurance obligations settled in cash (which represents nearly all
insurance liabilities) are financial instruments, hence, the goal should be to have them reported at
fair value. This includes reinsurance obligations. In addition, paragraph 46 of this FASB
document "would prohibit capitalization of policy acquisition costs of insurance enterprises."
Presumably, the effect of prepaying these expenses would be picked up in the fair valuing of
unearned premium liabilities.
As to how to estimate the fair value of these, the preliminary views document references the new
FASB Concepts Statement of Present Value-Based Measurements, released February I 1, 2000,
2000.
Efforts in the area of financial instruments in general include International Accounting Standard
(IAS) 39, issued in 1998, and the Joint Working Group on Financial Instruments, currently
working to develop a standard by the end of 2000. IAS 39 is very similar to FAS 115, in that it
requires investments in a "trading portfolio" to be held at fair value. Unlike, FAS 115, it creates
an exception to fair value accounting for any "financial asset ... that does not have a quoted
market price in an active market and whose fair value cannot otherwise be reliably measured. "~2
During December 1999, the IASC released an "Issues Paper" focused solely on insurance
accounting, with a comment deadline of May 3 I, 2000.
(Note that neither the IASC nor the FASB documents, nor their GAAP consequences impact
statutory accounting unless the NAIC takes explicit action.)
tl Per the IASC web site as of January 18, 2000 (http://www.iasc.org.uk/frame/cenl.htm),"The International
Accounting Standards Committee(IASC)is an independent private-sectorbody workingto achieve uniformityin
the accountingprinciplesthat are used by businessesand other organisationsfor financial reportingaround the
wodd."
iz Chapter30, paragraph21 of"The IASC-U.S.CompartsonProject: A Report on the Simil~tritiesand Differences
between IASC Standardsand U.S. GA.AP,"Second Editiott, publishedby the FASB in 1999.
~3Thesetwo Imlletscome from the IASC IssuesPaper on Insurance,pages iv-v, bullets (d) and (k).
454
CAS Task Force on Fair Value Liabilities
White Paper on Fair Valuing Property/Casualty Insurance Liabilities
1~ General statement
In general, the fair value projects of both the FASB and IASC propose that any asset or liability
that ultimately settles in cash should ideally be valued at its "fair value." This would include (but
not be exclusively limited to):
In addition, a fair value accounting approach (at least according to the FASB) would not
recognize prepaid acquisition costs as an asset. Hence, these assets would disappear under fair
value accounting.
Premium deficiency reserves would also disappear under fair value accounting, as any expected
price inadequacy on in-force policies would be directly reflected in the unearned premium
reserve valuation.
Given the absence of an active market for most (maybe all) of these items, their fair value would
have to be based on an estimate. The estimate would involve discounted cash flows.
For now, the focus from the FASB and the IASC is on contractual cash flows. Non-contractual
cash flows, such as future renewals, would be precluded from the cash flows used to estimate fair
value, even when the renewals are largely unavoidable due to existing legal or regulatory rules.
The only renewal business flows to be included in these cash flows are those that are
contractually guaranteed. ~4
i* The treatment of renewal business is still an open issue. The quandary these accounting organizations face is that
renewal business IS considered currently by those valuing the overall net worth of insurance enterprises. Therefore,
a "market value" of the enterprise would include these intangibles. I r a market price would include them, then why
should a cash flow estimation procedure, generally meant to estimate a hypothetical market value, exclude them.'?
So far, they have leaned against including them, despite a risk of being inconsistent with real-life market valuations,
due to problems with reliably estimating the renewal flows.
While both the FASB and IASC proposals include contractually guaran~-"ed renewals in these projected cash flows,
the IASC definition further requires that the insurer's pricing flexibility for these renewals be resU'icted in some
fashion.
455
These discounted cash flows may need to be adjusted for:
• Risk or uncertainty in the flows (with the size of the adjustment based on market
compensation for such risk)
• Credit standing of the obligor
• "Market imperfections," including possibly illiquidity.
We expect little disagreement that the risk in insurance liabilities is "identifiable" and
"significant." We expect the principal discussion to be on the measurability of this risk, in an
accounting context.
61 Potential advantages and disadvantages of fair value accounting in the insurance context
Below are some of the advantages and disadvantages to fair value accounting, as it might be
applied to insurance liabilities, that have been discussed in prior literature. This partial list is
intended to aid in comparing fair value accounting to the various alternatives, discussed in the
next section. More detailed discussion of these and other advantages/disadvantages can be found
456
throughout the later sections of this paper.
457
C A S Task Force on Fair Value Liabilities
W h i t e Paper on Fair Valuing Property/Casualty Insurance Liabilities
Introduction
For many, the proposals by FASB and 1ASC present some radically new ways to value balance
sheet items and to measure income for insurers. Most o f the proposed changes have a reasonable
theoretical basis, but a practical implementation of the new methodology will undoubtedly
present significant challenges to the actuarial and accounting professions.
For example, as discussed in the Methods for Estimating Fair Value section, all of the methods
currently available to measure the risk margin suffer from various disadvantages. None of these
methods is presently in widespread use for actual valuation of balance sheet liabilities (however,
some are commonly used for ratemaking). Although it is likely that more research will evolve
given an accounting standard that requires a risk margin, it is difficult to see a route that will
arrive at a widely adopted standard approach. Lacking a standard approach (with appropriate
guidelines for the magnitude of risk margins by lines of business), it may be difficult to enforce a
reliable comparison across insurers.
It is also not clear that all the proposed changes will benefit the industry, its customers or
investors. An example is the inclusion of the effect of credit risk in the fair value of liabilities.
This requirement implies that an insurer experiencing a lowered credit standing will see its
earnings improve. This creates an incentive for companies to increase operational risk and
thereby increase the insolvency cost to customers. (For a more detailed discussion of credit risk,
see the separate section of this paper on this topic.)
For these reasons, it is prudent to consider some alternatives to the full implementation of the
FASB and IASC proposals. The following are alternatives that we have considered or that have
been presented in the accounting literature. We do not necessarily endorse any o f them, but we
list them here in order to enhance the discussion of this topic.
Use the undiscounted expected value o f the estimated liability payments as its accounting value.
This alternative is essentially the status quo for property-liability insurers, although some may
have historically used estimates o f amounts other than the mean (such as the median or mode). It
implicitly assumes that the risk margin equals the discount on the liability. Note that current
statutory and GAAP accounting standards allow discounting for some losses (e.g., workers'
compensation life pensions). However, the vast majority of liabilities are not explicitly
discounted.
The FASB and IASC proposals indicate that the proper way to view the estimation of uncertain
cash flows is that the expected value o f the cash flows is the relevant measurement, Note that the
458
proposals do not directly address this issue with respect to the intended accounting treatment.
However, the examples in the documents clearly show the preference for expected value.
The actuarial profession has also recently adopted the expected value criterion. The new
Actuarial Standard of Practice No. 36, "Statements of Opinion Regarding Property/Casualty Loss
and Loss Adjustment Expense Reserves," specifically requires that the preferred basis for reserve
valuation be expected value.
Section 3.6.3 of the ASOP states "In evaluating the reasonableness of reserves, the actuary
should consider one or more expected value estimates of the reserves, except when such
estimates cannot be made based on available data and reasonable assumptions. Other statistical
values such as the mode (most likely value) or the median (50th percentile) may not be
appropriate measures for evaluating loss and loss adjustment expense reserves, such as when the
expected value estimates can be significantly greater than these other measures." For some, this
may be viewed as a change to the previous status quo, while for others, this is merely putting in
writing the current practice.
The U.S. regulators' point of view, as expressed in the NAIC Issue Paper No. 55, proposes that
the reserves to be booked be "management's best estimate," although the term "best estimate" is
not currently defined.
When discussing "expected value" in this paper, we define the term to be without a risk margin,
unless stated otherwise.
Advantages
• This is easiest to accomplish. There is no change to current accounting procedures.
• The risk margin equals the amount of the discount, so a risk margin is implicitly included
in the liability value.
• The risk margin is directly correlated with the amount of the discount. This is intuitively
appealing, since many believe that the amount of risk is positively related to the length of
the loss payment tail.
• It is easy to measure the runoffofthe liability.
Disadvantages
• It fails to overcome the many problems associated with current accounting, including
a) Incentive for accounting arbitrage, or transactions undertaken strictly for a favorable
accounting result, despite no economic benefit.
b) Misleading information for decision making, in that transactions that have a poor
economic result may look better than those creating a favorable economic result.
c) Items with significant long-term uncertainty may appear inestimable on an
undiscounted basis, even when estimable on a present-value basis.
d) Companies writing different types of insurance would not be comparable.
459
It is a poor calculation o f either the risk margin or the present value of the liability.
Hence, this alternative results in an accounting value for equity that may not adequately
represent the value to investors, policyholders or other parties.
Use the present value of the estimated liability payments as the accounting value. This alternative
is equivalent to the fair value, except for the risk margin and adjustment for credit standing.
Some would view this as the best practical alternative to fair value, given the difficulties in
estimating the risk margin and credit risk adjustment. For some lines of business, such as
workers compensation, actuaries routinely calculate present values of the liabilities (although
typically using a conservative discount rate). For other lines, the loss and LAE payments patterns
needed for present values are usually a by-product of normal loss reserving or ratemaking
practices.
Advantages
• This method is feasible with current actuarial skills and practices. Many insurers
currently discount loss reserves for some lines of business. Also, the requisite cash flow
patterns are commonly produced in estimating the undiscounted reserves.
• Discounting has widespread acceptance and is fundamental to the life/health industry.
• There is no dispute over how the risk margin should be calculated and applied to
individual companies.
• Measuring and displaying the runoffofthe liability is not difficult.
Disadvantages
• it will require more work and therefore, expense compared to not discounting.
• A risk margin is not calculated, so the fair value of the liabilities will be underestimated.
• The transition to discounted reserves will expose insurers who have carried inadequate
undiscounted reserves that are implicitly discounted (an example is environmental
liability). When they are forced to explicitly discount all reserves, some insurers will
further discount an already implicitly discounted reserve, rather than admit that the
original reserve was inadequate.
460
Earnings will emerge closer to the time when the policy is written. (i.e., they are front-
ended). This may provide incentives to writing risky long-tail business for companies that
have weak earnings.
This alternative is similar to #2 above. It uses the present value of expected liability payments as
its accounting value, but the present value is taken using a risk-adjusted interest rate. Here, risk-
adjusted rate is defined as a rate that produces a present value higher than the present value
obtained using the appropriate risk-free interest rate (as in #2 above). To accomplish this, the
risk-adjusted rate must be lower than the risk-free interest rate. The difference between the two
interest rates is called the risk adjustment. For some short-tail liabilities such as catastrophe loss
exposure (embedded in unexpired contracts) an adjustment to the interest rate may not be
appropriate. In these instances, a risk margin, as a percentage of the present value of expected
loss, can be added to the present value.
This method is conceptually equivalent to the fair value (with no credit risk adjustment), except
that the risk adjustment is determined on an industry-wide basis. Thus, in many cases, the
circumstances of the individual insurer would be ignored in favor of accounting simplicity.
There are several approaches that could be applied to determine the industry-standard risk
adjustment. A standard-setting organization (such as the AAA or NAIC) could promulgate risk
adjustments by line of business or for all lines taken together. The organizati(~n might apply
some of the methods discussed in Section D and then use judgment to weigh the results in
producing the risk adjustment(s). The adjustment could also be set to be the same for all lines, or
to vary by line.
Advantages
• It is as nearly as easy as #2 above and it has all of the same advantages plus others.
• It produces a fair value for a typical company's liabilities, since (an) appropriate industry-
wide risk margin(s) are (is) provided.
• Comparability between companies may be enhanced, since the risk margins (per unit of
like liability) would be the same for each insurer.
• Given the difficulties in accurately estimating risk margins at the industry level in this
alternative, it remains questionable whether company-specific fair value estimates would
be reliable enough for accounting purposes. Hence, this may be the most practical
approach to implementing something akin to fair value.
Disadvantages
* It has the same disadvantages as #2 above except for the omission of a risk margin.
• It may not be a very accurate or reliable calculation of the risk margin for an insurer with
atypical liabilities. If risk margins vary by line of business and a single risk margin is
461
applied to all lines, then insurers writing different types of insurance would not be
comparable.
• In the case where line-by-line standards are set, new lines may develop for which no
standards yet exist. The standard setters may forever be trying to catch up to market
developments.
• There is no formal process to determine the standard-setting body.
Use fair value for some liabilities and one or more of the alternatives for other liabilities.
Categories that possibly may require this treatment include unexpired risk (loss embedded in the
unearned premium reserve, or UPR), catastrophe losses, environmental losses, ceded losses and
loss adjustment expense.
For example, estimating the fair value of UPR runoff can be very difficult when the valuation
date occurs as a storm or major catastrophe is threatening, but the public release or reporting of
that value is after the event, when the storm either did or did not hit. In this case, an accurate fair
value as of the balance sheet date has little relevance at the time losses are reported. Note that
retaining the current UPR calculation, and not reflecting fair value until the loss is incurred,
would be a "mixture" that retains the current "deferral and matching" paradigm of GAAP
accounting.
Under this alternative, either the accounting standard-setting body would establish which
categories get which treatment, or the insurer would decide on the basis of a materiality criterion.
Advantages
• This may be the most practical solution to the problems associated with full
implementation of the fair value concept.
• This alternative is flexible. It could be amended as actuaries, accountants and other
professionals became more adept at measuring the proposed fair value components.
Disadvantages
• It may be difficult to decide which items should get the full fair-value treatment and
which items should continue to be valued as they are now.
• It could lead to inconsistent accounting of like items.
• There would be a possibility for accounting arbitrage, or "gaming" the system.
• This alternative could lead to "cliff" changes in liabilities, if a given liability could
change valuation standards over its life (such as when the loss component of the UPR
becomes incurred).
462
5. Entity-specific measurement
Advantages
• The insurer would have the most control with this approach.
• An insurer with unique liabilities would be able to use the proper risk margin.
• The method recognizes the current lack of a market for many insurance liabilities,
including the large information asymmetry that impedes the existence of an active
market. Given this information imbalance, the "market" price is either not transferable to
similar liabilities (due to individual portfolio differences), or is a naive price.
• it focuses on the marginal contribution of the item to the total value of the firm, not the
exit price for an item for which exit is not a viable alternative. Hence, it may be a more
relevant measure to the firm.
463
Disadvantages
• It might place an additional burden on individual insurers, who would need to derive their
specific risk margins.
• It would tend to produce liability values that are not comparable between companies.
This would partially defeat the purpose of fair value.
• The method would likely be subject to manipulation by the reporting entity to a greater
extent than other alternatives.
6. Cost-accumulation measurement
This approach is discussed on page 22 of the FASB document "Using Cash Flow Information
and Present Value in Accounting Measurements" (3-31-99). This method attempts to capture the
incremental cost that the insurer anticipates it will incur in satisfying the liability over its
expected term. This method typically excludes the markup and risk premium that third parties
would incorporate in the price they would charge to assume the liability.
For insurers, these items are the reinsurer's expenses and profit load associated with reinsuring
the liabilities, in practice, measurement should be similar to that of the present value alternative
(#2) above. Insurers would estimate the liability cash flows and discount them using a prescribed
interest rate.
Advantages
• Same as #2.
Disadvantages
• Same as #2.
• It can be dependent on the current corporate structure. For example, it may assume that
existing affiliates providing services at marginal cost (to the affiliate) will always be
around. This could result in substantial changes in value if the corporate structure
changes (e.g., breakup of the parent conglomerate).
• It may not adequately represent what the market would require to transfer the liability.
464
CAS Task Force on Fair Value Liabilities
White Paper on Fair Valuing Property/Casualty Insurance Liabilities
Risk adjustments
Fair value estimates reflect expected cash flows, the time value of money and an adjustment for
risk. This section focuses on the last of these components, the risk adjustment. The methods
discussed here assume that expected cash flows and risk-free discount rates are already available.
For the purpose of all subsequent discussion the starting point for the discount rate before risk
adjustment is the risk-free rate.
Other balance sheet insurance items, such as contingent commissions and deductible recoverable
amounts may also be subject to a risk adjustment in estimating their fair value. The risk
adjustment for these items is not addressed in this section, although some of the methods
discussed here may also be feasible for estimating their fair value.
This section begins with a conceptual discussion of risk margins, including a discussion of
diversifiable versus nondiversifiable risk. Next, the methods listed below are presented. These
presentations are meant to give the reader a brief conceptual overview of the methods (a more
involved discussion is included in the appendices). At the end of this section, a chart comparing
the listed methods is provided.
465
(Note: Neither the inclusion o f a method, exclusion o f a method, nor the order of the methods
listed is meant to imply any preference or priority by the task force. Methods were listed if
members o f the task force felt it deserved consideration, whether or not consensus was
achieved)
1) Capital Asset Pricing Model (CAPM) based methods, where the liability beta is
calculated from insurers' asset and equity betas.
2) Internal Rate of Return (IRR) method, where the risk adjustment is derived from cash
flow and rate of return on equity (ROE) estimates.
3) Single Period Risk-Adjusted Discount method, where the calendar year ROE is used to
find a risk adjusted interest rate.
I 0) Other methods.
The IASC (paragraph 243) and FASB (Concept statement 5 paragraphs 62 - 71) documents
require the use of a risk margin when measuring the fair value of an uncertain liabilities (such as
an insurer's liabilities) by discounting the expected liability cash flows. The finance and actuarial
literature generally support this approach. (Butsic, Cummins, D'Arcy, and Myers-Cohn.)
The economic rationale for a risk margin is that a third party would not accept compensation for
a transfer of liabilities if such payment reflected only the present value of the cash flows at a
risk-free interest rate. The acquiring entity would get an expected risk-free return while bearing
466
risk. A market exchange of the liability would therefore require a premium or risk margin over
and above the present value of the liability discounted at the risk-free rate.
In this section we discuss various possible feasible methods for estimating a risk margin. All o f
these methods have been used for estimating risk margins, either for direct application to balance
sheet liabilities or in ratemaking. Financial theory indicates that the same principles for
estimating the risk margin in pricing would also apply to a fair valuation of outstanding
liabilities. For certain kinds of short tail liabilities, such as claim liabilities associated with
catastrophes, the risk margins for pricing may be much larger than the risk margins for liabilities,
however. This is because, once a catastrophe has occurred the uncertainty regarding future
payments may be relatively modest, compared to the quite large level of uncertainty before the
event has occurred.
There are two major paradigms used to compute risk loads that are represented in this paper: the
finance perspective and the actuarial perspective. These two paradigms differ in their treatment
of diversifiable versus nondiversifiable risk. In the context of liability fair value, diversifiable
risk is defined as risk that can be reduced, per unit of liability volume, as more volume is added.
For example, if two statistically independent risks are combined, their joint risk will be reduced
due to the tendency of bad outcomes from one being offset by good outcomes in the other. In
contrast, nondiversifiable (or systematic) risk is defined as risk that cannot be reduced, per unit
of liability volume, as more volume is added. Here, bad or good outcomes in one risk are
matched with the same result in the other.
The amount of diversification depends on the correlation between the units being added. This
L 2 +0.2
o'(x+ y) = q~a~ ~ +2per o"
effect is evident in the square root rule for summing standard deviations:
Where p is the correlation between x and y, a , is the standard deviation o f x and ay is the
standard deviation o f y.
Adding more units to a portfolio may or may not reduce its risk. If the correlation between the
units is one, then there is no reduction in risk per unit volume from adding more of the units. In
this case the standard deviation o f the sum will equal the sum of the standard deviations, and
when this is normalized by dividing by the mean of the portfolio, the risk per unit is unchanged.
In investing, for instance, adding more shares of a given company's stock to one's portfolio will
not reduce the portfolio's risk, since the shares added will be perfectly correlated with the shares
the investor already owns.
If the correlation between the units is less than one, then there is a reduction in risk per unit
volume from adding the units. Thus, if an investor adds to the portfolio shares o f a company not
already in it, the risk should decline since the correlation o f the new stock with stocks in the
467
portfolio should be less than one. If the correlation is negative then there can be a significant
reduction in risk.
An example of diversifiable risk from insurance is the random occurrence of losses - - where the
fortuitous amount of one claim does not influence the amount of another claim. An example of
nondiversifiable risk from insurance is medical inflation, where a change in the cost of medical
care will simultaneously effect the value of general liability and workers compensation reserves.
Another example is parameter risk, where the mean (or other parameter) of a loss distribution is
unknown. Here the uncertainty in the mean affects all losses included in the distribution.
ts Certain approaches, such as ArbitragePricing Theoryallow factorsother than beta to be used in the quantification
of risk. Exceptfor some very recent research work,these approaches have not influenced the finance-based
methods used to compute risk loads in propertyand casualty insurance.
468
The characterization of the finance approach as quantifying only nondiversifiable risk and the
actuarial approach as including both diversifiable and nondiversifiable risk is an
oversimplification. Stulz 19 points out that in the real world, total risk often matters, and costs
incurred by companies to control total risk are rewarded in the financial markets and the failure
to do so may be punished. For some kinds of insurance, such as catastrophe insurance, it could
be difficult to find a market unless some kinds of "diversifiable" risk were rewarded. Property
catastrophe risk is diversifiable in a perfect market, but the mechanisms for doing so are so
costly that in practice it is only partially diversifiable. As in the case of formally
nondiversifiable risk, the whole industry is in the same boat, so the market treats the risk as
systematic and policyholders in catastrophe-exposed areas pay a risk premium for insurance
coverage. If an efficient means of diversification were to arise, then that situation would change.
While the actuarial based methods often explicitly incorporate process (diversifiable) and
parameter (nondiversifiable) risk components into the risk load formulas, some of the finance-
based methods, such as internal rate of return, may implicitly incorporate this risk as part of the
total return on equity required by an insurance company.
The discussion surrounding diversifiable versus nondiversifiable risk is still evolving. The
reader should be aware that differing views exist as to whether only diversifiable, or both
diversifiable and nondiversifiable risk should be included in risk adjustments. The reader should
also be aware that there are also very different approaches to measuring the nondiversifiable
component.
469
Method 1. The CAPM Aonroach
(Note: references to specific authors mentioned below and in the discussion of subsequent
methods can be found in the Appendix.)
CAPM is the method used in Massachusetts rate filings in the Automobile and Workers
Compensation lines. Myers and Cohn developed the underlying theory.
The method equates the present value of the premium to be charged on a policy to the present
value of the losses plus the present value of the underwriting profits tax plus the present value of
the tax on invested surplus and premium.
Losses are discounted at a risk-adjusted rate. The premium portion of underwriting profits is
discounted at a risk-free rate and the liability portion is discounted at a risk-adjusted rate.
Investment tax is discounted at the risk-free rate. The risk-adjusted rate used in the calculations is
derived from CAPM.
rL = r/ + /3, t'r. - r r )
where rL = risk-adjusted rate
rt = one period risk-free rate
~. = Cov(rL,rm)/Var(rm) = the liability or underwriting beta
r,, = expected rate of return on market portfolio
13L, the underwriting beta, is a measure of the covariance between the underwriting profits for a
line of business and the stock market. It represents the systematic risk to the insurer for writing
the policy. Note that I~L is usually considered to be negative. Otherwise insurance companies
would incur exposure to risk for a reward equal to or less than the risk-free rate, an illogical
conclusion.
Although the Myers-Cohn approach is typically applied in ratemaking to compute risk adjusted
premiums for new policies, the risk-adjusted discount rate from the calculation can be used to
discount outstanding reserve liabilities as well.
470
There are at least three approaches to computing fit. The first method is broadly similar to the
direct estimation technique (Method 7 of this section). Here, a time series of publicly traded
insurer data is analyzed. A beta of equities is determined from insurance company stock prices.
A beta of assets is determined from a weighted average of insurance company asset betas. The
liability beta is determined by subtracting the asset and equity betas, weighted by their respective
leverage values. The risk margin, as a reduction of the risk-free rate, equals the liability beta
times the market risk premium. This is the method used in Massachusetts.
The second method uses accounting data to measure the covariance between insurance
• - returns and the market. 20 A third
underwntmg . CAPM-based approach measures beta for a line
of business by quantifying the covariance of that line's underwriting return with the return for all
property and casualty lines. 21
Advantages
• The method has actually been done. In Massachusetts it is the standard method used in
the workers compensation and personal auto, with risk margins being positive and stable.
Note that this has only been applied to lines that are relatively homogeneous, and where
public data is generally available.
• The method is objective and the analysis is reproducible.
• The method has been in use for over a decade and has been reviewed by many
economists.
Disadvantages
• Several stages of estimation can produce measurement errors.
a) Some insurers in the data are also life insurers; carving them out requires estimating
the equity beta of the life operation.
b) The liabilities may be under- or overstated in the financial statements.
c) Mutual insurers, nonpublic companies, self insurers and captives are not included in
the analysis, introducing a potential bias
• Intangible assets like franchise value could distort the results. Another similar problem is
that the present value of income taxes is embedded in the liability value and cannot be
easily separated from it.
• Measurement errors on the beta for assets have a leveraged effect on the measurement of
underwriting betas.
• It relies on the CAPM model, which may not accurately predict returns for insurance
firms, as discussed below.
471
The CAPM beta has come under considerable criticism recently in the finance literature. CAPM
only recognizes nondiversifiable risk, assuming an efficient, friction-free market. The magnitude
of transaction costs to diversify an insurance portfolio violates the friction-free assumption,
casting doubt as to the applicability of CAPM to valuing insurance liabilities.
Fama and French have shown that factors other than beta contribute significantly to the
explanation of company stock returns. 22 Their work has caused a great deal of discussion in the
finance community about the use of CAPM and beta for estimating equity returns and computing
cost of capital. Alternatives to CAPM that look CAPM-like but incorporate factors other than
beta into the determination of the risk premium have attempted to address some of the
deficiencies of the CAPM model. For instance, Fama and French have presented a method for
deriving costs of equity that uses two additional factors as well as beta. 23 Some of the models
that appear to be generalizations of CAPM and use factors other than beta are better known as
examples of the Arbitrage Pricing Model. An introduction to this more general approach is
provided by D'Arcy and Doherty. 24
Members of the actuarial community (as opposed to members of the finance community) have
also criticized CAPM approaches. Much oftbe criticism focuses on the unreliability of estimates
of underwriting betas as opposed to estimates of equity betas examined by Fama and French.
Kozik 25 notes that a number of authors have measured the underwriting beta to be zero or
negative (i.e., no risk load necessary on insurance). He provides a detailed discussion of the
flaws in current methods of measurements of the underwriting beta, which can cause such results
to be obtained.
Note that much of the underlying theory of CAPM is widely used and accepted, although the
actual mechanisms for measurement have been criticized. Some of the criticisms of CAPM have
been addressed in extensions of CAPM such as contained in the Automobile Insurance Bureau's
Massachusetts Rate Filing (1998). Extending CAPM to address some of its limitations is
currently an area of active research.
It should be noted that many of the limitations of the CAPM approach may apply to other
methods presented in this paper, whenever those methods use CAPM to determine a rate of
return.
22Fama, Eugeneand French. Kenneth, "The Cross Section of Expected Stock Returns"Journal of Finance, Vol 47,
1992, pp. 427--465
23Fama, Eugene and French, Kenneth, "The Cross Sectionof Expected Stock Returns"Journal of Finance, Vol 47,
1992, pp. 427.-465
24D'Arcy, S. P., and Doherty, N. A, "The FinancialTheory of Pricing Prnperty-[AabilityInsuranceContracts,"
Hut:10nerFoundation, 1988
23Kozik, Thomas, "UnderwritingBetas-The Shadowsof Ghosts," Proceedingsof the CasualtyActuartalSociety
(PCAS) LXXXI, 1994, pp. 303-329.
472
The Pricing-Based Methods (Methods 2 and 3~
Under this general category of methods, the fair premium for a group of polices (which could be
those of a line of business or an entire company) is first determined, in this calculation, the value
of all nonliability premium components (such as commissions and general expenses) is excluded
from the fair premium calculation. The resulting premium amount, by definition, is the fair value
of the liability (losses and loss adjustment expenses). Since the liability fair value and its
expected payments are known, the implicit risk-adjusted interest rate at which the payments are
discounted can be readily found. Subtracting this value from the risk-free rate gives an estimate
of the risk adjustment to the risk-free rate. Note that this approach can be used to compute a
dollar-value risk load (to apply to liabilities discounted at the risk-free rate) rather than an
adjustment to the discount rate.
This method can be applied to any prospective pricing model that uses expected cash flows. The
most prevalent cash flow approaches are the internal rate of return (IRR) and the risk-adjusted
discount (RAD) models.
It should be noted that the standard pricing-based methods give a risk margin that is a composite
of the risk characteristics of liabilities already incurred and the unexpired policy liability. As the
time since policy issuance increases, there may be a significant information gain in a book of
liabilities (e.g., the insurer knows more about claims once they are reported) This effect is most
pronounced for property insurance with significant catastrophe potential. To separately measure
the risk margins in the reserve and unexpired policy portions of the insurer's liabilities, the
pricing methods can be modified. For example, in the IRR model, the capital requirement and/or
the required ROE may be different per unit of liability for the two liability types.
The IRR method is used by the NCCI in workers compensation rate filings.26 It does not directly
produce a risk mar~in, but it can easily be adapted to do so. The underlying theory is standard
capital budgeting.
Under the IRR method, a cohort of policies, written at the same time, is modeled over time until
all claim payments are made. At each stage (usually quarterly or annually) the cash flows
(premiums, losses, expenses, income taxes and investment returns) and balance sheet values are
estimated. Capital is added based on capital allocation rules, frequently as a fixed proportion to
liabilities. The application o f these capital allocation rules results in an initial amount of capital,
473
then a subsequent capital flow, based on the amount of additional or withdrawal of capital
necessary to maintain the capital allocation assumption at each point in the policy flows.
When the internal rate of return on the capital contributions and withdrawals equals the required
rate of return on the capital (equity), then the fair premium is obtained.
The inputs to the IRR method are the capital allocation rules (e.g., the required amount of equity
per unit of liability), the expected payments pattern of the policy flows, the investment return o n
cash flows, the income tax rate and the required return on equity. Note that the expenses and the
premium cash flows need not be included in this calculation, since we are only trying to value
the liability itself.
The required ROE can be determined using a variety of approaches. A simple approach often
used by insurance companies is to select a rate of return based on examining actual historical
rates of return on equity for insurance companies. Roth advocates this approach. 2s Another
approach is to use CAPM to estimate the industry-average insurer equity beta and then to derive
the appropriate ROE, given beta. An alternative way to estimate the required ROE is to use the
dividend growth model, which has been documented in rate filings. Still another approach might
use the "'hurdle rate" for an insurer that is derived from its experience raising capital.
The required capital could be based on the company's internal capital allocation rules. Absent
this, industry-wide "rules of thumb" or rating agency dictated norms might be used. Note that
the capital typically used in this calculation is "required" or "target" capital, not actual capital.
Care must be taken where the capital allocation assumption is dependent on the required ROE
assumption.
An additional complication arises where fair value rules require the use of "market assumptions"
wherever possible, over individual company assumptions. This could imply that the capital
allocation rules that drive the market price (if one can be said to exist) should be used instead of
the company's own internal capital assumptions.
The investment return under a fair value paradigm typically is the set of currently available
market yields for investments. This may be complicated by investment in tax-exempt
investments, especially where the company has significant tax advantages or disadvantages
relative to the market. Many users of IRR models make the simplifying assumption that all
investments are made in taxable securities.
za Rolh, R., "Analysis of Surplus and Rates of Return Using Leverage Ratios,", 1992 CasualtyActuarial Society
DiscussionPaper Program - Insurer Financial Solvency, Volume I, pp 439-464
474
Advantages
• The IRR is commonly used to price insurance products. The extension to calculate risk
margins is straightforward and will produce positive and stable risk margins.
• The method is conceptually simple and easy to explain.
• The method is objective and the analysis is reproducible.
• The method will work at the individual insurer level.
Disadvantages
• All of the methods for determining the required return on equity have problems and they
can produce different answers:
a) A required ROE based on historical returns depends on the historical period chosen.
b) A required ROE based on CAPM is subject to the limitations and criticisms that apply
to CAPM (see Method # I above).
c) The dividend growth method requires some subjective estimation - - it will not work
for companies with erratic or no dividends.
d) Internal management "hurdle" rates, based on a company's experience in raising
capital, are very subjective and may not be consistent with the market value approach
under fair value.
• The number of steps required makes this a fairly indirect method.
• Estimating the present value of income taxes requires a modification to the method.
• A required capital estimate is needed. There is no agreed upon method for doing this, and
no consensus as to whether it should be the company's or the industry's capital allocation
or requirement.
This method shares some features of the above IRR method. It is based on the risk-adjusted
discount method. 29'3° Here the relationship between the required ROE, the expected investment
return, the income tax rate and the capital ratio is used to find the implied risk-adjusted interest
rate. Like the above IRR method, the balance sheet values are fair value quantities. It is simpler
than the IRR model since the risk adjustment is derived directly from a formula (shown in the
Appendix), rather than by an iterative process.
The inputs to the single-period RAD method are the required amount of equity, the investment
return on cash flows, the risk-free rate, the effective income tax rate and the required return on
29Butsic, Robert, "Determining the Proper Discount Rate for Loss Reserve Discounting: An Economic Approach,"
1988 Casualty Actuarial Society Discussion Paper Program - Evaluating Insurance Company Liabilities, pp. 147-
188.
J°D'Arcy, Stephen P., 1988, "Use of the CAPM to Discount Property-Liability Loss Reserves", Journal of Risk and
Insurance, September 1988, Volume 55:3, pp. 481~1.90.
475
equity. The required ROE can be determined using one of the methods described above for the
IRR approach. The required capital and the investment return are estimated using historical
industry data, or from one of the alternative methods described above for the IRR approach. Note
that the required capital needs to be consistent with the fair value o f the liabilities. For example,
if the fair value of reserves were less than a non-fair value such as ultimate undiscounted
liabilities, the required capital would go up.
The simplicity of this method arises from the assumption that the risk adjustment (as a reduction
to the risk-free rate) is uniform over time. Thus, evaluating an insurance contract over a single
period will be sufficient to determine the risk adjustment. To illustrate the method, we assume
the following:
• capital is 50% of liability fair value,
• required ROE is 13%,
• expected investment return (EIR) is 7%,
• risk-free rate (RFR) is 6%,
• income tax rate is zero, and
• fair value for the liability is $100 at time zero.
To see that this works, note that the beginning assets are the fair premium for the liability o f
$100 plus the required capital of $50. This amount grows to $160.50 (i.e., $150 x 1.07) at the
end o f the year. The expected amount of liability grows at the risk-adjusted rate of 4% to $104.
Subtracting this amount from assets gives $56.50, which represents the required 13% return (
56.5 / 50 = 1.13).
The income tax rate, however, is not zero, so the formula for the risk adjustment (see the
Appendix) is somewhat more complicated than shown here. The Appendix provides the
complete formula and also gives a numerical illustration of the method.
476
Advantages
• The method is very simple and transparent. It is easy to explain and to demonstrate with a
spreadsheet.
• The method is reliable, robust and will produce positive and stable risk margins.
• Inputs are presently available from published sources. For example, many rate filings
with state insurance departments have estimates for required ROE and capital leverage.
Disadvantages
• The method will only produce an industry-average or company-average risk adjustment
(to the risk-free rate). It would be difficult to apply the method to produce specific lines
of business risk adjustments.
• This method has the same disadvantages relative to the selected ROE as the IRR method.
• This method has the same disadvantages relative to the selected "required capital" as the
IRR method.
477
M e t h o d 4 - M e t h o d s B a s e d on Underwriting D a t a
Typically, risk adjustments based on underwriting data use information published in insurance
companies' annual statements. To obtain stable results by line o f business applicable to a typical
company, data aggregated to industry level by sources such as A. M. Best can be used.
The published literature on risk adjustments using underwriting data primarily focuses on
estimating a risk adjustment to the factor used to discount liabilities. Alternative methods for
computing risk-adjusted discount rates use a C A P M approach to compute the risk adjustment.
Although we focus on using underwriting data to compute risk-adjusted discount rates, the same
data can be used to derive an additive risk load instead. 31 Risk adjustments incorporated through
the discount rate are discussed first, followed by discussion of risk adjustment via an additive
risk load.
Butsic introduced the concept of using risk adjusted discount rates to discount insurance
liabilities.32 He argued that a liability whose value is certain should be discounted at a risk free
rate. The appropriate risk free rate to use for the certain liabilities is the spot rate for maturities
equal to the duration of the liabilities. If certain liabilities are discounted at the risk free rate,
then uncertain liabilities should be discounted at a rate below the risk free rate. The formula for
the risk-adjusted rate is:
3t There are several different ways to make a risk adjustment. One way is through an additive risk load to the
otherwise calculated present value estimate (based on risk-free discount rates). A second is by discounting the
expected cash flows using a risk-adjusted discount rate. A third is by adjusting the individual expected cash flow
amounts for each time period, replacing each uncertain amount with the certainty equivalent amount (i.e. the fixed
amount for which the market would be indifferent between it and the uncertain amount being estimated.) A fourth is
by adjusting the timing of the estimated cash flows (sometimes used when timing risk is thought to dominate
amount risk).
32 Butsic, Robert, "Determining the Proper Discount Rate for Loss Reserve Discounting: An Economic Approach,"
1988 Casualty Actuarial Society Discussion Paper Program - Evaluating Insurance Company Liabilities, pp. 147-
188~
478
iL=i-e(R-i),
The above term "e (R - i)" represents the adjustment to the risk free rate for the riskiness of the
liabilities.
There is an analogy between this formula and that for a company's cost of equity based on the
CAPM.
ie=i+ ~ c ( R - i )
The specific procedure for computing the adjustment is described in detail in the Appendix.
Note that the method's results can be very sensitive to the historical time period used as the source
of the underwriting data. For example, the selection of an historical period that includes a major
market disruption, such as a workers' compensation crisis, major catastrophe, or mass tort eruption,
can produce drastically different indications than a time period that excluded this major disruption.
Thus, it is necessary to consider how long a time period is required to obtain stable and reasonable
results and whether the method is unstable over time, The longer the historical period used for
computing the risk adjustment, the more stable the results will be, but the less likely they are to
reflect current trends in the underwriting cycle or business environment. The shorter the historical
period used, the more likely it is that the adjustment will reflect the current environment, but at a
cost of being more unstable and more susceptible to infrequent random events such as catastrophes
(or the short-term absence of the long-term catastrophe or large loss risk).
An additional effect that must be considered is the effect of taxes. As shown by Myers and Cohn 33
JJ Myers, S and Cohn, R, "A Discounted Cash Flow Approach to Property-Liability Rate Regulation," Fair Rate of
Return in Property-Liability Insurance, Cummins, J.D., Hamngton S.A., Eds, Kluwer-NijhoffPublishing, 1987, pp.
55-78
479
and Butsic3+, taxes increase the premium needed to obtain a target rate of return and therefore
decrease the effective risk-adjusted discount rate. This effect is embedded in the data used to derive
the risk-adjusted discount rate. It might be desirable to segregate this effect from the pure risk
adjustment. A procedure for doing this is discussed in the Appendix.
Advantages
• The approach produces an adjustment to the discount rate without requiring the
computation o f a liability beta. As discussed above in the CAPM method for estimating a
risk adjustment, the liability beta is one of the more controversial features of the CAPM
approach.
• The approach does not require the computation o f a leverage ratio
• The approach is relatively easy to implement. Spreadsheets can be placed on a web site
containing a sample calculation
• The data required, such as Bests Aggregates and Averages, is relatively inexpensive and
readily available
• A paper presenting the approach has been included in the syllabus of the Casualty
Actuarial Society for over 10 years. A description of this technique is, therefore, readily
accessible to actuaries (or anyone else who accesses the CAS web site.)
• This method can easily be applied to individual lines where annual statement data is
available.
Disadvantages
• Results can be very different depending on the historical time period used. This
committee's research indicates that changing the time period used for the calculation in
one instance changed the all-lines risk adjustment from 4.5% to 1.0%. The committee
believes that the results for recent historical periods reflect certain well-known market
disruptions such as the impact o f the recognition of asbestos and environmental
liabilities. Also, the industry has been in a protracted soft market, which has depressed
underwriting profitability in the recent historical data.
• Results for a single line can be unstable. Some lines are unprofitable for extended
periods o f time and this method may not produce a positive risk load. Useful data for
lines with very long tails (or without industry data available) may be a problem+
Examples of such include medical malpractice-occurrence and directors & officers
(D&O, for which industry accident year data may not be available).
• Pricing adequacy may vary by line based upon individual line characteristics such as
regulatory environment, market conditions, geography, etc. An impact of this is cross
subsidization of lines where some lines are undercharges at the expenses of other lines.
Thus the results for a single line, even over relatively long time periods can be
misleading. (Our research showed that at least one regulated line had a negative risk
adjustment using this approach for 30 years.)
:~ Butsic, Robert P., 2000, Treatment of Income Tax~ in Present Value Models of Properly-Liability[ n s ~ e ,
UnpublishedWorking Paper.
480
• Results will be affected by "smoothing" in published financial numbers.
• The method requires accident year data to do the computation correctly, or else it is
susceptible to distortion from events with long-term latency issues, such as mass torts or
construction defect.
• Results using individual company data may be too volatile, hence, the method has usually
been applied mostly to industry data.
Computing Additive Risk Loads Instead of Risk Ad[ustments to the Discount Rate
Since the procedures described here focuses on computing a risk adjustment to the discount rate,
the procedure to compute an additive, dollar-value risk load must convert the risk-adjusted rate
into a risk load (as a ratio to the liability value). However, it is possible to compute the risk load
directly using the same data for computing a risk adjustment to the discount rate. This approach
might be preferred for a short tail line.
One approach to computing an additive risk load is simply to calculate the ratio of the profit on
the policies at the beginning of the period to the average discounted losses, where losses are
discounted at a risk-free rate rather than a risky rate. Thus, the risk load (expressed as a
percentage of the present value losses) is equal to the present value of the premiums minus the
present value of expenses minus the present value of the losses (plus loss adjustment expenses)
divided by the present value of the losses. All quantities are discounted at the risk-free rate.
Unlike the adjustment to the discount rate, this risk load would not be meaningful unless
computed by line, since the duration of the liabilities varies by line. An example of this
computation is shown in the Appendix.
481
Mc~hqd ~ - Actuariql Distribution-Based Risk Loads ~5
The evolution of this approach relative to pricing is given first, followed by the extension to the
valuation o f liabilities.
Pricing context
Probability-based actuarial risk loads are among the oldest procedures developed by actuaries for
estimating the risk adjustment to losses. These approaches continue to develop, even as other
approaches, which largely evolved from other disciplines (such as economics and finance),
continue to add to the tools used for deriving risk loads. Distribution based loads arose in the
context o f insurance pricing to fill the perceived need to apportion the targeted underwriting
profit to different classes of business according to their actual riskiness, as described
mathematically by the probability distribution o f the loss.
The first approaches to the problem focused on the volatility of the individual loss, characterized
mainly by the severity distribution. In 1970, Hans Biihlmann set forth three possible principles
that might be applied to the problem:
Actuarial distribution-based risk loads often invoke collective risk theory to explain the
derivation o f the risk load. Collective risk theory provides a model of the insurance loss
generating process that can be used to derive aggregate probability distributions. The theory also
allows derivation of the distribution parameters such as standard deviations or variances, which
are used in the risk load formulas. Recent developments in collective risk theory have given rise
to an additional principle used to derive risk loads:
• The expected policyholder deficit (EPD 36) principle: Risk Load = ~. Surplus Requirement.
Surplus is determined based on the expected policyholder deficit, which is derived frorn the
3s This exposition draws heavily on Glenn Meyers" September 18, 1998 presentation to Casualty Actuaries of New
England (CANE').
36The "expected policyholder deficit" is the total expected level of uncompensated losses over the total expected
level of all losses, for a given level of assets (reserves plus surplus) supporting a risk For example, assume 99% of
the time losses are only $1,1% of the time they are $100, and the total level of assets supporting this risk is $90.
Then expected uncompensated losses are $0. I0. Total expected losses are $1 99. The expected policyholder deficit
is 0.10/1.99, or around 5% For further discussion of this concept, see " Solvency Measurement for Property-
Liability Risk-Based Capital Applications" By Robert P. Butsic, published in the 1992 CAS discussion paper
program titled "Insurer Financial Solvency".
482
aggregate probability distribution of either losses or surplus (assets minus losses). This principle
is very similar to the tail-value-at-risk principle proposed by Meyers) 7
Each of the above principles contains an arbitrary coefficient L, constant across classes of
business (and concealed in the utility function), that can be adjusted to yield the desired overall
underwriting profit or rate of return on surplus. In much of the literature the time element is not
addressed explicitly. It is straightforward, however to apply the risk load to discounted liabilities.
The first two of the principles were applied in the practical context of increased limits
ratemaking at the Insurance Services Office (ISO) in the late seventies and early eighties.
During the eighties, regulatory pressures brought the Capital Asset Pricing Model (CAPM) into
the debate regarding how to incorporate risk into insurance prices. CAPM is founded on certain
axioms that are violated in the context of insurance pricing (e.g., no default, frietionless
markets), but this intrusion of modem financial theory stimulated much thought as to how the
risk load formalism can address enterprise-wide and market-wide issues that had been neglected
in the earlier formulations. The concept of systematic risk, already familiar to actuaries as
parameter risk, was incorporated into practical treatments intended for actual insurance pricing.
The answer given by this scheme gives a contract risk loading proportional to the change in the
variance of the insurer's bottom line caused by the addition of that one contract to the insurer's
portfolio. This raised an interesting parallel with work being done at about the same time on
reinsurance pricing based on marginal surplus requirements. 39 The Competitive Market
Equilibrium result can be re-expressed in terms of the marginal surplus (risk capital) required to
support the additional business, and thus linked to the cost of risk capital. More recent work
using probability distributions has referenced the expected policyholder deficit concept, rather
than standard deviation, variance or probability of ruin to motivate the computation of marginal
483
surplus requirement and, therefore, of risk load. t°' 41
The above methods apply prospectively to situations where the losses have not yet taken place
and only rating information is available. For risk-adjusted valuation of insurance liabilities, such
methods would apply to the Unearned Premium Reserve (UPR) and Incurred But Not Reported
Reserves ([BNR). As long as one has some kind of runoff schedule giving estimates of number
and type of claims not yet reported, one can apply these methods to estimate the variability of
unreported claims.
Estimating the variability of reported claims is a different problem because of the information
available to the insurance company about actual reported claims. Meyers has addressed the
problem in the context of reserving for workers' compensation pensions, using a parametric
model for the mortality table and calculating the variance of conditional future payments.42
Hayne has used the collective risk model with information about claim counts and severities as
the claim cohort ages and assumptions as to distributions and correlation structures to estimate
the distribution of outstanding losses.43 Heckman has applied distribution and regression
techniques to estimating the expected ultimate value of claims already reported and of IBNR
claims. 44 For the two latter methods, the conditional loss distribution provides the information
needed to calculate risk loads for the reserves.
There are some unsolved problems associated with approaches based on probability
distributions. Research is in progress to develop methods for measuring correlations of lines or
segments of the business with other segments, but there is no generally accepted approach for
incorporating correlations into the measure of risk. This is believed to be important, as these
correlations may make a significant contribution to, and in some cases may reduce overall risk.
In addition, some of the risk load procedures such as those based on standard deviation and
variance approaches are not value additive. That is, the risk load of the sum is not equal to the
sum of the risk loads.
Advantages
• Actuaries have used the approaches for a long time to compute risk loads.
4oMeyers, Glenn, "The Cost of Financing Insurance",paper presentedto the NAIC's InsuranceSecuritization
Working Groupat the March 2000 NAICquarterly meeting.
41 Philbrick, Stephen W., "Accountingfor Risk Margins,"Casualty Actuarial Society Forum, Spnng 1994, Volume
1, pp. 1-87.
42Meyer's,Glenn G, "Risk Theoretic Issuesin Loss Reserving: The Case of Workers CompensationPension
Reserves," Proceedings of the Casualty Actuarial Society (PCAS), LXXVI, 1989, p. 171
4J Hayne, Roger M., "Applicationof Collective Risk Theoryto EstimateVariability in Loss Reserves," Proceedings
of the CasualtyActuarial Society(PCAS), LXXVI, 1989,p. 77-110
44Heckman,Philip, "Seriatim,Claim Valuation from Detailed Process Models,"paper presentedat Casualty Loss
Reserve Seminar, 1999
484
• This is an area of active research with many worked out examples of how the method can
be applied.
• The method is intuitive: risk load is related to actual risk for a body of liabilities.
• The data required to compute the risk loads is readily available within many insurance
companies and many actuaries are qualified to perform the computation.
• Many reserving actuaries are familiar with using aggregate loss probabilities to establish
confidence intervals around their reserve estimates.
• This method can be used with company-specific data.
• This method can be used by line to reflect unique line of business risks.
Disadvantages
• The approaches have often been criticized as being inconsistent with modern financial
theory, as classically formulated, relative to compensation for diversifiable risk. For
example, the risk loads often fail to satisfy the one-price rule, whereby two insurers
offering identical insurance coverage would charge the same price.
• Sometimes the weight given to process risk relative to parameter risk in determining the
risk load can appear to be too large. Many researchers and practitioners believe that risk
loads apply only to nondiversifiable (parameter or systematic) risk not to unique (or
process) risk. It should be noted that it is not universally accepted that only diversifiable
risk matters when computing risk loads. 45'46
• The risk loads may not satisfy value additivity. As a result, two companies with identical
lines but a different mix can have different risk margins (see discussion below).
• A large number of methods for doing these calculations exist, yielding a variety of
results. There is little guidance regarding which of the available methods is appropriate
for a given set of circumstances.
• Certain parameters are not only subjective, but there is little guidance on how to calibrate
them. For instance, only the more recent papers discuss a conceptual framework for
selecting ~..
• Parameters are often determined in a subjective manner and may therefore be inaccurate.
• Actuaries are still struggling with measuring the correlations between lines of business.
This may be a significant source of risk to companies.
Note that the lack of value additivity is not universally accepted as a disadvantage. For
example, some believe there is much less risk in a $1 million (undiscounted) share of a large
company's auto liability reserves than in the entire $1 million in undiscounted auto liability
reserves for a small regional insurer. Thus, the former may be worth more than the latter
(i.e., valued with a smaller risk margin).
485
Method 6 - Using the reinsurance market to estimate the fair value qf liabil~'es
The reinsurance market offers the most direct approach to estimating the fair value of an
insurance company's liabilities. Blocks of liabilities are often sold either on a retrospective basis,
in transactions such as loss portfolio transfers, or on a prospective basis in more commonly
purchased excess of loss treaties. The price structures associated with these contracts provide
another glimpse o f the implicit risk load required to record the liabilities at their fair value.
Reinsurance prices may require some adjustment before they could be used to estimate the fair
value of liabilities. For example, market prices offered by some reinsurers reflect an embedded
option value equal to the value of their default on their liabilities. Such market prices would have
to be adjusted upward to remove this default value. Another example is portfolio transfers that
include customer lists or renewal rights. The effect of these lists or rights on the total price
would have to be isolated and removed before the portfolio transfer price could be used for a fair
value estimate.
There are numerous practical issues that need to be addressed before the method can be
implemented in practice. For example, how would a ceding company measure the risk loading in
the reinsurer's price structure? How could the analysis of a particular treaty structured to reinsure
a portion of the company's liability be generalized to estimate the fair value ofalt its liabilities?
Possible approaches are:
Reinsurance Surveys: On a regular basis, leading companies can be surveyed to evaluate the
risk loading implicit in their reinsurance structure. The survey can be structured to
discriminate between various lines of insurance and sizes o f ceding companies. The implicit
risk loading can then be published and employed by all companies with a particular set of
attributes (size, type of business, balance sheet leverage, etc.). Note that this is a
controversial suggestion. (Asking companies to share loss information is one thing. Asking
them to share pricing information is something else entirely. First, the pricing "assumption"
may not be as objective an item as a loss amount. It may be a gut call that varies by sale.
Second, there are many more antitrust issues in sharing pricing information than in sharing
loss information.)
486
Conceptually, this would operate similarly to the PCS Catastrophe Options currently offered
by the Chicago Board o f Trade. These options are priced based on an index, which is
constructed in the following way:
",4 survey of companies, agents, and adjusters is one part of the estimating process. PCS
conducts confidential surveys of at least 70% of the market based on premium-written
market share. PCS then develops a composite of individual loss and claim estimates
reported by these sources. Using both actual and projected claim figures, PCS
extrapolates to a total industry estimate by comparing this information to market share
data. "' 47
• Extrapolating from a company's own reinsurance program: Companies that submit their
reinsurance programs to bid will receive reinsurance market price information from a number
of providers. At a minimum, even the information contained in one well-documented bid
may be sufficient to compare the reinsurer's price to the ceding company's best estimate o f
the ceded liabilities discounted at the risk-free rate. In practice, a number of adjustments to
this risk load may be appropriate. For example, if the only reinsurance purchased is high
layer excess, then the risk loading will be commensurate with the increased risk associated
with that layer. Publicly available increased limits tables (e.g., ISO) might be suitable in
some cases to evaluate the relative risk at each layer of coverage. An insurer's policy limits
profile can then be employed to evaluate the weighted total limits of their liability portfolio
and the resulting risk load.
Advantages
• The reinsurance market is the closest structure to a liquid market for insurance liabilities;
• Most insurers have access to the reinsurance market and can therefore gain information
regarding their unique risk profile;
• Similar to catastrophe options, once the survey results are published, it would be
relatively straightforward to estimate fair value
Disadvantages
• Results can be sensitive to capacity changes in the reinsurance market. As such, the
values at any point in time may not represent future values. In fact, in highly competitive
market cycles, a negative risk load could be obtained for some coverages.
• Unstable reinsurance prices also make it difficult to update estimates for each reporting
period. If the information required for the fair value estimate could not be obtained
quickly enough, all estimates would have to be recalculated each reporting period.
• The credit risk of the reinsurer's default on its obligation is embedded in the price. For
reinsurance, this can be material, and would have to be removed, but the isolation o f this
item from the total price (and other risks) may be problematic.
• This approach would also raise difficulties in updating the values, as it would require
4~ Chicago Board of Trade web site: PCS Catastrophe InsuranceOptions - Frequently Asked Questions
487
regular surveys or continual shopping o f ceded business to reset the risk charges. 48
• Some reinsurance quotes are not transparent, so that the implied risk loading may be
difficult to ascertain. Often, the insurer and reinsurer would each have different estimates
o f the expected loss and other components o f price.
• The users o f this method will only sample the reinsurance market. I.e., they will not be
using the entire market for estimation. This could introduce bias.
• Reinsurance markets focus much more on prospective exposures rather than past
exposures, partly due to current accounting treatment of most retroactive reinsurance
contracts. As such, there are fewer market prices potentially available (and a much
smaller market) for reinsurance of existing claim liabilities.
• Reinsurance prices embed antiselection bias. The price o f reinsurance for the portion o f
an insurer's portfolio ceded may be higher than the price if all risks were ceded.
u Note that continual updates would be required under fair value accounting. This is because fair value accounting
is meant to be an idealized market value, i.e., an actual market value ifa sufficiently active market exists, or an
estimate of what a fair market value would be otherwise. As such, a fair value estimate would have to be updated as
otten as an active market value would be updated. In general, market values in an active market change constantly.
488
Method 7- Direct estimation of market values
This is the method of Allen, Cummins and Phillips. 49 In this approach, a time series of publicly
traded insurer data is analyzed. The output of the analysis is an estimate of the market value of
each insurer's liabilities for each year of the history. The market value of liabilities is derived by
subtracting the market value of the equity from the market value of total assets. The market value
of equity is calculated by extending the method of Ronn and Verma to avoid the problem of
including intangible asset values in the equity measurement,s° Here, the equity value is
determined so that the measured volatility of the insurer's stock price and of its asset values are
consistent. This method is described in the section on measurement of credit risk The market
value of assets is estimated from the separate asset categories, most of which are publicly traded.
The market value of liabilities thus obtained contains an embedded option value equal to the
value of default on the liabilities. This value of the default can be separately determined by the of
Ronn-Verma method.
Adding back the default value gives the market value of the liability as if there were no credit
risk. Next, the nominal (undiscounted) value of the liability is compared to the no-default market
value to determine the implied interest rate at which the nominal value is discounted to get the
market value. This calculation requires an estimation of the payment pattern of the liabilities
(also used in the above-average payment duration). The risk margin, as a reduction to the risk-
free rate, is the difference between the risk-free rate and the implied rate underlying the market
value.
Advantages
• The method is theoretically sound. It produces a risk load consistent with modem financial
theory without requiring the calculation of a beta.
• The method is objective and the analysis is reproducible
• The method is a type of direct measurement of liabilities that may be desirable by the
accounting profession. However, the measurement is direct for the industry, but not for a
particular company
Disadvantages
• There are difficulties with the estimation of parameters:
a) Some insurers in the data are also life insurers, or involved in multiple lines not
relevant to a particular company at issue; carving them out requires estimating the
49Allen, Franklin, J. David Cumminsand Richard D. Phillips, 1998, "Financial Pricing of Insurancein a Multiple
Line InsuranceCompany", Journalof Risk and Insurance,1998, volume65, pp. 597-636.
~oRonn, Ehun I., and AvinashK. Verma, 1986, Pricing Risk-AdjustedDeposit Insurance:An Option-BasedModel,
Journal of Finance, 41 (4): 871-895.
489
market equity value of these other operations.
b) Some companies are members o f financial conglomerates, or general conglomerates
(e.g., General Electric).
c) Not all insurers are publicly traded. These include foreign companies, privately held
companies and mutuals or reciprocals.
The liabilities may be under- or overstated in the financial statements. Therefore, the
market value may reflect an adjustment to the book value, based on market perceptive of
this bias. Any perceived change in this bias may make prior history unusable.
Measurement problems make it difficult to provide a stable estimate for individual line of
business risk margins. It is also difficult to get a reliable estimate for an individual firm.
Most actuaries don't have any experience with this method. It has not yet been used in
practice.
490
M e t h o d 8 - Distribution Transform M e t h o d
A number of authors have proposed risk-loading procedures based on transforming the aggregate
loss probability distribution.5~ The risk-loaded losses are computed from the mean of the
transformed distribution. A simple example of such a transform is the scale transform:
x~kx
As a simple, but unrealistic example (because insurance losses tend to have positive skewness), x is
a normal variable, that is, if aggregate losses follow a normal distribution and k is I.l, then the loss
distribution's expected mean is shifted upwards by 10%. Thus, a company purchasing the liabilities
would require 10% above the present value of the liabilities (at a risk-free ram), in order to be
adequately compensated for the riskiness of the liabilities. If one is using this distribution to
compute primary losses for an exposure where the limits applied to losses m the aggregate, the
expected mean would be increased by less than 10%, but losses excess of the primary limit will be
increased by more than 10%.
In the more recent literature on the transform method the power transform is used.52 (Other
transforms such as the Esscher transform also appear m the literature). This approach raises the
survival or tail probability to a power.
S*(x) = S(x)"
where S(x) = the original survival distribution, 1-F(x), or 1 minus the cumulative probability
distribution);
S*(x) = the transformed survival probability.
I f r is between 0 and one, the tail probabilities will increase and the transformed distribution will
have a higher mean than the original distribution.
The choice of the transformation parameter r is guided by the uncertainty of the business being
sJ Venter, Gary G., 1991, Premium Implicationsof Reinsursnce Without Arbitrage,ASTIN Bulletin, 21 No. 2: 223-
232. Also,
Wang, Shaun, 1998, Implementationof the PH-Transformin Ratemaking,[Presentedat the Fall, 1998 meeting of
the Casualty Actuarial Society]. Also,
Butsic, Robert P, 1999, Capital Allocation for Property Liability Insurers: A CatastropheReinsuranceApplication.
Casualty Actuarial Society Forum, Fall 1999.
s2 Wang, Shaun, 1998, Implementationof the PH-Transfonmin Ratemaking,[Presentedat the Fall, 1998 meeting of
the Casualty Actuarial Society]. Also,
Venter, Gary G., 1998, (Discusssionof) Implementationof the PH-Transfomain Ratemaking,[by Shaun Wang;
presentedat the Fall, 1998 meeting of the CasualtyActuarialSociety]
491
priced. The greater the uncertainty, the lower r will be. In practice, this may mean that one
calibrates the parameter by selecting a transformation that approximates current market premiums
for a given class of exposures. Wang suggests that using a distribution transformation to derive risk
loads is the equivalent of including a provision for parameter risk, but not process risk, into the
formula for risk loads. Thus, one might select r based on subjective probabilities about the
parameter uncertainty of the business.
Wang (1998) has suggested that one could apply this approach in two ways. s3 The first applies
a transform separately to the frequency and severity distributions used to price policies. The
second transforms the probability distribution of aggregate losses (i.e., the convolution of the
frequency and severity distributions). However, Venter suggests that one could obtain
inconsistent results when applying a transform to aggregate losses, and prefers working with the
frequency and severity distributions. 54
Option pricing theory and the distribution transform method are related. The parameters of the
probability distributions used in the option pricing formulas typically reflect "risk neutral"
probabilities, rather than real probabilities. Thus, for example, the parameters used to price
interest rate options are generally derived from current actual prices of bonds of different
maturities, or from the current yield curve, rather than from empirical time series data of the
various interest rates. One could view the "risk neutral" probabilities as a transformation of the
distribution for the underlying asset values.
Advantages
• The method produces a risk load consistent with modern financial theory without
requiring the calculation of a beta. Risk loads are value additive. (Note again that there
is not universal agreement among actuaries that risk loads should be value additive.) The
approach is similar to that used in pricing options.
• The method is conceptually straightforward to understand and explain. Once r or a similar
parameter has been selected, it can be reused subsequently.
• This approach is currently used in reinsurance pricing.
• it is theoretically viable for estimating risk loads by layer. Many of the other methods do not
address layers or deductibles.
• It is an area of active research for those investigating risk load methodologies.
Disadvantages
• It is not in common use for producing prices or risk loads on primary, business. Currently
its primary use is in producing risk load for layers.
s~Wang, Shaun, 1998, Implementationof the PH-Transformin Ratemaking, [Presented at the Fall, 1998meetingof
the Casualty Actuartal Society].
Venter, Gary G., 1998,(Discusssiottof) Implementationof the PH-Transformin Ratem'aking, [by ShaunWang;
presented at the Fall, 1998 meetingof the Casualty Actuarial Society]
492
• As currently applied, in order to calibrate the parameters, it often requires knowledge of the
risk loads on primary business.
• Because it is a new approach, actuaries are not as familiar with it as with some of the others
presented in this paper.
• The parameters may be selected based on the analyst's experience with a particular line of
business. This introduces an element of subjectivity, where different analysts may choose
different values for the parameter.
• It is not clear which transform choice to use. Many of the transformation methods are
chosen for their mathematical tractability, and are not supported with empirical evidence.
493
Method 9 - The Rule-o[-Th~cmb Method
The methods presented so far require that the person computing the risk-adjusted present value
of liabilities do original analytical work. In some situations there may not be adequate data or
other resources to develop the risk adjustment from scratch. In such situations it might be
appropriate to use a rule of thumb that provides a "quick and dirty" way to derive a risk
adjustment. Such methods would be relatively easy to apply but would produce broadly
reasonable results. Examples of rules of thumb would be:
Compute a risk adjusted discount rate by subtracting 3% from the risk-free rate.
The risk load should be 10% of the present value of General Liability liabilities and 5%
of the present value of Homeowners liabilities.
The numbers in the examples above are for illustrative purposes only. A separate body of
actuaries and other experts could determine actual guideline values. This group would review
existing research and perform additional studies where necessary. Quite likely, it would
consolidate results from using one or more of the other methods in this document.
Advantages
• For the individual company, it would be simpler to apply than any of the other
alternatives. It would reduce the work effort for actuaries and others, who would not have
to separately develop risk adjustments.
• This approach may lead to industry standard risk adjustments being used, thus creating
comparability from company to company.
• It may reduce the likelihood that a risk adjustment methodology can be used to
manipulate a company's financial statements.
Disadvantages
• Fair values produced using this approach may be less accurate because the unique risk
factors for a company may not be reflected.
° It precludes actuaries from applying methods that reflect new developments for
determining risk adjustments.
• An industry body may be required to perform research to parameterize the risk
adjustments. This may create antitrust issues. It is not clear that the industry body would
be sufficiently authoritative for its research to be used in financial valuations.
494
Method 10- Alternative Methods
This paper has presented a number of possible approaches to estimating the fair value of
insurance liabilities. Most of these approaches are rooted in analytical methods documented in
the actuarial literature. However, research continues into how to determine risk adjustments. Not
all current developments are covered in this paper and undoubtedly others will be published. A
company may wish to use alternative approaches not presented in this paper. In such cases, there
are a number of points one should consider:
• Once selected, the approaches should be used consistently. Changing approaches from
year to year may result in inappropriate income statements.
495
Converting a risk a d j u s t e d discount to an additive risk load
A number of the methods presented in this paper produce an adjustment to the risk-free discount
rate. Risk adjusted present values of liabilities are then derived by discounted the liabilities
using the risk-adjusted rate. An approach to deriving a dollar-value risk load is to work from the
risk-adjusted discount rates. This approach might be used if one wanted to discount losses at the
risk-free rate and apply the risk load to the losses directly. The procedure begins by discounting
the liabilities at the risk-adjusted and the risk-free rate. It then computes the difference between
the two discounted quantities. The risk load is this difference divided by the present value of the
liabilities, discounted at the risk-free rate, The table below presents an example where this
calculation is performed for liabilities of various durations, when the assumed risk-free rate and
the risk adjustment remain constant.
496
Unearned Premium (or Unexpired Policv) liability methods
As noted in the background 'section, a fair value accounting system focuses on the measurement
of assets and liabilities, not income. As such, the current recording of unearned premium under
U.S. GAAP accounting conventions would be replaced with the fair value of the business written
but not yet earned. The methods used to estimate this fair value have much in common with the
above methods that estimate the fair value of the liabilities for unpaid losses. However,
additional methods may be applicable since it may be easier to discern the market prices
underlying earned premium. Ore can argue that the booked premium represents the "market
price" charged by the particular insurer.
One area where such additional methods may be needed is property insurance, particularly where
catastrophe exposure exists.
• The price at which the business was written, the original entry price. The initial fair
value for a policy's liability may be the premium charged (less expenses).
The price at which similar business is currently being written by the market, e.g., a broad
average price. It is an indication of the current entry price. ~(This value may only be
available retrospectively shortly after the balance sheet date.)
The price at which reinsurance is being purchased for this risk, both quota share
reinsurance, which prices the entire risk, or excess of loss reinsurance, which should
provide a market guide to one of the more volatile components of the risk. This also is an
indication of the current exit price.
An actuarial estimate of the expected value of discounted losses associated with the
business written but not yet earned, adjusted for risk. The estimate of the necessary risk
adjustment would be based on the above methods for estimating the market value o f
unpaid losses. In particular, return on equity models, internal rate of return models, and
models based on the aggregate probability distribution of losses, can be directly applied
to future losses (losses not yet incurred on business written).
Note that the actuarial methods applicable to lines of business that contain a significant
catastrophe potential may require modification to consider the seasonality of the exposures.
497
SummarF
A number of methods for computing risk adjustments to discounted liabilities have been
presented. These are the approaches that the committee thought were worthy of discussion. Not
all would be feasible for the individual company actuary to implement. As fair value becomes
established as an accounting procedure, more research and application will be performed, and
more methods will become feasible.
Some methods would require an "official" body such as a committee of the American Academy
of Actuaries to perform research to establish parameters. Once established, the parameters could
thereafter be used at individual companies without further research or analysis being required.
This would hold only if one agrees that it is acceptable to ignore risks that are unique to
companies, such as those classified under diversifiable risk.
Methods such as those based on CAPM and IRR pricing models should be straightforward to
modify for estimating the fair value of liabilities. Actuaries are also well acquainted with
methods based on aggregate probability distributions. Actuaries should be able to apply one or
more of the methods to a line of business for which they are computing risk-adjusted discounted
reserves.
Some methods are more appropriate for some lines of business. For instance, methods based on
using risk-adjusted discount rates have been applied to lines of business with longer tails such as
Automobile Liability and Workers Compensation. However, they may be inappropriate for short
tail volatile lines such as property catastrophe because the risk is not time-dependent. Methods
based on applying aggregate probability distributions might be appropriate for such short tail
volatile lines. However, their use outside of increased limits and catastrophe pricing has not
been well researched.
The direct estimation method is relatively new and has only been applied by academic
researchers. Therefore, it could be difficult for practitioners to apply until further study has been
done. Using reinsurance pricing to develop a risk load is, in principle, the most consistent with
computing market-based estimates of liabilities. However, due to limitations on available data,
the extent of the market and a lack ofpublisbed research on the approach, it might be difficult to
apply in practice. There might be special situations where it could be used, such as in evaluating
~atastrophe liabilities.
In general, risk adjustments based on industry-wide information will be more stable than risk
adjustments based entirely on company-specific data. Also, risk adjustments based on individual
line of business data will be less stable than risk adjustments established using all-lines data.
However, such risk adjustments will fail to incorporate some of the risk components of that are
unique to lines of business or to companies.
498
This summary and discussion provided by the task force of methods available for computing the
risk adjusted present value o(liabilities demonstrates that actuaries have the theoretical
understanding needed to implement fair valuing of insurance liabilities. We have identified a
number of models that are available and appropriate for actuaries to use in estimating fair value
liabilities. No issues have been identified that are not susceptible of actuarial estimation.
The following table summarizes our findings on the methods of deriving risk adjustments.
499
Summary of Featuresof Estimation Methods
Method Uses Industry UsesCompany Has Specdic Us~ Leverage Incorporates Incoq0orates Is Value Comrr~ly Used CommonlyUsed
Data Speci~ Data TimeElemenl Ratios SystemaltcRisk ProcessRtsk Additive in pndng for Reserve
Ma'gins
CAPM X X X X X
Intenal Rateof X X X X X X
Re(um
SinglePeriod X X X X X X
PAD
UsingUriC- X X X X X
L/I writing Results
Basedon X X X X
PmOability
Distributions
Basedor1
Reinsurance
~rect
EsOmat=on
D~stn~tion
Transforms
N~iveMethods x x x
C A S T a s k F o r c e on F a i r V a l u e Liabilities
W h i t e P a p e r o n F a i r V a l u i n g P r o p e r t y / C a s u a l t y I n s u r a n c e Liabilities
Section E - A c c o u n t i n g P r e s e n t a t i o n I s s u e s
The purpose of this section is to discuss financial reporting presentation issues resulting from a
change to fair value accounting. Financial reporting presentation deals with the design of the
reporting template, i.e., what financial values should be displayed, and in what format. It
assumes that any required value can be determined, such as through the various methods in
Section D. While many implementation issues may arise from the choice of a particular
reporting template, such issues will not be discussed in this section. All implementation issues
will be discussed in the next section (Section F), whether arising from the estimation method
chosen (Section D), or arising from the presentation template chosen.
The following actuarial presentation issues will be discussed. This list is meant to stimulate
awareness of the various actuarial issues/concerns surrounding presentation and fair value
accounting. It is not meant to give definitive guidance on how presentation should be done. The
final choice of any presentation template is a judgment call, depending on the goals, priorities
and preferences of the template designer(s).
a) Risk margins. The risk margin for a given coverage year runs off over time to a value o f
zero as the losses are paid. In addition, the perception o f risk changes over time. For
example, the risk margin of hurricane losses would have been valued less before the
recent large hurricane losses in Florida. The perceived risk for mass tort liabilities is also
now much greater than believed in the 1970s and prior. Are the purposes of these
historical exhibits furthered or distracted by including historic risk margin estimates in
501
the reported history?
b) Time value of money. The amount of discount runs off to zero as losses are paid out.
Interest rates also fluctuate over time. As such, historical exhibits that reflect the time
value of money might show development trends impacted strictly by changes in new
money investment yields or the unwinding of interest discount. The economic impact of
these trends depends on the how the corresponding asset portfolio was impacted. How
should the historic loss development exhibits handle this issue?
A possible way of addressing the above two sub-issues might be to require historic loss
development exhibits to be on an undiscounted, expected value basis. This would isolate
the issues surrounding the expected value estimate (although it would ignore the issues
surrounding the amount of the discount or risk margin). An alternative approach for
evaluating the amount of the discount would be to require loss development exhibits to
show all actual and projected values discounted back to the beginning of the coverage year.
This would allow reflection of time value of money issues and expected value estimate
issues, without the distortion from interest rate fluctuations. The issue would remain
regarding whether to use the historical interest rates at the first valuation of the coverage
year or restate at the current interest rates.
2. £)isclosure o[~fair value estimation methods - Should the methods used to determine the fair
value estimates be disclosed in the financial statements and if disclosed, where, and in what
levels of detail? Depending upon the method(s) employed, the fair value components may
differ by line of business as well as subline of business, duration of payments, location of
the liabilities, and the currency that will pay out the liabilities. In addition, any changes to
the method(s) or the values used to determine the fair value of liabilities may need to be
disclosed in the financial statement.
3. Gross versus net o(ofreinsurance, other recoverables2. -- A decision needs to be made with
regard to how much of the fair value presentation should be on a gross versus net basis.
Should fair value adjustments be included in both gross and recoverable reporlings, or
would an overall net adjustment suffice? Where various amounts are reported in more
detail, should these fair value adjustments be disclosed in the aggregate or by individual
reinsurer or excess insurer (for a self-insured's financial statement)?
502
Long duration policies cause additional presentation issues if premium revenue is defined as
written premium. Should revenue from long duration policies be reported or disclosed
separately in financial reports, so as not to distort analyses of annual exposure growth?
These policies may also distort otherwise reported policy year loss development trends.
Should a single long duration policy be broken into separate 12-month policies for the
purposes of policy year loss development exhibits?
Special policy features such as death, disability, and retirement benefits may also be
impacted by a change in premium recognition. Should such benefits be accounted for as
loss reserves or as unexpired policy benefits, under a fair value system?
a) Unwinding of interest discount - The principal question here is whether the unwinding of
interest discount should be separately reported in income, and if so, where? Currently
when companies discount property/casualty loss reserves for anticipated investment
income, the unwinding of this discount over time flows through underwriting income, as
a change in incurred losses, and is not separately identified. Discount unwinding for life
insurance reserves also flows through as a change in incurred losses, but is separately
identified in U.S. statutory accounting statements. Alternatively, the unwinding could be
reported as interest expense, not in underwriting income.
Reflection of this unwinding in incurred losses maintains consistent treatment of any item
affecting paid or outstanding losses, at the cost of distorting comparisons o f losses to
charged premiums. This distortion is caused by premiums being fixed in time, with no
reflection of future investment income potential. If loss reserve discount is all unwound
in incurred losses, then reported histories of incurred losses to premiums will tend to
show excessive loss ratios for any long-tail line, distorting the true profitability picture.
Reflection in interest expense allows more direct comparisons of losses to charged
premiums.
b) Interest rate changes. How should changes in market interest rates used in discounting
existing liabilities be reflected? Should the effect of these changes flow through
underwriting? Should the effect flow through investment earnings? Should it be
reflected in the same manner as unrealized capital gains, as a change in interest rates
should affect both liabilities and assets similarly in a matched portfolio? Or should
changes in loss reserves for any purpose other than unwinding of discount (e.g., change
in expected ultimate payout, change in expected payment pattern, change in interest rates,
etc.) all be reported in the aggregate, with no differentiation as to the cause?
503
c) Experience adjustments, Changes in assumptions. Another issue is how should an insurer
present the effect of experience adjustments and changes in assumptions? Should
changes due to actual cash flows being different from expected be reported separately
from changes in assumptions about the future? The first are "realized" and the second are
currently "unrealized". Should there be an effort to keep consistency with how similar
issues for invested assets are treated? Should changes in risk margins be isolated, or
combined with changes in any other assumptions?
6. Consistent Treatment of Assets and Liabilities - This issue arises whenever recoveries are
available (beyond the initial premium) to offset changes in the estimated liabilities.
Examples include retrospectively rated insurance policies, deductible policies, policyholder
dividends, (re)insurance policies for which reinsurance (or retrocession) protection exists,
and contingent commission plans (on reinsurance contracts). In these examples the change
in a claim (or similar) liability should lead to an offsetting change (either in full or partial) in
either an asset or another liability.
For example, a direct retrospectively rated insurance policy may be subject to reinsurance.
This could result in at least three balance sheet entries after losses have started to occur:
• a liability for direct claims
• an asset (liability) tbr additional (return) premiums on the retrospectively rated policy
• an asset or contra-liability for the portion of the claim liability that is recoverable from
reinsurers.
The presentation issue regards the manner of reporting these amounts and their fair value
adjustments in a consistent manner, and in such a way that their individual adjustments will
not easily be taken out of context.
(Note that to the extent the retrospective rating plan and the reinsurance coverage transfer
risk, the overall net risk adjustment for all three items should be less than the risk adjustment
on direct claim liabilities. This implies that the risk adjustment for some of the individual
components may be a help to surplus.)
8. Disclosure of Credit Standing Impact - if the fair value of liabilities is to include the impact
of credit standing, these impacts should probably be disclosed separately in the financial
statements. (The credit standing issue is discussed in more detail in Section H.)
504
9. Consolidated Financial Statements - Fair valuation generally requires that transactions be
measured as if they were at arms-length. A key question regarding consolidated versus legal
entity reporting is the difficulty in measuring fair value for legal entities of the same quota
share group, especially when applied to a fresh start valuation of old claim liabilities. Thus,
it may be necessary to estimate fair value for each pool member's direct book of business
separately, rather than determining the fair value of the total quota share pool and then
allocating the total pool result to the pool members.
A related issue is how to report values containing risk margins if the component reporting
entities have risk margins that do not add to the total risk margin of the consolidated entity.
Should the component risk margins be scaled back to show value additivity?
10. Regulation and Tax Requirements - The change to fair value will impact both the absolute
value of many of the statement items as well as the format of the financial statements. This
may impact existing regulatory and tax use of financial information that may have come to
depend on the existing financial statements. The final "fair value" statements may have to
include accommodations for these needs. Alternatively, the regulatory and tax processes
could be changed to adapt to the new financial statements. A third alternative would be to
create additional supplemental reporting, based on the old accounting standards, as if
nothing had changed. Examples of areas potentially impacted include federal income taxes,
solvency testing, and market conduct exams.
505
CAS Task Force on Fair Value Liabilities
White Paper on Fair Valuing Property/Casualty Insurance Liabilities
Introduction
Up to now, this paper has dealt primarily with two areas associated with fair valuing insurance
liabilities. The first of these areas was contained in Sections C and D, "Fair Value Alternatives"
and "Methods of Estimating Risk Adjustments." Sections C and D discuss a variety ofways that
a liability's fair value could be determined in theory. The second area addressed so far in the
paper was that of presentation. This was the subject of Section E.
The current section, Section F, goes the next step, discussing issues arising from the
implementation of these concepts and methods and presentations. Implementation issues can be
categorized as:
1. Issues related to the availability and usability of market information. These include:
1,1. The robustness of the transactions occurring in the marketplace.
1.2. Intangibles included in market prices that might not be relevant in a fair value liability
valuation.
1.3. Influence of information asymmetry on market prices.
1.4. The existence of disequilibriums or temporary disruptions in market prices.
1.5. The lag between event occurrence and the reporting of the event in the marketplace.
2. General issues related to developing parameters for fair value methods. These are issues
that are not related to any particular fair value methodology. Rather they deal with concepts
that can be thought of as some of the theoretical underpinnings of fair value accounting.
These include:
2.1. Whether or not a risk charge should always be included in the fair value of a liability.
2.2. What properties a risk charge should have, specifically related to the inclusion of a
value for diversifiable risk and value additivity.
2.3. Whether an adjustment for an entity's own credit risk should be included in that entity's
fair value valuation of its liabilities.
2.4. The issues that need to be weighted when deciding to use industry-wide data or
company-specific data in a fair value calculation.
3, Application of fair value methodologies - general issues. This section discusses issues
that relate to questions that fair value practitioners will need to address when preparing fair
value financial statements. These issues are ones that relate to how to physically create
numbers to put on fair value financial statements, but that are not specific to any one
methodology. Included under this beading are:
3.1. The steps the actuarial profession might need to take to prepare for the implementation
of a new requirement.
506
3.2. What items should contain fair value adjustments in their carrying value?
3.3. How renewal business ought to be considered when developing fair values.
3.4. How judgment should be accommodated when developing fair value estimates.
5. Presentation issues. These are issues associated with the actual presentation of results in a
fair value financial statement. Items include:
5.1. Updating carried values from valuation date to valuation date, especially between full-
scale analytical re-estimations of appropriate carrying values (in accounting parlance, a
"fresh-start" valuation).
5.2. Issues associated with the initial development of exhibits that show historical
development.
507
I. Issues relatedto the availabilityand usabilitgo[market information
This is the first item to be discussed because it is F A S B ' s and the 1ASC's stated preference that
market valuations be used wherever possible. However, we are skeptical as to the usability o f
market information for developing fair value valuations o f insurance liabilities. The five specific
reasons for this skepticism are as follows:
1.1. Is the observed market active and robust enough for fair value estimation purposes?
A key principle espoused by both FASB and the IASC is that the first choice for the
development of fair values is from the marketplace. 55 However, there is not currently much
o f an active market that can be used to establish price comparisons. Moreover, the
transactions that are being done may suffer from a lack o f " m a r k e t relevancy" whereby the
marketplace transaction was for a block of liabilities that was similar but not exactly the
same as the block of liabilities a company is trying to value. The company in this situation
is faced with trying to decide how the market would respond to the differences between the
c o m p a n y ' s liabilities and those that were involved the marketplace transaction.
!.2. The observed market values may contain intangibles not relevant to the valuation at
hand. A similar but unrelatecl marketplace issue is the quantification of the value of
noneconomic considerations in a market price. A company could have a variety of reasons
for accepting one market price over another that are particular to that company. One
example could be the nature of the relationship that exists with a particular reinsurer. The
chosen reinsurer might not be the lowest cost option available to the company, but because
the company trusts its relationship with the reinsurer, the company may feel the
noneconomic "relationship value" is worth the extra cost. A different company looking to
price a similar block of liabilities might not have the same relationship with a reinsurer.
For the second company, then, the relationship value does not exist and the market price
assigned to the first c o m p a n y ' s liabilities would not be appropriate valuation for the second
c o m p a n y ' s liabilities.
~5There is no universally accepted definition of"fair value" to-date, although they all follow the same general
concept given by this short definition The detailed definition that FASB is proposing can be found in FASB's
Preliminary Views document titled "Reporting Financial Instruments and Certain Related Assets and Liabilities at
Fair Value," dated December 14, 1999, and labeled "No 204-B." The definition starts on paragraph 47, with
discussion and clarification continuing through paragraph 83 Paragraph 47 states "Fair value ~s an estimate o f the
price an enti~, would have realized if it had sold an asset or paid if it had been relieved ~ f a liability on the
reporting date in an arm "s-length exchange motivated b~, normal business consideration ~ That is. it is an estimate
o f an exit price determined b~, market interactions."
The IASC has a similar definition (found on page A18t of their Insurance Issues Paper, released November 1999).
It reads: "Theamount f o r which an asset could be exchanged, or a liabihty settled, bet~'een knowledgeable, willing
parties in an arm's length transaction."
508
1.3. Available market information, such as stock analyst estimates, or ilolated reinsurance
prices may not be reliable due to information asymmetry. The market price for an
actual liability traded on an active market is likely to be quite different than the market
value of an insurer's entire portfolio of liabilities. It is the latter item that is important in
fair value accounting, not the former. Unless all the insurer's liabilities are transferred, the
assuming reinsurers will quite rationally believe that the ceding insurer is selecting against
the reinsurer. This situation arises because the market (the reinsurers) does not have access
to the insurer's private information on the liabilities. Thus, the "actual market price" might
not be a better fair value representation than an internal cash flow-based measurement
unless most of the insurer's liabilities are actually transferred.
1.4. Market data available at a given valuation date may be distorted by disequilibriums
or temporary disruptions. The existence of an underwriting cycle can be viewed as
tangible evidence of the ongoing disequilibrium in the insurance marketplace, whereby
product pricing swings back and forth between underpricing and overpricing generally over
a seven-to-ten-year cycle. Market disruptions can be characterized as new events that lead
to significant uncertainty and temporary disruption in the market for insurance products.
Examples can include a threatening hurricane, a newly released wide-ranging court decision
and new legislation (e.g., Superfund, or California Proposition 103). At such times, market
prices right after the event may be wildly speculative, or the market may even be
suspended, greatly complicating the use of market prices for fair value valuations.
1.5. The data available in the marketplace may be out of date. Depending upon the source
being considered, there are often lags between event occurrence and event reporting. For
example, an insurer, on behalf o f its participation in an underwriting pool, may be exposed
to certain liabilities that will ultimately be shared by all members o f the underwriting pool.
If someone were to base a fair value estimate on the pool's reported financials, the fair
value estimate could reflect a lag of anywhere from several months to several years
between when the pool actually experienced the results being reported and the reporting of
them.
2. General issues related to the development o f ~arameters for fair value methods
These issues are ones that do not specifically pertain to any one fair value method. These are
"concept-type" items. Some of these, such as risk charge and credit risk, are items that relate to
the general concepts that will underlie fair value implementation. Others, such as the use of
industry-wide versus company-specific assumptions are issues that can not be resolved with a
global decision and instead will need to be considered each time that a fair value methodology is
applied.
2.1. Should a risk charge always be incorporated into the fair value of a liability? Most of
the guidance to date (from the FASB and IASC) mandates including such a risk charge
when it is material and estimable, and can be "estimated" from market information.
509
Pantgraph 62 of FASB's Statement of Financial Accounting C o h o r t s No. 7, Using Cash
Flow Information and Present Value in Accounting Measurements, says:
"An arbitrary adjustment for risk, or one that cannot be evaluated by comparison to
marketplace information, introduces an unjustified bias into the measurement .... in
many cases a reliable estimate of the market risk premium may not be obtainable .... In
such situations, the present value of expected cash flows, discounted at a risk-free rate
o f interest, may be the best available estimate offair value in the circumstances."
Given that there is no active market for many insurance liabilities, there is no readily
available, direct information on the market risk premium associated with their fair value.
The market risk premium would have to be estimated. It is unclear as to what marketplace
information would be required under such guidance for an acceptable estimate o f the risk
premium. Would the information have to be insurance specific or even insurance product
specific, or could it be based on overall market pricing for risk in general financial
markets? It is also unclear how much judgment may be used to produce an acceptable
estimate of this risk premium.
If the guidance is worded and interpreted too stringently, then it may never be possible to
include a risk premium in the fair value of insurance liabilities. Liabilities o f high risk
would be indistinguishable from liabilities of low risk, as long as the present value of
expected cash flows was the same. More lenient interpretations may allow risk premiums
for the more common liabilities, but the more unusual or higher risk liabilities may not
qualify for a risk premium. This would result in a lower liability value (due to absence of a
risk premium) for the highest risk items, a counterintuitive result, Attempts to always
include a risk margin may raise reliability and auditability issues,
2.2. W h a t properties should risk margins have? The following two items are separate, but
related. They are separate in that each is an issue in its own right, but they are related in
that it may only be possible to reflect one or the other, depending upon the fair value
methodology that is chosen. For example, a methodology that reflects process risk in each
line of business within a company might result in a series of fair values for each line, that
when added together, produce a fair value in excess of the fair value that would be
applicable to the company as a whole. This would be a reflection of process risk that
violates value additivity. Both of these are discussed in greater detail in Section D.
• Should a value be placed on process (diversifiable) risk in the valuation?
• Should results have value additivity or not?
2.3. Should an adjustment for an entity's credit risk be incorporated into that entity's fair
value of its liabilities? Section H contains the discussion of this issue.
510
2.4. Ule of industry-wide assumptions. The two options for data and assumptions to be used
in the methodologies described in Sections C and D are indnstry-wide or company-specific
ones. Consideration must be given to the balance between the greater reliability o f the
industry data and the greater applicability of the company-specific data. Availability of
data at the industry or company level is also a factor in selecting data for risk adjustment
computations. Industry-wide data provides more consistent and reliable results, but may
overlook important differences between the risks underlying the industry data and the
company-specific risks being valued. Company-specific data will be more reflective of the
underlying nature o f the risks being valued, but the volume and the volatility of the data
must be considered. If the company-specific data is too sparse or too volatile, it might not
be usable. This is an issue that will need to be addressed on a situation by situation basis.
3.1. What steps will the actuarial profession need to take to prepare for the
implementation of a new requirement? As with any new requirement, the switch to a fair
value valuation standard for property/casualty insurance liabilities would probably result in
many unanticipated consequences. Many of these consequences would not be evident at
first, and may take time to resolve once they are discovered. This may involve refinement
of existing and development o f new actuarial models and revisions to the initial accounting
standards.
3.2. Fair value accounting will affect more than just loss reserves, Should the same
methodologies that are being used for loss reserves also be used for other items? H o w
can consistency of underlying assumptions be maintained in the valuation of all items
with fair value adjustments?
Examples of the items that might warrant fair value adjustments include:
• The liability associated with the unexpired portion of policies in-force at the valuation
date
• Liability associated with the unexpired portion o f multi-year contracts
• Reinsurance contracts with embedded options, including commutation terms,
cancellation terms, contingent commission provisions, etc.
• Differences between the fair value o f liabilities on a net basis versus a gross basis
• Accrued retrospective premium asset or liability
511
• Salvage and subrogation
The real issue is not so much what contains fair value adjustments as how the adjustments
are to be made. The accounting standards will determine those items that should contain
fair value adjustments. The challenge will be to quantify the adjustments for these different
items in a manner that is consistent with the adjustments underlying loss reserves. The
implementation issue facing fair value practitioners is to keep in mind that there should at
least be consistency of assumptions when producing fair value adjustments for all those
items requiring adjustments.
3.3. Should renewal business be considered in the fair value estimate and if so, how?
While future accounting guidance will include some discussion of what renewal guarantees
are required for renewals to be included in fair value estimates, there undoubtedly will be
areas of gray, such as how far a contractual provision regarding renewals has to go before it
is considered a guarantee of renewal. For example, would a guarantee of a renewal at a
price no more than the full policy limit (i.e. a riskless contract for the insurer) be considered
a renewal guarantee?
3.4. How should judgement be accommodated in the development of fair value estimates?
All fair value methodologies have at least some judgmental elements within them. One of
the objectives of fair value is to have the same liability held be two different entities have
identical carrying values on each of the entities financial statements. The inclusion of
judgement in the development of fair value estimates could result in situations in which
different analysts are looking at similar liabilities but produce different results solely
because of the judgmental elements.
512
4.2. Methods that rely on public data: not all companies' data is publicly available. This
makes any methodthat relies on publicly available data subject to whatever distortions
might exist from using a subset of all companies Additionally, the data that is publicly
available can contain distortions arising from systematic overstatement or understatement
of liabilities by the entities providing the data. Lastly, there could be data compatibility
issues arising from changes in the available data sets due to such things as mergers,
insolvencies, divestitures, acquisitions, restructurings, etc. that alter the entities included in
the data sets.
4.3. Methods that produce results only on a total company basis: i f a method is used that
produces results on an all-company basis but presentation requires that fair value results be
displayed at a more detailed level, the methodology must be adapted to the presentation
needs.
4.4. Time period sensitivity: the selection of the historical time period used as the basis for
determining future parameters and assumptions could greatly influence the results.
4.5. Incorporates process risk: not all methods produce results that include a value for process
risk.
4.6. Value additivity: not all methods produce results that are value additive.
4.7. Nature of the line of business: some methods are not well suited to the development of
fair value estimates of liabilities arising from volatile short-tailed lines. All of the methods
can be used for the development o f fair value estimates o f long-tailed lines' liabilities.
513
List of Considerations when Selecting an Estimation Method
Method Rdiance o~ Relianceo~ ProduceResults TimePeriod Ino:xl~rat~ Is Value Not Oescjned
CAPM Public Data onlyon a Total S e n ~ ProcessRisk Additive Shod Tail
CompanyBas~ VolatileLines
LJndlscountedVel~Je X
PresentValueal a
fflk-lmeinterest rate
PresentValuea( a X X
0~valive interest
En~ty-spe¢~c
rnaQsurernent
~t-accumulation
measurement
CAPM X X X X X
traernalRate X" X X X
s~g~ Pe~odP4~ X" X X X X
UsingU~e~,~ng X X X X
Results
Basedon Probability X X
~strd0~ns
Basedo~ X- J X X
Reinsurance
DirectEsdma~ x x i x x x
J
DW~bu1~on x i x x x
rransforn~
P~laiveMethods X X ?
• Can use other methods to develop the parameter input for the required return on equity.
"Public data is required when using public reimurance quotes. Public data is not needed if the fair value estimates
=re derived from quotes made specifically for the entity that is developing the fair value estimate.
514
$, Presentation issues
The items presented here relate to the actual presentation o f fair value results in a financial
statement. These items are not "actuarial" in nature, but rather relate to the mechanics of
financial statement presentation and disclosures required within the financial statement
framework.
5.1. The selected method or methods may be appropriate for fresh-start valuations but not
interim valuations. Fresh-start in this context refers to the accounting concept, not the tax
one. The accounting concept of fresh-start involves "remeasuring an item using current
information and assumptions" at each valuation date. (IASC Insurance Issues Paper, page
A182.)
For example, suppose a company performs a full-scale actuarial review o f reserves for a
block of business twice a year. The company must publish financial statements quarterly,
though. The liabilities booked after each full-scale review would be viewed as fresh-start
valuations. However, for the financial statements produced between reviews, the company
will need to have some other method of quantifying the proper liability value to record.
The company can't just keep the same liability value from the previous financial statement.
At a minimum, the company will need to adjust the recorded value to reflect payments
made, unwinding of discount, and changes in the discount rate between the two statement
dates. This process of updating the reported value without undergoing a full-scale analysis
is an example of an interim valuation.
5.2. How should a restatement of historical exhibits to reflect historical fair value
estimates be done? Any exhibits that show historical data would need to be restated to a
fair value basis the first time fair value financial statements are produced. The question is
how to do the restatement. Fair value should reflect conditions and market perceptions at
the valuation date. It is difficult, if not impossible, to reconstruct these items after the fact,
when what the outcomes of situations that were then uncertain are now known.
515
C A S T a s k Force on F a i r V a l u e Liabilities
W h i t e P a p e r on Fair V a l u i n g P r o p e r t y / C a s u a l t y I n s u r a n c e Liabilities
Section G - A c c o u n t i n g C o n c e p t s
Introduction
This section discusses the proposed fair value adjustments in terms of the attributes demanded for
sound accounting bases. We set out below the criteria (termed accounting precepts) that
accountants and accounting standard setters judge accounting bases by, and consider who the users
of financial statements are. We then consider each ofthe major fair value adjustments in terms of
the accounting precepts. The fair value adjustment for the entity's own credit standing is discussed
in section H.
Fair value accounting could be applied to any financial reporting; GAAP financial statements,
statutory (regulatory) financial statements or even tax returns or internal management reports.
While, in the U.S., GAAP financial reporting is determined by the FASB and the SEC, statutory
financial statements will remain the responsibility of the NAIC. Even if fair value accounting
were adopted for GAAP financial statements, a different non-GAAP basis might well be
maintained for statutory financial statements.
Fortunately, to a large extent the two documents agree as to what is desirable. The FASB
document is longer and more discursive.
The IASC framework document defines the object of financial statements as:
"to provide information about the financial position, performance and changes in.financial
position o f an enterprise that is useful to a wide range o f users in making economic
decisions. "
516
Relevance. To be relevant information must be capable of making a difference to u s e r s '
decisions. This is achieved either because the information can directly feed into a prediction o f
the future position of the enterprise, or because the information can be used to refine previous
expectations. Untimely information generally has little relevance. The IASC framework details
a separate characteristic of"understandability," stating it is an essential characteristic of financial
statements that the information is readily understood by diligent users. This is implicit in the
FASB concept of relevance, information which cannot be readily understood lacks the
characteristic of being able to inform users' decision making. Also implicit in the two concept
statements is the concept of transparency, i.e. that items in financial statements should be clearly
disclosed so as to maximize their utility to financial statement users. (Neither the IASC nor
FASB documents listed above mention transparency explicitly, although the IASC notes
"substance over form," that is, following the economic substance rather than legal form as a
basic requirement).
Reliability. Reliability depends on the representational faithfulness with which a reported item
reflects the underlying economic resource, obligation or transactions. Reliability does not imply
a need for certainty, and reporting the degree of uncertainty in an item may provide a better
representation of the underlying economic reality than a single point estimate. In certain cases
the measurements of the financial effects o f items could be so uncertain that enterprises would
not be allowed to recognize them in their financial statements (for instance, nonpurchased
goodwill). Financial statements should be free from bias in their measurements. FASB, but not
the IASC, notes verifiability as a characteristic that helps constrain bias in financial statements.
Comparability and Consistency. Financial statements should be comparable over time and
between different enterprises in order to be able to ascertain trends and the relative position of
different companies. Conformity to a uniform set of accounting standards helps achieve
comparability and consistency.
Neutrality. Financial statements should be free from bias. However, the IASC framework notes
that where an element of a financial statement is subject to uncertainty a degree of caution is
needed in the exercise of judgment in making the required estimates.
CostBenefiL The balance between cost and benefit is a constraint on "good" accounting
paradigms rather than one of their qualities. If accounting information can only be generated at
substantial cost, the relevance and utility of that information to users needs to be established
before it is sensible to adopt accounting standards that demand such information.
517
FundcuwentM A ~ t i o n s
The IASC framework notes two fundamental assumptions for the preparation of financial
statements. These are:
• The Accruals basis: Transaction are recognized when they occur, not when cash changes
hands, and reported in the financial period to which they relate.
• The Going Concern basis: Financial statements are prepared on the basis that the
enterprise will continue in business for the foreseeable future. If there is the likelihood or
intention to substantially curtail business or to cease to trade, financial statements may
need to reflect this in their choice of accounting policies, and the circumstances are to be
disclosed.
Accounting paradiems
There are two types of m o d e m accounting paradigm.
The alternative is the asset-liabili~ approach. These models are balance sheet focused. Their
aim is to accurately reflect the assets and obligations of a company at periodic intervals. The
changes in the values of assets and obligations become the profit (or loss) for that period. A fair
value accounting approach for the assets and liabilities of insurance enterprises is one potentially
available asset-liability paradigm.
The IASC paper essentially analyses three alternative methods o f accounting for insurance: the
current deferral-matching model, full fair value accounting, and an alternative asset-liability
model.
518
Who uses financial accountinE, what arc their negds, and on what do they focus
Most prospective commercial insureds and reinsureds and their brokers are interested in the
solidity of (re)insurers with whom they place business. Essentially they need to evaluate the risk
o f the (re)insurer being unable to pay claims in full once they become due. While income
statement information is not irrelevant, their basic focus is on the balance sheet strengths and
weaknesses of the company.
Rating agencies have similar aims as commercial insureds and reinsurers. Their basic focus is on
balance sheet solidity. They, like insurance sector analysts are sophisticated users of financial
information, and have access to more detailed financial information than that presented in the
financial statements.
519
Bankers and Other Creditors
Bond issues and bank loans are most likely to be the obligation of the holding company of
insurance groups, not the individual insurance entities underneath the holding company. The
bond holders and bankers behind this debt will be interested in the ability of insurance groups to
service borrowings and repay loans, this is a function of both balance sheet strength and the
future profitability of the company. In addition both these creditor groups may be interested in
ascertaining that covenants are satisfied.
Regulators
Regulators have, at least in the US, two perspectives on insurance companies. First, they are
interested in the solidity of insurance companies and in minimizing any call on guarantee funds.
Second, they may wish to use the financial statements as a resource in the regulation of prices.
Regulatory analysis in both these areas might be made more difficult if reported profit measures
are volatile. Well understood and accepted measures of shareholder equity would also be
advantageous. Regulators have access to other financial information. Indeed, in the US,
statutory financial reports will be their primary source for the financial review of an insurance
company's operation.
Outside the US, regulators make more use of a company's general purpose financial statements,
and generally desire a single accounting paradigm for general purpose and regulatory financial
reports.
Employees
Employees will be concerned primarily with two questions: how secure is the company? and
how well is it doing? Most employees will be unsophisticated users of financial statements.
If insurance company investments are recorded at fair value, then reporting insurance liabilities
at fair value will create consistent balance sheet accounting, and will improve relevance and
520
representational faithfulness of reported income and equity.
There are alternatives to fair value accounting for liabilities that react to some, if not all, of the
same variables impacting the investment market value. These alternatives may produce more
relevant financial reports than the current status quo for U.S. GAAP (where most liabilities are
undiscounted but many assets are at market). They may also be easier to implement that full
reflection of fair value for liabilities. The risk is that they may cause an unacceptable level of
inconsistency relative to the assets, for those financial variables that would impact market values
but not the alternative standard liability values.
Currently, most p/c reserves are carried at an undiscounted value. This current use of
undiscounted reserves for loss reserves has the following advantages and disadvantages.
Advantages
• It is easy to understand
• It locks in a margin that cannot be distributed to shareholders. (A plus in the eyes of
regulators and policyholders)
Disadvantages
• It is typically an unreliable measure of the economic value of liabilities. Further, the
degree of distortion varies between different enterprises depending on their mix of
business and growth history. As a result, return on equity comparisons are distorted both
within the insurance sector and with other industries. In particular, insurance company
equity is understated in most cases compared to values for other industries. This
understatement of insurance company equity leads to an overstatement in returns on
equity.
• It results in different valuation bases for assets and liabilities, which can result in spurious
earnings volatility when interest rates change even when the underlying cash flows are
broadly matched.
• It distorts profit recognition.
• Booking undiscounted reserves may provide grounds for accounting arbitrage.
Fair value proponents, and others in favor of moving to a discounted basis for insurance
liabilities, would argue that moving to a discounted basis for loss reserves, etc., removes or at
least substantially reduces:
• The inconsistency between the valuation basis of assets and liabilities, to the extent assets
are either at market or at some version of cost (which is effectively an historic market
value).
521
• The inconsistency between enterprises writing different classes of business where the
economic value of two reserves shown at the same amount may be substantially different.
• The conservative bias that may be implicit in undiscounted liability values.
They would argue that the profits reported on a discounted basis would be a better (more
relevant) reflection of an enterprise's earnings for a period. The use of a fair value liability
valuation (in conjunction with holding assets as market) will put assets and liabilities on a
consistent footing, so that changes in the values of assets and changes in the discounted value of
liabilities broadly mirror each other when interest rates change, so long as liabilities and assets
are matched. This will eliminate that part of the interest rate volatility that does not reflect
economic change for the insurance enterprise. Further, fair value proponents would maintain
that the balance sheet values calculated on a discounted basis better discern between different
enterprises; that is they are more relevant, and do not contain conservative biases; that is they are
neutral.
Fair value proponents would also argue that well thought out presentation in the income
statement matching of investment return and the unwinding of the discount could do much to
mitigate the potential confusion that may be suffered by some users as a result of moving to a
discounted basis for loss reserves.
Others who oppose the introduction of discounted amounts would argue that liability values
currently reported by insurers reflect two offsetting biases, i.e., lack of provision for future
investment income and optimistic evaluation of ultimate settlement values (resulting in insurance
liabilities that they believe are already implicitly discounted). The introduction of explicit
discounting would remove one of the two biases. However, valuing loss reserves at discounted
values without addressing the second bias would probably be a disservice to all users as it would
overstate available capital and overstate profitability.
Further such observers might argue that if fair values are assessed by direct comparison to exit
prices available in the reinsurance market, there is a danger that values substantially different
from the net present value of the cost to the enterprise of running off liabilities may be recorded.
Substantial overvaluations are possible when there is a hard reinsurance market. Substantial
undervaluations are possible when there is a soft reinsurance market, precisely the time at which
such valuations cause regulators most concern.
The use of discounted liabilities will not necessarily result in more or less reliable estimates than
the undiscounted ones. Discounting techniques are well understood and generally introduce little
additional subjectivity into the liability valuation process. When the uncertainties are
concentrated in the tail, discounting of the reserves may even reduce the uncertainty in the
estimated liability value. In this task force's opinion, fair valueaccounting in practice may not
522
significantly alter the ineonsisteuey between different company's accounts due to variations in
r ~ - r v e strength.
Essentially similar arguments apply to the introduction o f discounting for the estimates o f other
insurers' liabilities or assets
Fair value proponents would argue that discounting in conjunction with adding risk margins to
liabilities provides the best basis for profit recognition. The profit on the book of business will
emerge as the associated risk expires.
This approach has the drawback that it is a difficult concept to grasp and may confuse amateur
(and some professional) users o f accounts. Clear disclosure of the risk adjustment may help such
users.
The lack of market depth in the exchange of insurance liabilities between enterprises makes a
direct market assessment o f the price for the risk margin impossible in most instances. Risk
adjustments derived from methods that use industry-wide data to derive industry level ri~k
adjustments may not succeed in producing financial information that can be used to distinguish
between one insurance enterprise and its peers. In addition market-based information will be
impossible to obtain in countries that do not have significant stock markets, or that have
integrated financial service industries where the major insurance carriers also have banking and
securities interests within one quoted vehicle.
Other enterprise-specific risk measures can to a greater or lesser extent be criticized as requiring
significant subjective input. Proponents of such methods would argue such judgrnent calls are
inherent in arriving at other accounting measures such as the bad debt adjustment to trade
receivables in manufacturers' balance sheets.
This is an area where standard setters may well be faced with determining a trade off between
reliable (less subjective) and relevant measures.
If there is a wide range of acceptable methods for calculating fair value adjustment this may well
lead to a greater spread o f the range of acceptable "values" for the various elements o f financial
statements. Accounting/actuarial guidance is likely in practice to increase the consistency o f the
calculation of the risk margin.
523
The introduction of subjective elements into fair value assessments also means that there is
additional scope for managing (or manipulating) financial results. Methods that reduce the scope
for subjectivity in the assessment, such as an IRR model using regulatory capital, curtail the
scope for inconsistency between different insurance enterprises (but, possibly, at the expenses of
relevance, see above). More company specific methods may result in greater scope for
inconsistency (the scope might well in practice be reduced by accounting or actuarial guidance).
The task force suspects however that the increase in inconsistency due to differences in the basis
on which fair values are calculated are likely to be of second order compared to differences in
the strength of company's loss reserves.
Opponents of risk margins would argue that a risk margin for insurance liabilities cannot be
reliably determined, so that (per FASBs Concepts Statement No. 7, paragraph 62) discounted
values with no risk adjustment should be used. Others would argue that undiscounted values
would be preferable to discounted values without risk adjustments, which they would contend,
could grossly understate a company's liabilities.
This is the most contentious of the fair value adjustments, and is separately discussed in section
H.
T~a~n
The extent of the link between taxes and the financial statements of enterprises varies between
different countries. Where the calculation of taxable profits is substantially based on the profit
disclosed in the enterprise's general purpose (i.e., GAAP) financial statements, it is certainly
possible that at least some companies may suffer a greater burden of taxation. It is possible this
may be mitigated to some extent by the recognition for tax purposes of some allowance (i.e., risk
margin) for the uncertainty in estimated claim liabilities. In the U.S., the explicit recognition of
risk margins may cause them to be removed from allowed claim liability deduction, thereby
increasing federal income taxes unless the margins are allowed by the IRS as a part of the
liabilities' economic value. If the reserves are currently reported at expected value, the risk
margins would have no impact on taxes (if the margins are accounted for as an asset) but would
restrict the disposable income.
524
CAS Task Force on Fair Value Liabilities
White Paper on Fair Valuing Property/Casualty Insurance Liabilities
A highly controversial proposed adjustment to estimated cash flows in the determination o f fair
value liabilities is the impact of the entity's (or obligor's) own credit standing. Under some
proposals, the weaker the obligor's financial situation, the lower the fair value of their liabilities
would be. This adjustment would recognize that a financially weak company would be less
likely to satisfy its obligations in full than a financially strong company.
This issue may not be material for most insurers, as it is very difficult for an insurer to be both
viable and of questionable financial health. Companies viewed to be strong financially have
historically experienced very small rates of default. 56 Therefore, the concern and controversy
surrounding this issue is focused largely on its impact on troubled companies.
This section of the white paper presents the arguments for each side of the issue, without stating
an overall preference. It also discusses the issues associated with estimating, implementing and
presenting liabilities that reflect the obligor's credit standing.
• Credit risk is reflected in the fair value of assets, and the assets and liabilities should be
valued consistently.
• The public debt of a company has a market value, and that market value reflects the
debtor's credit standing. Hence, requiring a company to report their publicly issued debt
(a liability for them) at market value leads to requiring them to reflect their own credit
standing when valuing a liability. The alternative, not requiring a company to report such
debt at market value, would allow a company to manipulate its earnings by buying back
existing debt or issuing new debt.
• If public debt is to he held at a fair value that reflects credit standing, then all liabilities
should be reported at a fair value that reflects credit standing. This is the argument FASB
made in their Concepts Statement Number 7, paragraph 85.
• Parties owed money by a company of questionable solvency will frequently settle for less
than the stated amount of the obligation, due to the risk of possibly getting much less if
that company (i.e., the obligor) goes insolvent. In other words, reflecting an entity's own
ss One year default rates for debt rated A or above (by Moody's) were less than 0.1%, for 1983-1999. Ten-year
default rates for the same rating categorywere less than 4%, for 1920-1999. Source: January 2000 report by
Moody's on Corporate Bond defaults from 1920-1999.
525
credit standing in valuing its liabilities reflects the true market cost to settle those
liabilities.
The obligor's credit standing is easily measurable, at least in those jurisdictions where
established rating agencies exist.
Due to limited liability, the owners' interest (e.g., as reflected in share price) of a
company can never go below zero. Thus, the fair value of its equity is always greater
than or equal to zero. If the fair value o f the equity is greater than or equal to zero, and
the fair value of the assets is less than the contractual "full value" liabilities, then the fair
value of the liabilities must be less than this "full value."
• There is no active market for such liabilities; hence there is no reliable way of measuring
this adjustment for credit standing.
• Users of financial statements could be misled as to the financial strength of weak
companies.
• A liability valuation that reflects the liability holder's credit standing would not be
relevant to a potential "buyer" of the liability. In the insurance situation, and possibly
other situations, the buyer would not be able to enforce the same credit standing discount
on the obligee. The obligee would view the prior liability holder's credit standing as
totally irrelevant. Hence, the buyer would also view the credit standing of the liability
seller as irrelevant to the liability's market value.
• An obligor's financial statements that included a reduction in the fair value of its
liabilities due to the obligor's credit standing would not be relevant to creditors.
• An insurance company's principal product is its promise to pay. In return for cash up-
front, an insurance company sells a promise to pay in the event of a specified
contingency. If an insurer attempts to pay less than the full initial promise, due to its
weakened credit standing, it is in effect abandoning its franchise. In fact, a troubled
company that is trying to remain a going concern will do all it can to pay the full amount,
in an attempt to retain its franchise. As such, reflection of credit standing in the
estimation of fair value liabilities is counter to going-concern accounting, and is relevant
only to liquidation accounting for a runoffbusiness. (The party trying to collect from a
troubled company is also arguably negotiating under duress. As such, any settlement
amount they would arrive at would not meet the definition of "fair value.")
• If credit standing is reflected in liability valuation, then favorable business results could
cause a drop in earnings, due to an improved credit standing increasing the fair value of
liabilities. Likewise, unfavorable results that lead to a drop in credit standing could result
in earnings improvement. This is counterintuitive and noninformative.
• It does not make sense to reflect credit standing in the value of liabilities without also
reflecting the impact of credit standing on intangibles. A company with a worsening
credit standing may see the fair value of its liabilities decrease, but it would also see the
fair value o f various intangibles, such as franchise value, decrease. In fact, the existence
526
of the intangible franchise value helps keep insurers from increasing their operational risk
in order to increase shareholder value at the expense of policyholders. Therefore, while
the fair value of a company's liabilities may be decreasing as credit standing decreases, it
is offset by an item not to be reflected in the fair value accounting standards as currently
proposed by the FASB and IASC. If intangibles are not to be estimated nor reflected in a
fair value standard, then the impact of credit standing on the liabilities should not be
reflected.
Credit standing is (usually) an attribute of the corporate whole, not the individual
business segments. Hence, business segment reporting could be complicated drastically
by this approach, as the segment results would not add to the corporate whole without an
overall credit standing adjustment.
To the extent that the credit standing adjustment is based on the obligor's judgment, a
potential moral and ethical dilemma exists. Management may be forced to state the
probability that it won't pay its obligations at the same time that it may be professing
before customers, partners, capital providers, etc. its integrity, financial soundness and
full intent to meet all obligations.
If an entity's own credit standing is reflected in valuing their liabilities, and the valuation
considers the reduced amounts their policyholders may be willing to accept as claim
settlement, some companies may be motivated to employ unreasonably optimistic
assumptions in setting their reserve levels. Troubled companies may be incented to
anticipate that claim settlements will be resolved on extremely favorable terms and hence
record an inappropriate reserve.
Methods for estimating the impact o f credit standing on liabilities, i f included in the fair value
definition.
Our task force was able to envision several methods that might be used to estimate this credit
risk adjustment. Four such methods are listed here. It is important to note that, to our
knowledge, none of these methods have actually been used to estimate the fair value o f liability
default for property-liability insurers in any practical setting. The first three methods are
discussed in more detail in the appendix, including examples.
57Cummins, J. David, 1988, Risk-Based Premiumsfor InsuranceGuaranty Funds, Journal of Finance, September,
43: 823-838. Also.
Doherty, Nell A. and James R. Garven, 1986, Price Regulationin Propeay-Liahility Insurance:A Contingent-
Claims Approach,Journal of Finance, December,41:103 I-1050. Also,
Den-ig, Richard A, 1989, Solvency Levels and Risk Loadings Appropriate for Fully Guaranteed~Property-Liability
527
Thus, the theory underlying the credit risk adjustment (in the insurance context) is that the fair
value o f owners' equity is increased by the value of the option implicitly given to the equity
owners by the policyholders, lftbe liabilities are measured without the credit risk adjustment,
then the fair value o f t b e owners' equity is understated.
The implied option value can be determined by the method of Ronn and Verma ~s, which is used
in the Allen, Cummins and Phillips analysis) 9 Under this method, the market value of the firm's
assets is first estimated. Then the implied volatility of the firm's market value is estimated from
the Black-Scholes formula for the value of the equity owners' call option. 6° Other inputs
required for this estimation are the undiscounted liability value, the average time until payment
of the liabilities and the risk-free interest rate.
Once the above inputs are obtained, the default value is determined by applying the Black-
Scholes option model with a set time to expiration and an exercise price equal to the expected
liability value at the end o f t b e same time horizon. The call option is valued relative to the asset
market value. The Appendix provides an example of the calculation.
Advantages
• For publicly traded insurers, this approach can provide results using an insurer's own
data.
• The method is relatively straightforward in terms of the complexity of the calculation.
• The method has been used to measure default risk for both insurance firms and banks. It
is well known in the finance literature.
Disadvantages
• This method can only be done for publicly traded companies.
• It is difficult to carve out the properly/casualty pieces of firms that have non-
property/casualty business segments.
• The method is sensitive to variations in input values.
• The method relies on accounting value of liabilities. This presents problems with
measuring reserve adequacy.
• It ignores side guarantees or implicit guarantees, such as that from a majority owner with
a reputation to uphold. Such an entity cannot afford to walk away without losing brand-
name value. It also ignores the side guarantee arising from an insurance guaranty fund.
Insurance Contracts: A Financial View, Financial Models of Insurance Solvency, J D. Cummins and R. A. Derrig
eds., Kluwer Academic Publishers, Boston, 303-354. Also,
Butsic, Robert P., 1994, "Solvency Measurement for Property-Liability Risk-Based Capital Applications", Journal
of Risk and Insurance, 61: 656-690.
58Ronn, Ehun 1., and Avinash K Verma, 1986, Pricing Risk-Adjusted Deposit Insurance: An Option-Based Model,
Journal of Finance, 41(4): 871-895.
s9 Allen, Franklin, J. David Cummins and Richard D. Phillips, 1998, "Financial Pricing of Insurance in a Multiple
Line Insurance Company", Journal of Risk and Insurance, 1998, volume 65, pp. 597-636.
6o Black, Fischer and Myron Scholes, 1973, The pricing of Options and Corporate Liabilities. Joumat of Political
Economy, May-June, 81: 637-659.
528
• It may ignore the relative credit-worthiness for different lines or entities within the
corporate total, if they have separate publicly traded securities.
529
Method 2 - Stochastic modeline using Dynamic Financial Analysis (DFA)
DFA models attempt to incorporate the dynamics of the insurance business by including
interactions between the different variables. Some DFA models also attempt to model the
underwriting cycle.
Among the outputs of stochastic DFA models are probability distributions of future surplus.
They can be used to compute the expected policyholder deficit (the expected cost of default), or
the average amount of unpaid liabilities, should the company experience insolvency in the future.
Insolvency would be deemed to have occurred whenever the company's surplus dropped below a
pre-specified level.
Advantages
• The method is insurer-specific.
• The method can be applied to all insurers.
• A comprehensive DFA model can better incorporate important company-specific risk
factors than the other methods.
° Many companies currently use these models to make strategic business decisions. A great
deal of research effort has recently been devoted to their development.
Disadvantages
• Good DFA models tend to be complex and are therefore labor-intensive and expensive.
(However, if an insurer already has such a model, adapting it to estimate credit risk may
require little additional cost.)
• DFA models are designed to work offofdata. They may not reflect risks that are not in
the historical data.
• Not all insurers currently have these models, since their management has determined that
they are not worth the cost. Insurers would need the models to be tailored to the unique
features of their business.
• There is presently not enough expertise available to construct a suitable DFA model for
each insurer.
• The models may not produce comparable results for similar companies, due to different
model structures and parameter assumptions.
• The ability of these models to reliably estimate insolvency probabilities is not universally
accepted. Many believe that these models are stronger at estimating the normal variation
6t This is a feature of stochastic DFA models,but not necessarily all DFA models.
530
resulting from the current processes, and not the shocks and paradigm shifts that may be
more likely to be the cause of an insolvency. Therefore, they may not be reliable when
applied to the stronger companies (although these companies are not expected to have a
material credit-standing adjustment).
It may be impractical to model insolvency for large, multinational or multi-industry
conglomerates.
Business and legal problems may exist for companies estimating their own probability of
reneging on their obligations, either directly or through a DFA model estimate.
531
Method 3 - Incorporate historic default histories bV credit rating from oublic ratine agencies.
This method would use publicly available historic default rates by credit rating, based on the
entity's current credit rating from A. M. Best, S&P, Moody's or some other public rating service.
At least one of these rating services (Moody's) publishes historic default rates by credit rating,
for a one year and multiple-year horizon, by year and averaged over several decades. These
default rates would allow determination of the expected default rate -- some other method would
have to be used to determine the risk premium associated with this expected value.
Advantages
• Simple to use and explain, when using the expected cost of default from the public data.
• Requires little direct analytical cost to the insurer.
• Avoids an entity having to estimate its own probability of reneging on promises.
Disadvantages
• Ambiguity would exist if the various public ratings are not consistent. For example, it is
common for the ratings from Moody's and S&P to differ. This would add judgment to
the process and potential manipulation.
• Not all companies are rated.
• A single rating may exist for the enterprise (such as a group rating), that may not be
appropriate for a particular group member or a line of business.
• Would require default history for a given rating. These may not be available from some
rating agencies.
• Requires ratings to be consistently applied over time. This may not be the case, as rating
methodologies change over time.
• Ratings may exist for debt, but not for all other liabilities. This problem could be
compounded by the existence of guaranty funds, particularly where those guarantees vary
by state and line.
This method would utilize observed interest rate spreads on public debt to quantify the credit risk
adjustment. Public debt has no amount risk, other than default risk, and no timing risk (absent
call provisions). Hence, it can be used to isolate the market's pricing of credit risk. The discount
that the market places on a dollar owed at time X, given a credit rating of Y, compared to the
same market value for a dollar owed at time X by the U.S. government, quantifies the credit risk
adjustment for a time horizon of X, rating of Y.
Ideally, this would be done based on the market value for each company's publicly held,
noncallable debt. lfnot available, then public debt of companies with a similar credit standing
(as measured by a public rating agency) could be used instead.
532
It may also be possible to use the developing market for credit derivatives rather than public debt
in applying this approach.
Advantages
• Relatively simple to use and explain.
• Requires little direct analytical cost to the insurer.
• Avoids an entity having to estimate its own probability of reneging on promises.
• Consistent with credit risk adjustment for public debt issued by the same entity.
• Relies heavily on market-based values rather than internal estimates.
Disadvantages
• Requires information on a range of public debt instruments that may not exist for all
companies. The entity may not have any actively traded public debt, or may not have a
broad enough range of noncallable public debt to handle all the time horizons of interest.
• Where reliance is made on other entities' public debt with similar credit standings, it
requires a determination of whether or when another entity has a similar credit standing.
This adds additional judgment and estimation to the method.
• Debt holders credit risk is not perfectly aligned with policyholder credit risk. Due to the
different priorities of creditors in a bankruptcy or insolvency proceeding, the amount
recoverable under a bankruptcy could be drastically different for policyholders as
opposed to debt holders. In addition, since debt is frequently at the holding company
level, it is possible that the bankruptcy administrator could arrange for a buyer to take
over the insurance operation such that the policyholders would be made "whole", at the
expense of the debt holders.
• Does not allow for guaranty funds or other side guarantees not applicable to public debt.
These guaranty funds and side guarantees can also vary by state and line, further
distancing the public debt information from the task at hand.
• The public debt may only exist for the enterprise (e.g., parent or holding company),
which may include many other businesses and operations besides the insurance operation.
The net credit risk may actually vary drastically by operation, so that the enterprise's
public debt credit risk is not indicative of the insurance operation credit risk.
• To the extent that the observed debt is callable, this could distort the application of
observable spreads to liability credit standing adjustments.
• Observed spreads versus U.S. Treasuries could include factors other than credit risk, such
as relative liquidity.
Presentation issues.
The following are a few presentation issues surrounding the reflection of credit standing in the
fair value of liabilities, assuming that such a reflection is made.
• Historical loss development - Should historical loss development include the impact of
changing credit ratings (of the liability holder)? Choices are to include this impact, to
533
exclude this impact, or include this impact but separately disclose this impact.
C u r r e n t b a l a n c e s h e e t i m p a c t - The task force generally agreed that the current impact of
credit standing reflection on the balance sheet should be disclosed, so as to provide useful
information for those interested in the total legal obligations of the entity.
I m p a c t on i n c o m e - Should the impact of credit standing reflection be separately disclosed
when reporting period earnings?
I m p a c t on s e g m e n t results - Most financial statements include various types of"segment"
disclosures, i.e., disclosures about certain business or operating segments of the business.
Current U.S. statutory reporting also includes many disclosures by product or line-of-
business. Where a corporation's debt is held principally at the holding company corporate
level, and not at the segment or operating level, it many not be appropriate to reflect credit
standing adjustments in business or operating segment results, in such a case, credit standing
adjustments would be reported only at the total corporate level, as an overall adjustment to
the business segment "pieces." Alternatively, credit standing could be incorporated at the
business-segment level, at the cost of potentially misstating the earnings or value of the
business segment.
If reported at the business-segment level, credit standing adjustments could distort reported
business-segment results in another way. Consider the case where most debt is at the holding
company level, the total corporate credit standing is weak, and the principal cause is a single
business unit. If credit standing is reported at a detail level, operating earnings of the
stronger business units would be impacted by the results of the unrelated, poorly performing
unit. Worsening results in that poorly performing unit could lead to improved earnings (due
to reduction in liability valuations) for the stronger units, while improving results for the
poorly performing unit could cause lower earnings for the stronger units.
l m ~ l f m g n t a t i o n issues. The following are some possible implementation issues associated with
reflection of credit standing in fair value estimates.
534
• Consistent t r e a t m e n t w h e r e o f f s e t s e x i s t . - Some liabilities have corresponding offsets,
recorded either as assets, contraliabilities, or even as other liabilities. Examples include
accrued retro premiums for retrospectively rated business, deductible recoverables, and
contingent commissions. I f a liability is valued in a manner that reflects the obligor's credit
standing, then the valuation of offsets for that liability should also be impacted in a consistent
manner. This may not be a simple task, and may materially complicate the estimation
process for both the direct liability and the offsets.
• Guaranty l'und rellection. - The credit standing adjustment of a liability could be materially
impacted by any guaranty fund (or similar) protection. The rationale is that the party owed
money (e.g., a claimant) may be unwilling to consider lowering their cash settlement
demands despite the financial weakness of the obligor, to the extent that there is backup
protection provided by a guaranty fund. Guaranty funds do not exist for all lines nor in all
states. They typically provide less than full protection (e.g., many funds cap the benefits, and
may pay claims only after significant delays). As such, proper reflection of guaranty fund
impacts may be very difficult, especially for a writer of multiple products in multiple states.
• Management d i l e m m a s - It may be difficult for management to value its liabilities reflecting
less than full contractual obligations, at the same time it is making assurances and promises
to consumers and creditors, especially when the impact of thecredit standing is significant.
• Auditor dilemmas - Whoever audits a company reporting fair value liabilities lowered for
credit standing impacts may find itself in the same position as a rating agency. That is, it
may be forced to quantify the likelihood of client solvency when auditing their financial
statements. This may be outside their normal expertise, and could open up additional areas
o f auditor liability.
535
CAS Task Force on Fair Value Liabilities
White Paper on Fair Valuing Property/Casualty Insurance Liabilities
Previous sections of this white paper have discussed what fair valuing means, what
methods can be used to accomplish it, and what theoretical and practical issues must be
dealt with in order to implement the fair valuing of insurance liabilities. This section
discusses what the actuarial profession needs to do to prepare for its role in this process.
Evaluating what casualty actuaries need to do to prepare for fair valuing insurance
liabilities requires addressing the following four issues:
• Do actuaries currently have a theoretical understanding of fair value concepts
adequate to estimate liabilities under a fair value standard?
• Are models currently available that can be used by actuaries to estimate fair value
liabilities?
* Are actuaries prepared to implement these models and make these estimates in
practice?
- What steps can the profession take to aid individual actuaries in implementing
effective processes for fair valuation of insurance liabilities for their companies or
their clients?
Note that professional readiness for this task should be evaluated relative to a
hypothetical implementation date sometime in the future. Fair valuing insurance
liabilities is not currently required of insurers in the United States, and we assume that
initiation of such a requirement would be accompanied by a reasonable implementation
period.
The analysis done by the task force and presented in the preceding sections demonstrates
that actuaries have the theoretical understanding needed to implement fair valuing of
insurance liabilities. We have identified a number of models that are available and
appropriate for actuaries to use in estimating fair value liabilities. No issues have been
identified that are not susceptible of actuarial estimation.
As noted above, fair valuing insurance liabilities is not a current requirement for most
insurers in the United States. Therefore, actuaries generally have not established the
systems and procedures that would be required to efficiently support fair valuation of
liabilities for the financial reporting process. However, casualty actuaries performing
insurance pricing and corporate financial functions have used many of the fair value
models that have been identified in prior sections of this white paper, and the task force
believes that this precedent demonstrates that actuaries can estimate fair value liabilities
536
in practice.
The task force has identified a number of issues concerning fair value that require
clarification prior to implementation. The task force presumes that many of these issues
will be clarified later in the accounting standards development process. The task force
also presumes that a reasonable period will be provided for implementation of any new
accounting standard requiring fair valuing insurance liabilities. Given those assumptions,
the task force believes that actuaries will be able to develop and use models that provide
efficient and effective estimates of the fair value of insurance liabilities in accordance
with those new accounting standards.
The task force believes that there are a number o f steps that can and should be taken by
the actuarial profession to aid individual practitioners if fair value accounting for
insurance liabilities is adopted for U.S. GAAP or statutory accounting. Depending on the
course of future accounting standards developments, the same may be true if the IASC
adopts fair value accounting for insurance liabilities.
!. You hold in your hands the first step, a white paper that discusses fair valuation of
insurance liabilities for general or property/casualty insurers. The task force hopes
this document will aid accounting standards setters in developing higher quality
standards for insurers. The task force also hopes this document will be a starting
point for casualty actuaries seeking both to better understand the issues underlying
fair value accounting and to plan what methods to use in fair valuing insurer liabilities
for their own companies or clients.
2. The actuarial profession should continue its active participation in the ongoing
discussions of fair value accounting for insurers. As is evident from the prior sections
of this white paper, fair value accounting is a complex issue, and actuaries should
continue to provide active assistance to accounting standards setters in order to insure
that the adopted standards are of high quality and are practical to implement.
3. The profession should seize any opportunities to broaden the numbers of actuaries
engaged in the discussion of fair value accounting. CAS meetings and the Casualty
Loss Reserve Seminar (CLRS) are the most obvious opportunities to discuss these
concepts with more casualty actuaries. Publication of this white paper in the CAS
Forum, on the CAS web site, and in other appropriate public forums should also be
encouraged.
4. Once an accounting standard setting organization adopts fair valuing for insurance
liabilities, a practice note designed to highlight the issues that practicing actuaries
may wish to consider in implementing that standard should be produced as soon as
537
possible. Practice notes are designed to provide helpful information quickly, so they
do not go through the due process required of a new Actuarial Standard of Practice
(ASOP). Accordingly, neither are they authoritative for actuaries. In addition to
being published, any such practice note should be presented at the CLRS and at CAS
meetings.
5. Finally, the task force believes that issues will arise during implementation that have
not been anticipated in advance. Initially these should be handled through updates to
the practice note. Once some experience has been accumulated, there may be need
for consideration of a new or revised ASOP. The task force has not identified any
need for a new or revised ASOP at this time and believes it is better to defer
developing any such standard until actual practice under a fair value accounting
standard has had a chance to develop. Premature development of an ASOP may
mean that unanticipated but important issues are not addressed in the ASOP. Also, an
ASOP developed too soon may tend to impede the development of good practice by
requiring more justification for estimation methods not yet contemplated during the
drafting of the ASOP.
538
CAS Task Force on Fair Value Liabilities
White Paper on Fair Valuing Property/Casualty Insurance Liabilities
This white paper has discussed many of the major issues involved in fair value
accounting as applied to insurance liabilities. While the focus has been on
property/casualty insurance liabilities, many of the issues are also applicable to other
insurance liabilities.
539
there is no clear "fight" answer. The selection of any financial accounting
paradigm is at least partially a value judgement, not a pure scientific exercise.
Reflection o f credit standing is a controversial issue - There are arguments for
and against the reflection of credit standing in fair value estimates of insurance
liabilities. The task force has consciously avoided taking a position on this issue.
Instead we have attempted to present both sides in a clear, objective fashion.
The task force chair wishes to thank all involved with this project for the tremendous
amount of work done in a short period of time. In approximately six months, the task
force team (with the help of key contributors) produced what I believe to be an excellent
workproduct, one that hopefully will be a major contribution to the profession's
understanding of the fair value issue. Thank you, once again.
540
C A S T a s k F o r c e o n F a i r V a l u e Liabilities
W h i t e P a p e r on F a i r Valuing P r o p e r t y / C a s u a l t y I n s u r a n c e Liabilities
Section K - A p p e n d i c e s
Table of Contents
541
Appendix1: CAPM Method
This appendix presents an example of computing a risk-adjusted discount rate using
CAPM.
In its simplest form, the approach used in Massachusetts assumes that the equity beta for
insurance companies is a weighted average of an asset beta and an underwriting beta.
The underwriting beta can therefore be backed into from the equity beta and the asset
beta.
here
~1, is the equity beta for insurance companies, or alternatively for an individual insurer
/~Ais the beta for insurance company assets
is the beta for insurance company underwriting profits
k is the funds generating coefficient, and represents the lag between the receipt of
premium and the average payout of losses in a given line
s is a leverage ratio
Since
13, C °v(r"rM)
Var(rM)
or the equity beta is the covariance between the company's stock return and the overall
market return divided by the variance of the overall market return. It can be measured
by regressing historical P&C insurance company stock returns on a return index such as
the S&P 500 Index. Similarly, flAcan be measured by evaluating the mix of investments
in insurance company portfolios. The beta for each asset category, such as corporate
bonds, stocks, real estate is determined. The overall asset beta is a weighted average of
the betas of the individual assets, where the weights are the market values of the assets.
Example:
Assume detailed research using computerized tapes of security returns such as tnose
available from CRISP concluded that 13cfor the insurance industry is 1.0 and flA for the
insurance industry is 0.15. By examining company premium and loss cash flow patterns,
it has been determined that k is 2. The leverage ratio s is assumed to equal 2. The
underwriting Beta is
0. =/L-(~+l)#.
$
542
or fly = .5"(1. - (2"2+1). 15) = .125
Once flvhas been determined overall for the P&C industry, an approach to deriving the
beta for a particular line is to assume that the only factor affecting the covariance of a
given line's losses with the market is the duration of its liabilities:
In order to derive the risk-adjusted rate, the risk free rate and the market risk premium are
needed. Assume the current risk free rate is 6% and the market risk premium (i.e., the
excess of the market return over the risk free return) is 9%. Then the risk-adjusted rate is:
rL=rl + E L ( r . - r l )
543
Appendix 2: IRR Method
All balance sheet values are at fair value. Thus, the liability value at each evaluation date
must be calculated using a risk-adjusted interest rate. Since we are trying to find this
value, it is an input that is iterated until the IRR equals the desired ROE. (This is easily
done using the "Goal Seek" function in an Excel spreadsheet.)
The present value o f the income taxes is a liability under a true economic valuation
method. However, in the FASB and IASC proposals, it is not included. I The basis for this
calculation is found in Butsic (Butsic, 2000). To a close approximation, the PV of income
taxes equals the present value o f the tax on investment income from capital, divided by 1
minus the tax rate. The PV is taken at an after-tax risk-free rate.
Exhibit A2 shows an example of the risk adjustment calculation, using the IRR method,
for a liability whose payments extend for three periods.
Note that the present value of income taxes is not the same as the deferred tax liability. For example, the
present value of income taxes includes the PV of taxes on future underwriting and investment income
generat~ by the policy cash flows.
544
Exhibit A2
Calculation of Risk Adjustment Using Internal Rate of Return Modal
545
Notes to Exhibit A2
Rows (Note that "RI" denotes Row 1, "R2" denotes Row 2, etc.):
1. Rate for portfolio of U. S. Treasury securities having same expected cash flows as the
losses.
2. Expected return for the insurer's investment portfolio. Note that the yield on a bond is
not an expected return. The yield must by adjusted to eliminate expected default.
Municipal bond yields are adjusted to reflect the implied return as if they were hlly
taxable.
3. Statutory income tax rate on taxable income.
4. Estimates can be obtained from Value Line, Yahoo Finance or other services.
5. Estimates are commonly available in rate filings (e.g., Massachusetts).
6. All-lines value an be estimated by adjusting historical industry reserve values to
present value and adding back the after-tax discount to GAAP equity. See Butsic
(1999) for an example. For individual lines, a capital allocation method can be used,
such as Myers and Read (1999).
7. An arbitrary round number used to illustrate the method.
10. R1 + (R4 x R5).
I I . R1 - Rl6.
12.(l R3) x R l
13. R25 + R26 (at time 0).
16. This value is iterated until the IRR (Row 45) equals RI0.
21. 1122 (Prior Year) + R37 + R38 + R39.
22. R21 + R43.
25. Present value of negative R38 using interest rate Rl 1.
26, Present value of R41 using interest rate R12. Result is divided by (l R3).
27. (R6, capital/reserve) x R25.
28. R27 + R43.
31. Time 0: R37. R25. Time I to 3: - R I 1 x R25 (Prior Year).
32. (R22, Prior Year) x R2.
33. R31 + R32.
34. (R28, Prior Year) x RI.
37. RI3.
38. R7 x payment pattern in Rows 4 though 7.
39. R3 x R33.
41. R3 x R34.
43. R28 - R27
45. Internal rate of return on Row 43 cash flows.
546
Appendix 3: Single Period RAD model
Here, there is no iteration needed, since the risk adjustment is derived directly from the
equations relating the variables to each other. Butsic (2000) derives this result.
The formula is
Although the risk adjustment can be calculated directly from the above formula, we have
provided Exhibit A3, which shows that the risk adjustment in fact produces the required
ROE and internal rate of return. The format of Exhibit A3 is similar to that of Exhibit A2.
However, only a single time period is needed.
Note that exhibits A2 and A3 give slightly different results for the risk adjustment. This is
because capital is needed for both asset and liability risk. In a multiple period model, the
relationship between the assets and loss reserve fair value is not strictly proportional. This
creates a small discrepancy.
547
Exhibit A3
Calculation of Risk Adjustment Using Single Period ROE Model
Fixed Inputs
1 Risk-free rate 0.060
2 Expected investment return 0.080
3 Incometax rate 0.350
4 Equity beta 0.800
5 market dsk premium 0.090
6 Capital/reserve 0.500
7 Loss & LAE 1000.00
8
9 Calculated values
10 Required ROE 0.1320
11 Risk-adjusted yield 0.0348
12 After-tax risk-free rate 0.0390
13
14 Premium 981.38
15 Risk adjustment 0.02518
16
17 Balance sheet, at fair value Time
18 0
19 Assets
20 Investments. before dividend 976.12 546.96
21 Investments, after dividend 1459.30 0.00
22
23 Liabilities
24 Loss & LAE 966.35 0.00
25 Income tax liaNlity 15.02 0.00
26 Capital, before dividend 0,00 546.96
27 Capital after div (required amount) 483.18 0.00
28
29 Income
30 Underwriting income 15.02 -33.65
31 Investment income 116.74
32 Net income r pretax 15.02 83.10
33 Inv income, capital (risk-adjusted) 28.99
34
35 Insurance Cash Flows
36 Premium 981.38 0,00
37 Loss & LAE 0.00 -1000.00
38 Income tax -5.26 -29.08
39
40 Income tax, capital (risk-adjusted) 10.15
41
42 Capital flow (dividend) 483.18 -546.96
43
44 ROE 13.29%
45
46 Internal rate of return 13.20%
548
Notes to Exhibit A3
Rows (Note that " R I " denotes Row 1, "R2" denotes Row 2, etc.):
1. Rate for portfolio of U. S. Treasury securities having same expected cash flows as the
losses.
2. Expected return for the insurer's investment portfolio. Note that the yield on a bond is
not an expected return. The yield must by adjusted to eliminate expected default.
Municipal bond yields are adjusted to reflect the implied return as if they were fully
taxable.
3. Statutory income tax rate on taxable income.
4. Estimates can be obtained from Value Line, Yahoo Finance or other services.
5. Estimates are commonly available in rate filings (e.g., Massachusetts).
6. All-lines value an be estimated by adjusting historical industry reserve values to
present value and adding back the after-tax discount to GAAP equity. See Butsic
(1999) for an example. For individual lines, a capital allocation method can be used,
such as Myers and Read (1999).
7. An arbitrary round number used to illustrate the method.
10. RI + (R4 x R5).
II. R1-RI5
12. (1 -- R3) x RI
14. R24 + R25 (at time 0).
15. R6 x (RI0 - R I ) / (1 - R3) - (R2 - R I ) x [1 + R6 x (1 + R I ) / (I+RI2)].
20. R21 (Prior Year) + R36 + R37 + R38.
21. R20 + R42.
24. Present value of R7 using interest rate RI 1.
25. Present value of R40 using interest rate R12. Result is divided by (1 - R3).
26. Time 0: 0; Time 1:R20 - R24 - R25.
27. R6 x R24.
30. Time 0:R36 - R24. Time i: - RI ! x R24 (Prior Year).
31. (R21, Prior Year) x R2.
32. R30 + R31.
33. (R27, Prior Year) x RI.
36. R14.
37. Time 0: 0. Time 1: - R7.
38. R3 x R32.
40. R3 x R33.
42. R27 - R26
44. (R26, Time 1) / (R27, Time 0) - 1.
46. Internal rate of return on Row 42 cash flows.
549
Appendix 4: Using Underwriting Data
This appendix describes Butsic's procedure for computing risk adjusted discount rates.
The following relationship is used for the computation.
~/la_ere-
C is the cash flow on a policy and can be thought of as the present value of the
profits, both underwriting and investment income, on the policy,
P is the policy premium,
E is expenses and dividends on the policy,
L is the losses and adjustment expenses,
_uis the average duration of the premium, or the average lag between the
inception of the policy and the collection of premium,
w is the average duration of the expenses,
t is the average duration of the liabilities.
i is the risk free rate of return
ia is the risk adjusted rate of return
This formula says that the present value cash flow or present value profit on a group of
policies is equal to the present value of the premium minus the present value of the
components of expenses minus the present value of losses. Premiums and expenses are
discounted at the risk free rate. Each item is discounted for a time period equal to its
duration, or the time difference between inception of the policy or accident period and
expiration ofall cash flows associated with the item. Losses are discounted at the risk-
adjusted rate. Underwriting data in ratio form, i.e., expense ratios, loss ratios, etc. can be
plugged into the formula. When that is done, P enters the formula as 1, since the ratios
are to premium.
550
Using as a starting point the rate of return on surplus, where the surplus supporting a
group of policies is assumed to be eV,,, or the leverage ratio times the average discounted
reserve, Butsic (Bustic, 1988) derived the following simplified expression for the risk
adjustment:
Z = e(R-i) = (I+i)C/V,,, ,
where:
Z is the risk adjustment to the interest rate or the percentage amount to be subtracted
from the risk free rate = e(R - ,)
C and i are as defined above
V,, is the average discounted reserve for the period
V,~ is generally taken as the average of the discounted unpaid liabilities at the beginning
of the accident or policy period (typically 100% of the policy losses) and the discounted
unpaid liabilities at the end of the period. In general, this would be equal to 100% plus
the percentage of losses unpaid at the end of the period (one year if annual data is used)
divided by 2. The discount rate is the risk-adjusted rate. If V,~is computed as a ratio to
premium, then published loss ratios are discounted and used in the denominator.
To complete the calculation, the quantity c, or the ratio o f discounted profit to premium
should be multiplied by (1 + i) and divided by v,, (V,,, in ratio form). To derive initial
estimates of the risk adjustment, it is necessary to start with a guess as to the value o f the
risk adjustment to the discount rate in order to obtain a value for discounted liabilities.
The following is an example of the computation of the risk adjustment using this
method. It is necessary to start with a guess for the risk adjustment and then perform
the calculation iteratively until it converges on a solution. This example is based on
data m Butsic's (1988) paper.
Parameter assumptions
Interest Rate Rt 0,0972
Fraction of losses OS after 1 year 0.591
Initial Risk Adjustment 0.044
551
Calculation
6 Premium-Expenses Discounted
(2) - (3)- (4) 0.702
7 Premiums-Expenses-Losses Disc 0.021
(6)-(1)
8 c*(1+1) 0.024
(7)'(1,1)
9 Z=C*(I+I)Nm 0.042
(8)/(s)
r l -- c / l ( l + i) -t
Where r l is the additive risk load and i is the risk free interest rate.
An example is shown below:
Parameter assumptions
Ilnterest Rate Rf 0.09721
Calculation
5 Premium-Expenses Discounted
(2)- (3) - (4) 0.702
6 C =Premiums-Expenses-Losses Disc 0.063
(5)-(1)
7 C/PV(Losses) 0.133
(6)/(1)
552
Appendix 5: The Tax Effect
More recent work by Butsic (Butsic, 2000) has examined the effect of taxes on the risk
adjusted discount rates and insurance premium. Butsic argued that, due to double
taxation of corporate income, there is a tax effect from stockholder supplied funds.
Stockholder funds are the equity supplied by the stockholder to support the policy. In the
formulas above, stockholder supplied funds are denoted by E and taken to be the ratio of
e to the present value of losses V = L(I + i A)-r . For a one period policy an amount E is
invested at the risk free rate i, an amount Ei o f income is earned, but because it is taxed at
the rate t, the after tax income is E ,( 1 - t). The reduced investment income on equity will
be insufficient to supply the amount needed to achieve the target return. In order for the
company to earn its target after tax return, the amount lost to taxes must be included in
the premium. However, the underwriting profit on this amount will also be taxed. The
amount that must be added to premium to compensate for this tax effect is:
E/t
(i-t)[l+i(1-t)]
This is the tax effect for a one period policy if the discount rate for taxes is the same as
the discount rate for pricing the policy, i.e., the risk adjusted discount rate. Butsic shows
that there is an additional tax effect under the current tax law, where losses are discounted
at a higher rate than the risk adjusted rate. There is also a premium collection tax effect,
due to lags between the writing and collecting of premium. This is because some
premium is taxed before it is collected. Butsic developed an approximation for all of
these effects taken together, as well as the multiperiod nature of cash flows into the
following adjustment to the risk adjusted discount rate:
i A" = i - e ( 1 - t ) ( r r -i),where
iA' is the tax and risk adjusted rate,
e is a leverage ratio,
t is the tax rate,
rr is the pre tax return on equity.
This is the effective rate used to discount losses to derive economic premium. The tax
effect acts like an addition to the pure risk adjustment. Since premiums as stated in
aggregate industry data already reflect this tax effect, no adjustment is needed for the risk
adjusted discount rate used for pricing. However, for discounting liabilities, it may be
desirable to segregate the tax adjustment from the pure risk adjustment, since the tax
effect really represents a separate tax liability. Using the formula above, as well as the
formula for determining the pure risk adjustment to the discount rate the two effects
could be segregated. One would need to have an estimate of the total pre tax return on
equity.
553
Appendix 6: Using Alggregate Probability Distributions
This example uses the Collective Risk Model to compute a risk load. It represents only
one of the many approaches based on aggregate probability distributions. This is in order
to keep the illustration simple.
Therefore, in order or compute a risk load, two quantities are needed: ~. and Var[Loss],
since SD(Loss) = Var[Loss] In. The following algorithm from Meyers (Meyers, 1994)
will be used to compute the variance of aggregate losses.
The Model:
2. Assume the Poisson parameter, n (the claim distribution mean), varies from
risk to risk.
4. Select the claim count, K, at random from a Poisson distribution with mean xn,
where the random variable X is multiplied by the random Poisson mean n.
5. Select occurrence severities, Z~, Z2, .., ZK, at random from a distribution with
mean 1/and variance o~.
The expected occurrence count is n ( i.e. El(Z n] = E[n] = n). n is used as a measure of
exposure.
When there is no parameter uncertainty in the claim count distribution c = O,
554
and variance is a linear function o f exposures.
var[x] = ,, + ,~v.
where
, , = ~ + o ~)
and
v= c~ 2
For example, assume an insurer writes two lines of business. The expected claim volume
for the first line is 10,000 and the expected claim volume for the second line is 20,000.
The parameter c for the first line is 0.01 and for the second line is 0.005. Let the severity
for line 1 be lognormal with a mean of $10,000 and volatility parameter (the standard
deviation of the logs of losses) equal to 1.25 and the severity for line 2 be Iognormal with
severity of $20,000 and volatility equal to 2. Applying the formula above for the
variance o f aggregate losses, we fred that the variance for line 1 is 1.05x 10 t4 and the
variance of line two is 1.24 x l0 ts and the sum o f the variances for the two lines is ! .34 x
10 tS. The standard deviation is $36,627,257.
One approach to determining the multiplier )1.would be to select the multiplier ISO uses
in its increased limits rate filings. In the increased limits rate filings, ~. is applied to the
variance of losses and is on the order of 107.(Meyers, 1998)
In recent actuarial literature, the probability of rain has been used to determine the
multipliers of SD(loss)or Var(Loss). (Kreps 1998, Meyers 1998, Philbrick, 1994). The
probability o f ruin or expected policyholder deficit is used to compute the amount of
surplus required to support the liabilities. To keep the illustration simple, we use the
probability o f ruin approach. However, the expected policyholder deficit or tail value at
risk (which is similar to expected policyholder deficit) approaches better reflect the
current literature on computing risk loads. Suppose the company wishes to be 99.9% sure
that it has sufficient surplus to pay the liabilities, ignoring investment income, the
company will require surplus of 3.1 times the standard deviation of losses, if one assumes
that losses are normally distributed. 2 In order to complete the calculation, we need to
know the company's required return on equity, re. This can be determined by examining
historical return data for the P&C insurance industry. Then the required risk margin for
one year is re x 3.1x 36,627,257. For instance, if re is 10% then the risk margin is
2 If one assumes that aggregate losses are lognormallydistributed,then the company needs approximately
e(z33"~)* the expected losses as sugphut,where ,06 is the volatilityparameter,derived from the rne~ and
variance of the distribution..
555
1 !,354,450 or about 2.0% of expected losses. In this example, the parameter lambda is
equal to 3.1 re. The result computed above could be converted into a risk margin for
discounted losses by applying the 2% to losses discounted at the risk free rate. This
would require the assumption that the risks o f investment income on the assets supporting
the losses being less than expected is much less than the risk that losses will be greater
than expected. When the assets supporting the liabilities are primarily invested in high
quality bonds, this assumption is probably reasonable. (see D'Arcy el. al., 1997)
Philbrick in his paper commissioned by the CAS "Accounting for Risk Margins" had a
slightly different approach to determining the risk margin. Philbrick's formula for risk
margin, given a total surplus requirement S, (i.e. 3.1" standard deviation in this example),
a rate o f return on equity re and a risk free rate i is:
RM = (r,-i)xS
l+r,
This is a risk margin for discounted losses not undiscounted losses.. The formula above
assumes that some of the required return on surplus is obtained from investing the surplus
at the risk free rate. if i = 5%, and re = 10% the risk margin in this illustration would be
$5,161,113.
in this example, it should be noted that the majority of the standard deviation is due to
parameter risk, as process risk for such large claim volumes is minimal. However, only
parameter risk for claims volumes has been incorporated. A more complete model would
incorporate parameter risk for the severity distribution. This risk parameter has been
denoted the "mixing parameter" in the actuarial literature. The algorithm for
incorporating this variance into the measure of aggregate loss variance is as follows:
X = Z,~_IZ , / B
v a r [ x l = . ( l + b ) ( u ~ + o~)+.~(b+~+b~)** ~.
556
Procedures for estimating b and c are provided by Meyers and Schenker. The procedures
use the means and variances of the claim count and the loss distribution to compute b and
c. The parameter b can also be ~,iewed as the uncertainty contributed to the total estimate
of losses due to uncertainty in the trend and development factors. Methods for measuring
the variance due to development are presented by Hayne, Venter and Mack. Regression
statistics containing information about the variances of trend factors are published in ISO
circulars and can be developed from internal data. To continue our example, we will
assume that the b parameter for line 1 is 0.02 and for line 2 is 0.05. Then the standard
deviation of aggregate losses is $95,663,174. The risk load using Philbrick's formula is
$13,479,811 or 2.7% of expected undiscounted losses. The load is intended to be applied
to discounted liabilities where liabilities are discounted at the risk free rate. Thus if
losses take one year to pay out the risk margin is 2.8% of the present value of liabilities.
The above risk load is consistent with liabilities that expire in one year. When losses
take more than one year to pay, Philbrick uses the following formula to derive a risk load.
_ (r,-i)S
This formula can be applied to liabilities of any maturity. Where S i is the surplus
requirement for outstanding liabilities as of yearj. In the above example if losses pay out
evenly over 3 years then the risk margin is $20,693,737or 4.6% of he discounted
liabilities. The calculation is shown below.
The computation above assumes that the relative variability of the liabilities remains
constant as the liabilities mature. As this may not be the case, refinements to the measure
of variability by age of liability may be desirable. One approach to modeling the
uncertainty in reserves would derive measures of variability from observed loss
development variability. This is the approach used by Zenwirth, Mack and Hayne.
Another approach, consistent with how risk base capital is computed, would measure
historic reserve development for P&C companies for a line of business from Schedule P.
557
Appendix 7: Direct Estimation of Market Values
Below we illustrate how to estimate the risk adjustment to the interest rate for a single
firm, based on empirical data.
Assume that the market value of assets is 1400 and the book (undiscounted) value of the
liabilities is 1000. Both of these values are available from the insurer's published
financial statements. Also, assume that using the Ronn-Verma method (see the discussion
in the Credit Risk Appendix), the estimated market value of the firm's equity is 500 and
that the value of the expected default (the credit risk adjustment) is 10. The market value
of the equity adjusted to exclude default is 510.
The discounted risk adjusted liabilities equals the market value of the assets minus the
market value of the equity or 900 = 1400 - 500. The implied market value of the
liabilities adjusted for default equals the market value of the assets minus the market
value of the equity adjusted for default, or 890 = 1400 - 510.
Assume that the risk-free interest rate applicable to valuing the insurer's expected
liability payments is 6% and that the liability payment pattern is 10% per year for 10
years (paid at the end of each year). The present value of the liabilities at the risk-free rate
is 730. Thus, the risk margin, expressed in dollars is 160 = 890 - 730. Alternatively, the
interest rate that gives a present value of 890 using the above payment pattern is 2.18%.
This value implies a risk adjustment of 3.82%.
The following discussion provides an example of the Ronn and Verma method.
Let A be the market value of assets, L the market value of liabilities and o" the volatility
of the asset/liability ratio. The formula for the owners' equity, where there is a possibility
of default, is the call option with expiration in one year:
Notice that equity value with no default is simply E, = A - L. For an insurer with
stochastic assets and liabilities, o'~, the volatility of the equity, is related to the
asset/liability volatility by
(2) tre = N ( d ) A t r / E .
558
Equations (!) and (2) are solved simultaneously to get E and o'.
The expected default value equals E - E , , or the derived market value o f the equity
minus the equity value with .no default.
The method is easily demonstrated with a numerical example. Assume that A = 130, L =
100 and cre = 0.5. Solving the simultaneous equations gives E = 40.057 and cr = 0.117.
Therefore, the value of the expected default is
For an insurer, the market value of assets is readily determined from the published
balance sheet. Discounting the reserves at a risk-free rate can approximate the market
value of liabilities. The equity volatility can be estimated by analyzing the insurer's stock
price over a recent time frame, as done by Allen, Cummins and Phillips.
559
A p p e n d i x 8: Distribution T r a n s f o r m Method
Assume expected claim counts for a policy equal 100 and ground up severities follow a
Pareto distribution:
G(x)= [b/(b+x)] ~
Therefore
E[X]=b/(q-I)
E(aggregate loss) = 100" E[X]
If the market risk premium is 10% then risk loaded premiums equal:
b b
100 1.1 = l O 0 - -
q-I qr-I
r=[(q-1)/i.l+l]/q= (q+O.l)/l.lq.
Expected values for higher layers could be computed by replacing q with qr in the Pareto
distribution and using the Pareto formula for limited expected value to price the excess
layers.:
For instance the formula for the transformed mean of a Poisson distribution with a mean
of 100 and transformation parameter r is:
560
y_((eZ°°F(j) - r ( j , lOO))/r(j)) ~
J
This formula could be combined with the formula for the transformed severity
distribution to produce a risk loaded mean.
561
Appendix 9: Credit Risk
To make the credit risk adjustment calculation more tractable, it is customary to assume
an annual time horizon and that future insolvencies have the same probability as for the
current one-year horizon. For longer-term liabilities, one can further assume that the
insolvency probabilities are independent year-to year and then determine the overall
expected default by a formula suggested by the above 5-year calculation:
D=t)~[1-w,(l-p)-w:(l-.p) 2 - . . . - w . ( l - p ) ~] ~_ D , [ w , + 2 w 2 + . . . + n w ~ ] .
P
Here, D~ is the fair value of the expected default for the one-year horizon, p is the one-
year insolvency probability and the weight w, is the expected proportion of loss paid in
year i (the weights sum to 1). Using the approximation above, the fair value over an n
year time horizon of a company's option to default can be expressed as a function of its
•one year default value.
It should be noted that the published research relating to bond default rates does not
support the assumption that annual default rates over the life of a bond are independent
and identically distributed. That is, for many categories of bonds, the default rate during
the third and fourth year is higher than the default rate during the first and second years
after issuance. If the assumption of independent and identically distributed default rates
is inappropriate for bonds, it may be inappropriate for some of the companies issuing
bonds (i.e. insurance companies) and therefore the approximations in the above formula
would not be appropriate.
562
A related technical issue that must be addressed in calculating the credit risk adjustment
is the length of the time horizon over which defaults are recognized. At one extreme, it
may be argued the applicable horizon is unlimited. Insurers are obliged to pay claims
occurring during the contractual coverage period, no matter how long the reporting and
settlement processes take. On the other hand, solvency monitoring and financial reports
have a quarterly or annual cycle. Also, it is important to recognize that capital funding
and withdrawal decisions are made with an approximate quarterly or annual cycle. An
approach that often makes the solution easier to derive is to assume that one may view
the time horizon as being a fairly short duration. According to this view, if the company
is examined over short increments such as one year, corrective action is applied and
insolvency over a longer term is avoided. The task force considers this view to be
controversial. The alternative view is that insurance liabilities are often obligations with
relatively long time horizons, and these longer horizons need to be considered when
evaluating the companies' option to default on its obligations.
In the numerical examples below, we have determined the annual fair value o f default.
The extension to longer-duration liabilities is straightforward, using the above formula, if
one assumes the formula to be appropriate. If one assumes the formula to be
inappropriate, many of the methods below can be modified to adjuste for the longer time
horizon of insurance liabilities.
The following (until #2, the DFA example), is a repeat of a few pages ago immediately
following Appendix 7,
The following discussion provides an example of the Ronn and Verma method.
Let A be the market value o f assets, L the market value of liabilities and o" the volatility
of the asset/liability ratio. The formula for the owners' equity, where there is a possibility
of default, is the call option with expiration in one year:
(1) E=A.N(d)-L.N(d-~r),
where d = ln(A/L)/cr + cr/2 and N(d) is the standard normal distribution evaluated at
d.
Notice that equity value with no default is simply E, = A - L . For an insurer with
stochastic assets and liabilities, o'n, the volatility of the equity, is related to the
asset/liability volatility by
563
(2) cr E = N(d)Atr/E.
The expected default value equals E - E , , or the derived market value of the equity
minus the equity value with no default.
The method is easily demonstrated with a numerical example. Assume that A = 130, L =
100 and o"e = 0.5. Solving the simultaneous equations gives E = 40.057 and cr = 0.117.
Therefore, the value of the expected default is
For an insurer, the market value of assets is readily determined from the published
balance sheet. Discounting the reserves at a risk-free rate can approximate the market
value of liabilities. The equity volatility can be estimated by analyzing the insurer's stock
price over a recent time frame, as done by Allen, Cummins and Phillips.
An insurer has initial liabilities of $100 million, measured at fair value, but under the
assumption that all contractual obligations will be paid. Assume that the DFA model has
been run using 10,000 simulations. The time horizon is one year. Wc examine all
observations where the terminal fair value (before default) of liabilities exceeds the
market value of the assets. Suppose that there are 22 of them, with a total deficit (liability
minus asset value) of $660 million. The average default amount per simulation is $0.066
million.
The expected terminal fair value is then discounted at a risk-adjusted interest rate to get
the fair value of the credit risk adjustment, With a 4% risk-adjusted interest rate, for
example, the fair value of the default is $0.063 million = 0.066/1.04. Thus, the fair value
of the liabilities, adjusted for credit risk, is $99.94 million ($100 million - $.06 million.
564
3. Rating Agency Method: Example
This example shows how the table of default ratios might look, ifa one-year time horizon
approach was used. Alternatively, a matrix o f default ratios by rating and lag year could
be used, similar to those available from Moody's (e.g., Moody's January 2000 report titled
"Historical Default Rates of Corporate Bond Issuers, 1920-1999"). Here the ratings are
the current A. M. Best categories. The values in the table below are purely hypothetical.
The raw results would be based on historical insolvency data. A simulation model or a
closed-form model could be applied to a large sample of companies within each rating
group to produce the adjusted results. These results might be further adjusted to ensure
that a higher rating had a corresponding lower default expectation.
To show how the above table would be applied, assume that an insurer has initial
liabilities of $100 million. These are measured at fair value, but under the assumption that
all contractual obligations will be paid. Assume also that the insurer has an A- Best's
rating. The expected default is 0.05% of $100 million, or $50,000.
The expected terminal fair value is then discounted at a risk-adjusted interest rate to get
the fair value of the credit risk adjustment. With a 4% risk-adjusted interest rate, for
example, the fair value of the default is $48,100 = 50,000/1.04. Thus, the fair value o f the
liabilities, adjusted for credit risk, is $99.95 million.
565
A p p e n d i x 10: R e f e r e n c e s
.Sec~nA - Background
I) FASB Preliminary Views document titled "Reporting Financial Instruments and Certain Related
Assets and Liabilities at Fair Value", available for download (via the "Exposure Drafts" link) at:
http://www.rutgers edu/Accounting/raw/fasb/new/index.html
5) Philbrick, Stephen W., "Accounting for Risk Margins," Casualty Actuarial Society Forum, Spring
1994, Volume I, pp. 1-87, available for download at:
http://www.casact.org/library/reserves/94SPFI.PDF
1) Stulz, Rene, "'Whats wrung with modem capital budgeting?", Address to the Eastern Finance
Association, April, 1999
2) D'Arey, S. P., and Doherty, N. A, "The Financial Theory of Pricing Proper~y-Liability Insurance
Contracts," Huebner Foundation, 1988
3) Fairley, William, "Investment Income and Profit Margins in Property Liability Insurance: theory
and Empirical Evidence," Fair Rate of Return in Property-Liability Insurance, Cummins, JD.,
Harrington S.A, Eds, Kluwer-Nijhoff Publishing, 1987, pp. 1-26.
4) Fama, Eugene and French, Kenneth, "The Cross Section of Expected Stock Retums" Journal of
Finance, Vol 47, 1992, pp. 427-465
5) Fama, Eugene and Kenneth French, "Industry Costs of Equity," Journal of Financial Economics,
Vol43, 1997, pp. 153-193
6) Feldblum, Shalom, "'Risk Load for Insurers", PCAS LXXVII, 1990, pp. 160- 195
566
7) Kozik, Thomas, "Underwriting Betas-The Shadows of Ghosts," PCAS LXXXI, 1994, pp. 303-
329.
8) Mahler, Howard, 'q'he Meyers-Cohn Profit Model, A Practical Application," PCAS LXXXV,
1998, pp. 689 - 774.
9) Meyers, S and Cohn, R, "A Discounted Cash Flow Approach to Property-Liability Rate
Regulation," Fair Rate of Return in Property-Liability Insurance, Cummins, J.D., Harrington
S.A., Eds, Kluwer-Nijhoff Publishing, 1987, pp. 55-78.
10) Myers, S. C. and R. Cohn, 1987, "Insurance rate of Return Regulation and the Capital Asset
Pricing Model, Fair Rate of Return in Property Liability Insurance", in J. D Cummins and S.
Han-ington, eds. Kluwer-Nijhoff Publishing Company, Norwell MA.
I) Brealy, Richard A. and Stuart C. Myers, 1996, "Principles of Corporate Finance (5th Edition)",
McGraw-Hill, New York.
3) Roth, R., "Analysis of Surplus and Rates of Return Using Leverage Ratios"~ 1992 Casualty
Actuarial Society Discussion Paper Program - Insurer Financial Solvency, Volume 1, pp 439-464
Butsic, Robert, "'Determining the Proper Discount Rate for Loss Reserve Discounting: An
Economic Approach," 1988 Casualty Actuarial Society Discussion Paper Program - Evaluating
Insurance Company Liabilities, pp. 147-188.
2) D'Arcy, Stephen P., 1988, "Use of the CAPM to Discount Property-Liabdity Loss Reserves",
Journal of Risk and Insurance, September 1988, Volume 55:3, pp. 481-490.
1) Butsic, Robert, "Determining the Proper Discount Rate for Loss Reserve Discounting: An
Economic Approach," 1988 Casualty Actuarial Society Discussion Paper Program - Evaluating
Insurance Company Liabilities, pp. 147-188.
2) Butsie, Robert P., 2000, Treatment of Income Taxes in Present Value Models of Property-Liability
Insurance, Unpublished Working Paper.
3) Myers, S and Cohn, R, "'A Discounted Cash Flow Approach to Property-Liability Rate
Regulation," Fair Rate of Return in Property-Liability Insurance, Cummins, J.D., Hanington
S.A., Eds, Kluwer-NijhoffPublishing, 1987, pp. 55-78
567
Method 5 - Actuarial Distribution-Based Risk Loads
I) Butsic, Robert P., 1994, "Solvency Measurement for Property-Liability Risk-Based Capital
Applications', Journal of Risk and Insurance, 61: 656-690.
2) Comell, Bradford, "Risk. Duration and Capital Budgeting: New Evidence on Some Old
Questions", Journal of Business, 1999 vol 72, pp 183-200.
3) Hayne, Roger M., "Application of Collective Risk Theory to Estimate Variability in Loss
Reserves." Proceedings of the Casualty Actuarial Society (PCAS), LXXVI, t989, p. 77-110
4) Heckman, Philip. "Seriatim, Claim Valuation from Detailed Process Models." paper presented at
Casualty Loss Reserve Seminar, 1999.
5) Heckman, Philip and Meyers, Glenn, "The Calculation of Aggregate Loss Distributions from
Claim Severity and Claim Count Distributions," PCAS, 1983, pp 22-621
7) Kreps, Rodney E., "Reinsurer Risk Loads from Marginal Surplus Requirements," Proceedings of
the Casualty Actuarial Society (PCAS), LXXVII, 1990, p. 196
8) Mack. Thomas, "Which Stochastic Model is Underlying the Chain Ladder Method,", CAS Forum,
Fall 1995, pp 229-240
9) Meyers, Glenn, "The Cost of Financing Insurance", paper presented to the NAIC's Insurance
Securitization Working Group at the March 2000 NAIC quarterly meeting.
10) Meyers, Glenn G , "The Competitive Market Equilibrium Risk Load Formula for Increased Limits
Ratemaking," Proceedings of the Casualty Actuarial Society (PCAS), LXXVIII, 1991, pp 163-200
I 1) Meyers, Glenn G., "Risk Theoretic Issues in Loss Reserving: The Case of Workers Compensation
Pension Reserves," Proceedings of the Casualty Actuarial Society (PCAS), LXXVI, 1989, p. 17 I
12) Meyers, Glen and Schenker, Nathaniel. "Parameter Uncertainty in the Collective Risk Model,"
PCAS, 1983, pp. I I 1-143
13) Philbrick, Stephen W, "Accounting for Risk Margins," Casualty Actuarial Society Forum, Spring
1994, Volume 1, pp. 1-87
14) Stulz. Rene. "Whats wrong with modern capital budgeting?", Address to the Eastern Finance
Association, April, 1999
15) Zehnwirth, Ben, "Probabilistic Development Factor Models with Application to Loss Reserve
Variability, Prediction Intervals and Risk Based Capital," CAS Forum, Spring 1994, pp. 447-606.
568
Method 7- Direct estimation of market values
Alien, Franklin, J. David Cummins and Richard D. Phillips, 1998, "Financial Pricing of Insurance
in a Multiple Line Insurance Company", Joumal of Risk and Insurance, 1998, volume 65, pp.
597-636.
2) Ronn, Ehun l., and Avinash K. Verma, 1986, Pricing Risk-Adjusted Deposit Insurance: An
Option-Based Model, Journal of Finance, 41 (4): 871-895.
I) Butsic, Robert P, 1999, Capital Allocation for Property Liability Insurers: A Catastrophe
Reinsurance Application. Casualty Actuarial Society Forum, Fall 1999.
2) Venter, Gary G., 1991, Premium Implications of Reinsurance Without Arbitrage, ASTIN Bulletin,
21 No. 2: 223-232.
3) Venter, Gary G., 1998, (Discusssion of ) Implementation of the PH-Transform in Ratemaking, [by
Shaun Wang; presented at the Fall, 1998 meeting of the Casualty Actuarial Society]
4) Wang, Shaun, 1998, Implementation of the PH-Transform in Ratemaking, [Presented at the Fall,
1998 meeting of the Casualty Actuarial Society].
1) Derrig, Richard A., 1994, Theoretical Considerations of the Effect of Federal Income Taxes on
Investment Income in Property-Liability Ratemaking, Journal of Risk and Insurance, 61: 691-709.
2) Meyers, Glenn G., "The Cost of Financing Catastrophe Insurance," Casualty Actuarial Society
Forum - Summer 1998, pp. 119 - 148.
3) Meyers, Glenn G., "Calculation of Risk Margin Levels for Loss Reserves," 1994 Casualty Loss
Reserve Seminar Transcript
4) Myers, S. C. and J. Read, 1998, "Line-by-Line Surplus Requirements for Insurance Companies,"
[Unpublished paper originally prepared for the Automobile Insurance Bureau of Massacbusetta.]
5) Robbin, Ira, The Underwriting Profit Provision, CAS Study Note, 1992
Allen, Franklin, J. David Cummins and Richard D. Phillips, 1998, "Financial Pricing of Insurance
in a Multiple Line Insurance Company", Journal of Risk and Insurance, 1998, volume 65, pp.
597-636.
569
2) Black, Fischer and Myron Scholes, 1973. The pricing of Options and Corporate Liabilities,
Journal of Political Economy, May-June, 81 : 637-659.
3) Butsic, Robert P., 1994, "Solvency Measurement for Property-Liability Risk-Based Capital
Applications", Journal of Risk and Insurance, 61: 656-690.
4) Cummins, L David, 1988, Risk-Based Premiums for Insurance Guaranty Funds, Journal of
Finance. September. 43:823-838
5) Derrig. Richard A,, 1989, Solvency Levels and Risk Loadings Appropriate for Fully Guaranteed
Property-Liability Insurance Contracts: A Financial View, Financial Models of Insurance
Solvency, J. D Cummins and R. A. Derrig eds., Kluwer Academic Publishers, Boston, 303-354.
6) Doherty, Nell A. and James R. Garven, 1986, Price Regulation in Property-Liability Insurance: A
Contingent-Claims Approach, Journal of Finance, December, 41:1031-1050.
7) Ronn, Ehun I, and Avinash K. Verma, 1986, Pricing Risk-Adjusled Deposit Insurance: An
Option-Based Model, Journal of Finance, 41 (4): 871-895,
t) CAS Valuation and Financial Analysis Committee, Subcommittee on Dynamic Financial Models,
"Dynamic Financial Models of ProperW/Casualty Insurers," CAS Forum, Fall 1995, pp, 93-127.
2) Correnti, S,; Sonlin, S M.; and Isaac, D.B., "Applying A DFA Mc~del to Improve Strategic
Business Decisions," CAS Forum, Summer 1998, pp 15-51.
3) D'Arcy, Stephen P,; Ciorvert,Richard W.; Herbers, Joseph A.; Hertinger. Thomas E; Lehmann,
Steven G.: and Miller, Michael, "Building a Public Access PC Based DFA Model," Casualty
Actuarial Society Forum, Summer 1997, Vol 2, pp.l-40
4) D'Arcy, S.P.; Gorvett, R W; Hettinger, T E ; and Wailing, R J., "Using the Public Access DFA
Model: A Case Study," CAS Forum, Summer 1998, pp. 53-118
5) Kirschner. G.S.; and Scheel, W.C., "Specifying the Functional ParametErs of a Corporate
Financial Model for Dynamic Financial Analysis," CAS Forum, Summer 1997, Volume 2, pp. 41-
88. Although the candidate should be familiar with the information and concepts presented in the
exhibits, no questions will be drawn directly from them.
Lowe, SP.; and Stanard, JN., "An Integrated Dynamic Financial Analysis and Decision Support
System for a Property Catastrophe Reinsurer," ASTIN Bulletin, Volume 27, Number 2, November
1997, pp 339-371.
Method 3 - Incorporate historic default histories b~" credit rating from public rating agencies_
1) Airman, Edward, "'Measuring Corporate bond Mortality and Performance", The Journal of
Finance. Sept, 1989, pp.909-922
570
Determining the Change in Mean Duration Due
to a Shift in the Hazard Rate Function
Daniel R. Corro
571
Abstract:
From a major worm event (such as a military action) to a seemingly minor detail (such
as the use o f a new plastic washer in a faucet design) change must be accounted for when
collecting, interpreting and analyzing data. Indeed the intervention itself may be the
focus o f the study. Theoretically, the best way to model some interventions, especially
time-dependent ones, is via the hazard function. On the other hand, it may be necessary
to translate into simpler concepts in order to answer practical questions. The average
duration, f o r example, may have well-understood relationships with costs, making it the
best choice for presenting the result.
For example, Shuan Wang [3] discusses deforming the hazard function by a constant
multiplicative factor--proportional hazard transform--as a way to price risk load, with
the mean playing the role o f the pure loss premium.
This paper investigates how a shift in the hazard rate impacts the mean. The primary
focus o f the discussion is the case o f bounded hazard rate functions o f finite support. A
formal framework is defined for that case and a practical calculation is described for
measuring the impact on the mean duration o f any deformation o f the hazard function.
The primary tool is the Cox Proportional Hazard model Several formal results are
derived and concrete illustrations o f the calculation are provided in an Appendix, using
the SAS implementation. The paper establishes that the method can be applied in a very
general context and, in particular, to deformations which are not globally proportional
shifts. Indeed the method demands no assumed form for either the survival distribution
or the deformation. The discussion begins with a case study that illustrates the
application o f these ideas to assess the cost impact o f a TPA referral program.
572
Introduction
Recall that the survival function, S(t), is just the probability of surviving to maturity time
t and that the hazard function, h(0, is the rate of failure at time t. We assume some
general familiarity with these concepts in this discussion--they are introduced formally
in Section II. While both functions equally well determine a model of survivorship, the
survival function is the more common and the hazard function the more arcane. Often
though, the best way to model a change in circumstances, especially a time-dependent
intervention, is via the hazard function. On the other hand, it may be necessary to
translate into simpler concepts in order to answer practical questions. The average
duration, for example, may have a well-understood relationship with costs which makes it
the best choice for presenting the result.
For example, Shuan Wang [3] discusses deforming the hazard function by a constant
multiplicative factor--proportional hazard transform--as a way to price risk load, with
the mean playing the role of the pure loss premium.
This paper investigates how a shift in the hazard rate impacts the mean. The primary
focus of the discussion is the case of bounded hazard rate functions of finite support. A
formal framework is defined for that ease and a practical calculation is described for
measuring the impact on the mean duration of any deformation of the hazard function.
The primary tool is the Cox Proportional Hazard model (see [1]). Several formal results
are derived and concrete illustrations of the calculation are provided in an Appendix,
using the Statistical Analysis System [SAS] implementation (c.f. [1]) of the Cox model.
The paper establishes that the method can be applied in a very general context and, in
particular, to deformations which are not globally proportional shifts. Indeed, the method
demands no assumed form for either the survival distribution or the deformation.
The paper begins with a case study that illustrates how these ideas were used to assess the
cost impact of a Third Party Administrator (TPA) referral program. While this paper has
a distinctly theoretical focus, the best way to explain the basic concepts is through a real
world example. Indeed, most of the ideas are a direct consequence of attempts to achieve
a better understanding of the case study outlined in Section I. The study illustrates that
for most practieal issues it is sufficient to determine the mean duration to failure via
numerical integration. For many purposes, there is little need to invoke the more esoteric
results developed in the subsequent sections. Still, the example illustrates the potential
value of building a survivorship model whose hazard structure is designed to
accommodate the issues under consideration. Among the technical results of the paper is
a description of just such a survivorship model. While the discussion of the case study is
largely self-contained for anyone generally familiar with the terminology of survivorship
models, the discussion does make an occasional reference to the notation and
observations developed in the subsequent sections.
Section II introduces the notation and formal set-up. The language shifts from rather
discursive to decidedly technical. Section III discusses some well known examples. The
remainder of the paper is devoted to several technical findings on how duration is
573
impacted by a hazard shift. Specifically, Section IV discusses the case of finite support
that is the case of primary interest. Section V considers how to combine ha7ards of finite
support into more complex models suited to empirical d~t~ and the kind of investigation
described in the case study.
574
Section I: A Case Study
Consider the following situation (while the data is based on a real world study, some
liberties are taken in this discussion; in particular, the thought process, as described,
follows hindsight more than foresight). The context is workers compensation (WC)
insurance. We are required to assess whether a third party claims administrator (TPA) is
saving money for two of its clients that have been selectively referring a portion of their
WC claims over to be managed by the TPA. These clients are both large multi-state
employers that are "self-insured" inasmuch as they do not purchase a WC insurance
policy. The medical bills and loss of wages benefits are the direct responsibility of the
employers and each has built internal systems to process their WC claims. The data
captured by these systems is designed for administering claims, however, rather than for
analytical use. As such, the data is comparatively crude relative to claim data of
insurance companies or TPA's. They do, however, capture the date and jurisdiction of
the injury, a summary of payments made to date, as well as if and when the claim is
settled. There are, however, no "case reserves" available nor are there sufficient details,
such as impairment rating or diagnosis, to adequately assess the severity of the claim.
Over the past few years, the employers have selectively farmed out the more complex
claims to the TPA. The TPA has its own claim data on the cases referred to it and there
is sufficient overlap to identify common claims within the TPA and the employer files.
Moreover, the TPA files are more like insurance carrier data files and contain
considerably more information, including the date of the referral, impairment ratings,
claimant demographics and other claim characteristics.
A major problem is referral selection bias. The selection process itself is not well defined,
even within an employer. Also, when the TPA first entered the picture, a greater
percentage of referred claims were older, outstanding cases. Simply comparing the
average cost per case of referred versus retained cases would not yield any meaningful
information. Indeed, the selection process refers claims that are more expensive. Not
only does this result in a higher severity for the referred cases, it renders the retained
cases less severe over time. In such a circumstance, no matter how successful the TPA is
in reducing costs, its mean cost per case will be comparatively high.
One fact that stood out for both employers is that the percentage of cases that closed
within one year had more than doubled since the TPA became involved. Also, the
referral rate shot up dramatically, suggesting that the TPA is, at some pragmatic level,
viewed as being effective. Of course, that could also be the effect of imposed cost
reductions on the staffthe employer is now willing to maintain for WC claims handling,
given the money spent on the TPA.
Another complicating issue is that the benefits that will be paid on some WC claims are
paid out over many years. Without any consistent reserves it is very problematic to fred
comparable data. The challenge here is to make an assessment using the currently
available payment data.
575
Without the presence of case reserves or enough claim characteristics to grade the
severity of the claims, conventional actuarial approaches do not work well. As noted, the
employer data, being collected largely for administrative purposes, did include the key
dates of injury and settlement. This, combined with what was noted in regard to claim
closure rates, suggested an approach based on survival analysis. In this context, a "life"
corresponds with a claim, beginning at the date of injury and "failing" at claim
settlement. Information on unsettled (open) claims is then "right censored". It was
hoped that the survival analysis models would enable us to deal with censored data, since
there were no ease reserves available for that purpose.
Merging the TPA data together with the employer claim data, we built a data set that
included an indicator of referral and, where so indicated, the date of referral. Other
covariates captured are:
A claim survivorship model was constructed from this data. As defined in later sections
in a formal way, the conceptual base of the model is a "hazard" function. The model
assumes that the various explanatory variables impact the hazard function as a
proportional shift, i.e., multiplication by a constant proportionality factor. Such
survivorship models are referred to as proportional hazard models. Referral to the TPA is
an exception in as much as it is captured as a so-called "time-dependent" intervention.
576
Instead of a constant value for the explanatory variable, the TPA referral indicator is
allowed to take on two values so as to be able to capture into the model the time frame o f
referral (=0 prior to referral, =1 afterward). The proportional adjustment factor
associated with TPA referral confirmed the expectation that referral was associated with a
greater hazard, i.e., shortened claim duration. While the effect on the hazard was
measured, the assignment demanded that it be translated into savings. In order to do that,
it was necessary to convert the result back into factors related to claim costs. Whence the
basic question of this paper: how to translate a change in hazard into a change in (mean)
duration.
The task is to assess the cost impact of the TPA program, but that is not clearly defined.
Due to the limited time frame of the data, the lack of case reserves or multiple loss
valuations, it was clear that the "ultimate" cost impact could not be assessed using the
available data, at least not directly. Also, ''ultimate cost impact" is a more complicated
notion than what the clients were after. We interpreted the task more simply: since we
had the actual payments made on TPA referred cases, what we needed to measure is
hypothetical: what would the payments on those claims have been without use of the
TPA?
There is a catch, however. Consider a simplified case: the "original" payout pattern is $1
per day for 100 days on all claims. Assume that the referral to the TPA results in a single
$100 payment on the first day. A little thought will convince the reader that at any point
in time, ignoring discounting and the prospect that the business fails, the TPA will appear
more costly. The comparison will not be fair unless it takes into account the unpaid
balance: no matter how simply you frame the issue, reserves cannot be completely
ignored.
The data included payment and duration, so there were ways available to translate a
change in mean duration to dollars. Our choice was to use the non-referred claims to
build a regression model in which the dependent variable is (log of) the benefits paid to
date. The explanatory variables would include available claim characteristics together
with the (log of) the payment duration. The characteristics (such as employer, accident
year, jurisdiction or nature of injury, as above, together with perhaps additional
covariates if available like age, wage, gender, part of body) are assumed independent of
TPA referral and their mean values over the TPA-referred claims are readily determined.
The only missing piece is the duration variable. Again, the question reduces to the topic
of this paper: determining the impact on the mean duration.
The Cox proportional hazard model is well suited to this context. The model was run on
pooled TPA-referred and non-referred data, with TPA-referral included among the
explanatory variables in the model. This captures TPA-referral as a deformation of the
hazard function and the methods of the paper can be applied to finish the job. Appendix 2
provides output that details the calculations.
The case study, however, illustrates an additional complexity. More precisely, the TPA-
referral was incorporated into the Cox proportional hazard model as a time-dependent
577
intervention (both the date of injury and date of referral being available). Also, as in the
paper, the deformation of the hazard function was modeled as a combination of
proportional shifts over three time intervals, as shown in the following table (refer to
Appendix 2, page 5 of the listing):
The pattern of the hazard ratios supports the TPA's contention that its early intervention
is more cost effective. Indeed, TPA intervention has its greatest, and most statistically
significant, impact during the first six months. Although not critical to this context, that
was an important finding of the study.
The difference in the values of the hazard ratios suggests that not only is it appropriate to
model TPA referral as a time-dependent intervention, it is also appropriate to mitigate the
global proportional hazard assumption by specializing to several time intervals. This is a
very direct approach to that issue; the technical discussion of the subsequent sections
follows that approach. An alternative way to mitigate that assumption--the one in fact
used in the study report--is to group the TPA intervention by the lag time to referral.
That formulation produces similar results and more directly supports the greater impact
of early intervention. Conceptually, it is easy to regard TPA-referral within a few days of
the injury as being an essentially different intervention than referral after several months.
The remainder of this discussion is somewhat more technical and makes reference to
some of the notation and results presented in the subsequent sections of the paper.
The SAS PHREG procedure is used not only to estimate the three proportional hazard
ratios ~0,. It optionally outputs paired values (t,S(t))ofa "baseline" survival function
S(t) at time t as well. We chose to determine a baseline survival function, S(t),
corresponding to the value of 0 for all eovariates in the model. In particular, it applies to
the ease of non-referral as defined by the vanishing of the TPA-referral indicator variable.
Observe that for the purpose of determining the baseline survival, only the non-time
dependent TPA-referral indicator is used, since the baseline option is not available in the
presence of time-dependent interventions.
This baseline survival function provides the expected duration distribution for the non-
referred claims at the formal value 0 for the other explanatory variables in the model.
Because referral is captured as a time dependent intervention, the deformation of the
hazard function is itself dependent on the lag time to referral of the individual claims.
Consequently, no single survival function of the form Sa(t) (see Section II) can suffice to
measure the impact on mean duration. This presents a somewhat more complicated
situation than that considered in this paper.
578
To deal with this, let x represent a TPA-referred claim and fl = fl(x) be the proportional
~azurd ratio associated to x by the model, which therefore includes the factor ~ = ¢~(x)
for the TPA-referral as a time-dependent intervention. Let D(x) represent the claim
duration function; recall that we seek a hypothetical alternative/)(x) which associates
what the duration would have been had x not been referred. Letting S,,S, denote,
respectively, the survival curves for x with and without referral, and a = a(x)the lag time
to TPA-referral, we have the following picture:
a D(x) b(x)
Since the baseline survival curve S(t)is known, this provides a way to determine
/~(x) for any TPA-referred claim x. The methods described in the paper can now be
invoked to estimate what the mean payment duration of those claims would have been
had they not been referred. Again, the details of the calculation can be found in Appendix
2. The following table summarizes the ffmdings in the ease study (pages 10 and 15 oftbe
listing):
579
Assumption Mean Duration
As Referred to TPA 0.737 years
(actual)
No Referral 0.826 years
(hypothetical)
Note that the application of the logic used to define D(x) v-~ D(x) becomes somewhat
problematic when crossing a boundary of the time intervals used to define the ~ . That is
another reason that, in the study, we chose to partition the TPA-intervention by layer of
referral lag a = a(x).
Finally, these mean duration figures can be plugged into the cost models and translated
into dollar savings attributable to TPA-referral. This case study is included to illustrate a
non-traditional application of survival analysis to an insurance problem, emphasizing the
power that manipulating the hazard function can bring to the analysis. The remainder of
the paper develops a formal context in which this can be done. The focus is on formal
relationships between the more "arcane" changes in haTard and the more "presentable"
effect on mean duration.
580
Section II: Basic Terminology and Notation
Let ~ + denote the set ofnonnegative real numbers. Let h(t) denote a function from
some subinterval Z _q ~+ to 91 ÷ . The set X is called the support. We assume
throughout that h(t) is (Lebesgue) integrable on Z and that 0 ¢ ~ is in the closure o f the
support. Any such h : ~ - - - ~ } ~ + can be viewed as a hazard rate function and survival
As is customary, we refer to S(t) as the survival function, f (t) as the probability density
function [PDF] and t as time. We also let T denote the random variable for the
distribution of survival times and g = Er(T ) the mean duration. When we axiom h(t)
with a subscript, superscript, etc., we make the convention that these associated functions
all follow suit. There are many well-known relationships and interpretations o f these
functions---refer to Allison[l ] for a particularly succinct discussion which also discusses
the SAS implementation of the Cox proportional hazard model.
af
-- 2
dh = dt + h 2; h is decreasing at t ¢~, df < _ f "
dt S dt S
We are concerned with what happens when h(t) is changed or "shifted" in some fashion.
This paper deals particularly with proportional shifts as the Cox model provides a viable
way to measure that type of shift (c.f.[2]). More precisely, we are interested in shills of
the form:
d/= d/(a,qo) for a,~p >- 0 where 6(h) = h,~ is defined as h~ (t) = ~ h(t) t <_a
t ~(t) t > a
581
The following are immediate consequences of this definition and our notational
conventions:
g(t) t < ct
sat) =
g(ct) + ¢(g(t) - g(a))
= (1 - ¢~)g(a) + q~g'(t) t > a
~S(t) t<a
S,(t) = ( S(a),_~ S(t), t > ct
If(t) t "; at
We are particularlyinterested in the effect that such a shift has on mean duration, which
is formally captured in the function:
Section III illustrates this notation in the case of two of the (infinite support) distributions
commonly used in survival analysis. However, we choose to deal exclusively with the
case of hazard functions with finite support in the remainder of the paper. Section IV
discusses the additional assumptions, notation and conventions applicable specifically to
fmite support haTnrd functions and presents some examples. Section V discusses
decomposing and combining finite support hazards and presents the main result: a
formula for calculating the effect on mean duration of a shift in the hazard rate function.
We also provide two appendices that detail the calculations referenced in the paper using
SAS and, in particular, illustrate how the SAS proportional baTards model procedure
(PHREG) can be used to do all the heavy lifting.
582
S e c t i o n III: F a m i l i a r E x a m p l e s
In this section we illustrate our notation with some distributions with infinite support
= (0,oo) which have found common application in survival analysis. The first three are
selected to present straightforward illustrations of the notation and concepts and for those
we only consider the case a = 0 (recall the identification d~(0,q0 = ~). We begin with
the simplest example of a hazard function:
oo co
A(O,~) = bt,, j ~t dt = . dt . 1
= J - - - hm - I
o ( l + t ) ~'+l o(l+t) ~ t - ~ ~o(1+ t) *-
in which the right hand side limits both exist for ¢ > 1 . For ¢ = ! the right hand side
diverges to + 0% whence/~¢ ~ ~t is infinite for ¢ ~ I. This illustrates that a proportional
increase in the hazard function can reduce an infinite mean duration to a finite number
and, conversely, that a proportional decrease can make a finite mean duration become
infinite.
The next example describes one of the most popular survival distributions, often defined
via its PDF:
Example Ill.4. Weibulll density with parameters a,b > O. In this example, define
583
f(a,b,'O = a m " b - l e -at'
then (see, e.g. [2] Hogg-Klugman, pp. 231-232)
ba ~
This distribution conforms to a proportional hazard model, indeed:
f ~ ( a , b ; t ) = f ( f a a , b;t),
b(~) ~ ~,~
t
Letting F(u)F(a; t) = Is a-le-'ds define the incomplete gamma function (as in [2], p.
0
217), we leave to the reader the verification that for the Weibull density:
Example III.5. Pareto density with parameters a,b > 0. In this example, define
f ( a , b; t ) = a b a ( b + t) -a-'
then (see, e.g. [21, pp. 222-223)
584
A(a,b;ct, qO =/.~6~,.~)
-a-1 kb--g-gJCg:-i--~J~b-7-d~J~,~-I-U
The last two examples are suggestive of the common approach to performing calculations
in survival analysis: fu'st, we select a form for the distribution, then we fit parameters to
the data. Finally, we calculate whatever statistics are needed using formulas specific to
that distribution (e.g. as found in [2]). This paper suggests the expediency of a simpler
more empirical approach to calculating/~s that avoids making any assumptions as to the
form of the distribution as well as any parameter estimation. Also, we can use the
method with time-dependent interventions and it is especially easy to do in practice.
585
Section IV: Hazard with Finite Support
Most survival analysis discussions use distributions whose natural support is the set of
positive real numbers, as in the previous section. The impetus for this work came from
insurance, particularly claims analysis. Although actuaries customarily employ the usual
collection of survival distributions-with their infinite supports-in practical applications
claim duration is subject to limits. Moreover, the specific structure of the very far "tail" is
either intrinsically unknowable, irrelevant, or both. Accordingly, this study focuses on
the situation in which the data is limited to a finite time interval.
As described in the case study section, the insurance problem prompting this
investigation arose in the line of workers compensation insurance. A very small
percentage of those claims involve pension benefits that can continue for decades. Even
the best insurance data bases, however, rarely track a coherent set of losses for more than
10 annual evaluations. That study concerned the implementation of a new program and
the available data consisted of a one snapshot evaluation of claims captured into various
automated systems. The data typically went back only four years and even the most
matured cohort included a high percentage of open ("right censored") cases.
In this section we introduce the assumptions and notation for our case of interest: support
Z = (0,1]. We make the assumption that h(t) is piecewise continuous. Observe that g(t)
andS(t) are both continuous on [0,1], the former nondecreasing and the latter
nonincreasing. Let p = S(I), 0 < p < 1. The distribution T has a point mass of
probability p at { 1}. We will make extensive use of the following:
1 1
E ( T " ) = nJt"- S(t)dt.
0
Proof" The proof is really just the integration by parts the diligent reader would have
done a few times already in the previous section:
Letting a2 denote the variance ofT, the following two corollaries are apparent:
586
Corollary IV.l:
I
i) 14 = IS(t)dt
O
I
Corollary IV.2:
a I
decreases the mean duration, i.e., that A = P6 is a decreasing function o f q~. A bit
more thought should convince the reader that A = g6 is an increasing function of a for
> 1 and decreasing for q~ < 1. Since g(t) is increasing, the following result f o r m a l i z ~
this:
Proposition IV.2:
OA I
i) -~a = ( ~ - l ) f ( a ) S ( a ) - $ ~S(t)~ dt
Proof: i) From Corollary IV.2, the fundamental theorem of calculus and the product rule
for differentiation:
8A 0 ~ i a ~,
oa a Lo 1 J
587
$6 (t) = f s ( t ) ~ .. t < ol
[ S ( a ) -* S ( t ) " t >a
0 t~a
OS
m ~
)
- s(t)~ (l~(S(a))s(,~) '-~') t ~
Noting that our assumptions enable us to differentiate under the integral, and recalling
that g(t) = - l n ( S ( t ) ) , we find that:
The graph ofA = 126 is a tent with a single "pole" of unit height at the origin, a front wall
of infinite length and constant height 12 and a back wall of decreasing height:
588
The following refines this:
I
I
u =g(t) du =h(t)dt; v= ~S(w)dw dv=S(t)dt
o
I t ~1 I t
~g(t)S(t)dt = g(t) ~oS(W)dWJo - ~h(t) ~S(w)dwdt
o o o
l I S(w)
<-In(p)/d-~f(t)Sldwdt as >l f o r w < t
0 0 S(t)
1
= -In(p)/t - Sf(t)tdt = - I n ( p ) ~ - ( / t - p).
0
While the effect o f an increase (decrease) o f the hazard function clearly has the opposite
affect on the m e a n duration, the effect on the variance is unclear. Indeed, the reader can
use Corollary IV.1 to verify that:
Before we discuss some examples, we note the following integration formula, in which
589
I
F(a)F(b)
B ( a , b) = Ix°-I (1 - x ) b-I dx = V a, b > 0 , the usual beta and g a m m a
0 F ( a + b)
functions.
~t'(I -tb)~ dt =
0 b
1
~ta(l _ t b )c dt = 1 it o-b+, (1 - t~)Cbt (b-') dt
0 O0
1 '( ' ho-b÷t
=~tx~; (l-xyax
1 ' ---1
as claimed.
W e next present some examples. The first, while especially simple, will play a major role
in later findings.
Example IV.1. Constant hazard function, let h(t) m 1, 0 < t < 1. Then, as in Example III.1,
we have:
h~(t)=¢; gg(t)=q~; S,(t)=e-~; f~,(t)=q~e - ~
and we observe that
1
Pv = S~,(1) = e -~' = p~', where p = Pl = -
e
More generally, for 0 < a < 1, we find:
590
{~ t e [O,a]
h 6 (t) = t e [at,l]
t t a [O,a]
ga(t)= q~(t-a)+a te[a,I]
Sa(t)= { e -t t ~[O,a]
p a e-~,(t-a) t e [a,1]
1 ct I pa+Pt~(l_p~O(,_a))
A(ct, q~) = laa = ~Sa(t)dt = ~e-t dt + p a Ie-~'(t-a) dt = 1-
0 0 a
1 -p~
In particular, we find that A(0,~) = ~9 = . We will make considerable use o f this
~a
example in later sections where we deal with combining hazards and show h o w to use the
Cox Proportional Hazard model to approximate any hazard o f finite support by a step
function.
For the special case p = 0, the formulas simplify considerably and we have:
591
i
/~" q~+l and S,(t)=(i-t) P
1~,2 + c r 2 = 2 i t ( l _ t ) g d t = 2F(2)F(q~
I)F(+~ 3)+ =
0
(~ + 2)(~ + 1)
2= ~P
::~ O', (q~+ l)2(~p + 2 )
In this case,
tr =--; ~ =trc:~tp~ 1,
12
The next example is a simple way to define a new hazard function from an old one.
Example IV.3. Reversed hazard function, let h(t) be any hazard function o f finite support
and define h(t) = h(1 - t ) , then clearly
(/~)~(t) = t p h ( 1 - t ) = (~')(t) for every (p > 0
which shows that the reverse o f a proportional hazard model is also one. Clearly, the
reverse o f an increasing (decreasing) hazard function is decreasing (increasing) and
= h . Letting u = 1 - t , we find that
I I-t
p = p; g(t) p . ~f dt
The reverse o f Example IV.I is, o f course, again Example IV.I. The reverse o f Example
The next example is another simple way to define a new hazard function from an old one.
Example IV.4. Complement hazard function, let h(t) be any hazard function o f finite
support such that f ( t ) < !, 0 < t < 1, and define J-(t) = ! - f ( t ) , then
592
t !
/~(t) = I - I)7"(s)ds = 1 -,~(1 - f ( s ) ) d s = 1 - t + (l - S ( t ) ) : 2 - t - S ( t )
0 0
b = 1 - p; i,(t) = l -/(t)
2 - t - S(t)
.?(t)
Area = p ]
Area = I-p
Then, clearly, f ( t ) > 0, when 0 < t < ! and the above lemma implies that
I
Sf(a,b,c,p;t)dt = 1- p
0
593
f(a'b'c'p;t)=B(~_,c+l),.o
a+l ~ ' - j ~kf
When c is an integer, the reduction formula for the Gamma function gives:
c! b'" c!
l~I(a~+ bJ + 1~ (-Ia+bj+ 1
j-o~, b ) .o
b(l - p ) [ I (a + bj + 1)
f(a,b,c,p;t)= j--o ~ (:~
• k---O
( 1 - P ) l,=o
- I ( a + bJ + l) ~ ( l)kt °+"
¢ c
j=k
594
h(t)
a=b=l,c=4,p=l/~~
0 t
The final example is a slight variation of the previous one.
Example IV.6: Assume a,c> -1, b > 0, 0 < p < i and define
f(a,b,c,p;t)= b(l - p ) ( l - t°(l - t~)")
Againf(t) >_0,when 0 _<t < 1 and the above lemma implies that
I
h(O
711//~b=l'c=l/g'2~
0 t
In the event that a particular shape of the hazard function is required, the last two
examples provide candidates for parameter estimation. The following section argues
that, for most purposes, a simple step function is preferable, from both the conceptual and
computational perspectives.
This section concludes with two results. The first is one more observation on the
difference/z -/.t~. The second revisits how for a finite haTard the survival function is a
convenient device for computing moments, in this case relating it with the moment
generating function.
595
In Example IV.2, note that
p+l p¢+l - 1
1 l p [ 1-p22 +p¢+lq~+l-l]j
p-/.t¢ = 2 (~o+ l)(p- I)
Proof'.
I a I
= fS(t)- p.~-~S(t)*dt
ct
It follows, therefore, by continuity and the preceding equation, that /.t = P6 would forge
Now if p = p~, then the right hand side is clearly 0 and the result holds. So consider the
ease /z = 1.t6, p~ < p. We then have both:
596
which clearly forces ¢~= 1. The result again follows since ¢~ = 1 makes the right hand
side O.
The upshot is that we may now assume that/z * #6- Becausefis continuous and does not
change sign, the generalized intermediate value theorem for integrals ~ ~ ( ~ (a,1) such
that
I I
f(O~S(t)-p~'-*S(t)'dt= f(S(t)-p.t-oS(t>'~(t)dt, f(()>O
a a
Noting that dS = - f (t)dt; t = a ~ S(t) = p~; t = I c~ S(t) = p . With the change of
variable we have:
f(O(/.t-/.t6) = i(S(t)-pal-C'S(t)v~(t)dt=P](s-pal-'S'~S
a p
S2 I-e) S q~-I ]P° Pa 2 patp+l p2 pq)+l
=--j--Pa - ~+----"~Jp = 2 -pal-qs qg+l 2 +pal-~° q)+l
= pa 2 '-(I "~
2
+ \P,~ )
~+1
I
Proposition IV.4. M r (x) = 1 + x Se " S ( t ) d t
o
Proof: By definition:
I
M r ( x ) = E(e ~r ) = [.e~t f (t)dt + pe ~
0
597
1
M r (x) - pe ~ = le'~f(t)dt
0
t ® [xt~ k
--
*° x k l k
= ~ --if, I t f ( t ) d t
k:O K: 0
i qoxk 1 k l
= If(oat + zX~kft - S(t)dt - p]
o ~.~ k! L o J
k-I fl ..~]
= i - p + x ~Itk-~S(t)ctt -
km, K - ~Lo
0o r I x k - l t k-I "1
:,-,,+,,z/jI=IL.0 qt#l--1)~ j
f.-""
k=l k!
.,, r1 k k "1 = Xk
= l + = x/I x-t- S(t)dtl-
,:oLo k~ j s'~oT.'
rl ® (xt ~k "]
= ] + =/I]E ~'/--LZ--S(t)dtl - p e ~
Lo*:o k! j
i
= 1 + x[.e~'S(t)dt - pe=.
0
598
Section V: Combining Finite Support Hazard Functions
We continue with the notation and assumptions o f the previous section. Consider first
the case o f two hazard functions, h1(t) and/11 (t) . If these represent independent causes
o f failure, then their sum h~ + h2 provides the corresponding hazard function. In this
ease, we clearly have:
1
g=gl +g2; S=StS2; f=Stf2 +flS2; /t= ~Sl(t)S2(t)dt,
0
and we can readily generalize this to the case o f compounding together any finite number
o f hazards.
Consider the case o f adding a constant hazard, i.e., the case h2(t ) ---a > 0. While this will
clearly decrease the mean duration to failure, the issue is by how much. From Example
IV.l, we have S2(t ) = e - " , and from Proposition IV.4 we find:
1 t 1 - Mr, ( - a )
Mr, (-a) = 1 - a fe-°'S,(t)dt = l - a f S , ( t ) S 2 ( t ) d t =1-a/.t ::~ /.t
o" 0~ a
While adding hazards is formally very simple, this suggests that the effect o f the mean
duration can become complicated in even the simplest contexts. Moreover, the more
useful and challenging task would be to reverse this process: to decompose a compound
hazard into mutually independent hazards. Fortunately, our needs are much less
demanding.
In this section we detail a very simple and straightforward way to combine hazard
functions. This provides the framework needed to exploit the Cox Proportional hazard
model to approximate hazard functions with step functions. The approach also fits in
well within the context o f time-dependent interventions.
Begin with a finite support hazard function h(t) and let {0 = a 0 < a~ <... < a , = 1}be a
partition o f [0,1] into n subintervals. We can readily decomposeh(t) into n finite support
hazard functions:
h,(t)=h(a,_, +t(ct~-a,_,)) O<t <_l,i=l,2,...,n
Fortunately, this process is readily reversed, i.e. given an ordered set o f n finite support
hazard functions {h,(t), i = 1,2,...,n} together with a partition {0 = a 0 < a m< ... < a , = I}
o f [0,1 ] into n subintervals, we define their gauntlet hazard function on [0,1 ] by
h(t) = {hl,h 2 ..... hn; 0 = cr0 < a L < ... < a n = l}(t) = h~ where a~_ 1 < t < ot~
We observe that when theh~, i = 1,2,...,n are all constant hazard functions
(h~(t) m q~ = h,,, i = 1,2,...,n from Example IV.l) their gauntlet hazard function is a step
599
function. Conversely, any hazard step function is the gauntlet of constant hazard
functions in an essentially unique way.
"
I n a, n i-I , _tZi_ I |
la = lS(t)dt = E IS(t) dt = Z I-I Pk I Sil--! dt
0 i=la,, i=lk=l a_, ~,O~i - a i - I }
n i-I a,--a, ~ I
= Z l-IPk (a, -ai_l)lSi(u) a'-a'-' du
i=l k = l 0
n i-I a~-a,_,
= Z l-I
i=1 k=l
pk -
i
Case 1: Assume the partition is uniform, that is, at = - then the formula becomes:
r/
i
.---:E[I-Ip, ,),
Case 2: Assume the hazard is constant on all the intervals (step function). Then by
Example IV. 1,
600
1 - e ¢'{a'-''-a')
h,-= ~, = ( . , ) o _ .... = ~,(~,-a,_,)'
and the formula becomes
-~
" e ;~"......... ( 1 - e"f
~. ...... "~
)/
Finally, when both apply, in the case of a step function with uniform partition, the
formula simplifies to:
I t-s ( _ ¢~._L
# = ~-'~e 1-e
i=1 ~i
In the example below, we consider how to make use of this, given a set of empirical
observations. The formulas suggest that it may prove useful to approximate the hazard
function by a step function. In that regard, notice that the natural choice for q~, ~ h, (t) is
the average value of the hazard function over the ith interval. This, in turn, is readily
determined from the survival function:
1 '~'
f h ( , ., u, j, . . g. ( c. t l ). - g ( o 6 _,) _ l n ( S ( a , _ , ) ) - l n ( S ( ~ , ) )
J
O/i -- ~i-I a~_I ~ i -- a i - I ~ t -- O~i-I
We conclude with a simple example that illustrates how, despite the awkwardness of the
formulas, the calculations can be quite simple in practice.
Example V. 1 Let ho (t) -~1, for 0 < t < 1 be the constant unit hazard function and
601
h(t) = (ho),(t) =
flI 0 ~ t <-~
2_<t<
SAS was used to simulate two survival data sets One and Two, conforming to the hazard
functions h 0 and h , respectively. The PDFs are readily determined from earlier
examples and were used to perform the simulations (refer to Appendix l for details). A
survival function was produced from Two. An excerpt of the output is provided below
(page l0 of the listing),
t s(t) g(t)
0 1 0
0.71665 0.33317,/1~3
~3 0.36770 1.00048~-1
1 0.26359 1.33335~~33
The estimation of the hazard function h(t) from the survival function is:
t ~ [0,/], h(t).~ g(l//33)-g(O) /-0
/-0 -- / --I
t ~[/,2~3 ], h(t).~
'4
/ =2
4
----1
,~ tN,ii, h(,) ~ ~ - : i
/
The simple average of an upper and a lower Riemann sum of the survival function over
[0,1 ] (equivalent to the trapezoidal rules since the survival function is monotonically
decreasing) was used to estimate the mean duration to failure to be 0.56193 (page 16 of
the listing):
I~ = iS(t)dt ~ 0.56193
o
Compare this with the value determined using the above formula:
602
¢~ = ~3 = I, ~ = 2 :
= 0.562077
Finally, data set Two observations were flagged and pooled with set One survival data.
The SAS PHREG procedure was then run on the combined data set with the flagged data
modeled as a time-dependent intervention applicable to the middle interval. The PHREG
procedure produced a hazard ratio of 2.000 (page 4 of the listing) for that intervention,
illustrating how the Cox proportional hazard model can be used to approximate a hazard
function by a step function. By the same token, it illustrates how that procedure may
provide the means to unpack this process. More precisely, the procedure results may
reveal a change in hazard as (approximated by) a combination of shifts like the ones
1 2 2 1
considered here: 8 = 8 ( 3 ' ) o 8 ( ~ - , ~-) . From that, the results of this paper
can be used to translate this into the effect on the mean time to failure.
603
References:
[1] Allison, Paul D., Survival Analysis Using the SAS~ System." A Practical Guide,
The SAS Institute, Inc., 1995.
[2] Hogg, Robert V. and Klugrnan, Stuart A., Loss Distributions, John Wiley & Sons,
1984.
604
APPENDIX 1
I~SLOG I
****************************** ;
10
11 DATA ONE;SET ZERO;
12 KEEP T CLOSED SHOCK;
13 RETAIN COUNT;
14 IF N - I;
15 C L O S E D - I;
16 SHOCK - 0;
17 C O U N T - 0;
18 DO I - I TO 1000;
19 T - 1/1000;
20 DO J - 1 TO ROUND(50*EXP(-T),I);
21 COUNT ÷ I;0UTPUT;END;END;
22 T - 1;P = EXP{-1);
23 C L O S E D = 0;
24 DO J - 1 TO (P/(1-P))*COUNT;
25 OUTPUT;END;
NOTE: The data set WORK.ONE has 49980 observations and 3 variables.
NOTE: The data set WORK.TWO has 49956 observations and 3 variables.
605
49 D A T A T H R E E ; S E T O N E TWO;
SO TITLE 'PHREG PAPER:TEST';
606
MPRINT(MEANDUR): PROC P H R E G S I M P L E DATA-ZDATA;
MPRINT(MEANDUR): MODEL T'CLOSED(0)- / C O R R B COVB;
MPRINT(MEANDUR): BASELINE OUT=BASE SURVIVAL-S;
NOTE: There are no explanatory variables in t h e M O D E L s t a t e m e n t .
NOTE: The data set WORE.BASE has 1001 observations and 2 variables.
N O T E : T h e P R O C E D U R E P H R E G p r i n t e d p a g e 5.
MPRINT(MEANDUR): DATA BASE;
MPRINT(MEANDUR) : SET BASE END - EOF;
MPRINT(MEANDUR) : IF N = 1 T H E N DO;
M P R I N T ( M E A N D U R ) : T = 0;
M P R I N T ( M E A N D U R ) : S = I;
MPRINT (MEANDUR) : OUTPUT;
M P R I N T ( M E A N D U R ) : END;
MPRINT(MEANDUR) : IF T < 1 T H E N O U T P U T ;
MPRINT(M~UR) : IF E O F O R T >= 1 T H E N DO;
M P R I N T ( M E A N D U R ) : T = 1;
MPRINT (MEANDUR) : OUTPUT;
M P R I N T ( M E A N D U R ) : END;
NOTE: The data set WORK.BASE has 1002 observations and 2 variables.
607
NOTI~: T h e d a t a s e t WORK.MEAN has 1 observations and 3 variables.
MPRINT(MEANDUR): PROC PRINT DATA - MEAN;
NOTE: The data set WORK.ZDATA has 49956 observations and 3 variables.
MPRINT(MEANDUR}: PROC PHREG SIMPLE DATA-ZDATA;
MPRINT(MEANDUR): MODEL T'CLOSED(0)- /CORRB COVB;
MPRINT(MEANDUR}: BASELINE OUT-BASE SURVIVAL-S;
NOTE: There are no explanatory variables in the MODEL statement.
NOTE: The data set WORK.BASE has 1001 observations and 2 variables.
NOTE: The PROCEDURE P H R E G p r i n t e d p a g e 9.
MPRINT(MEANDUR): DATA B~E;
MPRINT(MEANDUR} : S E T B A S E E N D ffi EOF;
MPRINT(MEANDUR) : IF N - 1 T H E N DO;
MPRINT(MEANDUR) : T ffi 0;
MPRINT(MEANDUR) : S ~ 1;
M P R I N T (M E A N D U R ) : OUTPUT;
MPRINT (MEANDUR) : END;
MPRINT(MEANDLTR) : IF T < 1 THEN OUTPUT;
MPRINT(MEANDUR) : IF E O F O R T >ffi 1 T H E N DO;
MPRINT(MEANDUR) : T - 1;
MPRINT (MEANDUR) : OUTPUT;
MPRINT (MEANDUR) : END;
NOTE: The data set WORK.BASE has 1002 observations and 2 vari3bles.
MPRINT(MEANDUR): PROC SORT N O D U P D A T A ffi B A S E ;
MPRINT(MEANDUR): B Y T;
608
MPRINT(MEANDUR): IF D > 0 T H E N DO;
MPRINT(MF2%NDUR): UPPER + D*OLD_S;
MPRINT(MEANDUR): LOWER + D-S;
MPRINT(M]F~qNDL*R}: END;
MPRINT(MEANDUR): O L D _ T - T;
MPRINT(MEANDUR): O L D S - S;
MPRINT(MF-~%NDUR): IF E O F T H E N DO;
MPRINT (MEANDUR) : MEAN - (UPPER + LOWER)/2;
MPRINT {MEANDUR} : OUTPUT;
MPRINT (MEANDUR) : END;
NOTE: The data set WORK.ZDATA has 99936 observations and 3 variables.
NOTE: The data set WORK.BASE has 1002 observations and 2 variables.
609
NOTE: The data set WORK.SUBBASE has 60 o b s e r v a t i o n s and 3 variables.
610
LIBTING
Percent
Total Event Censored Censored
Total Sample
Standard
Variable N Mean Deviation Minimum Maximum
611
PHREG PAPER:TEST page 2
The PHREG P r o c e d u r e
Without With
Criterion Covariates Covariates Model C h i - S q u a r e
SHOCK
SHOCK 0.0000591043
SHOCK
SHOCK 1.000000000
612
PHREG PAPER:TEST page 3
Percent
Total Event Censored Censored
Total Sample
Standard
Variable N Mean Deviation Minimum Maximum
613
PHREG PAPER:TEST page 4
Without with
Criterion Covariates Covariates Model Chi-Souare
Estimated C o v a r i a n c e Matrix
TSHOCK
TSHOCK 0.0001563092
E s t i m a t e d C o r r e l a t i o n Matrix
TSHOCK
TSHOCK 1.000000000
Data Set: W O R K . Z D A T A
Dependent Variable: T
Censoring Variable: CLOSED
Censoring Value(s): 0
Ties Handling: BRESLOW
S u m m a r y of the Number of
Event and Censored Values
Percent
Total Event Censored Censored
-2 LOG L - 657271.7
614
PHREG PAPER: TEST BASE ONE pages 6-7
oBs T S G
1 0.000 1.00000 0.00000
2 0.001 0.99900 0.00100
3 0.002 0.99800 0.00200
4 0.003 0.99700 0.00301
5 0.004 0.99600 0.00401
6 0.005 0.99500 0.00501
7 0.006 0.99600 0.00602
0 0.007 0.59300 0.00703
9 0.008 0.99200 0.00804
I0 0.009 0.99100 0.00904
11 0.324 0.72327 0.32397
12 0.325 0.72255 0.32697
13 0.326 0.72183 0.32597
14 0.327 0.72111 0.32697
15 0.328 0.72039 0.32797
16 0,329 0.71967 0.32897
17 0.330 0.71895 0.32997
18 0.331 0.71823 0.33097
19 0.332 0.71751 0.33197
20 0.333 0.71679 0.33298
21 0.334 0.71607 0.33398
22 0,335 0.71535 0.33499
23 0,336 0,71463 0.33600
24 0.337 0.71391 0.33700
25 0.338 0,71319 0.33801
26 0.339 0.71247 0.33902
27 0.340 0.71174 0.34004
28 0.341 0.71102 0.34105
28 0.342 0.71030 0.34206
30 0.363 0.70960 0.34305
31 0.657 0.51845 0.65692
32 0.658 0.51793 0.65792
33 0.659 0.51761 0.65893
34 0.660 0.51689 0.65993
35 0.661 0.51637 0.66094
36 0.662 0.51585 0.66195
37 0.663 0.51533 0.66256
38 0.664 0.51481 0.66397
39 0.665 0.51429 0.66498
40 0.665 0.51377 0.66599
41 0.667 0.51325 0.66700
42 0.668 0.51273 0.66802
63 0.669 0.51220 0.66903
44 0.670 0.51168 0.67005
45 0.671 0.51116 0.67106
46 0.672 0.51064 0.67208
47 0.673 0.51012 0.67310
60 0.674 0.50962 0.67408
69 0.675 0.50912 0.67505
50 0.676 0.50862 0.67605
51 0.991 0.37117 0.99110
52 0.992 0.37079 0.99212
53 0.993 0.37041 0.99315
54 0.994 0.37003 0.59410
55 0.995 0.36967 0.99515
56 0.996 0.36931 0.99612
57 0.997 0.36895 0.99710
58 0.998 0.36859 0.99808
59 0.999 0.36823 0.95905
60 1.000 0.36787 1.00003
615
PHREG PAPER: TEST BASE ONE page 8
D a t a Set: W O R K . Z D A T A
Dependent Variable: T
Censoring Variable: CLOSED
Censoring Value(s): 0
Ties Handling: BRESLOW
S u m m a r y of the N u m b e r of
E v e n t and C e n s o r e d V a l u e s
Percent
Total Event Censored Censored
-2 L O G L - 7 5 7 6 0 4 . 6
616
PHREG PAPER : TEST BASE TWO pages 10-11
OBS T S G
1 0.000 1.00000 0.00000
2 0.001 0.99900 0.00100
3 0.002 0.99800 0.00200
4 0.003 0.99700 0.00301
5 0.004 0.99600 0.00401
6 0.005 0.99500 0.00502
7 0.006 0.99399 0.00602
8 0.007 0.99299 0.00703
9 0,008 0.99199 0.00804
10 0.009 0.99099 0.00905
11 0.324 0.72314 0.32416
12 0.32S 0.72242 0.32515
13 0.326 0,72170 0.32615
14 0.327 0.72097 0,32715
15 0.328 0.72025 0.32815
16 0.329 0.71953 0.32915
17 0.330 0.71881 0.33015
18 0.331 0.71809 0.33116
19 0.332 0.71737 0.33216
2O 0.333 0,71665 0.33317
21 0.334 0.71521 0.33518
22 0.335 0,71379 0.33717
23 0.336 0.71237 0.33916
24 0.337 0.71095 0.34116
25 0.338 0.70952 0.34316
26 0.339 0.70810 0.34517
27 0.340 0.70668 0.34717
28 0.341 0.70526 0.34919
29 0.342 0.70386 0.35110
3O 0.343 0.70246 0.35317
31 0.657 0.37473 0.98155
32 0.658 0.37399 0.98353
33 0.659 0.37325 0,98551
34 0.660 0.37251 0.98750
35 0.661 0.37177 0.98949
36 0.662 0.37103 0.99148
37 0.663 0.37029 0.99348
38 0.664 0.36955 0.99540
39 0.665 0.36880 0.99749
40 0.666 0.36806 0.99950
41 0.667 0.36770 1.00048
42 0.668 0.36734 1.00146
43 0.669 0,36698 1,00244
44 0.670 0.36662 1.00342
45 0.671 0.36626 1.00441
46 0.672 0.36690 1.00539
47 0.673 0,36554 1.00637
40 0,674 0.36518 1.00736
49 0.675 0.36482 1.00835
50 0.676 0.36446 1.00934
51 0.991 0.26593 1.32451
52 0.992 0.26567 1.32549
53 0.993 0.26541 1.32647
64 0.994 0.26515 1.32765
55 0.995 0.26489 1.32043
56 0.996 0.26463 1.32941
97 0.997 0.26437 1.33040
58 0.998 0.26411 1.33138
59 0.999 0.26385 1.33237
60 1.000 0.26359 1.33335
617
PHREG PAPER: TEST BASETWO page 12
The P H R E G Procedure
D a t a Set: W O R K . Z D A T A
Dependent Variable: T
Censoring Variable: CLOSED
Censoring Value(s): 0
Ties H a n d l i n g : B R E S L O W
S u m m a r y of the N u m b e r of
Event a n d C e n s o r e d V a l u e s
Percent
Total Event Censored Censored
-2 L O G L - 1510536
618
PI~REG PAPER: TEST BASE THREE pages 14-15
OBS T 8 G
1 0.000 1.00000 0.00000
2 0.001 0.99900 0.00100
3 0 . 002 0. 9 9 8 0 0 0. 00200
4 0.003 0.99700 0.00301
5 0.004 0.99600 0.00401
6 0.00S 0.99500 0.00502
7 0 . 006 0. 99400 0. 00602
8 0.007 0.99300 0.00703
9 0.008 0.99199 0.00804
10 0.009 0.99099 0.00905
11 0,324 0.72320 0.32407
12 0 . 325 0. 72248 0. 32506
13 0.326 0.72176 0.32606
14 0.327 0.72106 0.32706
15 0.328 0.72032 0.32806
16 0.329 0.71960 0.32906
17 0.330 0.71888 0.33006
18 0.331 0.71816 0.33106
19 0.332 0,71746 0.33207
20 0.333 0.71672 0.33307
21 0.334 0.71564 0.33450
22 0.335 0.71457 0.33608
23 0,336 0 .71350 0 . 33758
24 0.337 0.71243 0.33908
25 0.338 0.71136 0.34058
26 0.339 0.71028 0.34209
27 0.340 0.70921 0.34380
28 0.341 0.70814 0,34511
29 0.342 0,70708 0.34661
30 0.343 0.70603 0.34809
31 0.657 0.44661 0.80608
3~ 0.658 0.44598 0.00749
33 0.659 0.64535 0.80891
34 0.660 0.84471 0.81032
35 0.661 0.44400 0.81174
36 0.662 0.44345 0.81316
37 0.663 0.44202 0.81458
30 0.664 0.44219 0,81601
39 0.665 0.44156 0.81744
40 0.666 0.44093 0.81806
41 0.667 0.64049 0.81986
42 0.668 0.44005 0.82086
43 0.669 0.43961 0.82186
44 0.670 0~43917 0.82287
45 0.671 0.43873 0.82387
46 0.672 0,43829 0.82687
47 0.673 0.43785 0.82588
48 0.674 0.43742 0.82886
89 0,675 0.43699 0.02705
60 0.676 0.43686 0.02883
81 0.991 0.31856 1.14393
52 0.992 0.31824 1.14494
53 0.993 0.31792 1.14594
56 0,994 0.31760 1.14695
55 0.995 0.31729 1.14793
56 0.996 0.31698 1.14891
S7 0.997 0.31667 1.14909
$8 0,998 0.31636 1.15087
$9 0.999 0.31605 1.15185
60 1.000 0.31574 1.15283
619
APPENDIX 2
8ASLOG,
350 ******************************************************************
351 ***BEGIN CODE FOR CASE STUDY **CT********************************
352 %MACRO VLIST;
353 EMPL2
354 AY93-AY94
355 MF01 EC01
356 NOI SPR NOI CUT
357 %MEND VLIST;
N O T E : T h e d a t a set W O R K . I N R I S K h a s 1 o b s e r v a t i o n s a n d 8 v a r i a b l e s .
368 t
369 PROC PHREG SIMPLE DATA=ONE;
370 M O D E L T ' C L O S E D 0 1 ( 0 ) - % V L I S T TPA;
MPRINT(VLIST): EMPL2 AY93-AY94 MF01 EC01 NOI SPR NOI CUT
371 BASELINE COVARIATES=INRISK OUT:EASE SURVIVAL=S / NOMEAN;
N O T E : T h e d a t a s e t W O R K . B A S E h a s 958 o b s e r v a t i o n s a n d 10 v a r i a b l e s .
N O T E : T h e P R O C E D U R E P H R E G p r i n t e d p a g e s 1-2.
NOTE: T h e d a t a s e t W O R K . B A S E h a s 957 o b s e r v a t i o n s a n d 2 v a r i a b l e s .
374 D A T A B A S E ; S E T B A S E E N D = EOF;
375 IF N - 1 T H E N D O ; T = 0;S = I ; O U T P U T ; F ~ ;
376 IF T < 1 THEN OUTPUT;
377 IF E O F O R T >= 1 T H E N D O ; T w 1 ; O U T P U T ; E N D ;
620
380 DATA BASE;SET BASE END-EOF;
381 A R R A Y MATT(I) T I - T I 0 0 0 ;
382 ARRAY MATS(I},SI-S1000;
383 KEEP TI-TI000 SI-S1000;
384 RETAIN TI-TI000 SI-SI000;
385 I = MIN(_N_,1000);
386 M A T T m T;
387 M A T S = S;
388 IF E O F T H E N DO;
389 D O I = N + 1 T O I000;
390 M A T T = I;
391 M A T S = 0;
392 END;
393 OUTPUT;
394 END;
395 *RUN PROPOTIONAL HAZARD MODEL';
396 *TXPA X=I,2,3 ARE TIME-DEPENDENT REFERRAL VARIABLES';
397 TITLE 'PROPORTIONAL HAZARD MODEL WITH TIME DEPENDENT REFERRAL';
NOTE: T h e d a t a set W O R K . B A S E h a s 1 o b s e r v a t i o n s a n d 2 0 0 0 v a r i a b l e s .
398 PROC PHREG SIMPLE DATA-ONE OUTEST=PARMS;
399 MODEL T*CLOSED01(0)m %VLIST
MPRINT(VLIST): EMPL2 AY93-AY94 MF01 EC01 NOI_SPR NOI_CUT
400 T I T P A T 2 T P A T3TPA;
401 IF T P A = I & T >= L A G 2 T P A T H E N T T P A ~ I ; E L S E T T P A = 0;
402 IF 1/6 • T T H E N T I T P A - T T P A ; E L S E T I T P A = 0;
403 IF I/3 • T >= 1/6 T H E N T 2 T P A = T T P A ; E L S E T 2 T P A ' = 0;
404 IF T >- 1/3 T H E N T 3 T P A = T T P A ; E L S E T 3 T P A = 0;
405 * D E T E R M I N E P H I m R E F E R R A L R I S K R A T I O BY T L A Y E R ;
406 T I T L E ' H A Z A R D R A T I O PHI BY T I M E LAYER';
407 D A T A PAP/MS;SET PARMS;
N O T E : T h e d a t a set W O R K . P A R M S h a s 1 o b s e r v a t i o n s a n d 14 v a r i a b l e s .
N O T E : T h e P R O C E D U R E P H R E G p r i n t e d p a g e s 3-4.
408 K E E P T L A Y E R PHI;
409 T L A Y E R - 1;PHI = E X P ( T I T P A ) ; O U T P U T ;
410 TLAYER - 2;PHI = EXP(T2TPA);OUTPUT;
411 T L A Y E R m 3;PHI = E X P ( T 3 T P A ) ; O U T P U T ;
NOTE: T h e d a t a set W O R K . O N E h a s 1 2 5 1 2 o b s e r v a t i o n s a n d 12 v a r i a b l e s .
419 PROC SORT DATAzONE ;BY T L A Y E R ;
621
NOTE: T h e d a t a set WORK.PARMS has 3 observations and 2 variables.
421 DATA ONE;MERGE ONE(IN-INO} PARMS(IN-INP);BY TLAYER;
422 IF I N O & INP;
622
NOTE: The data set WORK.BASE has 1325 observations and 10 v a r i a b l e s .
NOTE: The PROCEDURE PHREG printed pages 6-9.
N O T E : T h e P R O C E D U R E P R I N T p r i n t e d p a g e I0.
493 PROC PHREG SIMPLE DATA=ONE;BY TPA;
494 MODEL ADJT*CLOSED01(0)= %VLIST;
MPRINT(gqSIST): EMPL2 AY93-AY94 MFOI EC01 NOI_SPR NOI_CUT
495 BASELINE OUT=BASE SURVIVAL=S;
496 TITLE 'ADJUSTED MEAN DURATION FROM SURVIVAL FUNCTION AT MEANS';
623
510 L O W E R + (T - O L D _ T ) * S ;
511 END;
512 O L D T s T;
513 O L D _ S - S;
514 IF L A S T . T P A T H E N DO;
515 MEAN - (UPPER + LOWER)/2;OUTPUT;
516 END;
624
SAB L I S T I ~ *
D a t a Set: W O R K . O N E
Dependent Variable: T
Censoring Variable: CLOSED01
Censoring Value(s): 0
Ties H a n d l i n g : B R E S L O W
S u m m a r y of the N u m b e r of
Event and Censored Values
Percent
Total Event Censored Censored
Total Sample
Standard
Variable N Mean Deviation Minimum Maximum
625
PROPORTIONAL HAZARD MODEL FOR BASELINE page 2
The PHREG Procedure
Without With
Criterion Covariates Covariates Model Chi-Square
626
HAZARD RATIO PHI B Y T I M E L A Y E R page 3
D a t a Set: W O R K . O N E
Dependent Variable: T
Censoring Variable: CLOSED01
C e n s o r i n g Value(s): 0
Ties Handling: BRESLOW
S u m m a r y of the N u m b e r of
E v e n t and C e n s o r e d V a l u e s
Percent
Total Event Censored Censored
Total Sample
Standard
Variable N Mean Deviation Minimum Maximum
627
HAZARD RATIO PHI B Y T I M E LAYER page 4
Without With
Criterion Covariates Covariates Model Chi-Square
1 1 1.42402
2 2 1.20251
3 3 1.12151
628
ACTUAL MEAN DURATION FROM S U R V I V A L FUNCTION AT MEANS page 6
TPA-0
D a t a Set: W O R K . O N E
D e p e n d e n t Variable: T
C e n s o r i n g Variable: C L O S E D 0 1
C e n s o r i n g Value(s): O
Ties Handling: B R E S L O W
S u m m a r y of the N u m b e r of
Event and C e n s o r e d V a l u e s
Percent
Total Event Censored Censored
Total Sample
Standard
Variable N Mean Deviation Minimum Maximum
629
ACTUR~L M E A N D U R A T I O N FROM SURVIVAL FUNCTION AT MEANS page 7
TPA-0
Without With
Criterion CovariateB Covariates Model Chi-Square
630
ACTUAL MEAN DURATION FROM SURVIVAL FUNCTION AT MEANS page 8
TPA~ 1
The P H R E G Procedure
D a t a Set: W O R K . O N E
Dependent Variable: T
Censoring Variable: CLOSED01
Censoring Value(s): 0
Ties H a n d l i n g : B R E S L O W
S u m m a r y of the N u m b e r of
Event and Censored Values
Percent
Total Event Censored Censored
Total Sample
Standard
Variable N Mean Deviation Minimum Maximum
631
ACTUAL MEAN DURATION FROM SURVIVAL FUNCTION AT MEANS page 9
TPA-1
Without With
Criterion Covariates Covariates Model Chi-Square
632
ADJUSTED MEAN DURATION FROM SURVIVAL FUNCTION AT MEANS page II
TPA-0
D a t a Set: W O R K . O N E
D e p e n d e n t Variable: A D J T
C e n s o r i n g Variable: C L O S E D 0 1
C e n s o r i n g Value(s): 0
Ties Handling: BRESLOW
S u m m a r y of the N u m b e r of
Event and C e n s o r e d V a l u e s
Percent
Total Event Censored Censored
Total Sample
Standard
Variable N Mean Deviation Minimum Maximum
633
ADJUSTED MEAN DURATION FROM SURVIVAL FUNCTION AT MEANS page 12
TPA-0
Without With
Criterion Covariates Covariates Model Chi-Square
634
ADJUSTED MEAN DURATION FROM SURVIVAL F U N C T I O N AT MEANS page 13
TPA-I
Percent
Total Event Censored Censored
Total Sample
Standard
Variable N Mean Deviation Minimum Maximum
635
ADJUSTED MEAN DURATION FROM SURVIVAL FUNCTION AT MEANS page 14
TPAJl
The P H R E G P r o c e d u r e
Without With
Criterion Covariates Covariates Model Chi-Square
636
Modeling Multi-Dimensional Survival with
Hazard Vector Fields
Daniel R. Corro
637
Abstract:
Traditional survival analysis deals with functions of one variable, "'time. " This paper
explains the case o f multiple and interacting aging metrics by introducing the notion o f a
hazard vector fieid This approach is shown to provide a more general framework than
traditional survival analysis, including the ability to model multi-dimensional censored
data~ A simple example illustrates how Green's Theorem in the plane applies to evaluate
and even to theoretically optimize a course of action~ One evident apph'cation is to the
evaluation and promulgation of claim administration protocols.
638
Introduction
Although survival analysis has long recognized the need to account for different causes
of death or failure, it recognizes only one way of measuring age. Consequently, survival
f u n c t i o ~ v e n "select" survival functions--are functions ofone variable, typically
denoted "t" and interpreted as "time". This paper explains the need to study observed
lives from multiple perspectives. For example, a vehicle may burn several different types
of fuel with varying and inter-related consumption patterns. The ability to determine
whether a particular trip is possible and if so to find an efficient route may be best
approached as a multi-dimensional problem. That is, it may not always be practical or
revealing to reduce survival into functions of a single variable.
This work evolved from studying workers compensation insurance claims data and the
motivation comes from that context. A quick claim resolution may not achieve a cost-
effective result for either the insurer or the injured worker. A useful measure of "age"
for the insurer may be the paid to date benefit cost of the claim while for the claimant the
most important metric is likely his or her lost income. Traditional survival analysis can
be helpful here, especially in dealing with open claims, i.e., "right-censored" data (c.f.
[2], [4]). Simply taking "t" in the sui'vival analysis models to be paid loss can yield
useful reserve estimates (c.f. [4]). workers compensation claims typically involve both
medical and wage replacement benefits. Each is expected to follow a distinctive payment
pattern that need not be independent of the other. Indeed, that inter-relationship may
prove to be a key cost driver. This paper illustrates how a multi-dimensional survival
model can reveal those inter-relationships and their cost implications.
Consider, for instance, an issue from the ongoing debate over claim administration
protocols. In the workers compensation context, is it better to pursue aggressive medical
treatment quickly in an effort to minimize time lost from work, or is it more efficient to
spend those resources another way, such as providing job retraining. Clearly the answer
may vary tremendously based on the nature of the injury, the age of the worker, the
applicable benefit provisions, and a myriad of other considerations.
The main conceptual result of this paper is that traditional survival analysis can be
inherently limiting. This is established formally by showing that it is not always possible
to define a survival function. The first section of the paper presents a generalization of
the survival function to a function of several variables. Many of the basic formulas of
survival analysis are readily generalized. The next section discusses censored data and
shows how this can introduce new complications in the multi-dimensional context. The
concept of a hazard vector field is defined and shown to provide a more general
framework than traditional survival analysis. In particular, this framework is capable of
dealing with multi-dimensional censored data. It is shown that the existence of a survival
function conforms exactly to the "conservative force field" ofcisssical physics. A simple
example illustrates how Green's Theorem in the plane applies to comparing and even
optimizing paths of action, e.g., as in evaluating claim administration protocols.
639
The concepts introduced in this paper may lead to the ability to help identify optimum
claim handling practices. As noted in the section on further research, much additional
work is required to test this approach. Some work that uses this approach to study the
resolution pattern ofworkers compensation back strains shows some promise but is very
preliminary. The examples presented here are only numeric illustrations; many have no
practical application and some details are left to the reader. Those wishing additional
details on the numerical examples or on the application to back strain cases may contact
the author.
Let ~+ denote the set ofnormegative real numbers and ~ n denote n-dimensional space.
For any a = (a I ..... a, ) • 9t', 3 , = {(x~....,x, ) Ix, > a I , 1 < i < n }; in particular, let
3 = 30 denote the "positive quadrant." We regard 9~n as a model for "space-time" in
which each coordinate represents an aging metric. The most natural case is when n= 1
and the metric is time. For insurance applications, metrics to keep in mind would be
cumulative payments or accumulation of some other quantity associated with claim
resolution (e.g. xl = time from injury, xz = indemnity paid to date, x3 = medical paid to
date, x4 = ALAE paid to date, etc.). We regard :3 as all possible "failures" or "deaths", all
of whose lives begin at the origin. More generally, ~3° represents the possible future
(failure) values subsequent to attaining the point a • ~Rn. Clearly b e 3 a ¢::' 3b c 3 a .
f : ~3..--~~ + If=l.
3
3
Observe t h a t f a n d S uniquely determine one another; indeed, from the fundamental
theorem of calculus:
OnS
f = (-If
Ox~...ax,,"
640
For b E ~a, define fo (b) = f ( b ) . This defines a PDF on ~1a ill which the origin has been
s(a)
shifted to a and which has survival function s a (b) = ~ the conditional probability
S(a) '
assuming survival to a.
Let X b e the random variable with P D F f a n d sample space ~. Because ~ is closed under
vector addition (it is an additive semigroup), it is natural to consider the expression:
/z = E ( X ) = Zf(x)x
x~J
as a candidate "expected failure vector". More generally, for a E 3 this suggests that the
expected failure vector for survival beyond a be expressed as:
p(a)= Zfa(X)(x-a)
XE~[.
This infinite weighted sum, properly interpreted as a limit, can be found (when finite) via
integration. Let x i : 91 n ~ 91 denote the u s ~ coordinate projection functions and
{ci = (0,...0,1,0,...0) I 1 ~ i < n} the usual set of coordinate unit vectors. Continuity and
linearity imply:
/J.
P ( a ) = i~=
I
x~(fa(xXx-a)) / ei
S(a)~,lko, o
Lemma: For any continuous function g : 91+ ~ 91+, b ~ 91+ with Jg(t)dt < oo
0
t - b)g(t)dt = g(w)dwdt
b bt
641
cc c
bt b
C
v=t-b
t
du = - g ( t ) d t i dr=dr
=[uv]~- ~vdu:
cFclcc (t-b)fg(w)dw + ~(t-b)g(t)dt
b L t ]b b
C
= S(t - b)g(t) dt
b
Define n functions:
oo oo
QO oo
Invoking the above lemma and rearranging the order of integration (Fubini's Theorem):
p(a)= t-ai)g~(t)dt e,
which implies that this candidate for expected survival vector can be determined from
conditional survival parallel to the coordinate axes. Note that p : ~ -) ~ is a vector field
and that:
642
It = p(O) = S(t¢i)dt ci
i=lk 0 )
Recall that for n=l the hazard function h can be defined as h(t) = f ! t ! or equivalently as
S(t)
h= d(In(S(t)))
- - - - - - - ~ . While the first definition readily generalizes to define the hazard
function h = - ~ : 3 ~ 9~+ for any n, it is the second that is of greater interest. Given a
survival function S on ~ the corresponding hazard vector field is defined as:
We conclude this section with some simple examples for n=2, in which case we revert to
the more conventional xy.plane notation.
Example 1: Let (a, b) e ~ be a vector in the positive quadrant. The exponential survival
function with parameter vector (agb) is defined as:
643
S(x,y)=e -ax-by f(x,y)=abS(x,y) h(x,y)=ab
Note that this models the case of constant expected survival, p(x,y) = (!,1), and
constant hazard field ~x,.v) (a,b). =
The generalization of Example ! to n>2 is clear. It is not surprising that the expected
survival vector is constant exactly when the hazard field is constant. The inverse
relationship between the two in that event, ft. p = , , has an added geometric appeal since
survival is "global" while hazard is "local" (See [3] for a more systematic discussion of
the relationship between hazard and expected survival.)
Example 2: After a constant vector field (Example 1), the next simplest vector field is
,vT-
where *(x) = d ' Ie 2 a~ is the standard normal cumulative density function.
¥ ~ _®
Example 3: Suppose r ,T,g define another hazard field, survival, and PDF, respectively.
Then r/+r has survival function the product STand PDF:
When n=l. a hazard function h(t) is often viewed as belonging to a one-parameter family
{ch(t) I c ~ ~+} of"proportional b a i r d " functions ([2] considers the mean survival over
644
such a family). A proportional shift h(t) ~ ch(t), c ~ ~+ in the hazard function
corresponds to exponentiation of the survival function S(t) ~ S(t) c . The next
example shows that this concept becomes more complicated in higher dimensions.
Survival Data
y Status Count
1 0 Open 378
1 1 Closed 393
2 2 Closed 229
Total 1,000
In this context "failure" means claim closure, as that corresponds to the end of the life of
a claim. Cases open when the information is collected are regarded as censored. The
reported values of x and y represent medical and indemnity paid to date figures at that
evaluation. For closed cases, the final incurred costs are reported. Consistent with the
assumed unit of payment, no case survives beyond (2,2).
Let P~ denote the probability of survival from point a to point b. The task is to determine
the probability of survival from (0,0) to the point (I,1) = p(l.i)
• (o.o)" Note that there are no
p((u> 1000 = 393 = 0.607 Taking into account the censoring at (1,0), however, implies
°"~ = 1 ~
645
illustrates how censored data leads to a problem determining a probability of survival
SO,I) from (0,0) to (1,1).
The component functions of a hazard vector field r/determined from a survival function
S are readily expressed in terms o r s and the P D F f For example when n=2 we have:
"~f(x,v)dv lf(u,y)a,,
' = S(x,y) ' S(x,y)
in which the common denominator S(x,y) measures the probability of survival to (x,y) and
the numerators the observed "marginal failures" subsequent to (x,y).
646
l+x+y
f0 ( x , y ) = (x + 1) 3 (3' + 1) 3
fl (x,y) = xy
(x + 1) 3 (y + 1) 3
(~fi(x,v)dv ~(u,y)du
rlI(x'Y)=(PJ(x'Y)'Q'(x'Y))=IY S~x,y) ,x S~x,y)
k
= OPI OQ~=(~2Y)f(x,y )
Oy ox
It follows that the vector field/71 does not have a potential function and in particular does
not have the form - V In S I for any survival function S]. This points out the need to
generalize our definitions, as is done in the next section.
Let F = { C ° [ a ~ 3 ; Co a life path for a}, r/: 3 --~ ~ a continuous vector field.
The corresponding generalizedsurvivalfunction S : F ~ 91+ is determined from
S ( C . ) = e c.
The pair r/, S is referred to as a generalizedsurvivalmodelon 5.
Observe that if q,s and z,T are generalized survival models, then so is arl+by,SaT b , for
any a,b e ~1+. In particular, this generalized survival formulation captures situations that
cannot be modeled with PDF's, from both this formal arithmetic perspective and as
regards the ability to relate survival with choice of life path.
O f course, even for n = l , any continuous function h :91 + ~ .91÷ can formally define a
-ih(w)a~
survival function as S(t) = e o but setting f(t) = h(t)S(t) need not yield a
continuous PDF, as considered here. Indeed, Sf (t)dt = 1 - p w h e r e P = lira S ( t ) can
o t-~0
be greater than 0. In that case it is easy to augtmmtf(t) by a point mass o f p r o b a b i l i t y p
647
to achieve a mixed PDF based model. For n>l, the relationship between path
dependence and the existence of a PDF based model lies somewhat deeper.
Also, for n=l, the haTard is interpreted as the instantaneous rate of failure. Consider now
the case n=2. Following standard convention, express the hazard field as
rl(x,y) = (P(x,y),Q(x,y))and assume also that P and Q are continuously differentiable.
Note that for any t>O, any life path of(a+t,b) passing through (a,b) has the form
C + D, where C is a life path of (a,b) and D, (s) = (a + s, b), 0 < s < t. It follows that the
conditional probability of survival from (a,b) to (a+t, b) is uniquely determined as:
L
e-2 ~ _ S ( C + D) _ A t )
S(63
The mean rate of failurea(t) per horizontal unit along D, is also independent of the
choice of C :
(s(c)-l.)S<C) 1
a(t) = ~, s(-c) ) l-p(t)
t t
We are interested in the instantaneous horizontal rate of failure at (a,b), which is just the
limit:
a = lim a(t) = - lim p(t)- p(O) = dp
t-,O t--*O t- 0 dt It=0
On the other hand:
-40
p ( t ) = e o,
(a+t,b)
- l n p ( / ) = ~rl= SP(a+x,b)dx+Q(a+x,b)dy
19, ( a,b )
t
-- fp(,~ + s,b)~
0
since dy=0 along D,. First differentiating by t and then letting t --> 0 :
648
determines a generalized survival function. This discussion shows the converse: a
generalized survival function determines conditional probability, whence failure rates
parallel to the coordinate axes, which in turn determine a hazard vector field.
For any n and a ¢ 3, define the curve Da,i.t (s) = a + s6i, 0 < s ~ t, I < i < n . The
above discussion on failure rate noted that conditional survival parallel to a coordinate
axis is independent of choice of path and the discussion in Section I then suggests the
following definition for the expected survival vector
p(a,b)=(~e-~oP(a+s,b)ds ~-~Q(a,b+s)ds
at, Je ° dt)
o o
t i
-[P(s,b)ds ~ -~Q(a,s)ds
= ( je ; dt, Je z dr)
a b
i f I
e dt= dt+ Ie • dt
a a c
e r
C
i
C
f i
a C
649
The corresponding result for n=l is a special case of tin2 and the case n>2 is a
straightforward induction using the case rm2. In general, we have:
b ~ 3a ~ b + p ( b ) ~ 3a+p(a)
This is intuitively what one would expect and has implications to the task of determining
a hazard vector field approximating empirical data (c.f. [3]).
~(x,y) =
0,o) 2
~c,~= J Y__a~+x~a~ y=0,~=0
(o,o) 2
I
=~0a~+x2(0)=0 ~S(Cl)=l
0
S(C 2) = 1
S(Ct + C3 ) = 1 ~ 0.368
e
650
Clearly, a hazard vector field 1(x, y) = (P(x. y), Q(x, y)) and the corresponding expected
survival vector field q(x, y) should be “inversely related” in some sense. In this
example, as in Example 1, their component functions are found to be multiplicative
inverses of each other. The interested reader can verify that this is characteristic of the
ap
c.asewhenr=+ do=O(c.f. [3]).
Again consider the case w2 and let (a, b) E 3 be a point in the positive quadrant with
life paths C and D. We are interested in comparing S(C) with S(D). The case of most
interest is when (a,!~)is the “first” point beyond the origin at which the life paths meet
and so assume tinther that C lies beneath D in the rectangle [0, a] x [0, b] . The picture
is:
We are interested in comparing the probabilities of failumkurvival over the two paths.
As in the previous section, express the hazard field as rl(x,y) = (P(x,y),Q(x,y)) and assume
P and Q are continuously differentiable. Under these conditions, C-D is a closed curve
enclosing a simply comucted region R. Green’s themem, a topic coverod in most
651
advanced calculus courses, relates the line integral over the boundary with an integral
over the enclosed region. In this ease, it states that:
47= = ff OQ op
C D C-D
OQ OP
Letting r(x, y) -sometimes called the "rotation" of r/ at (x,y)---it follows
Ox Oy
that:
S(D) = eaS(C) where a = Sfr
R
In particular,
r ( x , y ) ~ 0 on R ~ S ( D ) >_S ( C )
r ( x , y ) < 0 on R ~ S ( D ) < S ( C )
with strict inequality holding when r does not vanish on R. Clearly, the function r(x,y) is
key to the task of identifying paths of least or greatest resistance, i.e., optimum paths for
failure or survival.
Example6 (Continued): Here r(x, y) = 2 x - y and as before the focus stays on survival
to the point (a,b) = (1,1). All life paths are contained within the unit square where the sign
of r is depicted below:
(U)
+
(Ouo)
652
The picture suggests considering the life path defined as:
G(t) =
I t
(t,2t)
(t,l)
0I t 5;
+I
or using Green-that:
Consider a deformation of C6 H C6 downward that would invade the region for which
r>O. Taking C;6= C, C, = D in the above, we find that S(C6) > S(C6). On the other
hand, any deformation of C6 H C6 upward would invade the region for which HO.
Taking C6 = D, C6 = C in the above, we find that s(C6) > S(?6 ). It follows that the life
path c6 provides the maximum probability of survival to (1,l). A similar argument
shows that the life path C, + c3 provides the minimum probability of survival to (1,l).
Finally, consider, as in Section II, the interpretation when values of x and,y represent
medical and indemnity benefits paid to date. Subject to this hazard function, the path
Cl + C3 (which corresponds to the “sports medicine” approach of first focusing all
resources to medical care) maximizes the probability of claim resolution at (1,l).
It is apparent from the example that optimal paths can be expected to trace along
solutions to r(x,y)=O and the boundary of the rectangle. Observe that in the interpretation
of Example 6, time was not included among the coordinates. Instead, time was relegated
to the role of parameter of life paths. That is appropriate provided the focus is more on
costs than on their specific timing. If, for example, it is desired to estimate expected time
to failure, it would make senseto include time among the coordinates and look
particularly at the expected survival vector component in that direction. Similarly, if the
timing of payments is at issue, such as with claim administration protocols, it is natural to
explicitly include time as a coordinate in the model. Given the way data is collected,
time stamped payment information is the most natural source for capturing a life path and
time is the most natural parameter.
Green’s theorem comes neatly into play when considering alternative paths for getting to
the same place, i.e., when resources are already allocated and it is a question of
optimizing their effect on claim resolution. Logically prior to this, of course, is the issue
of allocating resources, as illustrated in yet another revisit to the example:
653
ExJmple6 (Continued): Suppose we have fixed resources p > 0and we consider the
portion a ~ [0,1] to be speni on medical. Clearly this involves considering life paths to the
linex + y = ft. So let C,,.p denote the linear life path from (0,0) to (afl,(I - a ) p ) . We
leave to the reader the verification that:
y(/2, f l ) = #T] = (Of--O~3)/~ 3
c.,, 6
It follows that for any fl > 0, y(a, p) has a relative maximum at a = J 1 and so allocating
that portion of every dollar to medical would follow along the straight path
?+,Ii
that maximizes the probability of resolving the claim.
There is also the converse issue, suppose you are confronted with a claim that requires a
certain amount of work to close, how can you minimize the cost outlay? This related
allocation problem is illustrated in our final revisit to the example:
(,+2 y°,
and that the outlay a + a~ais minimized when a = ~ - 1. We find that in the most cost-
We conclude this section with a formulation of Green's theorem suitable for comparing
survival along any two life paths C and D of (a, b) e ~3. For any x ~ [0, a] let
/;x = {(x,t) l t ~ [0, b]} be the vertical line segment above x. Our assumptions imply that:
Lx 17O = {(x, t) I t ¢ [dl(x), d 2(x)] }
Lx NC = {(x,t) l t ~ [q(x),c2(x)]}
And we may define:
-i d2(x)<cl(x)
~(x,y)= + c2(x)<dl(x)
otherwise
654
Pictorially, 8 is 1 when C lies below D and - i when D lies below C, in effect flagging
the two possible orientations the life paths can traverse around the region R they enclose.
All life paths to (a,O lie in the closed rectangle [0, a] × [0, b] and the path:
~(t) _1 I
( 2 a ( t - _- ), , b ) -~ ~ t _< 1
is the "top" top life path. Let
C- D C-C D-C: R, R, R
S(D) = eaS(C) where a = ~Sr
R
This provides a general comparison formula that is amenable to numeric evaluation. In
practice, though, a simple chart of the sign of r(x,y) over the applicable rectangle is the
best starting point. The key, therefore, to identifying optimal paths is a representation of
r/that yields a sufficiently accurate picture ofr.
The question remains how to determine a hazard field from empirical data. One simple
approach is to restrict the domain of the function to regions over which the hJ,7~rd vector
is assumed constant and then approximate it by estimating the coordinate failure rates.
For this, traditional survival analysis methods suffice. SAS, for example, is well suited
since its survival analysis procedures can be performed over cells of data and its time
variable can be set to measure changes parallel to the coordinate variables (see [1]).
General curve fitting techniques can then be used to paste the pieces together. Clearly a
more systematic approach, especially a computer algorithm, would be useful. An
alternative is to first estimate the'expected survival vector field p --which is more
straightforward in concept--and then "invert" that field in some fashion to derive the
hazard vector field r/(this is considered in [3]).
A generalized survival model can be used to assign a case reserve "vector". Unlike
traditional reserve formulas, the vector would account for the interaction of component
cost liabilities. Properly formulated, it would provide integrated benchmarks for both the
655
prospective duration and various dollar costs of a claim. Note that the definition of
expected survival vector field presented here is strictly prospective. It would be
interesting to see whether the theory can yield a "tangent reserve vector" (or higher
derivative vectors) defined on life paths and sensitive to the prior history of the claim.
It would also be interesting and potentially very valuable to determine whether an insurer
has any tendency to follow paths of"greatest or least resistance" in resolving cases. The
ability to identify optimum paths might eventually yield valuable information on
protocols for case management. Example 6 is indicative of how to exploit Green's
Theorem in such an investigation, not to mention first semester calculus.
Example 4 illustrates that the concept of a proportional hazard relationship becomes more
complicated in higher dimensions. Indeed, the concept itself can be blown up n2-fold
from scalar to matrix multiplication. Further research is needed to determine what
concepts work best. The Cox proportional hazard model (see [1]) is the standard tool for
relating explanatory variables ("eovariates") with the hazard function. Because each
component along a life path implies essentially the same failure sequence, the Cox model
will typically associate the same covariate proportional shift irrespective of which
coordinate xi is chosen as the time t variable. Alternatively, a parameter for the life path
could be used as time t. As a result, the Cox model can be used in this context but only
with the understanding that the proportional effect is assumed to be uniform over all
values of all components. By the same token, so-called "time dependent" interventions
can also be analyzed using the Cox model provided the intervention is consistently
defined among the n components. This should not be a problem with time-stamped data
where time is used to parameterize the life paths.
Of particular value would be a generalization of the Cox model approach that avoids such
strong "inter-dimensional" assumptions on constant proportionality. The ideal solution
would be the ability to model covariate impact on the hazard vector field via pre or post
multiplication by a constant matrix. Presumably, determining the "best" such matrix
would involve constructing appropriate maximum likelihood functions. The discussion
in Section 111, however, suggests that this may not be straightforward.
Sometimes all of ~ may exceed the "natural" sample space for a particular problem. A
subset (e.g. manifold as in [5]) might be more appropriate and the "Stokes type" theorems
may prove useful in that context, analogous to the use of Green's theorem in the simple
example discussed here. Applications of "advanced calculus" have traditionally been the
purview of physicists and engineers, not actuaries. Use of multivariate survival models
may help level that playing field.
656
References
[1] Allison, Paul D., Survival Analysis Using the SAS~ System: A Practical Guide,
The SAS Institute, Inc., 1995.
[2] Corro, Dan, Calculating the Change in Mean Duration of a Shift in the Hazard
Rate Function, to appear in CAS Forum, Winter 2001.
[3] Corro, Dan, A Note on the Inverse Relationship of Hazard with Life Expectancy,
in preparation.
[4] Corro, Dan, Modeling Loss Development with Micro Data, CAS Forum, Fall
2000.
657
658
Surplus Allocation: A DFA Application
659
BIOGRAPHY
Kevin J. Olsen is a Fellow of the Casualty Actuarial Society, Associate of the Society of
Actuaries, Member of the American Academy of Actuaries, and a Chartered Financial
Analyst charterholder. Mr. Olsen is employed by GuideOne Insurance in West Des
Moines, Iowa where he is active in reserving issues and financial projects. He received
his B.S.B.A in Actuarial Science from Drake University. Mr. Olsen is an active Casualty
Actuarial Society member currently serving on the Examination Committee.
ACKNOWLEDGEMENTS
The author would like to thank David A. Withers for his encouragement and advice. He
also contributed much appreciated time and assistance reviewing, editing, and suggesting
improvements to this paper. The author would also like to recognize Thomas E.
Hettinger for a final read through and critique.
660
Surplus Allocation:
A DFA Application
Kevin J. Olsen
ABSTRACT
Surplus allocation has been requested from actuaries many times over the years. There
are those who feel surplus allocation o f any sort is incomprehensible. Since actuaries are
asked to allocate surplus, we need to ensure the processes being used are sound. It is
such a request from upper management that sent the author looking for the methods
employed by others and pondering what additional methods could be constructed. This
paper reviews reserve and duration based allocation methods and then ventures into
661
PURPOSE
The purpose of this paper is to share methods for surplus allocation with others, receive
feedback on these methods, and promote further development. The author is a company
actuary in the pursuit of answers for management. This project was begun to answer a
question presented by the company's CFO. The questions raised were non-actuarially
based but needed to be answered by someone with a financial understanding. Given the
company's surplus, what is the optimal distribution of surplus by line of business? This
will allow tracking, calculating, and determining profitability of each line of business on
its own.
662
INTRODUCTION
This paper will review and analyze three methods of allocating surplus. The methods can
be used to distribute current surplus by line of business. This is desired for many reasons
surplus ratios by line of business, and distributing investment income to line of business.
Although many ways have been discussed to allocate surplus, there is no single standard
accepted by everybody. California Proposition 103 used the proportion of loss and
unearned premium reserve to allocate surplus. It has been suggested that surplus being
used for pricing purposes should be allocated based on one's favorite risk load formula I.
duration, or based on the coefficient of variation in loss ratios. This paper will start with
the simpler methods and venture into a variance-based method. The methods will discuss
Keep in mind the allocation of surplus to line of business will not mean line of business
independence, because the total amount of the surplus is still there to support the
company as a whole. The standard deviation of the enterprise surplus or operating gain
will always be less than the sum of the standard deviations by line of business, due to less
663
METHODS
REVIEW OF SURPLUS
What is the purpose of surplus? Surplus is there for two purposes 1) to support insurance
company operations and 2) to support other activities. The surplus allocation for the
purposes of this paper deals with supporting insurance company operations. This is an
amount necessary to cover risks such as the variation in liabilities at a point in time, as
well as prepare for future needs. From a statutory view, as the company grows the
expenses are realized immediately, while the premiums are earned over the course of the
policy. If the company accelerates its growth, there will be a reduction of surplus to
cover the current expenses. From a going-concern basis, the future liabilities also need to
"Surplus [exists to] protect the insurer against several types of risk. Asset risk is
the risk that financial assets will depreciate (e.g., bonds will default or stock
prices will drop). Pricing risk is the risk that at policy expiration, incurred losses
and expenses will be greater than expected. Reserving risk is the risk that loss
reserves will not cover ultimate loss payments. Assotdiability mismatch risk is
the risk that changes in interest rates will affect the market value of certain assets,
such as bonds, differently than that of liabilities. Catastrophe risk is the risk that
unforeseen losses, such as hurricanes or earthquakes, will depress the return
realized by the insurer. Reinsurance risk is the risk that reinsurance recoverables
will not be collected. Credit risk is the risk that agents will not remit premium
balances or that insureds will not remit accrued retrospective premiums." [5]
664
RESERVE METHOD
Distributing surplus based on loss reserves and unearned premium reserves may be the
easiest method. Allocating surplus according to the volume of business per line is a
logical choice since surplus is committed when the policy is written and released when
the loss is paid. If it is a stable book of business, the loss reserves and unearned premium
reserves will remain relatively constant from year to year. California's proposition 103
used this method to allocate surplus to line of business for their calculations.
This method matches available surplus to line of business in proportion to reserves held.
There are no tricky calculations or multiple iterations. The necessary information can be
The method begins by listing the ultimate loss reserves needed by line of business and
summing them for the enterprise. The same is done for the unearned premium reserve.
These are shown in columns I and 2 of Table 1 below, while the sum is shown in column
3. For each line of business, take the respective reserve sum and divide by the enterprise
sum. This gives the distribution of reserves by line of business, which can be applied to
surplus. (See Table 1 or Exhibit 1 .) Enterprise surplus can then be multiplied by the
665
Table 1
The reserve method is a quick and easy method to use, but there are several
disadvantages to using this method. It does not consider the length o f the reserve pay-out
tail, adjustments in the reserve payment pattern, or the time value o f money. All o f these
can cause variations or unexpected results, the precise thing surplus is there to cover.
point in time and does not consider future changes in the distribution by line o f business.
The reserve method o f allocation considers only the pricing and reserving risks. Larger
amounts o f surplus are allocated to the lines o f business holding larger reserves. This
method ignores the five other significant areas o f variability referenced above in
determining the surplus allocation. These five neglected risks include asset risk, asset-
liability mismatch risk, catastrophe risk, reinsurance risk, and credit risk.
666
For more information on different reserve based methods reference "An Evaluation of
Surplus Methods Underlying Risk Based Capital Calculations" by Michael Miller and
DURATION METHOD
Many people on the CAS web site and CASNET touted duration as a means to allocate
reserve allocation since duration considers payment pattern changes and interest rate
changes in the duration calculation. Longer tail lines receive relatively more surplus to
Duration is a time value weighted pay-out length. In other words, duration is a weighted
average term to completion where the years are weighted by the present value of the
I1
Z [(t.CFt)/(l+y)t]
Duration = t=l
[ CF,/(l +y)']
t=l
667
Duration Example
Table 2
be paid in each calendar year. This includes payments from all accident years 1997 and
prior. Column 2 shows that the duration is being examined from the beginning of 1997
since the average payment is expected to be paid halfway through the year, assuming
that in any calendar year the payments are uniform. The present value of column 1 at
6.5% is shown in column 3, while column 4 is (2) times (3). Column 4 gives a weighted
present value based on the length until payment. The Macaulay duration is the sum of
column 4 divided by the sum of column 3. The Modified duration is the Macaulay
668
The duration method can be applied using a Dynamic Financial Analysis (DFA) model
that incorporates changing discount rates, payment patterns, and inflation amounts by
iteration in the calculation. Dynamo2 is an Excel based model developed by the actuarial
consulting firm of Miller, Rapp, Herbers & Terry (MRH&T) used by the author. Further
The DFA model needed some programming additions to capture and calculate the
necessary components for duration. Appendix B lays out the changes made to generate
the payments by accident year and calendar year and to generate the interest rate.
With the necessary information obtained, formulas were inserted in the DFA model to
calculate duration as shown in the example above. A sample iteration of this duration
process for the homeowners line of business is shown in Exhibit 2. After the DFA ran
1,000 iterations (maximizing computing capacity), durations were selected equal to the
Table 3 below shows the process of the duration method. Determining a distribution of
surplus begins by normalizing the duration by line of business with the enterprise
needed per dollar of premium. The next step is to apply the appropriate premium from
column 4 to each line of business to arrive at the estimated surplus in column 5. From
669
here a distribution m a y be determined by, dividing line o f business estimated surplus by
the enterprise surplus. The resulting distribution in column 6 canl theffbe used to spr.ead
the real surplus to line o f business. This method is also outlined in Exhibit 3.
Table 3
w h y use the Macaulay duration and not the Modified duration. Since this method deal§
with relative duration b y dividing each line o f business by the enterprise duration, it does
670
not matter which one is used. If the modified duration were used, then all the durations
would be divided by the same factor maintaining the same relativities between them:
In addition to the advantages listed above, there are a few disadvantages to using the
duration method to allocate surplus. The duration method distributes surplus based on
projected ultimate losses for past years and the payment pattern for those years. From
that point of view, using duration has a run-offview point. It allocates surplus to lines of
business in relation to how those lines will run-off and in relation to current premium
volume. This covers the vulnerability to greater variation in the longer payout lines of
business. This is a static view of business at a point in time. It does not consider future
growth or changes in the mix of business going forward. For a company that plans on
continuing to write business and grow, this might not be the best option. Surplus needs to
be allocated for future premium growth. For statutory accounting, the expenses of
writing policies are recognized immediately, while the premiums are earned over the
course of the policies. This is why companies with accelerated growth may see surplus
decline (statutory surplus, not market value surplus). "Rapid premium growth precedes
nearly all ofthe major failures. Rapid growth is not harmful, per se. However, rapid
premium growth reduces the margin for error in the operation of insurers." [4]
Additional surplus is needed to cover the reduced margin of error for growing lines of
business.
The duration method does a better job than the reserve method of considering the risks
surplus is to protect against. The lines of business with longer pay-out patterns have
671
higher durations. Here duration is a proxy for the riskiness of the long tail lines. Longer
tailed lines are exposed to more interest rate and payment pattern changes incorporated in
reserving risk, asset risk, and asset-liability mismatch risk. By using duration as a proxy,
this allocation method covers these risks. Again, four risks surplus is to protect against
are not even considered by this method (pricing risk, catastrophe risk, reinsurance risk,
Keep in mind that even though this model considers variation in the payment pattem,
judicial or legislative changes that could effect payments are not considered. Such
changes would create greater variability, but are difficult if not impossible to predict.
These types of changes can not be foreseen on any method presented here.
VARIATION M E T H O D
The variation method is one that the author developed while working with the DFA
model and trying to answer the CFO's questions. It is a forward-looking method on what
may happen. Loss reserves are already set up to cover losses that have occurred. Surplus
exists for unexpected events or variations from the norm. This method uses standard
The variation method uses the calendar year operating gain by line of business from each
iteration of the DFA, calculated by adding net underwriting profit to the investment
return during the calendar year. To calculate such information, additions needed to be
672
made to the DFA spreadsheet to capture interest earned by line o f business. This is
described in Appendix C.
Following the steps described in Appendix C, the investment return by line of business to
be included in the operating profit was derived as the amount of reserves available for
investment times the rate of return for the appropriate year. The calendar year net
operating profit was calculated by adding this investment return and the calendar year net
The next step was to compare the variation between lines of business. Using the variance
of operating profit alone would give results that are difficult to compare between lines of
premium and number of policies. To put all lines of business on a comparable basis the
The net operating gain was divided by the net written premium for that line of business.
This ratio is a unit of measure with the dollar units canceling out. This put all lines of
business on a net operating gain per dollar of net written premium basis before the
As the steps of the variation method calculation are described, reference will be made to
the portions of Exhibit 4 discussed in the text. Exhibit 4 shows this method laid out in its
entirety. By capturing the operating gain by line of business for each calendar year, the
673
@Risk software calculates the standard deviation of each line and year over the 1,000
iterations.
Table 4 shows the results of the simulation. Columns (1) through (5) are the standard
deviations of net operating gain per dollar of net written premium. This information was
generated by the DFA model and @Risk. Appendix D lays out the credibility weightings
Table 4
Table 5 below shows the remaining steps to determine the surplus allocation of the
variation method. Dividing the credibility weighted standard deviations (Table 8, column
12, Appendix D) by the average standard deviation of the enterprise (Table 8, column 13,
674
resulting relativities by the inverse of the company's premium-to-surplus ratio changes
the relativities to the amount of surplus needed per dollar of premium. The next step is to
apply the appropriate premium to each line of business to arrive at the estimated surplus
(column 17 = (15) * (16)). Appropriate premium could include the year-end premium by
line of business or the first year's projected premium. From here a distribution may be
surplus. The resulting distribution in column 18 can then be used to spread the real
Table 5
675
The distribution created by the variation method may raise some questions. Why is it that
the property and physical damage coverages receive more surplus based on this method?
The property and physical damage coverages are subject to catastrophes and therefore
more variation from year to year. The variation is a result of both the frequency and the
severity of catastrophes. The liability lines have the potential for high single occurrence
pay-outs by policy, but the number of these are relatively consistent from year to year.
The law of large numbers makes predicting the result for this line of business more
consistent.
Damage (CAPD) and CPP Liability (CPP Liab). As can be seen in Table 5, both of these
lines are allocated $11,000,000 surplus, whereas the premium for CPP Liab is 11.5 times
as large as that for CAPD. In CPP Liab, the law of large numbers helps smooth results,
while CAPD is subject to catastrophes. The reinsurance in place also underlies these
results.
Both liability and property lines of business for smaller companies are affected by
variations in large losses from year to year. The author did not test which lines of
business had more variability in large losses, attributing the major variations between
677
This method contains many of the characteristics that are desired from a surplus
allocation method. The length and amount of the tail are considered with varying
payment patterns, incorporating reserving risk. The DFA model varies interest rates to
include asset risk into the considerations. The varying interest rate is brought into the
operating gain through the investment income. Using operating gains also reflects
pricing risk by including variability of loss ratios embedded in the operating return and
impact of asset-liability mismatch risk is included by varying the interest rates in the
model as well as varying the ultimate loss and payment patterns included in the operating
gains. The DFA model does not consider reinsurance risk or credit risk, but these could
This method goes beyond the first two methods and looks to the future. This is a going-
concern method, which tries to reveal what the distribution by line of business should be
going forward. To do this it incorporates the company's growth plans by line of business
and the variability by line of business based on the growth plans and past experience. If
the company is going to cut rates to grow more, then this is included in the variation in
net operating gain per dollar of net written premium and figured into the standard
deviation. Most company changes in growth, mix of business, or type of business are
reflected in the operating gains as long as the DFA model is set up appropriately to reflect
these changes.
676
Non-catastrophe reinsurance levels also influence the variability by line of business. On
a net of reinsurance basis, as the threshold for excess of loss coverage is reduced, the
variability of results also declines. "[A]ny risk which lowers the aggregate Exposure
Ratio o f the portfolio has added capacity to the portfolio."[9] The exposure ratio is the
level o f surplus needed for the line o f business decreases freeing up surplus for other
uses.
COMPARISION OF METHODS
The three different methods presented givewidely varying results as can be seen in
Table 6 below.
Table 6
678
These variations are the result of the reasoning behind the methods. Looking at personal
auto liability, the loss reserve method in column (1) allocates 33.5% to this line, whereas,
the duration method apportions 20.5% and the variation method only 7.1%. The personal
auto liability line of business has a consistent amount of losses every year and consistent
sales growth producing higher reserves held. The reserve method reflects this explicitly.
The duration method analysis notes that the payout pattern is weighted heavily to the
earlier years. This does not allow much time for adverse development. The variation
method looks at the reserves and the payout pattern, but also considers that from year to
year the loss ratios are consistent. The ultimate personal auto liability losses can be
reasonably estimated from year to year without much variation from expected. Therefore
The homeowners line of business is another good example. With the payouts being quick
and settlements rather fast, the level of reserves carried is relatively low. The reserve
method looks only at the carried reserves to determine the allocation (7.3%). The
duration method considers that the pay-out pattern is relatively short meaning less surplus
is necessary (5.8%). Yet when losses are compared from year to year there is greater
variation due to catastrophes. The surplus necessary to cover these greater variations is
13.0%.
The driving risks affecting surplus differs for each line of business. For example,
catastrophes have more of an impact on property lines than liability coverages. There are
679
certain sets of risks for each line o f business that maintain significant influence on results.
680
SURPLUS
What overall amount of surplus should be used? All of the methods discussed above
allocate a stated surplus amount. There are a few different methods to determine how
ACTUAL
The most straightforward method would be to use the company's actual surplus as of
year-end. This amount would then be distributed back to line of business based on the
method of choice. A few problems with this method would be if the company was over
capitalized (under capitalized). If this were the ease than too much (little) would be
allocated. As stated toward the beginning of the paper, surplus is there to support
insurance operations as well as other activities. The surplus to allocate should be the
Actual surplus also has many definitions to consider. If allocating actual surplus, is it
market value, statutory value, or GAAP value? Should equity in the unearned premium
reserve or the discounted amount from the loss reserve discount factor be included?
681
PREMIUM-SURPLUS RATIO
A second method pegs the surplus at a certain premium to surplus ratio (P/S). There are a
variety of reasons and justifications for selecting a certain P/S ratio. The P/S ratio could
be selected by management's desire not to exceed a certain P/S ratio, say 2:1. It could be
pegged to match a certain peer group in the industry. A word of caution: P/S ratios can
O P E R A T I N G GAIN DISTRIBUTION
The amount of surplus needed by a company is based on its aversion to risk. Assume that
a company's risk manager determines that they want to be 95% confident that the surplus
doesn't decrease, or 90% confident the surplus decreases by no more than 10%. To do
this the company would need to generate a distribution of the change in surplus for the
year. Another alternative would be to use operating gain for the year. In both of these
choosing the corresponding amount from the distribution, it can be used to determine the
The net operating gains from the DFA model iterations used in the allocation larocess can
be captured and set into a distribution. Using the example from above, the goal would be
90°/, confidence that surplus would decrease at most 10% in the given year. From the
682
1,000 iterations, the Enterprise operating gain for 1998 was captured with the resulting
Table 7
Enterprise
Operating Gain
1998
5 % Perc= (71,619,600)
10% Perc= (50,000,000)
15% Perc= (32,521,600)
20% Pert = (! 0,258,950)
25% Perc = ( 763,150)
30% Pert = 12,859,600
35% Perc = 38,245,700
40% Perc = 60,052,150
45% Pert = 84,517,300
This is a portion of the full distribution that increases in 5% increments up to 95%. This
table communicates for example that 5% of the operating gain samples are less than
$(71,619,600) and that 30% of the samples are less than $12,859,600. At a 90°/,
confidence level $(50,000,000) is the operating gain. Similarly at a 95% confidence level
At the 90% confidence level, surplus would be decreasing at most $50,000,000 in the
year. If the company started with only $50,000,000 this would not be a pleasant
surplus the company management is willing to lose in any year?' For example,
683
management is willing to risk a decrease to be at most 10% of starting surplus. In other
words, the $(50,000,000) would equal (0.10) times the needed surplus. The surplus
The calculated surplus is the theoretical amount needed to support business as a going-
coneem under the stated constraints. This amount should be used in the ROE and pricing
calculations. Comparing this surplus to the enterprise surplus may indicate a redundancy
or deficiency. If the calculated surplus is less than the company surplus, the redundancy
isn't necessarily excess to squander. The total enterprise surplus may need to be
Many insurance companies are being evaluated from a financial viewpoint. The question
that comes up is the level of ROE that the company wants to target. The level of surplus
affects this ROE measure. Lower surplus translates into a higher leverage ratio
increasing the potential ROE while generating a greater chance of ruin. To reduce the
chance of ruin more surplus would be held, reducing the ROE. This puts the insurance
684
With the DFA model it is possible to test out different levels of surplus. One can begin
with a certain level of surplus, capturing the appropriate values for ROE and ending
685
CONCLUSION
As the line between the financial industry and the insurance industry blurs, actuaries are
becoming the financial leaders in the insurance industry. From a financial perspective
there is a strong desire to allocate surplus to measure, track, and rate performance on a
line o f business basis. There are many ways to allocate surplus once the overall needed
incorporates the most characteristics desired from a surplus allocation method. However,
this is just the starting point for others to build upon and to improve.
686
Exhibit 1
Distribution by Reserves
687
Exhibit 2
DURATION CALCULATION
Line of Busineea: Homeowners
Accident Years
(1) (2) (3) (4) (S) (9) (7) (8) (9) (10) (11) (12) (13) (14)
Present Weighted
U)n 11.i~ ~ ln2 ~ I~H 1~ ~ 1~7 Total ~a~ VJtut
&53%
1~7 2,671 1,005,428 1,24S,769 1,235,685 1,468,734 4,55,4.592 9,874,726 10,488,831 31,928,154 48,079,825 109,887,415 0.5 106,466,259 53,233,129
1998 1,071,315 2,672,012 1,972,956 1,901,699 4,189,929 7,045.897 7.282,894 35,960,250 83.542.241 145,639,193 1.5 132,455,615 198,683,422
C 1999 983,868 36,768 652,113 4,388,283 1,g02,007 5,579,526 2.659,516 15,570,392 37,330,497 69,102,970 2.5 58,995,230 147,488,074
a 2O0O 36,833 996,306 716,188 2,585,618 4,265.356 1,372,445 8,610,860 12,484,401 31,068.007 3.5 24,897,839 87,142,436
I 2001 306,002 104,284 551,020 951,525 2,545,685 2,007,813 6,180,550 12.646,879 4.5 9,513,924 42,812,659
2002 102,661 739.430 585,785 1,066,333 1,142,832 1,595,912 5,232,953 5.6 3,695,313 20,324,222
2003 1,268,525 26,357 2,154.525 1,437,799 1,631,019 6,518,225 6.5 4,320,777 25,085,049
d 2OO4 26,357 5,142,536 4 3 , 1 7 4 308,906 5,520,973 7.5 3,435,391 25,765,432
a 2OO5 2,153.695 2 1 , 5 8 7 576,838 2,752,120 8.5 1,607,519 13,663,907
r 20O6 1.248,852 2 1 , 5 8 7 55,009 1,322,448 9.5 725,0oj8 6,888.409
2007 784,585 55,O09 839,594 10.5 4 3 2 , 1 3 0 4,537,362
C# 2OO8 532.565 532,565 11.5 2 5 7 , 3 0 3 2,958,989
2OO9 235,252 235,252 12.5 1 0 6 , 6 9 3 1.333,657
~10 0 13.5 0 0
r ~11 0 14.5 0 0
~12 0 15.5 0 0
~13 0 16.5 0 0
~14 0 17.5 0 0
~15 0 18.5 0 0
~16 0 19.5 0 0
~17 0 20.5 0 0
~18 0 21.5 0 0
~19 0 22.5 0 0
~20 0 23.5 0 0
2021 0 24.5 0 0
346,909,087 632,916,747
Duration Distribution
Duration
Duration Relativity (S/P) Premium Estimated Split
Suq~us
(1) (2) (3) (4) (5) (6)
Homeowners 1.8698 0.8023 0.4814 54,552,830 26,261,991 5.4%
P Auto Liab 2.0946 0.8988 0.5393 179,204,200 96,644,215 19.7%
P Auto Phys Dam 1.2064 0.5177 0.3106 164,174,860 50,994,823 10.4%
C Auto Liab 2.5792 1.1068 0.6641 24,369,986 16,183,090 3.3%
C Auto Phys Dam 1.4776 0.6340 0.3804 8,872,202 3,375,212 0.7%
CPP Liab 3.2283 1.3853 0.8312 102,639,868 85,311,534 17.4%
CPP Prop 1.8037 0.7740 0.4644 137,153,660 63,692,602 13.0%
Other Liab 2.6794 1.1498 0.6899 15,129,588 10,437,285 2.1%
Other Liab - Umbrella 2.6639 1.1431 0.6859 630,543 432,473 0.1%
Workers Comp 3.6035 1.5463 0,9278 148,116,140 137,419,704 26.0%
Adjusted
Surplus P/S
(7) (8)
Homeowners 26,756,836 2.0388
P Auto Liab 98,465,245 1.6200
P Auto Phys Dam 51,955,699 3.1599
C Auto Liab 16,488,021 1.4780
C Auto Phys Dam 3,438,810 2.5800
CPP Liab 86,919,027 1.1809
CPP Prop 64,892,737 2.1135
Other Liab 10,633,951 1.4228
Other Liab - Umbrella 440,622 1.4310
Workers Comp 140,009,051 1.0579
(1) Duration
(2) Line Duration / Enterprise Duration
(3) (2)" Surplus/Premium ratio of 0.60 = (1/1,6697)
(4) Premium
(5) (3) ' (4) = (Surplus/Premium) * (Premium) = Estimated Surplus
(6) (5) / (Enterprise 5)
(7) (6)* Enterprise
By Line: [ (5) / Consumer or Commerical (5) ] * Consumer or Commercial (7)
(8) (4)/(7)
689
Exhibit 4
Variation Distribution
(s) (7) (10]
O~lnlJng C-4m pro" $ WP
~andan~ D e ~ o ~ (1) (2) (3) (4) (5)
Coneum~ 39?,928.456
I~mslbJm/Sur~us re4]io 167 C43mme~:t~ 436,911,953
Enterprise 834.840.409
Sumk~Prw~um rz=,~o 080 (ozsumed)
690
Appendix A
The following includes excerpts from the papers "Building a Public Access PC-Based
DFA Model" (1997 CAS Summer Forum, Vol.2) [2] and "Using the Public Access DFA
Model: A case Study" (1998 CAS Summer Forum) [3]. Both papers are used with the
The Dynamic Financial Analysis (DFA) model used in this paper is a public access
model. The actuarial consulting firm Miller, Rapp, Herbers & Terry (MRH&T)
created Dynamo2. Dynamo2 is Excel based enabling the user to create calculations as
needed.
Each iteration of the model starts with detailed underwriting and financial data showing
the historical and current positions of the company. It randomly selects values for 4,387
stochastic variables, calculates the effect on the company of each of these selected values,
and produces summary financial statements of the company for the next five years based
on the combined effect of the random variables and other deterministic factors.
The model consists of several different modules, each of which calculates a component of
the model indications. Separate modules are included for investments, catastrophes,
691
underwriting, taxation, interest rates, and loss reserve development. The number of lines
of business can be expanded or contracted to fit the needs of the user. The model used
For each line of business, the underwriting gain or loss is calculated separately for: 1)
new business, 2) 1st renewal business and 3) 2nd and subsequent renewals. This division
is provided to reflect the aging phenomenon, in which loss experience improves with the
length of time a policyholder has been with a company. These three categories are then
The values for each simulation are shared among the different modules. Thus, if the
random number generator produces a high value for the short term interest rate, this high
interest rate is used in the investment module as well as the underwriting module.
Similarly, a high value for catastrophes in the catastrophe module carries through to the
692
The primary risks that are reflected in the model are:
CAVEATS OF THE M O D E L
Some factors, having a potentially significant impact on results, are omitted from the
model because, in the opinion of the authors, they are beyond the scope of an actuarial
Whether fraud is likely to occur (or is currently occurring) at a particular insurer, is not
fraudulent behavior are simply omitted from the model. Other examples of omitted
factors that definitely could have a significant effect on insurance operations include a
change in the tax code, repeal of the McCarran-Ferguson Act, a major shift in the
application of a legal doctrine or the risk of a line of business being socialized by a state,
province, or federal government. Thus, the range of possible outcomes from operating an
insurance company is actually greater than a DFA model would indicate; the model is
designed to account only for the risks that can be realistically quantified.
The values used as input in the model are derived from past experience and current
operational plans. To the extent that something happens in the future that is completely
693
out of line with past events, the model will be inaccurate. For example, the size of a
specific catastrophe is based on a lognormal distribution with the parameter values based
on experience over the period 1949 - 1995 (adjusted for inflation). However, if this
process had been used just prior to 1992, the chance of two events occurring within the
next 2 ½ years, both of which exceeded the largest previous loss by a factor of more than
2, would have been extremely small. However, Hurricane Andrew caused $15.5 billion
in losses in August of 1992 and the Northridge earthquake caused $12.5 billion in insured
losses in January 1994. The largest insured loss prior to that was Hurricane Hugo, which
had caused $4.2 billion in losses in 1989. Also, if changes in any operations occur, then
The DFA model encompasses catastrophes, which have a significant impact on the
property lines of business. The liability lines of business are more influenced by changes
in public attitudes, and legislative or judicial changes. These changes are difficult if not
impossible to model accurately. The variation method considers these to the extent that
The number of years used may affect the credibility of results. The DFA model results
have a compounding effect from year to year (e.g., the first year results are used in the
second year, the second year results are used in the third year, and so on). With nominal
growth assumptions, this will result in larger variation for the more distant years. If
ample simulations are run, then the distant years' variation becomes more stable.
694
When a significant legislative or judicial change occurs, the model should be adjusted to
reflect such changes. The surplus allocation process should be run once again to
MODEL USAGE
Before relying on a DFA model for any purpose the user must be comfortable with the
695
Appendix B
The assumption was made that all payments would be made by the end of the twenty-first
year for each accident year, however in the original model only five calendar years of
payments were calculated for each accident year. These payments needed to be extended
to twenty years past the last projected accident year. Extended payments were produced
in the same fashion as done by the model for the first five years.
After projecting how much is going to be paid from each accident year for any calendar
year, these payments are summed across all accident years for the appropriate calendar
year. This generates projected calendar year paid amounts as in column 1 of Table 2.
The total column in Exhibit 2 also shows calendar year paid amounts.
A discount rate was needed to find the present value of these calculations. The discount
rate used came from the first year's projected investment information of the model.
Dividends, coupon payments, and interest were summed and divided by the average book
value amount invested in stocks, bonds, and cash over the year. This avoids both realized
and unrealized capital gains or losses. By calculating a DFA discount rate, it allows the
interest rate to vary with the projected economic conditions for each iteration.
696
For calculating present values, a uniform payment pattern during each year was assumed,
697
Appendix C
To capture interest earned by line o f business, adjustments had to be made to the DFA
model.
The pay-out rate of reserves was determined from the payment patterns already
mentioned. The percentage of reserves available for investment over the course of the
year is:
[ {(I-PPT)+(I-PPT+~)} /2]/(I-PPT)
Using the above calculation, the amount available for investment was found by
698
[ [ { (1 - PPT ) + ( 1 - PPT+I ) } / 2 ] / ( 1 - PPT ) ] * R e s e r v e s
Otherwise
Reserves = Reserves at the beginning o f the Calendar Year
Example: For a given line o f business and accident year, 20% o f the losses had
been paid by the beginning o f the current calendar year and 40% paid by the end.
The year began with $5,000,000 in reserves for this particular line and accident
year.
In other words, 87.5% o f the beginning o f the year reserves are available for
The method o f calculating the rate o f return used for this method was based on the market
value return. The ending market value o f the stocks, bonds, and cash were added to the
sum o f dividends, coupon payments, and interest received. The resulting amount was
divided by the sum o f the beginning market value o f the stocks, bonds, and cash. A
699
Appendix D
Credibility Weightings
Even after running 1,000 iterations, the information is not necessarily fully credible since
this calculation deals with the standard deviation. The model itself should be fully
credible, but the standard deviation deals with the number of samplings. If enough
iterations are run, the standard deviations should be relatively stable from year to year.
Due to computing limitations, only 1,000 iterations were run, which lacks full credibility.
Table 8 below lays out the credibility weighting of the standard deviations. Applying the
Btihlmann credibility across the years and between the lines with the use of columns 6
through 10, credibility is determined by line of business and displayed in column 11.
This process is shown explicitly in Exhibit 4. Giving credibility weight to the expected
value for a line of business over the years (column 6) and the complement to the average
of the expected values for all lines of business (column 8) results in a credibility weighted
standard deviation of net operating gain per dollar of net written premium (column 12),
This is the main factor in helping determine the distribution of surplus. The rest of the
steps are similar to those used in the duration method. The B~lhlmann credibility is
700
Table 8
(i i) (12)
Cred. Wtd.
Credibility Std Deviations
Home 0.7691 0.8621
P Auto Liab 0.9999 0.144 I
P Auto Phys Dam 0.9418 0.3569
P Auto Liab 0.9891 1.5628
C Auto Phys Dam 0.7541 0.9035
CPP Liab 0.9999 0.0770
CPP Prop 0.4155 1.0756
Other Liab 0.9996 0.2133
Other Liab - Umbrella 0.9798 1.0980
Workers Comp 0.9996 O. 1680
* The expected value is a straight average o f the individual line o f business data points,
s The between variance is the sum o f the squared differences between the line o f business data point and the
expected value all divided by the number o f lines o f business.
701
Appendix E
BI]IHLMANN CREDIBILITY
The within variance is calculated within the same class or line o f business across years or
periods. In this application it would be the variance for a cerlain line o f business over the
5 year period.
Z ( X1 - X )z Xi is an individual observation
i=l
n X is the average observation within the line o f business
n is the 5 years
The between variance is calculated within the same year but across lines o f business.
m
Z ( Yi - Y )2 Yi is an individual observation in the year
i=l
ITI Y is the average observation for the year over all lines o f
business
702
REFERENCES
[1] Calfarm v. Deukmeiian, 48 Cal. 3d 805, 771 P.2 "d 1247 (1989).
[3] D'Arcy, S. P., R. W. Gorvett, T. E. Hettinger, and R. J. Walling, III, 1998, "Using
the Public Access DFA Model: A Case Study," Casualty Actuarial Society Forum,
Summer 1998, pp. 53-118.
[4] Ettlinger, K.H.; Hamilton, K.L.; and Krohrn,G., State Insurance Regulation (First
Edition), Insurance Institute of America, 1995, Chapter 8, pp. 209-231.
[5] Feldblum, S. "Pricing Insurance Policies: The Internal Rate of Return Model," May
1992.
[6] Ferguson, R. E., "Duration," Proceedings of the Casualty Actuarial Society, Vol.
LXX, 1983, pp. 265-288.
[7] Miller, M. and Rapp, J., "An Evaluation of Surplus Methods underlying Risk Based
Capital calculations," 1992 Discussion Paper Program, Volume 1, pp. 1-122.
[9] Stone, J.M., "A Theory of Capacity and the Insurance of Catastrophe Risks," The
Journal of Risk and Insurance, June 1973, Vol. XL No. 2, Part I, pp. 231-243, and
September 1973, Vol. XL No. 3, Part II, PP. 339-355.
703
704