0% found this document useful (0 votes)
4 views

Tutorial 1 Solutions

Uploaded by

mangofarmergoose
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Tutorial 1 Solutions

Uploaded by

mangofarmergoose
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

THE UNIVERSITY OF HONG KONG

DEPARTMENT OF STATISTICS AND ACTUARIAL SCIENCE

STAT3951 Topics on Advanced Actuarial Modelling (Spring 2024)


Example Class 1 Solutions

1. In a large study on the transition of insured persons among various states of a


multiple state model, you are given:

• The available states are: Healthy — 0; Disabled — 1; Dead — 2; Lapsed —


3.
• Healthy individuals can become disabled, die or have their policies lapsed.
Disabled individuals can become healthy or die. Individuals that are dead or
have their policies lapsed cannot return to the system.
• Exposure and transition data are as follows:

𝑣 00
50 = 17355, 𝑣 11
50 = 553,
01
𝑑50 = 293, 02
𝑑50 = 18, 03
𝑑50 = 196,
10 12
𝑑50 = 166, 𝑑50 = 17.

The subscripts of the above quantities indicate the age of the insured.
01 , 𝜇 02 , 𝜇 03 , 𝜇 10 , 𝜇 12 .
(a) Estimate 𝜇50 50 50 50 50
(b) Provide a 95% confidence interval for each of the quantities in part (a).
(c) It is of interest to analyze whether the force of mortality for disabled persons
is equal to 25 times that for healthy persons. Determine the answer based on
a 95% confidence interval for the quantity 𝜇5012 − 25𝜇 02 .
50

Solution:

(a) The quantities are estimated as follows:


01
𝑑50 02
𝑑50 03
𝑑50
01 02 03
𝜇ˆ 50 = = 0.01688; 𝜇ˆ 50 = = 0.001037; 𝜇ˆ 50 = = 0.1130;
𝑣 00
50
𝑣 00
50
𝑣 00
50
10
𝑑50 12
10 12
𝑑50
𝜇ˆ 50 = = 0.3002; 𝜇ˆ 50 = = 0.03074.
𝑣 11
50
𝑣 11
50

(b) The upper 0.025-quantile of the standard normal distribution is 1.96. Hence,
the CI’s are:
01
𝜇ˆ 50
01 01
𝜇50 : 𝜇ˆ 50 ± 1.96 × √︃ = [0.01495, 0.01882];
01
𝑑50
𝜇ˆ 02
02
𝜇50 : 02
𝜇ˆ 50 ± 1.96 × √︃ 50 = [0.000558, 0.001516];
02
𝑑50
03
𝜇ˆ 50
03 03
𝜇50 : 𝜇ˆ 50 ± 1.96 × √︃ = [0.09715, 0.1288];
03
𝑑50

STAT3951 Example Class 1 1 Spring 2024


𝜇ˆ 10
10
𝜇50 : 10
𝜇ˆ 50 ± 1.96 × √︃ 50 = [0.2545, 0.3458];
10
𝑑50
12
𝜇ˆ 50
12 12
𝜇50 : 𝜇ˆ 50 ± 1.96 × √︃ = [0.01613, 0.04535].
12
𝑑50

¤ N(0.03074, 5.559 × 10−5 ), 𝜇ˆ 50


12 ∼
(c) Note that 𝜇ˆ 50 ¤ N(0.001037, 5.976 × 10−8 ),
02 ∼
12 is asymptotically independent of 𝜇ˆ 02 . Thus, the quantity 𝜇ˆ 12 − 25 𝜇ˆ 02
and 𝜇ˆ 50 50 50 50
is approximately normal with mean

0.03074 − 0.001037 × 25 = 0.004812

and variance

5.559 × 10−5 + 252 × 5.976 × 10−8 = 9.294 × 10−5 .



A 95% CI for 𝜇50 12 − 25𝜇 02 is given by 0.004812 ± 1.96 × 9.294 × 10−5 =
50
[−0.014, 0.0237]. Since this CI includes zero, we do not have enough evidence
12 = 25𝜇 02 .
to reject the claim that 𝜇50 50

2. Information on the exposure, force of mortality extracted from a standard table, and
observed deaths for 10 years of age is available below:

Age 𝑥 30 35 40 45 50
𝐸 𝑥𝑐 726 1,224 1,472 1,369 985
𝜇𝑥𝑠 0.0008 0.0011 0.0016 0.0037 0.0065
𝑑𝑥 0 2 2 1 5
Age 𝑥 55 60 65 70 75
𝐸 𝑥𝑐 776 643 390 311 223
𝜇𝑥𝑠 0.0093 0.0112 0.0186 0.0335 0.0875
𝑑𝑥 10 9 6 15 16

(a) Calculate the standardized deviations (𝑧𝑥 ’s) and conduct a chi-squared test at
the 5% significance level to check whether the observed mortality follows that
of the standard table.
(b) Explain why it may not be desirable to conduct a standardized deviations test
based on five intervals on the real line, like the one in the example in the lecture
notes.
(c) Conduct a standardized deviations test at the 5% significance level based on
the interval (−2/3, 2/3), to check whether the deviations are roughly normally
distributed.
(d) Can you find any other abnormalities by inspecting the values of the 𝑧𝑥 ’s?

Solution:
√︁
(a) Using the formula 𝑧𝑥 = (𝑑𝑥 − 𝐸 𝑥𝑐 𝜇𝑥𝑠 )/ 𝐸 𝑥𝑐 𝜇𝑥𝑠 , we calculate the values as
follows:

STAT3951 Example Class 1 2 Spring 2024


Age 𝑥 30 35 40 45 50
𝑧𝑥 −0.762 0.563 −0.231 −1.806 −0.554
Age 𝑥 55 60 65 70 75
𝑧𝑥 1.036 0.670 −0.466 1.419 −0.795
The test statistic is 𝑋 = 𝑧230 + 𝑧235 + · · · + 𝑧275 = 8.908, which is smaller than the
0.95 quantile of the 𝜒102 distribution (18.307). Hence, we do not reject 𝐻 , and
0
conclude that the observed mortality is consistent with the rates in the standard
table.
(b) There are only 10 values of 𝑥 available; the expected counts in the five intervals
will be too small for the chi-squared test to be valid.
(c) 4 out of the 10 standardized deviations are within the interval, and the rest
are outside the interval. To check whether this is consistent with the 𝑌 ∼
Bin(10, 0.5) distribution, we calculate the p-value (i.e., the probability of
observing something more extreme) as
4 10  
∑︁ ∑︁ 10
P(𝑌 = 𝑘) + P(𝑌 = 𝑚) = 1 − P(𝑌 = 5) = 1 − 0.510 = 0.754.
5
𝑘=0 𝑚=6

This is much larger than 0.05; we do not have enough evidence to reject the
null hypothesis, and we conclude that it is consistent with the Bin(10, 0.5)
distribution.
(d) There are 4 positive and 6 negative deviations, and thus they are not biased
in one direction. There is one value above 1.645 in absolute value; this is
consistent with the 10% probability for a standard normal random variable to
be below −1.645 or above 1.645.

3. The following table records 9 measurements of pasture yield (responses 𝑋𝑖 ), ob-


served respectively at 9 fixed growing times (covariates 𝑡𝑖 ), where

𝑋𝑖 = 𝑢 𝛽1 ,𝛽2 (𝑡𝑖 ) + 𝜖𝑖 , (𝜖 1 , . . . , 𝜖 𝑛 ) ∼ N (0, ν)

𝑖 1 2 3 4 5 6 7 8 9
Growing Time, 𝑡𝑖 9 14 21 28 42 57 63 70 79
Pasture Yield, 𝑋𝑖 8.93 10.80 18.59 22.33 39.35 56.11 61.73 64.62 67.08

Two different choices are proposed for the regression function 𝑢 𝛽1 ,𝛽2 (𝑡) as follows:
(A) 𝑢 𝛽1 ,𝛽2 (𝑡) = 𝛽1 + 𝛽2 𝑡
(B) 𝑢 𝛽1 ,𝛽2 (𝑡) = 𝛽1 /(1 + 14𝑒 −𝛽2 𝑡 )
Answer the following questions for each of the above regression functions (A) and
(B).

(i) Calculate the maximum likelihood estimates ( 𝛽ˆ1 , 𝛽ˆ2 , ν̂) of (𝛽1 , 𝛽2 , ν)
(ii) Let 𝐼 (𝛽1 , 𝛽2 , ν) denote the Fisher information matrix. Calculate 𝐼 (𝛽1 , 𝛽2 , ν) −1

STAT3951 Example Class 1 3 Spring 2024


(iii) Plot 𝑋𝑖 against 𝑡𝑖 on a graph. Plot also the fitted regression function 𝑢 𝛽ˆ1 , 𝛽ˆ2 (𝑡)
on the same graph
(iv) We are interested in estimating the expected pasture yields at growing times 0
and 80, i.e., 𝑢 𝛽ˆ1 , 𝛽ˆ2 (0) and 𝑢 𝛽ˆ1 , 𝛽ˆ2 (80), respectively. Calculate their maximum
likelihood estimates.
(v) Report the standard errors of the maximum likelihood estimators found in (d).
(vi) Which of the two regression functions, (A) or (B), would you prefer for mod-
elling the pasture yield data? Explain.

Solution:

(i) For function A

𝑋𝑖 = 𝛽1 + 𝛽2 𝑡𝑖 + 𝜖𝑖

Hence we have the likelihood function and score function are


𝑛 
(𝑋𝑖 − 𝛽1 − 𝛽2 𝑡𝑖 ) 2
 
Y 1
𝐿(X |𝛽1 , 𝛽2 , ν) = √ exp −
2𝜋ν 2ν
𝑖=1
𝑛
1 ∑︁ (𝑋𝑖 − 𝛽1 − 𝛽2 𝑡𝑖 ) 2
𝑆(𝛽1 , 𝛽2 , ν) ∝ 𝑛 ln ν
2 2ν
𝑖=1

Then we set the derivatives of the score function to zero:


𝑛
𝜕 1 ∑︁
𝑆(𝛽1 , 𝛽2 , ν) = − 2(𝑋𝑖 − 𝛽1 − 𝛽2 𝑡𝑖 ) = 0
𝜕 𝛽1 2ν
𝑖=1
𝑛
𝜕 1 ∑︁
𝑆(𝛽1 , 𝛽2 , ν) = − 2𝑡𝑖 (𝑋𝑖 − 𝛽1 − 𝛽2 𝑡𝑖 ) = 0
𝜕 𝛽2 2ν
𝑖=1
𝑛
𝜕 𝑛 1 ∑︁
𝑆(𝛽1 , 𝛽2 , ν) = − + 2 (𝑋𝑖 − 𝛽1 − 𝛽2 𝑡𝑖 ) 2 = 0
𝜕ν 2ν 2ν
𝑖=1

The MLEs are


P𝑛 P P𝑛
𝑋𝑖 𝑛𝑗=1 𝑋 𝑗 − 𝑛 𝑖=1
𝑖=1 𝑡𝑖 𝑋𝑖
𝛽ˆ2 = P𝑛  2 P𝑛 2 = −0.592
𝑖=1 𝑡𝑖 − 𝑛 𝑖=1 𝑡𝑖
𝑛
1 ∑︁
𝛽ˆ1 = (𝑋𝑖 − 𝛽ˆ2 𝑡𝑖 ) = 0.9265
𝑛
𝑖=1
𝑛
1 ∑︁
ν̂ = (𝑋𝑖 − 𝛽ˆ1 − 𝛽ˆ2 𝑡𝑖 ) 2 = 8.392
𝑛
𝑖=1

For function B
𝑛  2
𝑛 ∑︁ 1 𝛽1
𝑆(𝛽1 , 𝛽2 , ν) = − ln ν − 𝑋𝑖 −
2
𝑖=1
2ν 1 + 14𝑒 −𝛽2 𝑡𝑖

STAT3951 Example Class 1 4 Spring 2024


Setting the derivatives of 𝑆 to be zero, we have the following equation

𝑛  2
𝜕 𝑛 ∑︁ 1 𝛽1
𝑆(𝛽1 , 𝛽2 , ν) = − + 𝑋𝑖 − =0
𝜕 𝛽1 2ν
𝑖=1
2ν 2 1 + 14𝑒 −𝛽2 𝑡𝑖
𝑛  
𝜕 ∑︁
−𝛽2 𝑡𝑖 −1 𝛽1
𝑆(𝛽1 , 𝛽2 , ν) = (1 + 14𝑒 ) 𝑋𝑖 − =0
𝜕 𝛽2
𝑖=1
1 + 14𝑒 −𝛽2 𝑡𝑖
14𝛽1 𝑡𝑖 𝑒 −𝛽2 𝑡𝑖
 
𝜕 1 𝛽1
𝑆(𝛽1 , 𝛽2 , ν) = − 𝑋𝑖 − =0
𝜕ν ν 1 + 14𝑒 −𝛽2 𝑡𝑖 (1 + 14𝑒 −𝛽2 𝑡𝑖 ) 2
Solving the above equation numerically, we have 𝛽ˆ1 = 72.28575, 𝛽ˆ2 = 0.0680184, ν̂ =
0.903530
(ii) Let β = (𝛽1 , 𝛽2 )
𝑛
𝜕 −1
∑︁ 𝜕𝜇β (𝑡𝑖 )
ln 𝑓 (X |β, ν) = ν (𝑋𝑖 − 𝜇β (𝑡𝑖 )
𝜕β 𝜕β
𝑖=1
𝑛
𝜕 𝑛 1 ∑︁
ln 𝑓 (X |β, ν) = − + 2 (𝑋𝑖 − 𝜇β (𝑡𝑖 ) 2
𝜕ν 2ν 2ν
" 𝑖=1 #
2 𝑛
𝜕𝜇 (𝑡 ) 𝜕𝜇 (𝑡 ) 𝜕 2 𝜇 (𝑡 )
𝜕 ∑︁ β 𝑖 β 𝑖 β 𝑖
ln 𝑓 (X |β, ν) = ν −1 − + (𝑋𝑖 − 𝜇β (𝑡𝑖 ))
𝜕β𝜕β ⊤ 𝜕β 𝜕β ⊤ 𝜕β𝜕β ⊤
𝑖=1
𝑛
𝜕2 1 ∑︁ 𝜕𝜇β (𝑡𝑖 )
ln 𝑓 (X |β, ν) = − (𝑋𝑖 − 𝜇β (𝑡𝑖 ))
𝜕β𝜕ν ν2 𝜕β
𝑖=1
𝑛
𝜕2 𝑛 1 ∑︁
ln 𝑓 (X |β, ν) = 2 − 3 (𝑋𝑖 − 𝜇β (𝑡𝑖 )) 2
𝜕ν 2 2ν ν 𝑖=1
Hence the Fisher information matrix is
ν −1 P𝑛 𝜕𝜇β (𝑡𝑖 ) 𝜕𝜇β (𝑡𝑖 )
 
0 
I (β, ν) = 
 𝑖=1
𝜕β 𝜕β ⊤ 

𝑛
 0 2

2ν 
 

and its inverse is
 P 𝜕𝜇β (𝑡𝑖 ) 𝜕𝜇β (𝑡𝑖 )  −1 
ν 𝑛
𝑖=1 ⊤
0 
I (β, ν) −1 = 
 
𝜕β 𝜕β 
2ν 2
0
 
 
 𝑛 
For function A 𝜇β (𝑡) = 𝛽1 + 𝛽2 𝑡,
𝜕𝜇β (𝑡𝑖 ) 𝜕𝜇β (𝑡𝑖 )
 
1 𝑡𝑖
=
𝜕β 𝜕β ⊤ 𝑡𝑖 𝑡𝑖2
We have the
 P𝑛  −1
𝑖=1 𝑡𝑖
 
© 𝑛 ª 
­ P ν P𝑛ν 2 ®®
­ 
𝑛 0   3.79 −0.06718 0 
I (β, ν) −1

= ­ 𝑖=1 𝑡𝑖 𝑖=1 𝑡𝑖 ®  = −0.06718 0.000158 0 
 
« ν ν ¬   0 0 15.65
 2
2ν  
0

 
 𝑛 

STAT3951 Example Class 1 5 Spring 2024


𝛽1
For function B, 𝜇β (𝑡𝑖 ) = ,
1 + 14𝑒 −𝛽2 𝑡𝑖

14𝛽1 𝑡𝑖 𝑒 −𝛽2 𝑡𝑖
(1 + 14𝑒 −𝛽2 𝑡𝑖 ) −2 (1 + 14𝑒 −𝛽2 𝑡𝑖 ) −1 ·
𝜕2 𝜇
© ª
β (𝑡𝑖 ) ­ (1 + 14𝑒 −𝛽2 𝑡𝑖 ) 2 ®®
=­ 2
𝜕β𝜕β ⊤ ­­ 14𝛽1 𝑡𝑖 𝑒 −𝛽2 𝑡𝑖 14𝛽1 𝑡𝑖 𝑒 −𝛽2 𝑡𝑖
 ®
(1 + 14𝑒 −𝛽 2 𝑡 𝑖 ) −1 ®
« (1 + 14𝑒 −𝛽2 𝑡𝑖 ) 2 (1 + 14𝑒 −𝛽 2 𝑡 𝑖 ) 2
¬
Hence the inverse of
 1.5428 −0.001817 0 
−1

I (β, ν) = −0.001817 2.573 0  ,

 0 0 0.1814

where the numerical computations can be found in the notebook provided.


(iii) Please see the attached Jupyter notebook
(iv)

𝜇ˆ β𝐴 (0) = 𝛽ˆ1 + 𝛽ˆ2 (0) = −0.592


𝜇ˆ β𝐴 (80) = 𝛽ˆ1 + 𝛽ˆ2 (80) = 73.53
𝛽ˆ1
𝜇ˆ β𝐵 (0) = = 4.819
1 + 14𝑒 𝛽ˆ2 (0)
𝛽ˆ1
𝜇ˆ β𝐵 (80) = = 68.15146
1 + 14𝑒 𝛽ˆ2 (80)

(v) Let 𝑇𝐴 = 𝜇ˆ 𝐴 (𝑎) be the unbiased estimator of 𝜇β𝐴 (𝑎) and 𝑇𝐵 = 𝜇ˆ 𝐵 (𝑏) be the
β̂ β̂
unbiased estimator of 𝜇β𝐵 (𝑏). Then,

 𝜕𝜇 𝐴 
 
 𝜕𝛽 
   1
𝜕𝜇 𝐴 𝜕𝜇 𝐴 𝜕𝜇 𝐴 𝐴  𝐴
−1  𝜕𝜇 
Var(𝑇𝐴 )| 𝑎=0 = Var(𝜇β𝐴 (0)) ≈ I (β, ν)  
𝜕 𝛽1 𝜕 𝛽2 𝜕ν  𝜕 𝛽2 
 𝜕𝜇 𝐴 
 
 
 𝜕ν 
1
−1   ª
©  𝐴  
= ­ 1 𝑎 0 I (β, ν) 𝑎  ®
0
«   ¬𝑎=0
= 3.8529

Hence the SD at 𝑎 = 0 is 1.96288. Similarly, Var(𝑇𝐴 )| 𝑎=80 = 3.1416 and


𝜕𝜇 𝐵 1
the SD at 𝑎 = 80 is 1.7724. Note that we have = , and
𝜕 𝛽1 1 + 14𝑒 −𝛽2 𝑡

STAT3951 Example Class 1 6 Spring 2024


𝜕𝜇 𝐵 14𝛽1 𝑡𝑒 −𝛽2 𝑡
= 2
. Hence we have
𝜕 𝛽2 (1 + 14𝑒 −𝛽2 𝑡)
 𝜕𝜇 𝐵 
 
 𝜕𝛽 
 𝐵   1
𝜕𝜇 𝜕𝜇 𝐵 𝜕𝜇 𝐵 𝐵  𝜕𝜇 𝐵 
Var(𝑇𝐵 )| 𝑏=0 = Var(𝜇β𝐵 (0)) ≈ I (β, ν) −1  

𝜕 𝛽1 𝜕 𝛽2 𝜕ν  𝜕 𝛽2 
 𝜕𝜇 𝐵 
 
 
 𝜕ν 
   1 
1
 ª
−1  1 + 14  ®
© 𝐵
=­­ 0 0 I (β, ν)  0  ®
1 + 14 
 0 

«   ¬𝑏=0
= 0.00086

Hence the SD at 𝑏 = 0 is 0.0828. Similarly, Var(𝑇𝐵 )| 𝑏=80 = 0.5530 and the


SD at 𝑏 = 80 is 0.7437.
(vi) We note that function B may be a better choice as it has less variance on
estimating the expected pasture, also a better fit to the raw data as observed by
1P
the prediction error 𝑛 𝑖 (𝑋𝑖 − 𝜇β (𝑡𝑖 ).

STAT3951 Example Class 1 7 Spring 2024

You might also like