GLM Slides 5 Continuous Response
GLM Slides 5 Continuous Response
In practice, there are many situations where the response is continuous but the
normal distribution does not fit:
response can only take nonnegative values, Y ≥ 0
conditional probability that Y ≥ k increases when k increases
variance is not constant
Examples:
insurance claims
time to failure (of an experiment or a machine)
amount of rainfall
The distributions that may suit in given scenarios are gamma, lognormal or
inverse Gaussian distributon
Coefficient of variation
is a dimensionless measure of variability (often given in %)
allows to compare the variability of variables measured in different scales
is also called relative standard deviation (RSD)
if µ → 0 then CV → ∞, i.e. it is sensitive to values of µ
CV is often used in situations where the r.v. of interest is (related to) exponential
Coefficient of variation
is a dimensionless measure of variability (often given in %)
allows to compare the variability of variables measured in different scales
is also called relative standard deviation (RSD)
if µ → 0 then CV → ∞, i.e. it is sensitive to values of µ
CV is often used in situations where the r.v. of interest is (related to) exponential
√
Then DY = EY ⇒ CV = 1
CV < 1 (Erlang distribution) – small relative variability
CV > 1 (hyperexp. dist. = mixture of exponentials) – large relative variability
GLM (MTMS.01.011) Lecture 5 3 / 35
Coefficient of variation for known distributions
Distribution EY DY CV
σ
N(µ, σ 2 ) µ σ2 µ
q
1−π
B(n, π) nπ nπ(1 − π) nπ
Po(λ) λ λ √1
λ
α α √1
Γ(α, λ) λ λ2 α
1 1
Exp(λ) λ λ2 1
In other words, let us assume that CV = α, which also means that there is a
linear relation between the std. deviation and the mean:
√
DY = αEY
Properties:
ν 1
if λ = µ = 2 then we have χ22ν -distribution
if ν = 1 then we have exponential distribution
if 0 < ν ≤ 1, the density is monotone decreasing, otherwise unimodal and
stretched to the right
if Yi ∼P
Γ(νi , λ) are independent, their sum has gamma distribution with
n
shape i=1 νi and rate λ
if ν → ∞, gamma distribution converges to normal distribution
Lindsey (1995). Data from Liege, Belgium 1984, the duration of marriages
(n = 1699)
Marriage is sometimes thought of as having three periods of different relationships
through which a couple goes, so that gamma distribution with ν = 3 might be
suitable
Empirical mean is ȳ = 13.85 years and variance is sy2 = 75.9
2
Parameter estimates: ν̂ = ȳs 2 = 2.53; µ̂ = 13.85
y
Histogram (next slide) shows a good fit
1
g(µi ) =
µi
Why?
1
g(µi ) =
µi
Why?
Note that by construction we have a restriction µi > 0, which also implies ηi > 0
Bliss (1970) estimated a hyperbolic model using gamma distribution and log-link
Results (x = log10 u, see next slide for the explanation of this transform):
2 RSS(µ̂)
RRSS =1−
RSS(µ̄)
M. Mittlböck, H. Heinzl (2002). Measures of explained variation in Gamma regression models. Commun. Statist. – Simulat. and Comput., 31(1), 67-73.
χ2
ϕ̂ = ν̂ −1 =
n−p
Question
Does lognormal distribution belong to the exponential family?
Question
Does lognormal distribution belong to the exponential family?
Answer:
Based on the definition of natural 1-parameter exponential family, log-normal
distribution does not belong to this family
Question
Does lognormal distribution belong to the exponential family?
Answer:
Based on the definition of natural 1-parameter exponential family, log-normal
distribution does not belong to this family
Mean: EY = µ
µ3
Variance: DY = – variance is proportional to mean cubed
λ
Coefficient of variation: CV = µλ
p
NB! The name can be misleading: "inverse" means that while the Gaussian
describes a Brownian motion’s level at a fixed time, the inverse Gaussian describes
the distribution of the time a Brownian motion with positive drift takes to reach a
fixed positive level.
IG has sharper peak and heavier tails (as compared to lognormal), thus it is used
in areas related to more extreme events
insurance
financial mathematics
meteorology (wind energy applications)
Its hazard function is ∩-shaped as for lognormal and Weibull distributions, thus IG
is also used in
survival analysis
risk analysis (e.g. analysis of noise effects)
IG is first mentioned by Schrödinger (1915) and Smoluchowski (1915), name
"inverse Gaussian" is proposed by Tweedie (1941)
Same class of distributions is discussed by Wald (1947). If µ = 1, inverse
Gaussian is called Wald’s distribution
Canonical link: g(µi ) = − 2µ1 2 – not used very often in this exact form
i
1
Default link in most statistical packages: squared inverse, g(µi ) = µ2i
,
which implies µi = √1
ηi
Other possibilities:
log-link g(µi ) = ln µi – often used, especially when squared inverse has
convergence or negativity issues
identity g(µi ) = µi – always a simple choice, problems if ηi < 0
1
Thus the deviance is (as ϕ = λ )
2
D = − {l(µ̂; y, λ) − l(y; y, λ)}
λ
X yi 1
yi 1
=2 − − −
i
2µ̂i 2 µ̂i 2yi2 yi
X yi
2 1
X (yi − µ̂i )2
= 2 − + =
i
µ̂i µ̂i yi i
yi µ̂2i
Claim amount
> m1 = glm(claimcst0~factor(agecat)+gender+area,
data=claims, family="inverse.gaussian"(link="log"))
> summary(m1)
...
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.70839 0.09853 78.230 < 2e-16 ***
factor(agecat)2 -0.15845 0.10349 -1.531 0.125836
...
factor(agecat)6 -0.31865 0.12076 -2.639 0.008349 **
genderM 0.15283 0.05119 2.986 0.002846 **
areaB -0.02976 0.07287 -0.408 0.682977
...
areaF 0.35539 0.13049 2.723 0.006485 **
---
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
...
AIC: 77162
How to interpret the results?
GLM (MTMS.01.011) Lecture 5 30 / 35
Example. Results (2)
> m2 = glm(claimcst0~factor(agecat)+gender+area,
data=claims, family="Gamma")
> m3 = glm(log(claimcst0)~factor(agecat)+gender+area,
data=claims, family="gaussian")
> m1$aic
[1] 77162.32
> m2$aic
[1] 79331.75
> m3$aic
[1] 14707.79
√ X (ln yi − µi )2
l(ln y; µ, σ) = −n ln(σ 2π) −
i
2σ 2
AIC = −2log-likelihood + 2p
P
⇒ AICLN = AICN + 2 i ln(yi )
P
We just need to calculate 2 i ln(yi ) and add it to the AIC found for normal
model
> sum(log(claims$claimcst0))
[1] 31489.81