st3054 Slides PDF
st3054 Slides PDF
Survival models
Lifetime distribution functions
Cox regression
Eric Wolsztynski
[email protected]
Department of Statistics
School of Mathematical Sciences
University College Cork, Ireland
2014-2015
Version 1.0
Introduction
Survival models
Lifetime distribution functions
Cox regression
Acknowledgment
These lecture notes follow and adapt a section of the Institute and
Faculty of Actuaries CT4 notes, in respect of the exemption
programme in place for ST3054 and ST6004. However this
document does not reproduce the CT4 notes fully nor exactly, and
also presents notions, developments and examples not found in
those notes.
These notes also use a large part of former ST3054 notes written
by Dr Kingshuk Roy Choudhury and Dr Tony Fitzgerald for a
previous course syllabus.
For any comment or query about this document, please contact
[email protected]
ST3054 - ST6004
Introduction
Survival models
Lifetime distribution functions
Cox regression
Tutorials / Practicals:
ST3054 - ST6004
Introduction
Survival models
Lifetime distribution functions
Cox regression
Module Objective:
To develop techniques for the analysis of survival data
Module Content:
1) Parametric models of survival, use of life tables, types of
censoring, hazard functions
2) Non-parametric estimation of hazard and survival functions,
Kaplan-Meier and Nelson-Aalen estimators
3) Proportional hazards model with covariates
I
Use of software
ST3054 - ST6004
Introduction
Survival models
Lifetime distribution functions
Cox regression
Learning Outcomes
I
Introduction
Survival models
Lifetime distribution functions
Cox regression
Learning Outcomes
I
Introduction
Survival models
Lifetime distribution functions
Cox regression
Related material
I
Co-requisite: ST3053
Introduction
Survival models
Lifetime distribution functions
Cox regression
Outline
I Introduction
II Survival models (Ch7 of CT4 notes 2013)
III Lifetime distribution functions (Ch8 of CT4)
IV The Cox regression model (Ch9 of CT4)
ST3054 - ST6004
Introduction
Survival models
Lifetime distribution functions
Cox regression
Introduction
ST3054 - ST6004
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
10
Introduction
Survival models
Lifetime distribution functions
Cox regression
At Age
1926
2006
0
57.4
76.8
10
55.2
67.2
20
46.4
57.5
35
34.4
43.3
55
19.1
24.8
65
12.8
16.6
75
7.7
9.8
65
13.4
19.8
75
8.4
12.1
At Age
1926
2006
0
57.9
81.6
10
54.9
72.0
20
46.4
62.1
35
34.7
47.4
55
19.6
28.5
11
Introduction
Survival models
Lifetime distribution functions
Cox regression
Rank
1
2
3
4
5
6
7
8
9
10
Country
Monaco
Macau
Japan
Singapore
San Marino
Andorra
Guernsey
Hong Kong
Australia
Italy
L.E.
89.68
84.43
83.91
83.75
83.07
82.50
82.24
82.12
81.90
81.86
Rank
212
213
214
215
216
217
218
219
220
221
Country
Mozambique
Lesotho
Zimbabwe
Somalia
Central Afr. Rep.
Afghanistan
Swaziland
South Africa
Guinea-Bissau
Chad
L.E.
52.02
51.86
51.82
50.80
50.48
49.72
49.42
49.41
49.11
48.69
ST3054 - ST6004
12
Introduction
Survival models
Lifetime distribution functions
Cox regression
Lifestyle factors
ST3054 - ST6004
13
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
14
Introduction
Survival models
Lifetime distribution functions
Cox regression
Machine reliability
I
ST3054 - ST6004
15
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
16
Introduction
Survival models
Lifetime distribution functions
Cox regression
Predicting survival
ST3054 - ST6004
17
Introduction
Survival models
Lifetime distribution functions
Cox regression
Types of data
I
ST3054 - ST6004
18
Introduction
Survival models
Lifetime distribution functions
Cox regression
Aggregate model
I
x = attained age
ST3054 - ST6004
19
Introduction
Survival models
Lifetime distribution functions
Cox regression
Select model
I
20
Introduction
Survival models
Lifetime distribution functions
Cox regression
Section I
Survival models
ST3054 - ST6004
21
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
22
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
23
Introduction
Survival models
Lifetime distribution functions
Cox regression
Future lifetime
I
24
Introduction
Survival models
Lifetime distribution functions
Cox regression
Future lifetime
Definition:
F (t) = P(T t)
Definition (0 x ):
Fx (t) = P(Tx t)
25
Introduction
Survival models
Lifetime distribution functions
Cox regression
Future lifetime
Examples:
I
F30 (50) denotes the probability that a 30-year old dies before
his/her 80th birthday
26
Introduction
Survival models
Lifetime distribution functions
Cox regression
qx
t px
px
In particular we have
t qx
= Fx (t)
t px
= 1 t qx = Sx (t)
ST3054 - ST6004
27
Introduction
Survival models
Lifetime distribution functions
Cox regression
lx
9,253
7,218
5,507
4,104
2,982
2,109
1,449
966
623
388
dx
2,035
1,711
1,403
1,122
873
660
483
343
235
155
px
0.78006
0.76297
0.74515
0.72659
0.70730
0.68728
0.66652
0.64503
0.62281
0.59985
qx
0.21994
0.23703
0.25485
0.27341
0.29270
0.31272
0.33348
0.35497
0.37719
0.40015
28
Introduction
Survival models
Lifetime distribution functions
Cox regression
from
from
from
from
from
90
91
92
93
94
to
to
to
to
to
91
92
93
94
95
Probability
1 p90 = 0.78006
1 p91 = 0.76297
p92 = 0.74515
p93 = 0.72659
p94 = 0.70730
Thus
5 p90
29
Introduction
Survival models
Lifetime distribution functions
Cox regression
5 p90
5 p90
3 p90
2 p93
5 p90
2 p90
3 p92
1 p90
s+t px
s px
t px+s
s+t px
t px
s px+t
ST3054 - ST6004
30
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
31
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
32
Introduction
Survival models
Lifetime distribution functions
Cox regression
x+t p0
x p0
ST3054 - ST6004
33
Introduction
Survival models
Lifetime distribution functions
Cox regression
x+t p0
x p0
x+s+t p0
x p0
x+s p0 x+s+t p0
x p0
x+s p0
= s px t px+s
Similary,
s+t px
= t px s px+t
ST3054 - ST6004
34
Introduction
Survival models
Lifetime distribution functions
Cox regression
35
Introduction
Survival models
Lifetime distribution functions
Cox regression
36
Introduction
Survival models
Lifetime distribution functions
Cox regression
' h x
37
Introduction
Survival models
Lifetime distribution functions
Cox regression
38
Introduction
Survival models
Lifetime distribution functions
Cox regression
P (x < T x + h)
1
h
P(T > x)
f (x)
dS(x)/dx
=
S(x)
S(x)
d
Inversion: since x = dx
log S(x), we have
Z x
S(x) = exp
s ds = exp [(x)]
0
ST3054 - ST6004
39
Introduction
Survival models
Lifetime distribution functions
Cox regression
hazard estimates
1.5
1.0
0.5
0.0
0
20
40
60
80
100
Age (years)
40
Introduction
Survival models
Lifetime distribution functions
Cox regression
survival estimates
0.8
0.6
0.4
0.2
0.0
.1
.01
.001
.0001
0
20
40
60
80
100
Age (years)
41
Introduction
Survival models
Lifetime distribution functions
Cox regression
Density
0.03
0.02
0.01
0.00
.1
.01
.001
.0001
0
20
40
60
Age (years)
80
100
42
Introduction
Survival models
Lifetime distribution functions
Cox regression
x+t
(2)
x+t
1
lim+ P (T x + t + h|T > x + t)
h0 h
1
= lim+ P (Tx t + h|Tx > t)
h0 h
ST3054 - ST6004
43
Introduction
Survival models
Lifetime distribution functions
Cox regression
S(x) h
P(T x + t + h) P(T x + t)
= lim+
S(x) h
h0
ST3054 - ST6004
44
Introduction
Survival models
Lifetime distribution functions
Cox regression
fx (t) =
(0 t < x)
ST3054 - ST6004
45
Introduction
Survival models
Lifetime distribution functions
Cox regression
d S(x + t)
1
d
d
Sx (t) =
=
S(x + t)
dt
dt S(x)
S(x)
dt
and since
d
d
S(x + t) =
dt
dt
f (u)du = f (x + t)
x+t
we have
fx (t) =
S(x + t) f (x + t)
f (x + t)
=
= t px x+t
S(x)
S(x) S(x + t)
ST3054 - ST6004
46
Introduction
Survival models
Lifetime distribution functions
Cox regression
Summary
Tx
t qx
qx
t px
px
Probabilistic notation
x h ' P(T x + h|T > x)
x =
f (x)
S(x)
or f (x) = x S(x)
Actuarial notation
x h ' h qx
fX (t) = x+t t px
s+t px = s px t px+s
( s, t > 0)
ST3054 - ST6004
47
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
48
Introduction
Survival models
Lifetime distribution functions
Cox regression
dx
dx
= lx lx+1
lx+1
=
lx
px
qx
t px
= 1 px = 1
=
lx+1
lx lx+1
dx
=
=
lx
lx
lx
lx+t
lx
ST3054 - ST6004
49
Introduction
Survival models
Lifetime distribution functions
Cox regression
50
Introduction
Survival models
Lifetime distribution functions
Cox regression
...
ST3054 - ST6004
51
Introduction
Survival models
Lifetime distribution functions
Cox regression
S(x)
1.0000
0.97408
0.97259
0.97160
0.97082
...
0.00001
0.00000
lx
100,000
97,408
97,259
97,160
97,082
...
1
0
Survival probabilities:
t px
lx+t
lx
ST3054 - ST6004
52
Introduction
Survival models
Lifetime distribution functions
Cox regression
lx
88,792
87,805
53
Introduction
Survival models
Lifetime distribution functions
Cox regression
= t qx
for 0 t 1.
ST3054 - ST6004
54
Introduction
Survival models
Lifetime distribution functions
Cox regression
lx+t
lx
(1 t)lx + t lx+1
= 1
lx
t lx t lx+1
=
lx
= t(1 px )
= 1
= t qx
ST3054 - ST6004
55
Introduction
Survival models
Lifetime distribution functions
Cox regression
dx
0 lx+t dt
= R1
qx
0 t px dt
56
Introduction
Survival models
Lifetime distribution functions
Cox regression
57
Introduction
Survival models
Lifetime distribution functions
Cox regression
qx
0 t px dt
R1
=
t px dt
R0 1
0 t px dt
58
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
59
Introduction
Survival models
Lifetime distribution functions
Cox regression
ex =
t t px x+t dt
0
Z x
=
t t px dt
t
0
ix Z x
h
= t t px
+
(integration by parts)
t px dt
0
0
Z x
=
t px dt
0
h
ix
(using t px x+t = fx (t) = t qx /t = t px /t and t t px
= 0)
0
ST3054 - ST6004
60
Introduction
Survival models
Lifetime distribution functions
Cox regression
k px
qx+k
61
Introduction
Survival models
Lifetime distribution functions
Cox regression
ex
=
=
k k px
k=0
1 px qx+1
qx+k
+2 px qx+2 + 2 px qx+2
+3 px qx+3 + 3 px qx+3 + 3 px qx+3
+...
X X
[x] [x]
k=1
j=k
[x]
j px qx+j =
k px
k=1
ST3054 - ST6004
62
Introduction
Survival models
Lifetime distribution functions
Cox regression
Probability
1 px
qx+1
(survives one year and then dies in the next year)
2 px qx+2
...
k px qx+k
...
P
k px qk+1 (k = 1 to x)
ST3054 - ST6004
63
Introduction
Survival models
Lifetime distribution functions
Cox regression
Relationship between
ex and ex
Considering the two formulae
Z
ex =
[x]
x
t px dt
and
ex =
k px
k=1
ex = ex +
2
I Define Jx = Tx Kx to be the random lifetime after the highest
integer age to which a life x survives. Approximately, E[Jx ] = 1/2
(assuming deaths occur uniformly within each year of age), and
since E[Tx ] = E[Kx ] + E[Jx ], we have
ex ' ex + 1/2.
ST3054 - ST6004
64
Introduction
Survival models
Lifetime distribution functions
Cox regression
65
Introduction
Survival models
Lifetime distribution functions
Cox regression
Why
e81 6=
e80 1?
How would you write e81 w.r.t e80 and px ?
ST3054 - ST6004
66
Introduction
Survival models
Lifetime distribution functions
Cox regression
Var[Kx ] =
k 2 k px qx+k dt ex2
k=0
67
Introduction
Survival models
Lifetime distribution functions
Cox regression
Angola, Zambia
Afghanistan, Malawi
Nigeria, Rwanda, South Africa, Zimbabwe
Cameroon, Ethiopia, Uganda
Bangladesh, Ghana, Haiti, Kenya, Russia
Botswana, Burma, Guyana, Pakistan, Yemen
Brazil, Guatemala, India
Barbados, China, Serbia
Australia, Japan, New Zealand, USA,
most Western European countries
Average life expectancy at birth for males (2009, CIA World Factbook and IFA 2011)
ST3054 - ST6004
68
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
69
Introduction
Survival models
Lifetime distribution functions
Cox regression
A formula for t qx
Z
t qx
t
s px x+s ds
This follows from the relationship fx (t) = t px x+t . For each time
s [0, t], the integrand is the product of
(i) s px , the probability of surviving to age x + s
(ii) x+s , which is approximately equal to
of dying just after age x + s
ds qx+s ,
the probability
These probabilities are mutually exclusive and are thus just added
up (or in the limit integrated).This result allows deriving an
important relationship between t px and x .
ST3054 - ST6004
70
Introduction
Survival models
Lifetime distribution functions
Cox regression
A formula for t px
t px
Z t
= exp
x+s ds + c
0
ST3054 - ST6004
71
Introduction
Survival models
Lifetime distribution functions
Cox regression
A formula for t px
Proof: This follows from
s px =
s qx = fx (s) = s px x+s
s
s
Note that
s px
log s px = s
= x+s
s
s px
hence, for some constant of integration c (which is 0),
Z t
Z t
log s px ds =
x+s ds + c
0
0 s
h
it
Since 0 px = 1 we have log s px = log t px and
0
Z t
Z t
x+c ds + c} = exp{
x+c ds}
t px = exp{
0
(since e 0 = 1, we use c = 0)
ST3054 - ST6004
72
Introduction
Survival models
Lifetime distribution functions
Cox regression
Rt
t qx
0 s px x+s ds
t px
n R
o
t
= exp 0 x+s ds + c
ST3054 - ST6004
73
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
74
Introduction
Survival models
Lifetime distribution functions
Cox regression
Parametric models?
1.0
1
survival estimates
0.8
0.6
0.4
0.2
0.0
.1
.01
.001
.0001
0
20
40
60
80
100
Age (years)
75
Introduction
Survival models
Lifetime distribution functions
Cox regression
Parametric models?
0.04
Density
0.03
0.02
0.01
0.00
.1
.01
.001
.0001
0
20
40
60
Age (years)
80
100
76
Introduction
Survival models
Lifetime distribution functions
Cox regression
= Sx (t) = e
Rt
0
ds
= e
= e t
and
t qx
= 1 t px = 1 e t
ST3054 - ST6004
77
Introduction
Survival models
Lifetime distribution functions
Cox regression
1
Var[T ] =
2
78
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
79
Introduction
Survival models
Lifetime distribution functions
Cox regression
10
20
30
40
50
60
70
80
90
100
ST3054 - ST6004
80
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
81
Introduction
Survival models
Lifetime distribution functions
Cox regression
d
Since t = dt
log [S(t)] we have
t =
d
[t 1 ] = t 1
dt
ST3054 - ST6004
82
Introduction
Survival models
Lifetime distribution functions
Cox regression
S(t) = e log(c)
t
I
(1c t )
= Bc t
= Be t log(c)
83
Introduction
Survival models
Lifetime distribution functions
Cox regression
hazard estimates
.1
.01
.001
.0001
0
10
20
30
40
50
60
70
80
90
100
Age (years)
84
Introduction
Survival models
Lifetime distribution functions
Cox regression
S(t) = e log(c)
t
(1c t )At
= A + Bc t
85
Introduction
Survival models
Lifetime distribution functions
Cox regression
ST3054 - ST6004
86
Introduction
Survival models
Lifetime distribution functions
Cox regression
Survival probabilities
Survival probabilities t px can be found using
Z t
x+s ds
t px = exp
0
= gc
B
log(c)
, we have
x (c t 1)
B
log(c)
= stg c
x (c t 1)
ST3054 - ST6004
87
Introduction
Survival models
Lifetime distribution functions
Cox regression
Survival probabilities
Proof ...?
ST3054 - ST6004
88
Introduction
Survival models
Lifetime distribution functions
Cox regression
Gompertz-Makeham family
The force of mortality can be modeled using one of the
Gompertz-Makeham curves. This family of functions is of the form
GM(r , s) = 1 + 2 t + + r t r 1 + e r +1 +r +2 t++r +s t
s1
89
Introduction
Survival models
Lifetime distribution functions
Cox regression
Weibull
t
F (t)
1 e t
1 e t
S(t)
e t
e t
t 1
Gompertz
1 t
Bc e
1e
e
B(1c t )
log(c)
B(1c t )
log(c)
B(1c t )
log(c)
Bc t
Makeham
t
(A+Bc )e
1e
e
Log-logistic
B(1c t )
log(c)
B(1c t )
At
log(c)
B(1c t )
At
log(c)
A+Bc t
t 1
(1+t )2
1
1+t
1
1+t
t 1
1+t
ST3054 - ST6004
90
Introduction
Survival models
Lifetime distribution functions
Cox regression
91
Introduction
Survival models
Lifetime distribution functions
Cox regression
92
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Section II
Estimating lifetime distribution
functions
ST3054 - ST6004
93
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
94
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Introduction
I
95
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Overview
Parametric MLE
Non-parametric MLE
ST3054 - ST6004
96
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Simple experiment:
I
ST3054 - ST6004
97
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
98
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Parametric approach:
I
99
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
100
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Example
Time
5
10
15
20
20
20
35
40
40
40
Died
1
1
1
1
1
1
1
0
0
0
Alive
9
8
7
4
4
4
3
3
3
3
SDF
0.9
0.8
0.7
0.4
0.4
0.4
0.3
0.3
0.3
0.3
ST3054 - ST6004
101
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Example
1.0
0.9
0.8
survival
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0
10
15
20
Time
25
30
35
40
102
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Example
Note step down of 0.1
Empirical Survival
1.00
0.75
0.50
0.25
0.00
0
10
20
analysis time
30
40
103
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
104
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Cohorts
I
No entrants after t = 0
105
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Follow-up time
I
1/1/91
1/1/92
A
B
0
365
730
ST3054 - ST6004
106
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
II.2 Censoring
ST3054 - ST6004
107
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Censoring
I
108
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Censoring
If inference is based on data with shorter times:
I
ST3054 - ST6004
109
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Censoring
ST3054 - ST6004
110
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Censoring mechanisms
Data are censored if we do not know the exact values of each
observation but we do have information about the value of each
observation in relation to one or more bounds (e.g. we know that a
person was still alive at age 20 at end of investigation).
I
ST3054 - ST6004
111
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Censoring mechanisms
Most common censoring assumptions (not all mutually exclusive):
I
Right censoring
Left censoring
Interval censoring
Random censoring
Type I censoring
Type II censoring
ST3054 - ST6004
112
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Right censoring
I
Ex: end of mortality study before all lives observed have died
ST3054 - ST6004
113
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Right censoring
ST3054 - ST6004
114
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Left censoring
I
Data are left censored if we cannot know when entry into the
state we wish to observe took place
115
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Left censoring
ST3054 - ST6004
116
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Truncation
I
117
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Truncation
ST3054 - ST6004
118
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Interval censoring
I
119
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Interval censoring
ST3054 - ST6004
120
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Random censoring
I
121
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Random censoring
ST3054 - ST6004
122
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
123
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
124
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
125
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Type I censoring
ST3054 - ST6004
126
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Type I censoring
Examples of right censoring mechanisms:
I
127
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Type II censoring
I
128
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
129
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
130
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Note: patient age may be important but not the sole determinant,
and is usually treated as an explanatory variable in a multivariate
regression model (cf. next section). Ex: measure mortality
amongst patients suffering from a virulent tropical disease.
ST3054 - ST6004
131
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
132
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
133
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Then c0 + c1 + + ck = n m
ST3054 - ST6004
134
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
135
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Day
3
4
6
11
17
21
24
25
26
30
Event
Rat 4 dies from effects of drug
Rat 13 dies from effects of drug
Rat 7 gnaws through bars of cage and escapes
Rats 6 and 9 die from effects of drug
Rat 1 killed by other rats
Rat 10 dies from effects of drug
Rat 8 freed during raid by animal liberation activists
Rat 12 accidentally freed by journalist reporting earlier raid
Rat 5 dies from effects of drug
Investigation closes. Remaining rats hold street party.
ST3054 - ST6004
136
Introduction
Survival models
Lifetime distribution functions
Cox regression
censored
Day
died
Day
3
4
6
11
17
21
24
25
26
30
6
34
t1 t2
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
24 25
17
11 (2 rats)
t3
21
t4
30 (5 rats)
26
t5
Event
Rat 4 dies from effects of drug
Rat 13 dies from effects of drug
Rat 7 gnaws through bars of cage and escapes
Rats 6 and 9 die from effects of drug
Rat 1 killed by other rats
Rat 10 dies from effects of drug
Rat 8 freed during raid by animal liberation activists
Rat 12 accidentally freed by journalist reporting earlier raid
Rat 5 dies from effects of drug
Investigation closes. Remaining rats hold street party.
ST3054 - ST6004
137
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
138
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
139
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
nj
j j (1 j )nj dj
j=1
I
140
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
(1 j k)
ST3054 - ST6004
141
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Probability
0.10
0.30
0.25
0.20
0.15
142
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
j )
S(t)
=
(1
tj t
ST3054 - ST6004
143
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
To compute S(t),
we multiply the survival probabilities within each
of the intervals up to and including duration t. The survival
probability at time tj is estimated by
j = nj dj = number of survivors
1
nj
number at risk
So the probability of survival at time t is estimated by
S(t)
=
Y nj dj
nj
tj t
144
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
145
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
146
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
tj
dj
nj
j = dj /nj
j )
(1
15
0.0667
0.9333
0.0667
14
0.0714
0.9286
0.1333
11
12
0.1667
0.8333
0.2778
20
0.1111
0.8889
0.3580
26
0.1667
0.8333
0.4650
0.0667
0.1333
F (t) =
0.2778
0.3580
0.4650
for
for
for
for
for
for
Qj
0t<3
3t<4
4 t < 11
11 t < 21
21 t < 26
t 26
k=1 (1
k )
ST3054 - ST6004
147
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
A graphical approach
ST3054 - ST6004
148
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
1.00
0.75
S(t)
1
14/15
14/1513/14
13/1510/12
...
0.25
0.50
t
0t<3
3t<4
4 t < 11
11 t < 21
...
0.00
Survival probability
34
11
21
26
30
Time
ST3054 - ST6004
149
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
dj
nj (nj dj )
150
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
151
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Z1/2 Var[S(x)]
where = exp
S(x)
log(S(x))
This CI is not
symmetric about S(t)
Bands can be
constructed by
adjusting conf level
0.0
0.2
0.4
S(t)
0.6
0.8
1.0
h
i
1/ , S(x)
,
S(x)
50
100
150
152
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
tj t
X dj
tj t
nj
ST3054 - ST6004
153
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
S(t)
= e t
F (t) = 1 e t
ST3054 - ST6004
154
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
155
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Survival estimation in R
>
>
>
>
>
>
library(survival)
leukemia
leuk.km = survfit(Surv(time, status) x, data=leukemia)
leuk.km.ncs <- survfit(Surv(time) x, data=leukemia)
plot(leuk.km[1], conf.int=F, xlab="t", ylab="S(t)", bty="n")
lines(leuk.km.ncs[1], lty=4)
ST3054 - ST6004
156
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
1.0
Survival estimation in R
0.0
0.2
0.4
S(t)
0.6
0.8
Censoring
No censoring
50
100
150
ST3054 - ST6004
157
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
KM vs NA
>
>
>
>
>
>
ST3054 - ST6004
158
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
1.0
KM vs NA
0.0
0.2
0.4
S(t)
0.6
0.8
KaplanMeier
NelsonAalen
50
100
150
ST3054 - ST6004
159
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
160
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Let us denote:
- K the number of categories
- dk,(i) the number of deaths in group k at ordered time t(i)
P
- d(i) = K
k=1 dk,(i) the total number of deaths at time t(i)
- nk,(i) the number of members of group k at risk at t(i)
- n(i) the total number of members at risk right before t(i)
ST3054 - ST6004
161
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Hypotheses:
H0 :
S1 (t) = S2 (t)
H1 :
S1 (t) 6= S2 (t)
Assumptions:
I
I
I
162
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
1,(i) )
i=1 wi (d1,(i) e
Pm
2
(i)
i=1 wi v
2
163
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
n(i)
ST3054 - ST6004
164
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
165
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
166
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
ST3054 - ST6004
167
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
censored lives
Pn
i=1 ti
ST3054 - ST6004
168
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
n
Y
i=1
1i
n
Y
h(ti )i S(ti )
i=1
169
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
log(L()) =
and
Pn
i=1 i
n
X
ti
i=1
Pn
i
Pi=1
n
i=1 ti
<0
2
2
ST3054 - ST6004
170
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
= Sx (m) = e
Pm1
j=0
x+j
ST3054 - ST6004
171
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Date of surgery
1 April 2001
1 April 2001
1 May 2001
1 September 2001
1 October 2001
Date of death
1 August 2005
1 October 2001
1 March 2002
1 August 2003
1 August 2002
ST3054 - ST6004
172
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Date of surgery
1 February 2001
1 March 2001
1 April 2001
1 June 2001
1 September 2001
1 September 2001
1 November 2001
ST3054 - ST6004
173
Introduction
Survival models
Lifetime distribution functions
Cox regression
Statistical inference
Censoring
The Kaplan-Meier (product-limit) model
Parametric estimation of the survival function
Date of surgery
1 February 2001
1 June 2001
1 September 2001
174
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Section III
The Cox regression model
ST3054 - ST6004
175
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
ST3054 - ST6004
176
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Effect of covariates
I
ST3054 - ST6004
177
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
178
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
ST3054 - ST6004
179
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
log(i ) = xi
I
180
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
181
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
1 (t) = e ,
2 (t) = e e
2 (t)
= e
1 (t)
I
182
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
183
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
ST3054 - ST6004
184
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Proportional hazards
Definition: Given a set of p-dimensional vectors of covariates
{xi }ni=1 , the Cox model (1972) defines the hazards function as
(t; xi ) = 0 (t) exp(xiT )
where is a p-dimensional vector of regression parameters, and
o (t) is the baseline hazard.
I
185
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Proportional hazards
I
186
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Proportional hazards
Under the Cox model, the hazards of two different lives are
proportional at all times:
exp(x1T )
(t; x1 )
=
(t; x2 )
exp(x2T )
Hazard
Time
ST3054 - ST6004
187
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Proportional hazards
The constant hazards ratio between these two lives is:
P
p
p
x
exp
T
X
j
1,j
j=1
exp(x1 )
(t; x1 )
P
= exp
j (x1,j x2,j )
=
=
T
p
(t; x2 )
exp(x2 )
j x2,j
exp
j=1
Hazard
j=1
Time
ST3054 - ST6004
188
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
If the jth covariate xi,j takes positive values only, then j > 0
implies a positive correlation between xi,j and the hazard rate
(hazard increases with xi,j )
189
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Ex: the covariates for a given life are (62, 168, 85)
(age at start of study, height in cm, weight in kg)
ST3054 - ST6004
190
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
191
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Hazard
Time
ST3054 - ST6004
192
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
ST3054 - ST6004
193
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
jall
jall
194
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
jRi
jRi
e xj
195
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Partial likelihood:
L() =
k
Y
j=1
e xj
P
iRj
e xi
196
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
k
Y
(t, xj )
iRj (t, xi )
P
j=1
197
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
ST3054 - ST6004
198
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Two-sample example
I
Partial likelihood:
e
e 0
e 0
Lp =
3e 0 + 2e
e 0 + 2e
e 0 + e
1
e
1
=
3e 0 + 2e
1 + 2e
1 + e
199
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
The model must also allow for multiple events at one time
point tj (i.e. dj > 1)
Lp () =
xj (y(i) )
e jD(i)
P
P
jD xj (y(i) )
(i)
(i)
e
R
combinationsD(i)
(i)
200
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
0 (18)e x3 0 (18)e x4
e .0+.1
= .0+.1
e
+ e .1+.1 + e .0+.1
Thus,
e .0+.1
e .0
3e .0 + 2e .1
e .0+.1 + e .1+.1 + e .0+.1
e
1
=
3 + 2e
2e + e 2
Lp =
ST3054 - ST6004
201
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
k
Y
j=1
e sj
P
iRj
e xi
d j
202
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
2 log L()
i j
203
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Cox regression
I
204
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Cox regression
I
ST3054 - ST6004
205
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
206
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
ST3054 - ST6004
207
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
ST3054 - ST6004
208
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
wbc
2300
750
4300
2600
6000
ag time
present 65
present 156
present 100
present 134
present 16
209
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
coef
-1.069
0.368
exp(coef)
0.343
1.444
agpresent
log(wbc)
exp(coef)
0.343
1.444
se(coef)
0.429
0.136
exp(-coef)
2.913
0.692
z
-2.49
2.70
lower .95
0.148
1.106
p
0.0130*
0.0069**
upper .95
0.796
1.886
210
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
ST3054 - ST6004
211
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Effect of covariates
I
One would then fit two Cox models with different sets of
covariates (having p and p + 1 covariates)
212
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
ST3054 - ST6004
213
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Model selection
I
214
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
NULL
ag
log(wbc)
loglik
-85.054
-80.924
-77.234
Chisq
Df
8.2612
7.3799
1
1
Pr(>|Chi|)
0.004050**
0.006596**
ST3054 - ST6004
215
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
0.05
0.10
0.20
0.50
1.00
2.00
5.00
leuk.cox2=coxph(Surv(time)strata(ag)+log(wbc), data=leuk)
plot(survfit(leuk.cox2), fun="cumhaz", log=T, lty=c(1,4))
50
100
150
ST3054 - ST6004
216
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Residual diagnostics
Martingale residuals or deviance residuals:
0 (ti )
riM = i exp T xi H
q
riD = sign(riM ) 2(riM i log (i riM ))
scatter.smooth(leuk$wbc,resid(leuk.cox))
plot(1:length(leuk$time), resid(leuk.cox,type="deviance"))
ST3054 - ST6004
217
Introduction
Survival models
Lifetime distribution functions
Cox regression
Modelling approach
The Cox model
Cox regression
Model selection
Residual diagnostics
deviance residuals
-1
-1
-3
-2
-2
martingale residuals
scatter.smooth(leuk$wbc,resid(leuk.cox))
plot(1:length(leuk$time), resid(leuk.cox,type="deviance"))
0 e+00
2 e+04
4 e+04
6 e+04
log(wbc)
8 e+04
1 e+05
10
15
20
25
30
index
ST3054 - ST6004
218
Introduction
Survival models
Lifetime distribution functions
Cox regression
References
[1] CT4 course notes, The Actuarial Education Company, Institute and Faculty of Actuaries, 2011
[2] Kaplan, E. L. and P. Meier, Non- parametric Estimation from Incomplete Observations. J. Am. Stat. Assoc.,
53:457481, 1958
[3] D. P. Harrington and T. R. Fleming, A class of rank test procedures for censored survival data, Biometrika,
69:553566, 1982
[4] S.R. Deshmukh, Actuarial Statistics: An Introduction Using R, Universities Press
[5] J.D. Gibbons and S. Chakraborti, Nonparametric Statistical Inference, 4th Edition, Dekker
[6] Alex R. Cooks ST3242 lecture notes, National University of Singapore
http://courses.nus.edu.sg/course/stacar/internet/st3242/st3242.html
[7] B. S. Everitt and T. Hothorn, A Handbook of Statistical Analyses Using R, Second Edition, Chapman & Hall
2010
[8] M. J. Crawley, Statistics: an Introduction Using R, Wiley 2005
For any comments or queries about this document, please contact [email protected]
ST3054 - ST6004
219