0% found this document useful (0 votes)
43 views

Lecture 16: Parametric Survival Models: James J. Dignam

This lecture discusses parametric survival models used to relate predictor variables to survival times. It focuses on two main constructs: the proportional hazards model and accelerated failure time model. The proportional hazards model assumes the hazard of an event at a given time depends on predictor values through a relative hazard function. It presents the Cox proportional hazards model, which models the log of the hazard ratio as a linear combination of predictors without specifying the baseline hazard. An example compares breast cancer survival between tumors with negative versus positive staining using Kaplan-Meier curves and log-cumulative hazard plots.

Uploaded by

cdcdiver
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Lecture 16: Parametric Survival Models: James J. Dignam

This lecture discusses parametric survival models used to relate predictor variables to survival times. It focuses on two main constructs: the proportional hazards model and accelerated failure time model. The proportional hazards model assumes the hazard of an event at a given time depends on predictor values through a relative hazard function. It presents the Cox proportional hazards model, which models the log of the hazard ratio as a linear combination of predictors without specifying the baseline hazard. An example compares breast cancer survival between tumors with negative versus positive staining using Kaplan-Meier curves and log-cumulative hazard plots.

Uploaded by

cdcdiver
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Lecture 16: Parametric Survival Models

James J. Dignam

Department of Public Health Sciences


University of Chicago

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 1 / 49


Regression models for survival data

We want to relate predictors (characteristics, factors such as


treatment) to survival times. The best way to do this
mathematically involves associating predictors with the hazard
function
There are some different constructs for how survival times are
influenced by predictors. We will focus on two of these:
1 Proportional hazards model
2 Accelerated failure time model
We begin with the proportional hazards model, as it is (by far) the
most common approach

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 2 / 49


Proportional hazards model

This model is based on the assumption of proportional hazards,


introduced earlier:
Consider the situation where the hazard at a particular time
depends on the values of p explanatory variables, X 1 , X 2 , ..., X p .
Denote i t h subject’s values of the explanatory variable
x i = (x 1i , x 2i , ..., x pi )
Let h0 (t ) be the baseline hazard function: the hazard function for an
individual for whom x i = 0
The hazard function for the i t h individual is given by

h i (t ) = ψ(x i )h 0 (t ) (1)

where the relative hazard,ψ(x i ), is a function of the values of the


explanatory variables for the i t h individual. ψ(·) can be interpreted
as the hazard at time t for an individual with explanatory variables
x i relative to the hazard for an individual with explanatory variables
x =0

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 3 / 49


Proportional hazards (continued)
Since the relative hazard, ψ(x i ), must be non-negative, it is
convenient to model it as exp(η i ), where η i is a linear combination
of the p explanatory variables:

η i = β1 x 1i + β2 x 2i + · · · + βp x pi (2)
Pp
so that i =1 β j x j i . In matrix notation, η i = βT x i .
The general proportional hazards model then becomes

h i (t ) = exp(β1 x 1i + β2 x 2i + · · · + βp x pi )h 0 (t ) (3)

equivalently,
h i (t )
log( ) = β1 x 1i + β2 x 2i + · · · + βp x pi (4)
h 0 (t )
the proportional hazards model may be regarded as a linear
model for the (natural) logarithm of the hazard ratio.
J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 4 / 49
Proportional hazards (continued)

If we specify a parametric functional form for the baseline hazard


function h0 (t ), we will obtain a parametric proportional hazards
model.
Later, we will see that h0 (t ) can be left completely unspecified
(Cox proportional hazards model).

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 5 / 49


Proportional hazards model for the comparison of two
groups

Suppose that we want to compare two groups of survival times:


Group I vs. Group II. Let X be the group indicator, 1 if Group II, 0
otherwise.
Under the proportional hazards model, the hazard of death at time
t is given by
h 1 (t ) = e βxi h 0 (t ) (5)
Consequently, the hazard at time t for an individual in Group I is
h 0 (t ), and that for an individual in Group II is e β h 0 (t ).
The hazard ratio is h 1 (t )
h 0 (t ) = eβ

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 6 / 49


A parametric survival distribution: Weibull

Weibull distribution:

λ(t ) = λγt γ−1 , γ > 0, λ > 0


Λ(t ) = λt γ
log{Λ(t )} = log(λ) + γ × log(t )
S[t ] = exp(−λt γ )

The Weibull is an extension of the simple exponential model,


which has scale parameter λ. The parameter γ is called the
shape parameter. Some notes:
If γ = 1, the distribution is exponential - constant hazard
If γ > 1, the hazard is increasing over time, if γ < 1, decreasing
In Stata, γ = p
If the Weibull model holds, H [t ] should be linear in log(t ) with
intercept log(λ) and slope γ

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 7 / 49


Weibull proportional hazards model for the
comparison of two groups

We now make the additional assumption that the survival times for
the individual in Group I have a Weibull distribution W (λ, γ),

h 0 (t ) = λγt γ−1 (6)

then the hazard function for those in Group II is e β h0 (t ), that is,

e β λγt γ−1 (7)

which is a hazard function for a Weibull distribution W (e β λ, γ)


That the hazard of a Weibull can be multiplied by a constant and
the new hazard is also Weibull means this distribution has the
proportional hazards property (not all survival distributions do).

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 8 / 49


The log-cumulative hazard plot

When a single sample of survival times has a Weibull distribution


W (λ, γ), then log(H0 (t )) = logλ + γlog(t )
It then follows that if the survival times in a second group have a
W (e β λ, γ), then log(H1 (t )) = (β + logλ) + γlog(t )
Thus, if the assumptions of a proportional hazards model and the
Weibull survival times were tenable, then if we plot the estimated
log-cumulative hazard function (estimated using Kaplan-Meier, for
instance) against the logarithm of the survival time for individuals
in the two groups, we expect to obtain two approximately parallel
straight lines.

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 9 / 49


The log-cumulative hazard plot (continued)

If the two lines in a log-cumulative hazard plot are essentially


straight, but not parallel, this means that the shape parameter γ
(which governs how the hazard changes over time) is different in
the two groups, and the hazards are no longer proportional.

If the lines are not particularly straight, the Weibull model may not
be appropriate. However, it the two curves can be taken to be
parallel, this would mean that the proportional hazards model is
valid, and can use a model (with different hazard or even without
specifying h0 (t )) is appropriate.

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 10 / 49


Example: Prognosis in women with breast cancer by
tumor marker staining

Table 1: Survival times (in months) of women with tumors that were
negatively or positively stained for Helix promatia HPA. Censored times are
labeled with an asterisk. From Collett book (Leatham &Brooks Lancet 1987)
Negative staining Positive staining
23 5 68
47 8 71
69 10 76*
70* 13 105*
71* 18 107*
100* 24 109*
101* 26 113
148 26 116*
181 31 118
198* 35 143
208* 40 154*
212* 41 162*
224* 48 188*
50 212*
59 217*
61 225*

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 11 / 49


Example: Prognosis for women with breast cancer
(continued)
stain codes: 1 = negative staining, 2= positive staining

. use p r o g n o s i s _ b r e a s t _ c a n c e r . d t a

. s t s e t time s t a t u s

. s t s graph , by ( s t a i n )

Kaplan-Meier survival estimates


1.00
0.75
0.50
0.25
0.00

0 50 100 150 200 250


analysis time

stain = 1 stain = 2

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 12 / 49


Example: Prognosis for women with breast cancer
(continued)

. s t s generate s u r v f = s , by ( s t a i n )

. generate cumhazard = − l o g ( s u r v f )

. generate lcumhazard = l o g ( cumhazard )

. generate l t i m e = l o g ( t i m e )

. graph twoway s c a t t e r lcumhazard l t i m e i f s t a t u s ==1 & s t a i n ==1 / / /


> | | s c a t t e r lcumhazard l t i m e i f s t a t u s ==1 & s t a i n ==2 / / /
> , ///
> legend ( o r d e r ( 1 " n e g a t i v e s t a i n " 2 " p o s i t i v e s t a i n " ) )

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 13 / 49


Prognosis for women with breast cancer (continued)
0
-1
lcumhazard
-2-3
-4

2 3 4 5
ltime

negative stain positive stain

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 14 / 49


Prognosis for women with breast cancer : default form
. s t r e g i . s t a i n , d i s t ( w e i b u l l ) nolog
f a i l u r e _d : s t a t u s
analysis time _t : time

W e i b u l l r e g r e s s i o n −− l o g r e l a t i v e − hazard form

No . o f s u b j e c t s = 45 Number o f obs = 45
No . o f f a i l u r e s = 26
Time a t r i s k = 4331
LR c h i 2 ( 1 ) = 4.14
Log l i k e l i h o o d =
− 60.883962 Prob > c h i 2 = 0.0418
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
_ t | Haz . R a t i o Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2. stain | 2.545372 1.271665 1.87 0.061 .9560751 6.776579
_cons | .0041365 .0037257 − 6.09 0.000 .0007079 .0241707
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ l n _ p | − .0646417 .1673746 − 0.39 0.699 − .3926898 .2634064
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
p | .9374033 .1568975 .6752382 1.301355
1/p | 1.066777 .1785513 .7684296 1.480959
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

The default produced by STATA with coefficients expressed as hazard ratios. This form gives estimates
for e β , λ, γ
Option "nohr" express coefficients on the log relative hazard scale, and gives estimates for β, log(λ), γ
NOTE In Stata output: cons = λ, p = γ

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 15 / 49


Prognosis for women with breast cancer (continued)

. display log (2.545372)


.93427681

β̂ = .93427681, λ̂ = 0.0041365, γ̂ = 0.9374033


The hazard function for the group with negative staining is
estimated to be ĥ0 (t ) = λ̂γ̂t γ̂−1 , and the hazard function for the
group with positive staining is estimated to be ĥ1 (t ) = e β̂ λ̂γ̂t γ̂−1
Since e β̂ = 2.55, a women in the positive HLA group has about two
and a half times the risk of death at any given time, compared to a
women whose tumor was HLA negative.
e β̂ = 2.55 > 1 and the 95% CI for e β is [0.96, 6.77] only just includes
unity, suggests that women with positively stained tumors have a
poorer prognosis than those whose tumors were negatively
stained.

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 16 / 49


Prognosis for women with breast cancer (continued)

Quantiles
The median and other percentiles of the survival time distributions
in the two groups can be estimated from the values of β̂, λ̂ and γ̂.
The estimated p th percentile for those with negative staining is
given by
1 100
t̂ p = { log( )}1/γ̂ (8)
100 λ̂ 100 −p
The estimated p th percentile for those with positive staining is given
by
1 100
t̂ p ={ log( )}1/γ̂ (9)
100
e β̂ λ̂ 100 − p

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 17 / 49


The Weibull proportional hazards model

But wait . . . if γ not different from 1.0, (γ̂ = 0.9374033)then we have an


exponential survival model
Simpler model may be preferred if it can be fit
Exponential model is nested within Weibull - can use LR test in
addition to CI on output.

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 18 / 49


Fit Exponential Model for Prognosis of women with
breast cancer

. s t r e g i . s t a i n , d i s t ( e x p o n e n t i a l ) nolog

f a i l u r e _d : status
analysis time _t : time

E x p o n e n t i a l r e g r e s s i o n −− l o g r e l a t i v e − hazard form

No . o f s u b j e c t s = 45 Number o f obs = 45
No . o f f a i l u r e s = 26
Time a t r i s k = 4331
LR c h i 2 ( 1 ) = 4.36
Log l i k e l i h o o d = − 60.960708 Prob > c h i 2 = 0.0369

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
_ t | Haz . R a t i o Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2. stain | 2.589922 1.28878 1.91 0.056 .9766015 6.868405
_cons | .0030266 − 12.97 .0013536
0.000 .0012598 .0072716
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 19 / 49


Fit Exponential Model for Breast Cancer Data

Note that β for stain effect (H R = 2.59) not much different from
Weibull (H R = 2.52)
scale parameter (rate of failure) not much different either at
0.0030266. What is this number and how does it relate to staining?

analysis time _t : time

| incidence no . o f |−−−−−− S u r v i v a l t i m e −−−−−|


stain | time at r i s k rate subjects 25% 50% 75%
−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
1 | 1652 .0030266 13 148 . .
2 | 2679 .0078387 32 26 61 .
−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
total | 4331 .0060032 45 40 113 .

. * Note _cons term i n model i s f a i l u r e r a t e a t s t a i n = 1


. d i s p l a y .0030266 * 2.59
.00783889

Note that the simple incidence rates are reproduced by the model
parameters λ and the coefficient for treatment
J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 20 / 49
Fit Exponential Model for Breast Cancer Data
Plotting the curves
. s t c u r v e , s u r v i v a l a t 1 ( s t a i n =1) a t 2 ( s t a i n =2)

Exponential regression
1
.8
Survival
.6
.4
.2

0 50 100 150 200 250


analysis time

stain=1 stain=2

Compare this to KM curve


J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 21 / 49
The Weibull Proportional Hazards Model

More generally, when there are p explanatory variables X 1 , ..., X p , under


the proportional hazards model, the hazard of death at time t for the i t h
individual is

h i (t ) = exp(β1 x 1i + β2 x 2i + · · · + βp x pi )h 0 (t ) (10)

If baseline hazard function h0 (t ) is specified as a Weibull model with


scale parameter λ and shape parameter γ, i.e., h0 (t ) = λγt γ−1 , the
hazard function for the i th individual in the study is then given by

h i (t ) = exp(βT x i )λγt γ−1 (11)

From the form of (11), the survival time of the i th individual in the study
has a Weibull distribution W (exp(βT x i )λ, γ)
The survivor function corresponding to the hazard function in (11) is

S i (t ) = exp{−exp(βT x i )λt γ } (12)

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 22 / 49


Example: Multiple Covariates

Treatment of hypernephroma
In a study carried out at the University of Oklahoma Health Sciences
Center, data were obtained on the survival times of 36 patients with a
maglignant tumor in the kidney, or hypernephroma.

Of particular interest is whether the survival time of the patients


depends on their age at the time of diagnosis and on whether or not
they had received a nephrectomy, or surgical removal of the kidney.

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 23 / 49


Treatment of hypernephroma

. use treatment_of_hypernephroma . d t a
. l i s t i n 1/10

+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+
| nephre~y age time status |
|−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−|
1. | 0 1 9 1 |
2. | 0 1 6 1 |
3. | 0 1 21 1 |
4. | 0 2 15 1 |
5. | 0 2 8 1 |
|−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−|
6. | 0 2 17 1 |
7. | 0 3 12 1 |
8. | 1 1 104 0 |
9. | 1 1 9 1 |
10. | 1 1 56 1 |
+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+

time is recorded in months.


nephrectomy codes: 0 = no nephrectomy, 1 = nephrectomy
age codes: 1 for age<60, 2 for age 60-70, 3 for age >70

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 24 / 49


Treatment of hypernephroma

Let Ag e _2 be the indicator for age being in the range 60-70, and
Ag e _3 be the indicator for age being in the range >70. Then a
Weibull proportional hazard model could be specified to be

h(t ) = exp(β1 · nephr ec t om y + β2 · Ag e _2 + β3 · Ag e _3)h 0 (t ) (13)

where h0 (t ) = λγt γ−1 .


h 0 (t ) is the hazard for a subject that didn’t receive a nephrectomy
and of an age <60 at the time of diagnosis.
exp(β1 ) is the hazard ratio comparing receiving a nephrectomy to
receiving no nephrectomy adjusting for age. (Equivalently, β1 is
the increase in log hazard for 1 unit increase in nephrectomy (1
vs. 0) adjusting for age.)
exp(β2 ) is the hazard ratio comparing age group 60-70 to age
group <60 adjusting for nephrectomy status.

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 25 / 49


Treatment of hypernephroma
. s t r e g nephrectomy i . age , d i s t ( w e i b u l l ) nolog

f a i l u r e _d : status
analysis time _t : time

W e i b u l l r e g r e s s i o n −− l o g r e l a t i v e − hazard form

No . o f s u b j e c t s = 36 Number o f obs = 36
No . o f f a i l u r e s = 32
Time a t r i s k = 1340
LR c h i 2 ( 3 ) = 17.13
Log l i k e l i h o o d =
− 43.87881 Prob > c h i 2 = 0.0007
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
_ t | Haz . R a t i o Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
nephrectomy | .1919802 .1006891 − 3.15 0.002 .0686785 .5366509
|
age |
2 | 1.085593 .4363889 0.20 0.838 .4937408 2.386903
3 | 5.218136 3.088109 2.79 0.005 1.635956 16.64406
|
_cons | .0170522 .0131496 − 5.28 0.000 .0037617 .0772992
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ ln_p | .3438972 .1411602 2.44 0.015 .0672284 .620566
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
p | 1.410434 .199097 1.06954 1.859981
1/p | .7090018 .1000828 .53764 .9349817
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 26 / 49


Treatment of hypernephroma

exp(βˆ1 ) = .1919802, with 95% CI for exp(β1 ) is [0.0686785, 0.5366509].


The CI does not contain 1 and smaller that 1, meaning that the
nephrectomy substantially reduces the hazard of death at any
given time controlling for age.
exp(βˆ2 ) = 1.085593, with 95% CI for exp(β2 ) is [.4937408, 2.386903].
There is no evidence that hazard of death is different for age
group 60-70 compared to age group < 60 controlling for
nephrectomy status.
exp(βˆ3 ) = 5.218136, with 95% CI for exp(β3 ) is [1.635956, 16.64406],
indicating there is strong evidence that mortality hazard at any
given time is higher for age group >70 compared to age group 60
controlling for nephrectomy status.
λ̂ = .0170522, γ̂ = 1.410434, thus the hazard for a subject that didn’t
receive a nephrectomy and of an age <60 at the time of diagnosis
at time t is estimated to be λ̂γ̂t γ̂−1 .
J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 27 / 49
Comparing alternative Weibull models

We can use the log likelihood ratio test (difference in deviance) to


compare two nested models.
Suppose the smaller model contains p explanatory variables, and
the larger model contains k extra explanatory variables; the
maximized likelihoods under smaller model and larger model are
L̂ s and L̂ l , respectively.
Under H0 the smaller model is correct, −2(logL̂ s − logL̂ l ) has an
approximate χ2k distribution.
In a given model, we can also examine individual coefficients and
tests on these to determine which factors are significant predictors
of failure

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 28 / 49


Treatment of hypernephroma: with interaction terms
. x i : s t r e g i . age * nephrectomy , d i s t ( w e i b u l l ) nolog
i . age _Iage_1 −3 ( n a t u r a l l y coded ; _Iage_1 o m i t t e d )
i . age * nephrec~y _IageXnephr_# ( coded as above )

f a i l u r e _d : status
analysis time _t : time

W e i b u l l r e g r e s s i o n −− l o g r e l a t i v e − hazard form

No . o f s u b j e c t s = 36 Number o f obs = 36
No . o f f a i l u r e s = 32
Time a t r i s k = 1340
LR c h i 2 ( 5 ) = 21.82
Log l i k e l i h o o d = − 41.532133 Prob > c h i 2 = 0.0006

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
_ t | Haz . R a t i o Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
_Iage_2 | .9183851 .7500492 − 0.10 0.917 .1852869 4.552028
_Iage_3 | 1.121983 1.297394 0.10 0.921 .1163344 10.82093
nephrectomy | .0875388 .0632624 − 3.37 0.001 .0212351 .3608657
_IageXnephr_2 | 1.128947 1.061025 0.13 0.897 .1789303 7.123004
_IageXnephr_3 | 12.65381 16.80464 1.91 0.056 .9371311 170.8609
_cons | .0187577 .0159487 − 4.68 0.000 .0035436 .0992918
−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ ln_p | .4407051 .1457169 3.02 0.002 .1551052 .7263049
−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
p | 1.553802 .2264153 1.167781 2.067427
1/p | .6435825 .0937809 .483693 .8563251
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 29 / 49


Treatment of hypernephroma

Comparing the two nested models: without interaction between


age and nephrectomy vs. with interaction between age and
nephrectomy:

−2(logL̂ s − logL̂ l ) = −2 ∗ (−43.87881 − (−41.532133)) = 4.693354 (14)

. d i c h i 2 t a i l ( 2 , 4.693354)
.0956866
Interaction weak, not really needed

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 30 / 49


Alternative Model: The Accelerated Failure Time
Regression Model

The accelerated failure time model is a parametric model form


that provides an alternative to the commonly used proportional
hazards model.
Unlike a proportional hazards model which assumes that the
effect of a covariate is to multiply the hazard by some constant, an
accelerated failure time model assumes that the effect of a
covariate is to accelerate or decelerate the time to event by some
constant.
Compared with proportional hazards model, the accelerated
failure time model may provide a more straightforward way to think
about covariates in relation to survival time, although it is less
used, and fewer are familiar with it.

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 31 / 49


Accelerated failure time model for the comparison of
two groups (continued)

Suppose that we want to compare two treatments, a standard


treatment, S, or a new treatment, N.
Under the accelerated failure time model, the survival time (t ) of
an individual on the new treatment is taken to be a multiple of the
survival time for an individual on the standard treatment. Thus the
effect of the new treatment is to “speed up" or “slow down" the
passage of time.
Under this assumption, the probability that an individual on the
new treatment survives beyond time t will the probability that an
individual on the standard treatment survives beyond time φ × t

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 32 / 49


Accelerated failure time model for the comparison of
two groups

Let S S (t ) and S N (t ) be the survivor functions for individuals in the


two treatment groups. Then, the accelerated failure time model
specifies that
S N (t ) = S S (t φ) (15)
φ is often called the acceleration factor.
One interpretation of this model is that the survival time of an
individual on the new treatment is φ times the survival time that the
individual would have experienced under the standard treatment.
φ < 1 corresponds to an acceleration in the time to death of an
individual assigned to the new treatment, relative to an individual on
the standard treatment.
φ > 1 corresponds to an deceleration in the time to death of an
individual assigned to the new treatment, relative to an individual on
the standard treatment.
J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 33 / 49
Accelerated failure time model for the comparison of
two groups (continued)

One can also interpret this model in terms of percentiles of


survival times of patients on the new and standard treatment.
Consider the median survival times of patients on the new and
N S N S
standard treatments, t .50 and t .50 say. By definition, S N (t .50 ) = S S (t .50 )
= 0.5
N N
Under the accelerated failure time model, S N (t .50 ) = S S (φt .50 )
N S N S
Therefore, t .50 1/φ = t .50 ⇔ t .50 = φt .50 .

In other words, under the accelerated failure time model, the


median survival time of a patient on the new treatment is φ times
that of a patient on the standard treatment.
The same argument can be used for any percentile of the survival
time distribution: t Np = φt Sp
100 100

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 34 / 49


Accelerated failure time model for the comparison of
two groups (continued)

Let X be an indicator variable that takes the value 0 for an


individual in the standard treatment group, and 1 for an individual
in the new treatment group. Let S 0 (t ) be the baseline survivor
function (survivor function for subjects with x = 0), under the
accelerated failure time model,

S i (t ) = S 0 (t φxi ) (16)

Since φ must be positive, it is convenient to set φ = e β , then the


model becomes
S i (t ) == S 0 (e βxi t ) (17)
and so
h i (t ) = e βxi h 0 (t e βxi ) (18)

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 35 / 49


Assessing the validity of an accelerated failure time
model

Consider the accelerated failure time model for individuals in the


two treatment groups, as we showed before, the model says that
the relationship between any p th percentiles of survival times of
patients on the new and standard treatment is given by

t Np = e β t Sp (19)
100 100

This means that a plot of the quantity t̂ Np against t̂ Sp , for suitably


100 100
chosen values of p , should given a straight line through the origin
if the accelerated failure time model is appropriate. The slope of
this line will be an estimate of e β .
This type of plot is called percentile-percentile plot, also known as
the quantile - quantile plot or the Q-Q plot.

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 36 / 49


Example: Prognosis for women with breast cancer
(continued)

Let X be the indicator for staining status, 1 for positive staining,


and 0 for negative staining.
Let S 0 (t ) be the baseline survivor function, in this example, it’s the
survivor function for negative staining group.
Under the accelerated failure time model,

S i (t ) = S 0 (t e βxi ) (20)

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 37 / 49


Prognosis for women with breast cancer

. use p r o g n o s i s _ b r e a s t _ c a n c e r . d t a

. s t s e t time s t a t u s

. s t s l i s t , by ( s t a i n )

f a i l u r e _d : status
analysis time _t : time

Beg . Net Survivor Std .


Time Total Fail Lost Function Error [95% Conf . I n t . ]
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
s t a i n =1
23 13 1 0 0.9231 0.0739 0.5664 0.9888
47 12 1 0 0.8462 0.1001 0.5122 0.9591
69 11 1 0 0.7692 0.1169 0.4421 0.9191
70 10 0 1 0.7692 0.1169 0.4421 0.9191
71 9 0 1 0.7692 0.1169 0.4421 0.9191
100 8 0 1 0.7692 0.1169 0.4421 0.9191
101 7 0 1 0.7692 0.1169 0.4421 0.9191
148 6 1 0 0.6410 0.1522 0.2818 0.8555
181 5 1 0 0.5128 0.1673 0.1756 0.7738
198 4 0 1 0.5128 0.1673 0.1756 0.7738
208 3 0 1 0.5128 0.1673 0.1756 0.7738
212 2 0 1 0.5128 0.1673 0.1756 0.7738
224 1 0 1 0.5128 0.1673 0.1756 0.7738

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 38 / 49


s t a i n =2
5 32 1 0 0.9688 0.0308 0.7982 0.9955
8 31 1 0 0.9375 0.0428 0.7725 0.9840
10 30 1 0 0.9063 0.0515 0.7369 0.9688
13 29 1 0 0.8750 0.0585 0.7004 0.9512
18 28 1 0 0.8438 0.0642 0.6646 0.9318
24 27 1 0 0.8125 0.0690 0.6295 0.9111
26 26 2 0 0.7500 0.0765 0.5618 0.8663
31 24 1 0 0.7188 0.0795 0.5291 0.8426
35 23 1 0 0.6875 0.0819 0.4971 0.8180
40 22 1 0 0.6563 0.0840 0.4658 0.7927
41 21 1 0 0.6250 0.0856 0.4352 0.7668
48 20 1 0 0.5938 0.0868 0.4052 0.7402
50 19 1 0 0.5625 0.0877 0.3759 0.7130
59 18 1 0 0.5313 0.0882 0.3471 0.6852
61 17 1 0 0.5000 0.0884 0.3190 0.6567
68 16 1 0 0.4688 0.0882 0.2915 0.6277
71 15 1 0 0.4375 0.0877 0.2646 0.5981
76 14 0 1 0.4375 0.0877 0.2646 0.5981
105 13 0 1 0.4375 0.0877 0.2646 0.5981
107 12 0 1 0.4375 0.0877 0.2646 0.5981
109 11 0 1 0.4375 0.0877 0.2646 0.5981
113 10 1 0 0.3937 0.0892 0.2230 0.5605
116 9 0 1 0.3937 0.0892 0.2230 0.5605
118 8 1 0 0.3445 0.0906 0.1776 0.5184
143 7 1 0 0.2953 0.0900 0.1366 0.4736
154 6 0 1 0.2953 0.0900 0.1366 0.4736
162 5 0 1 0.2953 0.0900 0.1366 0.4736
188 4 0 1 0.2953 0.0900 0.1366 0.4736
212 3 0 1 0.2953 0.0900 0.1366 0.4736
217 2 0 1 0.2953 0.0900 0.1366 0.4736
225 1 0 1 0.2953 0.0900 0.1366 0.4736
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
. * recode f o r modeling l a t e r
. r e p l a c e s t a i n = s t a i n −1
(45 r e a l changes made )

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 39 / 49


Prognosis for women with breast cancer

Table 2: Estimated percentiles of the distributions of survival times for women


with tumors that were positively or negatively stained

Percentile Negative staining Positive staining


10 47 13
20 69 26
30 148 35
40 181 48
50 - 61
60 - 113
70 - 143
80 - -

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 40 / 49


Prognosis for women with breast cancer

. clear

. input Percentile_negative_stain Percentile_positive_stain

P~negat~n P~ p o s i t ~n
1. 47 13
2. 69 26
3. 148 35
4. 181 48
5. end

. graph twoway ( s c a t t e r P e r c e n t i l e _ p o s i t i v e _ s t a i n P e r c e n t i l e _ n e g a t i v e _ s t a i n ) / / /
> ( l f i t Percentile_positive_stain Percentile_negative_stain ) / / /
> , xlabel (0(50)200) ylabel (0(20)60)

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 41 / 49


Prognosis for women with breast cancer : the
percentile-percentile plot

60
40
20
0

0 50 100 150 200


Percentile_negative_stain

Percentile_positive_stain Fitted values

The points fall on a reasonably straight line roughly through the origin, suggesting that the accelerated
failure time model would not be inappropriate. (However, this conclusion must be regarded with some
caution due to the limited number of points in the graph.)
The slope of the line (a rough estimate for e β is smaller than 1, suggesting that for women whose
tumors were positively stained, the disease process is speeded up relative to those whose tumors were
negatively stained. E.g., the median survival time for those HLA+ tumors is shorter compared to the
median survival time for women with HLA- tumors.
J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 42 / 49
Prognosis for women with breast cancer : Weibull
accelerated failure time model

Let’s further assume that the survival times in the negative


staining group have a Weibull distribution W (λ, γ), so that

S 0 (t ) = exp(−λt γ ) (21)

Under the accelerated failure time model,

S i (t ) = S 0 (t e βxi ) = exp(−λ(e (βxi ) )γ t γ ) (22)

which is a survivor function for a Weibull distribution


W (λ(e (βxi ) )γ , γ). The Weibull distribution has the accelerated failure
time property.
Weibull distribution is the only distribution that has both
proportional hazard property and accelerated failure time property.

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 43 / 49


Prognosis for women with breast cancer

. s t r e g s t a i n , d i s t r i b u t i o n ( w e i b u l l ) nolog t i m e
. . .
W e i b u l l r e g r e s s i o n −− a c c e l e r a t e d f a i l u r e − t i m e form

No . o f s u b j e c t s = 45 Number o f obs = 45
No . o f f a i l u r e s = 26
Time a t r i s k = 4331
LR c h i 2 ( 1 ) = 4.14
Log l i k e l i h o o d =
− 60.883962 Prob > c h i 2 = 0.0418
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
_t | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
s t a i n | − .9966647 .5440936 − 1.83 0.067 − 2.063069 .0697391
_cons | 5.854364 .4988778 11.74 0.000 4.876581 6.832146
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ l n _ p | − .0646417 .1673746 − 0.39 0.699 − .3926898 .2634064
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
p | .9374033 .1568975 .6752382 1.301355
1/p | 1.066777 .1785513 .7684296 1.480959
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

p gives estimate for γ


stain gives estimate for β
_cons gives estimate for −log(λ)/γ
J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 44 / 49
Prognosis for women with breast cancer
β̂ = −0.9966647, γ̂ = 0.9374033
. d i exp ( − 0 . 9 3 7 4 0 3 3 * ( 5 . 8 5 4 3 6 4 ) )
.00413652

λ̂ = 0.00413652
The estimated S(t ) for the negative staining group (baseline):
Ŝ 0 (t ) = exp(−λ̂t γ̂ ) (23)
Ŝ i (t ) = Ŝ 0 (t exp(β̂x i )), thus the estimated survivor function for
positive staining group (xi = 1) is given by
Ŝ pos (t ) = Ŝ 0 (t e −.9966647 ) = Ŝ 0 (.3691085t ) (24)

e β̂ = 0.37, with a 95% CI [e −2.063069 , e 0.0697391 ] = (0.13, 1.07).


pos neg
t̂ p/100 = 0.37t̂ p/100 . Comparing a positive staining subject to a
negative staining subject (baseline), the median (or any other
percentile) survival time for a positive staining subject will be
about 37% of that for a negative staining subject.
J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 45 / 49
Prognosis for women with breast cancer

can also output directly in ‘time ratio’ scale


. s t r e g s t a i n 1 , d i s t r i b u t i o n ( w e i b u l l ) nolog t i m e t r

f a i l u r e _d : status
analysis time _t : time

W e i b u l l r e g r e s s i o n −− a c c e l e r a t e d f a i l u r e − t i m e form

No . o f s u b j e c t s = 45 Number o f obs = 45
No . o f f a i l u r e s = 26
Time a t r i s k = 4331
LR c h i 2 ( 1 ) = 4.14
Log l i k e l i h o o d = − 60.883962 Prob > c h i 2 = 0.0418
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
_ t | Time R a t i o Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
stain1 | .3691085 .2008296 − 1.83 0.067 .1270635 1.072228
_cons | 348.753 173.9851 11.74 0.000 131.1814 927.1788
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ l n _ p | − .0646417 .1673746 − 0.39 0.699 − .3926898 .2634064
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
p | .9374033 .1568975 .6752382 1.301355
1/p | 1.066777 .1785513 .7684296 1.480959
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 46 / 49


Are relevant times at different percentiles t of HLA positive
survival curve shifted down by .37 times t from HLA negative
survival curve ? Check this on KM plot
Kaplan-Meier survival estimates
1.00
0.75
0.50
0.25
0.00

0 50 100 150 200 250


analysis time

stain1 = 0 stain1 = 1

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 47 / 49


The General Accelerated Failure Time Model

The accelerated failure time model for comparison between two


groups can be generalized to the situation where there are p
explanatory variables X 1 , ..., X p ,

S i (t ) = S 0 (t exp(β1 x 1i + β2 x 2i + · · · + βp x pi )) (25)

where S 0 (t ) is the baseline survivor function (for an individual for


whom x i = 0)
Different models for S 0 (t ) will give different parametric accelerated
failure time models. Parametric models commonly used are the
Weibull, log-logistic, and lognormal model,
Positive β for an explanatory variable means that larger values are
associated with longer survival time controlling for the other
variables in the model; negative β means that larger values are
associated with shorter survival time controlling for the other
variables in the model.
J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 48 / 49
Summary

Parametric Survival Regression Models


These models provide a way to incorporate covariates, as well as
efficiently estimate survival quantities, provided that the model fit
is adequate
next: Flexible semi-parametric model (happens to be PH form, but
flexible extensions available)

J. Dignam (UChicago) Lecture 16 Mar. 5, 2020 49 / 49

You might also like