Abstract

This document discusses methods for analyzing survival data including the Kaplan-Meier method, log-rank test, and Cox's proportional hazards model. It provides examples of using each method on a sample survival dataset and interpreting the results. Key assumptions and calculations for each method are defined.

Uploaded by

mathseek890

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views

Abstract

Uploaded by

mathseek890

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Abstract

This review introduces methods of analyzing data arising from studies where the
response variable is the length of time taken to reach a certain end-point, often
death. The KaplanMeier methods, log rank test and Cox's proportional hazards
model are described.
Keywords: Cox's proportional-hazards model, cumulative hazard function H(t),
hazard ratio, KaplanMeier method, log rank test, survival function S(t)
Introduction
Survival times are data that measure follow-up time from a defined starting point
to the occurrence of a given event, for example the time from the beginning to the
end of a remission period or the time from the diagnosis of a disease to death.
Standard statistical techniques cannot usually be applied because the underlying
distribution is rarely Normal and the data are often 'censored'. A survival time is
described as censored when there is a follow-up time but the event has not yet
occurred or is not known to have occurred. For example, if remission time is being
studied and the patient is still in remission at the end of the study, then that
patient's remission time would be censored. If a patient for some reason drops out
of a study before the end of the study period, then that patient's follow-up time
would also be considered to be censored.
The hypothetical data set given in Table Table11 will be used for illustrative
purposes in this review. For this data set the event is the death of the patient, and so
the censored data are those where the outcome is survived or unknown.

Table 1
Survival time, age and outcome for a group of patients diagnosed with a disease
and receiving one of two treatments
Estimating the survival curve using the KaplanMeier
method
In analyzing survival data, two functions that are dependent on time are of
particular interest: the survival function and the hazard function. The survival
function S(t) is defined as the probability of surviving at least to time t. The hazard
function h(t) is the conditional probability of dying at time t having survived to that
time.
The graph of S(t) against t is called the survival curve. The KaplanMeier method
can be used to estimate this curve from the observed survival times without the
assumption of an underlying probability distribution. The method is based on the
basic idea that the probability of surviving k or more periods from entering the
study is a product of the k observed survival rates for each period (i.e. the
cumulative proportion surviving), given by the following:
S(k) = p
1
p
2
p
3
... p
k

Here, p
1
is the proportion surviving the first period, p
2
is the proportion surviving
beyond the second period conditional on having survived up to the second period,
and so on. The proportion surviving period i having survived up to period i is given
by:

Where r
i
is the number alive at the beginning of the period and d
i
the number of
deaths within the period.
To illustrate the method the data for the patients receiving treatment 2 from Table
Table11 will be used. The survival times, including the censored values (indicated
by + in Table Table2),2), must be ordered in increasing duration. If a censored time
has the same value as an uncensored time, then the uncensored should precede the
censored. The calculations are shown in Table Table2.2. Where there is a censored
time the proportion surviving will be 1. This does not alter the cumulative
proportion surviving, and so these calculations can be omitted from the table. For
more detailed explanation, see Swinscow and Campbell [1].

Table 2
Calculations for the KaplanMeier estimate of the survival function for the
treatment 2 data from Table 1
Plotting the cumulative proportion surviving against the survival times gives the
stepped survival curve shown in Fig. Fig.11.

Figure 1
Plot of the survival curve for treatment 2.
This method is found in most statistical packages. Figure Figure22 is the output
from a statistical package used to compare the survival curves for the two
treatment groups for the data given in Table Table11.

Figure 2
Survival curves for the two treatment groups for the data in Table 1.
It can be seen that patients on treatment 1 appear to have a higher survival rate than
those on treatment 2. The graph can be used to estimate the median survival time
because this is the time with probability of survival of 0.5. The median survival
time for those on treatment 2 appears to be 5 days versus about 37 days on
treatment 1.
Comparing survival curves of two groups using the log
rank test
Comparison of two survival curves can be done using a statistical hypothesis test
called the log rank test. It is used to test the null hypothesis that there is no
difference between the population survival curves (i.e. the probability of an event
occurring at any time point is the same for each population). The test statistic is
calculated as follows:

Where the O
1
and O
2
are the total numbers of observed events in groups 1 and 2,
respectively, and E
1
and E
2
the total numbers of expected events.
The total expected number of events for a group is the sum of the expected number
of events at the time of each event. The expected number of events at the time of
an event can be calculated as the risk for death at that time multiplied by the
number alive in the group. Under the null hypothesis, the risk of death (number of
deaths/number alive) can be calculated from the combined data for both groups.
Table Table33 shows the calculation of the expected number of deaths for
treatment group 2 for the example data. For example, at the beginning of day 4
when the third death (event 3) takes place, there are 13 patients still alive. One
dies, giving a risk for death of 1/13 = 0.077. Six of the 13 patients are from
treatment group 2, and therefore the expected number of deaths is given by 6
0.077 = 0.46 at event 3. The total expected number of events for group 2 is
calculated as:

Table 3
Calculations for the log-rank test to compare treatments for the data in Table 1

Where r
2i
is the number alive from group 2 at the time of event i. E
1
can be
calculated as n - E
2
, where n is the total number of events.
The test statistic is compared with a
2
distribution with 1 degree of freedom. It is a
simplified version of a statistic that is often calculated in statistical packages [2].
For the data in Table Table1,1, the total number of expected deaths for treatment
group 2 is calculated as 2.92 and the total number of observed deaths is 10, giving
a total number of expected deaths for treatment group 1 of 10 - 2.92 = 7.08. The
value of the test statistic is therefore calculated as follows:

This gives a P value of 0.032, which indicates a significant difference between the
population survival curves.
An assumption for the log rank test is that of proportional hazards. This is
discussed below. Small departures from this assumption, however, do not
invalidate the test.
Cox's proportional hazards model (Cox regression)
The log rank test is used to test whether there is a difference between the survival
times of different groups but it does not allow other explanatory variables to be
taken into account.
Cox's proportional hazards model is analogous to a multiple regression model and
enables the difference between survival times of particular groups of patients to be
tested while allowing for other factors. In this model, the response (dependent)
variable is the 'hazard'. The hazard is the probability of dying (or experiencing the
event in question) given that patients have survived up to a given point in time, or
the risk for death at that moment.
In Cox's model no assumption is made about the probability distribution of the
hazard. However, it is assumed that if the risk for dying at a particular point in time
in one group is, say, twice that in the other group, then at any other time it will still
be twice that in the other group. In other words, the hazard ratio does not depend
on time.
The model can be written as:

Where h(t) is the hazard at time t; x
1
, x
2
... x
p
are the explanatory variables; and h
0
(t)
is the baseline hazard when all the explanatory variables are zero. The coefficients
b
1
, b
2
... b
p
are estimated from the data using a statistical package.
Because hazard measures the instantaneous risk for death, it is difficult to illustrate
it from sample data. Instead, the cumulative hazard function H(t) can be examined.
This can be obtained from the cumulative survival function S(t) as follows:
H(t) = -ln S(t)
The estimated cumulative hazard function for the example data given in Table
Table11 is shown in Table Table44.

Table 4
Cumulative hazard functions (logarithmic scale) for the example data
The assumption that the proportional hazards stay constant over time can be
inspected by looking at a graph showing the logarithm of the estimated cumulative
hazard function. The assumption is equivalent to assuming that the difference
between the logarithms of the hazards for the two treatments does not change with
time, or equally that the difference between the logarithms of the cumulative
hazard functions is constant. Figure Figure33 is the graph for the example data.
The lines for the two treatments are roughly parallel, suggesting that the
proportional hazards assumption is reasonable in this case. A more formal test of
the assumption is possible (see Armitage and coworkers [2]). Note that, in this
graph, the time scale was also logarithmically transformed. This was to make the
comparison clearer between the two treatments, but it does not affect the vertical
positioning of the lines.

Figure 3
Cumulative hazard functions for the example data.
Cox's regression was applied to the example data using treatment and age as
explanatory variables. The output is shown in Table Table55.

Table 5
Application of Cox's regression to the example data, using treatment and age as
explanatory variables
The P values indicate that the difference between treatments was bordering on
statistical significance, whereas there was strong evidence that age was associated
with length of survival. The coefficient for treatment, -1.887, is the logarithm of
the hazard ratio for a patient given treatment 1 compared with a patient given
treatment 2 of the same age. The exponential (antilog) of this value is 0.152,
indicating that a person receiving treatment 1 is 0.152 times as likely to die at any
time as a patient receiving treatment 2; that is, the risk associated with treatment 1
appears to be much lower. However, the confidence interval contains 1, indicating
that there may be no difference in risk associated with the two treatments.
Using the KaplanMeier (log rank) test, the P value for the difference between
treatments was 0.032, whereas using Cox's regression, and including age as an
explanatory variable, the corresponding P value was 0.052. This is not a substantial
change and still suggests that a difference between treatments is likely. In this case
age is clearly an important explanatory variable and should be included in the
analysis.
The exponential of the coefficient for age, 1.247, indicates that a patient 1 year
older than another patient, both being given the same treatment, has an increased
risk for dying, by a factor of 1.247. Note that, in this case, the confidence interval
does not contain 1, indicating the statistical significance of age.
Further models for survival data, allowing for different assumptions, are discussed
by Kirkwood and Sterne [3].
An example from the literature
Dupont and coworkers [4] investigated the survival of patients with bronchiectasis
according to age and use of long-term oxygen therapy. The KaplanMeier curves
and results of the log rank tests shown in Fig. Fig.44 indicate that there is a
significant difference between the survival curves in each case.

Figure 4
The KaplanMeier estimates of survival for (a) age > 65 years or 65 years,
and (b) long-term oxygen therapy (LTOT) before intensive care unit admission
(yes/no). The P values are for the log rank test.
The authors also applied Cox's proportional hazards analysis and obtained the
results given in Table Table6.6. These results indicate that both age and long-term
oxygen therapy have a significant effect on survival. The estimated risk ratio for
age, for example, suggests that the risk for death for patients over the age of 65
years is 2.7 times greater than that for those below 65 years.

Table 6
Results of Cox's proportional hazards analysis for the patients with bronchiectasis
Assumptions and limitations
The log rank test and Cox's proportional hazards model assume that the hazard
ratio is constant over time. Care must be taken to check this assumption.
Conclusion
Survival analysis provides special techniques that are required to compare the risks
for death (or of some other event) associated with different treatments or groups,
where the risk changes over time. In measuring survival time, the start and end-
points must be clearly defined and the censored observations noted. Only the most
commonly used techniques are introduced in this review. KaplanMeier provides a
method for estimating the survival curve, the log rank test provides a statistical
comparison of two groups, and Cox's proportional hazards model allows additional
covariates to be included. Both of the latter two methods assume that the hazard
ratio comparing two groups is constant over time.
Competing interests
None declared.
References
1. Swinscow TDV, Campbell MJ. Statistics at Square One. London: BMJ
Books; 2002.
2. Armitage P, Berry G, Matthews JNS. Statistical Methods in Medical
Research. 4. Oxford, UK: Blackwell Science; 2002.
3. Kirkwood BR, Sterne JAC. Essential Medical Statistics. 2. Oxford, UK:
Blackwell Science Ltd; 2003.
4. Dupont M, Gacouin A, Lena H, Lavoue S, Brinchault G, Delaval P, Thomas
R. Survival of patients with bronchiectasis after the first ICU stay for
respiratory failure. Chest. 2004;125:18151820. doi:
10.1378/chest.125.5.1815. [PubMed] [Cross Ref]

Relapse Autopsy
No ratings yet
Relapse Autopsy
12 pages
Bridge Nine Cosmere RPG Adventure
100% (1)
Bridge Nine Cosmere RPG Adventure
29 pages
Health Assessment
100% (3)
Health Assessment
28 pages
Analysis of Survival Data - LN - D Zhang - 05
100% (1)
Analysis of Survival Data - LN - D Zhang - 05
264 pages
Breathing Exercises
No ratings yet
Breathing Exercises
1 page
Survival Analysis
No ratings yet
Survival Analysis
16 pages
Survival Analysis Notes
No ratings yet
Survival Analysis Notes
13 pages
What Is A Cox Model?: Sponsored by An Educational Grant From Aventis Pharma
No ratings yet
What Is A Cox Model?: Sponsored by An Educational Grant From Aventis Pharma
8 pages
Chapter Three
No ratings yet
Chapter Three
10 pages
Survival Analysis Overview
No ratings yet
Survival Analysis Overview
23 pages
Appendix 2 An Introduction To The Counting Process Approach To Survival Analysis
No ratings yet
Appendix 2 An Introduction To The Counting Process Approach To Survival Analysis
5 pages
Estadística Statistics - What Are Haz Ratios - Hardward
No ratings yet
Estadística Statistics - What Are Haz Ratios - Hardward
8 pages
Using Kaplan Meier and Cox Regression in Survival Analysis: An Example
No ratings yet
Using Kaplan Meier and Cox Regression in Survival Analysis: An Example
12 pages
HT 2007: Statistical Lifetime Models, Sheet 4
No ratings yet
HT 2007: Statistical Lifetime Models, Sheet 4
2 pages
Some Insight On Censored Cost Estimators: H. Zhao, Y. Cheng and H. Bang
No ratings yet
Some Insight On Censored Cost Estimators: H. Zhao, Y. Cheng and H. Bang
9 pages
Estimating Survival Functions From The Life Table : J. Chron. Dis. 1969
No ratings yet
Estimating Survival Functions From The Life Table : J. Chron. Dis. 1969
16 pages
Some Insight On Censored Cost Estimators: H. Zhao, Y. Cheng and H. Bang
No ratings yet
Some Insight On Censored Cost Estimators: H. Zhao, Y. Cheng and H. Bang
8 pages
unit1BSTAT 531 SURVIVAL ANALYSIS 2
No ratings yet
unit1BSTAT 531 SURVIVAL ANALYSIS 2
12 pages
Dissertation Cox Regression
100% (2)
Dissertation Cox Regression
5 pages
Applied Statistics Survival Analysis
No ratings yet
Applied Statistics Survival Analysis
23 pages
Statistical Techniques For The Biomedical Sciences: Lecture 12: Case-Control Studies and Cox Proportional Hazards
No ratings yet
Statistical Techniques For The Biomedical Sciences: Lecture 12: Case-Control Studies and Cox Proportional Hazards
5 pages
Statistics: An Introduction Using R by M.J. Crawley Exercises
No ratings yet
Statistics: An Introduction Using R by M.J. Crawley Exercises
29 pages
Survival Functions
No ratings yet
Survival Functions
25 pages
Survival Analysis Using Split Plot in Time Models: Omar Hikmat Abdulla, Khawla Mustafa Sadik
No ratings yet
Survival Analysis Using Split Plot in Time Models: Omar Hikmat Abdulla, Khawla Mustafa Sadik
4 pages
Primary Composite Endpoints Are The Main Measurements For A Trial They Answer The Most Important
No ratings yet
Primary Composite Endpoints Are The Main Measurements For A Trial They Answer The Most Important
2 pages
Patient in The Institute of Therapy and Rehabilitation of Pondok Pesantren Ibadurrahman Tenggarong Seberang)
No ratings yet
Patient in The Institute of Therapy and Rehabilitation of Pondok Pesantren Ibadurrahman Tenggarong Seberang)
10 pages
A Review On Joint Modelling
No ratings yet
A Review On Joint Modelling
25 pages
Non-Parametric Survival Models
100% (1)
Non-Parametric Survival Models
4 pages
Survival Analysis Assignment 2
No ratings yet
Survival Analysis Assignment 2
2 pages
Handout Survival Analysis
No ratings yet
Handout Survival Analysis
3 pages
Survival Analysis
No ratings yet
Survival Analysis
36 pages
Mb0050 SLM Unit12
No ratings yet
Mb0050 SLM Unit12
22 pages
Laska 1992
No ratings yet
Laska 1992
13 pages
Dimitrov InfectiousDiseaseModelTutorial
No ratings yet
Dimitrov InfectiousDiseaseModelTutorial
30 pages
Cure Rate Model
No ratings yet
Cure Rate Model
10 pages
Timevarying in R
No ratings yet
Timevarying in R
10 pages
Extracorporeal Membrane Oxygenation (ECMO) Reconsidered: Point of View
No ratings yet
Extracorporeal Membrane Oxygenation (ECMO) Reconsidered: Point of View
5 pages
Introduction To Nonparametric Statistics Craig L. Scanlan, Edd, RRT
No ratings yet
Introduction To Nonparametric Statistics Craig L. Scanlan, Edd, RRT
11 pages
Statistical Methods For Conditional Survival Analysis
No ratings yet
Statistical Methods For Conditional Survival Analysis
19 pages
Cox Regression Notes I
No ratings yet
Cox Regression Notes I
8 pages
Lecture 3
No ratings yet
Lecture 3
62 pages
relative survival regression model with B-spline Giorgi_et_al-2003-Statistics_in_Medicine
No ratings yet
relative survival regression model with B-spline Giorgi_et_al-2003-Statistics_in_Medicine
18 pages
Sample Size and Power Calculations Using The Noncentral T-Distribution
No ratings yet
Sample Size and Power Calculations Using The Noncentral T-Distribution
12 pages
Biostatistics Assignments
No ratings yet
Biostatistics Assignments
10 pages
An Overall Strategy Based on Regression Models to Estimate Relative Survival Remontet_et_al-2007-Statistics_in_Medicine
No ratings yet
An Overall Strategy Based on Regression Models to Estimate Relative Survival Remontet_et_al-2007-Statistics_in_Medicine
15 pages
Ch6-Comparisons of Several
No ratings yet
Ch6-Comparisons of Several
43 pages
Cox Proportional-Hazards Model - Easy Guides - Wiki - STHDA
No ratings yet
Cox Proportional-Hazards Model - Easy Guides - Wiki - STHDA
15 pages
Tutorials in Statistics - Chapter 4 New
No ratings yet
Tutorials in Statistics - Chapter 4 New
11 pages
Análisis de Supervivencia
No ratings yet
Análisis de Supervivencia
8 pages
Design Based Upon Intraclass Correlation Coefficient
No ratings yet
Design Based Upon Intraclass Correlation Coefficient
8 pages
Tutorial
No ratings yet
Tutorial
42 pages
Chapter 16 - Logistic Regression Model
No ratings yet
Chapter 16 - Logistic Regression Model
7 pages
Preventive Veterinary Medicine: Michael Ho Hle, Michaela Paul, Leonhard Held
No ratings yet
Preventive Veterinary Medicine: Michael Ho Hle, Michaela Paul, Leonhard Held
9 pages
Cross-Sectional Dependence
No ratings yet
Cross-Sectional Dependence
5 pages
06 01 Regression Analysis
No ratings yet
06 01 Regression Analysis
6 pages
Survival Analysis Part2 Applied Clinical Data Analysis
No ratings yet
Survival Analysis Part2 Applied Clinical Data Analysis
17 pages
H (T) /frac (F (T) ) (S (T) ) H (T) : Survival Analysis
No ratings yet
H (T) /frac (F (T) ) (S (T) ) H (T) : Survival Analysis
2 pages
06_chapter 1
No ratings yet
06_chapter 1
26 pages
Survival Analysis
No ratings yet
Survival Analysis
28 pages
Basic Statistics Concepts: 1 Frequency Distribution
No ratings yet
Basic Statistics Concepts: 1 Frequency Distribution
7 pages
Neuroscientific based therapy of dysfunctional cognitive overgeneralizations caused by stimulus overload with an "emotionSync" method
From Everand
Neuroscientific based therapy of dysfunctional cognitive overgeneralizations caused by stimulus overload with an "emotionSync" method
Christian Hanisch
No ratings yet
First Hitting Time Regression Models: Lifetime Data Analysis Based on Underlying Stochastic Processes
From Everand
First Hitting Time Regression Models: Lifetime Data Analysis Based on Underlying Stochastic Processes
Chrysseis Caroni
No ratings yet
Medical Statistics at a Glance Workbook
From Everand
Medical Statistics at a Glance Workbook
Aviva Petrie
No ratings yet
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Dermatology Written Exam 2007
No ratings yet
Dermatology Written Exam 2007
6 pages
Industrial Hygiene Questions and Answers
71% (7)
Industrial Hygiene Questions and Answers
4 pages
Declaration - Brand Name
No ratings yet
Declaration - Brand Name
3 pages
Pathophysiology Er
No ratings yet
Pathophysiology Er
3 pages
D-Dimer Assays
No ratings yet
D-Dimer Assays
1 page
Nihms 921772
No ratings yet
Nihms 921772
20 pages
Technical Specifications: Toll Free No: 1800 425 7151
No ratings yet
Technical Specifications: Toll Free No: 1800 425 7151
2 pages
SVT Presentation
No ratings yet
SVT Presentation
39 pages
Emergency Preparedness and Response Handbook
100% (1)
Emergency Preparedness and Response Handbook
157 pages
FSM Government Employee Handbook
No ratings yet
FSM Government Employee Handbook
11 pages
As 5808-2009 Child-Resistant Packaging - Requirements and Testing Procedures For Non-Reclosable Packages For
No ratings yet
As 5808-2009 Child-Resistant Packaging - Requirements and Testing Procedures For Non-Reclosable Packages For
10 pages
CareMore: Innovative Healthcare Delivery
No ratings yet
CareMore: Innovative Healthcare Delivery
30 pages
Hansard Oct21
No ratings yet
Hansard Oct21
126 pages
Flyer IDDW 2022 Updt 28.07.22
No ratings yet
Flyer IDDW 2022 Updt 28.07.22
2 pages
Forensic 102 BSC 3b Final
No ratings yet
Forensic 102 BSC 3b Final
359 pages
Siwes Report Tkeys
No ratings yet
Siwes Report Tkeys
49 pages
Invisalign Treatment
No ratings yet
Invisalign Treatment
7 pages
Individual OB Assignment 2
No ratings yet
Individual OB Assignment 2
13 pages
Pcab New License (Corp - Partn) - 11192018
No ratings yet
Pcab New License (Corp - Partn) - 11192018
25 pages
Panduit LOTO Training Quiz With Answers
No ratings yet
Panduit LOTO Training Quiz With Answers
1 page
OBLICON
No ratings yet
OBLICON
21 pages
Designing Safer Chemicals: Group 3
No ratings yet
Designing Safer Chemicals: Group 3
11 pages
Date Sheet June 2021 Tee (3rd Aug - To 9th Sept - 2021)
No ratings yet
Date Sheet June 2021 Tee (3rd Aug - To 9th Sept - 2021)
12 pages
Obtaining A Capillary Blood Specimen To Measure Blood Glucose
No ratings yet
Obtaining A Capillary Blood Specimen To Measure Blood Glucose
3 pages
Abraham Maslow: Pirámide de Maslow
No ratings yet
Abraham Maslow: Pirámide de Maslow
2 pages
Carol George PHD, Malcolm L. West PHD - The Adult Attachment Projective Picture System Attachment Theory and Assessment in Adults
No ratings yet
Carol George PHD, Malcolm L. West PHD - The Adult Attachment Projective Picture System Attachment Theory and Assessment in Adults
321 pages

Abstract

Uploaded by

Abstract

Uploaded by

Abstract

You might also like