0% found this document useful (0 votes)
8 views

Survival Analysis Part2 Applied Clinical Data Analysis

Uploaded by

vidhujoshy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Survival Analysis Part2 Applied Clinical Data Analysis

Uploaded by

vidhujoshy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

KJA

Statistical Round
pISSN 2005-6419 • eISSN 2005-7563

Survival analysis: part II –


Korean Journal of Anesthesiology applied clinical data analysis
Junyong In1 and Dong Kyu Lee2
Department of Anesthesiology and Pain Medicine, 1Dongguk University Ilsan Hospital, Goyang, 2Guro Hospital,
Korea University School of Medicine, Seoul, Korea

As a follow-up to a previous article, this review provides several in-depth concepts regarding a survival analysis. Also,
several codes for specific survival analysis are listed to enhance the understanding of such an analysis and to provide an
applicable survival analysis method. A proportional hazard assumption is an important concept in survival analysis. Val-
idation of this assumption is crucial for survival analysis. For this purpose, a graphical analysis method and a goodness-
of-fit test are introduced along with detailed codes and examples. In the case of a violated proportional hazard assump-
tion, the extended models of a Cox regression are required. Simplified concepts of a stratified Cox proportional hazard
model and time-dependent Cox regression are also described. The source code for an actual analysis using an available
statistical package with a detailed interpretation of the results can enable the realization of survival analysis with person-
al data. To enhance the statistical power of survival analysis, an evaluation of the basic assumptions and the interaction
between variables and time is important. In doing so, survival analysis can provide reliable scientific results with a high
level of confidence.

Keywords: Cox regression; Extended Cox regression; Goodness of fit test; Log minus log plot; Proportional hazard as-
sumption; Schoenfeld residual; Stratified Cox regression; Survival analysis; Time-dependent coefficient; Time-dependent
Cox regression.

Introduction forms of the Cox proportional hazards (CPH) model in-series.


The most important aspect of the CPH model is a propor-
The previous article ‘Survival analysis: Part I – analysis of tional hazard assumption during the observation period. The
time-to-event’ introduced the basic concepts of a survival analy- hazard of an event occurring during an observation cannot
sis [1]. To decrease the gap between the data from a clinical case always be remained constantly, and the hazard ratio cannot be
and a statistical analysis, this article presents several extended maintained at a constant level. This is the main obstacle for a
clinical data analysis using a CPH model.
The basic concepts required to understand and interpret the
Corresponding author: Dong Kyu Lee, M.D., Ph.D. results of a survival analysis were covered in a previous article
Department of Anesthesiology and Pain Medicine, Guro Hospital, [1]. Part 2 of this article, described herein, focuses on the ana-
Korea University School of Medicine, 148 Gurodong-ro, Guro-gu, lytical methods applying clinical data and coping with problems
Seoul 08308, Korea that can occur during an analysis. Such methods for validating a
Tel: +82-2-2626-3237, Fax: +82-2-2626-1438
proportional hazard assumption apply clinical data and several
Email: [email protected]
ORCID: https://orcid.org/0000-0002-4068-2363 extended Cox models to overcome the problem of a violated
proportional hazard assumption. This article also includes the R
Received: May 2, 2019.
codes used for estimating several Cox models based on clinical
Revised: May 13, 2019.
Accepted: May 16, 2019. data. For those familiar with a statistical analysis, the R codes
can easily enable an extension of the Cox model estimation.1)
Korean J Anesthesiol 2019 October 72(5): 441-457
https://doi.org/10.4097/kja.19183

CC This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/

licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright ⓒ The Korean Society of Anesthesiologists, 2019 Online access in http://ekja.org
Survival analysis part II VOL. 72, NO. 5, October 2019

Proportional Hazard Assumption any time point remain at a distance and never meet graphically
during an observation. However, this does not guarantee the
Refer to the previous article [1] for a description of diagnos- satisfaction of the proportional hazard assumption. In a clinical
tic methods applied to a CPH model. Here, we consider only setting, one hazard could remain lower or higher than the oth-
a proportional hazard assumption. A hazard is defined as the ers, and their ratio cannot be constant because the treatment
probability of an event occurring at a time point (t). The surviv- effect may vary owing to various factors. Therefore, we need a
al function of a CPH model is an exponential function, and the statistical method to prove the satisfaction or violation of the
hazard ratio (λ) is constant during an observation; thus, a sur- proportional hazard assumption.
vival function is defined in the exponential form of the hazard
ratio at a time point (equation 1) [1]. Validation of Proportional Hazard
Assumption
s(t) = exp−λt… … … …equation 1.
s(t): survival function based on the CPH model There are three representative validation methods of a pro-
t: specific time point portional hazard assumption. One is a graphical approach, an-
λ: hazard ratio other is using the goodness of fit (GOF), and the last is applying
a time-dependent covariate [4,5].
To estimate hazard ratio, which is included in the survival
function, hazard function (h) is required and it contains a specif- Graphical analysis for validation of proportional
ic explanatory variable (X) which indicates a specific treatment hazard assumption
or exposure to a specific circumstance. At the time point of t,
the hazard function of the control group is defined as the basal As mentioned in the previous article, a log minus log plot
hazard function (h0 (t)), and hazard function of the treatment (LML plot) is one of the most frequently used methods for the
group as the combined form of the basal hazard function and a validation of a proportional hazard assumption [1]. The log
certain function with the explanatory variable (X). The hazard transformation is applied twice during a mathematical process
ratio is the value of the hazard functions of treatment over con- for estimating the survival function. The first log transformation
trol groups (equation 2) [2]. results in negative values because the probability values from the
survival function lay between zero and 1, and such values should
hC (t) = h0 (t) be made positive to conduct a second log transformation. The
hT (t) = h0 (t) × exp βX name of the LML plot implies this process. A survival function
hT (t) h0 (t) × expβX is the exponential form of a hazard ratio, and the hazard ratio is
λ= = = expβX … … … … equation 2.
hc (t) h0 (t) constituted with the hazard function, which is an exponential
hC (t): Hazard function of control group form of an explanatory variable. As a result of an LML transfor-
hT (t): Hazard function of treatment group mation, the survival function is converted into a linear function-
λ: Hazard ratio al form, and the difference from the explanatory variable creates
h0 (t): Baseline hazard function at time t a distance on the y-axis at a time point. Ultimately, survival
t: specific time point functions that are log transformed twice become parallel during
X: explanatory variable the observation period. Deductively, two curves on an LML plot
β: coefficient for X also become parallel, which indicates that the hazard ratio re-
mains constant during the observation period [4].
As shown in equation 2, the CPH model processes the There is a risk of subjective decision regarding the validation
analysis under the constant hazard ratio assumption with the of a proportional hazard assumption using an LML plot because
explanatory variable, which is not affected by the time [3]. The this method is based on a visual check. It is recommended that
hazard ratio remains constant, and the hazards of each group at the interpretation be as conservative as possible except under
strong evidence of a violation, including instances in which the
curves are crossing each other or apparently meet. A continu-
1)
Sample data (Survival2_PONV.csv) and the R console output of entire ous explanatory variable should be converted into a categorical
code are provided as supplemental information. Refer to online help or variable of two or three levels to produce an LML plot. When
R statistical textbooks for detailed explanations of the argument. The doing so, the data thin out and a different result can be reached
included R code covers the process beginning with the survival analysis
introduced in [1]. A detailed description of a violation of a proportional
according to the criteria used for dividing the variable [5].
hazard assumption is provided in [14].

442 Online access in http://ekja.org


KOREAN J ANESTHESIOL In and Lee

 codes for Kaplan–Meier survival analysis under the


R PONV using a log-rank test, and the LML plot introduced in
assumption of a proportional hazard part I of this article [1]. Small modifications of this code can en-
able a survival analysis with the user’s own data.
The sample data ‘Survival2_PONV.csv’ contains the imagi-
nary data of 104 patients regarding the first onset time of post-
### Kaplan-Meier Estimation (KME)
operative nausea and vomiting (PONV). All patients received
one of two types of antiemetics (Drugs A or B). The columns #Add survival object
represent the patient number (No), types of antiemetics (Anti- PONV.raw$Survobj <- with(PONV.raw,
Surv(Time, PONV == 1)
emetics), age (Age), body weight (Wt), amount of opioid used )
during anesthesia (Inopioid), the first PONV onset time (Time), head (PONV.raw)
and whether PONV occurred (PONV). To load such data into
## Single KME. The log-log confidence interval is preferred.
R software 3.5.2 (R Development Core Team, Vienna, Austria,
2018), the following code can be used. In this code, the loca- km.one <- survfit(Survobj ~1, data = PONV.raw,
tion of the CSV file on the hard drive is ‘d:\’, and users should conf.type = "log-log")
adequately modify the path. This code provides the first several # Result of KME
km.one
lines of data (Table 1).
# Survival table
# Read data summary (km.one)
PONV.raw <- read.csv ("d:/Survival2_PONV.csv", # Survival curve
TRUE, sep = "," ggsurvplot (km.one, data = PONV.raw,
) conf.int = TRUE,
# Check imported data palette = "grey",
head(PONV.raw) surv.median.line = "hv",
break.time.by = 4,
censor = TRUE,
To conduct a survival analysis using R, two R packages are legend = "none",
required, ‘survival’2) and ‘survminer’.3) When these packages xlab = "Time (hour)",
risk.table = TRUE,
are not supplied as a default, manual installation is not difficult
tables.height = 0.2,
when using the command ‘install.packages(“package name”)’. tables.theme = theme_cleantable(),
These packages are then called. risk.table.y.text = FALSE
)
#Load Package: survival, survminer
library (survival) R applies a Kaplan–Meier analysis using the new variable
library (survminer) ‘Survobj’. The results of a Kaplan–Meier analysis and a survival
table are presented in Table 2. Out of 104 patients, 63 patients
Then, a Kaplan–Meier survival analysis is applied. The fol- suffered from PONV, and the median onset time was 10 h. A
lowing code covers a Kaplan–Meier analysis, comparing the graphical presentation is also possible (Fig. 1). Here, ‘ggsurvplot’
produces survival curves with complex arguments, fine-tuning
the argument options to draw intuitive graphs.
Table 1. First Three Data Imported as PONV.raw
The next code is for an estimation of the survival curves ac-
Result of command “head(PONV.raw)
cording to two antiemetics and conducting a log-rank test.
Antie­ Ino­
No Age Wt Time PONV
metics pioid ### KME by Antiemetics
1 1 0 48 78.5 0 4 0 km.antiemetics <- survfit (Survobj ~ Antiemetics,
data = PONV.raw,
2 3 0 54 88.3 100 21 0
conf.type = "log-log"
3 4 0 22 49.4 0 14 0
)
︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙ # Result of KME by Antiemetics
From the left, each column contains each coded variable: The first km.antiemetics
column has a number automatically generated by R, variable ‘No’ is a
coded number in the original data, ‘Antiemetics’ has a value of 0 for
2)
Drug A and 1 for Drug B, ‘Age’ and ‘Wt’ are the actual patients’ age and Terry M. survival: Survival Analysis. R package version 2.42-4. 2018.
body weight, ‘Inopioid’ is the amount of opioid used during surgery, https://github.com/therneau/survival.
3)
‘Time’ indicates the onset time of postoperative nausea and vomiting Alboukadel K, Marcin K, Przemyslaw B, Scheipl F. survminer: Drawing
(PONV), and ‘PONV’ is coded as 1 when the patient experienced Survival Curves using 'ggplot2', R package version 0.4.3. 2018. http://www.
PONV. sthda.com/english/rpkgs/survminer/.

Online access in http://ekja.org 443


Survival analysis part II VOL. 72, NO. 5, October 2019

Table 2. Results of Kaplan–Meier Estimation and Survival Table


Call: survfit(formula = Survobj ~ 1, data = PONV.raw, conf.type = "log − log")

n Events Median 0.95LCL 0.95UCL

104 63 10 7 16

Call: survfit(formula = Survobj ~ 1, data = PONV.raw, conf.type = "log − log")

Time n.risk n.event Survival std.err Lower 95% CI Upper 95% CI


1 104 8 0.923 0.0261 0.852 0.961
2 96 7 0.856 0.0345 0.772 0.910
3 89 3 0.827 0.0371 0.739 0.887
︙ ︙ ︙ ︙ ︙ ︙ ︙
n: total number of cases, Events: number of patients who experienced PONV, Median: median survival time, 0.95LCL: lower limit of 95% confidence
interval, 0.95UCL: upper limit of 95% confidence interval, n.risk: number at risk, n.event: number of event, Survival: survival rate, std.err: standard
error of survival rate, Lower/upper 95% CI: lower/upper limits of 95% confidence interval.

tables.height = 0.2,
tables.theme = theme_cleantable(),
risk.table.y.text.col = TRUE,
risk.table.y.text = TRUE
)

Table 3 and Fig. 2 show the results of this code. Antiemetics


are coded as 0 for Drug A and 1 for Drug B, namely ‘Antiemetics
= 0’ and ‘1’ represent Drugs A and B, respectively in the Table
and Figure.
As the interpretation of a log-rank test, the survival functions
of two antiemetics are statistically different (P = 0.009), and the
median PONV free time is 13 and 6 h for Drugs A and B, re-
spectively.
The log-rank test is also based on the proportional hazard
assumption, and an LML plot can be used to validate this as-
Fig. 1. Kaplan–Meier curve of overall survival status with sample data. sumption. The code for this process is as follows, and the output
A 95% confidence interval (estimated from a log hazard) is presented in
the shadowed area. The dashed line indicates the median survival time. graph is shown in Fig. 3.4,5)

# LML plot
plot (survfit(Surv(Time, PONV == 1) ~ Antiemetics,
# Survival table of KME by Antiemetics data = PONV.raw), fun = "cloglog")
summary (km.antiemetics)

# KM estimation, log-rank test


survdiff ( formula = Surv(Time, PONV == 1)
~ Antiemetics,
data = PONV.raw
) 4)
There are several ways to draw an LML plot in R; ‘plot.survfit’ with the
argument ‘fun = “cloglog”’ provides an LML plot of the log-scaled x-axis.
# Survival curve of KME by Antiemetics Most statistical references describe a log-scaled x-axis LML plot, whereas
ggsurvplot ( km.antiemetics, data = PONV.raw, others describe a standard linear-scaled x-axis LML plot. The R code for a
fun = "pct", pval = TRUE, non-log-scaled LML plot can be created through the following.
conf.int = TRUE, surv.median.line = "hv",
linetype = "strata", palette = "grey", # Non-log scaled LML plot
xlab="Time (hour)", ponvsurv=Surv(PONV.raw$Time, PONV.raw$PONV)
legend.title = "Antiemetics", NLML.fun=function(p){return(log(-log(p)))}
legend.labs = c("Drug A", "Drug B"), plot (survfit(ponvsurv ~ PONV.raw$Antiemetics), fun=NLML.fun)
legend = c(.1, .2), break.time.by = 4, 5)
R package ‘survival’ version 2.44-1 (updated in March 2019) has an error
risk.table = TRUE, with an x value of 1 when log scaled using the x-axis. Versions before 2.44-
1 work properly.

444 Online access in http://ekja.org


KOREAN J ANESTHESIOL In and Lee

Table 3. Results of Kaplan–Meier Estimation between Antiemetics, Survival Tables of Two Antiemetics, and Comparison Results of Log-rank Test
Call: survfit(formula = Survobj ~ Antiemetics, data = PONV.raw, conf.type = "log − log")

n Events Median 0.95LCL 0.95UCL

Antiemetics = 0 51 25 13 9 NA
Antiemetics = 1 53 38 6 4 10

Call: survfit(formula = Survobj ~ Antiemetics, data = PONV.raw, conf.type = "log − log")

Antiemetics = 0

Time n.risk n.event Survival std.err Lower 95% CI Upper 95% CI

1 51 1 0.980 0.0194 0.869 0.997


2 50 2 0.941 0.0329 0.829 0.981
3 48 1 0.922 0.0376 0.804 0.970
4 47 2 0.882 0.0451 0.757 0.945
︙ ︙ ︙ ︙ ︙ ︙ ︙
Antiemetics = 1

Time n.risk n.event Survival std.err Lower 95% CI Upper 95% CI

1 53 7 0.868 0.0465 0.743 0.935


2 46 5 0.774 0.0575 0.636 0.865
3 41 2 0.736 0.0606 0.595 0.834
4 39 6 0.623 0.0666 0.478 0.738
︙ ︙ ︙ ︙ ︙ ︙ ︙
Call: survdiff(formula = Surv(Time, PONV == 1) ~ Antiemetics, data = PONV.raw)

N Observed Expected

Antiemetics = 0 51 25 34.9
Antiemetics = 1 53 38 28.1
Chisq = 6.8 on 1 degrees of freedom, P = 0.009
Antiemetics = 0 and 1 indicate Drugs A and B, respectively. Because the variable ‘Antiemetics’ is coded as 0 for drug A and 1 for drug B, the R output
only describes these as ‘Antiemetics = 0 and 1’. n: total number of cases, Events: number of patients who experienced postoperative nausea and
vomiting, Median: Median survival time, 0.95LCL: lower limit of 95% confidence interval, 0.95UCL: upper limit of 95% confidence interval, n.risk:
number at risk, n.event: number of events, Survival: survival rate, std.err: standard error of survival rate, Lower/upper 95% CI: lower/upper limits of
95% confidence interval, Chisq: chi-squared statistics.

The goodness of fir test (GOF test) are independent of time. A violation of the proportional hazard
assumption may be suspected when the Schoenfeld residual plot
The second method for validating a proportional hazard presents a relationship with time. Also, a Schoenfeld residual
assumption is a GOF test between the observed and estimated test is possible under a null hypothesis of ‘there is no correlation
survival function values. This provides a P value and hence is a between the Schoenfeld residuals and ranked event time’.7)
more objective method than a visual check [5]. Schoenfeld residual tests cannot be used to validate a propor-
A Schoenfeld residual test is a representative GOF test tional hazard assumption in a Kaplan–Meier estimation because
for validation of a proportional hazard assumption [6–8]. A it is based on estimated values using the CPH model. A Schoen-
Schoenfeld residual is the difference between explanatory vari-
ables observed in the real world and estimated using a CPH 6)
Schoenfeld residual is only for the patient who experienced the event.
model for patients who experience an event. Thus, Schoenfeld It is the difference between observed value of explanatory variable at a
residuals are calculated using all explanatory variables included specific time and expected value of the explanatory variable (covariate) at
a specific time which is a weighted-average value by likelihood of event
in the model. If the CPH model includes two explanatory vari-
from the risk set at that time point.
ables, the two Schoenfeld residuals come out for one patient at a 7)
Some statistical software provides a method using scaled Schoenfeld
time.6) residuals. Under a specific circumstance, these two results are different,
Because the hazard ratio is constant during the observation although they mostly produce similar results. Please refer to the following:
Grambsch, P.M. and Therneau, T.M. 1994. Proportional hazards tests and
period (a proportional hazard assumption), Schoenfeld residuals diagnostics based on weighted residuals. Biometrika 81: 515-526.

Online access in http://ekja.org 445


Survival analysis part II VOL. 72, NO. 5, October 2019

Fig. 3. Log minus log plot of Kaplan–Meier estimation with log-rank


test between two antiemetics. The two curves do not meet during the
observation period, indicating the satisfaction of the proportional
hazard assumption. The log-time scale is shown in the x-axis.

Fig. 2. Kaplan–Meier curves of two antiemetics with sample data. The


P value is estimated based on a log-rank test. A 95% confidence interval
(estimated from a log hazard) is presented in the shadowed area. The # Univariate Cox proportional hazard model
dashed lines indicate the median survival times of groups taking Drugs # for a single covariate
A and B. Drug A is coded as ‘Antiemetics = 0’ and Drug B is coded as cph.antiemetics <- coxph(Surv(Time, PONV == 1) ~ Antiemetics
‘Antiemetics = 1’ in the original data. , data = PONV.raw
)
summary(cph.antiemetics)
feld residual test is lacking in terms of the statistical hypothesis
testing process. Null hypothesis significance testing applies a Table 4 summarizes the results. The PONV incidence rate is
statistical process to validate ‘no difference,’ and when the null 1.9471-fold higher (95% CI, 1.174–3.229, P = 0.010) in the drug
hypothesis is not true under a significant level, an alternative B groups than in the drug A groups.
hypothesis is true except for the probability of the significance Survival2_PONV.csv has four covariates. A multivariate
level, that is, differences exist between comparatives within the analysis is possible using these covariates with a CPH model.
probability of significance. A Schoenfeld residual test deter- Multivariate analysis can estimate the most compatible model,
mines whether a proportional hazard assumption is violated including significant covariates, through regression diagnostic
based on the probability of the correlation statistics. Correlation statistics. Still, several controversies remain [9], both directional
statistics with a higher probability than the significance level stepwise selection methods are applied in this example.
result in a satisfaction of the proportional hazard assumption
without null hypothesis testing. This method cannot guarantee # Multivariate Cox regression
sufficient evidence to reject a hypothesis, however. Furthermore, cph.full <- coxph (Surv(Time, PONV == 1)
~ Antiemetics + Age + Wt + Inopioid
the P value is dependent on the sample size, and large sample , data = PONV.raw
size will produce a high significance with a minimal violation of )
the assumption; an apparent assumption violation may be insig- summary (cph.full)
nificant with small sample size. Although a Schoenfeld residual # Variables selection
test is more objective than an LML plot, the use of two methods cph.selection <- step(
simultaneously is recommended owing to the problems listed coxph(Surv(Time, PONV == 1)
~ Antiemetics + Age + Wt + Inopioid
above [4,5,7].
, data = PONV.raw)
, direction = "both"
 codes for the Cox proportional hazard regression
R
)
model and GOF test summary (cph.selection)

To estimate a CPH model, libraries used in a Kaplan–Meier # Final model selected


analysis are also required. After importing the data and calling cph.selected <- coxph(Surv(Time, PONV == 1)
the required libraries, the CPH model can be estimated accord- ~ Antiemetics + Inopioid
, data = PONV.raw
ing to the antiemetics using the following code. )
summary (cph.selected)

446 Online access in http://ekja.org


KOREAN J ANESTHESIOL In and Lee

Table 4. Results of the Cox Proportional Hazard Model Estimation Using Antiemetics with Sample Data
Call: coxph(formula = Surv(Time, PONV == 1) ~ Antiemetics, data = PONV.raw)

n = 104, number of events = 63

coef exp(coef) se(coef) z Pr(>|z|)

Antiemetics 0.6664 1.9471 0.2581 2.582 0.00983**


---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1

exp(coef) exp(-coef) Lower .95 Upper .95

Antiemetics 1.947 0.5136 1.174 3.229

Concordance = 0.615 (se = 0.032 )


Rsquare = 0.064 (max possible = 0.993 )
Likelihood ratio test = 6.85 on 1 df, P = 0.009
Wald test = 6.67 on 1 df, P = 0.01
Score (logrank) test = 6.91 on 1 df, P = 0.009
‘Antiemetics’ is coded as 0 for Drug A or 1 for Drug B in the original data. coef: the value of coefficient, exp(coef): exponential value of coefficient,
se(coef): standard error of coefficient, z: z-statistics, Pr(>|z|): P value of given z-statistics, Signif. codes: codes for significance marking.

Table 5. Multivariate Cox Proportional Hazard Model with Sample Data


Call: coxph(formula = Surv(Time, PONV == 1) ~ Antiemetics + Inopioid, data = PONV.raw)

n = 104, number of events = 63

coef exp(coef) se(coef) z Pr(>|z|)

Antiemetics 0.703650 2.021116 0.258971 2.717 0.00659**


Inopioid 0.012740 1.012821 0.002417 5.271 1.35e-07***
---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’

exp(coef) exp(-coef) Lower .95 Upper .95

Antiemetics 2.021 0.4948 1.217 3.358


Inopioid 1.013 0.9873 1.008 1.018

Concordance = 0.694 (se = 0.03 )


Rsquare = 0.284 (max possible= 0.993 )
Likelihood ratio test = 34.69 on 2 df, P = 3e-08
Wald test = 34.09 on 2 df, P = 4e-08
Score (logrank) test = 38.29 on 2 df, P = 5e-09
‘Antiemetics’ is coded as 0 for Drug A or 1 for Drug B in the original data. ‘Inopioid’ is the amount of opioid used during surgery. coef: the value
of coefficient, exp(coef): exponential value of coefficient, se(coef): standard error of coefficient, z: z-statistics, Pr(>|z|): P value of given z-statistics,
Signif. codes: codes for significance marking.

After examining the full model including all covariates for the final model (Fig. 4).8)
(summary(cph.full)), the most compatible model is confirmed
through a covariate selection (summary(cph.selection)), and a # Survival curves of the Cox PH model
# grouped by Antiemetics
clean result is finally obtained (summary(cph.selected)). Table 5 new.cph.antiemetics <-with (PONV.raw
shows the final model. According to the result, the PONV incre-
ment is estimated as 2.021-fold (95% CI, 1.217–3.358, P = 0.007)
based on the antiemetics, and 1.013-fold (95% CI, 1.008–1.018, 8)
The command ‘ggadjustedcurves’ included in the ‘survminer’ library
P < 0.001) based on intraoperative opioid usage. easily produces the survival curves of the CPH model. Unfortunately, this
command still has minor functional errors such as in printing the 95% CI
The next code draws survival curves against the antiemetics
or labelling, and a somewhat complex ‘ggsurvplot’ is used in this example.

Online access in http://ekja.org 447


Survival analysis part II VOL. 72, NO. 5, October 2019

Fig. 5. LML plot of Cox proportional hazards model based on antie­


metics with sample data.
Fig. 4. Survival curves of antiemetics estimated using the Cox pro­
portional hazards regression model. a solid black line indicates Drug A
(Antiemetics = 0) and a solid grey line indicates Drug B (Antiemetic = Table 6. Results of the Schoenfeld Residual Test
1). Dashed lines present a 95% CI range. Drug A is coded as ‘Antiemetics =
0’ and drug B is coded as ‘Antiemetics = 1’ in the original data. Results of ‘print(sf.residual)’

rho chisq P
,data.frame(Antiemetics = c(0, 1),
Inopioid = c(0,0) Antiemetics −0.275 4.5 0.0340
)) Inopioid 0.307 5.36 0.0206
new.cph.antiemetics.fit <- survfit(cph.selected GLOBAL NA 10.26 0.0059
, newdata = new.cph.antiemetics ‘Antiemetics’ is coded as 0 for Drug A or 1 for Drug B in the original
) data. ‘Inopioid’ is the amount of opioid used during surgery. rho: Spear­
ggsurvplot(new.cph.antiemetics.fit man’s ρ statistics, chisq: chi-squared statistics, P: P value.
, data = PONV.raw
, conf.int = TRUE
, conf.int.style = "step" “Inopioid” is a continuous type of variable, and an LML plot
, censor = FALSE using this variable is impossible to achieve without a categorical
, palette = "grey"
, break.time.by = 4 transformation.
, linetype = "solid" A Schoenfeld residual test is shown below. Here, ‘cox.zph’ in-
, axes.offset = FALSE cluded in the ‘survminer’ library enables this test. The results are
, xlab = "Time (hour)"
listed in Table 6, and graphical output is shown in Fig. 6.
, legend = c(0.1, 0.15)
, legend.labs = c("Drug A", "Drug B")
, legend.title = "Antiemetics") # Schoenfeld residuals test
sf.residual <- cox.zph(cph.selected)
print(sf.residual) # display the results
The R code for an LML plot is described above. For cate- par (mfrow = c(2,1))
gorical variables, an LML plot provides an easy to interpret and plot(sf.residual[1]) # plot curves
intuitive validation method for a proportional hazard assump- abline (h = coef(cph.selected)[1]
, lty = "dotted", lwd = 1)
tion.9) Validation of the proportional hazard assumption of the plot(sf.residual[2])
antiemetics, which is a categorical variable, is possible using an abline (h = coef(cph.selected)[2]
LML plot. (Fig. 5) , lty = "dotted", lwd = 1)

#LML for CoxPH The P value in Table 6 indicates the significance probability
plot (survfit(coxph(Surv(Time, PONV == 1)
of the Schoenfeld residual test for the antiemetics and intraop-
~ strata(Antiemetics)
, data = PONV.raw erative opioid used, and such values indicate a violation of the
) proportional hazard assumption. A positive increment of the
), Schoenfeld residual curve for ‘Inopioid’ is shown in Fig. 6. The
fun = "cloglog"
) curve for the antiemetics gradually changes toward a negative

The proportional hazard assumption of the antiemetics is not 9)


In R, categorical variables should be treated as a stratum for comparison
violated according to the graphs shown in Fig. 5. The covariate using an LML plot of the CPH model.

448 Online access in http://ekja.org


KOREAN J ANESTHESIOL In and Lee

Fig. 6. Schoenfeld residual plot with


‘Antiemetics’ and ‘Inopioid’. Dotted hori­­
zon­tal lines indicate the estimated coef­
ficient values of these covariates.

value over time, but not continuously. In this way, a Schoenfeld tional hazard assumption, one method is to apply a stratified
residual test provides more objective results than an LML plot, CPH model. This method makes one integrated result from the
which is strictly conservative. results of each stratum containing a categorical variable classi-
fied based on a certain criterion. Unlike the Mantel–Haenszel
Adding a time-dependent covariate method, which is based on the sample size of each stratum,
stratification in the CPH model sets a different baseline hazard
To validate a proportional hazard assumption in a CPH mod- corresponding to each stratum, and a statistical estimation is
el, a time-dependent covariate is intentionally added into the then applied to achieve common coefficients for the remaining
estimated model. This covariate can be made using a time-inde- explanatory variables except for the stratified variables.11) This
pendent variable and time, or a function of time. For example, provides a hazard ratio of the controlled effects of variables vio-
the process compares two models, namely, a CPH model that lating the proportional hazard assumption [12].
assumes the proportional hazard assumption has not been vio- A stratified CPH model can be applied to control the vari-
lated, and another model incorporated with a combined covari- ables violating a constant hazard assumption, as well as to
ate of the explanatory variable and time (or a function of time) control the confounding factors that influence the results with
in the estimated CPH model. A likelihood ratio test or Wald sta- little or no clinical significance. Stratification always requires
tistics are used for comparison. This type of method has certain categorical variables, and conversion into categorical variables
advantages, including a simultaneous comparison with multiple is required for continuous variables. Under this situation, care
covariates and various time functions; note that the results may should be taken that the sample size of each stratum is reduced
change depending on the covariates and types of functions se- (data thinned out) and information held by the continuous
lected [5,10,11].10) variable is simplified. Therefore, conversion into a categorical
variable should consider as small number of strata as possible,
Cox Proportional Hazard Regression Models setting the range of clinical or scientific meaning, and maintain-
with Time-dependent Covariates ing a balance among the strata [12].

Covariates violating the proportional hazard assumption in a 10)


Because various application methods and their variations are available,
CPH model should be adequately adjusted. This section intro- they are not discussed in detail herein.
duces a stratification and time-dependent Cox regression to deal 11)
This is a non-interaction stratified CPH model. Several survival functions
with covariates violating the proportional hazard assumption. are estimated through stratification, and if the explanatory variables
have interactions with each other, the coefficients at each stratum may
be different. In this case, it is assumed that an interaction model between
Stratified Cox proportional hazard model explanatory variables and a likelihood ratio test provide clues to judge
whether there is an interaction between explanatory variables. That is, if
two or more variables are included in the model, it is necessary to check
To fit the CPH model with variables violating the propor- whether an interaction between them exists.

Online access in http://ekja.org 449


Survival analysis part II VOL. 72, NO. 5, October 2019

According to this, the categorical variable ‘Inopioid_c’ is re-


R codes for stratified Cox proportional hazard model
corded as 0 or 1 and is newly added to the dataset (Table 7).
In the previous CPH modeling, the variable ‘Inopioid’ vio- Next, the code for a stratified CPH model is as follows:
lated the constant hazard assumption based on the Schoenfeld
### Stratified Cox proportional hazard modeling
residual test (Fig. 6). Here, ‘Inopioid’ is a continuous variable
cph.strata <- coxph (Surv(Time, PONV == 1)
that records the dose of intraoperatively used opioid. To apply a ~ Antiemetics + strata(Inopioid_c)
stratified CPH modeling, continuous variables should be con- , data = PONV.raw)
verted into categorical variables. For convenience, the following
summary (cph.strata)
is a code that converts ‘Inopioid’ into a categorical variable of 0
or 1, when not used or used, respectively. ggsurvplot(survfit(cph.strata)
, data = PONV.raw
, risk.table = TRUE
##### Stratified Cox regression
, palette = c("black","black")
### Add categorical variables from Inopioid
, linetype = c("solid","dashed")
PONV.raw <- transform(PONV.raw,
)
Inopioid_c = ifelse(
Inopioid == 0, 0, 1))
par( mfrow = c(1,1))
head (PONV.raw)
plot (survfit(cph.strata)
, fun = "cloglog"

Table 7. PONV.raw Added a New Categorical Variable ‘Inopioid_c’ from the Variable ‘Inopioid’
Results of ‘head (PONV.raw)’

No Antiemetics Age Wt Inopioid Time PONV Survobj Inopioid_c

1 1 0 48 78.5 0 4 0 4+ 0
2 3 0 54 88.3 100 21 0 21+ 1
3 4 0 22 49.4 0 14 0 14+ 0
︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙
‘Survobj’ is a variable created by an R command during the process of a Kaplan–Meier estimate, and indicates a survival object. ‘Inopioid_c’ is a newly
created categorical variable based on ‘Inopioid’, which is coded as 0 for an opioid not used or 1 for an opioid used during operation. From left, each
column contains each coded variable: The first column has a number automatically generated by R, the variable ‘No’ is a coded number in the original
data, ‘Antiemetics’ has a value of 0 for Drug A and 1 for Drug B, ‘Age’ and ‘Wt’ are the actual patients’ age and body weight, ‘Inopioid’ is the amount
of opioid used during surgery, ‘Time’ indicates the onset time of postoperative nausea and vomiting, and ‘PONV’ is coded as 1 when the patient
experienced postoperative nausea and vomiting.

Table 8. Results of Stratified Cox Proportional Hazard Model. Stratification with ‘Inopioid_c’
Call: coxph(formula = Surv(Time, PONV == 1) ~ Antiemetics + strata(Inopioid_c), data = PONV.raw)

n = 104, number of events = 63

coef exp(coef) se(coef) z Pr(>|z|)

Antiemetics 0.7282 2.0714 0.2625 2.774 0.00553**


---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’

exp(coef) exp(-coef) Lower .95 Upper .95

Antiemetics 2.071 0.4828 1.238 3.465


Concordance = 0.634 (se = 0.034 )
Rsquare = 0.074 (max possible = 0.979 )
Likelihood ratio test = 7.96 on 1 df, P = 0.005
Wald test = 7.7 on 1 df, P = 0.006
Score (logrank) test = 8.03 on 1 df, P = 0.005
‘Antiemetics’ is coded as 0 for Drug A or 1 for Drug B in the original data. ‘Inopioid_c’ is a newly created categorical variable based on ‘Inopioid’,
which is coded as 0 for an opioid not used or 1 for an opioid used during operation. coef: the value of coefficient, exp(coef): exponential value of
coefficient, se(coef): standard error of coefficient, z: z-statistics, Pr(>|z|): P value of given z-statistics, Signif. codes: codes for significance marking.

450 Online access in http://ekja.org


KOREAN J ANESTHESIOL In and Lee

Fig. 7. Examples of the stratified Cox proportional hazard model and corresponding LML plot. (A) Survival curves of estimated stratified Cox
proportional hazard model. Stratification is achieved using the categorical variable ‘Inopioid_c’. (B) Log-minus log plot for evaluation of proportional
hazard assumption against two antiemetics. Note that a non-parallelism of below 2 h is not assured, whereas the overall curves are roughly parallel
without crossing.

, main = "Antiememtics"
)

sf.residual.strata <- cox.zph(cph.strata)


print(sf.residual.strata)
plot(sf.residual.strata)
abline (h = coef(cph.strata)
, lty = "dotted"
, lwd = 1)

This code outputs a stratified CPH model by controlling ‘In-


opioid_c’ (Table 8). The command ‘ggsurvplot’ provides survival
curves of two strata and prints the LML plot using the last ‘plot’
command (Fig. 7). The Schoenfeld residual test using a ‘cox.
zph’ command reveals that ‘Antiemetics’ violates the propor-
tional hazard assumption (rho = −0.265, chisq = 4.26, P = 0.039, Fig. 8. Schoenfeld residual test for the stratified Cox proportional
shown in Fig. 8). It is possible to obtain an adequate CPH model hazard model. For the covariate ‘Antemetics’, the probability was esti­
mated as 0.039, and a violation of the proportional hazard assump­tion
by stratifying ‘Inopioid’ and ‘Antiemetics’, although the inter- was strongly suggested under the controlled covariate ‘Inopioid’ (the
pretations may be complex because it is difficult to integrate the dotted horizontal line shows the estimated coefficient of ‘Antiemetics’).
comparison results among all strata.

Time-dependent Cox regression liver function test will be crucial, and all laboratory results will
vary for every follow-up time. The administration dose may
Most clinical situations change over time, and the variables also vary according to the laboratory results or analgesic effects.
affected by a specific treatment also change even when the treat­ Moreover, the laboratory results may not be valid after the
ment remains constant during the observation period [13]. patients are censored or after an event occurs. These variables
For example, consider an analgesic having a toxic effect on the are common in clinical practice, and the existence of time-de-
hepatobiliary function for patients with chronic pain. A periodic pendent variables should be considered and checked before
starting the data collection for survival analysis. If an adequate

Online access in http://ekja.org 451


Survival analysis part II VOL. 72, NO. 5, October 2019

measurement method is developed, a time-dependent covariate


 code for time-dependent coefficient Cox regression
R
Cox regression will be possible. Another type of time-dependent
model: step function
variable is a covariate with a time-dependent coefficient [14]. If
the analgesics mentioned above produces a level of tolerance, its As shown in Fig. 6, the Schoenfeld residuals of ‘Antiemetics’
effect decreases over time. This indicates that the risk of break- and ‘Inopioid’ turn from positive to negative or vice versa at
through pain occurrence may be higher as time passes, which approximately 3 and 6 h. The data are arbitrarily separated using
apparently violates the proportional hazard assumption. In this these time points.
case, the effect of the analgesics can be included in the survival
function, which is expressed as a covariate with a coefficient of tdc <- survSplit (Surv(Time, PONV) ~.
, data = PONV.raw
the function of time.
, cut=c(3, 6)
As mentioned above, a time-dependent covariate is incorpo- , episode = "tgroup"
rated into the analysis as a single value according to the repeated , id = "id"
observation intervals. For example, a patient under analgesics )

medication takes an initial liver function test, the results of head(tdc)


which show 40 IU/L and 100 IU/L after four weeks with contin-
ued pain and 130 IU/L at eight weeks with pain, whereas at 12 The command ‘survSplit’ separates the patient data according
weeks after analgesics administration, the pain is subsided and to the established time interval, where the value for each interval
medications are discontinued without a further laboratory test. is the measured value on the left side of the interval (start time,
The laboratory data input for the time-dependent covariate are ‘tstart’), and ‘Time,’ which is the end of the interval succeeds
40 until 4th weeks without an event, 100 from 4th to 8th weeks the next interval. That is, one interval is closed at the left and
without an event, 130 from 8th to 12th weeks, and an event oc- opened at the right, and if an event occurs during the interval,
curs at 12th weeks. the survival function is estimated using the variables measured
Clinical studies in the area of anesthesiology often include at the left side of the interval (Table 9). It seems that the data be-
variables related to the response or effect of a specific treatment ing duplicated at the end and the start of the interval, problems
or medication. Depending on the characteristics and measure- do not occur because the divided time does not overlap. It is
ment methods of the variables, once a specific treatment or possible to apply a Cox regression and GOF test with these sepa-
medication is applied, their effects are gradually decreased over rated data.
time or delayed until the onset time. The effects of treatment
or medication changes over time, the coefficient of these effects # Fitting Cox regression
can be expressed as a time function, and for Cox regression, a fit.tdc <- coxph(Surv(tstart,Time, PONV)
~ Antiemetics:strata(tgroup)
step function is frequently applied. A step function is a method + Inopioid
applying different coefficient values to different time intervals. A , data = tdc)
Cox regression can thus be established and output the integrated
summary(fit.tdc)
results [15]. In addition, a continuous parametric function for a
time-dependent coefficient can be used for analysis instead of a # GOF test
step function [14]. sf.tdc <- cox.zph(fit.tdc)

Table 9. Data Divided by survSplit Function


Antie­
No Age Wt Inopioid Survobj Ino­pioid_c id tstart Time PONV tgroup
metics

1 1 0 48 78.5 0 4+ 0 1 0 3 0 1
2 1 0 48 78.5 0 4+ 0 1 3 4 0 2
3 3 0 54 88.3 100 21+ 1 2 0 3 0 1
4 3 0 54 88.3 100 21+ 1 2 3 6 0 2
5 3 0 54 88.3 100 21+ 1 2 6 21 0 3
︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙ ︙
All personal data are separated according to a preset time period (at 3 and 6 h). The same ‘id’ number indicates the same person. For example, data
with id = 1 are separated into two time periods. The first period starts from time = 0 (tstart = 0) and ends at 3 (Time = 3) and PONV does not occur.
The second period starts from 3 to 4 (the observation is prematurely ended before 6) and PONV does not occur. The same time period is indicated as
tgroup (time group) in the last column. The other variables are the same as in Table 7.

452 Online access in http://ekja.org


KOREAN J ANESTHESIOL In and Lee

print (sf.tdc) = 0.009) until 6 h post-operation, with no significant difference


shown until the end of the observation (risk ratio = 0.9382,
par(mfrow=c(2,2))
plot(sf.tdc[1]) 95% CI: 0.4242–2.075, P = 0.957). The results of the Schoenfeld
abline (h = coef(fit.tdc)[1], lty = "dotted") residual test (Table 10 and Fig. 9) indicate that all variables do
plot(sf.tdc[2]) not violate the proportional hazard assumption. These results
abline (h = coef(fit.tdc)[2], lty = "dotted")
plot(sf.tdc[3]) cannot provide a single desired outcome, and it is necessary to
abline (h = coef(fit.tdc)[3], lty = "dotted") combine the results.
plot(sf.tdc[4])
abline (h = coef(fit.tdc)[4], lty = "dotted")
# Combined results
combine.tdc <- data.frame(tstart = rep(c(0,3,6), 2)
Table 10 shows the estimated Cox regression and GOF test , Time = rep(c(3,6, 24), 2)
, PONV = rep(0,12)
results, and Fig. 9 presents a plot of the Schoenfeld residuals. , tgroup= rep(1:3,4)
The risk of the PONV increases 1.0126-fold (95% CI: 1.0078– , trt = rep(1,12)
1.017, P < 0.001) by one unit of intraoperative opioid. For the , prior= rep(0,12)
, Antiemetics = rep(c(0,1), each = 6)
antiemetics, the group taking drug B showed an increased , Inopioid = rep (c(0,1), each = 3)
PONV risk of 3.6545-fold (95% CI: 1.2024–11.107, P = 0.022) , parameter = rep(0:1, each = 6)
until 3 h post-operation, 3.8969-fold (95% CI: 1.4020–10.831, P )

Table 10. Results of Time-dependent Coefficient Cox Regression Using Step Function and Schoenfeld Residual Test
Call: coxph(formula = Surv(tstart, Time, PONV) ~ Antiemetics:strata(tgroup) + Inopioid, data = tdc)

n = 250, number of events = 63

coef exp(coef) se(coef) z Pr(>|z|)

Inopioid 0.012477 1.012556 0.002413 5.172 2.32E-07***


Antiemetics: strata(tgroup)tgroup = 1 1.295949 3.654464 0.567181 2.285 0.02232*
Antiemetics: strata(tgroup)tgroup = 2 1.360185 3.896914 0.521567 2.608 0.00911**
Antiemetics: strata(tgroup)tgroup = 3 −0.063743 0.938247 0.404993 −0.157 0.87494
---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

exp(coef) exp(-coef) Lower .95 Upper .95

Inopioid 1.0126 0.9876 1.0078 1.017


Antiemetics: strata(tgroup)tgroup = 1 3.6545 0.2736 1.2024 11.107
Antiemetics: strata(tgroup)tgroup = 2 3.8969 0.2566 1.4020 10.831
Antiemetics: strata(tgroup)tgroup = 3 0.9382 1.0658 0.4242 2.075
Concordance = 0.67 (se = 0.031 )
Rsquare = 0.152 (max possible = 0.874)
Likelihood ratio test = 41.35 on 4 df, P = 2e-08
Wald test = 38.92 on 4 df, P = 7e-08
Score (logrank) test = 44.61 on 4 df, P = 5e-09

Results of Schoenfeld residual test

rho chisq P

Inopioid 0.29948 5.150327 0.0232


Antiemetics: strata(tgroup)tgroup = 1 −0.02755 0.047411 0.8276
Antiemetics: strata(tgroup)tgroup = 2 −0.00368 0.000845 0.9768
Antiemetics: strata(tgroup)tgroup = 3 0.02486 0.038692 0.8441
GLOBAL NA 5.199691 0.2674
‘Antiemetics’ is coded as 0 for Drug A or 1 for Drug B in the original data. ‘Inopioid’ is the amount of opioid used during surgery. The split time
periods are presented as Antiemetics:strata(tgroup)tgroup = 1 for the time period from 0 to 3, Antiemetics:strata(tgroup)tgroup = 2 for the time
period from 3 to 6, and Antiemetics:strata(tgroup)tgroup = 3 for the time period from 6 to the end of the observation. coef: the value of coefficient,
exp(coef): exponential value of coefficient, se(coef): standard error of coefficient, z: z-statistics, Pr(>|z|): P value of given z-statistics, Signif. codes:
codes for significance marking.

Online access in http://ekja.org 453


Survival analysis part II VOL. 72, NO. 5, October 2019

Fig. 9. Schoenfeld residual graphs of time-dependent coefficient Cox regression.

combine.tdc Table 11. Comparison Kaplan–Meier Analysis and Survival Analysis


with Time-dependent Coefficient
cfit.tdc <- survfit(fit.tdc
, newdata = combine.tdc Kaplan–Meier analysis
, id = parameter n Events Median 0.95LCL 0.95­UCL
)
Antiemetics = 0 51 25 13 10 NA
cfit.tdc Antiemetics = 1 53 38 6 5 12

km <- survfit(Surv(Time, PONV) ~Antiemetics Survival analysis with time-dependent coefficient


, data = PONV.raw
n Events Median 0.95LCL 0.95­UCL
)
0 104 126 31 17 40
summary (km) 1 104 126 16 10 26
km
Proportional hazard assumed Kaplan–Meier analysis results are pre­
par( mfrow = c(1,1)) sented in the upper part of the table. Note that this result is the same
plot(km, xmax= 24, col="Black" as in Table 3. The lower part of this table presents the results of a Cox
, lty = c("solid","dashed"), lwd=2 regression with a time-dependent coefficient. The median survival is
, xlab="Postoperative hours" different from the proportional hazard assumed analysis. Antiemetics
, ylab="PONV free" = 0 and 1 indicate Drugs A and B respectively. n: total number of cases,
) Events: number of patients who experienced postoperative nausea and
lines(cfit.tdc, col="Grey" vomiting, 0.95LCL: lower limit of 95% confidence interval, or 0.95UCL:
, lty= c("solid","dashed"), lwd=2) upper limit of 95% confidence interval.
legend (x = 0.15, y = 0.25
, c("Drug A, Kaplan-Meier estimation"
, "Drug B, Kaplan-Meier estimation" To compare the results from two antiemetics, the data divid-
, "Drug A, Cox regression with time-dependent coefficient" ed by ‘survSplit’ are combined to enable an interpretation (com-
, "Drug B, Cox regression with time-dependent coefficient" bine.tdc). The results are shown in Table 11. The survival model
)
, col = c("black", "black", "grey", "grey") considering the time-dependent coefficient increases the sample
, lty = c("solid", "dashed", "solid", "dashed") size because the data of one patient are separated at the estab-
) lished time points. Note that the median survival times in this
model are 31 and 16 h, and the median survival times from the

454 Online access in http://ekja.org


KOREAN J ANESTHESIOL In and Lee

Conclusions
Clinical studies in the area of anesthesiology had rarely pre-
sented statistical results using survival analysis. In recent years,
studies on the survival or recurrence of cancer according to the
anesthetics have been actively published [16–18]. Survival anal-
ysis has the power to present clear and comprehensive results
based on studies on pain control or the effects of medications.
Previous articles have focused on the basic concepts of survival
analysis and interpretations of the published results [1], and the
present article covered the process of conducting a survival anal-
ysis using clinical data, finding errors, and achieving adequate
results. Although this article does not include all existing sur-
vival analysis methods, it introduced several R codes to enable
an intermediate level of survival analysis for clinical data in the
field of anesthesiology.12)
Fig. 10. Graphical comparison between survival models of Kaplan– Some clinical papers dealing with a survival analysis have
Meier and Cox regression with time-dependent coefficient. Black curves presented statistical results without considering a proportional
indicate the model fitted using a Kaplan–Meier analysis, and the gray
hazard assumption or an interaction between the covariates and
curves are from a Cox regression with a time-dependent coefficient.
The solid lines indicate Antiemetics = 0 (Drug A), and the dashed lines time. The power of a log-rank test, which is commonly used
indicate Antiemetics = 1 (Drug B). to compare two groups, tends to decrease when a proportional
hazard assumption is violated and can generate an incorrect re-
Kaplan–Meier analysis are 13 and 6 h. Plotting these two models sult [19,20]. An investigation into the reporting of survival anal-
into a single graph enables a visual comparison (Fig. 10). Here, ysis results in leading medical journals indicated that the use of
although ‘ggsurvplot’ provides comprehensive graphs, it cannot survival analysis has significantly increased, although several
draw two graphs simultaneously. Another graphics software is problems still exist, including descriptions regarding the censor-
required to make a single graph from these graphs (Fig. 11). ing, sample size calculation, constant proportional hazard ratio
assumption validations, and GOF testing [21]. Because most
## plot using ggsurvplot statistical analyses require several basic assumptions, survival
analysis also requires some essential assumptions. In a Kaplan–
ggsurvplot ( km, data = PONV.raw,
fun = "pct", pval = TRUE, Meier analysis, the likelihood of an event of interest and censor-
conf.int = TRUE, surv.median.line = "hv", ing occurring should be independent from each other, and the
linetype = "strata", palette = "grey", survival probabilities of patients who participated in earlier and
legend.title = "Antiemetics",
later studies should be similar. A log-rank test also requires the
legend.labs = c("Drug A", "Drug B"),
legend = c(.1, .2), break.time.by = 4, previously described and proportional hazard assumptions [22].
xlab = "Time (hour)", A CPH model requires a proportional hazard assumption, inde-
risk.table = TRUE, tables.height = 0.2, pendence between the survival times among different patients,
tables.theme = theme_cleantable(),
risk.table.y.text.col = TRUE, and a multiplicative relationship between the predictors and
risk.table.y.text = TRUE hazard [23].
)
ggsurvplot ( cfit.tdc, data = PONV.raw,
fun = "pct", 12)
A clustered event time analysis and an accelerated failure time analysis
conf.int = TRUE, surv.median.line = "hv",
are often applied to survival analysis methods in clinical study. A clustered
linetype = "strata", palette = "grey",
event time analysis is similar with a stratified CPH model, and has certain
legend.title = "Antiemetics",
advantages when each stratum has insufficient event cases. It has two
legend.labs = c("Drug A", "Drug B"),
types of processes, one is a marginal approach that estimates the survival
legend = c(.1, .2), break.time.by = 4,
function through an overall cluster from the pooled effect of each stratum,
xlab = "Time (hour)",
and another is a conditional approach that estimates the survival function
risk.table = TRUE, tables.height = 0.2,
from the heterogeneity between clusters. An accelerated failure time
tables.theme = theme_cleantable(),
analysis estimates the model similarly with a linear regression based on a
risk.table.y.text.col = TRUE,
Weibull distribution or log-logistic distribution. Unlike a CPH model that
risk.table.y.text = TRUE
continuously maintains the risk ratio of the covariates, this model assumes
) that the disease process can be accelerated or decelerated over time.

Online access in http://ekja.org 455


Survival analysis part II VOL. 72, NO. 5, October 2019

Fig. 11. Cox regression model with the time-dependent coefficient. Survival curves of Kaplan–Meier analysis (A) and time-dependent coefficient (B)
using ‘ggsurvplot’ command. Gray solid lines indicate Antiemetics = 0 (Drug A), and the black dashed lines indicate Antiemetics = 1 (Drug B). The
results of the survival analysis are changed when considering the constant hazard ratio assumption.

When reporting or interpreting the results of survival anal- Author Contributions


ysis, it is important that the identification of the underlying
assumptions corresponds to the statistical analysis, and it is Junyong In (Conceptualization; Writing–original draft; Writing–
necessary to verify that the assumptions are reasonable and well review & editing)
maintained. Statistical results with violated assumptions cause Dong Kyu Lee (Conceptualization; Writing–original draft; Writ-
deviated decisions because of an increased probability of error. ing – review & editing)
Survival analysis will be a powerful tool to achieve a scientific
conclusion when an appropriate method is chosen with regard ORCID
to the nature of the variables, the relationship with time, and
other basic assumptions. Junyong In, https://orcid.org/0000-0001-7403-4287
Dong Kyu Lee, https://orcid.org/0000-0002-4068-2363
Conflicts of Interest
Supplementary Materials
No potential conflict of interest relevant to this article was
reported. Futher detailes are presented in the online version of this article
(Available from https://doi.org/10.4097/kja.19183).

References
1. In J, Lee DK. Survival analysis: Part I - analysis of time-to-event. Korean J Anesthesiol 2018; 71: 182-91.
2. Clark TG, Bradburn MJ, Love SB, Altman DG. Survival analysis part I: basic concepts and first analyses. Br J Cancer 2003; 89: 232-8.
3. Bewick V, Cheek L, Ball J. Statistics review 12: survival analysis. Crit Care 2004; 8: 389-94.
4. Hancock MJ, Maher CG, Costa Lda C, Williams CM. A guide to survival analysis for manual therapy clinicians and researchers. Man Ther
2014; 19: 511-6.
5. Kleinbaum D, Klein M. Evaluating the proportional hazards assumption. In: Survival Analysis. A Self-Learning Text. 2nd ed. New York,
Springer Science+Business Media, Inc. 2005, pp 131-72.
6. Schonfeld D. Partial residuals for the proportional hazards model. Biometrika 1982; 69: 238-41.
7. Grambsch PM, Therneau TM. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 1994; 81: 515-26.
8. Abeysekera W, Sooriyarachchi R. Use of Schoenfeld’s global test to test the proportional hazards assumption in the Cox proportional
hazards model: an application to a clinical study. J Natl Sci Found 2009; 37: 41-51.

456 Online access in http://ekja.org


KOREAN J ANESTHESIOL In and Lee

9. Ekman A. Variable selection for the Cox proportional hazards model: A simulation study comparing the stepwise, lasso and bootstrap
approach [Master's thesis]. [Umeå]: UMEÅUniversity; 2017. 50 p. Available from http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-130521
10. Prashant Narayan KC. Extension of Cox PH Model When Hazards are Non-Proportional Applied to Residential Treatment for Drug Abuse.
[Master's thesis]. [Mankato (MN)]: Minnesota State University; 2016. 51 p. Available from https://cornerstone.lib.mnsu.edu/etds/661/
11. Collett D. Testing the assumption of proportional hazards. In: Modelling Survival Data in Medical Research. 2nd ed. Boca Raton, Chapman
& Hall/CRC. 2003, pp 141-7.
12. Kleinbaum D, Klein M. The stratified Cox procedure. In: Survival analysis. A Self-learning Text. 2nd ed. New York, Springer
Science+Business Media, Inc. 2005, pp 173-210.
13. Collett D. Time-dependent variables. In: Modelling Survival Data in Medical Research. 2nd ed. Boca Raton, Chapman & Hall/CRC. 2003,
pp 251-72.
14. Zhang Z, Reinikainen J, Adeleke KA, Pieterse ME, Groothuis-Oudshoorn CG. Time-varying covariates and coefficients in Cox regression
models. Ann Transl Med 2018; 6: 121.
15. Thomas L, Reyes EM. Tutorial: survival estimation for Cox regression models with time-varying coefficients using SAS and R. J Stat Softw
2014; 61: 1-23.
16. Wigmore TJ, Mohammed K, Jhanji S. Long-term survival for patients undergoing volatile versus IV anesthesia for cancer surgery: a
retrospective analysis. Anesthesiology 2016; 124: 69-79.
17. Tsui BC, Rashiq S, Schopflocher D, Murtha A, Broemling S, Pillay J, et al. Epidural anesthesia and cancer recurrence rates after radical
prostatectomy. Can J Anaesth 2010; 57: 107-12.
18. Biki B, Mascha E, Moriarty DC, Fitzpatrick JM, Sessler DI, Buggy DJ. Anesthetic technique for radical prostatectomy surgery affects cancer
recurrence: a retrospective analysis. Anesthesiology 2008; 109: 180-7.
19. Qiu P, Sheng J. A two‐stage procedure for comparing hazard rate functions. J R Stat Soc Series B Stat Methodol 2008; 70: 191-208.
20. Li H, Han D, Hou Y, Chen H, Chen Z. Statistical inference methods for two crossing survival curves: a comparison of methods. PLoS One
2015; 10: e0116774.
21. Abraira V, Muriel A, Emparanza JI, Pijoan JI, Royuela A, Plana MN, et al. Reporting quality of survival analyses in medical journals still
needs improvement. A minimal requirements proposal. J Clin Epidemiol 2013; 66: 1340-6.
22. Goel MK, Khanna P, Kishore J. Understanding survival analysis: Kaplan-Meier estimate. Int J Ayurveda Res 2010; 1: 274-8.
23. George B, Seals S, Aban I. Survival analysis and regression models. J Nucl Cardiol 2014; 21: 686-94.

Online access in http://ekja.org 457

You might also like