Propensity Score Matching Methods For The
Propensity Score Matching Methods For The
by
August 2019
Observational studies are often used to investigate the effects of treatments on a spe-
cific outcome. In many observational studies, the event of interest can be of recurrent
type, which means that subjects may experience the event of interest more than one
time during their follow-up. The lack of random allocation of treatments to subjects in
observational studies may induce the selection bias leading to systematic differences
in observed and unobserved baseline characteristics between treated and untreated
subjects. Propensity score matching is a popular technique to address this issue. It
is based on the estimation of conditional probability of treatment assignment given
the measured baseline characteristics. The use of the propensity score in the analysis
of observational studies with recurrent event outcomes has not been well developed.
In this study, we consider three matching methods called propensity score match-
ing, covariate matching and history matching, and compare the accuracy of them
to estimate the treatment effects in recurrent event rates through Monte Carlo sim-
ulation studies. We consider various scenarios under the settings of time-fixed and
time-dependent treatment indicators. A synthetic data set is analyzed to illustrate
the methods discussed in the thesis.
ii
To My Family
iii
Acknowledgements
iv
Statement of contribution
Dr. Candemir Cigsar proposed the research question that was investigated throughout
this thesis. The overall study was jointly designed by Dr. Candemir Cigsar and
Yasin Khadem Charvadeh. The algorithms were implemented, the simulation study
was conducted and the manuscript was drafted by Yasin Khadem Charvadeh. Dr.
Candemir Cigsar supervised the study and contributed to the final manuscript.
v
Table of contents
Title page i
Abstract ii
Acknowledgements iv
Statement of contribution v
Table of contents vi
List of figures x
1 Introduction 1
1.1 Propensity Score Methods and Recurrent Events in Observational Studies 1
1.1.1 Recurrent Events in Observational Studies . . . . . . . . . . . 5
1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 The Goal of The Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 9
vi
2.1.1 Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Fundamental Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Poisson Processes . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 Renewal Processes . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Propensity Score Matching . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Simulation Procedures for Recurrent Event Processes . . . . . . . . . 30
2.5 Construction of the Likelihood Function . . . . . . . . . . . . . . . . 32
Bibliography 88
vii
List of tables
viii
4.6 Empirical estimates (E.E.’s) resulted from the matching methods in
the first scenario when α3 = 0.9 and α6 = 0.55 (m=1000). . . . . . . 65
4.7 Theoretical estimates (T.E.’s) and empirical estimates (E.E.’s) resulted
from the matching methods in the second scenario (m=1000). . . . . 66
4.8 Theoretical estimates (T.E.’s) and empirical estimates (E.E.’s) resulted
from the matching methods in the second scenario when α3 = 0.9 and
α3 = 0.55 (m=1000). . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.9 Empirical estimates (E.E.’s) resulted from the matching methods in
the first scenario (m=500). . . . . . . . . . . . . . . . . . . . . . . . . 68
4.10 Empirical estimates (E.E.’s) resulted from the matching methods in
the first scenario when α3 = 0.9 and α6 = 0.55 (m=500). . . . . . . . 68
4.11 Theoretical estimates (T.E.’s) and empirical estimates (E.E.’s) resulted
from the matching methods in the second scenario (m=500). . . . . . 69
4.12 Theoretical estimates (T.E.’s) and empirical estimates (E.E.’s) resulted
from the matching methods in the second scenario when α3 = 0.9 and
α3 = 0.55 (m=500). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
ix
List of figures
x
Chapter 1
Introduction
In many observational studies, the event of interest is of recurrent type, which means
that subjects may experience an event more than one time during their lifetimes. It
is usually assumed that a process generates recurrent events under a random mecha-
nism. Data obtained from such processes are called recurrent event data (Cook and
Lawless, 2007). An important objective of analyzing recurrent event data is to in-
vestigate the relationships between treatments and other explanatory covariates on
event occurrences. Study designs define how data are collected in recurrent event
studies. Recurrent event data can be obtained through randomized clinical trials or
observational studies in epidemiological studies. In the lack of randomization, sta-
tistical methods to analyze recurrent event data may suffer from the selection bias.
To analyze data from such recurrent event studies, specific methods and approaches
therefore need to be developed and utilized because causal inference cannot be made
due to the lack of the random allocation of treatments to subjects.
Analysis of recurrent event data obtained from randomized controlled trials have
been well discussed. A survey of statistical methods to deal with such data can be
found in Cook and Lawless (2007, Section 8.4). In the case of observational studies, as
discussed by Smith and Schaubel (2015), the statistical methods have been developed
if the objective of a study is to describe the event generation process. However, there
is an important gap in the literature to establish cause-and-effect relations between
treatments and event occurrences in particular when the goal of a study is to com-
pare the effectiveness of treatments. Important challenges involved in the analysis of
recurrent events include dependencies between event occurrences in the same subject,
various censoring mechanisms leading to incomplete data and unexplained hetero-
geneity in some characteristics of subjects in a population. Examples of recurrent
events include occurrence of asthma attacks in infants, infections in renal transplant
patients and insurance claims for policy holders. Cook and Lawless (2007) present
several examples of recurrent event data.
A key function in modeling of recurrent events is called the intensity function
of a recurrent event process (Cook and Lawless, 2007, p. 10). We mathematically
define the intensity function in the next chapter. The intensity function of a recurrent
6
event process is very flexible and can be extended to deal with regression problems in
recurrent event studies. Therefore, we consider intensity based regression models for
recurrent event processes. As discussed in Section 1.3, our main objective in this thesis
is to investigate some important matching methods to deal with the selection bias in
observational studies when the individuals are subject to recurrent events. Therefore,
we consider simple recurrent event models. We discuss more complicated models and
how to extend the methods discussed to deal with them in the final chapter of the
thesis.
of this method is the high dimensionality of the matching problem that leads to a
larger bias due to the fact that many individuals remain unmatched. This issue can
be addressed by using the PSM, which substantially reduces the dimensionality of the
problem (Rosenbaum and Rubin, 1985a). It should be noted that Rosenbaum and
Rubin (1985b), Rubin and Thomas (1996) and Rubin (2001) have found that match-
ing on the linear propensity score can be particularly effective in terms of reducing
bias.
There are a few PSM methods proposed by researchers such as the Nearest Neigh-
bor Matching, Caliper and Radius Matching, Stratification and Interval Matching and
Kernel and Local Linear Matching. Nearest neighbor matching proposed by Rubin
(1973a) is one of the most straightforward and common matching methods, for which
we choose an individual from the control or comparison group as the match for the
treated individual. In this method, the two matched individuals should have very
close propensity scores. Althauser and Rubin (1970), Cochran and Rubin (1973), Ru-
bin (1973a) and Raynor Jr (1983) investigated another matching method, called the
caliper matching, which is a variant of the nearest neighbor matching method. This
method avoids bad matches, a problem common to the nearest neighbor matching, by
imposing a tolerance level on the maximum propensity score distance (i.e., a caliper).
In this case an individual from the comparison group is considered as a match for
the treated individual if it lies within the caliper. Rosenbaum and Rubin (1985b)
discuss the choice of an appropriate caliper width by using results from Cochran and
Rubin (1973). Smith and Todd (2005) noted that a drawback of the caliper matching
method is that it could be difficult to know a priori what choice for the tolerance
level would be reasonable. Dehejia and Wahba (2002) proposed a variant of caliper
matching called the radius matching. In the radius matching, it is possible to use
all comparison group members together with nearest neighbors within each caliper,
which in return allows us to use more units as matches, and avoid bad matches.
The work by Rosenbaum and Rubin (1984) is the first solid study on stratification
and interval matching based on propensity scores. In this case of propensity score
matching, the common support of propensity score is partitioned into a set of intervals.
The idea behind this method is to calculate the mean difference in outcomes between
treated and controlled observations falling within each strata. This procedure is called
the impact within each interval. A weighted average of the interval impact estimates
then provides an overall impact estimate.
8
Kernel and local linear matching are nonparametric matching estimators that use
weighted averages of all or, depending on the choice of the kernel function, nearly all
individuals in the control group for each observation of the treated group to construct
the counterfactual outcome. In this case, the allocation of the weights is based on the
propensity score, which means that the closer the propensity score of an individual
in the control group to that of the treated individual, the higher the weight would
be. The main research on Kernel and local linear matching is done by Heckman et al.
(1997, 1998) and Heckman et al. (1998).
Robins et al. (2000) proposed the class of marginal structural models, which is
a new class of causal models allowing for improved adjustment of time-dependent
confounders when there exist time-varying exposures or treatments. They showed
that the parameters of a marginal structural model can be consistently estimated
using a new class of estimators called the inverse probability of treatment weighted
estimators. Stuart (2010) provided a detailed structure and guidance for researchers
interested in using matching methods. Cottone et al. (2019) and Vansteelandt and
Daniel (2014) investigated the efficiency and performance of regression adjustment for
propensity scores to estimate the average treatment effect in observational studies.
Recurrent events have been of interest for researchers for a long time. The recent
history of the statistical analysis of recurrent events through stochastic processes in
medical sciences goes back to almost forty years ago. For example, Byar (1980) in-
vestigated the effect of instillations of thiotepa on bladder tumors, which could recur
during the first two years after transurethral resection. Following Byar’s work, Gail
et al. (1980) concerned with the comparison of episodic illness data arising from two
treatment groups. Lawless and Nadeau (1995) analyzed data on automobile warranty
claims, and improved the method discussed by Nelson (1988) for estimating the cumu-
lative mean function of identically distributed processes of recurrent events. Lawless
and Nadeau (1995) proposed a robust estimation method based on rate functions of
recurrent event processes. Their method can be used with regression models under
certain conditions. The gist of their research is that they used point estimates based
on Poisson models and developed robust variance estimates, which are still valid even
if the assumed model is not a Poisson process.
Over the past two decades, many methodologies such as marginal and conditional
methods have been developed to analyze multivariate survival data of recurrent events
9
((Prentice et al., 1981), (Andersen and Gill, 1982), (Wei et al., 1989), (Lee et al.,
1992), (Pepe and Cai, 1993), (Lin et al., 2000)) . Moreover there has been interest in
comparing these conditional and marginal methods, which can be found through works
done by Cook and Lawless (2002), Cai and Schaubel (2004), Kelly and Lim (2000).
Liang et al. (1993) discussed an approach for estimating parameters in a proportional
hazards regression (Cox, 1972) type of specification for the recurrent event processes
with external covariates. Liang et al. (1995) provided a survey of models and methods
for analyzing multivariate failure time data including frailty and marginal models for
recurrent events. Lin et al. (2000) proposed a semi-parametric regression for the
mean and rate functions of recurrent events providing rigorous justification through
modern empirical process theory. An important assumption of the above methods
is independent censoring or as sometimes known as the conditionally independent
censoring; see, Cook and Lawless (2007, Section 2.6) for more details.
Lawless et al. (1997) studied the mean and rate functions of recurrent events among
survivors at certain time points. They suggested joint rate/mean function models for
recurrent and terminal events by modeling marginal distribution of failure times and
the rate function for the recurrent events conditional on the failure time. The objec-
tive of their paper was to present fairly simple methods for assessing the effects of
treatments or covariates on recurrent event rates when other terminal events inducing
the dependent censoring are present. Chen and Cook (2004) described methods for
testing for differences in mean functions between treatment groups when each partic-
ular event process is ultimately terminated by death as a terminating event. They
showed that the methods based on the assumption that the recurrent event process
is independently terminated as a regular censoring may not be a valid assumption.
There has been so many different models and methods for the statistical analysis of
recurrent events and special cases, which can be found in the books by Daley and
Vere-Jones (2003) and Cook and Lawless (2007) and the references given in them.
The propensity score matching (PSM) is a statistical matching technique that at-
tempts to estimate the effect of a treatment in the analysis of observational data,
policy or another type of intervention by accounting for covariates which predict
10
whether receiving a treatment (Rosenbaum and Rubin, 1983). In other words, re-
searchers intend to mimic the properties of randomized experimental designs with
the propensity score (PS) techniques by trying to make the treatment and control
groups similar on covariates that are believed to interfere with the correct estimation
of treatment effect. After applying the PSM, the only difference between treatment
and control groups would be the treatment in theory (Rosenbaum and Rubin, 1983).
The use of PSM in univariate survival analysis has been recently studied, but
there has not been too much research on the estimation of treatment effects in the
presence of recurrent events. In this thesis, we consider observational studies in which
individuals are subject to recurrent events, and receive a certain type of treatment
according to some characteristics of them. Furthermore, we investigate relationships
between explanatory factors and an outcome, and also their incorporation in modeling
PSM for recurrent events. We consider simple recurrent event models to investigate
the effects of different matching methods in more detail. This allows us to get rid of
the complexity added by the event generation models. We focus on a simple “treated”
(i.e., treatment) versus “untreated” (i.e., control) groups case. In some settings, we
discuss the situations where individuals switch from an existing treatment to a new
treatment regimen.
We consider three matching methods; (i) propensity score matching, (ii) covariate
matching, and (iii) history matching. Among these methods, the history matching
is a respectively new matching method that can be applied only in event history
settings, which includes recurrent events as a special case. This technique has not
been extensively discussed in the literature. To our knowledge, it has only been
applied in a restricted setting by Smith and Schaubel (2015) and Smith et al. (2018).
We discussed this technique in two different settings. The first setting includes a
time-fixed treatment assigned after the start of the follow-up of individuals in the
study. In the second setting, we consider a time-varying treatment in a sense that the
treatment is assigned at the start of their follow-up and may change at some point
during their follow-up. In each setting, we conducted simulation studies with various
scenarios. The studies and results are explained in Chapters 3 and 4. In Chapter 5,
we present an illustrative analysis of a synthetic data set generated to mimic data
sets obtained from studies of recurrent epileptic seizures in adults.
11
Our main objective with this thesis is to investigate the effects of these three dif-
ferent matching methods on the accuracy of the estimation of treatment effects in
observational studies with recurrent events. The novelty of the study is the use of
history information to match the treated and untreated subjects in the cohort. In
other words, we investigate the information obtained from the past event occurrences
experienced by individuals in observational studies to balance the baseline character-
istics between treated and untreated groups in a cohort of individuals. Furthermore,
we compare the accuracy of history matching method in the estimation of treatment
effects with that of two popular matching methods.
Chapter 2
prospective study, the selected individuals are longitudinally followed, and the event
of interest occurring during their follow-up are recorded, while in a retrospective study
the data is available for analysis purposes prior to the study design. Cook and Lawless
(2007) provide many examples of recurrent event data arising from various research
fields.
In this section, we introduce the notation frequently used in the remaining parts of
the thesis and some fundamental concepts. Recurrent event data are usually analyzed
under the point process framework, where a process may undergo some sort of events
repeatedly over time. Rigorous probabilistic treatment of point processes can be found
in point process textbooks; e.g., in Daley and Vere-Jones (2003, 2007). We adapted
a standard counting process notation given by Cook and Lawless (2007).
A stochastic process {W (t); t ∈ T } is a family of random variables that is indexed
by the element t in the index set T . In this thesis, the index t denotes the time
so that t ≥ 0. Therefore, W (t) is a random variable representing the observable
value of w(t) at time t, where t ∈ T . A stochastic process is called a discrete-time
process if the set T is finite or countable; otherwise, it is a continuous-time process
(Daley and Vere-Jones, 2003). A point process is a probabilistic model for random
scatterings of points on some space S often assumed to be a subset of Rd for some
d > 0. Oftentimes, point processes describe the time or space occurrences of random
events, in which the occurrences are revealed one-by-one as time evolves (Jacobsen,
2006). A counting process, denoted by {N (t); t ≥ 0}, is a stochastic process, where
N (t) represents the cumulative number of events occurred over the time interval (0, t]
with the following properties: N (0) = 0, N (t) is a positive integer, and if s ≤ t
then N (s) ≤ N (t). If s < t, the notation N (s, t) represents the number of event
occurrences in the interval (s, t]; that is, N (s, t) = N (t) − N (s).
Two important and commonly used counting processes are Poisson processes (PPs)
and renewal processes (RPs). In many studies, the interest is in modeling either the
mean or rate functions of a counting process. The mean function of a counting process
14
{N (t); t ≥ 0} is defined as
and the associated rate function ρ(t) is the derivative of the mean function; that is,
′ d
ρ(t) = µ (t) = µ(t), (2.2)
dt
where we assume that the expectation in (2.1) and derivative in (2.2) exist.
Let T be a non-negative and continuous random variable. The cumulative distri-
bution function (c.d.f.) and probability density function (p.d.f.) of the random variable
T are defined as F (t) = Pr(T ≤ t) and f (t) = (d/dt)F (t), respectively. The com-
plement of the c.d.f. is called, the survival function S(t), which gives the probability
that an event has not occurred up to time t. Thus, we have
∫ ∞
S(t) = Pr(T ≥ t) = 1 − F (t) = f (x)dx, t ≥ 0. (2.3)
t
mathematically defined as
where △N (t) = N (t + △t− ) − N (t− ) represents the number of events in the interval
[t, t + △t). Note that the history H(t) = {N (s) : 0 ≤ s < t} records all information
on event occurrences of the counting process N (t); t ≥ 0} over the time interval [0, t),
which includes event occurrence times over [0, t). The intensity function is important
in the analysis of recurrent events because it completely defines an orderly counting
process (Cook and Lawless, 2007, p. 10). Therefore, we use intensity functions to
generate event times of recurrent event processes in our simulation studies in this
thesis. Details of the use of the intensity function in simulation studies can be found
in Section 2.4 in this chapter.
2.1.1 Covariates
Covariates play an important role in modeling recurrent events and estimating propen-
sity scores (PSs). In the case of recurrent events, covariates could affect the probabilis-
tic characteristics, that is the intensity function of a counting process. In probability
score matching (PSM), covariates are crucial mainly because they are used to match
individuals in a control group with individuals from a treatment group.
Covariates can be observed, unobserved, and are basically classified as external or
internal (Kalbfleisch and Prentice, 2002, pp. 197-200). An internal covariate is one
where the change of the covariate over time is related to the behavior of the individual,
meaning that any change in a covariate is in sensible relationship with the condition of
the individual. Examples of internal covariates include disease complications, blood
pressure, etc. In contrast, an external covariate is one whose value is external to the
individual. In other words, individuals under study cannot affect the value of exter-
nal covariates, but external covariates may cause some specific change in individual’s
physical or mental health. For example, levels of air or water pollution can be clas-
sified as external covariates. Furthermore, a covariate is called time-dependent if its
value changes over time or called time-fixed otherwise. Note that fixed covariates are
naturally external.
We use the notation x or z to represent the value of a fixed covariate, and x(t)
16
where H(t) = {N (s), x(u); 0 ≤ s < t, 0 ≤ u ≤ t}. Note that the history H(t)
includes information on the counting process {N (t); t ≥ 0} over [0, t) but information
on covariates over [0, t], which means that the value of the covariate process x(t) is
known in the intensity function at time t. More discussion on the extended history
functions to include covariates can be found in Daley and Vere-Jones (2003) and Cook
and Lawless (2007).
In this section, we describe the basic families for recurrent event processes such as
PPs and RPs that will be used in subsequent chapters for describing and analyzing
data.
The P oisson process (PP) is one of the most widely-used counting processes. It is
usually used in scenarios where we count the occurrences of certain events that appear
to happen at a certain rate, but completely at random; that is, without a certain
structure. For example, suppose that from the past data, we know that heart attacks
happen to an individual with a rate of two per year. Other than this information, the
timings of heart attacks seem to be completely random. In such a case, the PP might
be a good model for making inference on the rate of heart attacks.
In modeling recurrent event processes, a PP describes a situation, in which events
17
occur randomly in such a way that the number of events in non-overlapping time
intervals are independent. PPs are also suitable when there are external covariates
which affect occurrence of events. It is worth mentioning that PPs or other models
which are based on counts are appropriate for incidental events where their occurrence
does not change the process itself.
There are various equivalent ways of defining a PP. One way of defining a PP is
through its intensity function (Cook and Lawless, 2007, Section 2.1.1). Let {N (t); t ≥
0} be a counting process with the intensity function λ(t|H(t)). Then, {N (t); t ≥ 0}
is called a PP if the intensity function is of the form
It is obvious that the Poisson process intensity function (2.6) does not depend on
the history of the process H(t), meaning that in the absence of covariates, intensity
is specified only by t. This fact is a result of the independent increment property of
the PPs, which shows that the PPs possess the Markov property (Cook and Lawless,
2007, p. 32). For the special case, in which ρ(t) = ρ > 0 (i.e. a positive constant), the
process {N (t); t ≥ 0} is called a homogeneous Poisson process (HPP); otherwise, it is
called a non-homogeneous Poisson process (NHPP). It should be noted that a HPP
{N (t); t ≥ 0} with the rate function ρ > 0 has the following properties:
• N (0) = 0,
A proof of this result can be found in Daley and Vere-Jones (2003). For any PP
{N (t); t ≥ 0}, the following results can be obtained from the intensity function given
in (2.6) (Daley and Vere-Jones, 2003).
(i) N(0) = 0.
(ii) N (s, t) has a Poisson distribution with mean µ(s, t) = µ(t) − µ(s), for 0 ≤ s < t.
(iii) Let (s1 , t1 ] and (s2 , t2 ] be any two non-overlapping intervals, then N (s1 , t1 ) and
N (s2 , t2 ) are independent random variables.
18
The following result is the key to generate realizations of a NHPP through the Monte
Carlo simulations. A proof of it can be found in Daley and Vere-Jones (2003).
Proposition 2.2.1. Let {N (t); t ≥ 0} be a NHPP with the mean function µ(t). Then,
{N ∗ (s); s ≥ 0} is a HPP with the rate function ρ∗ (s) = 1 if we define s = µ(t) and
Therefore, by generating event times of a HPP with rate function ρ∗ (s) = 1, we can
consequently generate event times of a NHPP using the relation t = µ−1 (s).
The external covariates affecting the event occurrence rate can be easily incor-
porated in PP models through the intensity function (2.6). These covariates can be
involved in a PP by redefining the history of the associated intensity (i.e., rate) func-
tion to include covariate information. As discussed above, the intensity function of a
PP at time t depends only on t, and is not a function of the past of the process; i.e., the
history H(t). Covariates in PPs can be included in the intensity function as follows.
Let x(t) be a p-dimensional vector of time-fixed and/or time varying covariates. We
′
define z(t) = (z1 (t), ..., zq (t)) , q ≥ p, as a q-dimensional vector of covariates, whose
elements include x(t), as well as functions of t in the case if the model depends on
that. The intensity function can be defined then as
′
λ(t | H(t)) = ρ(t | x(t) ) = ρ0 (t) exp(z (t) β), (2.7)
where β is a q-dimensional vector of parameters and ρ0 (t) is called the baseline rate
function of the process {N (t); t ≥ 0}. The model (2.7) is usually called the multiplica-
tive model, in which the effect of covariates z(t) on the rate function is assumed to be
of log-linear form. The multiplicative model is the most common family of regression
models for recurrent events. Therefore, we consider only the multiplicative models in
this thesis. However, if the log-linearity assumption of the multiplicative model (2.7)
is not valid, additive or time transform models can be used for regression in recurrent
events as well. The multiplicative model (2.7) is fully parametric if the rate function
including both the baseline rate function and the exponential function is determined
parametrically. If the baseline rate function is free of parameters but the exponential
function in (2.7) is parametrically specified, the model is semi-parametric, which is
sometimes called the the Andersen-Gill model (Cook and Lawless, 2007).
19
It should be noted that the intensity function (2.7) can represent a PP if and only
if the covariates are external. The model including internal covariates is no longer
a PP, but can be specified as a general intensity-based process. These models are
useful in particular when there is a need for modeling the past of a process. Since
we only focus on PPs in this thesis, we do not consider the general intensity-based
models. However, the methods discussed in the following chapters can be extended
to deal with such models. The general intensity-based models are discussed by Cook
and Lawless (2007, Chapter 5).
Poisson models are useful in some settings and applications, but the main drawback
of using them to model recurrent events is that usually real-life data sets are overdis-
persed and exhibit variability in the number of event occurrences beyond the amount
predicted by Poisson models. This situation usually occurs whenever there is het-
erogeneity among subjects due to some unmeasured factors or subject specific effects
that influence event rates (Cook and Lawless, 2007, p. 35). Such a heterogeneity
is called the unexplained heterogeneity. In such situations, even after conditioning
on observed covariates, V ar{N (t)} appears to be substantially larger than E{N (t)}.
Since under a Poisson model the mean and the variance of N (t) need to be equal, the
use of Poisson models is therefore no longer plausible when unexplained variability is
present in a given data set.
This issue can be addressed by incorporating unobservable random effects. To
explain this, we now consider a cohort of m individuals, and introduce the index i
to denote the ith individual process, where i = 1, . . ., m. Following the notation
given in Cook and Lawless (2007, Section 2.2.3), we let ui denote the unobserved
random effect for the ith individual, i = 1, . . ., m. For simplicity we assume that z i
denotes a p-dimensional vector of time-fixed covariates. The results in this section
can be extended to the external time-varying covariates case as well. Conditional
on covariates z i and the random effect ui , the mixed Poisson model of the process
{Ni (t); t ≥ 0} is then given with the intensity function
where the ui are i.i.d. random variables following a distribution function G(u) with a
finite mean. It should be noted that, even tough the model given in (2.8) is a Poisson
process for the given value of ui , the marginal process {Ni (t); t ≥ 0} is not a PP in
general.
We may assume without loss of generality that E(ui ) = 1 and V ar(ui ) = ϕ, where
ϕ > 0. Any c.d.f. under these assumptions can be used to model the random effects ui .
The most commonly used distribution for the ui is however the gamma distribution
as it would make the multiplicative mixed Poisson model (2.8) mathematically more
convenient to work with. In this case, the ui have a gamma distribution with mean 1
and variance ϕ, and the p.d.f. of the form
−1 −1
uϕ exp(−u/ϕ)
g(u; ϕ) = , u > 0. (2.9)
ϕϕ−1 Γ(ϕ−1 )
Let µi (s, t) denote the expected number of events in {Ni (t); t ≥ 0} over the time
interval (s, t], where 0 < s < t; that is, µi (s, t) = E{Ni (s, t)} = E{Ni (t) − Ni (s)}.
Then, by definition, for i = 1, . . ., m,
∫ t
µi (s, t) = ρ0 (v) exp(z ′i β) dv = µ0 (s, t) exp(z ′i β). (2.10)
s
Given z i and ui , the random variable Ni (s, t) follows a Poisson distribution with
∫t
mean function s ρ(t | z i , ui ) = ui µi (s, t). Note that, given only z i , the distribution
of Ni (s, t) is no longer Poisson but is negative binomial with probability function of
the form
∫ ∞
[uµi (s, t)]n
Pr(Ni (s, t) = n | z i ) = exp{−uµi (s, t)} g(u; ϕ) du,
0 n!
(2.11)
Γ(n + ϕ−1 ) [ϕ µi (s, t)]n
= , n = 0, 1, 2, ...
Γ(ϕ−1 ) [1 + ϕ µi (s, t)]n+ϕ−1
Note that the limit as ϕ → 0 gives the Poisson distribution (Cook and Lawless, 2007,
p. 36). Therefore, the model converges to a Poisson process in the limit when ϕ → 0.
However, the case ϕ > 0 represents overdispersion for the Poisson model, and the
process becomes a negative binomial process for which the intensity function at time
21
t can be expressed as
(1 + ϕ Ni (t− ))
λi (t | Hi (t)) = ρi (t), t ≥ 0, (2.12)
1 + ϕ µi (t)
(Cook and Lawless, 2007, p. 37). The level of the overdispersion in the observed
event counts are defined by the parameter ϕ. A high value of of ϕ represents a more
pronounced overdispersion (i.e., unexplained heterogeneity) in the event counts across
individual processes. Because of this reason, the parameter ϕ is sometimes called the
heterogeneity parameter of the mixed Poisson process.
We now represent the expressions for the marginal mean and variance of Ni (s, t)
based on the random effects model (2.8). It is easy to see that the marginal mean is
given by
Moreover, the marginal covariance for event counts over non-overlapping intervals can
be written as
It is worth mentioning that relationships (2.13), (2.14) and (2.15) hold for any distri-
bution function for the ui .
22
A renewal process (RP) is a stochastic process model for recurrent events that ran-
domly occur in time and are subject to some sort of “renewal” after each event occur-
rence. As defined in this section, RPs have a very strict conditions by definition, which
limits their use for many applications. However, they can be modified for building
more realistic models. In this section, we introduce only some basic RP models and
a few extensions of them. More details on RPs can be found in Daley and Vere-Jones
(2003) and Cook and Lawless (2007).
Let Tj , j = 1, 2, . . ., be the occurrence time of the jth event, which is usually
called the jth arrival time, of the counting process {N (t); t ≥ 0} with the associated
intensity function λ(t | H(t)), and let T0 = 0. Then, Wj = Tj − Tj−1 , j = 1, 2, . . ., is
called the jth gap time; that is, the time between the (j − 1)st and jth events. RPs
are defined as stochastic processes in which the gap times between successive events
are independent and identically distributed. The definition of the RPs is analogous
to the case where the intensity function (2.5) is of the form
( )
λ(t | H(t)) = h t − TN (t− ) , (2.16)
where t − TN (t− ) is clled the backward recurrence time; that is, the elapsed time since
the most recent event before time t, and h(·) is the hazard function of the gap times
Wj as defined in (2.4).
The distribution of counts N (s, t) in a RP is often of interest. When the Wj
are exponentially distributed, the corresponding counting process {N (t); t ≥ 0} is
equivalent to a HPP, and thus N (s, t) follows a Poisson distribution with the mean
µ(s, t). It is however not easy to obtain the distribution of counts in other cases. The
following relation can be useful to obtain the distribution of N (t) in some cases.
∑
∞
µ(t) = E{N (t)} = Fn (t), (2.18)
n=1
23
′
h(w | z) = h0 (w) exp(z β), w > 0, (2.19)
and (ii) the accelerated failure time model, where the hazard function of Wj given z
is
′ ′
h(w | z) = h0 (w exp(z β)) exp(z β), w > 0. (2.20)
In (2.19) and (2.20), the function h0 (w) is called the baseline hazard function of Wj .
If the external time-varying covariates are of interest, they can be included in a
RP with the intensity function
This can be done in a similar way to the case where we incorporate time-varying
covariates to the hazard function of Wj ; that is,
′
h(w | z(t)) = h0 (w) exp(z (t) β), (2.22)
where t = w + tN (t− ) . Since we mainly focus on the PPs, we do not discuss the RPs
in detail. More information on regression models of recurrent events and beyond can
be found in Cook and Lawless (2007, Chapter 4).
In this section, we first discuss the treatment evaluation and some examples, and
then, introduce the propensity score (PS) methodology. Propensity scoring is used to
properly analyze data obtained from observational studies. In such studies, researchers
24
do not conduct randomized controlled trials to make causal inference, instead some
pretreatment characteristics of individuals are used to find the propensity score.
In many fields of study, the primary goal is to evaluate the effectiveness of a
program, which typically means the comparison of the effects of that program on
the outcome of interest with the effects of another program or a placebo. Examples
of treatment evaluation can be the effect of a new medicine on epileptic seizures,
effect of training programs on job performance or government programs targeted to
help school and their effect on student performance. Note that in these studies,
unlike lab experiments, individuals decide whether to participate in the program or
not. Since individuals who decide to participate are different in terms of various
characteristics from individuals who do not participate, it is statistically imprudent to
directly compare the outcome of interest. Therefore, we need to balance the observed
and unobserved outcome-related covariates between treatment and control groups
and then compare their outcomes. Below are the assumptions and the procedure
required for conducting the propensity score matching (PSM) technique initiated by
Rosenbaum and Rubin (1983).
Let y0 and y1 be the potential outcomes for the control group and treatment group,
respectively. There exists a set x of observable covariates such that after controlling
for these covariates, the potential outcomes are independent of treatment assignment;
that is, in notation,
y0 , y1 ⊥ T rt | x,
The assumption of common support ensures that there is sufficient overlap in the
characteristics of treated and untreated units to find adequate matches. When these
assumptions are satisfied, the treatment assignment is said to be strongly ignorable
in the terminology of Rosenbaum and Rubin (1983).
The procedure for estimating the effect of a treatment can be divided into three
steps:
• Step 2: Choose a matching algorithm that will use the estimated PSs to match
untreated units with treated units.
• Step 3: Estimate the effect of the treatment with the matched sample and
calculate standard errors.
A binary outcome model is usually employed to estimate the PS given in (2.23) for each
subject under a study. Logit and probit models are the commonly used binary outcome
models in developing PS methods. These models are used to estimate the probability
of receiving a treatment conditional on the observed pretreatment measurements. It
is essential that a flexible functional form be used to allow for possible nonlinearities
in the participation model.
After defining a suitable binary outcome model and estimating the propensity
scores for each subject, we need to apply a matching algorithm to match subjects
in the treatment group with subjects in the control group so that we may be able
to calculate the treatment effect in an observational study. Note that, here our goal
is to find a match or matches for each subject in the treatment group not for the
subjects in the control group. Figure 2.1 gives a visual representation of how the
PSM methods works. In this figure, the y-axis shows the estimated propensity scores
26
Figure 2.1: Predicted probabilities or propensity scores of subjects in the treated and
control groups.
of four individuals in the treated group and six individuals in the control group.
There are many matching methods available for different situations including kernel
matching, nearest neighbor, radius (or caliper) and stratification, which are briefly
explained below by using the example given in Figure 2.1.
In the kernel matching method, each subject from the treatment group is matched
with the weighted average of all the control subjects. In this matching method, we
need to weigh each individual in the control group based on their PSs, where the
individual with closest PS to the one of the treated subject get the highest weight,
27
and so on. In other words, the weights are inversely proportional to the distance
between the treatment and control group’s PSs. Figure 2.2 shows how the kernel
matching method works. In this method, a weight for the treated subject i and
control subject j, denoted as w(i, j), is defined by
( )
p j − pi
K
h
w(i, j) = ( ), (2.24)
∑nC pj − pi
j=1 K
h
where K(.) is a prespecified kernel function which in fact is a weighting function used
in non-parametric estimation techniques, h is bandwidth parameter and nC denotes
the number of individuals in the matched control group. A difficulty linked to kernel
matching method is selecting an appropriate bandwidth parameter which can affect
the bias and variance directly (Imbens, 2004).
Another matching method is called the nearest neighbor matching, for which we
match a subject from treatment group with a subject from control group whose PS,
in comparison to others, is closest in value to the one for the treated subject. It
should be noted that, although not common, the PSM methods can be applied with
replacement; that is, if we are using PSM with replacement, it is possible to use an
untreated individual more than one time as a match. The nearest neighbor matching
method is easy to implement and understand. However, one of the major issues
28
involved in this matching method is that it may result in some bad matches if the
PSs of the matched subjects are far from each other. Let pi and pj be the PSs for two
observations from treatment and control groups respectively, then
min∥pi − pj ∥, (2.25)
determines the match, where ∥.∥ denote the absolute-value norm. Figure 2.3 illustrates
the nearest neighbor matching method.
In the radius matching method, we only need to put a certain radius, and choose
all the control observations that fall within the radius. In this method, matches are
based on the inequality
∥pi − pj ∥ < r, (2.26)
where r is a pre-specified radius. Figure 2.4 illustrates the radius matching method.
As shown in this figure, all the control subjects that fall inside the circle can be used
as matches for the selected treated individual. The main advantage of using radius
matching is that it is possible to use all the observations in the control group, which
results in an increase in the estimation precision. In the case of having poor matches
when PSs are not close enough, we can use the radius (or caliper) matching method
as an alternative to nearest neighbor method (Rosenbaum and Rubin, 1985b).
Finally for the stratification matching method, we need to divide the observations
into blocks based on the estimated PSs and for observations that fall in a certain
29
block we use the individuals in the matching block and the difference estimated as
the average of within-stratum effects. Theoretical and empirical results indicate that
the popular version of stratification via estimated propensity scores based on within-
stratum sample mean differences and a fixed number of strata may lead to biased
inference due to residual confounding and this bias leads to more misleading results
as sample size increases, therefore caution must be taken in stratifying on quintiles
(Lunceford and Davidian, 2004).
After choosing an appropriate matching method and defining matches, we need to
calculate the effect of the treatment. The common way to calculate treatment effect
is through the following formula.
where ATE stands for average treatment effect. ATE is suitable for randomized ex-
periments, where there are usually little differences between observations in treatment
and control groups. Therefore, we need to calculate the average treatment effect on
the treated (ATET), which is the difference between the outcomes of the treated ob-
servations and the outcomes of the treated observations if they were not treated; this
is, in notation,
The second term in (2.28) cannot be calculated as it is not possible to observe the
outcome y0 for observations who receive the treatment (T rt = 1). In this situation,
we can apply PSM using which we can estimate the treatment effect by comparing the
outcomes of the matched control subjects with the outcomes of the matched treated
subjects.
AT ET = E(Y1 | p(x), T rt = 1) − E(Y0 | p(x), T rt = 0). (2.29)
where the w(i, j) represent the weights, and nT rt denotes the number of individuals
in the matched treated group. Note that, if no weighting methods are used, then the
30
T rt ⊥ x|p(x).
This assumption is known as balancing condition, and is testable (Senn, 1994). Bal-
ancing tests consider whether the estimated propensity score adequately balances
characteristics between the treatment and control group units.
∏
n { ∫ τ }
λ(tj |H(tj )) exp − λ(u|H(u))du . (2.31)
j=1 τ0
A derivation of the above result can be found in Cook and Lawless (2007, Section
2.1). The differences between successive events Tj generated by the counting process
{N (t); t ≥ 0} result in waiting times Wj = Tj − Tj−1 , (j = 1, 2, ...), where T0 = 0. The
survival function of Wj , the waiting time between (j −1)st and jth events, conditional
on H(tj−1 ) and tj−1 is given by (Cook and Lawless, 2007, Section 2.1),
{ ∫ }
tj−1 +w
Pr{Wj > w | Tj−1 = tj−1 , H(tj−1 )} = exp − λ(u|H(u))du . (2.32)
tj−1
Using the result given in (2.32) and the fact that any continuous and strictly increasing
31
Using the result in (2.35), we can simulate a HPP, which can be used for simulating
a NHPP as explained in Proposition 2.2.1. This simulation method is useful when
(2.33) cannot be easily solved.
Following steps elaborate the computer simulation procedure for generating event
times of a given intensity function over the time interval [0, τ ].
3. Replace Ej in (2.33) with the generated value obtained from the second step.
∫ tj−1 +Wj
4. Solve the equation Ej = tj−1 λ(u|H(u))du by solving nonlinear equations
in order to find the waiting time Wj .
6. If Tj is less than the upper bound τ , then set j = j + 1, tj−1 = Tj−1 and
return to the second step. Otherwise, break the loop and the calculated values
t1 , t2 , ..., tj−1 are the recurrent event times.
32
If there are external covariates that are of interest, the intensity function λ(t | H(t))
can be extended with covariates in the above algorithm. For a more detailed expla-
nation regarding simulation methods refer to Cook and Lawless (2007, pp. 44-45 and
Problem 2.2).
Suppose that there are m independent counting processes under observation. The ith
process, i = 1, . . ., m, is observed over the observation window [τi0 , τi ], where τi0 and
τi are, respectively, the starting and end of the follow-up times of the ith process. Let
ti1 < ti2 < · · · < tini , i = 1, . . ., m, denote the ni event times experienced by the ith
process. Then, the contribution of the ith process to the likelihood function L(θ) can
be expressed as
∏
ni { ∫ τi }
Li (θ) = λi (tij |Hi (tij )) exp − λi (u|Hi (u))du , (2.36)
j=1 τi0
where θ is a parameter vector specifying the intensity function. The likelihood func-
tion for the m independent processes is the product of such terms, which is
∏
m ∏
m ∏
ni { ∫ τi }
L(θ) = Li (θ) = λi (tij |Hi (tij )) exp − λi (u|Hi (u)) du . (2.37)
i=1 i=1 j=1 τi0
The derivation of the above likelihood function can be found in Cook and Lawless
(2007, Section 2.6). In the case of mixed Poisson processes with random effects, where
the random effects ui follows a gamma distribution with mean 1 and variance ϕ, the
likelihood function for m independent processes is of the form
{n }
∏m ∏ i
ρi (tij ) Γ(ni + ϕ−1 ) (ϕµi (τi ))ni
L(θ, ϕ) = −1 ) ni +ϕ−1
, (2.38)
i=1 j=1
µ i (τ i ) Γ(ϕ (1 + ϕµ i (τ i ))
∫t
where µi (t) = τi0
ρi (s)ds. This result is given in Cook and Lawless (2007, p. 36).
In studies where the subjects are intermittently observed or cease to be at risk
temporarily it is useful to denote when an individual or process is under observation
and at risk of an event. This can be done with the at-risk indicator Y (t). For example,
33
if the ith subject is observed over the interval [τi0 , τi ] and under risk of having an event
over the observation window, the at-risk indicator is Yi (t) = I(τi0 ≤ t ≤ τi ).
Sometimes it is more convenient to write down the likelihood function by using
the at-risk indicator Y (t). Following the notation given by Cook and Lawless (2007,
Section 2.6), the observed part of the counting process {N (t); t ≥ 0}, called the ob-
∫t
servable process, can be written as N̄ (t) = τ0 Y (u)dN (u) with the intensity function
where H̄(t) = {N̄ (s), Y (u); τ0 ≤ s < t, τ0 ≤ u ≤ t} is the history of the observ-
able process. If ∆N (t) and Y (t) are conditionally independent given H(t), then
λ̄(t|H̄(t)) = Y (t)λ(t|H(t)) (Cook and Lawless, 2007, Section 2.6) and the complete
likelihood function for m independent processes can be written as
∏
m m ∏
∏ ni { ∫ ∞ }
L(θ) = Li (θ) = λi (tij |Hi (tij )) exp − Yi (u)λi (u|Hi (u))du . (2.40)
i=1 i=1 j=1 0
The likelihood function (2.40) is not only valid for the case where an individual process
is intermittently observed but can also be used when starting and end of follow-up
times are random as stopping times (Cook and Lawless, 2007, Section 2.6).
Chapter 3
Estimation of Time-Fixed
Treatment Effects
In this section, we introduce the models used in our Monte Carlo simulation studies
to examine the bias arised from different PSM models, HM and CM methods. Our
discussion includes a detailed explanation of the methods used for developing causal
connection based on the conditions of the occurrence of an effect. In Chapter 2, we
review some widely used models in analyzing and describing recurrent events such as
renewal processes (RPs) and Poisson processes (PPs). PPs can be divided into two
35
general classes; (i) homogeneous Poisson processes (HPPs) and (ii) non-homogeneous
Poisson processes (NHPPs). For the sake of simplicity in interpretation, we choose
simple processes under two settings. In the first setting, we use a HPP to generate
event times so that there is no overdispersion involved in the data generation, while in
the second setting our model construction is based on the presence of overdispersion.
The major goal of our Monte Carlo simulation study is to determine the impact
of different matching methods in the estimation of treatment effects. Therefore, as
mentioned above, we consider three matching methods listed below:
1. PSM : In this matching method, we use seven different models to obtain propen-
sity scores and match the subjects to balance the observed covariates between
treated and untreated subjects.
2. CM : This is the most basic matching method, in which we try to find subjects
with similar values on outcome-related covariates. Unlike the PSM method, in
the CM method we match each of the pre-treatment measurements separately.
3. HM : This method is based on the rate of events observed in the past of indi-
viduals. In the HM method, we use the previous number of events experienced
by each subject prior to the experimental treatment initiation to match treated
and untreated subjects.
Each of the matching methods mentioned above has their own advantages and
disadvantages. When there are a few covariates on which subjects need to be matched,
CM is one of the most powerful matching techniques as it allows us to match the
subjects on covariates directly so that we can find the best matched subjects. Methods
based on PSM can be more practical compared to CM when there are too many
covariates involved in the matching process. In such cases, it might be technically
hard to use CM to match the subjects on each of the covariates separately because of
the high dimensionality of the covariates. As a result, we may end up with too many
treated subjects being excluded from the study. In contrast, by applying the PSM
method, it is possible to summarize all covariates in a single value (i.e., the estimated
propensity score) and use it for matching.
On the other hand, the HM method is simple to implement as it does not require
the explanatory variables to be known. This would make the study much easier
36
since the researchers are no longer in need for identification of the key covariates
that are used for matching subjects. The HM method could be powerful in cases,
where the information provided by the history is sufficient in order to be able to
match the subjects on their history. This condition may require subjects to experience
enough number of observed events in a fixed follow-up period before the experimental
treatment assignment. If this is not possible, the history data can be extended by
additional information on subjects such as the addition of some explanatory variables
at the baseline. We briefly discuss this issue later on this section. Note that the
history information used in the matching process may vary in a sense that one may
want to match the subjects on something other than the rate or the number of events
observed in the past of subjects. For example, it is also possible to match the subjects
based on the gap times between successive events experienced by subjects prior to the
treatment assignment.
We now introduce the setup of our simulation study. In order to represent a general
case, we consider different type of explanatory variables. More specifically, we use ten
binary variables, a continuous variable and a count variable. The association of these
explanatory variables with the outcome or treatment selection can be strong, medium
or weak. We let x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 represent the explanatory
variables used in simulations. Table 3.1 presents their association with the treatment
selection and outcome.
The nine variables x1 , x2 , x4 , x5 , x7 , x8 , x10 , x11 , x12 in Table 3.1 are associated with the
treatment selection and the variables x1 , x2 , x3 , x4 , x5 , x6 , x11 are associated with the
outcome. The variable x9 is associated with neither treatment selection nor outcome.
In an epidemiological terminology, the five variables x1 , x2 , x4 , x5 , x11 are sometimes
referred to as true confounders, which means that they are associated with both
treatment selection and the outcome (Rothman et al., 2008). The other two covariates
37
Table 3.2: The levels of association between the explanatory variables and outcome /
treatment selection.
Outcome Treatment
Strong Association x1 , x 3 , x 4 x1 , x4 , x7 , x10
Moderate Association x2 , x 5 , x 6 x2 , x5 , x8 , x12
Weak Association x11 x11
The outcome of interest could be the rate of the event occurrences if the follow-up
times for individuals vary. Let x denote the vector of selected covariates given in
∼
Table 3.1. For convenience, we consider {Ni (t); t ≥ 0} continuously observed over
the interval [0, τ ] for all i = 1, 2, . . ., m. Note that we take τi = τ for all i = 1, 2, ..., m
for the sake of simplicity in interpretation of the results. However, the results can be
extended to the case, in which τi values vary as well. When τi = τ for all individuals,
we can equivalently focus on the expected number of events over the interval [s, τ ]
38
for the random effects model, where ui follows a gamma distribution with mean 0 and
variance ϕ. The lower limits of the integrals s given in (3.1) and (3.2) represent the
time of the experimental treatment initiation, which is equal to 5 years in our study.
We represent the outcome of the matching methods in two forms; theoretical estimate
(T.E.) and empirical estimate (E.E.) of the treatment effect. The theoretical estimate
can be obtained by calculating
E{N1 (τ )|x}
∼
,
E{N0 (τ )|x}
∼
where N1 (τ ) and N0 (τ ) correspond to the treated and untreated matched subjects, re-
spectively. Moreover, empirical estimate is the total number of post-treatment events
for matched treated subject divided by the total number of post-treatment events for
matched untreated subject.
We consider the following propensity score models, each differing in the choice of
explanatory variables entering the model:
• PS 1: This model contains all variables associated with the treatment selection.
• PS 3: This model includes all the true confounding variables that are associated
with both the treatment selection and outcome.
• PS 4: This model includes all the true confounding variables and previous num-
ber of events experienced by each subject prior to the treatment selection.
39
• PS 5: In this model we obtain propensity scores using the true confounders with
an additional adjustment for variable representing the history of the subjects.
• PS 7: All observed and unobserved variables associated with outcome are in-
cluded in the model.
• CM 1: In this case, we match the subjects on the variables that are associated
with the treatment selection.
• CM 2: All the variables associated with the treatment selection as well as pre-
vious number of events experienced by each subject prior to the treatment se-
lection are considered for matching subjects.
Note that in CM we do not match the subjects based on their propensity scores. In-
stead of this, we directly match them on binary covariates and history. In other words,
we use exact matching for binary covariates and the number of events occurred before
the treatment assignment. For continuous and count variables, we apply a caliper of
width of 0.2 of the standard deviation of the corresponding covariate. Finally, for
HM, we consider an untreated subject as a match for a treated subject if it has the
same pre-treatment number of events as treated subject.
40
β0,trt β1 β2 β3 β4 β5 β6
-3.5 log(5) log(2) log(5) log(2) log(5) log(2)
ρ0 βtrt α1 α2 α3 α4 α5 α6
0.3 -1.099 0.389 0.148 0.389 0.389 0.148 0.148
where the propensity score for each of the subjects can be obtained by using the
logistic regression model
logit(pi,trt ) = β0,trt + β1 x1 + β2 x2 + β3 x4 + β4 x5 + β5 x7
(3.4)
+ β6 x8 + log(0.1)x10 + log(1.03)x11 + log(0.45)x12 .
The values of the parameters in the model (3.4) are given in Table 3.3.
where ρ0 indicates the baseline rate function and T rti is a binary variable defin-
ing whether the ith subject receives the treatment or not. We used the model
given in (3.5) to generate event times for m individual processes over 10 years
of follow-up. The procedures used to generate events in this setup are given in
Section 2.4. In our case, none of the subjects received the experimental treat-
ment during their first 5 years of follow-up period. The values of the parameters
in the model (3.5) are given in Table 3.4.
• Step 4: Total number of events experienced by each subject during the first five
years (i.e., during the pre-treatment period) is recorded. This information is
used in the HM method.
• Step 5: We matched the subjects in the treatment group with subjects in the
control group using the proposed methods and models as previously discussed
in this section.
• Step 6: For the matched sample obtained from the previous step, we calculate
the mean of the empirical and theoretical estimates of the treatment effect
resulted from all matched subjects.
We next give the steps used to generate data in the presence of overdispersion.
Some of the steps below are the same as the ones given above, but for the sake of
completeness we report them again.
logit(pi,trt ) = β0,trt + β1 x1 + β2 x2 + β3 x4 + β4 x5 + β5 x7
(3.7)
+ β6 x8 + log(0.1)x10 + log(1.03)x11 + log(0.45)x12 .
• Step 3∗ : We then generated event times for each subject using the random effect
Poisson model
( )
λi t | x, ui = ui ρ0 exp{βtrt T rti + α1 x1 + α2 x2 + α3 x3 + α4 x4
∼
(3.8)
+ α5 x5 + α6 x6 + log(1.05)x11 },
where ui follows a gamma distribution with mean 1 and variance ϕ(= 0.3 and
0.6). Note that, T rti (i = 1, ..., m) equals zero during the first 5 years of subject’s
follow-up period. Parameters used in formulas (3.7) and (3.8) are the same as
those used in the formulas (3.4) and (3.5).
• Step 4∗ : We recorded the total number of events that each subject experienced
prior to the time of experimental treatment initiation, and then used that for
HM and improving the performance of other matching methods.
• Step 6∗ : Using the matched sample obtained from previous step we calculate
the mean of theoretical and empirical estimates of treatment effect.
• Step 7∗ : We repeat the Steps 1∗ to 6∗ B(= 1000) times, each of size m and
finaly the Monte Carlo estimate of the treatment effect is obtained by averaging
over the 1000 estimates resulted from simulated data sets.
The estimates obtained from the Step 7 and Step 7∗ are compared to exp{βtrt } =
0.33 where βtrt is the true treatment effect. Results of the Monte Carlo simulations
43
are reported in Tables 3.5 – 3.8. Table 3.6 and Table 3.8 represent the results of the
matching methods when covariates x3 and x6 are strongly associated with outcome.
In this case, we set α3 = 0.9 and α6 = 0.55 to see how well the proposed matching
methods work. We next summarize the results of the simultion studies.
Our primary goal of using PSM, HM and CM is to balance all outcome-related co-
variates involved in the process to obtain an accurate treatment effect measure. It
is important to note that in a real life situation there may exist some unobserved
outcome related covariates not measured due to the lack of enough understanding of
the process. Unlike randomization, PSM methods do not gaurantee the balance of
unmeasured covariates (Rubin and Thomas, 2000). As a result, a bias in the estimate
of the treatment effect may occur. Tables 3.5 – 3.8 can help to indicate the accuracy
of the suggested matching methods in estimating a treatment effect under various
settings.
We summarize our findings as follows. First, we found that CM and HM resulted
in the least biased estimators of the treatment effect, while matching on propensity
scores resulted in a more pronounced degree of bias. For example, in Table 3.5 the
empirical estimate of PS 4 in the absence of overdispersion is 18 per cent more bi-
ased comparing to the result of CM 4. In particular, the propensity score model
PS 1 including all covariates associated with the treatment assignment resulted in
the greatest biased results. Whereas, including only the confounders x1 , x2 , x4 , x5
and x11 in the propensity score model PS 3 resulted in a greater percision in the
estimation of the treatment effect. This result supports the fact that the goal of
propensity score methods is to efficiently balance the outcome related covariates be-
tween treated and untreated subjects, not to predict the probability of receiving the
treatment (Brookhart et al., 2006). The results of our simulation study reveal that
if variables unrelated to the outcome but related to the exposure are added to the
propensity score model, the bias might be more pronounced as a result of not well-
balanced matches or decreased number of matched subjects. This statement can be
supported by the estimates resulted from PS 1 and PS 6. For example, in Table 3.5,
Table 3.5: Theoretical estimates (T.E.’s) and empirical estimates (E.E.’s) of the treatment effect resulted from the matching
methods (m = 1000).
the theoretical estimate resulted from the model PS 1 is equal to 0.4631 which is more
biased in comparison to 0.3824, the theoretical estimate resulted from PS 3.
Another finding of our simulation study is that matching on the history of the
subjects or just adding the history to the matching model significantly reduces the
bias in the estimates. For example, in Table 3.5 the estimates resulted from the
model PS 5 and HM supports this idea as they are close to the true value 0.33. This
is because of the fact that the history of a subject is a direct result of the observed and
unobserved covariates. Therefore, using it to match the subjects not only increases
the precision but also accounts for unobserved covariates in some cases. Moreover, it
is worth mentioning that, based on our findings in Table 3.5 and Table 3.6, the HM
method is quite robust to some changes that make the estimation of treatment effect
complicated. In particular, it can be seen in Table 3.5 that, when we incorporate
overdispersion in the model, the results of HM did not deviate noticeably from the
target value 0.33. Similarly, in Table 3.6 when we increase the effects of unobserved
covariates x3 and x6 , the resulted estimates are more precise comparing to the models
that do not include the history in the matching process.
Another interesting result is that propensity score matching using the models PS 2
and PS 4 has not increased the precision in the estimation of the treatment effect,
even though we have included the history of the process in the model. The reason for
this result is that propensity score matching does not perform exact matching on the
history of the process and just balance it on the average so that we may end up with
matches that have different history.
Tables 3.5 and 3.6 show that propensity score matching using the model PS 7
does not improve the results as expected. This model includes all the observed and
unobserved outcome related covariates. Although after matching the covariates are
balanced on average, the results are still biased. We conducted a Monte Carlo simu-
lation study to specify the root cause of this result, and found that balancing some
key covariates on average is not a good idea since it has a profound impact on the
event rate. In our study, the covariate x11 can make a noticeable difference if matched
subjects differ on this covariate. We recommend that researchers should thoughtfully
identify the pivotal covariates and make the necessary adjustments before conducting
the PSM.
49
Tables 3.7 and 3.8 represent the results of the simulation studies when the popula-
tion size is reduced to 500. In this case, we realized that reducing the population size
increases the amount of the bias in the estimate of treatment effect arised from using
the PSM comparing to the bias in estimates when the the population size is 1000.
For example, in Table 3.8, the empirical estimate of the model PS 1 in the absence
of overdispersion is equal to 0.6415 which is 3.9 per cent more biased compared to
the corresponding result given in Table 3.6. Furthermore, we observe that reducing
sample size does not greatly affect the results obtained by CM and HM. This result is
expected because in our settings CM and HM result in good matched samples where
matched units share similar covariates. As a result, in this case if the size of the
matched samples decreases, the estimates are not affected.
There are many recommendations made for researchers who use propensity score
methods to make causal inference in observational studies. One of the key points in
using propensity score methods is to include all important true and potential con-
founding variables in a propensity score model. Any failure in doing so may result
in the excluded variables being imbalanced between treated and untreated subjects,
which eventually may lead to biased estimation of the treatment effect (Austin et al.,
2007). In a real life situation it is usually common to have unobserved or unobserv-
able covariates. In such cases, efficient and capable approaches to address this issue
are needed. In the case where subjects under study provide a relatively good history,
we suggest that this information should be included in the model as it significantly
reduces the bias in the estimation of the treatment effects.
Another interesting conclusion that can be made from the simulation study is that
if the data is overdispersed, regular propensity score matching will result in baised
estimates. Overdispersion has been a challenging problem for researchers since it is
hard to pinpoint the real reason why the data are overdispersed. Our findings show
that the more the overdispersion, the worse the estimates. Based on the results given
in Tables 3.5 - 3.8 it can be concluded that this issue can be addressed by using HM or
PSM methods with some adjustment on the history of the possible matched subjects
such as the propensity score model PS 5.
Chapter 4
Estimation of Time-Varying
Treatment Effects
In this section, we introduce the models and methods used in the simulation study.
Our primary goal is to examine the effectiveness of three different matching methods
in various settings. The compared matching methods are propensity score matching
(PSM ), covariate matching (CM ) and history matching (HM). We consider a data-
generating process in the presence and absence of overdispersion. In the absence of
overdispersion, we generate event times for individuals from a homogeneous Poisson
process (HPP). We use a mixed Poisson model to generate event times for individuals
when overdispersion is present.
We now introduce the setup of our simulation study. Similar to the setup given in
the previous chapter, we consider twelve explanatory variables among which ten are
binary variables, one is a continuous variable and another one is a count variable. We
let x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 denote the values of the explanatory
variables. The variables x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x12 are binary, the variables
x10 and x11 are continuous and count, respectively. We choose to assign different
levels of association with the outcome and treatment to these variables so that we can
argue the effectiveness of different matching methods in a more general case. A strong
association between a variable and either outcome or treatment selection causes a
profound impact on the rate of the event occurrences or on the likelihood of selecting
the treatment if that variable is present. We consider the strength of association
between a variable and either outcome or treatment selection as moderate if the
presence of that variable is not as impactful as a covariate with strong association,
but helps to predict the probability of treatment selection or increase the event rate to
a reasonable degree. Finally, the association between a variable and either outcome
or treatment selection is defined as weak if the presence of that variable does not
help in predicting the dependent variable or has a slight effect on the event rate
which is ignorable in many situations. The presence or absence of association of
the explanatory variables with the outcome or treatment selection are presented in
Table 4.1.
52
As shown in Table 4.1, the variables x1 , x2 , x4 , x5 , x7 , x8 , x10 , x11 , x12 are asso-
ciated with the treatment selection and the variables x1 , x2 , x3 , x4 , x5 , x6 , x11 are
associated with the outcome. The variables x1 , x2 , x4 , x5 , x11 are called true con-
founders as they are associated with both the treatment selection and the outcome. In
the current study, the two variables x3 and x6 are considered as potential confounders.
We assume that these two variables are unobserved, but we include them in some of
the PSM models to indicate the degree of bias resulted from excluding them in other
models.
The levels of association of the explanatory variables are described in Table 4.2.
The covariates x1 , x4 , x7 and x10 are strongly associated with the treatment selec-
tion. The covariates x2 , x5 , x8 and x12 are moderately associated with the treatment
selection. The covariate x11 is weakly associated with the treatment selection. We
also consider different levels of association between the outcome and covariates. This
allows us to understand the strength of different matching methods in balancing dif-
ferent type of covariates with different levels of association. In our simulation study,
the association between the covariates x1 , x3 , x4 and the outcome variable is strong,
while the covariates x2 , x5 and x6 are moderately associated with the outcome vari-
able. Finally, the covariate x11 is weakly associated with the outcome variable. We
next briefly explain the first scenario of our simulation study.
Scenario 1: In the first scenario, subjects can change their treatment during their
follow-up times. We assume that the standard treatment is available at the beginning
53
of the follow-up time. Individuals either choose to receive the standard treatment or
not. Selection of the new treatment depends on whether the individual has received
the standard treatment and on the availability of the new treatment before the end
of the follow-up time. We assume a Bernoulli distribution with the success parameter
0.3 in order to indicate if the new treatment is available for an individual, and then
generate the time of the new treatment initiation using a Weibull distribution with the
shape parameter 2.3 and scale parameter 5.5. The parameters of Weibull distribution
are chosen so that we have a reasonable follow-up time before and after the new
treatment initiation. We generate the event times for individuals who receive the
both treatments (treated group) as well for those who do not receive any treatment
(control group), and then use the aforementioned matching methods to estimate and
compare the effects of the standard and new treatments.
Figure 4.1 shows the event histories of two matched individuals. In the top line,
the individual receives the standard treatment at time T S (time of the standard
treatment initiation), and then switch to the new treatment at time T N (time of the
new treatment initiation). In the bottom line, the individual does not receive any
treatment.
In order to evaluate the effectiveness of matching methods in estimating the ef-
ficacy of the new treatment over the standard treatment, we present the outcome
in two different forms, theoretical and empirical estimates of the standard treatment
effect compared to the new treatment effect. Let E(N1 (tN )) and E(N0 (tN )) denote
the expected number of events for a matched pair treated and control individuals over
54
the time interval [0, T N ), respectively. Moreover, let E(N1 (tN , τ )) and E(N0 (tN , τ ))
denote the expected number of events for the same matched treated and control in-
dividuals over the time interval [T N , τ ], respectively. Then, the theoretical estimate
(T.E.) of the matched sample can be obtained by calculating
for all the matched individuals in the matched sample and taking their average.
Empirical estimate (E.E.) can be obtained by following a similar idea. Let EM1 (tN )
and EM0 (tN ) denote the observed number of events for a matched pair treated
and control individuals over the time interval [0, T N ), respectively. Furthermore,
let EM1 (tN , τ ) and EM0 (tN , τ ) denote the observed number of events for the same
matched treated and control individuals over the time interval [T N , τ ], respectively.
Then, we can estimate the effect of the standard treatment by calculating
EM1 (tN )
EM0 (tN )
for all matched individuals in a matched sample and taking their average. The effect
of the new treatment can be estimated by calculating
EM1 (tN , τ )
EM0 (tN , τ )
for all the matched individuals and taking their average. Finally, the empirical esti-
mate of the standard treatment effect compared to the new treatment effect for the
matched sample can be obtained by dividing the resulted estimate of standard treat-
ment by resulted estimate of new treatment. We next explain the second scenario for
our simulation study where there are two different treatments available for subjects
to receive.
Scenario 2: In this scenario, we assume that there exist two treatments; Treat-
ment A and Treatment B. We assign individuals to either a treatment or a control
group using a binary distribution. The treatment group consists of individuals who
55
receive Treatment A, and the control group consists of individuals who receive Treat-
ment B. The time to receive Treatment A or B is generated from a Weibull distri-
bution with the shape parameter 2.1 and scale parameter 3.2. We generate the event
times for those who receive Treatment A as well as for those who receive Treatment
B. Then, we use the PSM, CM and HM methods to estimate and compare the effects
of Treatments A and B.
Let ttrt.A and ttrt.B denote the times of receiving Treatment A and Treatment B,
respectively. We use E(NA (ttrt.A , τ )) and E(NB (ttrt.B , τ )) to respectively denote the
expected number of events for a matched pair treated and control individuals over
the time intervals [ttrt.A , τ ) and [ttrt.B , τ ). Then, the theoretical estimate of the effect
of Treatment A compared to the effect of Treatment B can be shown as
In order to calculate the empirical estimate, the expected number of events in the
formula (4.2) is replaced by the corresponding observed number of events.
We propose the following PSM and CM models, each differing in the choice of
variables entering the model:
• PSM 1: This model contains all variables associated with the treatment selec-
tion.
• PSM 2: This model contains all variables associated with the treatment selection
as well as rate of events experienced by each subject prior to the treatment
selection.
• PSM 3: This model includes all the true confounding variables that are associ-
ated with both the treatment selection and outcome.
• PSM 4: This model includes all the true confounding variables and rate of events
experienced by each subject prior to the treatment selection.
• PSM 5: In this model, we obtain propensity scores using the true confounders
with an additional adjustment for variable representing the history of the sub-
jects.
56
• PSM 6: All twelve variables are included in the propensity score model.
• PSM 7: All observed and unobserved variables associated with the outcome are
included in the model.
• CM 1: In this case, we match the subjects on the variables that are associated
with the treatment selection.
• CM 2: All the variables associated with the treatment selection as well as rate
of events experienced by each subject prior to the treatment selection are con-
sidered for matching subjects.
• CM 4: We match the subjects on the true confounders and rate of events expe-
rienced by each of them prior to the treatment selection.
We now present the steps used for the data-generating process in the absence of
overdispersion for the first scenario.
Table 4.3: Coefficients used to obtain the propensity scores.
β0,treatment β1 β2 β3 β4 β5 β6
-3.5 log(5) log(2) log(5) log(2) log(5) log(2)
ρ0 βS βN βA βB α1 α2 α3 α4 α5 α6
0.3 -1.099 -1.15 -1.099 -1.25 0.389 0.148 0.389 0.389 0.148 0.148
• Step 1: We considered m(= 500 and 1000) independent subjects. For each of
them, the binary covariates x1 − x9 and x12 were generated from independent
Bernoulli distributions with parameters 0.5 and 0.92, respectively. Remaining
two covariates x10 and x11 , which respectively represent the continuous and
count variables, were generated from the standard Normal distribution and the
negative binomial distribution N B(r = 60, p = 0.56).
• Step 2: We then assigned the standard treatment to subjects by using the binary
model
where the propensity score for each of the subjects is obtained by the logistic
regression model
logit(pi,treatment ) = β0,treatment + β1 x1 + β2 x2 + β3 x4 + β4 x5 + β5 x7
(4.4)
+ β6 x8 + log(0.1)x10 + log(1.03)x11 + log(0.45)x12 .
The values of the parameters in the model (4.4) are given in Table 4.3.
• Step 3: We assigned the new treatment to those subjects who received the
standard treatment by using a binary outcome distribution
• Step 4: The time of new treatment initiation was generated from a Weibull
distribution with the shape parameter 2.3 and scale parameter 5.5.
• Step 5: The event times over the time intervals [0, TiN ) and [TiN , τ ] for those
individuals who received both the standard treatment and the new treatment
were generated by using the following intensity functions:
( )
λi t | x = ρ0 exp{βS T rti,S + α1 x1 + α2 x2 + α3 x3 + α4 x4
∼
(4.6)
+ α5 x5 + α6 x6 + log(1.05)x11 },
and
( )
λi t | x = ρ0 exp{βN T rti,N + α1 x1 + α2 x2 + α3 x3 + α4 x4
∼
(4.7)
+ α5 x5 + α6 x6 + log(1.05)x11 },
respectively.
• Step 6: The event times over the time interval [0, τ ] for those individuals who did
not receive any treatment were generated using the following intensity function.
( )
λi t | x = ρ0 exp{α1 x1 + α2 x2 + α3 x3 + α4 x4
∼
(4.8)
+ α5 x5 + α6 x6 + log(1.05)x11 },
where ρ0 in the formulas (4.6), (4.7) and (4.8) indicates the baseline rate func-
tion. We set τ = 10 for all the individuals under study. The procedures used
to generate events in this setup are given in Section 2.4. The values of the
parameters in the models (4.6), (4.7) and (4.8) are given in Table 4.4.
• Step 7: We matched the subjects in the treatment group with subjects in the
control group using some of PSM and CM models.
• Step 8: For the matched sample obtained from the previous step, we calcualte
theoretical estimate (4.1) and empirical estimate.
• Step 9: We repeat Steps 1 to 8 B(= 1000) times. Finaly the Monte Carlo
estimate of the compared treatment effects is obtained by averaging over the
1000 means resulted from simulated data sets.
59
The steps used to generate data in the presence of overdispersion in the first
scenario are given below.
• Step 1: We considered m(= 500 and 1000) independent subjects. The binary co-
variates x1 −x9 and x12 were generated for each of the subjects from independent
Bernoulli distributions with parameters 0.5 and 0.92, respectively. Remaining
two covariates x10 and x11 , which respectively represent the continuous and
count variables, were generated from the standard Normal distribution and the
negative binomial distribution, that is N B(r = 60, p = 0.56).
• Step 2: We then assigned the standard treatment to subjects by using the binary
model
where the propensity score for each of the subjects can be obtained by
logit(pi,treatment ) = β0,treatment + β1 x1 + β2 x2 + β3 x4 + β4 x5 + β5 x7
(4.10)
+ β6 x8 + log(0.1)x10 + log(1.03)x11 + log(0.45)x12 .
• Step 3: The new treatment was then assigned to those subjects who received
the standard treatment by using a bianry outcome distribution
• Step 4: The time of new treatment initiation was generated from a Weibull
distribution with the shape parameter 2.3 and scale parameter 5.5.
• Step 5: The event times over the time intervals [0, TiN ) and [TiN , τ ] for those
individuals who received both the standard and new treatments were generated
by using the following intensity functions:
( )
λi t | x, ui = ui ρ0 exp{βS T rti,S + α1 x1 + α2 x2 + α3 x3 + α4 x4
∼
(4.12)
+ α5 x5 + α6 x6 + log(1.05)x11 },
60
and
( )
λi t | x, ui = ui ρ0 exp{βN T rti,N + α1 x1 + α2 x2 + α3 x3 + α4 x4
∼
(4.13)
+ α5 x5 + α6 x6 + log(1.05)x11 },
respectively.
• Step 6: The event times over the time interval [0, τ ] for those individuals who did
not receive any treatment were generated using the following intensity function.
( )
λi t | x, ui = ui ρ0 exp{α1 x1 + α2 x2 + α3 x3 + α4 x4
∼
(4.14)
+ α5 x5 + α6 x6 + log(1.05)x11 },
where ui in (4.12), (4.13) and (4.14) follows a gamma distribution with mean
1 and variance ϕ(= 0.3 and 0.6). Note that τ = 10 in all the subjects under
study.
• Step 7: We matched the subjects in the treatment group with subjects in the
control group using some of PSM and CM models.
• Step 8: For the matched sample obtained from the previous step, we calculate
theoretical estimate (4.1) and empirical estimate.
• Step 9: We repeat Steps 1 to 8 B(= 1000) times. Finaly the Monte Carlo
estimate of the compared treatment effects is obtained by averaging over the
1000 means resulted from simulated data sets.
It should be noted that, in the first scenario of our simulation study, we cannot
use the HM method or any other models that include pre-treatment event rates due
to the lack of available history of individuals. We now give the steps required for the
data-generating process in the absence of overdispersion for the second scenario.
• Step 1∗ : Like Step 1, we consider m(= 500 and 1000) independent subjects. We
generate ten binary covariates x1 − x9 and x12 for each of m subjects. The nine
covariates x1 −x9 are drawn from independent Bernoulli distributions, each with
parameter 0.5. The other covariate x12 was drawn from a Bernoulli distribu-
tion with the value of the success probability parameter 0.92. The continuous
61
and count covariates, x10 and x11 , were respectively generated from the stan-
dard Normal distribution and the negative binomial distribution, with notation
N B(r = 60, p = 0.56).
logit(pi,treatment ) = β0,treatment + β1 x1 + β2 x2 + β3 x4 + β4 x5 + β5 x7
(4.16)
+ β6 x8 + log(0.1)x10 + log(1.03)x11 + log(0.45)x12 .
Note that, here if T rti = 1 then the subject receives Treatment A; otherwise,
Treatment B. The values of parameters in formula (4.13) is given in Table 4.3.
• Step 4∗ : The time of treatment assignment was generated from a Weibull dis-
tribution with the shape parameter 2.1 and scale parameter 3.2.
• Step 5∗ : We generated pre-treatment event times for all subjects using the
following intensity function:
( )
λi t | x = ρ0 exp{α1 x1 + α2 x2 + α3 x3 + α4 x4 + α5 x5 + α6 x6 + log(1.05)x11 }.
∼
(4.17)
Note that, at most one of T rti,A and T rti,B can be equal to one. The values of
parameters in formulas (4.14) and (4.15) are given in Table 4.4.
62
• Step 7∗ : We recorded the rate of the events that each subject experienced prior
to the time of experimental treatments initiation, and then used that for HM,
CM and PSM.
• Step 8∗ : We next used the proposed matching methods and models to match
the subjects so that we can estimate the treatment effects.
• Step 9∗ : Using the matched sample obtained from previous step we calculate
theoretical estimate (4.2) and empirical estimate for each of the matched in-
dividuals in a matched sample and then calculate the mean of the resulted
estimates.
• Step 10∗ : We repeat the Steps 1∗ to 9∗ B(= 1000) times, each of size m and
finaly the Monte Carlo estimate of the compared treatment effects is obtained
by averaging over the 1000 means resulted from simulated data sets.
In the case of overdispersed data, the steps of the data-generating process are
given below.
• Step 1∗ : We consider m (=500 and 1000) independent subjects and generate ten
binary covariates x1 – x9 and x12 for each of m subjects. The nine covariates x1 –
x9 are drawn from independent Bernoulli distributions, each with parameter 0.5.
The other covariate x12 was drawn from a Bernoulli distribution with the value
of the success probability parameter 0.92. The continuous and count covariates,
x10 and x11 , were respectively generated from the standard Normal distribution
and the negative binomial distribution N B(r = 60, p = 0.56).
• Step 2∗ : A treatment status was generated for each of the m subjects by using
the following binary model
logit(pi,treatment ) = β0,treatment + β1 x1 + β2 x2 + β3 x4 + β4 x5 + β5 x7
(4.20)
+ β6 x8 + log(0.1)x10 + log(1.03)x11 + log(0.45)x12 .
63
Note that, here if T rti = 1 then subject receive treatment A; otherwise treat-
ment B.
• Step 4∗ : The time of treatment assignment was generated from a Weibull dis-
tribution with the shape parameter 2.1 and the scale parameter 3.2.
• Step 5∗ : We generated pre-treatment event times for all subjects using the
following intensity function:
( )
λi t | x, ui = ui ρ0 exp{α1 x1 + α2 x2 + α3 x3 + α4 x4
∼
(4.21)
+ α5 x5 + α6 x6 + log(1.05)x11 }.
+ α5 x5 + α6 x6 + log(1.05)x11 }.
(4.22)
Note that, at most one of T rti,A and T rti,B can be equal to one, and ui follows
a gamma distribution with mean 1 and variance ϕ(= 0.3 and 0.6).
• Step 7∗ : We recorded the rate of the events that each subject experienced prior
to the time of experimental treatments initiation, and then used that for HM,
CM and PSM.
• Step 8∗ : We next used the proposed matching methods and models to match
the subjects so that we can estimate the treatment effects.
• Step 9∗ : Using the matched sample obtained from previous step we calculate
theoretical estimate (4.2) and empirical estimate for each of the matched in-
dividuals in a matched sample and then calculate the mean of the resulted
estimates.
64
• Step 10∗ : We repeat the Steps 1∗ to 9∗ B(= 1000) times, each of size m and
finally the Monte Carlo estimate of the compared treatment effects is obtained
by averaging over the 1000 means resulted from simulated data sets.
The estimates obtained from Step 9 and Step 10∗ are respectively compared to
exp{βS −βN } = 1.053 and exp{βA −βB } = 1.16, where βS , βN , βA and βB are the true
treatment effects. Results of the Monte Carlo simulations are reported in Tables 4.5
– 4.12. Tables 4.6, 4.8, 4.10 and 4.12 represent the results of the matching methods
when covariates x3 and x6 are strongly associated with the outcome. In this case, we
set α3 = 0.9 and α6 = 0.55 to see how well the proposed matching methods work.
Note that, we did not report the T.E. for the first scenario since the formula (4.1)
results in the true estimate in all the settings. We next summarize the results of the
simulation studies.
In Chapter 3 our main goal was to assess the strength of PSM, CM and HM methods
in eliminating the bias in estimation of time-fixed treatment effect in observational
studies. In this chapter, we assumed that individuals under study can change their
treatment or choose to receive a different treatment instead of receiving no treatment,
the case in Chapter 3. We used the same matching techniques here to estimate and
compare the treatment effects. We summarize our findings as follows.
First, based on the results given in Tables 4.5 and 4.6, we found that except the
model CM 1 the other matching models resulted in estimates with very small degree
of bias. For example, in Table 4.5 the estimate resulted from the model PSM 3 in
the presence of overdispersion (ϕ = 0.6) is equal to 1.0712 which is close to the true
value, 1.053. The reason why in the first scenario of our study different matching
models resulted in precise estimates is that we used multiplicative intensity function
to generate event times and estimate the treatment effects. As a result, the bias arised
from a matching model cancels out when we compare the estimated treatment effects.
It is important to note that the size of the matched sample should be large enough to
be able to eliminate the bias when comparing the treatment effects. For example, in
Table 4.5: Empirical estimates (E.E.’s) resulted from the matching methods in the first scenario (m=1000).
Table 4.6: Empirical estimates (E.E.’s) resulted from the matching methods in the first scenario when α3 = 0.9 and
α6 = 0.55 (m=1000).
Table 4.10: Empirical estimates (E.E.’s) resulted from the matching methods in the first scenario when α3 = 0.9 and
α6 = 0.55 (m=500).
the model CM 1 where we match the individuals on the covariates that are associated
with treatment selection, the estimates are relatively biased due to the small matched
number of observations.
Second, we demonstrated that failure to include some important confounders in
the PSM or CM models can result in a higher degree of bias. For example, in Table
4.8 the results of the model PSM 1 are more biased comparing to the results of PSM
7. It can be concluded that including the covariates that are solely associated with
treatment selection does not improve the accuracy of the estimates and in some cases
may even increase the bias due to decreased number of matched subjects. For example,
in Table 4.7 the estimate under the model PSM 1 in the absence of overdispersion is
equal to 1.6249, while the estimate under of the model PSM 3 equals to 1.3353 which
is much closer to the true estimate, 1.16. The same conclusion can be made based
on the estimates resulted from the model PSM 6 in Table 4.7. Based on this, we can
conclude that in developing a PSM model, true confounders play the most crucial
role.
Third, we observed that among all the PSM models, the model PSM 5 has the
lowest degree of bias which led us to the conclusion that beside other confounding
variables, matching individuals on their history balance out noises caused by unmea-
sured covariates. Furthermore, it is worth mentioning that in the models in which
the history of the possible matched subjects is adjusted, the estimates are relatively
robust regardless of the degree of overdispersion. For example, in Tables 4.7 and 4.8
the estimates resulted from the models PSM 5, CM 2, CM 4 and HM support this
statement. On the other hand, the other models resulted in more biased estimates
when the degree of overdispersion increased. We recommend that in the case where
researchers cannot or miss to measure some important covariates, including history
in the matching model can account for those unmeasured covariates to some degree
depending on how informative the history is.
Fourth, when we decreased the population size to 500, the degree of bias increased
for all the settings. In particular, CM and PSM models resulted in more degree of
bias. For example, in Table 4.11 the model CM 1 in the absence of overdispersion
resulted in an estimate with a bias of 0.322, while the same model in Table 4.7 resulted
in an estimate with a bias of 0.247. The reason of this result can be linked to the
fact that CM is generally useful when there are a few explanatory variables in the
72
matching model, and in particular if the population size is small, it may result in small
matched sample and therefore lead to a more biased estimate. It is worth noting that
the results of CM method could be worse if the number of treated subjects in the
population is small. In this case, HM method also resulted in more biased estimates
comparing to the corresponding estimates when m = 1000. For example, in Table 4.12,
the empirical estimate resulted from HM method in the presence of overdispersion
(ϕ = 0.3) is equal to 1.4991 which is 3.5 per cent more biased in comparison to the
corresponding estimate in Table 4.8. The reason why HM method did not cause any
noticeable degree of bias is that in HM method we only use one variable to match the
individuals and therefore it is easier to find comparison unit(s) for treated units.
Chapter 5
In this chapter, we consider a real world example from an epilepsy study to illus-
trate the matching methods used in the previous chapters. To this end, we generated
a synthetic data set based on the information obtained from the recently published
literature on recurrent epileptic seizures in adults. The generated data set includes
variety of explanatory variables that help to assess the capability of applied matching
methods in details. Our primary goal in this chapter is to determine how well propen-
sity score matching (PSM) works when count and continuous explanatory variables
are available along with binary explanatory variables in a given data set.
Seizures are caused by some abnormal activities in the brain due to a central nervous
system (neurological) disorder. Anyone can develop epilepsy and it affects both males
and females of all races, ethnic backgrounds and ages. The onset of epilepsy is most
common in children and older adults, but the condition can occur at any age. For
majority of people with epilepsy there are a few ways to control seizures, including
treatment with medication and surgery. In many scientific and research papers, it
has been shown that patients with epilepsy may undergo several seizure attacks in
a weak, month or year (e.g., Moran et al. (2004); Hoppe et al. (2007); and Viteva
(2014)). This allows us to have informative histories for patients under study which
in return make it possible for us to apply the history matching (HM) method.
74
There are many things that make seizures more likely for some people with epilepsy.
These are often called triggers. Below, we mention some of the seizure triggers that
have been reported by people with epilepsy:
2. Stress.
The reasons why sleep deprivation can trigger seizures are not clearly known, but
seizure specialists believe that changes in the brain’s electrical and hormonal activity
occurring during sleep can be related to why lack of sleep can provoke seizures. Stress
is another trigger because the areas of brain responding to stress overlap with the
areas important for seizures. Moreover, stress also causes sleep disorders, which may
provoke seizures. Other triggers such as drinking alcohol, using recreational drugs
and living in mega-cities with excessive noise may negatively affect brain activities
and cause stress, which eventually lead to seizure attacks. We use these triggers as
explanatory variables that affect the outcome of interest; that is, a seizure attack.
In addition to these variables, we consider two other explanatory variables, age and
gender, which play an important role in having seizure attacks.
Similar to the second scenario considered in Chapter 4, in this study we assume
that the follow-up time for each of the individuals under study is 10 years, and there
are two different treatments; Treatment A and Treatment B. Individuals under study
can only choose to receive one of these treatment. Our secondary goal is to estimate
and compare the effects of these two treatments.
In this section, we develop the propensity score models using some observed baseline
covariates. Covariates that we use in the PSM models are gender, stress, living in
urban and industrial areas, age and years of schooling. Table 5.1 shows the association
between explanatory variables and outcome or treatment selection.
75
Table 5.1: The presence or absence of association of explanatory variables with the
treatment selection and the outcome considered in data generation.
Associated with trt. selection Not associated with trt. selection
Associated with outcome St.1 , G.2 , U.I.3 , Age H.S.4 , Al.5 , R.D.6
Not associated with outcome Sc.7 -
In Table 5.1, the variables G., U.I., Al. and R.D. represent the binary variables and
Age, H.S., Sc. and St., represent the continuous and count variables, respectively.
Following PSM models are used for estimating propensity scores for individuals under
study.
• PSM 1: This model contains all variables associated with the treatment selec-
tion.
• PSM 2: This model includes all the true confounding variables that are associ-
ated with both the treatment selection and outcome.
• PSM 3: In this model, we obtain propensity scores using the true confounders
followed by an additional adjustment for variable representing the history of the
subjects.
• PSM 4: All explanatory variables are included in the propensity score model.
Based on the findings in Austin (2009b), we apply caliper matching, where calipers
of width of 0.2 of the standard deviation of the logit of the estimated propensity scores
are used. Austin (2009b) showed that matching on the logit of propensity score, using
1
Stress
2
Gender
3
Urban and Industrial Areas
4
Hours of Sleep
5
Alcohol
6
Recreational Drugs
7
Years of Schooling
76
calipers of width 0.2 of the standard deviation of the logit of the propensity score,
tended to have superior performance for estimating treatment effects compared with
other competing methods that are used in the medical literature. For PSM 3 and
PSM 6, we adjust the history of the potential matched individuals so that they have
the same number of pre-treatment events.
For CM, we consider the following four cases:
• CM 1: In this case, we match the subjects on the variables that are associated
with the treatment selection.
In CM method, we use exact matching for the binary covariates. For continuous and
count variables we apply a caliper of width of 0.2 of the standard deviation of the
associated covariate. In order to match the subjects on their histories, we find the
subjects with the same number of pre-treatment events.
Since the pre-treatment follow-up time is the same for all the individuals, we
used the previous number of events experienced by individuals prior to the time of
treatment initiation to represent the history of individuals. Therefore, we considered
an untreated individual as a match for a treated individual in the HM method if they
have the same number of pre-treatment events.
We now present the steps used for the data-generating process in the absence and
presence of overdispersion.
β0,treatment β1 β2 β3 β4 β5
-0.6 log(5) log(1.2) log(5) log(1.18) log(0.5)
77
ρ0 βA βB α1 α2 α3 α4 α5 α6 α7
1.25 -1.25 -1.75 0.15 -0.08 0.2 0.2 0.2 0.05 0.05
• We generated a treatment status for each of the individuals by using the follow-
ing binary model
logit(pi,treatment ) = β0,treatment +β1 G.i +β2 St.i +β3 U.I.i +β4 Agei +β5 Sc.i . (5.2)
78
Note that, here if T rti = 1 then subject receive treatment A; otherwise treat-
ment B. The values of parameters in the logistic regression model (5.2) is given
in Table 5.2.
• We generated pre-treatment event times for the first two years of follow-up time
for all subjects using the following intensity function.
( )
λi t | x = ρ0 exp{α1 St.i + α2 Sl.i + α3 U.I.i + α4 Al.i
∼
(5.3)
+ α5 Dr.i + α6 Agei + α7 G.i }, i = 1, 2, . . . , 20,000.
Parameters used in the intensity functions (5.3) and (5.4) and their values are
given in Table 5.3.
• We matched the individuals using the proposed matching methods and models.
• For the matched sample obtained from the previous step, we compared the
effects of two treatments in two forms:
1. Theoretical estimate
E(NA (t))/(τ − 2)
T.E. = , (5.5)
E(NB (t))/(τ − 2)
where E(NA (t)) and E(NB (t)) represent the expected number of post-
treatment events for the matched individuals in the control and treatment
groups, respectively, and τ = 10 which represents the end of the follow-up
time.
79
2. Empirical estimate
In order to calculate the empirical estimate, the expected number of events
in T.E. given in (5.5) is replaced with the corresponding observed number
of events.
• Finally estimates resulted from previous step were compared to exp{βA −βB } =
1.65
Steps required for generating data in the presence of overdispersion is the same as
the steps given above. The only difference is that, instead of the intensity functions
(5.3) and (5.4), we used the following event generating models, respectively.
( )
λi t | x, ui = ui ρ0 exp{α1 St.i + α2 Sl.i + α3 U.I.i + α4 Al.i
∼
(5.6)
+ α5 Dr.i + α6 Agei + α7 G.i }, i = 1, 2, . . . , 20,000,
and
( )
λi t | x, ui = ui ρ0 exp{βA T rti,A + βB T rti,B + α1 St.i + α2 Sl.i + α3 U.I.i + α4 Al.i
∼
Table 5.4: Estimates resulted from different matching methods and models
where the random effect ui follows a gamma distribution with mean 1 and variance
ϕ = 0.3, representing a moderate amount of heterogeneity commonly seen in such
studies.
Based on the results given in Table 5.4, it can be concluded that in our settings
PSM resulted in a higher degree of bias comparing to CM and HM. The reason for
this is that PSM failed to perfectly balance some key covariates between treated and
untreated subjects. For example, the variables Age and Stress have profound effects
on the event rate and may result in a higher degree of bias if they are not sufficiently
balanced between treatment and control groups.
One of the commonly used numerical balance diagnostics is the standardized mean
difference. The standardized mean difference can be used to compare balance in base-
line covariates between treated and untreated units in a matched sample. It can also
be used to evaluate the propensity score balance. The concept and formulas for stan-
dardized mean difference has been thoroughly discussed by Austin (2009a) and Stuart
(2010). There is no consensus as to what value of a standardized difference would
denote important residual imbalance between treated and untreated subjects in the
matched sample (Austin, 2009a). However, it is recommended that the standardized
mean difference should be close to zero for propensity scores (Austin, 2011b), and
for continuous covariates, the standardized mean difference should be less than 0.25
standard deviation units (Stuart (2010); Clearinghouse (2014)). Finally, the stan-
dardized mean difference for categorical covariates should be less than 0.10 (Austin,
2009a). It should be noted that, for any type of covariate, the closer the standardized
mean difference to zero, the better the matched sample is. Therefore, regardless of
the mentioned recommendation, researchers should carefully identify key covariates
that are prognostically important to apply the matching method.
The standardized mean differences for each of the covariates in the PSM models
are reported in Tables 5.5 and 5.6. A few conclusions can be made based on the
absolute value of the standardized mean difference.
81
studies do not include randomization as a design principle. The main goal with
the observational studies is to develop cause-and-effect type of relationships between
explanatory variables and outcome variables when randomized controlled trial is not
feasible. There are many studies compared these two important classes of designs
with advantages and disadvantages.
In this study, we mainly focused on the observational studies and discussed the
estimation of treatment effects in rucurrent events. To this end, we considered PSM,
CM and HM methods for matching in observational studies. It should be noted
that the use of PSM and CM methods has been relatively well-documented when
the outcome of interest is not of recurrent type. As noted by Smith and Schaubel
(2015), the matching methods have not been discussed in detail especially when there
is an interest to assess the treatment effects. To fill this gap, we considered commonly
used PSM and CM methods, as well as a relatively new matching method called HM,
which is applicable in recurrent event studies. HM has been applied by Smith and
Schaubel (2015) in a recurrent event setting. To our knowledge, this has not been
discussed extensively by others. In order to make more general conclusions, we used
PSM, CM and HM techniques in various settings commonly seen in epidemiology
studies. We considered time-fixed treatment effects in Chapter 3 and time-varying
treatment effects in Chapter 4 through Monte Carlo simulations. In the simulations,
we focused on the bias in estimation of the treatment effects and did not discuss the
variance because most of the observational studies matching procedures are population
based and usually the standard errors are negligible. The results of our study can be
summarized as follows.
First, we demonstrated that HM or any other model in which the history is ad-
justed provide the best matched sample. When an outcome-related covariate was
omitted from the matching model, we showed that including the history in that model
can greatly decrease the bias due to excluded covariate. This result was expected since
the history is caused by all measured and unmeasured outcome-related covariates, and
therefore, can be used as an alternative to them. It should be noted that the use of the
history as an alternative approach to deal with unmeasured or unobserved covariates
should not be regarded as a panacea as this conclusion thoroughly depends on how
informative the history is. Furthermore, we observed that estimates resulted from
HM or the models in which histories of potential matched subjects are adjusted are
relatively robust to overdispersion.
85
Second, we demonstrated that covariates that are associated with the treatment
selection but not the outcome should not be included in the PSM or CM models.
Their inclusion can potentially increase the number of bad matches and in most cases
decrease the size of the matched sample, which eventually leads to a higher degree
of bias. Based on this point, it can be concluded that, in developing and applying
the PSM and CM methods, one should only include true and potential confounders
in the matching model. This will help to maximize the number of matched subjects,
which in return increases the accuracy of the estimation of treatment effects.
Third, CM may result in precise estimates if there are a few covariates on which
subjects need to be matched. In contrast, if there are too many covariates, models
based on CM may result in many treated subjects being excluded from the matched
sample. As a result, the bias may increase. In such cases, methods based on PSM
are preferred as they summarize all covariates in a single quantity. Furthermore,
we showed in Chapter 4 that it is better to avoid models based on CM when the
population size is small.
Fourth, if a confounding variable has a noticeable impact on the outcome, then it
is better to follow the idea of the randomized block design, and adjust that variable
between treated and untreated subjects prior to conducting any matching method.
This helps to improve the performance of matching methods by reducing the bias,
which could be a result of balancing the key prognostic covariates on average.
Fifth, based on the simulation studies presented in Chapters 3 and 4, it can be
concluded that in our settings the importance of true confounders in developing PSM
models is more than the importance of potential confounders. For example, in Ta-
ble 3.5 results obtained from the model PS 7 is more biased compared to the results
of the model PS 3. The model PS 7 includes all true and potential confounders,
while the model PS 3 includes only true confounders. This result may run counter
to intuition for many as the goal of observational matching analysis is to balance
the true and potential confounders between treatment and control groups, but the
results of the model PS 7 are unexpectedly more biased in spite of the fact that all
the outcome-related covariates are included in the model. This issue was addressed
by using the model PS 5 where potential confounders are replaced by the history,
but future work should thoroughly examine the reasons why the model PS 7 did not
result in more precise estimates.
86
In this section, we suggest some future extensions to our work in this thesis. We aim
with the future work to address some of the shortcomings of the current simulations
and extending the ideas explored here to deal with several other relevant situations
in the causal inference for recurrent events data.
First, we showed in Chapters 3 and 4 that matching subjects on their history can
be helpful in increasing the accuracy of estimation of treatment effects. This is only
true when the history of the subjects under study is reasonably informative so that it
can be used as an alternative to other outcome-related covariates. Therefore, it could
be interesting to establish some criteria on when one can use the history of subjects or
HM method in estimation of treatment effects in recurrent events settings. Moreover,
for the HM method instead of pre-treatment number of events, pre-treatment gap
times between successive events can be used as the history of subjects.
Second, all covariates as well as their effects considered in this thesis are time-
fixed. We aim to consider the situation where covariates and their effects vary over
time as a future work. Such an extension would be very useful because time-varying
covariates and their effects are of interest in many epidemiology and public health
studies with recurrent events. For example, in recurrent events analysis the occurrence
of a new event usually depends on the previous event occurrences. Therefore, more
complex recurrent event models based on event intensity functions can be considered.
Furthermore, our study in this thesis can be extended to the case where the effect
of a treatment can change over time or when estimation of a new treatment effect in
the presence of old treatments effects is of interest. For example, in Chapter 4 we
assumed that the effect of the standard treatment does not interfere with the effect
of the new treatment. This assumption is not realistic in many real life situations.
Third, throughout the thesis we considered only the multiplicative type of intensity
functions to generate event times. It can be useful to redevelop and evaluate the
performance of the matching methods when the intensity function is of additive form.
The intensity function can also be generalized to include trend component in the
baseline rate function.
Fourth, in this study we used three different matching methods to estimate the
treatment effects. There are other matching techniques such as stratification on the
87
propensity score, inverse probability of treatment weighting using the propensity score,
mahalanobis distance matching and coarsened exact matching which can be applied
in future studies.
Bibliography
Agresti, A. and Min, Y. (2004). Effects and non-effects of paired identical observations
in comparing proportions with binary matched-pairs data. Statistics in Medicine,
23(1):65–75.
Andersen, P. K. and Gill, R. D. (1982). Cox’s regression model for counting processes:
a large sample study. The Annals of Statistics, 10(4):1100–1120.
Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., and
Stürmer, T. (2006). Variable selection for propensity score models. American
Journal of Epidemiology, 163(12):1149–1156.
Cai, J. and Schaubel, D. E. (2004). Marginal means/rates models for multiple type
recurrent event data. Lifetime Data Analysis, 10(2):121–138.
Chen, B. E. and Cook, R. J. (2004). Tests for multivariate recurrent events in the
presence of a terminal event. Biostatistics, 5(1):129–143.
Cottone, F., Anota, A., Bonnetain, F., Collins, G. S., and Efficace, F. (2019). Propen-
sity score methods and regression adjustment for analysis of nonrandomized stud-
ies with health-related quality of life outcomes. Pharmacoepidemiology and Drug
Safety, 28(5):690–699.
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical
Society: Series B (Methodological), 34(2):187–202.
Cox, D. R. and Isham, V. (1980). Point processes, volume 12. CRC Press.
Faries, D. E., Obenchain, R., Haro, J. M., and Leon, A. C. (2010). Analysis of
observational health care data using SAS. SAS Institute.
Gail, M., Santner, T., and Brown, C. (1980). An analysis of comparative carcinogen-
esis experiments based on multiple times to tumor. Biometrics, 36(2):255–266.
Heckman, J., Ichimura, H., Smith, J., and Todd, P. (1998). Characterizing selection
bias using experimental data. Econometrica, 66(5):1017–1098.
Hoppe, C., Poepel, A., and Elger, C. E. (2007). Epilepsy: accuracy of patient seizure
counts. Archives of Neurology, 64(11):1595–1599.
Jacobsen, M. (2006). Point process theory and applications: marked point and piece-
wise deterministic processes. Springer Science & Business Media.
Kalbfleisch, J. and Prentice, R. (2002). The statistical analysis of failure time data.
john wiley & sons. Inc., Hoboken, New Jersey.
Kelly, P. J. and Lim, L. L.-Y. (2000). Survival analysis for recurrent event data: an
application to childhood infectious diseases. Statistics in Medicine, 19(1):13–33.
Lawless, J. F. (2003). Statistical models and methods for lifetime data, volume 362.
John Wiley & Sons.
Lawless, J. F. and Nadeau, C. (1995). Some simple robust methods for the analysis
of recurrent events. Technometrics, 37(2):158–168.
Lawless, J. F., Nadeau, C., and Cook, R. J. (1997). Analysis of mean and rate
functions for recurrent events. In Proceedings of the First Seattle Symposium in
Biostatistics, pages 37–49. Springer.
Lee, E. W., Wei, L., Amato, D. A., and Leurgans, S. (1992). Cox-type regression
analysis for large numbers of small groups of correlated failure time observations.
In Survival analysis: state of the art, pages 237–247. Springer.
Liang, K.-Y., Self, S. G., Bandeen-Roche, K. J., and Zeger, S. L. (1995). Some recent
developments for regression analysis of multivariate failure time data. Lifetime Data
Analysis, 1(4):403–415.
Liang, K.-Y., Self, S. G., and Chang, Y.-C. (1993). Modelling marginal hazards in
multivariate failure time data. Journal of the Royal Statistical Society: Series B
(Methodological), 55(2):441–453.
Lin, D. Y., Wei, L.-J., Yang, I., and Ying, Z. (2000). Semiparametric regression for
the mean and rate functions of recurrent events. Journal of the Royal Statistical
Society: Series B (Statistical Methodology), 62(4):711–730.
Lunceford, J. K. and Davidian, M. (2004). Stratification and weighting via the propen-
sity score in estimation of causal treatment effects: a comparative study. Statistics
in Medicine, 23(19):2937–2960.
92
Moran, N., Poole, K., Bell, G., Solomon, J., Kendall, S., McCarthy, M., McCormick,
D., Nashef, L., Sander, J., and Shorvon, S. (2004). Epilepsy in the united kingdom:
seizure frequency and severity, anti-epileptic drug utilization and impact on life in
1652 people with epilepsy. Seizure, 13(6):425–433.
Pepe, M. S. and Cai, J. (1993). Some graphical displays and marginal regression
analyses for recurrent failure times and time dependent covariates. Journal of the
American Statistical Association, 88(423):811–820.
Prentice, R. L., Williams, B. J., and Peterson, A. V. (1981). On the regression analysis
of multivariate failure time data. Biometrika, 68(2):373–379.
Robins, J. M., Hernan, M. A., and Brumback, B. (2000). Marginal structural models
and causal inference in epidemiology. Epidemiology, 11(5):550–560.
Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score
in observational studies for causal effects. Biometrika, 70(1):41–55.
Rothman, K. J., Greenland, S., Lash, T. L., et al. (2008). Modern epidemiology,
volume 3. Wolters Kluwer Health/Lippincott Williams & Wilkins Philadelphia.
93
Senn, S. (1994). Testing for baseline balance in clinical trials. Statistics in medicine,
13(17):1715–1726.
Smith, A. R., Zhu, D., Goodrich, N. P., Merion, R. M., and Schaubel, D. E. (2018).
Estimating the effect of a rare time-dependent treatment on the recurrent event
rate. Statistics in Medicine, 37(12):1986–1996.
Stuart, E. A. (2010). Matching methods for causal inference: A review and a look
forward. Statistical science: a review journal of the Institute of Mathematical
Statistics, 25(1):1–21.
Viteva, E. I. (2014). Seizure frequency and severity: how really important are they for
the quality of life of patients with refractory epilepsy. Annals of Indian Academy
of Neurology, 17(1):35–42.
Wei, L.-J., Lin, D. Y., and Weissfeld, L. (1989). Regression analysis of multivariate
incomplete failure time data by modeling marginal distributions. Journal of the
American Statistical Association, 84(408):1065–1073.