0% found this document useful (0 votes)
35 views

CH 1

Uploaded by

raisa.mim17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

CH 1

Uploaded by

raisa.mim17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

Introduction to Causal Inference

Potential Outcomes and Counterfactuals

Mahbub Latif

Institute of Statistical Research and Training (ISRT)


University Dhaka, Dhaka 1000, Bangladesh
([email protected])

October 2024

Mahbub Latif Introduction to Causal Inference October 2024 1 / 80


Mahbub Latif Introduction to Causal Inference October 2024 2 / 80
Plan

Chapter 1: Potential Outcomes and Counterfactuals


Why study causation
Individual causal effects
Average causal effects
Measures of causal effect
Causation versus association
Causal estimands

Mahbub Latif Introduction to Causal Inference October 2024 3 / 80


Introduction

• Question 1
Do charter schools increase the test scores of elementary school students?
• If so, how large are the gains compared to those that could be realized by
implementing alternative educational reforms?

Mahbub Latif Introduction to Causal Inference October 2024 4 / 80


Introduction

• Question 2
Does obtaining a college degree increase an individual’s labor market earnings?
• If so, is this particular effect large relative to the earnings gains that could be
achieved only through on-the-job training?

Mahbub Latif Introduction to Causal Inference October 2024 5 / 80


Introduction

• These (Questions 1 and 2) types of questions are simple cause-and-effect questions of


the form,
Does 𝐴 cause 𝑌 ?
• If 𝐴 causes 𝑌 , how large is the effect of 𝐴 on 𝑌 ?
• Is the size of this effect large relative to the effects of other causes of 𝑌 ?

Mahbub Latif Introduction to Causal Inference October 2024 6 / 80


Introduction

• Simple cause-and-effect questions are the motivation for much research in the social,
demographic, and health sciences
• Definitive answers to cause-and-effect questions may not always be possible to
formulate, given the constraints that researchers face in collecting data and
evaluating alternative explanations.
• Over the past four decades, a counterfactual model (also known as the
Neyman-Rubin Causal model) of causality has been developed and refined, and as a
result, a unified framework for the prosecution of causal questions is now available.

Mahbub Latif Introduction to Causal Inference October 2024 7 / 80


Model for associational inference (Holland, 1986)

• Let 𝑈 be the population and a variable is a real-valued function that is defined on


every unit 𝑢 ∈ 𝑈
• Let 𝑌 (𝑢) be the value of the variable 𝑌 associated with 𝑢 ∈ 𝑈 and the objective is to
understand the variability of 𝑌 over the units of 𝑈
• Associational inference is related to the distribution of 𝑌 by the values of another
variable 𝑆 (say) defined on 𝑈
• The joint distribution of 𝑌 and 𝑆 over 𝑈 is defined by 𝑃 (𝑌 = 𝑦, 𝑆 = 𝑠), which is the
proportion of 𝑢’s in 𝑈 for which 𝑌 (𝑢) = 𝑦 and 𝑆(𝑢) = 𝑠

Mahbub Latif Introduction to Causal Inference October 2024 8 / 80


Model for associational inference

• The associational parameter is defined by the joint distribution of 𝑌 and 𝑆 , e.g.

𝑃 (𝑌 = 𝑦, 𝑆 = 𝑠)
𝑃 (𝑌 = 𝑦 | 𝑆 = 𝑠) =
𝑃 (𝑆 = 𝑠)
• A typical associational parameter is defined by the parameters of regression of 𝑌 on
𝑆 , i.e., 𝐸(𝑌 | 𝑆 = 𝑠)
• Associational inference consists of making statistical inferences (estimates, tests, etc.)
about associational parameters relating 𝑌 and 𝑆 based on data gathered about 𝑌
and 𝑆 from units in 𝑈
• In associational inference, the role of time is to affect the definition of the population
of units or to specify the operational meaning of a particular variable, which will be
different from causal inference

Mahbub Latif Introduction to Causal Inference October 2024 9 / 80


Why study causation

Mahbub Latif Introduction to Causal Inference October 2024 10 / 80


Simpson’s paradox

• Named after Edward Simpson (born 1922)


• The paradox refers to the existence of data in which statistical association that holds
for an entire population is reversed in every subpopulation

Mahbub Latif Introduction to Causal Inference October 2024 11 / 80


197.5

195.0
cholesterol

192.5

190.0

0 2 4 6
exercise time (in hour)

Mahbub Latif Introduction to Causal Inference October 2024 12 / 80


197.5

195.0
cholesterol

192.5

190.0

0 2 4 6
exercise time (in hour)

Mahbub Latif Introduction to Causal Inference October 2024 13 / 80


197.5

cholesterol 195.0

192.5

190.0

0 2 4 6
exercise time (in hour)

age 20 25 30 40 50

Mahbub Latif Introduction to Causal Inference October 2024 14 / 80


197.5

cholesterol 195.0

192.5

190.0

0 2 4 6
exercise time (in hour)

age 20 25 30 40 50 overall

Mahbub Latif Introduction to Causal Inference October 2024 15 / 80


Simpson’s paradox (Simpson, 1951)

• A group of sick patients were given the option to try a new drug
• Among those who tried the new drug, a lower percentage recovered than among those
who did not

• However, when we partition the patients by gender


• More men taking the drug recover than men are not taking the drug
• More women taking the drug recover than women are not taking the drug

• General conclusion
• Drug appears to help men and women, but hurt the general population

Mahbub Latif Introduction to Causal Inference October 2024 16 / 80


Example

• Recovery rates of 700 patients who were given access to the drug were recorded
• A total of 350 patients chose to take the drug, and the remaining 350 did not

gender Drug No drug


Men 81/87 = .93 recovered 234/270 = .87 recovered
Women 192/263 = .73 recovered 55/80 = .69 recovered

Combined 273/350 = .78 recovered 289/350 = .83 recovered

• Conclusions based on association measures could be misleading

Mahbub Latif Introduction to Causal Inference October 2024 17 / 80


Study question

What is wrong with the following claims?

• “Data show that income and marriage have a high positive correlation. Therefore,
your earnings will increase if you get married.”
• “Data show that as the number of fires increases, so does the number of firefighters.
Therefore, to reduce fires, you should reduce the number of firefighters.”
• “Data show that people who hurry tend to be late to their meetings. Don’t hurry, or
you will be late.”

Mahbub Latif Introduction to Causal Inference October 2024 18 / 80


Study question

• There are two treatments (A and B) used on kidney stones


• Doctors are more likely to use Treatment A on large (and therefore, more severe)
stones and Treatment B on small stones.
• Should a patient who does not know the size of his or her stone examine the general
population data or the stone size-specific data when determining which treatment will
be more effective?

Mahbub Latif Introduction to Causal Inference October 2024 19 / 80


Study question

Mahbub Latif Introduction to Causal Inference October 2024 20 / 80


Probability and statistics

• Variables, event, conditional probability, independence, conditional independence


• Probability distributions, law of total probability, Bayes rule
• Expectation, conditional expectation, regression

Mahbub Latif Introduction to Causal Inference October 2024 21 / 80


Monty Hall’s problem

• The host of a popular TV game show, Monty shows a contestant three doors – 𝐴, 𝐵,
and 𝐶 – a new car is behind one of the doors, and the other two doors have goats.
• If the contestant guesses correctly, the car is his; otherwise, he gets a goat.
• Suppose the contestant guesses 𝐴 at random and then Monty, who is forbidden from
revealing where the car is, opens Door 𝐶 , which, of course, has a goat behind it.
• He tells the contestant that he can now switch to Door 𝐵, or stick with Door 𝐴.
• Whichever the contestant picks, he will get what’s behind it.

Are the contestant better off opening Door A, or switching to Door B?

Mahbub Latif Introduction to Causal Inference October 2024 22 / 80


Monty Hall’s problem

• Let each door have an equal chance to get the car behind, i.e.,

𝑃 (car-in-A) = 𝑃 (car-in-B) = 𝑃 (car-in-C) = 1/3

• Initially the contestant picked the door 𝐴 and Monty opened the door 𝐶 , then we
have
𝑃 (open-C | car-in-A) = 1/2
𝑃 (open-C | car-in-B) = 1
𝑃 (open-C | car-in-C) = 0
• Then show switching to door 𝐵 is a better choice to win the car, i.e.

𝑃 (car-in-B | open-C) > 𝑃 (car-in-A | open-C)

Mahbub Latif Introduction to Causal Inference October 2024 23 / 80


Individual causal effects

Mahbub Latif Introduction to Causal Inference October 2024 24 / 80


The Potential Outcome model (The Neyman-Rubin Causal Model)

• Basic setup of the potential outcome model


• Unit: a person or object on which tratment will assign
• Target population: a well-defined population of units whose outcomes are going to be
compared
• Data: a sample of units from the target population
• Treatment: an intervention

Mahbub Latif Introduction to Causal Inference October 2024 25 / 80


The Potential Outcome model

• Suppose that each individual in a population of interest can be exposed to two


alternative states of a cause, e.g., taking the drug or not taking it
• Each individual in the population of interest has a potential outcome under each
treatment state, even though each individual can be observed in only one treatment
state at any point in time

Mahbub Latif Introduction to Causal Inference October 2024 26 / 80


The Potential Outcome model

• For the example with the causal effect of having a college degree rather than only a
high school diploma on subsequent earnings:
• adults who have completed only high school diplomas have theoretical what-if earnings
under the state “have a college degree,” and
• adults who have completed college degrees have theoretical what-if earnings under the
state “have only a high school diploma”

• These what-if potential outcomes are counterfactual in the sense that they exist in
theory but are not observed

Mahbub Latif Introduction to Causal Inference October 2024 27 / 80


Potential outcome

• Dichotomous treatment variable 𝐴 has two levels

1 for treated
𝐴={
0 for untreated

• Dichotomous outcome variable 𝑌 has two levels

1 for death
𝑌 ={
0 for survival

• 𝐴 and 𝑌 are random variables

Mahbub Latif Introduction to Causal Inference October 2024 28 / 80


Potential outcome

p(Y^(a=0)=0|A=1)
trt deoay ulta hoyeche
• Patient Zeus
• On January 1, Zeus got a new heart after waiting for a heart transplant, and five days
later he died
• We know somehow “had Zeus not received a heart transplant on January 1, he would
have been alive five days later”
p(Y^(a=0)=1|A=1)
• Patient Hera
• On January 1, Hera got a new heart after waiting for a heart transplant, and five days
later she was alive
• We know somehow “had Hera not received a heart transplant on January 1, she would
still have been alive five days later”

Mahbub Latif Introduction to Causal Inference October 2024 29 / 80


Notations for actual data (associational inference)

• 𝐴 = 1 → patient received new heart, 𝐴 = 0 otherwise


• 𝑌 = 1 → patient died five days later, 𝑌 = 0 otherwise

patient 𝐴 𝑌 (main ta, not counterfactual)


Zeus 1 1
Hera 1 0

Mahbub Latif Introduction to Causal Inference October 2024 30 / 80


Notations for ideal data (causal inference)

• Counterfactuals
• 𝑌 𝑎=1 → outcome that would have observed under 𝑎 = 1
zeus ke trt dise dekhe mara
• 𝑌 𝑎=0 → outcome that would have observed under 𝑎 = 0
gese so y^a=1 hobe, ar na
dile jeto na so y^a=0 hobe,
ar hera r jonno, na dileo
patient 𝐴 𝑌 𝑎=0 𝑌 𝑎=1 beche thakto, so dei ba nai
dei not matter!, so all
Zeus 1 0 1 counterfatuals zero
Hera 1 0 0

• Both 𝑌 𝑎=1 and 𝑌 𝑎=0 are random variables as it can take different values for different
individuals
• The notations 𝑌 (1) and 𝑌 (0) are also used for 𝑌 𝑎=1 and 𝑌 𝑎=0 , respectively, in the
literature
Mahbub Latif Introduction to Causal Inference October 2024 31 / 80
Associational and causal measures

• The roles of treatment 𝐴 measured on unit 𝑢 ∈ 𝑈 are different for the models of
associational and causal inference are different
• In associational inference 𝐴(𝑢) is a characteristic of 𝑢 and in causal inference 𝐴(𝑢)
indicates exposure of 𝑢 to a specific cause
• The role of time is important in causal inference, and a unit is exposed to a cause
that must occur at some specific time or within a specific time period
• Variables are now divided into two classes: pre-exposure and post-exposure
• Response variable falls in the class of post-exposure

Mahbub Latif Introduction to Causal Inference October 2024 32 / 80


Individual causal effects

• The treatment 𝐴 has a causal effect on an individual outcome 𝑌𝑖 if for the individual
we have
𝑌𝑖𝑎=1 ≠ 𝑌𝑖𝑎=0 ⇒ Δ𝑖 = 𝑌𝑖𝑎=1 − 𝑌𝑖𝑎=0
• For Zeus, the heart transplant has a causal effect, but for Hera, the treatment does not
have a causal effect

trt dise-mara gese


na dile- mara jeto
so trt er diff

Mahbub Latif Introduction to Causal Inference October 2024 33 / 80


Potential outcomes

• The variables 𝑌 𝑎=1 and 𝑌 𝑎=0 are known as potential outcomes or counterfactual
outcomes becasue only one will ultimately be realized
• The outcomes that would have been observed for each individual under a possible
treatment value 𝑎 is known as potential outcome def
• Counterfactual outcomes refers to the outcome corresponding “counter to the fact”
situation
• The outcomes that would have been observed under a treatment value that the
individual did not actually receive

• The term potential outcomes was first used by Jerzey Neyman (1923) in the context
of randomized experiments

Mahbub Latif Introduction to Causal Inference October 2024 34 / 80


Consistency

• 𝐴𝑖 is the set of treatments for which unit 𝑖 can be exposed and 𝐴𝑖 = 𝐴 because the
set of treatments is the same for all units protteke same trt dile !

• Consistency of counterfactual outcomes

If 𝐴𝑖 = 𝑎 then 𝑌𝑖𝑎 = 𝑌𝑖𝐴 = 𝑌𝑖

• For each individual, the observed outcome 𝑌 is the counterfactual outcome under the
treatment value that the individual actually experienced
• The counterfactual outcome 𝑌 𝑎 under treatment 𝑎 is factual for some individuals and
counterfactual for others
• Consistency is a component of Rubin’s
• “Stable-unit-treatment-value assumption” (SUTVA)

Mahbub Latif Introduction to Causal Inference October 2024 35 / 80


Stable Unit Treatment Value Assumption (SUTVA)

• The potential outcomes for any unit don’t vary with the treatments assigned to other
units (No interference), i.e.
• 𝑌𝑖 = 𝑌𝑖 (𝐴𝑖 ) and it does not depend on 𝐴𝑖′ , 𝑖′ ≠ 𝑖

• There are no different versions of each treatment level, which leads to different
potential outcomes (No hidden variations of treatments)
• Under SUVTA, for a treatment with two levels, the dimension of counterfactuals of a
population of size 𝑁 is 𝑁 × 2, otherwise, it would be 𝑁 × 2𝑁

Mahbub Latif Introduction to Causal Inference October 2024 36 / 80


The fundamental problem of causal inference

• Individual causal effects cannot be determined because only one counterfactual


outcome is generally observed for each individual
• Causal inference is a missing data problem

• There are two general solutions to the fundamental problem of causal inference
• Scientific solution and statistical solution

Mahbub Latif Introduction to Causal Inference October 2024 37 / 80


Table 1: Available data from a follow-up study

patient 𝐴 𝑌 𝑌 𝑎=0 𝑌 𝑎=1


Ian 1 1 ? ?
Jim 0 0 ? ?
Ken 1 0 ? ?
Leo 0 1 ? ?
Mike 1 1 ? ?
Nick 0 0 ? ?

Mahbub Latif Introduction to Causal Inference October 2024 38 / 80


Table 2: Available data from a follow-up study

patient 𝐴 𝑌 𝑌 𝑎=0 𝑌 𝑎=1


jekono ekta observe korte
Ian 1 1 ? 1
Jim 0 0 0 ?
Ken 1 0 ? 0 A dekhe koi bosabo seta buji, Y(1)
Leo 0 1 1 ? na Y(0), then Y dekhe ki bosabo
seta buji !
Mike 1 1 ? 1
Nick 0 0 0 ?

Mahbub Latif Introduction to Causal Inference October 2024 39 / 80


More notations

Let 𝑁 be the populaiton size, for the 𝑖𝑡ℎ individual, we can define

𝑌𝑖 (0) if 𝐴𝑖 = 0 𝑌𝑖 (1) if 𝐴𝑖 = 0
𝑌𝑖obs = 𝑌𝑖 (𝐴𝑖 ) = { 𝑎𝑛𝑑 𝑌𝑖miss = 𝑌𝑖 (1 − 𝐴𝑖 ) = {
𝑌𝑖 (1) if 𝐴𝑖 = 1 𝑌𝑖 (0) if 𝐴𝑖 = 1
Ai=0 bosaile Yi(1-Ai)=Yi(1) evave
• Observed response

𝑌𝑖𝑜𝑏𝑠 = 𝑌𝑖 (𝐴𝑖 ) = 𝑌𝑖 (1)𝐴𝑖 + 𝑌𝑖 (0)(1 − 𝐴𝑖 ) = 𝑌𝑖 (0) + Δ𝑖 𝐴𝑖

• Potential outcomes are hypothetical and assumed to exist before treatment is assigned
• Observed outcome does not exist until it is assessed, which is after the treatment is
assigned
trt deoar age theke jani je ki hobe for potentia; er
jonnn,
but obs er jonno trt deoar por jani!
Mahbub Latif Introduction to Causal Inference October 2024 40 / 80
Average causal effects

Mahbub Latif Introduction to Causal Inference October 2024 41 / 80


Average causal effects

• Three pieces of information are needed to define an individual causal effect


• an outcome of interest
• the actions (𝑎 = 0 and 𝑎 = 1) to be compared
• the individual whose counterfactuals (𝑌 𝑎=0 and 𝑌 𝑎=1 ) are to be compared

Mahbub Latif Introduction to Causal Inference October 2024 42 / 80


Average causal effects

• Three pieces of information is needed to define average causal effect in a


population of individuals
• an outcome of interest
• the actions (𝑎 = 0 and 𝑎 = 1) to be compared
• a well-defined population of individuals whose counterfactuals (𝑌 𝑎=0 and 𝑌 𝑎=1 ) are to
be compared

specifiw grp of people lagbe ACE er jonno wgile ice er jonno individual holei
hobe

Mahbub Latif Introduction to Causal Inference October 2024 43 / 80


An example of a population

• Consider a population of 20 subjects for which all the counterfactuals (𝑌 𝑎=0 and
𝑌 𝑎=1 ) are known

subjects 𝑌 𝑎=0 𝑌 𝑎=1 subjects 𝑌 𝑎=0 𝑌 𝑎=1


Rheia 0 1 Leto 0 1
Kronos 1 0 Ares 1 1
Demeter 0 0 Athena 1 1
Hades 0 0 Hephaestus 0 1
Hestia 0 0 Aphrodite 0 1
Poseidon 1 0 Cyclope 0 1
Hera 0 0 Persephone 1 1
Zeus 0 1 Hermes 1 0
Artemis 1 1 Hebe 1 0
Apollo 1 0 Dionysus 1 0

Mahbub Latif Introduction to Causal Inference October 2024 44 / 80


total=20
Average causal effects: Notations

• Pr(𝑌 𝑎=1 = 1) = 𝐸(𝑌 𝑎=1 ) → proportion of individuals who would have developed
the outcome 𝑌 had everybody in the population been treated
• Pr(𝑌 𝑎=0 = 1) = 𝐸(𝑌 𝑎=0 ) → proportion of individuals who would have developed
the outcome 𝑌 had everybody in the population been untreated
• Average causal effects in the population
• The treatment 𝐴 has an average causal effect on the outcome of interest 𝑌 if

Pr(𝑌 𝑎=1 = 1) ≠ Pr(𝑌 𝑎=0 = 1)

Mahbub Latif Introduction to Causal Inference October 2024 45 / 80


Average causal effects page-33 e Delta dite likhse ICE
here Tau is ATE

• The outcome 𝑌 does not necessarily be dichotomous


• In general, treatment 𝐴 has an average causal effect on the outcome of interest 𝑌 if

𝐸(𝑌 𝑎=1 ) ≠ 𝐸(𝑌 𝑎=0 ) ⇒ 𝐸(𝑌 𝑎=1 ) − 𝐸(𝑌 𝑎=0 ) = 𝜏

• 𝜏 is known as the average treatment effect (ATE)

• When treatment is not dichotomous, the causal effect can be defined as a contrast of
any functional (e.g., mean, median, hazard, cdf, …) of the distributions of
counterfactuals under different treatment values

Mahbub Latif Introduction to Causal Inference October 2024 46 / 80


Average causal effects

• For our example population


10 10
𝑃 (𝑌 𝑎=1 = 1) = and 𝑃 (𝑌 𝑎=0 = 1) =
20 20

• Since 𝑃 (𝑌 𝑎=1 = 1) = 𝑃 (𝑌 𝑎=0 = 1) ⇒Treatment 𝐴 does not have average causal


effect on 𝑌 in our hypothetical population

Mahbub Latif Introduction to Causal Inference October 2024 47 / 80


Average causal effects
• The absence of average causal effect does not imply the absence of individual causal
effects
• For our population, the treatment 𝐴 has an individual causal effect on 12 members of
the population

subjects 𝑌 𝑎=0 𝑌 𝑎=1 subjects 𝑌 𝑎=0 𝑌 𝑎=1


Rheia 0 1 Kronos 1 0
Zeus 0 1 Poseidon 1 0
Leto 0 1 Apollo 1 0
Hephaestus 0 1 Hermes 1 0
Aphrodite 0 1 Hebe 1 0
Cyclope 0 1 Dionysus 1 0

• Of the 12 subjects, six were harmed by the treatment 𝐴, and the others benefited
from it.
Mahbub Latif Introduction to Causal Inference October 2024 48 / 80
Average causal effects

• Average causal effect is always equal to the average of individual causal effects
ind ce nai, i.e= trt effect nai, sobar
𝐸(𝑌 𝑎=1 ) − 𝐸(𝑌 𝑎=0 ) = 𝐸(𝑌 𝑎=1 − 𝑌 𝑎=0 )jonno same outcome, so sharp nul
true , as kono diff nai :)

• When there is no causal effect for any individual in the population (i.e. 𝑌 𝑎=1 = 𝑌 𝑎=0
for all individuals), we say that sharp causal null hypothesis is true
• Average causal effect can sometimes be identified from the data, even if individual
causal effects cannot
• From now on, “average causal effect” is referred to as “causal effect”

Mahbub Latif Introduction to Causal Inference October 2024 49 / 80


Measures of causal effect

Mahbub Latif Introduction to Causal Inference October 2024 50 / 80


Measures of causal effect

Causal risk difference

𝑃 (𝑌 𝑎=1 = 1) − 𝑃 (𝑌 𝑎=0 = 1)

Causal risk ratio


𝑃 (𝑌 𝑎=1 = 1)
𝑃 (𝑌 𝑎=0 = 1)

Causal odds ratio


𝑃 (𝑌 𝑎=1 = 1)/𝑃 (𝑌 𝑎=1 = 0)
𝑃 (𝑌 𝑎=0 = 1)/𝑃 (𝑌 𝑎=0 = 0)

Mahbub Latif Introduction to Causal Inference October 2024 51 / 80


Measures of causal effect

risk diff=p(Y(1)=1)-p(y(0)=1)
=E(Y(1))-E(Y(0))=E(Y(1)-Y(0))=avg(ind casual effect)

• For difference measures, the causal null value is zero, and for ratio measures, it is one
• Causal risk difference in the population is the average of the individual causal effects,
but the causal risk ratio in the population is not the average of the individual causal
effects
• Exercise: Obtain the causal risk difference, risk ratio, and odds ratio for our
hypothetical population

Mahbub Latif Introduction to Causal Inference October 2024 52 / 80


Number needed to treat

• Consider a population with 100 million individuals


• 20 million would die within five years if treated (𝑎 = 1)
• 30 million would die within five years if untreated (𝑎 = 0)

• Causal risk difference

𝑃 (𝑌 𝑎=1 = 1) − 𝑃 (𝑌 𝑎=0 = 1) = −.10

• If one treats 100 million patients, there will be 10 million fewer deaths than if one does
not treat those 100 million patients
• One needs to treat ten patients to save one live

Mahbub Latif Introduction to Causal Inference October 2024 53 / 80


Number needed to treat

• The number needed to treat (𝑁 𝑁 𝑇 ) is the average number of individuals that need
to receive treatment to reduce the number of cases by one
−1
𝑁𝑁𝑇 =
𝑃 (𝑌 𝑎=1 = 1) − 𝑃 (𝑌 𝑎=0 = 1)

• For our example, 𝑁 𝑁 𝑇 = 10

Mahbub Latif Introduction to Causal Inference October 2024 54 / 80


Causation versus association

Mahbub Latif Introduction to Causal Inference October 2024 55 / 80


Causation versus association

• Data obtained from actual studies don’t have information on counterfactuals, but the
labels of assigned treatments and the observed outcomes for all individuals are
available

Mahbub Latif Introduction to Causal Inference October 2024 56 / 80


Causation versus association

• Assigned treatment 𝐴 and observed outcome 𝑌 (via consistency)

subjects 𝑌 𝑎=0 𝑌 𝑎=1 𝐴 𝑌 subjects 𝑌 𝑎=0 𝑌 𝑎=1 𝐴 𝑌


Rheia 0 1 0 0 Leto 0 1 0 0
Kronos 1 0 0 1 Ares 1 1 1 1
Demeter 0 0 0 0 Athena 1 1 1 1
Hades 0 0 0 0 Hephaestus 0 1 1 1
Hestia 0 0 1 0 Aphrodite 0 1 1 1
Poseidon 1 0 1 0 Cyclope 0 1 1 1
Hera 0 0 1 0 Persephone 1 1 1 1
Zeus 0 1 1 1 Hermes 1 0 1 0
Artemis 1 1 0 1 Hebe 1 0 1 0
Apollo 1 0 0 1 Dionysus 1 0 1 0

Mahbub Latif Introduction to Causal Inference October 2024 57 / 80


Causation versus association
• From actual studies only the information on 𝐴 and 𝑌 are available to identify the
causal effects

subjects 𝐴 𝑌 subjects 𝐴 𝑌
Rheia 0 0 Leto 0 0
Kronos 0 1 Ares 1 1
Demeter 0 0 Athena 1 1
Hades 0 0 Hephaestus 1 1
Hestia 1 0 Aphrodite 1 1
Poseidon 1 0 Cyclope 1 1
Hera 1 0 Persephone 1 1
Zeus 1 1 Hermes 1 0
Artemis 0 1 Hebe 1 0
Apollo 0 1 Dionysus 1 0

• Some information on counterfactuals is missing


Mahbub Latif Introduction to Causal Inference October 2024 58 / 80
More notations

Risk in treated
• 𝑃 (𝑌 = 1 | 𝐴 = 1) → proportion of individuals who developed the outcome 𝑌 among
those who received the treatment (i.e. 𝐴 = 1)

Risk in untreated
• 𝑃 (𝑌 = 1 | 𝐴 = 0) → proportion of individuals who developed the outcome 𝑌 among
those who received no treatment (i.e. 𝐴 = 0)

Mahbub Latif Introduction to Causal Inference October 2024 59 / 80


Causation versus association

• Treatment 𝐴 and outcome 𝑌 are associated if

𝑃 (𝑌 = 1 |𝐴 = 1) ≠ 𝑃 (𝑌 = 1 |𝐴 = 0)

• If 𝐴 and 𝑌 are not associated, they are independent

𝑃 (𝑌 = 1 |𝐴 = 1) = 𝑃 (𝑌 = 1 |𝐴 = 0)
try dey na dey, mara jaoar prob same ,
• Independence is represented by 𝐴 ⟂
⟂ 𝑌 or 𝑌 ⟂⟂ 𝐴 so, trt o outcome independet,

Mahbub Latif Introduction to Causal Inference October 2024 60 / 80


Measures of association

• Risk difference
𝑃 (𝑌 = 1 | 𝐴 = 1) − 𝑃 (𝑌 = 1 | 𝐴 = 0)
• Risk ratio
𝑃 (𝑌 = 1 | 𝐴 = 1)
𝑃 (𝑌 = 1 | 𝐴 = 0)
• Odds ratio
𝑃 (𝑌 = 1 | 𝐴 = 1)/𝑃 (𝑌 = 0 | 𝐴 = 1)
𝑃 (𝑌 = 1 | 𝐴 = 0)/𝑃 (𝑌 = 0 | 𝐴 = 0)

Mahbub Latif Introduction to Causal Inference October 2024 61 / 80


Our hypothetical data ho: independent, RD=0, RR=1, OR=1
h1: dependent ase, associated, RD=!0,

• Risk difference
7 3
𝑃 (𝑌 = 1 | 𝐴 = 1) − 𝑃 (𝑌 = 1 | 𝐴 = 0) = − = 0.110
13 7
• Risk ratio
(7/13)
𝑃 (𝑌 = 1 | 𝐴 = 1)/𝑃 (𝑌 = 1 | 𝐴 = 0) = = 1.256
(3/7)
• Risk ratio
𝑃 (𝑌 = 1 | 𝐴 = 1)/𝑃 (𝑌 = 0 | 𝐴 = 1) (7/13)/(6/13)
= = 1.556
𝑃 (𝑌 = 1 | 𝐴 = 0)/𝑃 (𝑌 = 0 | 𝐴 = 0) (3/7)/(4/7)

• 𝑅𝐷 ≠ 0 or 𝑅𝑅 ≠ 1 or 𝑂𝑅 ≠ 1, treatmeant 𝐴 and outcome 𝑌 are associated

Mahbub Latif Introduction to Causal Inference October 2024 62 / 80


Association is not causation

• Association compares different risks in two disjoint subsets of the population


determined by individual’s actual treatment value
• Causation compares different risks in the entire population under two treatment values
• Association measures are defined using conditional probabilities
• Measures of causal effects are defined using marginal probabilities

alada subfgrp e vag kori, so


conditional on A=1 or A=0 korte hoy
while causation e ek pop ei ekbar
trt dey or dey na, so condition er
kichu nai, so marginal !!

Mahbub Latif Introduction to Causal Inference October 2024 63 / 80


Causation versus association

Mahbub Latif Introduction to Causal Inference October 2024 64 / 80


Causation versus correlation

Mahbub Latif Introduction to Causal Inference October 2024 65 / 80


Causal estimands

Mahbub Latif Introduction to Causal Inference October 2024 66 / 80


Causal estimands

• An estimand is the parameter that represents the causal effect of interest and it is a
function of counterfactuals
• Estimands are usually formulated based on causal assumptions and considerations of
the study design
• Let us consider a population of size 𝑁 and 𝐴𝑖 = 𝐴, i.e., all units of the population
received treatment from the same set of treatment combinations 𝐴

Mahbub Latif Introduction to Causal Inference October 2024 67 / 80


Causal estimands

• The individual treatment effect (ITE) for the unit 𝑖

𝜏𝑖 = 𝑌𝑖 (1) − 𝑌𝑖 (0)

• Conditional average treatment effect (CATE)

𝜏 (ℓ) = 𝐸[𝑌𝑖 (1) − 𝑌𝑖 (0) | 𝐿 = ℓ]

• Average treatment effect (ATE)

𝜏 = 𝐸[𝑌𝑖 (1) − 𝑌𝑖 (0)] = 𝐸𝐿 [𝜏 (ℓ)]

• Average treatment effect for the treated units (ATT) and control units (ATC)

𝜏 = 𝐸[𝑌𝑖 (1) − 𝑌𝑖 (0) | 𝐴 = 1] 𝑎𝑛𝑑 𝜏 = 𝐸[𝑌𝑖 (1) − 𝑌𝑖 (0) | 𝐴 = 0]

Mahbub Latif Introduction to Causal Inference October 2024 68 / 80


Causal estimands
• Causal effects defined by potential outcomes, not model parameters
• Average difference of the pair of potential outcomes, averaged over the entire
population

1 𝑁 1 𝑁
𝜏𝑓𝑠 = ∑ (𝑌𝑖 (1) − 𝑌𝑖 (0)) = ∑ (𝑌 𝑎=1 − 𝑌𝑖𝑎=0 )
𝑁 𝑖=1 𝑁 𝑖=1 𝑖

• Average treatment effect for a subpopulation, e.g., for females


1
𝜏𝑓𝑠 (𝑓) = ∑ (𝑌 (1) − 𝑌𝑖 (0))
𝑁 (𝑓) 𝑖∶𝑋 =𝑓 𝑖
𝑖

• 𝑋𝑖 ∈ {𝑓, 𝑚} is a covariate, where 𝑓 is for female and 𝑚 is for male


𝑁
• 𝑁 (𝑓) = ∑𝑖=1 𝐼(𝑋𝑖 = 𝑓) is the number of females in the population
Mahbub Latif Introduction to Causal Inference October 2024 69 / 80
Causal estimands

• Average treatment effect for the treated, i.e. those who were exposed to the treatment
1
𝜏𝑓𝑠,𝑡 = ∑ (𝑌 (1) − 𝑌𝑖 (0))
𝑁𝑡 𝑖∶𝐴 =1 𝑖
𝑖

• 𝑁𝑡 = ∑𝑖=1 𝐼(𝐴𝑖 = 1) is the number of units exposed to the treatment

• Interest is in the average effect of the job-training program on hourly wages, averaged
over only those who would have been employed irrespective of the level of treatment
1
𝜏𝑓𝑠,𝑝𝑜𝑠 = ∑ (𝑌𝑖 (1) − 𝑌𝑖 (0))
𝑁𝑝𝑜𝑠 𝑖∶𝑌𝑖 (1)>0,𝑌𝑖 (0)>0

• 𝑁𝑝𝑜𝑠 = ∑𝑖=1 𝐼(𝑌𝑖 (1) > 0, 𝑌𝑖 (0) > 0)

Mahbub Latif Introduction to Causal Inference October 2024 70 / 80


Causal estimands

• In general, a causal estimand is expressed as a row-exchangeable1 function of all


potential outcomes for all units, all treatment assignments, and pre-treatment
variables
𝜏 = 𝜏 (𝑌 𝑌 (0), 𝑌 (1), 𝑋 , 𝐴 )
• 𝑌 (0) and 𝑌 (1) are 𝑁 -component column vector of potential outcomes
• 𝐴 is the 𝑁 -component column vector of treatment assignment
• 𝑋 is a 𝑁 × 𝐾 matrix of pre-treatment variables (covariates)

1
a function that remains invariant under any permutation of its rows
Mahbub Latif Introduction to Causal Inference October 2024 71 / 80
Key concepts

• Counterfactuals/potential outcomes
• Individual causal effect
• Average causal effect
• Effect measure
• Association and related measures

Mahbub Latif Introduction to Causal Inference October 2024 72 / 80


Key concepts

• Causal inference requires information on counterfactual outcomes, but real-world


(observed) data only contain partial information on counterfactual outcomes
• The question of interest is under what conditions real-world data can be used for
causal inference

Mahbub Latif Introduction to Causal Inference October 2024 73 / 80


Holland (1986)

• §1: Introduction
• §2: Model for associational inference
• §3: Rubin’s model for causal inference
• §4: Some special cases of causal inference
• §5: Comments on selected philosophers
• §6: Comments from a few statisticians
• §7: What can be a cause?
• §8: Comments on causal inference in various disciplines
• §9: Summary

Mahbub Latif Introduction to Causal Inference October 2024 74 / 80


Homework (The Kidney Stone Data)

• Charig et al. (1986) discussed a study about treatments for kidney stones
• 𝑍 represents the treatment (𝑍 = 1 for open surgical procedure and 𝑍 = 0 for small
puncture procedure)
• Outcome 𝑌 is binary (1 for success and 0 for failure)
• Data on another variable 𝑋 , the size of the stone, is also available (0 for small and 1 for
large stones)
• Can you show Simpson’s paradox using this data set?
• Explain the paradox after examining the association between (𝑋 and 𝑍 ) and (𝑋 and 𝑌 )

Mahbub Latif Introduction to Causal Inference October 2024 75 / 80


Kidney stone data

X Z Y n
0 1 1 81
0 1 0 6
0 0 1 234
0 0 0 36
1 1 1 192
1 1 0 71
1 0 1 55
1 0 0 25

Mahbub Latif Introduction to Causal Inference October 2024 76 / 80


Homework (Barkeley Admission)

• Bickel et al. (1975) discussed sex bias in graduate admission of UC Barkley


• Data can be downloaded from datasets::UCBAdmissions
• Examine whether there is a significant difference in graduate admission rates between
males and females.
• Examine the association between admission and sex by different departments, and
conclude.

Mahbub Latif Introduction to Causal Inference October 2024 77 / 80


Homework (Generating counterfactuals and estimating treatment effect)

• Generate counterfactuals corresponding to a binary treatment according to the


following:

𝑌 (0) ∼ 𝑁 (0, 1), 𝜏 = −0.5 + 𝑌 (0), 𝑎𝑛𝑑 𝑌 (1) = 𝑌 (0) + 𝜏

• Determine the treatment effect of 𝐴 = 𝐼(𝜏 ≥ 0) by

𝐸(𝑌 | 𝐴 = 1) − 𝐸(𝑌 | 𝐴 = 1)

• Determine the effect of a randomly assigned treatment

Mahbub Latif Introduction to Causal Inference October 2024 78 / 80


Acknowledgments

• Class notes of the workshop “Introduction to Causal Inference”, which was held at
the Harvard School of Public Health during June 4-8, 2018
• Hernan MA and Robins JM (2020). “Causal Inference: What If”. Boca Raton:
Chapman & Hall/CRC.
• Rubin D and Imbens G (2015). “Causal Inference for Statistics, Social, and
Biomedical Sciences: An Introduction”. Cambridge University Press.

Mahbub Latif Introduction to Causal Inference October 2024 79 / 80


References I

Bickel, P. J., Hammel, E. A., and O’Connell, J. W. (1975). Sex bias in graduate
admissions: Data from berkeley: Measuring bias is harder than is usually assumed,
and the evidence is sometimes contrary to expectation. Science, 187(4175):398–404.
Charig, C. R., Webb, D. R., Payne, S. R., and Wickham, J. E. (1986). Comparison of
treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and
extracorporeal shockwave lithotripsy. Br Med J (Clin Res Ed), 292(6524):879–882.
Holland, P. W. (1986). Statistics and causal inference. Journal of the American
Statistical Association, 81(396):945–960.
Simpson, E. H. (1951). The interpretation of interaction in contingency tables. Journal
of the Royal Statistical Society: Series B (Methodological), 13(2):238–241.

Mahbub Latif Introduction to Causal Inference October 2024 80 / 80

You might also like