2A.3 Lecture Slides20 LDV 1
2A.3 Lecture Slides20 LDV 1
Lecture 20:
Limited Dependent Variables: Probit model
Oleg I. Kitov
[email protected]
1/21
Lecture outline
2/21
Binary outcome variable
I Previously, we worked with continuous outcome variables Yi ∈ R.
I Now, we would like to model a binary outcome variable Yi ∈ {0, 1}.
I Yi is a Bernoulli random variable: an event either happens or it doesn’t.
I Yi ∼ Ber (pi ) where pi = P (Yi = 1) and 1 − pi = P (Yi = 0).
I Note that pi can vary across individuals i = 1, . . . , n.
I Recall that E [Yi ] = pi and Var (Yi ) = pi (1 − pi ), heteroskedasticity?
I Example: suppose Yi is a vote by individual i in the Brexit referendum:
1 if i voted Brexit
Yi =
0 if i voted Remain
where
| |
Xi = (1, inci , educi , femi ) , β = (β0 , β1 , β2 , β3 )
4/21
Model for conditional Yi
I It does not make much sense to model a binary outcome Yi directly.
I Instead, we want to model conditional probability pi = P (Yi = 1 | Xi ).
I We know the linear model for the latent variable Yi∗ = X|i β + ui .
I We know that the error term ui is a random variable.
I Link between conditional probability pi and the linear model for Yi∗ :
∂Yi
MEinc = = β1
∂inc
I The marginal effect is equal to β1 and is constant for all individuals i. 7/21
Probit model
I Assume the errors are standard normal, ui ∼ iid N (0, 1), with pdf and cdf:
Z x
1 2 1 2
φ (x ) = √ e −x /2 , Φ (x ) = √ e −s /2 ds
2π −∞ 2π
I Probit model: the model for P (Yi = 1 | Xi ) when ui ∼ iidN (0, 1):
I Since Φ (x ) ∈ [0, 1] for any x ∈ R, it must be the case that for any values
of explanatory variables inci , educi , femi , model predictions are in [0, 1]:
b (Yi = 1 | inci , educi , femi ) = Φ β̂0 + β̂1 inci + β̂2 educi + β̂3 femi
P 8/21
Probit model: computing predicted probabilities
9/21
Probit model: marginal (partial) effects [1/5]
I Probit population model for pi = P (Yi = 1 | Xi ) is
∂
MEinc (Xi ) = P (Yi = 1 | Xi )
∂inci
∂
= Φ (β0 + β1 inci + β2 educi + β3 femi )
∂inci
= β1 φ (β0 + β1 inci + β2 educi + β3 femi )
I Notice that marginal effect of income now depends on the level of income.
I If the coefficients are estimated with ML, the predicted marginal effect is
MEinc (Xi ) = β̂1 φ β̂0 + β̂1 inci + β̂2 educi + β̂3 femi
I Unlike in the linear model, where the marginal effect of income is just β1 ,
in the probit model MEinc (Xi ) 6= β1 , but is proportional to β1 .
I If β1 > 0, then higher β increases the rise in probability as income grows.
I In probit, MEinc (Xi ) is a function of all of the explanatory variables.
I Marginal effects vary across individuals i = 1, . . . , n.
11/21
Probit model: marginal (partial) effects [3/5]
12/21
Probit model: marginal (partial) effects [4/5]
13/21
Probit model: marginal (partial) effects [5/5]
I Marginal (partial) effect (evaluated) at the averages (MEA): the
marginal effect of income evaluated at the average values of all
|
explanatory variables, denoted by X̄ = (1, inc, educ, fem) :
I Note that for categorical variables such as fem, no individual can have
fem = fem, as fem ∈ (0, 1), it is the proportion of females in the sample. 14/21
Estimating probit model [1/5]
pi = P (Yi = 1 | Xi ) = Φ (βXi )
16/21
Estimating probit model [3/5]
I Recall that that the derivative of the cdf is the pdf, Φ0 (βXi ) = φ (βXi ).
I The maximum likelihood estimator β̂ satisfies the first order condition:
n
X φ β̂Xi
Yi − Φ β̂Xi Xi = 0.
i=1 Φ β̂Xi 1 − Φ β̂Xi
17/21
Estimating probit model [4/5]
b (Yi = 1 | Xi ).
I Note that Φ(β̂Xi ) is the predicted probability P
I In the first-order condition the following term is the predicted residual:
ûi = Yi − Φ β̂xi = Yi − P
b (Yi = 1 | Xi )
I Note that the first term looks like some sort of a weight:
φ β̂Xi
ωi = .
Φ β̂Xi 1 − Φ β̂Xi
I The first order condition of the MLE can be described as the weighted
sum of the product of the prediction errors and the explanatory variable:
n
X φ β̂Xi Xn
Yi − Φ β̂Xi Xi = ωi ûi Xi = 0.
i=1 Φ β̂Xi 1 − Φ β̂Xi i=1
18/21
Estimating probit model [5/5]
19/21
Testing hypotheses about probit coefficients [1/2]
I Consider the estimated probit model for the Brexit vote:
b (Yi = 1 | inci , educi , femi ) = Φ β̂0 + β̂1 inci + β̂2 educi + β̂3 femi
P
a
LR = −2 (lr − lu ) ∼ χ21
a
LR = −2 (lr − lu ) ∼ χ22
I In general, for q restrictions, the likelihood ratio test statistic follows χ2q :
a
LR = −2 (lr − lu ) ∼ χ2q
21/21