0% found this document useful (0 votes)

4 views18 pages

Loksh in 2011

This article presents the switch probit command for maximum likelihood estimation of binary choice models with binary endogenous regressors, addressing the econometric challenges of fitting models with endogenous switching. It discusses various applications, including the impact of interventions on labor force participation and migration, and provides a detailed explanation of the model, its estimation, and the calculation of treatment effects. The switch probit command offers a more efficient alternative to existing methods by simultaneously estimating selection and outcome equations while ensuring consistent standard errors.

Uploaded by

yihunie4612

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views18 pages

Loksh in 2011

Uploaded by

yihunie4612

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

The Stata Journal (2011)

11, Number 3, pp. 368–385

Impact of interventions on discrete outcomes:

Maximum likelihood estimation of the binary
choice models with binary endogenous
regressors
Michael Lokshin Zurab Sajaia
The World Bank The World Bank
Washington, DC Washington, DC
[email protected] [email protected]

Abstract. In this article, we describe the switch probit command, which imple-
ments the maximum likelihood method to ﬁt the model of the binary choice with
binary endogenous regressors.
Keywords: st0233, switch probit, endogenous variables, maximum likelihood,
limited-dependent variables, binary choice models, impact evaluation, marginal
treatment eﬀect

1 Introduction
In this article, we describe the implementation of a maximum likelihood (ML) estimator
of the parameters of binary choice models with endogenous regressors. In these models,
a switching equation sorts individuals over two different states (with only one regime or
outcome observed for each individual). The econometric problem of fitting a model with
endogenous switching with binary endogenous regressors arises in a variety of settings in
the modeling of the effects of fertility and migration on female labor force participation
(LFP), modeling of housing demand, and in the modeling of markets in disequilibrium.
For example,

• The paper by Aakvik, Heckman, and Vytlacil (2000) formulates an econometric

framework for studying the impact of interventions on discrete outcomes when
responses to treatment vary among observationally identical persons. Using this
framework, the paper evaluates the effect of Norwegian Vocational Rehabilitation
training programs on employment outcomes for women. The results demonstrate
the positive effect of the training programs on employment after controlling for
observable characteristics of the applicants. However, after controlling for unob-
served characteristics of the applicants, the average treatment effect became neg-
ative, indicating that participants in the program have lower employment rates
than nonparticipants.

c 2011 StataCorp LP st0233
M. Lokshin and Z. Sajaia 369

• Carrasco (2001) estimates the causal eﬀect of fertility on LFP of females in the
United States using the 1986–1989 rounds of the University of Michigan Panel
Study of Income Dynamics (PSID). The paper ﬁnds that the probability of LFP in
women falls more in the model that accounts for endogenous fertility than in the
model with exogenous fertility. The paper points to a downward bias induced by
the exogeneity assumption of children variables that introduces a spurious positive
correlation between fertility and LFP decisions.

• Lokshin and Glinskaya (2009) assess the impact of male migration on the labor
market behavior of females in Nepal. The results indicate that male migration has
a negative impact on the level of LFP by the women left behind. The paper ﬁnds
evidence of substantial heterogeneity (based both on observable and unobservable
characteristics) in the impact of male migration.

Models with endogenous switching can be fit one branch (selection equation and
outcome equation) at a time (two heckprob estimations) or by simultaneous ML esti-
mations (see [R] biprobit and [R] heckprob). However, both of these methods are
inefficient, and the biprobit command is restrictive in that it assumes an equality of
coefficients in the outcome equations for both treatment regimes. In addition, these
approaches require potentially cumbersome adjustments to derive consistent standard
errors. The switch probit command, on the other hand, implements the full informa-
tion ML method to simultaneously estimate the binary selection and the binary outcome
parts of the model to yield consistent standard errors of the estimates. This approach
relies on an assumption of joint normality of the error terms in the selection and outcome
equations. The switch probit command also derives the average treatment effects—
the average effects of treatment on the treated and on the untreated—and the marginal
treatment effects.

2 Model
Consider a model that describes the behavior of an agent with two binary outcome
equations and a criterion function Ti that determines which regime the agent faces.
Ti can be interpreted as a treatment. A motivating example is the eﬀect of husband
migration on wife’s LFP. Here the treatment (migration of a husband) and the outcome
(whether the wife works outside the home) can take one of the two potential values:

Ti = 1 if γZi + μi > 0 (1)

Ti = 0 if γZi + μi ≤ 0
∗ ∗
y1i = β1 X1i + ε1i y1i = I(y1i > 0) (2)
∗ ∗
y0i = β0 X0i + ε0i y0i = I(y0i > 0) (3)
370 Impact of interventions on discrete outcomes

Observed yi is deﬁned as

yi = y1i if Ti = 1
yi = y0i if Ti = 0
∗ ∗
where y1i and y0i are the latent variables (wife’s propensity for LFP) that determine
the observed binary outcomes y1 and y0 (whether the wife works or not); X1 and X0
are vectors of weakly exogenous variables; Z is a vector of variables that determines a
switch between the regimes; β1 , β0 , and γ are vectors of parameters; and μi , ε1i , and
ε0i are the error terms. Assume that μi , ε1i , and ε0i are jointly normally distributed,
with a mean-zero vector and correlation matrix
⎛ ⎞
1 ρ0 ρ1
Ω=⎝ 1 ρ10 ⎠ (4)
1

where ρ0 and ρ1 are the correlations between ε0 , μ and ε1 , μ, and ρ10 is the correlation
between ε0 and ε1 .
Because y1i and y0i are never observed simultaneously, the joint distribution of
(ε0 , ε1 ) is not identiﬁed, and consequently, ρ10 cannot be estimated. We assume that
ρ10 = 1 (γ is estimable only up to a scalar factor). This model is identiﬁed by nonlin-
earities of its functional form. The log-likelihood function for the simultaneous system
of equations [(1)–(3)] is

ln() = wi ln{Φ2 (X1i β1 , Zi γ, ρ1 )}
Ti =0,yi =0

+ wi ln{Φ2 (−X1i β1 , Zi γ, −ρ1 )}
Ti =0,yi =0

+ wi ln{Φ2 (X0i β0 , −Zi γ, −ρ0 )}
Ti =0,yi =0

+ wi ln{Φ2 (−X0i β0 , −Zi γ, ρ0 )}
Ti =0,yi =0

where Φ2 is the cumulative function of a bivariate normal distribution and wi is an

optional weight for observation i. To ensure that estimated ρ1 , ρ0 are bounded between
−1 and 1, the ML directly estimates atanh(ρ):

1 1 + ρj
atanh ρj = ln j = 0, 1 (5)
2 1 − ρj
M. Lokshin and Z. Sajaia 371

After estimating the model’s parameters, the following statistics can be calculated
(Aakvik, Heckman, and Vytlacil 2000):

• The eﬀect of the treatment on the treated, or the expected eﬀect of the treatment
on individuals with observed characteristics x who participated in the program
(TT):1

TT(x) = Pr(y1 = 1|T = 1, X = x) − Pr(y0 = 1|T = 1, X = x)

Φ2 (X1 β1 , Zγ, ρ1 ) − Φ2 (X0 β0 , Zγ, ρ0 ) (6)
=
F (Zγ)

where F is a cumulative function of the univariate normal distribution.

• The eﬀect of the treatment on the untreated (TU), which is the expected eﬀect of
the treatment on individuals with observed characteristics x who did not partici-
pate in the program:

TU(x) = Pr(y1 = 1|T = 0, X = x) − Pr(y0 = 1|T = 0, X = x)

Φ2 (X1 β1 , −Zγ, −ρ1 ) − Φ2 (X0 β0 , −Zγ, −ρ0 ) (7)
=
F (−Zγ)

• The treatment eﬀect (TE), which is the expected eﬀect of the treatment for the
person with observed characteristics x randomly drawn from the population:

TE(x) = Pr(T = 1, X = x) − Pr(T = 0, X = x) = F (X1 β1 ) − F (X0 β0 ) (8)

• The marginal treatment eﬀect (MTE), which is the eﬀect of the treatment on
individuals with observed characteristics x and unobserved characteristics μ:

MTE(x, μ) = Pr(T = 1|X = x, μ = μ) − Pr(T = 0|x = x, μ = μ)

X1 β1 + ρ1 μ X0 β0 + ρ0 μ (9)
=F −F
1 − ρ21 1 − ρ20

• The average treatment effects (ATT, ATU, and ATE) for the corresponding sub-
groups of the population, which can be calculated by averaging (6) through (8)
over the observations in the subgroups. For example, the average treatment effect
on the treated (ATT), which is the mean effect of the treatment on those who
actually participated in the program, is

1. The treatment effect statistics are defined only for the cases when the exogenous variables in (2)
and (3) are the same, in other words, when X0 = X1 . When X0 = X1 , the treatment effect
statistics are calculated based on the vector of explanatory variables, which is a union of variables
in X0 and X1 . The coefficients corresponding to the variables that were not initially included in
the sets of explanatory variables for either of the equations are set to zero.
372 Impact of interventions on discrete outcomes

1
NT
ATT = TT(xi )
NT i=1

where NT is the number of observations with T = 1 (number of treated individu-

als).

2.1 Additional methods and formulas

The probability of being treated—the probability for a husband to migrate:

Pr(T = 1|z) = F (zγ) (10)

The probability of being treated and having a positive outcome—the probability for
a husband to migrate and for his wife to work:

Pr(T = 1, y = 1|X = x) = Φ2 (Zγ, X1 β1 , ρ1 ) (11)

The probability of being treated and having a zero outcome—the probability for a
husband to migrate and for his wife not to work:

Pr(T = 1, y = 0|X = x) = Φ2 (Zγ, −X1 β1 − ρ1 ) (12)

The probability of not being treated and having a positive outcome—the probability
for a husband not to migrate and for his wife to work:

Pr(T = 0, y = 1|X = x) = Φ2 (−Zγ, X0 β0 , −ρ0 ) (13)

The probability of not being treated and having a zero outcome—the probability
for a husband not to migrate and for his a wife not to work:

Pr(T = 0, y = 0|X = x) = Φ2 (−Zγ, −X0 β0 − ρ0 ) (14)

The probability of having a positive outcome conditional on being treated—the

probability for a wife of a migrant to work, conditional on her husband being migrant:

Φ2 (Zγ, X1 β1 , ρ1 )
Pr(y = 1|T = 1, X = x) = (15)
F (Zγ)

The probability of having a positive outcome conditional on being not treated—

the probability for a wife of a nonmigrant to work, conditional on her husband being
nonmigrant:
Φ2 (−Zγ, X0 β0 , −ρ0 )
Pr(y = 1|T = 0, X = x) = (16)
F (−Zγ)
M. Lokshin and Z. Sajaia 373

3 The switch probit command

3.1 Syntax
switch probit is implemented as a d2 ML evaluator that calculates the overall log
likelihood along with its ﬁrst and second derivatives. The command allows for weights,
clustering, robust standard errors, and the full set of options associated with Stata’s ML
procedures. The generic syntax2 for the command is as follows:

switch probit (depvar1 varlist1 ) (depvar0 varlist0 ) if in weight ,

select(depvar s varlist s) options

pweights, fweights, and iweights are allowed; see [U] 11.1.6 weight.
depvar1 is a binary outcome variable in regime 1. varlist1 is a vector of explanatory
variables in the equation explaining outcome in regime 1 [equation (2)]. depvar0 and
varlist0 are, correspondingly, the binary outcome variable and the set of explanatory
variables in regime 0 [equation (3)]. depvar s is a binary dependent variable in selection
equation (1), and varlist s is a set of explanatory variables in selection equation (1).
In cases when the explanatory variables in the binary outcome equations are the same
and there is only one dependent variable, only one equation needs to be specified.
Alternatively, when the exogenous variables are different in outcome equations [(2) and
(3)] and the dependent variables are different between the two outcome equations, both
equations must be specified.

3.2 Options
select(depvar s varlist s) gives the specification of switching (1) for Ti . varlist s might
include the set of instruments that help identify the model. It is an integral part of
the switch probit estimation and is required. A full specification of explanatory
variables is required for the selection equation (1); in other words, both instruments
and exogenous variables must be specified in varlist s. If there are no instrumental
variables in the model, the model will be identified by nonlinearities.
noconstant suppresses the constant terms.
offset1(varname), offset0(varname), and offset s(varname) include variables in
each equation with coefficients constrained to 1.
For more information, see [R] estimation options.
constraints(numlist | matname) applies linear constraints to the fitted model.
collinear keeps collinear variables in the equations. By default, only noncollinear
explanatory variables are used.

2. The syntax of switch probit is similar to the syntax of the movestay command (Lokshin and Sajaia
2004).
374 Impact of interventions on discrete outcomes

robust speciﬁes that the Huber/White/sandwich estimator of the variance is to be

used in place of the conventional ML variance estimator. robust combined with
cluster() further allows observations that are not independent within cluster (al-
though they must be independent between clusters). If you specify pweights, then
robust is implied. See [U] 20.16 Obtaining robust variance estimates.
cluster(varname) specifies that the observations are independent across groups (clus-
ters) but not necessarily within groups. varname specifies to which group each
observation belongs; for example, cluster(personid) refers to data with repeated
observations on individuals. Specifying cluster() affects the estimated standard
errors and variance–covariance matrix of the estimators but not the estimated coef-
ficients. cluster() can be used with pweights to produce estimates for unstratified
cluster-sampled data. Specifying cluster() implies robust.
level(#) specifies the confidence level, as a percentage, for confidence intervals. The
default is level(95) or as set by set level; see [U] 20.7 Specifying the width
of confidence intervals.
noskip specifies that a full ML model with only a constant for the regression equation
be fit. This model is not displayed but is used as the base model to compute a
likelihood-ratio test for the model test statistic displayed in the estimation header.
By default, the overall model test statistic is an asymptotically equivalent Wald test
that all the parameters in the regression equation are zero (except the constant).
For many models, this option can substantially increase estimation time.
maximize options control the maximization process; see [R] maximize. With the possi-
ble exception of iterate(0) and trace, you should specify these options only if the
model is unstable. The maximization uses the difficult option by default. This
option need not be specified.
M. Lokshin and Z. Sajaia 375

3.3 Saved results

switch probit saves the following in e() (* indicates saved parameters specific for
switch probit):
Scalars
e(N) number of observations
e(k) number of parameters
e(k eq) number of equations in e(b)
e(k eq model) number of equations in overall model test
e(k aux) number of auxiliary parameters
e(k dv) number of dependent variables
e(df m) model degrees of freedom
e(ll) log likelihood
e(ll 0) log likelihood, constant-only model (noskip only)
e(N clust) number of clusters
e(chi2 c) value of the χ2 test on the equality of both correlation coefficients (ρ1 , ρ0 )
to 0
e(p c)* probability of rejecting the χ2 test
e(p) significance of comparison test
e(rho1)* estimated coefficient of correlation between the error terms of the selection
equation and the outcome equations in regime 1
e(rho0)* estimated coefficient of correlation between the error terms of the selection
equation and the outcome equations in regime 0
e(rank) rank of e(V)
e(rank0) rank of e(V) for constant-only model
e(ic) number of iterations
e(rc) return code
e(converged) 1 if converged, 0 otherwise
Macros
e(cmd) switch probit
e(depvar) name of dependent variable
e(wtype) weight type
e(wexp) weight expression
e(title) title in estimation output
e(clustvar) name of cluster variable
e(offset1) offset for selection equation
e(offset2) offset for equation 1
e(offset3) offset for equation 0
e(chi2type) Wald or LR; type of model χ2 test
e(chi2 ct) Wald or LR; type of model χ2 test corresponding to e(chi2 c)
e(vce) vcetype specified in vce()
e(vcetype) title used to label Std. Err.
e(opt) type of optimization
e(ml method) type of ml method
e(user) name of likelihood-evaluator program
e(technique) maximization technique
e(crittype) optimization criterion
e(properties) b V
e(predict) program used to implement predict
Matrices
e(b) coefficient vector
e(ilog) iteration log (up to 20 iterations)
e(gradient) gradient vector
e(V) variance–covariance matrix of the estimators
Functions
e(sample) marks estimation sample
376 Impact of interventions on discrete outcomes

4 Postestimation
The predict command can follow switch probit to calculate the predictive statistics.
The statistics could be both in and out of the sample; type “predict . . . if e(sample)
. . . ” to generate statistics for observations in the estimated sample only.

predict type newvar if in , statistic

One of the following statistics may be specified with the predict command after
switch probit:
p11, the default, calculates the probability of being treated and having a positive out-
come [equation (11)].
p10 calculates the probability of being treated and having a zero outcome [equation
(12)].
p01 calculates the probability of not being treated and having a positive outcome [equa-
tion (13)].
p00 calculates the probability of not being treated and having a zero outcome [equation
(14)].
psel calculates the probability of being treated [equation (10)].
pcond1 calculates the probability of a positive outcome conditional on being treated
[equation (15)].
pcond0 calculates the probability of a positive outcome conditional on not being treated
[equation (16)].
zb calculates the probit linear prediction for the selection equation.
xb1 calculates the linear prediction based on the coefficients of the outcome equation in
regime 1.
xb0 calculates the linear prediction based on the coefficients of the outcome equation in
regime 0.
stdpsel calculates the standard error of the linear prediction of the selection equation.
stdp1 calculates the standard error of the linear prediction of regime 1.
stdp0 calculates the standard error of the linear prediction of regime 0.
tt calculates the treatment effect on the treated [equation (6)].
tu calculates the treatment effect on the untreated [equation (7)].
te calculates the treatment effect [equation (8)].
mte calculates the marginal treatment effect [equation (9)].
M. Lokshin and Z. Sajaia 377

5 Example
We illustrate the use of the switch probit command by looking at the problem of
estimating the impact of a husband’s migration on wife’s LFP. A typical empirical spec-
iﬁcation for such a model might be the following:

Mi∗ = Zi γ + μi Mi = I(Mi∗ > 0) = I(Zi γ + μi > 0) (17)

∗ ∗
LFPi1 = Xi β1 + ε1i LFPi0 = Xi β0 + ε0i (18)

Wi = I(LFP∗i1 > 0) = I(Xi β1 + ε1i > 0) if Mi = 1

Wi = I(LFP∗i0 > 0) = I(Xi β0 + ε0i > 0) otherwise (19)

Here Mi∗ is a latent continuous variable that determines the propensity of a husband
to migrate; LFP∗i0 and LFP∗i1 is the latent continuous propensity of a wife to work outside
the home if her husband migrates (subscript 1) or stays home (subscript 0); Zi is
a vector of characteristics that influences the migration decision; Xi is a vector of
characteristics that is thought to influence the wife’s LFP decision. β1 , β0 , and γ are
vectors of parameters, and μi , ε1i , and ε0i are the disturbance terms. The observed wife’s
LFP, Wi , is a dichotomous realization of latent variable LFP∗ i1 if a husband migrates and
of latent variable LFP∗i0 if he does not migrate.
The assumption that is often made in this type of model is that the wife’s decision
to participate in the labor market is endogenous to her husband’s migration decision.
Some unobserved characteristics that influence the probability of a husband to migrate
could also influence the decision of his wife to work or not. Neglecting these selectivity
effects is likely to produce biased estimates of the impact of the husband’s migration on
the wife’s LFP. The simultaneous ML estimation of (17), (18), and (19) with the proper
instrumentation of the migration decision might correct such a bias.
The data from this example are a nonrandom subsample of the data from the
2004 round of the Nepal Living Standards Survey (for example, Lokshin and Glinskaya
[2009]). The migration indicator migrates takes on value 1 if the husband migrates and
0 if he stays in the native country. The dependent variables in the wife’s LFP equations,
(19), are binary indicators of whether a wife works if her husband migrates (works 1)
or whether she works if her husband stays (works 0). The set of exogenous variables in
the LFP regressions (19) includes such wife’s characteristics as her age, age-squared, ed-
ucational dummies (wedu 2–wedu 5), and regional dummies (reg2–reg6). The omitted
category for educational dummies is “illiterate”, and higher-index dummies correspond
to higher levels of wife’s education. In addition to these variables, the migration equa-
tion (17) includes an instrument—pmigrants—to improve identification. A proportion
of migrants in a ward is believed to influence the husband’s migration decision but not
to affect the wife’s LFP decision.
The ML estimation of this specification using the switch probit command on
switch probit example.dta is shown below:
378 Impact of interventions on discrete outcomes

. use switch_probit_example
. switch_probit works age age2 wedu_2-wedu_5 hhsize hhsize2 reg_*,
> select(migrant age age2 wedu_2-wedu_5 hhsize hhsize2 reg_* pmigrants)

Fitting probit model for migrant=1:

(output omitted )
Fitting full model:
Iteration 0: log likelihood = -5631.5963
(output omitted )
Iteration 6: log likelihood = -5588.5707
Switching probit model Number of obs = 5426
Wald chi2(14) = 319.25
Log likelihood = -5588.5707 Prob > chi2 = 0.0000

Coef. Std. Err. z P>|z| [95% Conf. Interval]

migrant
age -.053828 .0101656 -5.30 0.000 -.0737521 -.0339039
age2 .0738756 .0133015 5.55 0.000 .0478051 .0999461
wedu_2 .0451867 .0639763 0.71 0.480 -.0802046 .170578
wedu_3 .1416557 .0674213 2.10 0.036 .0095124 .2737991
wedu_4 .197108 .0611319 3.22 0.001 .0772916 .3169243
wedu_5 .0142019 .0994006 0.14 0.886 -.1806197 .2090234
hhsize -.0556154 .0140021 -3.97 0.000 -.0830591 -.0281718
hhsize2 .0013728 .000646 2.13 0.034 .0001068 .0026389
reg_2 .7102527 .082358 8.62 0.000 .548834 .8716713
reg_3 .873601 .0915923 9.54 0.000 .6940835 1.053119
reg_4 .6790904 .0865635 7.84 0.000 .509429 .8487518
reg_5 .7782603 .0931983 8.35 0.000 .5955949 .9609257
reg_6 .9390181 .0849033 11.06 0.000 .7726107 1.105425
pmigrants .9877826 .1358246 7.27 0.000 .7215712 1.253994
_cons -.2400678 .2142056 -1.12 0.262 -.659903 .1797675

works_1
age .1251623 .0247011 5.07 0.000 .0767491 .1735755
age2 -.1696758 .0327141 -5.19 0.000 -.2337943 -.1055572
wedu_2 -.2971512 .154475 -1.92 0.054 -.5999166 .0056142
wedu_3 -.0435156 .1531251 -0.28 0.776 -.3436352 .2566041
wedu_4 -.0806875 .141102 -0.57 0.567 -.3572423 .1958673
wedu_5 .4507575 .2118544 2.13 0.033 .0355305 .8659846
hhsize -.1056211 .0539037 -1.96 0.050 -.2112704 .0000283
hhsize2 .0022055 .0033594 0.66 0.511 -.0043789 .0087898
reg_2 -.2725182 .2955016 -0.92 0.356 -.8516906 .3066542
reg_3 -.7467022 .3581774 -2.08 0.037 -1.448717 -.0446874
reg_4 -.4249299 .2932972 -1.45 0.147 -.9997818 .1499221
reg_5 -.3665319 .3253127 -1.13 0.260 -1.004133 .2710693
reg_6 -.1816081 .3456906 -0.53 0.599 -.8591491 .495933
_cons -2.041429 .7257811 -2.81 0.005 -3.463934 -.618924

works_0
age .0888403 .0130782 6.79 0.000 .0632076 .1144731
age2 -.1230907 .0174413 -7.06 0.000 -.157275 -.0889063
wedu_2 .1174843 .077252 1.52 0.128 -.0339267 .2688954
wedu_3 .0595546 .0853855 0.70 0.486 -.1077979 .2269072
wedu_4 .0320592 .0765806 0.42 0.675 -.1180359 .1821544
wedu_5 .2318881 .0982554 2.36 0.018 .039311 .4244652
hhsize -.0530284 .0204413 -2.59 0.009 -.0930925 -.0129642
hhsize2 .0011186 .0009241 1.21 0.226 -.0006926 .0029297
M. Lokshin and Z. Sajaia 379

reg_2 -.5203729 .0806754 -6.45 0.000 -.6784939 -.362252

reg_3 -1.036087 .0916373 -11.31 0.000 -1.215693 -.8564816
reg_4 -.7069518 .082838 -8.53 0.000 -.8693112 -.5445923
reg_5 -1.03612 .1017284 -10.19 0.000 -1.235504 -.8367363
reg_6 -.5616403 .0946533 -5.93 0.000 -.7471574 -.3761232
_cons -1.64381 .2658748 -6.18 0.000 -2.164915 -1.122705

/athrho1 -.231402 .3879283 -.9917276 .5289235

/athrho0 -.8448182 .5563851 -1.935313 .2456764

rho1 -.2273583 .3678756 -.758098 .4845578

rho0 -.6883527 .2927534 -.9591607 .2408502

LR test of indep. eqns. (rho1=rho0=0):chi2(2) = 4.96 Prob > chi2 = 0.0838

The results of the husband’s migration equation are reported in the section of the
output headed “migrant”. The results of the wife’s LFP equation in the regime where
her husband migrates are reported in the “works 1” section, and the wife’s LFP equation
in the regime where her husband stays is outputted in the “works 0” section.
The variables /athrho1 and /athro0 are ancillary parameters used in the ML pro-
cedure. /athrho1 and /athrho0 are the transformations of the correlation coefficients
as in (5).
The correlation coefficients rho1 and rho0 are both negative but are significant only
for the correlation between the error terms in the equation determining the husband’s
migration and the wife’s LFP equation if her husband stays home.
The likelihood-ratio test for joint independence of the equations is reported in the
last line of the output. The test rejects the H0 that ρ0 = ρ1 : Prob > x2 = 0.08.
We can now derive the effect of a husband’s migration on his wife’s LFP by inter-
preting migration as a treatment, (6), using the predict command:

. predict tt, tt
. summarize tt if (migrant == 1)
Variable Obs Mean Std. Dev. Min Max

tt 1694 .1160357 .0705859 .0046564 .4448425

Women living in migrant-sending households had 11.6 percentage points or about

90% lower probability of LFP compared with the counterfactual scenario of women living
in nonsending households.

6 Simulation Monte Carlo study

In this section, we provide a brief discussion of sensitivity of our estimator for model
identiﬁcation and the assumptions about the distribution of the error terms. To il-
lustrate the properties of our estimator, we conduct Monte Carlo simulations across a
range of model speciﬁcations.
380 Impact of interventions on discrete outcomes

The baseline data-generating process of (1) through (4) has the following form:

Ti = 1 if γ1 x1i + γ2 x2i + γ3 x3i + γ4 zi + μi > 0

Ti = 0 otherwise
∗ ∗ (20)
y1i = β11 x1i + β12 x2i + β13 x3i + ε1i y1i = I(y1i > 0)
∗ ∗
y0i = β01 x1i + β02 x2i + β03 x3i + ε0i y0i = I(y0i > 0)

We generate x1 , x2 , x3 , and z as independent standard normal random variables.

The true values of coeﬃcients β and γ are shown in the top panel of table 1.

Table 1. Monte Carlo simulations for diﬀerent model speciﬁcations

Selection equation Binary outcome T = 1 Binary outcome T = 0

Coeff. Std. % test Coeff. Std. % test Coeff. Std. % test
Dev. rejection Dev. rejection Dev. rejection
True coefficients
x1/x1 1.000 1.000 1.000
x2/x1 1.000 2.000 2.000
x3/x1 1.000 3.000 3.000
z/x1 1.000
Normally distributed errors; model is identified through the instrument
x1/x1 1.000 1.000 1.000
x2/x1 0.989 0.037 5.12 1.978 0.062 5.24 1.975 0.137 5.81
x3/x1 1.031 0.041 5.03 2.960 0.100 3.92 2.991 0.220 5.73
z/x1 0.987 0.034 4.81
Normally distributed errors; model is identified through nonlinearities only
x1/x1 1.000 1.000 1.000
x2/x1 0.988 0.037 7.15 1.981 0.072 98.63 2.031 0.171 100.0
x3/x1 1.030 0.041 19.51 2.966 0.116 100.0 3.103 0.318 100.0
z/x1 0.989 0.034 99.7 −0.011 0.046 100.0 −0.058 0.141 100.0
Nonnormal errors; model is identified through the instrument
x1/x1 1.000 1.000 1.000
x2/x1 0.989 0.031 3.77 2.087 0.098 7.45 1.949 0.075 4.49
x3/x1 1.031 0.033 5.32 3.239 0.146 11.72 3.015 0.117 5.36
z/x1 0.996 0.028 7.71
Nonnormal errors; model is identified through nonlinearities only
x1/x1 1.000 1.000 1.000
x2/x1 0.991 0.031 4.95 2.069 0.193 100.00 2.320 0.260 4.34
x3/x1 1.032 0.033 5.36 3.199 0.401 100.00 3.781 0.494 6.73
z/x1 0.997 0.029 8.43 0.021 0.192 98.87 −0.406 0.239 11.28
M. Lokshin and Z. Sajaia 381

We conduct Monte Carlo simulations for four scenarios on samples of 10,000 ob-
servations with 1,000 repetitions. In all simulations, we show the ratio of γi to βi
(i = 1, . . . , 3) to ensure the comparability of the estimation results across different
model specifications.
In the first scenario, shocks μi , ε1i , and ε0i are generated as standard trivariate
normal; the instrument z is excluded from the outcome equations. The estimates of the
ratios correspond well to the true coefficients.
In the second scenario, with the same error distribution, we add instrument z into the
outcome equations. In this specification, the model is identified through nonlinearities
of the functional form. The estimated ratios of coefficients are still close to the true
ratios. The coefficient on the instrumental variable is insignificant in both outcome
equations. The standard errors of the estimates in the outcome equations are larger
compared with the instrument-identified specification. Note that Wald tests at the 5%
level always reject the true null hypothesis for 5% of the parameter estimates. The weak
identification offered by function-form identification makes the large-sample properties
of the estimator worthless in this case.
In the third scenario, the errors are χ2 distributed and instrument z is excluded from
the outcome equations. The estimated coefficients are now further away from the true
coefficients compared with the first scenario.
Comparable results are observed in the fourth scenario with nonnormal errors, al-
though the precision of the estimates deteriorates significantly in this case. The simu-
lation based on different functional forms for the nonnormal distribution of the shocks
in (20) produces similar estimates.
The results of our simulations indicate that the estimator described in this paper is
relatively robust in terms of identification of the model. These findings are consistent
with conclusions of Wilde (2000) that “in recursive multiple-equation probit models with
endogenous dummy regressors no exclusion restrictions for the exogenous variables are
needed if there is sufficient variation in the data”.
We also evaluate the performance of our estimator in terms of predicting the ATE
and ATT effects. The data-generating process in these simulations is similar to the data-
generating process described in (20), but in addition to presenting the results based on
10,000 observations, we generate ATT and ATE for the sample sizes ranging from 200 to
30,000 observations.
Figure 1 shows the results of Monte Carlo simulations of ATE and ATT for the spec-
ification with normally distributed error terms and 1,000 repetitions. The simulations
demonstrate a good performance of the ML algorithm described in this paper when the
errors in (1), (2), and (3) are jointly normally distributed. Even for smaller sample
sizes, the method produces efficient and unbiased estimates of ATE and ATT effects.
382 Impact of interventions on discrete outcomes

Normally distributed errors: True ATE =−.102, True ATT = −.443

ATE ATT
.05 −.25

True effect
Mean estimated effect
−.3
0 95% confidence interval

−.4

−.1

−.5

−.2
−.6

−.25 −.65
.21 2 3 4 5 10 15 20 30 .21 2 3 4 5 10 15 20 30
Observations 000’s Observations 000’s

Figure 1. The results of Monte Carlo simulations (1,000 repetitions) of ATE and ATT
eﬀects; speciﬁcation with normally distributed errors

Figure 2 presents the results of Monte Carlo simulations of ATE and ATT for the
speciﬁcation where the error terms are nonnormally distributed. The violation of the
normality assumption results in biased estimates for both ATE and ATT eﬀects. The
bias is larger for estimations based on smaller sample sizes.
M. Lokshin and Z. Sajaia 383

Errors distributed nonnormally: True ATE =−.175, True ATT = −.336

ATE ATT
.05 −.1

True effect
Mean estimated effect
0
95% confidence interval

−.2

−.1

−.3

−.2

−.4

−.3 −.45
.21 2 3 4 5 10 15 20 30 .21 2 3 4 5 10 15 20 30
Observations 000’s Observations 000’s

Figure 2. The results of Monte Carlo simulations (1,000 repetitions) of ATE and ATT
eﬀects; speciﬁcation with nonnormally distributed errors

Our final simulation results examine the validity of confidence intervals depending
on the assumptions about the joint distribution of the error terms in (20). Figure 3
shows the coverage rates for ATE and ATT effects. The coverage rates are constructed
for the specifications with normally (solid line) and nonnormally distributed error terms
(dotted line) in (20). The simulations are based on the bootstrap estimations of the
confidence intervals for ATE and ATT effects for 1,000 replications for 1,000 Monte Carlo
repetitions (that is, 1,000,000 ML model estimation for each sample size). The size α
confidence interval is reported as the interval between the α/2 and 1 − α/2 quantiles of
the simulated draws of ATE and ATT.
384 Impact of interventions on discrete outcomes

ATE ATT
1 1

.95 .95

.9 .9

.8 .8
Coverage rate for 95% confidence interval

.6 .6

Normal

Nonnormal

.4 .4

.2 .2

0 0

.2 1 2 3 4 5 10 15 20 30 .2 1 2 3 4 5 10 15 20 30

Observations 000’s Observations 000’s

Figure 3. Coverage rates for ATE and ATT eﬀects

The left panel of figure 3 shows the coverage rates for 95% confidence intervals of
ATE for samples of a different size. The coverage rates for the confidence intervals
estimated based on normally distributed errors are close to the nominal 95%. The
coverage rates for nonnormal specification demonstrate undercoverage that increases
with the sample size. For example, while the coverage rates for samples of up to 2,000
observations are close to the nominal 95%, the coverage rates drop to about 54% for
the simulations based on the sample with 30,000 observations. This decline in the
coverage rates for nonnormal specification are consistent with the estimation bias and
the narrower confidence intervals shown in figure 2.
The right panel of figure 3 presents the coverage rates for 95% confidence intervals of
ATT estimates. Similarly to ATE, the coverage rates for ATT estimated under normality
assumptions are close to nominal. For the simulations based on nonnormal errors, the
undercoverage is more severe compared with ATE: the 95% coverage rates decline rapidly
from about 90% for the small samples to 0% for the samples of 15,000 observations
and larger. Again these results are consistent with the bias and pattern of confidence
intervals for ATT shown in the right panel of figure 2.

7 Conclusion
This article describes a Stata implementation of an ML estimator for the parameters
of a binary response model with endogenous switching. The switch probit command
M. Lokshin and Z. Sajaia 385

extends the set of Stata ML algorithms for estimation of the models with endogenous
switching (for example, movestay by Lokshin and Sajaia [2004]). We think that the
ability of the new command to produce estimates of the treatment impact for diﬀerent
population subgroups could be useful in applied studies of impact evaluation.
The results of our Monte Carlo simulations indicate that while the estimator per-
forms well under the assumption of normally distributed error terms, it produces biased
estimates if the normality assumptions are violated. Researchers who suspect that the
normality assumptions are not likely to hold might want to use other, semiparametric
or nonparametric, methods of estimation for such models.

8 References
Aakvik, A., J. J. Heckman, and E. J. Vytlacil. 2000. Treatment eﬀects for discrete
outcomes when responses to treatment vary among observationally identical persons:
An application to Norwegian vocational rehabilitation programs. NBER Technical
Working Paper No. 262. http://www.nber.org/papers/t0262.

Carrasco, R. 2001. Binary choice with binary endogenous regressors in panel data:
Estimating the eﬀect of fertility on female labor participation. Journal of Business
and Economic Statistics 19: 385–394.

Lokshin, M., and E. Glinskaya. 2009. The eﬀect of male migration on employment
patterns of women in Nepal. World Bank Economic Review 23: 481–507.

Lokshin, M., and Z. Sajaia. 2004. Maximum likelihood estimation of endogenous switch-
ing regression models. Stata Journal 4: 282–289.

Wilde, J. 2000. Identiﬁcation of multiple equation probit models with endogenous

dummy regressors. Economics Letters 69: 309–312.

About the authors

Michael Lokshin is an adviser in the Research Department of the World Bank.
Zurab Sajaia is an economist in the Research Department of the World Bank.

KEBRI DEHAR UNIVERSITY - MoSHE Model For ABVM
No ratings yet
KEBRI DEHAR UNIVERSITY - MoSHE Model For ABVM
58 pages
Imbens Wooldridge Notes
No ratings yet
Imbens Wooldridge Notes
473 pages
Beta Distributions of First and Second Kind
No ratings yet
Beta Distributions of First and Second Kind
28 pages
Visual Data Insights Using SAS ODS Graphics: A Guide to Communication-Effective Data Visualization 1st Edition Leroy Bessler all chapter instant download
100% (3)
Visual Data Insights Using SAS ODS Graphics: A Guide to Communication-Effective Data Visualization 1st Edition Leroy Bessler all chapter instant download
41 pages
Emp Handout PDF
No ratings yet
Emp Handout PDF
36 pages
Empirical Methods - Esther Duflo 2002
No ratings yet
Empirical Methods - Esther Duflo 2002
36 pages
ssrn-4487202
No ratings yet
ssrn-4487202
380 pages
Binary
No ratings yet
Binary
135 pages
Eco 270
No ratings yet
Eco 270
9 pages
Adv Econometrics
No ratings yet
Adv Econometrics
8 pages
Sella+et+al.+2021
No ratings yet
Sella+et+al.+2021
58 pages
Research Methodos Notes
No ratings yet
Research Methodos Notes
57 pages
Econometric lec4
No ratings yet
Econometric lec4
58 pages
Propensity Score Matching Methods For The
No ratings yet
Propensity Score Matching Methods For The
104 pages
Jia Grad - Msu 0128D 15316
No ratings yet
Jia Grad - Msu 0128D 15316
168 pages
Exploring Marginal Treatment Effects Flexible Estimation Using Stata
No ratings yet
Exploring Marginal Treatment Effects Flexible Estimation Using Stata
37 pages
Rose & Bliemer 2009 - Constructing Efficient Stated Choice Experimental Designs
No ratings yet
Rose & Bliemer 2009 - Constructing Efficient Stated Choice Experimental Designs
32 pages
1997 MF Berger - Using Efficiency Measures To Distinguish Among Alternative Explanations of The Structure Performance Relationship
No ratings yet
1997 MF Berger - Using Efficiency Measures To Distinguish Among Alternative Explanations of The Structure Performance Relationship
26 pages
Causal Inference - A Statistical Learning Approach
No ratings yet
Causal Inference - A Statistical Learning Approach
247 pages
Chapter 2
No ratings yet
Chapter 2
97 pages
Using Instrumental Variables For Inference About Policy Relevant Treatment Parameters
No ratings yet
Using Instrumental Variables For Inference About Policy Relevant Treatment Parameters
57 pages
Probit Logit Ohio PDF
No ratings yet
Probit Logit Ohio PDF
16 pages
Course3 Generalization
No ratings yet
Course3 Generalization
26 pages
Analysis of Binary Panel Data by Static and Dynamic Logit Models
No ratings yet
Analysis of Binary Panel Data by Static and Dynamic Logit Models
45 pages
Potential Outcomes Framework
100% (1)
Potential Outcomes Framework
7 pages
ANOVA Study Guide and Practice Exams
No ratings yet
ANOVA Study Guide and Practice Exams
36 pages
Elementary Statistics 12th Edition Triola Solutions Manualdownload
100% (6)
Elementary Statistics 12th Edition Triola Solutions Manualdownload
34 pages
14 - 382 - Pset - 5 (1) - Merged
No ratings yet
14 - 382 - Pset - 5 (1) - Merged
9 pages
1-s2.0-S0923474824000213-main
No ratings yet
1-s2.0-S0923474824000213-main
19 pages
Econometrics - Qualitative Response Models
No ratings yet
Econometrics - Qualitative Response Models
17 pages
Uberti-2022-SJ-22-1-Interpreting-logit-models
No ratings yet
Uberti-2022-SJ-22-1-Interpreting-logit-models
17 pages
Stat2001 Practice Exam Solution
No ratings yet
Stat2001 Practice Exam Solution
21 pages
14 382 Pset 5
No ratings yet
14 382 Pset 5
7 pages
Estimating Marginal Treatment Effects Using Semi and Nonparametric Method
No ratings yet
Estimating Marginal Treatment Effects Using Semi and Nonparametric Method
27 pages
Instrumental Variables Estimates
No ratings yet
Instrumental Variables Estimates
27 pages
M604 Final Solutions
No ratings yet
M604 Final Solutions
20 pages
PSM Inès
No ratings yet
PSM Inès
71 pages
Lecture 6. Bayesian Estimation
No ratings yet
Lecture 6. Bayesian Estimation
14 pages
Lecture 1b
No ratings yet
Lecture 1b
7 pages
Introduction To Treatment Effects Handout
No ratings yet
Introduction To Treatment Effects Handout
18 pages
Stratified Sampling 2012
No ratings yet
Stratified Sampling 2012
17 pages
Matching, regression discontinuity, difference in differences, and beyond 1st Edition Lee pdf download
No ratings yet
Matching, regression discontinuity, difference in differences, and beyond 1st Edition Lee pdf download
54 pages
CH 12
No ratings yet
CH 12
14 pages
NPV Scheduler 4 White Paper
No ratings yet
NPV Scheduler 4 White Paper
20 pages
PD2004 9
No ratings yet
PD2004 9
26 pages
Sillano and Ortuzar (2005) (WTP_Random)
No ratings yet
Sillano and Ortuzar (2005) (WTP_Random)
26 pages
Topics in Applied Econometrics MIT 14.387 J. Angrist Spring 2004 W. Newey
No ratings yet
Topics in Applied Econometrics MIT 14.387 J. Angrist Spring 2004 W. Newey
7 pages
Sta301 Solved Mcqs Final Term by Junaid
No ratings yet
Sta301 Solved Mcqs Final Term by Junaid
55 pages
1040438-exam-GRA66121. Exam 2023
No ratings yet
1040438-exam-GRA66121. Exam 2023
9 pages
Deb and Trivedi 2nd Paper
No ratings yet
Deb and Trivedi 2nd Paper
11 pages
6.01 Statistical Inference
No ratings yet
6.01 Statistical Inference
1 page
Understandable Statistics 12th Edition Brase C.H. - eBook PDF instant download
100% (1)
Understandable Statistics 12th Edition Brase C.H. - eBook PDF instant download
55 pages
Solving Problem Involving Sampling Distribution of The Sample MeansApril 16 2024
No ratings yet
Solving Problem Involving Sampling Distribution of The Sample MeansApril 16 2024
31 pages
Slides On Taming The Factor ZOO
No ratings yet
Slides On Taming The Factor ZOO
35 pages
Chang_2011
No ratings yet
Chang_2011
18 pages
Econometrics 2013 Midterm
No ratings yet
Econometrics 2013 Midterm
9 pages
Ch4_Classifications24
No ratings yet
Ch4_Classifications24
42 pages
Practical Signal Processing
No ratings yet
Practical Signal Processing
30 pages
Ecmetrics II Ch1
No ratings yet
Ecmetrics II Ch1
56 pages
Femlogit Implementation of The Multinomi
No ratings yet
Femlogit Implementation of The Multinomi
16 pages
STA 121 Question (24-25)
No ratings yet
STA 121 Question (24-25)
2 pages
Lecture 2_Using Indicator Variables
No ratings yet
Lecture 2_Using Indicator Variables
14 pages
Handout 6 Causality
No ratings yet
Handout 6 Causality
16 pages
Lecture 6&7_Qualitative Dependent Models
No ratings yet
Lecture 6&7_Qualitative Dependent Models
15 pages
Kriging Method and Application
100% (1)
Kriging Method and Application
56 pages
12th B BSS 7th Sem ECON 405 2021
No ratings yet
12th B BSS 7th Sem ECON 405 2021
3 pages
Stats-Edited Answers
No ratings yet
Stats-Edited Answers
30 pages
Cross Section Answers
No ratings yet
Cross Section Answers
22 pages
Cap1_Slides
No ratings yet
Cap1_Slides
30 pages
Slides_1
No ratings yet
Slides_1
8 pages
Empirical Methods in Microeconomics
No ratings yet
Empirical Methods in Microeconomics
3 pages
HD Econometrics
No ratings yet
HD Econometrics
197 pages
Chapter 4
No ratings yet
Chapter 4
11 pages
Past Paper 2019
No ratings yet
Past Paper 2019
7 pages
BSC Intermediate Econometrics: Please Do Not Distribute
No ratings yet
BSC Intermediate Econometrics: Please Do Not Distribute
25 pages
CH 5. Discrete Choice Model
No ratings yet
CH 5. Discrete Choice Model
38 pages
Logit_probit
No ratings yet
Logit_probit
20 pages
Econometrics-CH-4 (1)
No ratings yet
Econometrics-CH-4 (1)
14 pages
Econometrics Eviews 6
No ratings yet
Econometrics Eviews 6
12 pages
Chapter 5-LDVM-2024
No ratings yet
Chapter 5-LDVM-2024
27 pages
MMW 6 Data Management Part 3 Central Location Variability PDF
No ratings yet
MMW 6 Data Management Part 3 Central Location Variability PDF
5 pages
Length of Confidence
No ratings yet
Length of Confidence
3 pages
Econometrics 2 Exam Answers
67% (3)
Econometrics 2 Exam Answers
6 pages
L 1 Mock V 12014 December Am Solutions
No ratings yet
L 1 Mock V 12014 December Am Solutions
76 pages
Active Inference: The Free Energy Principle in Mind, Brain, and Behavior
From Everand
Active Inference: The Free Energy Principle in Mind, Brain, and Behavior
Thomas Parr
4/5 (3)
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet

Loksh in 2011

Uploaded by

Loksh in 2011

Uploaded by

The Stata Journal (2011)

11, Number 3, pp. 368–385

Impact of interventions on discrete outcomes:

• The paper by Aakvik, Heckman, and Vytlacil (2000) formulates an econometric

Ti = 1 if γZi + μi > 0 (1)

where Φ2 is the cumulative function of a bivariate normal distribution and wi is an

TT(x) = Pr(y1 = 1|T = 1, X = x) − Pr(y0 = 1|T = 1, X = x)

where F is a cumulative function of the univariate normal distribution.

TU(x) = Pr(y1 = 1|T = 0, X = x) − Pr(y0 = 1|T = 0, X = x)

TE(x) = Pr(T = 1, X = x) − Pr(T = 0, X = x) = F (X1 β1 ) − F (X0 β0 ) (8)

MTE(x, μ) = Pr(T = 1|X = x, μ = μ) − Pr(T = 0|x = x, μ = μ)

where NT is the number of observations with T = 1 (number of treated individu-

2.1 Additional methods and formulas

Pr(T = 1|z) = F (zγ) (10)

Pr(T = 1, y = 1|X = x) = Φ2 (Zγ, X1 β1 , ρ1 ) (11)

Pr(T = 1, y = 0|X = x) = Φ2 (Zγ, −X1 β1 − ρ1 ) (12)

Pr(T = 0, y = 1|X = x) = Φ2 (−Zγ, X0 β0 , −ρ0 ) (13)

Pr(T = 0, y = 0|X = x) = Φ2 (−Zγ, −X0 β0 − ρ0 ) (14)

The probability of having a positive outcome conditional on being treated—the

The probability of having a positive outcome conditional on being not treated—

3 The switch probit command

robust speciﬁes that the Huber/White/sandwich estimator of the variance is to be

3.3 Saved results

Mi∗ = Zi γ + μi Mi = I(Mi∗ > 0) = I(Zi γ + μi > 0) (17)

Wi = I(LFP∗i1 > 0) = I(Xi β1 + ε1i > 0) if Mi = 1

Fitting probit model for migrant=1:

Coef. Std. Err. z P>|z| [95% Conf. Interval]

reg_2 -.5203729 .0806754 -6.45 0.000 -.6784939 -.362252

/athrho1 -.231402 .3879283 -.9917276 .5289235

rho1 -.2273583 .3678756 -.758098 .4845578

LR test of indep. eqns. (rho1=rho0=0):chi2(2) = 4.96 Prob > chi2 = 0.0838

tt 1694 .1160357 .0705859 .0046564 .4448425

Women living in migrant-sending households had 11.6 percentage points or about

6 Simulation Monte Carlo study

Ti = 1 if γ1 x1i + γ2 x2i + γ3 x3i + γ4 zi + μi > 0

We generate x1 , x2 , x3 , and z as independent standard normal random variables.

Table 1. Monte Carlo simulations for diﬀerent model speciﬁcations

Selection equation Binary outcome T = 1 Binary outcome T = 0

Normally distributed errors: True ATE =−.102, True ATT = −.443

Errors distributed nonnormally: True ATE =−.175, True ATT = −.336

Observations 000’s Observations 000’s

Figure 3. Coverage rates for ATE and ATT eﬀects

Wilde, J. 2000. Identiﬁcation of multiple equation probit models with endogenous

About the authors

You might also like