0% found this document useful (0 votes)
124 views

Probit Logit Ohio PDF

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views

Probit Logit Ohio PDF

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Probit and Logit Models

Ani Katchova

© 2013 by Ani Katchova. All rights reserved.


Probit and Logit Models Overview

 Examples of probit and logit models


 Binary dependent variable
 Linear regression model, probit, and logit models functional forms and properties
 Model coefficients and interpretations
 Marginal effects (and odds ratios) and interpretations
 Goodness of fit statistics (percent correctly predicted and pseudo R-squared)
 Choice between probit and logit
 Economic models that lead to use of probit and logit models
Probit and Logit Models (Binary Outcome Models)

Binary outcome examples

 Consumer economics: whether a consumer makes a purchase or not.


 Labor economics: whether an individual participates in the labor market or not.
 Agricultural economics: whether or not a farmer adopts or uses organic practices,
marketing/production contracts, etc.

Binary outcome dependent variable

 The decision/choice is whether or not to have, do, use, or adopt.


 The dependent variable is a binary response
 It takes on two values: 0 and 1.
0
1
Binary outcome models

 Binary outcome models are among the most used in applied economics.
 A look at the OLS model:
 Binary outcome models estimate the probability that y=1 as a function of the independent
variables.
pr 1| F ′

 There are three different models depending on the functional form of F ′ .

Regression model (linear probability model)

 In the linear probability model, F x′ x′

pr 1|x x

 A problem with the regression model is that the predicted probabilities will not be limited
between 0 and 1.
 We do not use the regression model with binary outcome data.
Logit model

 For the logit model, F x′ is the cdf of the logistic distribution.



exp
F ′ Λ ′
1 1 exp

 The predicted probabilities are limited between 0 and 1.

Probit model

 For the probit model, F ′ is the cdf of the standard normal distribution.

F ′ Φ ′

 The predicted probabilities are limited between 0 and 1.


Model coefficients

 Probit and logit models are estimated using the maximum likelihood method.

Interpretation of coefficients

 An increase in x increases/decreases the likelihood that y=1 (makes that outcome more/less
likely). In other words, an increase in x makes the outcome of 1 more or less likely.
 We interpret the sign of the coefficient but not the magnitude. The magnitude cannot be
interpreted using the coefficient because different models have different scales of coefficients.

Comparison of coefficients

 Coefficients differ among models because of the functional form of the F function.

≃4

≃ 2.5

≃ 1.6

 We should not compare the magnitude of the coefficients among different models.
Marginal effects

 When estimating probit and logit models, it is common to report the marginal effects after
reporting the coefficients.
 The marginal effects reflect the change in the probability of y=1 given a 1 unit change in an
independent variable x.

Marginal effects for the regression model

 For the OLS regression model, the marginal effects are the coefficients and they do not depend
on x.
∂ ⁄ x

 The index j refers to the jth independent variable.


 [When we use the index i, it refers to the ith observation.]
Marginal effects for the binary models (probit and logit)

 For the logit and probit models, the marginal effects are calculated as:

∂ ⁄ x F′ ′

 The marginal effects depend on x, so we need to estimate the marginal effects at a specific
value of x (typically the means).
 Coefficients and marginal effects have the same signs because F′ ′ 0.

Marginal effects for the logit model

∂ ⁄ x Λ ′ 1 Λ ′
1

Marginal effects for the probit model

∂ ⁄ x ϕ ′
Estimating marginal effects
Marginal effects at the mean

 The marginal effects are estimated for the average person in the sample .
∂ ⁄ x F′ ′

 Most papers report marginal effects at the mean.


 A problem is that there may not be such a person in the sample.

Average marginal effects

 The marginal effects are estimated as the average of the individual marginal effects.

∑ F′ ′
∂ ⁄ x
n

 This is a better approach of estimating marginal effects, but papers still use the previous
approach.
 In practice, the two ways to estimate marginal effects produce almost identical results most of
the time.
Partial effects for discrete variables

 Predict the probabilities for the two discrete values of a variable and take the difference:
1

Interpretation of marginal effects

 An increase in x increases (decreases) the probability that y=1 by the marginal effect expressed
as a percent.
o For dummy independent variables, the marginal effect is expressed in comparison to the
base category (x=0).
o For continuous independent variables, the marginal effect is expressed for a one-unit
change in x.
 We interpret both the sign and the magnitude of the marginal effects.
 The probit and logit models produce almost identical marginal effects.
Odds ratio/relative risk for the logit model

 The odds ratio or relative risk is p/(1-p) and measures the probability that y=1 relative to the
probability that y=0.
exp
1 exp
exp
1

 An odds ratio of 2 means that the outcome y=1 is twice as likely as the outcome of y=0.
 Odds ratios are estimated with the logistic model.
 Reporting marginal effects instead of odds ratios is more popular in economics.
Predicted probabilities and goodness of fit measures

 After estimating the models, we can predict the probability that y=1 for each observation.

̂ pr 1| F ′

 For the regression model, the predicted probabilities are not limited between 0 and 1.
 For the logit and probit models, the predicted probabilities are limited between 0 and 1.
 The predicted probability indicate the likelihood of y=1. If the predicted probability is greater
than 0.5 we can predict that y=1, otherwise y=0.
Goodness of fit measures
Percent correctly predicted values

 If the predicted probability is greater than 0.5 we can predict that y=1, otherwise y=0.
 We can create the following table:

Actual y=1 Actual y=0


Predicted yhat=1 True False
Predicted yhat=0 False True

 We have four cases of 0/1: two of them are correct predictions and two of them are wrong
predictions.
 The percent correctly predicted values are the proportion of true predictions to total
predictions.
Pseudo R-squared (McFadden R-squared)

 The pseudo R-square is calculated as:

R-squared = 1 ⁄

 It compares the unrestricted log-likelihood Lur for the model we are estimating and the
restricted log-likelihood Lr with only an intercept.
 If the independent variables have no explanatory power, the restricted model will be the same
as unrestricted model and R-squared will be 0.
Discussion about binary outcome models

Choice between the logit and probit model

 The choice depends on the data generating process, which is unknown.


 The models produce almost identical results (different coefficients but similar marginal
effects).
 The choice is up to you.

Coding of the dependent variable

 If we reverse the categories 0 and 1, the signs of the coefficients are reversed (positive become
negative and vice versa) but the magnitudes are the same.

Latent variable models

 A latent variable is a variable that is incompletely observed y*. Latent variables can be
introduced into binary outcome models in two ways: index functions and random utility
models.
Index function models

 The latent variable is an index of the unobserved propensity for the event to occur.
 Index models are used in two step models, which will be covered later in class.
o Example: We cannot observe how much people want to work, only if they work or not.

1 ∗ 0
0 ∗ 0

Random utility models

 The latent variable is the difference in utilities if the event occurs or does not occur.
 They are often a result of individual choice.
o Example: a consumer chooses one product or another depending on which utility is
higher.

You might also like