Econometria Avanzada: Generalized Linear Models
Econometria Avanzada: Generalized Linear Models
Carlos Castro
1 / 30
Motivation: Binary Choice Models
2 / 30
Example: Biometrics Literature
3 / 30
Example:Biometrics Literature
4 / 30
Motivation: Polychotomous
5 / 30
Motivation: Polychotomous.
1 Person i high school education
2 Person i incomplete college education
yi =
3 Person i complete college education
4 Person i Graduate education
6 / 30
Linear Models.
{ }
1, If person i is a member of organized labor (syndicate)
yi =
0, else
7 / 30
Linear Models.
yi = xi β + vi
where xi is a matrix that collects socio-economic characteristic from the
employee, such as education, work experience,..etc.
The linear probability model has a number of shortcomings:
prob(yi = 1) = F (xi , β)
prob(yi = 0) = 1 − F (xi , β)
8 / 30
Linear Model.
E [y |x] = F (x, β)
y = E [y |x] + [y − E [y |x]]
= β′x + ϵ
9 / 30
Linear Models.
β′x + ϵ = 0 or 1
ϵ = −β ′ x with probability 1 − F or
ϵ = 1 − β ′ x with probability F =⇒
var [ϵ|z] = β ′ x(1 − β ′ x)
10 / 30
Linear Models.
11 / 30
Generalized Linear Models.
12 / 30
GLM: Distribution y.
∑
r
l(y , θ) = c(θ)h(y ) exp Qj (θ)Tj (y )
j=1
13 / 30
GLM: Link function.
We typically want to relate the parameters of the distribution to various
predictors. We do so by modelling a transformation of the mean, µi
which will be some function of Q(θ):
E (yi ) = µi
−1
φ (µi ) = xi β
14 / 30
Predictors.
15 / 30
Model Specification.
prob(yi = 1) = xi β
It is possible to use any continuous probability function defined on R.
16 / 30
Probit Model: Specification.
∫xi β { }
1 −zi2
prob(yi = 1) = ϕ(xi β) = √ exp dz
2π 2
−∞
The use of the standard normal function ϕ(xi β) restricts the range of
values to be ϵ[0, 1], such that
17 / 30
Probit: Specification
18 / 30
Probit: Specification.
ϵi −xi β
prob(yi = 1) = prob( > )
σ σ
ϵi xi β
= prob( < )
σ σ
xi β
= ϕ( )
σ
19 / 30
Probit: Estimation.
xi β
prob(yi = 1) = ϕ( )
σ
xi β
prob(yi = 0) = 1 − prob(yi = 1) = 1 − ϕ( )
σ
Since yi is iid the likelihood function is the multiplication of the
probability for each observation → ∃ 1, ..m observation such that yi = 0
and m + 1, ..., n such that n − m observations where yi = 1.
20 / 30
Probit: Estimation.
21 / 30
Probit: Estimation.
∑
n
ln L = yi ln F (xi β) + (1 − yi ) ln[1 − F (xi β)]
i=1
∂ ln L ∑ f (.) ∑
n n
f (.)
= yi xi − (1 − yi ) xi = 0
∂β F (.) 1 − F (.)
i=1 i=1
22 / 30
Probit: Estimation.
This function must be maximized with the use of numeric methods since
it is not linear. The most common numerical method used is the Newton
Raphson Algorithm.
∂ 2 ln L ∂ ln L
β̂t+1 = β̂t − [ |β̂t ]−1 [ |β̂t ]
∂β∂β ′ ∂β
where β̂t is a consistent, asymptotically efficient and normal distributed
estimator.
The asymptotic covariance matrix is:
∂ 2 ln L
−[ |β̂t ]−1
∂β∂β ′
23 / 30
Logit: Specification.
exp {β ′ x}
prob(yi = 1) = Λ(xi β) =
1 + exp {β ′ x}
Both of these model are the most commonly used. The choice between
theM is sometimes based on the data used. The distribution are both
symmetric. However the tails of the logistic density function are
heavier(widther) then in the standard normal. This makes the probability
associated with extreme values most likely to vary between the to
models.
24 / 30
Logit: Estimation.
The methods used in the estimation of logit model are also maximum
likelihood methods.
First order condition for the log-likelihood:
∂ ln L ∑
n
= (yi − Λi )xi = 0
∂β
i=1
25 / 30
Logit: Estimation.
This equation must be solved with the use of numeric methods since it is
not linear. The most common numerical method used is the Newton
Raphson Algorithm.
∂ 2 ln L ∂ ln L
β̂t+1 = β̂t − [ ′
|β̂t ]−1 [ |β̂t ]
∂β∂β ∂β
where β̂t is a consistent, asymptotically efficient and normal distributed
estimator.
The asymptotic covariance matrix is:
∂ 2 ln L
−[ |β̂t ]−1
∂β∂β ′
26 / 30
Marginal Effects.
The parameters of the models with dicrete dependent variables, like those
of any nonlinear regression model, are not necessarily the marginal effects
that one is accustomed to analyzing. In general
{ }
∂E [y |x] dF (β ′ x)
= β
∂x d(β ′ x)
= f (β ′ x)β
27 / 30
Marginal Effects.
∂E [y |x]
= ϕ(β ′ x)β
∂x
where ϕ(t) is the standard normal density.
28 / 30
Marginal Effects.
dΛ[β ′ x] exp {β ′ x}
=
d(β ′ x) (1 + exp {β ′ x})2
= Λ(β ′ x)[1 − Λ(β ′ x)]
∂E [y |x]
= (Λ(β ′ x)[1 − Λ(β ′ x)])β
∂x
It is obvious that these values will vary with the values of x. In
interpreting the estimated model, it will be useful to calculate this value
at, say, the means of the regressors.
29 / 30
Goodness-of-fit.
McFaddenR 2
L1
1−
L0
30 / 30