0% found this document useful (0 votes)
17 views

Introduction to Regression Models

The document provides an overview of various regression models used for different types of dependent variables, including Linear Probability Model, Logit Model, Multinomial Logit, Ordered Choice Model, Count Data Models, and Survival Analysis Models. Each model is described with its formula, advantages, and disadvantages. The document highlights the appropriate contexts for using these models based on the nature of the dependent variable.

Uploaded by

Bersabeh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Introduction to Regression Models

The document provides an overview of various regression models used for different types of dependent variables, including Linear Probability Model, Logit Model, Multinomial Logit, Ordered Choice Model, Count Data Models, and Survival Analysis Models. Each model is described with its formula, advantages, and disadvantages. The document highlights the appropriate contexts for using these models based on the nature of the dependent variable.

Uploaded by

Bersabeh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Introduction to Regression Models

1. Linear Probability Model (LPM)


 A Linear Probability Model (LPM) is a regression model used for binary
dependent variables (0/1).
 It uses Ordinary Least Squares (OLS) to estimate the probability of an event
occurring.
 Formula: P(Y=1∣X)=β0+β1X1+β2X2+...+βkXk+ϵP(Y = 1 | X) = \beta_0 + \
beta_1X_1 + \beta_2X_2 + ... + \beta_kX_k + \epsilonP(Y=1∣X)=β0+β1X1+β2X2+...
+βkXk+ϵ
 Pros:
o Easy to interpret coefficients.
o Simple to estimate.
 Cons:
o Predicted probabilities can be < 0 or > 1, which is not realistic.
o Assumes constant variance (homoskedasticity), which is often violated.

2. Logit Model
 A logistic regression model is used when the dependent variable is binary (0 or 1).
 It estimates the log-odds of an event occurring.
 Formula: P(Y=1∣X)=eβ0+β1X1+...+βkXk1+eβ0+β1X1+...+βkXkP(Y = 1 | X) = \
frac{e^{\beta_0 + \beta_1X_1 + ... + \beta_kX_k}}{1 + e^{\beta_0 + \beta_1X_1 + ...
+ \beta_kX_k}}P(Y=1∣X)=1+eβ0+β1X1+...+βkXkeβ0+β1X1+...+βkXk
 Pros:
o Predicted probabilities always lie between 0 and 1.
o Works well with categorical and continuous variables.
 Cons:
o Interpretation of coefficients is less intuitive compared to LPM.
o Requires computing marginal effects for probability interpretation.

3. Multinomial Logit (MNL) & Multinomial Probit


 Used when the dependent variable has more than two unordered categories (e.g.,
choosing among 3 job types).
 Multinomial Logit (MNL): Assumes Independence of Irrelevant Alternatives
(IIA)—meaning that the odds of choosing one category over another are not affected
by additional choices.
 Multinomial Probit: Does not assume IIA and allows for correlated error terms but
is computationally intensive.

4. Ordered Choice Model (Ordered Logit/Probit)


 Used when the dependent variable has more than two ordered categories (e.g.,
survey responses: poor, fair, good, excellent).
 Ordered Logit/Probit:
o Assumes that there is an underlying continuous variable determining the
choice.
o Uses threshold values to determine category placement.

5. Count Data Models


 Used when the dependent variable represents count values (0,1,2,3,...) (e.g., number
of doctor visits).
 Common models:
o Poisson Regression: Assumes that the mean and variance are equal (Poisson
distribution).
o Negative Binomial Regression: Used when there is overdispersion (variance
> mean).

6. Survival Analysis Models


 Used when analyzing time until an event occurs (e.g., time until failure of a
machine, time until a patient recovers).
 Common models:
o Kaplan-Meier estimator: Non-parametric method for estimating survival
functions.
o Cox Proportional Hazards Model: Used when analyzing the effect of
covariates on survival time.

You might also like