Econometrics Theory Note
Econometrics Theory Note
Page: 54 ........................................................................................................................................................ 2
2.1: What is the conditional expectation function or the population regression function? ................2
2.3: What is the role of the stochastic error term Ui in regression analysis? What is the difference
between the stochastic error term & the residual, U^i? ...................................................................2
Answer:.........................................................................................................................................2
2.4: Why do we need regression analysis? Why not use the simply mean value of regressand as it is
the best value?...............................................................................................................................3
2.5: What do we mean by a linear regression model? ......................................................................3
Chapter 03: ................................................................................................................................................... 4
01: Definition of Ordinary Least Squares ............................................................................................ 4
02: The classical linear regression model: The seven assumptions underlying the method of least
squares ..........................................................................................................................................4
03: Properties of Least – Squares Estimators: The Gauss Markov Theorem.......................................5
04: The Zero Null Hypothesis and the “2 – t” Rule of Thumb ............................................................5
The “2-t” Rule of Thumb: .................................................................................................................. 6
05: Normality Test ..........................................................................................................................6
Three Common Types of Normality Tests: ...................................................................................... 6
06: Polynomial Regression Models .................................................................................................7
Chapter 09: ................................................................................................................................................... 7
01: Nature of Dummy Variables ......................................................................................................7
02: ANOVA Models.........................................................................................................................8
03: Caution in the Use of Dummy Variables.....................................................................................9
04: What Happens if the Dependent Variable is a Dummy Variable? ................................................9
05: Detection of Multicollinearity ...................................................................................................9
Chapter: 11 ................................................................................................................................................. 10
01: The nature of Heteroscedasticity............................................................................................. 10
02: Detection of Heteroscedasticity .............................................................................................. 11
Chapter: Autocorrelation ........................................................................................................................... 11
01. What is the nature of Autocorrelation? ................................................................................... 11
1|Page
02: Based on Durbin- Watson d statistic, how would you distinguish “pure” autocorrelation from
specification bias? ........................................................................................................................ 11
Page: 54
2.1: What is the conditional expectation function or the population regression function?
Answer:
The Conditional Expectation Function (CEF), also known as the Population Regression
Function (PRF), represents the expected value of a dependent variable Y given specific values
of independent variable(s) X. Mathematically, it is expressed as:
E(Y∣X) = f(X)
This function shows how the average value of Y changes with X, capturing the systematic (or
predictable) relationship between them in the population. It does not imply causation, only
association. In the context of linear regression, the PRF is often written as:
E(Y∣X) = β0+β1X
Here, β0and β1 are the population parameters. The actual observed values deviate from this
expected value due to random error. The CEF/PRF is a foundational concept in econometrics and
statistics, serving as the theoretical basis for estimating relationships using sample data.
2.3: What is the role of the stochastic error term Ui in regression analysis? What is the
difference between the stochastic error term & the residual, U^i?
Answer:
In regression analysis, the stochastic error term Ui captures all unobserved factors that
influence the dependent variable Yi but are not included in the model. It accounts for
randomness, measurement errors, omitted variables, and model imperfections. The true
regression model is:
Yi=β0+β1Xi+Ui
The residual U^i, on the other hand, is the difference between the observed value Yi and the
predicted value Y^i from the estimated regression line:
U^i=Yi−Y^i
Key difference:
2|Page
Residuals estimate the error terms, helping assess model fit and diagnose assumptions like
homoscedasticity and independence.
2.4: Why do we need regression analysis? Why not use the simply mean value of
regressand as it is the best value?
Answer:
While the mean of the regressand (dependent variable Y) is a simple and unbiased estimate of
central tendency, it ignores the influence of explanatory variables. Regression analysis goes
further by modeling how Y changes with one or more independent variables X. This allows us
to:
Using just the mean assumes all individuals or observations are alike, which is rarely true in real-
world data. Regression analysis provides a more accurate, informative, and flexible approach to
understanding and predicting outcomes based on observed patterns.
Answer:
A linear regression model is a statistical tool used to examine the relationship between a
dependent variable Y and one or more independent variables X. The model assumes that the
relationship between Y and X is linear, meaning it can be represented by a straight line. The
simple linear regression model is:
Yi=β0+β1Xi+Ui
Here,
• β0 is the intercept,
• β1 is the slope coefficient showing the effect of X on Y,
• Ui is the stochastic error term capturing unobserved factors.
In multiple linear regression, more independent variables are included. The model helps
quantify relationships, make predictions, and test economic or scientific theories. "Linear" refers
to linearity in parameters, not necessarily in variables (e.g., Y=β0 + β1 ln(X) + U is still linear in
parameters).
3|Page
Chapter 03:
01: Definition of Ordinary Least Squares
Answer:
Ordinary Least Squares (OLS) is a statistical method used to estimate the parameters of a
linear regression model. It works by minimizing the sum of the squared differences between
the observed values of the dependent variable Yi and the predicted values Y^I from the
regression line. The OLS regression equation is:
Yi=β0+β1Xi+Ui
∑(Yi−Y^i)2=∑(Yi−β0−β1Xi)2
This method ensures the best-fitting line in the least squares sense. OLS estimators are BLUE
(Best Linear Unbiased Estimators) under the classical assumptions, such as linearity, no perfect
multicollinearity, and homoscedasticity. It is widely used for its simplicity, interpretability, and
effectiveness in modeling linear relationships.
02: The classical linear regression model: The seven assumptions underlying the method
of least squares
Answer:
The Classical Linear Regression Model (CLRM) relies on seven key assumptions to ensure that the
Ordinary Least Squares (OLS) estimators are BLUE—Best Linear Unbiased Estimators. These
assumptions are:
1. Linearity in Parameters:
The model must be linear in the coefficients, like:
• Yi=β0+β1Xi+Ui
• It means the effect of X on Y is added up in a straight-line form.
2. Random Sampling:
The data should be collected randomly so that each observation is independent and
representative of the population.
3. No Perfect Multicollinearity:
The independent variables (X’s) should not be perfectly correlated. If they are, the model
can’t separate their individual effects.
4. Zero Conditional Mean:
The error term Ui should have an average of zero given any value of X:
• E(Ui∣Xi) = 0
4|Page
• This means no systematic error is left out of the model.
5. Homoscedasticity:
The variance of the errors should be constant for all values of X:
• Var (Ui∣Xi) = σ2
• No matter the value of X, the spread of errors stays the same.
6. No Autocorrelation:
The errors should not be related to each other. That is:
• Cov (Ui, Uj) = 0 for i≠j
7. Normality of Errors (for inference):
The error term Ui should be normally distributed (especially for small samples). This
helps in valid hypothesis testing and confidence intervals.
These assumptions ensure OLS gives us reliable, unbiased, and efficient estimates.
Answer:
The Gauss-Markov Theorem is a fundamental result in linear regression. It states that under the
Classical Linear Regression Model (CLRM) assumptions (excluding normality), the Ordinary
Least Squares (OLS) estimators are the Best Linear Unbiased Estimators (BLUE).
• Best: They have the lowest variance among all linear and unbiased estimators.
• Linear: The estimators are linear functions of the observed data (Y-values).
• Unbiased: On average, the estimators give the true values of the population parameters.
So, if the assumptions like linearity, random sampling, zero conditional mean, no perfect
multicollinearity, and homoscedasticity hold, then:
This makes OLS a powerful and reliable method for estimating regression coefficients. However,
if the assumptions are violated (e.g., errors are heteroscedastic or autocorrelated), OLS is still
unbiased but no longer the most efficient—it won’t have the minimum variance.
The Gauss-Markov Theorem does not require normality of the error terms. Normality is only
needed for hypothesis testing (t-tests, F-tests) and constructing confidence intervals.
Answer:
5|Page
In regression analysis, the Zero Null Hypothesis is used to check if an independent variable has
no real effect on the dependent variable. It assumes the true value of the coefficient is zero. For
example:
To test this, we look at the t-value provided in the regression output and use the “2-t” rule of
thumb:
If the absolute value of the t-statistic is greater than 2, we reject the null hypothesis. This
means the variable is likely significant and affects the dependent variable.
If the t-value is less than or equal to 2, we fail to reject the null, meaning the variable might
not have a meaningful effect.
This rule is a quick and easy way to interpret regression results, especially when working with
large samples. However, for small samples or more accurate testing, it's better to use the exact
critical values from the t-distribution table.
Answer:
A normality test is used in statistics to check whether a set of data, especially regression
residuals, follows a normal distribution (bell-shaped curve). In regression analysis, this is
important because many statistical tests (like t-tests and F-tests) assume that the error terms are
normally distributed. While normality is not essential for OLS estimation, it is crucial for making
valid inferences, especially in small samples.
The purpose of a normality test is to verify the assumption that the residuals (the differences
between actual and predicted values) come from a normally distributed population. If this
assumption is violated, the test results might be misleading.
1. Histogram of Residuals
This is a graphical method. By plotting a histogram of residuals, we can visually check if
the shape resembles a normal distribution. A bell-shaped curve indicates normality.
2. Normal Probability Plot (NPP)
Another graphical device where observed residuals are plotted against expected values
under normality. If the points fall roughly along a straight line, the data is likely normal.
3. Jarque–Bera Test
A formal statistical test that uses skewness and kurtosis to test normality. If the test
statistic is large (with a small p-value), we reject the assumption of normality.
6|Page
These tools help decide whether the regression results can be reliably used for inference.
Answer:
Polynomial regression models are an extension of classical linear regression that allow for
nonlinear relationships between the dependent variable Y and the independent variable X.
These models are widely used in econometrics, especially in analyzing cost and production
functions.
For example, the graph in the image shows a U-shaped marginal cost (MC) curve, where MC
(Y-axis) varies with output (X-axis). This indicates that the relationship between output and
marginal cost is nonlinear—initially decreasing, then increasing.
Y=β0+β1X+β2X^2
This equation fits a parabola to the data, capturing the downward and upward trend in MC. The
coefficient β2 determines the curve’s shape: if positive, the parabola opens upward ( as shown in
the graph).
Polynomial models can go beyond quadratic by including higher powers of X, such as X^3, to
capture more complex patterns. These models help economists and researchers better understand
and predict real-world nonlinear relationships.
Chapter 09:
01: Nature of Dummy Variables
Answer:
7|Page
Dummy variables are artificial variables created to represent qualitative (categorical) data in
regression models. While regression analysis typically requires numerical input, many important
variables—like gender, region, education level, or season—are non-numeric. Dummy variables
solve this issue by converting categories into binary values (0 or 1).
For example, to include gender in a regression model, we can define a dummy variable:
This allows the regression to estimate how being male (relative to female) affects the dependent
variable. If there are more than two categories (e.g., four regions), we use (k – 1) dummy
variables to avoid the dummy variable trap, which refers to perfect multicollinearity.
In short, dummy variables make it possible to incorporate qualitative factors into quantitative
models and interpret their effects meaningfully.
ANOVA (Analysis of Variance) models are used to test whether there are statistically
significant differences between the means of multiple groups. In regression analysis, ANOVA
helps determine how well the independent variables explain the variation in the dependent
variable.
In a simple linear regression, ANOVA divides the total variation in the dependent variable into
two parts:
The main test statistic used in ANOVA is the F-test, which checks whether the regression model
as a whole is significant. A large F-value with a low p-value suggests that at least one
explanatory variable has a meaningful effect on the dependent variable.
8|Page
ANOVA is especially useful when working with categorical variables (e.g., dummy variables)
and in comparing multiple group means simultaneously within a regression framework.
When the dependent variable is a dummy variable, it means the outcome is binary, taking
only two values (e.g., 1 = success, 0 = failure). In such cases, linear regression is not
appropriate, because it can predict values outside the 0–1 range and assumes constant variance,
which violates key assumptions.
In summary, if the dependent variable is a dummy, use logit or probit models for better
accuracy and valid interpretation of probabilities, instead of regular linear regression.
9|Page
Multicollinearity arises when independent variables in a regression model are highly correlated,
making it difficult to estimate their individual effects. It can be detected using the following
methods:
Chapter: 11
01: The nature of Heteroscedasticity
Answer:
Heteroscedasticity refers to the situation where the variance of the error terms is not constant
across observations in a regression model.
It commonly arises in cross-sectional data, where the spread of residuals changes with the level
of an independent variable. For example, higher-income individuals might show more variability
in spending than lower-income ones, causing unequal error variance. This violates the classical
assumption of homoscedasticity, where error terms are expected to have the same variance.
While heteroscedasticity does not make OLS estimators biased, it does make them inefficient
and results in incorrect standard errors, affecting the reliability of hypothesis tests. As a
result, t-statistics, confidence intervals, and p-values may no longer be valid.
It can be visually detected through residual plots showing a fan or funnel shape and confirmed
using formal tests like Breusch-Pagan or White’s test. To fix heteroscedasticity, one might use
robust standard errors, weighted least squares (WLS), or transform the variables (e.g., taking
logarithms).
10 | P a g e
02: Detection of Heteroscedasticity
Answer:
Detecting heteroscedasticity is important for ensuring the validity of regression results. It can be
identified using the following methods:
1. Informal Method:
Often based on prior knowledge or data patterns. For example, in income or expenditure
data, variability tends to increase with higher values, suggesting possible
heteroscedasticity.
2. Graphical Method:
Plotting the residuals against the predicted values or an independent variable. If the
residuals show a funnel-shaped or fan-shaped pattern (narrow at one end and wide at
the other), this indicates non-constant variance.
3. Formal Method – Park Test:
A statistical test where the log of the squared residuals is regressed on the log of an
independent variable. If the coefficient of the explanatory variable is statistically
significant, it suggests the presence of heteroscedasticity.
These methods help assess whether the assumption of constant error variance is violated, which
is crucial for making accurate inferences in regression analysis.
Chapter: Autocorrelation
01. What is the nature of Autocorrelation?
Answer:
Autocorrelation refers to the situation where the error terms in a regression model are
correlated with each other across time or observations.
It typically arises in time series data, where the value of an error term in one period may be
influenced by its value in a previous period. This violates the classical regression assumption that
error terms should be independent of one another. When autocorrelation is present, the OLS
estimators remain unbiased but become inefficient, and their standard errors may be
underestimated, leading to misleading t-tests and confidence intervals.
02: Based on Durbin- Watson d statistic, how would you distinguish “pure”
autocorrelation from specification bias?
Answer:
11 | P a g e
The Durbin-Watson (d) statistic is commonly used to detect first-order autocorrelation in
regression residuals. Its value ranges from 0 to 4:
• d ≈ 2 suggests no autocorrelation,
• d < 2 indicates positive autocorrelation,
• d > 2 indicates negative autocorrelation.
To distinguish pure autocorrelation from specification bias, the key is to evaluate model
correctness:
• If the model is correctly specified (right functional form, all relevant variables included)
and the Durbin-Watson test shows significant autocorrelation, it is likely pure
autocorrelation due to time-based relationships in the data.
• If the model is mis-specified (e.g., missing variables, wrong functional form), the
residuals may appear autocorrelated, not due to true autocorrelation but due to
specification bias.
In such cases, checking residual plots, performing Ramsey RESET tests, or re-estimating the
model with better specification can help confirm whether the issue is pure autocorrelation or
arises from model misspecification.
12 | P a g e
13 | P a g e