0% found this document useful (0 votes)
2 views

Econometrics I

Econometrics is the application of mathematics and statistical methods to analyze economic data, combining economic theory with empirical verification. It has two main branches: theoretical econometrics, which develops methods for measuring economic relationships, and applied econometrics, which uses these methods to study specific economic fields. Key concepts include regression analysis, types of data (cross-section, time series, panel), and the assumptions of classical linear regression models, which are essential for accurate estimation and forecasting in economics.

Uploaded by

Vinin Vijayan rv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Econometrics I

Econometrics is the application of mathematics and statistical methods to analyze economic data, combining economic theory with empirical verification. It has two main branches: theoretical econometrics, which develops methods for measuring economic relationships, and applied econometrics, which uses these methods to study specific economic fields. Key concepts include regression analysis, types of data (cross-section, time series, panel), and the assumptions of classical linear regression models, which are essential for accurate estimation and forecasting in economics.

Uploaded by

Vinin Vijayan rv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

ECONOMETRICS

INTRODUCTION TO ECONOMETRICS
• Econometrics- combining two Greek words, “oikonomia”
(Economics) and “metron” (measure).
• Literal meaning ‘measurement in Economics’.
• Branch of economics, which is a combination of mathematics,
statistics and economic theory or the application of mathematics
and statistical methods to the analysis of economic data.
INTRODUCTION TO ECONOMETRICS
• Econometrics is statistical and mathematical analysis of economic
relationships; science of empirically verifying economic theories;
set of tools used for forecasting future values of economic variables
and science of making quantitative policy recommendations in
government.
• The Econometrics has recognised as a branch of Economics only in
the 1930s with the foundation of ‘Econometric Society’ by Ragnar
Frisch and Irving Fisher.
• The term Econometrics have been first used by Powel Ciompa as
early as 1910, but Ragnar Frisch is credited for coining the term and
establishing it as a subject.
• In history of Econometrics, R J Epstien viewed Henry Moore (1896-
1958) as father of modern Econometrics because of his attempt in
1911 to provide statistical evidence for marginal productivity theory.
Classification of Econometrics
• Theoretical econometrics is concerned with the development of
appropriate methods for measuring economic relationships specified
by econometric models.
• In applied econometrics we use the tools of theoretical econometrics to
study some special field(s) of economics and business, such as the
production function, investment function, demand and supply
functions, portfolio theory, etc.
Uses of Econometrics
• In testing economic theories
• In formulation and evaluation of economic policy
• Prediction of macroeconomic variables
• In finding macroeconomic relationships
• Microeconomic relationship
• Finance
• In private and public sector decision making.
Types of Data
1. Cross Section Data: one dimensional micro data set collected at a point
of time without regarding the differences in time (firms, households,
regions, countries, etc)
2. Time Series Data: the macro data collected for the same entity at
different point of time (tick by tick, daily, weekly, monthly, yearly, ect)
3. Panel Data: Time series over different cross section units. The data
collected for different entities or subject (people, countries, same set of
stocks) in different point of time.
Regression and Causation
• The term regression was coined by Francis Galton.
• In correlation analysis, primary objective is to measure the
strength or degree of linear association between two random
variables.
• Regression analysis tries to estimate or predict the average
value of one variable (dependant) on the basis of the fixed
values of other variables (independent).
• In the literature of Econometrics the terms dependent variable
and explanatory variable are described variously like;
Simple Linear Regression (Two Variable)
• Consider a hypothetical example;
• Total population is 60 families
• Y = weekly family consumption expenditure
• X = weekly disposable family income
• The total 60 families were divided into 10 groups of approximately with same level of
income as;
80,100,120,140,160,180,200,220,240,260
• The detailed table is given in the next page.
• From the table we can understand that average weekly expenditure in each income group
is;
65,77,89,101,113,125,137,149,161,173
• We have 10 fixed values of X, and the corresponding Y values against each of the X
values.
• We call these mean values as Conditional Expected Mean Values- E(Y/X) – the expected
value of Y for given value of X.
• If we add the weekly consumption expenditure of all the 60 families and divide
this by 60, ie, (∑X/N) = 7272/60, we will get 121.20, which is called
Unconditional Mean or Expected weekly consumption expenditure E(Y).
• The above diagram shows Conditional distribution of expenditure for various levels of
income.
• There are considerable variations in weekly consumption expenditure in each income
group.
• Despite of the variability in consumption expenditure within each income group, the
average weekly consumption increases as income increases.
• If we join these conditional mean values, we obtain what is known as the
Population Regression Line (PRL), or Population Regression Curve
(PRC).
• More simply, it is the regression of Y on X.
• A population regression curve is simply the locus of the conditional
means of the dependent variable for the fixed values of the explanatory
variable.
• The above figure shows that for each X, there is Y values that are
spread around the conditional mean of those Y values.
• These Y values are distributed symmetrically around their respective
means.
• The regression line passes through these conditional values.
Population Regression Function (PRF)
• Each conditional mean E(Y/Xi) is function of Xi (value of X)
E(Y/X) = f(Xi)
• The expected value of Y for given X is functionally related to Xi.
• It is the average response of Y varies with X.
• In our example we know that consumption is a linear function of
income, and hence we have a linear population regression function;
E(Y/Xi) = f(Xi) = β0 + β1Xi
• In regression, we are estimating PRF, ie, estimating the values of the
unknowns β0 and β1 on the basis of the observation of Y and X.
• Yi = β0 + β1Xi2 is a non-linear regression function.
Stochastic Specification of PRF
• Yi = E(Y/Xi)+ui
• Yi = β0 + β1Xi+ui
• ui = Y – E(Y/Xi)
• Yi = E(Y/Xi)+ui has two components- fixed and random components.
• E(Y/Xi) states that the consumption expenditure of a given family is
depends upon the consumption expenditure of all the families wit same
level of income, which is systematic and deterministic.
• ui is pure random, it is a proxy for all the omitted and neglected variable
that may affect Y but are not included in the regression mode, which is
not systematic component.
• ui is known as the stochastic disturbance or stochastic error term.
Sample Regression Function (SRF)
• Estimate the PRF from the sample data.
• We are taking 2 sets of samples from our earlier example of income and
consumption;
• Now we are plotting it to a diagram as follows;
• Estimator, also known as a (sample) statistic, is simply a rule or formula or
method that tells how to estimate the population parameter from the
information provided by the sample at hand. A particular numerical value
obtained by the estimator in an application is known as an estimate.
• So, Population Regression Function is
• And Sample Regression Function is
• The primary objective of regression analysis is to estimate PRF on the basis of
SRF.
Estimation of Regression Equation
• There are various methods to estimate the value of unknown
parameters like;
• Ordinary Least Square Method
• Maximum Likelihood Method
• Moment Method
• Ordinary Least Square Method is the most popular and widely used
method of estimation.
Ordinary Least Squares
• OLS introduced by Carl Friedrich Gauss, German Mathematician.
• Our population regression function (PRF) is Yi = β1 + β2Xi +ui
• The PRF is not directly observable, so we estimate it from the SRF.
• Yi = β̂1 + β̂2Xi +ûi
= Ŷi + ûi
where Ŷi is the estimated (conditional mean) value of Yi
• ûi = Yi − Ŷi
= Yi − β̂1 − β̂2Xi which shows that the ûi (the residuals) are simply the
differences between the actual and estimated Y values.
• Now given ‘n’ pairs of observations on Y and X, we would like to
determine the SRF in such a manner that it is as close as possible to the
actual Y.
• For this we have to choose the SRF in such a way that the sum of the
residuals, ie, ∑ ûi = ∑(Y- Ŷi) is as small as possible.
• In the above figure we can find that the algebraic sum of the ûi is small (even zero)
although the ûi are widely scattered about the SRF.
• The method of least squares chooses β̂1 and β̂2 in such a manner that, for a given
sample or set of data ∑ui2 is as small as possible.
• For a given sample, the method of least squares provides us with unique estimates of
β1 and β2 that give the smallest possible value of ∑ui2 .
• The estimators obtained through such method are known as the least-squares
estimators.
• Selecting the values for β̂1 and β̂2 in such a way that the errors are small as possible.
• The positive errors will exactly balance the negative errors and the ∑ui is always zero.
• We can overcome this by adopting Least Square Criterion by minimising the sum of
squared residuals (RSS).
• Squaring the residuals does two things;
• 1) It avoids the possibility that large positive residuals and large
negative residuals could offset each other and still lead to a small or
even zero value of the sum of squared residuals.
• 2) It implicitly assign a large weight to numerically large residuals
regardless of whether they are positive or negative.
• If the sum of squared residuals (RSS) small, the better is the fit and if it
is zero we will get a perfect fit line although it is impossible.
RSS = u12+u22+u32+u42+…..+un2
• Here we are minimising squared residual, that’s why we call it as
Ordinary Least Square (OLS).
Interpretation of a Linear Regression Equation
Yi = β̂1 + β̂2Xi +ûi
• Suppose Y and X are expressed with natural units (no log or other
function).
• One unit increase in X will cause a β̂2 unit increase in Y.
• The constant gives the value of Y when X=0, it may or may not have
a plausible meaning depends upon the context.
Assumptions of Classical
Linear Regression Model
Assumptions of Classical Linear Regression Model
• The Gaussian, standard, or classical linear regression model
(CLRM), which is the cornerstone of most econometric theory,
makes 10 assumptions.

• It is classical in the sense that it was developed firstly by Gauss in


1821 and since then has served as a norm or a standard against
which may be compared the regression models that do not satisfy the
Gaussian assumptions.
1) The Model is Linear in Parameter

• Yi = β1 + β2Xi +ui

• The parameter β1 and β2 should appear with power one and it is not
multiplied or divided with any other parameters.

• Linearity implies that one unit change in X has the same effect on Y
irrespective of the initial value of X.

• OLS estimation is a method to deal with linear estimation.


2) Non-Stochasticity of Xi.
• The value of explanatory variable X need to vary in the sample and fixed in the
repeated sample.

• It is very important to understand the concept of “fixed values in repeated


sampling,” which can be explained in terms of above table. Consider the various Y
populations corresponding to the levels of income shown in the table. Keeping the
value of income X fixed, say, at level $80, we draw at random a family and
observe its weekly family consumption expenditure Y as, say, $60. Still keeping X
at $80, we draw at random another family and observe its Y value as $75. In each
of these drawings (i.e., repeated sampling), the value of X is fixed at $80. We can
repeat this process for all the X values.
3) Zero Mean for Disturbances
• The conditional mean of the random error terms ui for any given value of
Xi is equal to zero.
• The expected value of the disturbance term in any observation should be
zero. Sometimes it will be positive, sometimes it will be negative, but it
should not have a systematic tendency in either direction.
• E(ui/Xi) = 0
4) The Disturbance Term is Homoscedastic
• Homoscedastic means equal variance or equal spread.
• The word comes from the Greek verb skedanime, which means to
disperse or scatter.
• The conditional population variances of the random error terms ui
corresponding to each population value of Xi are equal to the same
finite positive constant, σ2 .
• Once we have generated the sample, the disturbance term will be
greater in some observation and smaller in others, but there should
not be any priori reason for it to be more erratic in some observations
than in others.
4) The Disturbance Term is Homoscedastic
5) Non-autocorrelated Errors
• The disturbance terms are not subject to autocorrelation, there should not be any
systematic association between the values of the disturbance term in any two
observations.
• Consider the random error terms ui and uj (i≠j) corresponding to two different
population values Xi and Xj, where Xi ≠ Xj. This assumption states that ui and uj
have zero conditional covariance.
• For example, just because of the disturbance term is large and positive in one
observation, there should be no tendency for it to be large and positive in the next
and vice versa.
• The values of the disturbance term should be absolutely independent of one
another.
• This means that for a given Xi, the deviation of any two Y values from their mean
value do not exhibit any systematic pattern.
6) Zero Covariance Between ui and Xi
• The disturbance term u and explanatory variable X are
uncorrelated.
• We assumed that X and u have separate influence on Y. But if X and
u are correlated, it is not possible to assess their individual effects on
Y.
• Thus, if X and u are positively correlated, X increases when u
increases and it decreases when u decreases. Similarly, if X and u are
negatively correlated, X increases when u decreases and it decreases
when u increases. In either case, it is difficult to isolate the influence
of X and u on Y.
7) The Number of Observations must be Greater than Number of
Parameters to be Estimated.
• The number of observations ‘n’ must be greater than the number of
explanatory variables.
• Suppose we have only one pair of observations on Y and X.
• From this single observation there is no way to estimate the two
unknowns, β1 and β2.
• We need at least two pairs of observations to estimate the two
unknowns.
8) Variability in X values
• The X values in a given sample must not all be the same.
• If there is very little variation in family income, we will not be able to
explain much of the variation in the consumption expenditure.
• Variation in both Y and X is essential to use regression analysis as a
research tool. In short, the variables must vary.
9) Regression Model is Correctly Specified
• The regression model is correctly specified.
• There is no specification bias or error in the model used in empirical
analysis.
• There is no such errors like omitting important variables from the
model, or by choosing the wrong functional form, or by making wrong
stochastic assumptions about the variables of the model.
10) No Perfect Multicollinearity
• There is no perfect linear relationship among the explanatory
variables in a linear regression model.
Why We Use OLS Coefficient Estimator?
• The OLS coefficient estimation with assumptions of Classical Linear
Regression Model, have several desirable statistical properties which
will be summarising by Gauss-Markov Theorem.
Gauss-Markov Theorem
• As given the assumptions of the classical linear regression model, the
least-squares estimates possess some ideal or optimum properties.
• These properties are contained in the well-known Gauss–Markov
Theorem.
• An estimator like an OLS estimator is said to be a best linear unbiased
estimator (BLUE) if ;
• It is linear.
• It is unbiased, i.e. its average or expected value, E(β̂2), is equal to
the true value, E(β2).
• It has minimum variance, an unbiased estimator with the least
variance is known as an efficient estimator.
• Gauss-Markov Theorem- Given the assumptions of the
CLRM, the least squares estimators, in the class of
unbiased linear estimators, have minimum variance, that is
they are BLUE.
• If any two estimators are both linear and unbiased, one
would choose the estimator with the smaller variance
because it is more likely to be close to β2 than the
alternative estimator. In short, one would choose the BLUE
estimator.
• We will not find an estimator whose variance is smaller
than the OLS estimator.

You might also like