0% found this document useful (0 votes)
13 views

Statistical Interference Lecture-8

Regression analysis is a statistical process used to estimate relationships between variables. Simple linear regression uses one independent variable to predict an outcome, while multiple regression uses two or more independent variables. Regression finds the line of "best fit" to describe the relationship between a dependent variable and independent variables. The regression equation includes an intercept (a), slope (b), and error term (u) to account for randomness. Coefficients a and b are estimated using formulas, and their significance can be tested using t-statistics compared to critical values from t-tables.

Uploaded by

Bilal Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Statistical Interference Lecture-8

Regression analysis is a statistical process used to estimate relationships between variables. Simple linear regression uses one independent variable to predict an outcome, while multiple regression uses two or more independent variables. Regression finds the line of "best fit" to describe the relationship between a dependent variable and independent variables. The regression equation includes an intercept (a), slope (b), and error term (u) to account for randomness. Coefficients a and b are estimated using formulas, and their significance can be tested using t-statistics compared to critical values from t-tables.

Uploaded by

Bilal Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Lec-14

Regression Analysis
Presented by Dr. Muhammad Khalid Sohail
Regression
• In statistics, regression analysis is a statistical process
for estimating the relationships among variables.
When the focus is on the relationship between a
dependent variable and one or more independent
variables.
• The two basic types of regression are simple linear
regression and multiple linear regression.
• Simple Linear regression uses one independent
variable to explain and/or predict the outcome of Y,
while
• Multiple regression uses two or more independent
variables to predict the outcome, Y (Dependent
Variable).
• Regression is concerned with describing and evaluating the relationship between a
given variable and one or more other variables. More specifically, regression is an
attempt to explain movements in a variable by reference to movements in one or more
other variables.
• Denote the dependent variable by y and the independent variable(s) by x1, x2, ... , xk
where there are k independent variables.

• Some alternative names for the y and x variables:


y x
dependent variable independent variables
regressand regressors
effect variable causal variables
explained variable explanatory variable

• Note that there can be many x variables but we will limit ourselves to the case where
there is only one x variable to start with. In our set-up, there is only one y variable.
3
Regression is different from Correlation

• If we say y and x are correlated, it means that we are treating y and x in a completely symmetrical
way.

• In regression, we treat the dependent variable (y) and the independent variable(s) (x’s) very
differently. The y variable is assumed to be random or “stochastic” in some way, i.e. to have a
probability distribution. The x variables are, however, assumed to have fixed (“non-stochastic”)
values in repeated samples.

4
Simple Regression

• For simplicity, say k=1. This is the situation where y depends on only one x
variable.

• Examples of the kind of relationship that may be of interest include:


• Sales (Y) depends upon Advertisement (X)
• Consumptions (Y) depends upon Income (X)
• How asset returns vary with their level of market risk
• Measuring the long-term relationship between stock prices
and dividends.
• Constructing an optimal hedge ratio

5
Finding a Line of Best Fit

• We can use the general equation for a straight line,


y=a+bx
to get the line that best “fits” the data.

• However, this equation (y=a+bx) is completely deterministic.

• Is this realistic? No. So what we do is to add a random disturbance term, u into the equation.
yt =  + xt + ut
where t = 1,2,3,4,5

6
Why do we include a Disturbance term?

• The disturbance term can capture a number of features:

- We always leave out some determinants of yt (omitted variables)


- There may be errors in the measurement of yt that cannot be
modelled.
- Random outside influences on yt which we cannot model

7
ˆb nXiYi Xi Yi
Regression
nXi Xi 
2
• Regression Model 2

• yt =  +  xt + u t   Xi XYi Y
• a=alpha=intercept 
 Xi X 2

• b= beta=slope
aˆ Y bˆX
• u=error term
• After calculating a^ a and b^ we can
predict any value of Y^, by the
following formula
• Y^=a^+b^X
nXiYi Xi Yi
bˆ 
nXi2 Xi 
2


X XY Y
i i

X X
2
i

aˆ Y bˆX
nXiYi Xi Yi
bˆ 
nXi2 Xi 
Regression
2


X XY Y
i i

X X
2

• Significance of a and b aˆ Y bˆX


i

• If t(a^)>2, significant
• If t(b^)>2, significant
• OR s
 uˆ 2
t

  T 2
2 2
x x
• If t(a^)>CV, significant SE (ˆ )  s t
s t
,
T  ( xt  x ) 2
T  xt  T 2 x 2
2

• If t(b^)>CV, significant
ˆ)  s 1 1
• CV->critical value, (from table) SE (   s
 ( xt  x ) 2  xt2  Tx 2
• Say N=10, alpha=0.10
• Ho: a^=0 and b^=0
• H1: a^=0 and b^=0
• It is two tailed Test, alpha/2=0.10/2=0.05, so we can get table value as
t(0.05,N-2)=t(0.05,8)=2.3
Regression
• Try to solve some regression questions from book

You might also like