0% found this document useful (0 votes)
80 views8 pages

How To Do Linear Regression With Excel

Linear regression is used to analyze the relationship between passenger traffic (PAX) and population (POP) as a function of per capita personal income (PCPI), employment, and national average yield. The regression results show that PCPI and national average yield are positively correlated with PAX/POP, while employment is negatively correlated. The regression model has a high goodness of fit, as indicated by R-square and adjusted R-square values close to 1. The regression analysis overall and the coefficients are statistically significant.

Uploaded by

bicky bong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views8 pages

How To Do Linear Regression With Excel

Linear regression is used to analyze the relationship between passenger traffic (PAX) and population (POP) as a function of per capita personal income (PCPI), employment, and national average yield. The regression results show that PCPI and national average yield are positively correlated with PAX/POP, while employment is negatively correlated. The regression model has a high goodness of fit, as indicated by R-square and adjusted R-square values close to 1. The regression analysis overall and the coefficients are statistically significant.

Uploaded by

bicky bong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Linear Regression with Excel

Brief Tutorial for


Forecasting Tutorial

MA4880
First, please make sure that you have Analysis ToolPak installed.
Copy the Forecast Data and
LAX Master Plan Equation

From the list, select: Regression.


Let’s Do Regression
You are asked to specify the input data.
• “Y Range” refers to Dependent variables
• “X Range” refers to Independent variables
For this analysis, we assume that PAX/(2*POP) is a function of PCPI, Employment, and Nat’l Avg Yield.
So Column H is the Dependent variable, and Columns B-D are Independent variables

To specify
the input
range, click
on this button
to go to range
specification
screen.
Let’s Do Regression
Here, you can select the data set you want to include as the value of Dependent or Independent
variables. (Have to be done one at a time.)
• For Dependent variables, all data sets must be in one column.
• For Independent variables, data sets can be in multiple columns, and they will be your “X Variable 1,2, …
n” in the output worksheet.
• There must be the same number of data sets for Dependent and Independent variables. So please
construct your table carefully.
Let’s Do Regression
When you press: OK, the regression result will be shown on another worksheet. Statistics
useful for our purpose are highlighted.

This example shows, R^2 = 0.941, the


coefficients for the 3 independent
variables and the intercept.
Regression output and model
• Residuals: Actual (or observed) values minus the predicted (estimated using the regression equation) values of the explained variable.
• Multiple R: The multiple correlation coefficient which measures the correlation between the actual (observed) and the predicted (estimated using the
regression equation) values of the explained variable. It lies between -1 and +1 .
• R-Square: The Square of the multiple correlation coefficient. It is also called coefficient of determination. It measures the goodness of fit of the regression
(i.e. the proportion of the variability in the explained variable that is explained by the regression). R-Square lies between zero and 1 and the closer it is to 1,
the higher the goodness of fit.
• Adjusted R-Square: The R-Square adjusted to take into account the number of explanatory variables in the regression (or the degrees of freedom). The
adjusted R-Square can be negative, and is always less than or equal to R-Square.
• Standard Error: The standard error of the regression (it measures the average error/residual due to each observation, corrected for the number of variables
included in the regression).
• df: Degrees of freedom of the regression (equal to the number of explanatory variables) of the residuals as well as the total (the number of observations
minus one).
• SS: Sum of squares (squared deviations from the mean) of the regression of the residuals as well as the total.
• MS: Mean square (mean of squared deviations from the mean) of the regression of the residuals as well as the total. It is equal to the corresponding SS
divided by the corresponding df.
• F: The F statistic. A measure to assess the overall significance of the regression. It is equal to the ratio of two mean squares: the mean square of the
regression divided by the mean square of the residuals. In order for the regression to be significant, the F-value must be greater than a critical value,
determined by the regression degrees of freedom and the F statistical distribution.
• Significance F: The probability that the regression is not significant which is equivalent to the probability that all coefficients are equal to zero.
• Intercept: The estimated constant term in the regression equation. It can be interpreted as the mean effect on the explained variable of all the variables
excluded from the regression.
• Coefficients: The estimated coefficients of the explanatory variables in the regression equation.
• t-Stat: A value of the Student t statistic to assess the significance of the coefficient. In order for the coefficients to be significant (not equal to zero) in
general, the absolute value of t-Stat must be 2 or more.
• P-value: Probability that the corresponding intercept or coefficient is equal to zero.
• Upper and Lower Confidence Levels: ―Lower 95%‖ and ―Upper 95%‖ means that there is a 95% chance that the coefficient computed by regression falls
between these limits. The upper and lower confidence levels for each coefficient must not cross ―0‖, which would imply that there is a possibility that a
coefficient could assume the value ―0‖. If you change the confidence level in the Regression window, these values will change.
Regression output and model
From this summary, the coefficients of the regression are:
Constant = -3014.9
PCPI = 0.233
EMP = - 0.075
YIELD = 6876.10

This turns the original equation into:


(PAX)/(2*(POP)) = -3014.9 + 0.233*(PCPI) - 0.075(EMP) + 6876.10(YIELD)

The R-Square and the Adjusted R-Square are both close to one signaling a high goodness of fit.
The Significance F is very close to zero, implying that the regression is highly significant overall.
The t-Stat of all the coefficients is much higher than 2, except for the intercept, implying that the coefficients of PCPI,
EMP and YIELD are highly significant

You might also like