0% found this document useful (0 votes)
17 views18 pages

Stat Q4M3

Uploaded by

ishamreyalmadin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views18 pages

Stat Q4M3

Uploaded by

ishamreyalmadin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

STATISTICS &

PROBABILITY

REGRESSION ANALYSIS
Regression analysis is the statistical method used to determine
the structure of a relationship between two variables (single linear
regression) or three or more variables (multiple regression).

According to the Harvard Business School Online course


Business Analytics, regression is used for two primary purposes:

1. To study the magnitude and structure of the relationship between


variables
2. To forecast a variable based on its relationship with another variable

Both of these insights can inform strategic business decisions.


Lesson 1. Calculation & Interpretation of the Slope and Y-
intercept of a Regression Line

The regression line is also called as the line of best fit. Its
significance is in enabling us to interpret data trends and help
us in making predictions based on that data, the latter which is
to be discussed further in the next lesson.
Take note that in doing regression, you first need to consider
the following assumptions:
a. There exists a relationship between the variables; and
b. The relationship is tested to be significant.
The stated conditions are necessary to be first met, otherwise doing
a regression analysis would be totally pointless.
A scatterplot is one way of illustrating a line of best fit. The figure
below shows a scatterplot of a data of two variables. Notice that
several lines can be drawn on the graph near the points. With this,
you should be able to draw the line of best fit. Best fit means that
the sum of the squares of the vertical distances from each point to
the line is at a minimum.
The Equation of a Regression Line
Going back in our algebra concepts, an equation of a line is
given by y = mx + b where m stands for the slope and b for the y-intercept.
Similarly, an equation of a regression line is given by y’=a+bx
where b is the slope and a is the y-intercept.
Furthermore, the corresponding formulas for the y-intercept a
and the slope b are as follows:

where n is the number of data pairs


The rounding rule for both a and b is up to three decimal places.
Answer this in your notebook.
Given the data below, find the equation of the regression line and
provide an interpretation of the results.
Solution:
Solution:

Hence, the equation of the regression line y'=a+bx is


y'=79.078+.945x where the slope is .945 and the y-intercept is
79.078. The y-intercept is the value you get when x=0. That is, it is
the value at some point where the line intersects the y-axis.
Interpretation

Marginal change is the magnitude of the change in one variable when


the other variable changes exactly one unit. In the problem, the
value of the slope b, which is .945, is the marginal change. This
means that for every change in the value of x, which is the number
of study hours, the value of y which is the grade also changes at .945
unit on the average. Similarly, the value of the y–intercept a is
79.078. This means that the grade of a student would be 79.078 if
he/she has zero hours of study.
Answer this in your notebook.
Lesson 2. Solving Problems Involving Regression Analysis
Below is a sample data about the top achieving students of a school
given their number of study hours (x) and their score in the math final
exam (y). Find the equation of the regression line and predict the value of
the dependent variable if the value of the independent one is 14.

Before we proceed with our initial computation, we must remember that in making regression
analysis, the data must be correlated and that the correlation must be significant. For the sake
of this discussion let us just have the assumption that such requirements have been met.
Before we proceed with our initial computation, we must remember that in
making regression analysis, the data must be correlated and that the correlation
must be significant. For the sake of this discussion let us just have the
assumption that such requirements have been met.
Solution: we need to solve for the values in the slope a & y-
intercept b
Hence, the equation of the regression line y’= a + bx is
y'=75.667+1.583x where the slope is 1.583 and the y-intercept is
75.667.
Interpretation
In the regression line equation, our slope b is 1.583 which means
that for every change in the value of x, which is the number of
study hours, the value of y which is the score also changes at 1.583
unit on the average. Similarly, the value of the y–intercept a is
75.667. This means that the score of a student would be 75.667 if
he/she has zero hours of study.
Now, since our main objective is to predict the value of y when the
value of x is 14, we will now use our newfound equation. We will
replace x with 14.

y'=75.667+1.583x
y'=75.667+1.583(14)
y'=75.667+22.162
y'=97.829

Hence, if a student’s study hours is 14, his/her expected score in


the math exam would be 97.829.
Answer this in your notebook.
The data below shows the ages of students x in a certain school, and the
corresponding number of them having smartphones y. Find the equation of the
regression line and predict the number of students with smartphones with the
age of 20. Consider the variables to be correlated and that the correlation is
significant.
Quiz next meeting

You might also like