0% found this document useful (0 votes)
15 views

Maths 2nd

Uploaded by

V. Rakesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Maths 2nd

Uploaded by

V. Rakesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

How to find a least squares regression line

Often the questions we ask require us to make accurate predictions on how one factor affects
an outcome. If a teacher is asked to work out how time spent writing an essay affects essay
grades, it’s easy to look at a graph of time spent writing essays and essay grades say “Hey, people
who spend more time on their essays are getting better grades.” What is much harder (and
realistically, pretty impossible) to do by eye is to try and predict what score someone will get in an
essay based on how long they spent on it. Sure, there are other factors at play like how good the
student is at that particular class, but we’re going to ignore confounding factors like this for now and
work through a simple example.

Our teacher already knows there is a positive relationship between how much time was spent on an
essay and the grade the essay gets, but we’re going to need some data to demonstrate this
properly.

Least squares regression line example

Suppose we wanted to estimate a score for someone who had spent exactly 2.3 hours on an essay.
I’m sure most of us have experience in drawing lines of best fit, where we line up a ruler, think “this
seems about right”, and draw some lines from the X to the Y axis. In a room full of people, you’ll
notice that no two lines of best fit turn out exactly the same. What we need to answer this question is
the best best fit line.
Through the magic of least sums regression, and with a few simple equations, we can calculate a
predictive model that can let us estimate grades far more accurately than by sight alone. Regression
analyses are an extremely powerful analytical tool used within economics and science. There are a
number of popular statistical programs that can construct complicated regression models for a variety
of needs. A simpler model such as this requires nothing more than some data, and maybe a
calculator. It’s worth noting at this point that this method is intended for continuous data.

Least squares regression equations

The premise of a regression model is to examine the impact of one or more independent variables (in
this case time spent writing an essay) on a dependent variable of interest (in this case essay grades).
Linear regression analyses such as these are based on a simple equation:

Y = a + bX

Y – Essay Grade a – Intercept b – Coefficient X – Time spent on Essay

There’s a couple of key takeaways from the above equation. First of all, the intercept (a) is the essay
grade we expect to get when the time spent on essays is zero. You can imagine you can jot down a
few key bullet points while spending only a minute on an essay and still get a few points here and
there. Every essay will have at least this score according to our model. On top of that, every hour we
spent on our essays (X) leads to an increase of b in the grade the essay gets. We can work
out b through the following, slightly scary equation:
But we’re getting ahead of ourselves. To calculate b, and make sense of that creepy equation, we’re
going to need to know the values for our data:

How do you calculate a least squares regression line by hand?


When calculating least squares regressions by hand, the first step is to find the means of the
dependent and independent variables. We do this because of an interesting quirk within linear
regression lines - the line will always cross the point where the two means intersect. We can think of
this as an anchor point, as we know that the regression line in our test score data will always
cross (4.72, 64.45).

The second step is to calculate the difference between each value and the mean value for both the
dependent and the independent variable. In this case this means we subtract 64.45 from each test
score and 4.72 from each time data point. Additionally, we want to find the product of multiplying
these two differences together.
You should notice that as some scores are lower than the mean score, we end up with negative
values. By squaring these differences, we end up with a standardized measure of deviation from the
mean regardless of whether the values are more or less than the mean.

Let's remind ourselves of the equation we need to calculate b.

The symbol sigma (∑) tells us we need to add all the relevant values together.

If we do this for the table above, we get the following results:

∑(x-x ̅ ) * (y-y ̅ ) = 611.36

And
∑(x-x ̅ ) ^2 = 94.18

Slotting in the information from the above table into a calculator allows us to calculate b, which is step
one of two to unlock the predictive power of our shiny new model:

The final step is to calculate the intercept, which we can do using the initial regression equation with
the values of test score and time spent set as their respective means, along with our newly calculated
coefficient.

64.45= a + 6.49*4.72
We can then solve this for a:

64.45 = a + 30.63
a = 64.45 – 30.63
a = 30.18

Now we have all the information needed for our equation and are free to slot in values as we see fit. If
we wanted to know the predicted grade of someone who spends 2.35 hours on their essay, all we
need to do is swap that in for X.

y=30.18 + 6.49 * X
y = 30.18 + (6.49 * 2.35)
y = 45.43

Drawing a least squares regression line by hand

If we wanted to draw a line of best fit, we could calculate the estimated grade for a series of time
values and then connect them with a ruler. As we mentioned before, this line should cross the means
of both the time spent on the essay and the mean grade received.
And there we have it! A perfect* predictive model that will make our teachers’ lives a lot easier.

You might also like