Simple Linear Regression
Simple Linear Regression
𝑦 = 𝑎 + 𝑏𝑥
𝑦 = 50 + 5𝑥
Equation of a linear relationship
To plot a straight line, we need to know two points that lie on that
line. We can find two points on a line by assigning any two values
to x and then calculating the corresponding values of y. For the
equation y = 50 + 5x:
These two points are then plotted. By joining these two points, we
obtain the line representing the equation y = 50 + 5x.
Equation of a
linear
relationship
• These two points are then
plotted. By joining these
two points, we obtain the
line representing the
equation y = 50 + 5x.
Equation of a linear relationship
Note that in the figure, the line intersects the y (vertical) axis at 50.
Consequently, 50 is called the y-intercept. The y-intercept is given by the
constant term in the equation. It is the value of y when x is zero.
y = a + bx
Model (1)
Simple Linear Regression Analysis
Model (1), A gives the value of y for x = 0, and B gives the change in y due to
a change in one unit of x.
𝑦 = 𝐴 + 𝐵𝑥 +∈
Model (2)
The random error term ∈ is included in the model to represent the following
two phenomena:
𝑦ො = 𝑎 + 𝑏𝑥
Estimates of A and B
In the model 𝑦ො = 𝑎 + 𝑏𝑥, a and b, which are
calculated using sample data, are called the
estimates of A and B, respectively.
Scatter Diagram
∈ - the random error. It is the difference between the actual value of y and
the predicted value of y.
e – if we use sample data, we use this symbol instead for the random
error. e is an estimator of ∈.
𝑆𝑆𝐸 = 𝑒 2 = 𝑦 − 𝑦ො 2
Least Squares Line
To calculate the least squares line for simple linear regression, you need to
find the best-fitting straight line that minimizes the sum of the squared
differences between the observed data points and the corresponding points
on the regression line.
𝑆𝑆𝑥𝑦
To solve for b (the slope), we use the formula 𝑏 =
𝑆𝑆𝑥𝑥
σ𝑥σ𝑦 σ𝑥 2
where 𝑆𝑆𝑥𝑦 = σ 𝑥𝑦 − and 𝑆𝑆𝑥𝑥 = σ 𝑥2 −
𝑛 𝑛
𝑆𝑆𝑥𝑦
𝑏=
𝑆𝑆𝑥𝑥
We can also state that, on average, all households with a monthly income of
$6100 spend about $1690.75 per month on food.
Least Squares Line
In our data on seven households,
there is one household whose
income is $6100. The actual food
expenditure for that household is
$1600 (see Table 13.1). The
difference between the actual and
predicted values gives the error of
prediction. Thus, the error of
prediction for this household, which is
shown in Figure 13.7, is
𝑒 = 𝑦 − 𝑦ො = 16 − 16.9075 = −$90.75
Least Squares Line
Therefore, the error of prediction is -$90.75.
The negative error indicates that the predicted
value of y is greater than the actual value of y.