0% found this document useful (0 votes)
7 views34 pages

Week 6 - Lecture 12-1

The document outlines a lecture on Machine Learning focusing on Linear Regression, covering its definition, types, and evaluation methods. It explains the concepts of supervised learning, model parameters, train-test split, and various metrics for evaluating regression models such as Mean Absolute Error and Mean Squared Error. Additionally, it introduces Polynomial and Multiple Linear Regression as extensions of basic linear regression to handle more complex relationships.

Uploaded by

Saif Mohammad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views34 pages

Week 6 - Lecture 12-1

The document outlines a lecture on Machine Learning focusing on Linear Regression, covering its definition, types, and evaluation methods. It explains the concepts of supervised learning, model parameters, train-test split, and various metrics for evaluating regression models such as Mean Absolute Error and Mean Squared Error. Additionally, it introduces Polynomial and Multiple Linear Regression as extensions of basic linear regression to handle more complex relationships.

Uploaded by

Saif Mohammad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Machine Learning Models:

Linear Regression
Week 6 - Lecture 12
COSC 202 Data Science and AI

Menatalla Abououf
Fall 2024
Outline – Week 6
• Introduction to Machine Learning
1. What is Machine Learning?
2. Why Machine learning?
3. What is a Model?
• Types of Machine Learning
• Types of supervised Machine learning and Its framework
• Linear Regression
• Understanding model parameters
• Train-test split
• Evaluating regression models

2
Recall - Types of Machine Learning
In this course, we will focus on Supervised and Unsupervised learning. We will
discuss different concepts using the following models:

Machine
Learning

Supervised Unsupervised

Regression • Linear Regression


• K-means
• Decision Tree Clustering
Classification & regression
• Random Forest

3
Supervised Learning: Regression
• Goal: to predict a quantitative value. It works by predicting a continuous
value based on input data.

• Examples of Algorithms:
• Linear Regression
• Decision Trees
• Random Forest

4
Supervised Learning: Linear Regression
• Linear regression is a linear model. from sklearn.linear_model import LinearRegression
# Initialize the Linear Regression model
It predict numcal values) model = LinearRegression()
# Train the model using the training data
• E.g. A model that assumes a linear model.fit(X_train, y_train)
relationship between an input (x) and an
output (y) S
𝑦 𝑥 =𝛽 +𝛽 𝑥 linear
relationship
>
- there should
The model’s parameters
be a correlation
called coefficient
& bias (either the or -ve)

y=mx+b
-

Slope y-intercept

5
Supervised Learning: Linear Regression
Example: Imagine we are
predicting distance travelled (y)
from speed (x). Our linear
regression model would be:

Distance=m(speed)+ c

Where:
m: Coefficient
c: y-intercept

6
Supervised Learning: Linear Regression
• As the speed increases, distance
also increases, hence the variables
have a positive relationship.
(tve correlation (

• What will be the equation if the


distance is constant?
(Speed and time are the changing )

7
Supervised Learning: Linear Regression

As the speed increases, time decreases, hence the variables


have a negative relationship. (ve correlation) 8
Supervised Learning: Linear Regression
• Now suppose we have this Try to plot it:
dataset:

9
actual table
-error
is the value of
yp compared to
y in

4
y =

for -fy =
x+
1

↓ use absolute value

for
calculating errors

>
- total
=
8


total =
S
-

model
trying equations
the ) etc
keeps (y = X 1
y
= X+ , ... (

till the least number of which


getting
errors

will be the best fit


Supervised Learning: Linear Regression
• Now calculate the mean:

10
Supervised Learning: Linear Regression
↓ find best fit

using least square


method for
simple instead of

I
method
an
optimization method
linear (1 ingut
regression trying multiple
>
-
times +
-

Least Square method: & I output ( get least error like in

slide 10

Yne e
>
-

slide
10 will
give us an

of
error
3 -
2 (less
than &8)
Ylug
, so

in mean of
it's the best

Xy &

11
Supervised Learning: Linear Regression

12
Supervised Learning: Linear Regression
• Let’s try to predict the values of y when x={1,2,3,4,5,6)

• We make sure the best fit line has the least


error between the data points and the
regression line.
-hom theregressonation𝑦
• We have lots of ways to minimize this error: y 𝑥 &
from the table of

Sum of Squared errors, Sum of Absolute He


in
actual
slide
G
values

errors actual values


are

the
basically
label
(from
testing)

13
Train-Test Split
• Recall: our goal is to build a model that will generalize well on unseen data.
• We cannot guarantee this if we train on our entire dataset  Split up the
dataset into training data and testing data.
• Training data: A portion of the dataset used to train the model, where the
model learns patterns from the data and adjusts its parameters.
• Testing data: A portion of the dataset used to evaluate the model's
performance after training. The model does not see this data during training,
and it's used to assess how well the model generalizes to unseen data. The
labels are used for evaluation, but they are hidden from the model during
prediction. o
to check errors

14
X_train y_train

Train-Test Split
X1 X2 X3 y

X y
X1 X2 X3 y
Training data
80%

>
- could be

70 % /30 %

compare predicted
values (yp) from
X_test X-test to y-test y_test
values
20% X1 X2 X3 y
Testing data

means 2000 testing >


- will
determines

from sklearn.model_selection import train_test_split automatically know that 80 %


& the
combin
is
training a
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) shuffled
ation

15
data
this
6 In data
7 ->
choosing set

I
most cases , we shuffle
for &
training testing unless order
gives y = X
is
important (for example , if we're

predicting weather , time of the

data >
-
thats why
(moving fight lete) is important
So model will think that it doesn't shuffled data
this
matter if it's day/night when shuffled a
choosing chosen
trained which is not what we want Ja affects

gives y
zx
have
=

so we

d stay
to

consistent
fitted model
with what
is different for shuffled
we
each shuffled set
that's trained
as training/ testing
Model Evaluation for Regression Models
• The loss function is a way to measure how well a model's predictions align
with the actual data. -
by comparing
predicted
from X
lesting to y testing ?

• We will focus on two different evaluation metrics (Loss functions):

from sklearn.metrics import mean_squared_error, mean_absolute_error


• Mean absolute error # Predict target values for the test data
y_pred = model.predict(X_test)
# Evaluate the model's performance
• Mean squared error mae = mean_absolute_error(y_test, y_pred) # Mean Absolute Error
mse = mean_squared_error(y_test, y_pred) # Mean Squared Error

• Root mean squared error

16
Mean Absolute Error (MAE)
• Mean Absolute Error (MAE): Is the magnitude of difference between the
prediction of an observation and the true value of that observation averaged
out across the dataset.
• Also known as L1 Loss.
Actual value
Actual value Predicted value

1
𝑀𝐴𝐸 = |𝑦 − 𝑦 ( ) |
𝑛
d
from
Predicted value
testing

17
Example
Imagine you are a data scientist working for a startup that sells handmade crafts
online.
The company recently implemented a new pricing algorithm to predict the price
of crafts based on various features like size, material, and complexity.
You want to evaluate the performance of this new algorithm. Here’s a set of actual
prices of crafts and the prices predicted by the algorithm:

18
Example Solution
Solution:
Here, n = 5, CS items)
MAE = 1/5 * (|25-28| + |15-14| + |20-22| + |30-29| + |40-38|)
= 1/5 * (3+1+2+1+2) = 1/5 * (9) = 1.8 predicted prices
are off by
1 (from
.
8 the actual
prices

• From the above result, on average the predicted prices are off by $1.8

19
MAE Advantages and Limitations
Advantages:
• MAE treats all errors equally, minimizing the impact of outliers on the loss
function. (big errors treated like small
are (
errors

• It has the same unit as the target variable, making it easy to compare. It
tells us that, on average, our predictions are off by two units from the
actual values.
Limitations:
• May not fully reflect the impact of large errors because it doesn't
emphasize them. For example, a model with a few significant prediction
errors can still have a low MAE.

20
Mean Squared Error (MSE) & Root Mean
Squared Error (RMSE) higher

bigger impact error - (because it's squared

• The Mean Square Error (MSE) is measured as the difference between the
model’s predictions and ground truth, squared and averaged out across the
dataset.
• The root mean squared error (RMSE) is the square root of the MSE
• Also known as L2 loss.
Actual value Predicted value Actual value

1
𝑀𝑆𝐸 = 𝑦 −𝑦
𝑛

1 Predicted value
square-rooting to
𝑅𝑀𝑆𝐸 = 𝑦 −𝑦
get the same unit 𝑛
& (instead of squaring unit) 21
compare
Example
The table shows the actual monthly temperature
in Fahrenheit and the predicted temperature.
Calculate the mean squared error and the root
mean squared error.

22
Example
Solution
1. Calculate the error and the squared error
between the actual and the predicted
values
2. Calculate the average of the squared error

16 + 9 + 4 + 25 + 9 + 4 + 1 + 0 + 16 + 9 + 9 + 4
𝑀𝑆𝐸 = E
12
squared
𝑀𝑆𝐸 = 8.83 error

from the
𝑅𝑀𝑆𝐸 = 8.83 = 2.89 table

23
MSE Advantages and Limitation
Advantages:
• It offers faster convergence in scenarios where the error values are
relatively small and consistent because it amplifies small errors
(squared).
• It penalizes large errors more than small ones, reflecting the true
accuracy of the model better.
Limitations:
• It is sensitive to outliers, meaning that extreme errors can skew it.
• It has a different unit than the target variable, making it harder to
compare.
• Note: RMSE takes the square root of MSE, which means it brings the error
metric back to the same unit as the target variable.

24
MAE vs MSE: Which One to Choose?

Short answer: It depends on your goal and the chosen model.

Scenario Preferred loss function


Simpler and more intuitive model? MAE has the same unit
unaffected by outliers & Robust to outlier? MAE treats all errors equally >
-
Clarge errors

Coutliers) do not
Care more about small errors? MAE treats all errors equally
affect
largely)
Care more about large errors? MSE will penalize larger errors more
Data is relatively clean? MSE will converge faster
-

parameters
S
no outliers & values are faster
within an acceptable range T
(mdb)

25
Polynomial Regression
• Polynomial Regression is an extension of Linear Regression, allowing to
model more complex (non-linear) relationships by introducing polynomial
features.
• The model equation becomes:
𝑦 = 𝛽 + 𝛽 𝑥 + 𝛽 𝑥 + … + 𝛽ₙ𝑥ⁿ, where n is the degree of the polynomial.
S
could be for 1 feature

having a non-linear

relationship (all are

* but raised to

differnt powers (
L

could be for more than 1 feature too 26


Multiple Linear Regression &

Can linear regression model the relationship between multiple variables?


Multiple linear regression is used to estimate the relationship between two or
more independent variables and one dependent variable.
L um

* values Y
𝑦 = 𝛽 + 𝛽 𝑥 + 𝛽 𝑥 + … + 𝛽ₙ𝑥 + 𝜖
m
𝑦: the label X1 , X2 ,
. . .
, XM

features
𝛽 : y-intercept when all other parameters are 0 (multiple
𝛽 : the regression coefficient of the first independent variable (feature) 𝑥
𝜖: the error term (captures the variation in output that cannot be explained by the linear
combination of the independent variables)

27
Multiple Linear Regression
Can linear regression model the relationship between multiple variables?
Multiple linear regression is used to estimate the relationship between two or
more independent variables and one dependent variable.
multiple features (X1 , X2 ,
...,xh)
-
𝑦 = 𝛽 + 𝛽 𝑥 + 𝛽 𝑥 + … + 𝛽ₙ𝑥 + 𝜖

> feature

Example: -bigger coefficient - more valuble


predict label
𝑀𝑎𝑡ℎ 𝑆𝑐𝑜𝑟𝑒 = 17.62 + 1.68 𝑥 𝑆𝑡𝑢𝑑𝑦 𝑇𝑖𝑚𝑒 + 0.26 𝑥 𝐼𝑄 𝑆𝑐𝑜𝑟𝑒 zu
L

feature
1 feature 2

small coefficient
very > better to remove
-

because it's not valuble

28
https://medium.com/@joshibhagyesh29/understanding-simple-linear-regression-vs-multiple-linear-regression-a-guide-with-examples-c3e8945e4830
time constant

↓ d =B x +-
W

speed ,
wind , traffic
(examples of features to
predict d)
-

speed is most
important to predict o so it win

have coefficient features


other
very large making
a

not important to predict the label


Lessons Learned
• The machine learning framework we learned applies to a variety of supervised machine
learning algorithms. This includes:
• Understanding what a model is
• Understanding model parameters
• Train-test split
• Evaluating regression models
• Next, we will be introduced to a different ML model. The concepts we will learn will also
apply to most ML models, including linear regression, and cover:
• Understanding hyperparameters
• Overfitting and underfitting
• Regularization
• Cross-validation
• Evaluating classification models

29
Life Cycle of Data Science – Modeling &
Validation
1. Selecting Appropriate Machine Learning Model
2. Splitting Data (split to training& testing)
3. Model Training
4. Model Validation
5. Model Evaluation
6. Final Model Selection

30
Recommended Reading
• Artificial Intelligence with Python, by Alberto Artasanchez and Prateek Joshi.
Publisher: Packt Publishing Ltd, 2nd Edition, 2020. ISBN-10: 183921953X.
ISBN-13: 978-1839219535. - Pages 117-122

• First Principles with Python by Joel Grus, Data Science from Scratch.
Publisher: O’Reilly Media; 2nd Edition, 2019. ISBN-10: 1492041130. ISBN-13:
978-1492041139 – Pages 173 - 175

• Try it yourself:
• https://www.kaggle.com/code/ybifoundation/simple-linear-regression/notebook

31

You might also like