0% found this document useful (0 votes)
7 views

09_Regression

Uploaded by

l.arrizabalaga
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

09_Regression

Uploaded by

l.arrizabalaga
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Objectives

Rubén Sánchez Corcuera


[email protected]

■ We will see how to apply regression using scikit-learn


■ We will review how to evaluate regression models

Regression
2

What is a regression? What is a regression?

■ A regression is a statistical technique that relates a dependent


variable to one or more independent (explanatory) variables.
■ A regression model is able to show whether changes observed in
the dependent variable are associated with changes in one or more
of the explanatory variables.
■ It does this by essentially fitting a best-fit line and seeing how the
data is dispersed around this line.

3 4
Regression in sklearn (our fav library) Regression in sklearn (our fav library)

■ There are multiple methods for regression supported in sklearn: ■ There are multiple methods for regression supported in sklearn:
● Nearest Neighbour regression

● Linear regression ● Support Vector Regression


● Logistic regression ■ LinearSVR
● Generalized linear regression ■ SVR
● Quantile regression ■ NuSVR
● Polynomial regression ● SGD Regression

5 6

Regression in sklearn (our fav library) Robust regression in sklearn


■ Robust regression aims to fit a regression model in the presence of corrupt data:
either outliers, or error in the model.
■ There are multiple methods for regression supported in sklearn:
■ Scikit-learn provides 3 robust regression estimators: RANSAC, Theil Sen and
● Gaussian Process Regression HuberRegressor.

● Decision Trees Regression ● HuberRegressor should be faster than RANSAC and Theil Sen unless the
number of samples are very large, i.e n_samples >> n_features.
● HuberRegressor should be more robust than RANSAC and Theil Sen on
default parameters.
● RANSAC is faster than Theil Sen and scales much better with the number
of samples.
● RANSAC will deal better with large outliers in the y direction (most
common situation).
● Theil Sen will cope better with medium-size outliers in the X direction, but
this property will disappear in high-dimensional settings.
7 8
Evaluating Regression Models

Evaluating
Regression ■ Why can’t we use accuracy to evaluate our regression models?
● We have a continuous target variable.

Models
● If we evaluate accuracy for each one of the data points we will
obtain awful results
■ We need other type of metrics to properly evaluate our models.

9 10

Mean Absolute Error (MAE) Mean Absolute Percentage Error (MAPE)

■ One of the most used metrics ■ When the target variable has a single dimension, some users tend
to normalize it, whereas other don´t.
■ We try to calculate the difference between the predicted values
and the actual ones. ■ The value of MAE will vary between normalized and
non-normalized approaches.
■ If is the predicted value and yi the expected one, the error would
be ■ Defining the error as a percentage variation from the actual values,
solves these situations:

■ As it would not be useful to present it as the total error, we
calculate the mean:

11 12
Root Mean Squared Error (RMSE) R-squared (R2)

■ RMSE is another widely used metric for regression models. ■ R-squared explains to what extent the variance of one variable
explains the variance of the second variable.
■ Is similar to the MSE, but the result is square-rooted.
● It is also known as the Coefficient of Determination.

13 14

R-squared (R2) Adjusted R-squared (Adjusted R2)

■ R-squared explains to what extent the variance of one variable ■ If we have an overfitted model can have a high R-squared we can
explains the variance of the second variable. help this problem with the adjusted R-squared measure.
● It is also known as the Coefficient of Determination.

15 16
Exercise Do you have any questions?
[email protected]

■ Let’s try this with a quick exercise.

Thanks!
17 18

You might also like