0% found this document useful (0 votes)
16 views

Regression: Machine Learning Course - CS-433

This document provides an overview of regression in machine learning. It defines regression as relating input variables to an output variable to either predict new outputs or understand the effect of inputs on outputs. The document notes that regression data consists of pairs of inputs and outputs and discusses common regression examples and goals. It also defines the regression function and provides additional context about correlation versus causation and terminology.

Uploaded by

Uasdaf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Regression: Machine Learning Course - CS-433

This document provides an overview of regression in machine learning. It defines regression as relating input variables to an output variable to either predict new outputs or understand the effect of inputs on outputs. The document notes that regression data consists of pairs of inputs and outputs and discusses common regression examples and goals. It also defines the regression function and provides additional context about correlation versus causation and terminology.

Uploaded by

Uasdaf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Machine Learning Course - CS-433

Regression

Sept 21, 2021

minor changes by Martin Jaggi 2021,2020,2019,2018,2017,2016;


Mohammad
c Emtiyaz Khan 2015
Last updated on: September 20, 2021
What is regression?
Regression is to relate input vari-
ables to the output variable, to ei-
ther predict outputs for new inputs
and/or to understand the effect of
the input on the output.

Dataset for regression


In regression, data consists of pairs
(xn, yn), where yn is the n’th output
and xn is a vector of D inputs. The
number of pairs N is the data-size
and D is the dimensionality.

Examples of regression

(a) Height is correlated with (b) Do rich people vote for republicans?
weight. Taken from “Machine Learn- Taken from Avi Feller et. al. 2013, Red
ing for Hackers” state/blue state in 2012 elections.
(c) How does advertisement in TV, radio, and newspaper affect sales? Taken from the book ”An
Introduction to statistical learning”

Two goals of regression


In prediction, we wish to predict
the output for a new input vector,
e.g. what is the weight of a person
who is 170 cm tall?

In interpretation, we wish to under-


stand the effect of inputs on output,
e.g. are taller people heavier too?

The regression function


For both the goals, we need to find a
function that approximates the out-
put “well enough” given inputs.

yn ≈ f (xn), for all n


Additional Notes
Correlation 6= Causation
Regression finds correlation not a causal relationship, so interpret your
results with caution.

This image is taken from www.venganza.org. You can see many more
examples at this page: Spurious correlations page.

Machine Learning Jargon for Regression


Input variables are also known as features, covariates, independent vari-
ables, explanatory variables, exogenous variables, predictors, regressors.
Output variables are also known as target, label, response, outcome, de-
pendent variable, endogenous variables, measured variable, regressands.
Prediction vs Interpretation
Some questions to think about: are these prediction tasks or interpreta-
tion task?
1. What is the life-expectancy of a person who has been smoking for
10 years?
2. Does smoking cause cancer?
3. When the number of packs a smoker smokes per day doubles, their
life span gets cut in half?
4. A massive scale earthquake will occur in California within next
30 years.
5. More than 300 bird species in north America could reduce their
habitat by half or more by 2080.

You might also like