0% found this document useful (0 votes)

3 views

unit 4 regression

The document discusses various types of machine learning techniques, including supervised, unsupervised, and reinforcement learning, along with their applications and examples. It also covers the evaluation of models through training, test, and evaluation sets, and introduces cross-validation as a method for model assessment. Additionally, the document explains linear and nonlinear regression methods, including the least squares criterion for estimating regression coefficients.

Uploaded by

Adarsh Sharma 12-A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

unit 4 regression

Uploaded by

Adarsh Sharma 12-A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Unit 4

Regression technique
Types of Learning
 In general, machine learning algorithms can be classified into
three types.

• Supervised learning
• Unsupervised learning
• Reinforcement learning
Supervised learning
 A training set of examples with the correct responses (targets) is
provided and, based on this training set, the algorithm
generalises to respond correctly to all possible inputs. This is
also called learning from exemplars. Supervised learning is the
machine learning task of learning a function that maps an input
to an output based on example input-output pairs.
Example supervised learning
 Consider the following data regarding patients entering a clinic.
The data consists of the gender and age of the patients and each
patient is labeled as “healthy” or “sick”.
Unsupervised learning
 Unsupervised learning is a type of machine learning algorithm
used to draw inferences from datasets consisting of input data
without labeled responses. In unsupervised learning algorithms,
a classification or categorization is not included in the
observations. There are no output values and so there is no
estimation of functions.
 Consider the following data regarding patients entering a clinic.
The data consists of the gender and age of the patients.
Reinforcement learning
 This is somewhere between supervised and unsupervised
learning.
 Reinforcement learning is the problem of getting an agent to act
in the world so as to maximize its rewards.
 The algorithm gets told when the answer is wrong, but does not
get told how to correct it. It has to explore and try out different
possibilities until it works out how to get the answer right.
Reinforcement learning is sometime called learning with a critic
because of this monitor that scores the answer, but does not
suggest improvements.
Evaluating Models
 To train and evaluate models, data are often divided into three
sets: the training set, the test set, and the evaluation set
 Training Set
 is used to build the initial model
 may need to “enrich the data” to get enough of the special cases
 Test Set
 is used to adjust the initial model
 models can be tweaked to be less idiosyncrasies to the training data and can be
adapted for a more general model
 idea is to prevent “over-training” (i.e., finding patterns where none exist).
 Evaluation Set
 is used to evaluate the model performance

7
Test and Evaluation Sets
 Reading too much into the training set (overfitting)
 common problem with most data mining algorithms
 resulting model works well on the training set but performs poorly on unseen
data
 test set can be used to “tweak” the initial model, and to remove unnecessary
inputs or features

 Evaluation Set is used for final performance evaluation

 Insufficient data to divide into three disjoint sets?
 In such cases, validation techniques can play a major role
Cross Validation
Bootstrap Validation

8
Cross Validation
 Cross validation is a heuristic that works as follows
 randomly divide the data into n folds, each with approximately the same
number of records
 create n models using the same algorithms and training parameters; each model
is trained with n-1 folds of the data and tested on the remaining fold
 can be used to find the best algorithm and its optimal training parameter
 Steps in Cross Validation
 1. Divide the available data into a training set and an evaluation set
 2. Split the training data into n folds
 3. Select an algorithm and training parameters
 4. Train and test n models using the n train-test splits
 5. Repeat step 2 to 4 using different algorithms / parameters and compare
model accuracies
 6. Select the best model
 7. Use all the training data to train the model
 8. Assess the final model using the evaluation set
9
Example – 5 Fold Cross Validation

10
Linear Regression
 Linear regression: involves a response variable y and a single predictor
variable x  y = w0 + w1 x
 w0 (y-intercept) and w1 (slope) are regression coefficients
 Method of least squares: estimates the best-fitting straight line
| D|

 ( x  x )( y  y)
w
i i

1
i 1
| D| w  y w x
0 1

 i
( x
i 1
 x ) 2

 Multiple linear regression: involves more than one predictor variable

 Training data is of the form (X1, y1), (X2, y2),…, (X|D|, y|D|)
 Ex. For 2-D data, we may have: y = w0 + w1 x1+ w2 x2
 Solvable by extension of least square method
 Many nonlinear functions can be transformed into the above
11
Nonlinear Regression
 Some nonlinear models can be modeled by a polynomial
function
 A polynomial regression model can be transformed into linear
regression model. For example,
y = w0 + w1 x + w2 x2 + w3 x3

is convertible to linear with new variables: x2 = x2, x3= x3

y = w0 + w1 x + w2 x2 + w3 x3

 Other functions, such as power function, can also be

transformed to linear model
 Some models are intractable nonlinear (e.g., sum of
exponential terms)
possible to obtain least squares estimates through extensive
computation on more complex functions
12
More on Linear Models
Population Linear Regression y  β0  β1x  ε
y
Observed Value
of y for xi

εi Slope = β1
Predicted Value
Random Error for this
of y for xi
x value

Intercept = β0

xi x
Estimated Regression Model

The sample regression line provides an estimate of the

population regression line

Estimated (or Estimate of the Estimate of the

predicted) y regression regression slope
value intercept

Independent

ŷ i  w 0  w1x variable

The individual random error terms ei have a mean of zero

Least Squares Criterion
 w0 and w1 are obtained by minimizing the sum of the
squared residuals

e 2
  (y ŷ) 2

  (y  (w 0  w1x)) 2

 The formulas for w1 and w0 are:

w1 
 ( x  x )( y  y )
w0  y  w1 x
 (x  x) 2
General Form of Linear Functions

Slide thanks to Greg Shakhnarovich (Brown Univ., 2006)

Least Squares Generalization
 Simple Least Squares:
Determine linear coefficients ,  that minimize sum of
squared error (SSE).
Use standard (multivariate) differential calculus:
 differentiate SSE with respect to , 
 find zeros of each partial differential equation
 solve for , 
 One dimension:
N
SSE   ( y j  (    x j )) 2 N  number of samples
j 1

cov[x, y ]
   y x x,y  means of training x, y
var[ x]
yˆ t      xt for test sample xt
mtcars dataset

18
plot

19
Regression model

20
21
Training and test set

22
Regression model

23
Multivariate regression

24
Multivariate regression

25
26

Bs 2010 Catalog
No ratings yet
Bs 2010 Catalog
59 pages
m2 Data analytic and visualization
No ratings yet
m2 Data analytic and visualization
53 pages
Supervised_Learning (2)
No ratings yet
Supervised_Learning (2)
41 pages
Week 6 - Lecture 12-1
No ratings yet
Week 6 - Lecture 12-1
34 pages
Regression
No ratings yet
Regression
45 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
datamining unit4
No ratings yet
datamining unit4
21 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
s&Ml Unit 5- q & A
No ratings yet
s&Ml Unit 5- q & A
15 pages
INSY662 - F23 - Week 3-1
No ratings yet
INSY662 - F23 - Week 3-1
22 pages
Linear Regression 18may
No ratings yet
Linear Regression 18may
28 pages
Supervised Learning Notes
No ratings yet
Supervised Learning Notes
13 pages
2a Linear Regression 18may
No ratings yet
2a Linear Regression 18may
28 pages
Regression
No ratings yet
Regression
16 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Supervised Machine Learning Algorithm
100% (1)
Supervised Machine Learning Algorithm
111 pages
Lecture Slide 02 - Supervised Learning - Summer 2023
No ratings yet
Lecture Slide 02 - Supervised Learning - Summer 2023
43 pages
5.REGRESSION-1
No ratings yet
5.REGRESSION-1
46 pages
ML-1-PPT-UNIT-1
No ratings yet
ML-1-PPT-UNIT-1
93 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
ML4 Linear Models
No ratings yet
ML4 Linear Models
34 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
Project 03: Data Fitting Applied Mathematics and Statistics For Information Technology
No ratings yet
Project 03: Data Fitting Applied Mathematics and Statistics For Information Technology
17 pages
ML_Introduction
No ratings yet
ML_Introduction
76 pages
ML_Theory
No ratings yet
ML_Theory
10 pages
QSRI-lecture1
No ratings yet
QSRI-lecture1
45 pages
Week - 03 Week04
No ratings yet
Week - 03 Week04
32 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Types of Machine Learning Algorithms
No ratings yet
Types of Machine Learning Algorithms
14 pages
Module 5
No ratings yet
Module 5
48 pages
Slide 1
No ratings yet
Slide 1
29 pages
Whole ML PDF 1614408656
100% (1)
Whole ML PDF 1614408656
214 pages
UNIT-6
No ratings yet
UNIT-6
107 pages
Unit I
No ratings yet
Unit I
14 pages
Fiches Machine Learning
No ratings yet
Fiches Machine Learning
21 pages
W2 Ecs7020p
No ratings yet
W2 Ecs7020p
54 pages
Introduction To Machine Learning Algorithms: Linear Regression
No ratings yet
Introduction To Machine Learning Algorithms: Linear Regression
1 page
Assignment 9[1]
No ratings yet
Assignment 9[1]
8 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
Machine Learning
No ratings yet
Machine Learning
115 pages
Machine Learning
No ratings yet
Machine Learning
21 pages
Machine Learning and Data Mining
No ratings yet
Machine Learning and Data Mining
88 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
Module 3 - ML
No ratings yet
Module 3 - ML
101 pages
Regression Analysis
No ratings yet
Regression Analysis
11 pages
Unit 3 Machine Learning
No ratings yet
Unit 3 Machine Learning
12 pages
3. LR, decision tree
No ratings yet
3. LR, decision tree
48 pages
Machine Learning Concepts
No ratings yet
Machine Learning Concepts
68 pages
AAI Lecture 10 Sp 25
No ratings yet
AAI Lecture 10 Sp 25
37 pages
Regression-and-generalization (1)
No ratings yet
Regression-and-generalization (1)
67 pages
ML Short
No ratings yet
ML Short
11 pages
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
No ratings yet
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
78 pages
IML-Summary
No ratings yet
IML-Summary
12 pages
Cost-Function
No ratings yet
Cost-Function
31 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
SAT Math: Master the Skills in 40 Pages
From Everand
SAT Math: Master the Skills in 40 Pages
Jennifer L Johnson
No ratings yet
Unit 5 Lesson 6
No ratings yet
Unit 5 Lesson 6
8 pages
Saej 476 Av 001
100% (1)
Saej 476 Av 001
47 pages
Croda NF-T
No ratings yet
Croda NF-T
3 pages
Resume Jonna
100% (1)
Resume Jonna
5 pages
Wireless Networking Developing World
No ratings yet
Wireless Networking Developing World
254 pages
Inbound 8699559573217622972
100% (1)
Inbound 8699559573217622972
7 pages
Manual 800 3310 IB 09MC1
No ratings yet
Manual 800 3310 IB 09MC1
199 pages
Part Sample Warrant
No ratings yet
Part Sample Warrant
1 page
Carte Gramatica Limbii Engleze in Scheme, CataragaA
No ratings yet
Carte Gramatica Limbii Engleze in Scheme, CataragaA
168 pages
BHS INGGRIS CHAPTER 5 PART 2 (SISWA)
No ratings yet
BHS INGGRIS CHAPTER 5 PART 2 (SISWA)
3 pages
GEO2 - Minerals&Rocks - 2023
No ratings yet
GEO2 - Minerals&Rocks - 2023
16 pages
Muslims in American History A Forgotten Legacy 1st Edition Jerald F Dirks - The ebook in PDF format is ready for immediate access
No ratings yet
Muslims in American History A Forgotten Legacy 1st Edition Jerald F Dirks - The ebook in PDF format is ready for immediate access
60 pages
Social Media Trends and Issues
No ratings yet
Social Media Trends and Issues
51 pages
Nail Tech8 Exam Prep
No ratings yet
Nail Tech8 Exam Prep
7 pages
Society Can Be Defined As A Group of People Who Share A Common Economic
No ratings yet
Society Can Be Defined As A Group of People Who Share A Common Economic
4 pages
The Curious Incident of The Dog in The Night-Time
No ratings yet
The Curious Incident of The Dog in The Night-Time
3 pages
John Keats Biography
No ratings yet
John Keats Biography
19 pages
How To Create The Sales Bill of Material (BOM) or Sales Kit and Its Limitation
No ratings yet
How To Create The Sales Bill of Material (BOM) or Sales Kit and Its Limitation
6 pages
Packaging of Ice Cream
No ratings yet
Packaging of Ice Cream
15 pages
Unit 4
No ratings yet
Unit 4
22 pages
Pre-Colonial Period - Lesson 1
No ratings yet
Pre-Colonial Period - Lesson 1
12 pages
Determination of Selected Engineering Properties of Soybean (Glycine Max) Related To Design of Processing Machine
No ratings yet
Determination of Selected Engineering Properties of Soybean (Glycine Max) Related To Design of Processing Machine
5 pages
Fernando Claudi Ün - The Communist Movement - From Comintern To Cominform, Part 1 - The Crisis of The Communist International
No ratings yet
Fernando Claudi Ün - The Communist Movement - From Comintern To Cominform, Part 1 - The Crisis of The Communist International
405 pages
Block-3 Marketing
No ratings yet
Block-3 Marketing
60 pages
Form Validasi CAF WANASARI WANAYASA NTS OPRT TSEL
No ratings yet
Form Validasi CAF WANASARI WANAYASA NTS OPRT TSEL
8 pages
Origin of The European Cultivated Carrot
No ratings yet
Origin of The European Cultivated Carrot
24 pages
Arithmetic Progressions - Practice Sheet PDF
No ratings yet
Arithmetic Progressions - Practice Sheet PDF
4 pages
Manual Llave Stealth Hytorc - Manual
No ratings yet
Manual Llave Stealth Hytorc - Manual
32 pages

unit 4 regression

Uploaded by

unit 4 regression

Uploaded by

Unit 4

 Evaluation Set is used for final performance evaluation

 Multiple linear regression: involves more than one predictor variable

is convertible to linear with new variables: x2 = x2, x3= x3

 Other functions, such as power function, can also be

The sample regression line provides an estimate of the

Estimated (or Estimate of the Estimate of the

The individual random error terms ei have a mean of zero

 The formulas for w1 and w0 are:

Slide thanks to Greg Shakhnarovich (Brown Univ., 2006)

You might also like