0% found this document useful (0 votes)

34 views

3.8 Supervised Learning With Python A

Uploaded by

Spry Cylinder

Available Formats

Download as PPSX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

3.8 Supervised Learning With Python A

Uploaded by

Spry Cylinder

Available Formats

Download as PPSX, PDF, TXT or read online on Scribd

You are on page 1/ 25

Supervised Learning with Python

Engr. Elisa G. Eleazar

School of Chemical, Biological, and Materials Engineering and Sciences

DS100: APPLIED DATA SCIENCE 1

Outline
Module 3.8: Learning Outcomes
SUPERVISED LEARNING IN PYTHON
Classification 1. Define Machine Learning and differentiate the types
2. Differentiate Classification from Regression
Regression 3. Write Python codes for Classification and Regression
problems

DS100: APPLIED DATA SCIENCE 2

Supervised Learning
• the science and art of giving computers the ability to learn to make decisions from data without being
explicitly programmed
MACHINE Supervised Learning Unsupervised Learning Reinforcement Learning
LEARNING uses labeled data uses unlabeled data machines or software agents
ex: learning to predict whether ex: clustering Wikipedia entries to interact with an environment
an email is spam or not categories

• the aim is to build a model that is able to predict the target variable given the predictor variables
• Independent Variable  features  predictor variables
SUPERVISED • Dependent Variable  target  response variable
LEARNING
Classfication Regression
the target variable consists of categories the target is a continuous variable

DS100: APPLIED DATA SCIENCE 3

Supervised Learning
• the aim is to build a model that is able to predict the target variable given the predictor variables
• Independent Variable  features  predictor variables
SUPERVISED • Dependent Variable  target  response variable
LEARNING
Classfication Regression
the target variable consists of categories the target is a continuous variable

Predictor variables Target variable

sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa

DS100: APPLIED DATA SCIENCE 4

Supervised Learning
Python Packages for Machine Learning

DS100: APPLIED DATA SCIENCE 5

Classification
DATA PRE-PROCESSING

DS100: APPLIED DATA SCIENCE 6

Classification
EXPLORATORY DATA ANALYSIS

DS100: APPLIED DATA SCIENCE 7

Classification
EXPLORATORY DATA ANALYSIS

DS100: APPLIED DATA SCIENCE 8

Classification
VISUAL EXPLORATORY DATA ANALYSIS

DS100: APPLIED DATA SCIENCE 9

Classification
• process of building a model that is able to predict the categorical target variable given the predictor
variables
CLASSIFICATION
• Training (labeled) data  Label

K-NEAREST • algorithm that predicts the label of a data by taking the majority vote of the ‘k’ closest labeled data
NEIGHBOR points

Classification: red if k=3; green if k=5

DS100: APPLIED DATA SCIENCE 10
Classification
MODEL BUILDING

.fit() trains the model to the data .predict() predicts the label of an unlabeled data point

DS100: APPLIED DATA SCIENCE 11

Classification
MODEL BUILDING

Requirements for the use of scikit-learn

• data must be a NumPy array or a pandas DataFrame
• features must be continuous variables
• there must be no missing values

DS100: APPLIED DATA SCIENCE 12

Classification
MODEL BUILDING

DS100: APPLIED DATA SCIENCE 13

Classification
MEASURING MODEL PERFORMANCE

• commonly-used metric in measuring model performance in classification problems

• number of correct predictions divided by the total number of data points
ACCURACY • normally done by splitting data into training set and test set
• fit/train the classifier on the training set
• make predictions on the test set
• compare predictions with the known labels

train_test_split() randomly splits the data

Arguments: Results (4 arrays):

• feature data • training data
• targets/labels • test data
• test size • training labels
• test labels

DS100: APPLIED DATA SCIENCE 14

Classification
MEASURING MODEL PERFORMANCE

DS100: APPLIED DATA SCIENCE 15

Classification
MEASURING MODEL PERFORMANCE

DS100: APPLIED DATA SCIENCE 16

Classification
MODEL COMPLEXITY

Model Complexity Curve

Smaller k  more complex model  can lead to overfitting

Larger k  smoother decision boundary  less complex model

DS100: APPLIED DATA SCIENCE 17

Regression
• the aim is to build a model that is able to predict the target variable given the predictor variables
• Independent Variable  features  predictor variables
SUPERVISED • Dependent Variable  target  response variable
LEARNING
Classfication Regression
the target variable consists of categories the target is a continuous variable

DS100: APPLIED DATA SCIENCE 18

Regression
DATA PRE-PROCESSING

CRIM: per capita crime rate

NX: nitric oxide concentration
RM: average number of rooms
per dwelling
MEDV: median value of owner
occupied homes in hundreds
of dollars (target variable)

DS100: APPLIED DATA SCIENCE 19

Regression
DATA PRE-PROCESSING

DS100: APPLIED DATA SCIENCE 20

Regression
VISUAL EXPLORATORY DATA ANALYSIS

DS100: APPLIED DATA SCIENCE 21

Regression
VISUAL EXPLORATORY DATA ANALYSIS

DS100: APPLIED DATA SCIENCE 22

Regression
MODEL BUILDING: LINEAR REGRESSION AND VALIDATION: R^2

DS100: APPLIED DATA SCIENCE 23

DS100: APPLIED DATA SCIENCE 24

Supervised Learning with Python

Engr. Elisa G. Eleazar

School of Chemical, Biological, and Materials Engineering and Sciences

DS100: APPLIED DATA SCIENCE 25

The Eos 2006 Electrical System: Self-Study Programme 379
No ratings yet
The Eos 2006 Electrical System: Self-Study Programme 379
19 pages
Examen Extra-Ordinario - Mecánica de Materiales II - A - 15122021 - ME221
No ratings yet
Examen Extra-Ordinario - Mecánica de Materiales II - A - 15122021 - ME221
2 pages
ML_Introduction
No ratings yet
ML_Introduction
76 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Machine Learning Part: Domain Overview
No ratings yet
Machine Learning Part: Domain Overview
20 pages
OR forecasting tool
No ratings yet
OR forecasting tool
39 pages
MI_Unit 3
No ratings yet
MI_Unit 3
107 pages
ML 3RD Unit
No ratings yet
ML 3RD Unit
67 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
Supervised Learning (Classification and Regression)
No ratings yet
Supervised Learning (Classification and Regression)
14 pages
ML 2
No ratings yet
ML 2
39 pages
Lect3 Supervised1
No ratings yet
Lect3 Supervised1
25 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Introduction To ML
No ratings yet
Introduction To ML
31 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
Machine Learning Ppts
No ratings yet
Machine Learning Ppts
38 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
Introduction To Machine Learning: David Kauchak CS 451 - Fall 2013
No ratings yet
Introduction To Machine Learning: David Kauchak CS 451 - Fall 2013
34 pages
ML 3
No ratings yet
ML 3
28 pages
Intro To ML
No ratings yet
Intro To ML
26 pages
DS-05 Introduction To Machine Learning
No ratings yet
DS-05 Introduction To Machine Learning
103 pages
UNIT 1 PART 3
No ratings yet
UNIT 1 PART 3
11 pages
Comp Vis Week 2
No ratings yet
Comp Vis Week 2
16 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Classification
No ratings yet
Classification
53 pages
Unit Iii Classification
No ratings yet
Unit Iii Classification
57 pages
03 Supervised Classification
No ratings yet
03 Supervised Classification
68 pages
ML Unit 3 Part 2
No ratings yet
ML Unit 3 Part 2
8 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
Machine Learning Supervised
No ratings yet
Machine Learning Supervised
42 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
Machine Learning Concepts
No ratings yet
Machine Learning Concepts
68 pages
Unit 4
No ratings yet
Unit 4
72 pages
QSRI-lecture1
No ratings yet
QSRI-lecture1
45 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
Logistic Regression in Python - Real Python
No ratings yet
Logistic Regression in Python - Real Python
27 pages
Unit 3 Machine Learning
No ratings yet
Unit 3 Machine Learning
12 pages
Module1 ML
No ratings yet
Module1 ML
13 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Unit 3 in Machine Intelligence
No ratings yet
Unit 3 in Machine Intelligence
62 pages
Lect 1
No ratings yet
Lect 1
24 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
100% (1)
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
21 pages
Developing A Machining Learning Models From Start To Finish.
No ratings yet
Developing A Machining Learning Models From Start To Finish.
59 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Unit - 2 Machine Learning
No ratings yet
Unit - 2 Machine Learning
45 pages
Machine Learning Reg
No ratings yet
Machine Learning Reg
45 pages
AI_UNIT_3
No ratings yet
AI_UNIT_3
30 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Intro To Machine Learning With Python
100% (1)
Intro To Machine Learning With Python
55 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
supervised_learning
No ratings yet
supervised_learning
23 pages
Data Mining Classification and Prediction
No ratings yet
Data Mining Classification and Prediction
17 pages
Linear Regression for ML ass
No ratings yet
Linear Regression for ML ass
99 pages
Task The Problems That Can Be Solved With Machine Learning
No ratings yet
Task The Problems That Can Be Solved With Machine Learning
9 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
10.Classification2022
No ratings yet
10.Classification2022
20 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Metacognitive Reading Report Template
No ratings yet
Metacognitive Reading Report Template
1 page
The Hon
No ratings yet
The Hon
1 page
CH126P Lecture 1. Introduction
No ratings yet
CH126P Lecture 1. Introduction
46 pages
HW2
No ratings yet
HW2
1 page
Complex Piping Example
No ratings yet
Complex Piping Example
10 pages
Time Value of Money 2: Using Multiple Factors: Completion 1.3
No ratings yet
Time Value of Money 2: Using Multiple Factors: Completion 1.3
1 page
Replacement Analysis: Worksheet 2.7
No ratings yet
Replacement Analysis: Worksheet 2.7
1 page
2.1 Materials: Figure 1. Nucleophiles Alkyl Bromide
No ratings yet
2.1 Materials: Figure 1. Nucleophiles Alkyl Bromide
3 pages
1st Grade Math Lesson Plan 2-Digits PDF
0% (1)
1st Grade Math Lesson Plan 2-Digits PDF
6 pages
RDM Resistance of Materials: Chapter 7 - Shear Week 7 Lectures 11 & 12
No ratings yet
RDM Resistance of Materials: Chapter 7 - Shear Week 7 Lectures 11 & 12
19 pages
Lecture 1 - Postulates of Quantum Mechanics
100% (1)
Lecture 1 - Postulates of Quantum Mechanics
55 pages
CH - 1 - Introduction To Java
No ratings yet
CH - 1 - Introduction To Java
24 pages
Configuration Tasks Window or From Server Manager À Roles À Add Roles
No ratings yet
Configuration Tasks Window or From Server Manager À Roles À Add Roles
13 pages
2sk30a PDF
No ratings yet
2sk30a PDF
1 page
P - Chapter 4
No ratings yet
P - Chapter 4
14 pages
Fact and Dimension Tables
No ratings yet
Fact and Dimension Tables
11 pages
IEEE ICNTET Conference Format
No ratings yet
IEEE ICNTET Conference Format
4 pages
Physics Quest HW 1b
No ratings yet
Physics Quest HW 1b
5 pages
Constraints in SQL Server
No ratings yet
Constraints in SQL Server
5 pages
Hello Arduino Experiment MODULE 4
No ratings yet
Hello Arduino Experiment MODULE 4
7 pages
Threads- Threading Issues
No ratings yet
Threads- Threading Issues
19 pages
Schedule of Dimensions
100% (1)
Schedule of Dimensions
49 pages
TESLA Presentation
No ratings yet
TESLA Presentation
13 pages
Mastercam X7 Tutorials: Chapters
No ratings yet
Mastercam X7 Tutorials: Chapters
1 page
Tabel Periodik Unsur
No ratings yet
Tabel Periodik Unsur
13 pages
Test Bank for College Physics, 9th Edition: Hugh D. Young pdf download
100% (2)
Test Bank for College Physics, 9th Edition: Hugh D. Young pdf download
22 pages
Programmable Logic Devices
100% (1)
Programmable Logic Devices
12 pages
Physic IA3
No ratings yet
Physic IA3
6 pages
Aqu4518r21 PDF
No ratings yet
Aqu4518r21 PDF
2 pages
Analisis SPSS (Soalan Bahagian C Sikap Ibu Bapa)
No ratings yet
Analisis SPSS (Soalan Bahagian C Sikap Ibu Bapa)
6 pages
The Effect of Good Corporate Governance On Tax Avoidance: An Empirical Study On Manufacturing Companies Listed in IDX Period 2010-2013
No ratings yet
The Effect of Good Corporate Governance On Tax Avoidance: An Empirical Study On Manufacturing Companies Listed in IDX Period 2010-2013
11 pages
Properties of Potash Alum
No ratings yet
Properties of Potash Alum
8 pages
Dsbda 3
No ratings yet
Dsbda 3
12 pages
Failure of Building Caused by Unstable Soil A Case Study of Atanu Village Nigeria
No ratings yet
Failure of Building Caused by Unstable Soil A Case Study of Atanu Village Nigeria
5 pages
Bearing Capacity calculation
No ratings yet
Bearing Capacity calculation
10 pages

3.8 Supervised Learning With Python A

Uploaded by

3.8 Supervised Learning With Python A

Uploaded by

Supervised Learning with Python

Engr. Elisa G. Eleazar

DS100: APPLIED DATA SCIENCE 1

DS100: APPLIED DATA SCIENCE 2

DS100: APPLIED DATA SCIENCE 3

Predictor variables Target variable

DS100: APPLIED DATA SCIENCE 4

DS100: APPLIED DATA SCIENCE 5

DS100: APPLIED DATA SCIENCE 6

DS100: APPLIED DATA SCIENCE 7

DS100: APPLIED DATA SCIENCE 8

DS100: APPLIED DATA SCIENCE 9

Classification: red if k=3; green if k=5

DS100: APPLIED DATA SCIENCE 11

Requirements for the use of scikit-learn

DS100: APPLIED DATA SCIENCE 12

DS100: APPLIED DATA SCIENCE 13

• commonly-used metric in measuring model performance in classification problems

train_test_split() randomly splits the data

Arguments: Results (4 arrays):

DS100: APPLIED DATA SCIENCE 14

DS100: APPLIED DATA SCIENCE 15

DS100: APPLIED DATA SCIENCE 16

Model Complexity Curve

Smaller k  more complex model  can lead to overfitting

DS100: APPLIED DATA SCIENCE 17

DS100: APPLIED DATA SCIENCE 18

CRIM: per capita crime rate

DS100: APPLIED DATA SCIENCE 19

DS100: APPLIED DATA SCIENCE 20

DS100: APPLIED DATA SCIENCE 21

DS100: APPLIED DATA SCIENCE 22

DS100: APPLIED DATA SCIENCE 23

DS100: APPLIED DATA SCIENCE 24

Engr. Elisa G. Eleazar

DS100: APPLIED DATA SCIENCE 25

You might also like