0% found this document useful (0 votes)

45 views

03 Classification Handout

This document summarizes a lecture on linear classification. It introduces classification problems and discusses modeling classification as regression by assigning categorical labels numerical values. It describes how linear classifiers use a decision boundary to separate classes and discusses learning classifiers by minimizing loss functions. Key concepts covered include decision boundaries, loss functions, and metrics for evaluating classification models like recall and precision.

Uploaded by

Zakriya Shahid

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views

03 Classification Handout

Uploaded by

Zakriya Shahid

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

CSC 411: Lecture 03: Linear Classification

Richard Zemel, Raquel Urtasun and Sanja Fidler

University of Toronto

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 1 / 24

Examples of Problems

What digit is this?

How can I predict this? What are my input features?
Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 2 / 24
Regression

What do all these problems have in common?

Categorical outputs, called labels

(eg, yes/no, dog/cat/person/other)

Assigning each input vector to one of a finite number of labels is called

classification

Binary classification: two possible labels (eg, yes/no, 0/1, cat/dog)

Multi-class classification: multiple possible labels

We will first look at binary problems, and discuss multi-class problems later
in class

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 3 / 24

Today

Linear Classification (binary)

Key Concepts:
I Classification as regression
I Decision boundary
I Loss functions
I Metrics to evaluate classification

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 4 / 24

Classification vs Regression

We are interested in mapping the input x ∈ X to a label t ∈ Y

In regression typically Y = <
Now Y is categorical

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 5 / 24

Classification as Regression

Can we do this task using what we have learned in previous lectures?

Simple hack: Ignore that the output is categorical!
Suppose we have a binary problem, t ∈ {−1, 1}
Assuming the standard model used for (linear) regression
y (x) = f (x, w) = wT x

How can we obtain w?

Use least squares, w = (XT X)−1 XT t. How is X computed? and t?
Which loss are we minimizing? Does it make sense?
N
1 X (n)
`square (w, t) = (t − wT x(n) )2
N n=1

How do I compute a label for a new example? Let’s see an example

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 6 / 24
Classification as Regression

A dimensional
One 1D example: example (input x is 1-dim)

The colors indicate labels (a blue plus denotes that t (i) is from the first
class, red circle that t (i) is from the second class)
Greg Shakhnarovich (TTIC) Lecture 5: Regularization, intro to classification October 15, 2013 11 / 1

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 7 / 24

Decision Rules

Our classifier has the form

f (x, w) = wo + wT x

A reasonable decision rule is

(
1 if f (x, w) ≥ 0
y=
−1 otherwise

How can I mathematically write this rule?

y (x) = sign(w0 + wT x)

What does this function look like?

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 8 / 24
Decision Rules
A 1D example:

y
ŷ = +1 ŷ = −1
+1
x
w0 + w T x
-1

How can I mathematically write this rule?

Greg Shakhnarovich (TTIC) y (x) = sign(w0 + wT x)

Lecture 5: Regularization, intro to classification October 15, 2013 11 / 15

This specifies a linear classifier: it has a linear boundary (hyperplane)

w0 + wT x = 0

which separates the space into two ”half-spaces”

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 9 / 24

Example in 1D

The linear classifier has a linear boundary (hyperplane)

w0 + wT x = 0

which separates the space into two ”half-spaces”

In 1D this is simply a threshold

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 10 / 24

Example in 2D

The linear classifier has a linear boundary (hyperplane)

w0 + wT x = 0
which separates the space into two ”half-spaces”
In 2D this is a line
Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 11 / 24
Example in 3D

The linear classifier has a linear boundary (hyperplane)

w0 + wT x = 0
which separates the space into two ”half-spaces”
In 3D this is a plane
What about higher-dimensional spaces?
Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 12 / 24
Geometry

wT x = 0 a line passing though the origin and orthogonal to w

wT x + w0 = 0 shifts it by w0

Figure from G. Shakhnarovich

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 13 / 24

Learning Linear Classifiers

Learning consists in estimating a “good” decision boundary

We need to find w (direction) and w0 (location) of the boundary
What does “good” mean?
Is this boundary good?

We need a criteria that tell us how to select the parameters

Do you know any?

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 14 / 24

Loss functions

Classifying using a linear decision boundary reduces the data dimension to 1

y (x) = sign(w0 + wT x)

What is the cost of being wrong?

Loss function: L(y , t) is the loss incurred for predicting y when correct
answer is t
For medical diagnosis: For a diabetes screening test is it better to have false
positives or false negatives?
For movie ratings: The ”truth” is that Alice thinks E.T. is worthy of a 4.
How bad is it to predict a 5? How about a 2?

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 15 / 24

Loss functions

A possible loss to minimize is the zero/one loss

(
0 if y (x) = t
L(y (x), t) =
1 if y (x) 6= t

Is this minimization easy to do? Why?

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 16 / 24

Other Loss functions

Zero/one loss for a classifier

(
0 if y (x) = t
L0−1 (y (x), t) =
1 if y (x) 6= t

Asymmetric Binary Loss


α
 if y (x) = 1 ∧ t = 0
LABL (y (x), t) = β if y (x) = 0 ∧ t = 1

0 if y (x) = t


Squared (quadratic) loss

Lsquared (y (x), t) = (t − y (x))2

Absolute Error
Labsolute (y (x), t) = |t − y (x)|
Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 17 / 24
More Complex Loss Functions

What if the movie predictions are used for rankings? Now the predicted
ratings don’t matter, just the order that they imply.
In what order does Alice prefer E.T., Amelie and Titanic?
Possibilities:
I 0-1 loss on the winner
I Permutation distance
I Accuracy of top K movies.

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 18 / 24

Can we always separate the classes?

If we can separate the classes, the problem is linearly separable

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 19 / 24

Can we always separate the classes?

Causes of non perfect separation:

Model is too simple
Noise in the inputs (i.e., data attributes)
Simple features that do not account for all variations
Errors in data targets (mis-labelings)

Should we make the model complex enough to have perfect separation in the
training data?

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 20 / 24

Metrics
How to evaluate how good my classifier is? How is it doing on dog vs no-dog?

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 21 / 24

Metrics

How to evaluate how good my classifier is?

Recall: is the fraction of relevant instances that are retrieved
TP TP
R= =
TP + FN all groundtruth instances

Precision: is the fraction of retrieved instances that are relevant

TP TP
P= =
TP + FP all predicted

F1 score: harmonic mean of precision and recall

P ·R
F1 = 2
P +R

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 22 / 24

More on Metrics
How to evaluate how good my classifier is?
Precision: is the fraction of retrieved instances that are relevant
Recall: is the fraction of relevant instances that are retrieved
Precision Recall Curve

Average Precision (AP): mean under the curve

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 23 / 24

Metrics vs Loss

Metrics on a dataset is what we care about (performance)

We typically cannot directly optimize for the metrics
Our loss function should reflect the problem we are solving. We then hope it
will yield models that will do well on our dataset

Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 24 / 24

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (78)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (78)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
Shortcut To Shred Ebook Revised 9-9-2015 PDF
88% (8)
Shortcut To Shred Ebook Revised 9-9-2015 PDF
15 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (7)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
70% (71)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Schonemann Trace Derivatives Presentation
No ratings yet
Schonemann Trace Derivatives Presentation
82 pages
189 Cheat Sheet Minicards
No ratings yet
189 Cheat Sheet Minicards
2 pages
Ch02-Regression Handout
No ratings yet
Ch02-Regression Handout
22 pages
Lecture 2
No ratings yet
Lecture 2
57 pages
intro-to-sdes
No ratings yet
intro-to-sdes
28 pages
Second Order Odes: Y (X) F (X, Y)
No ratings yet
Second Order Odes: Y (X) F (X, Y)
16 pages
15 SVM
No ratings yet
15 SVM
61 pages
Discretization and Simulation For A Class of Spdes With Applications To Zakai and Mckean-Vlasov Equations
No ratings yet
Discretization and Simulation For A Class of Spdes With Applications To Zakai and Mckean-Vlasov Equations
43 pages
Numerical Solution of Stochastic Differential Equations in Finance
No ratings yet
Numerical Solution of Stochastic Differential Equations in Finance
22 pages
Lec 03
No ratings yet
Lec 03
42 pages
Part II - ODE
No ratings yet
Part II - ODE
111 pages
Lec 1
No ratings yet
Lec 1
14 pages
PDEB
No ratings yet
PDEB
110 pages
1 Unit Root Tests: T T T T T T T
No ratings yet
1 Unit Root Tests: T T T T T T T
34 pages
Two-Dimensional Inverse Boundary Value Problem For
No ratings yet
Two-Dimensional Inverse Boundary Value Problem For
17 pages
Ordinary Differential Equations (Odes) : Department of Mathematics Iit Guwahati Ra/Rks/Mgpp/Kvk
No ratings yet
Ordinary Differential Equations (Odes) : Department of Mathematics Iit Guwahati Ra/Rks/Mgpp/Kvk
16 pages
1.4 Classification of Integral Equations
No ratings yet
1.4 Classification of Integral Equations
11 pages
Probability and Statistics - 4
No ratings yet
Probability and Statistics - 4
29 pages
HW 3 PDF
No ratings yet
HW 3 PDF
3 pages
6.241 Dynamic Systems and Control: Lecture 7: State-Space Models Readings: DDV, Chapters 7,8
No ratings yet
6.241 Dynamic Systems and Control: Lecture 7: State-Space Models Readings: DDV, Chapters 7,8
12 pages
Applications of Random Matrix Theory To Principal Component Analysis (PCA)
No ratings yet
Applications of Random Matrix Theory To Principal Component Analysis (PCA)
25 pages
Simulation of Diffusion Processes
No ratings yet
Simulation of Diffusion Processes
4 pages
Final 05
No ratings yet
Final 05
5 pages
On A Boundary-Value Problem For A Fourth-Order Partial Integro-Differential Equation With Degenerate Kernel
No ratings yet
On A Boundary-Value Problem For A Fourth-Order Partial Integro-Differential Equation With Degenerate Kernel
16 pages
State Space Description of A Dynamic System: 1. Linear Case
No ratings yet
State Space Description of A Dynamic System: 1. Linear Case
8 pages
Chap 8 e
No ratings yet
Chap 8 e
19 pages
CMM 2017 01 17
No ratings yet
CMM 2017 01 17
6 pages
The Convex (δ, L) Weak Contraction Mapping Theoree
No ratings yet
The Convex (δ, L) Weak Contraction Mapping Theoree
13 pages
SMAI-M20-06: Data, Distances and Learning: C. V. Jawahar
No ratings yet
SMAI-M20-06: Data, Distances and Learning: C. V. Jawahar
24 pages
review
No ratings yet
review
6 pages
EE5143_Module5
No ratings yet
EE5143_Module5
30 pages
Appliedstat 2017 Chapter 10 11
No ratings yet
Appliedstat 2017 Chapter 10 11
23 pages
Assignment 1 (2)
No ratings yet
Assignment 1 (2)
2 pages
sc4026 Lecture 2
No ratings yet
sc4026 Lecture 2
63 pages
(Ebook) Essential Partial Differential Equations: Analytical and Computational Aspects (Instructor Solution Manual, Solutions) by David F. F. Griffiths, David J. Silvester, John W. Dold ISBN 9783319225685, 3319225685 - The ebook is ready for download, no waiting required
100% (3)
(Ebook) Essential Partial Differential Equations: Analytical and Computational Aspects (Instructor Solution Manual, Solutions) by David F. F. Griffiths, David J. Silvester, John W. Dold ISBN 9783319225685, 3319225685 - The ebook is ready for download, no waiting required
80 pages
Exponential Distribution
No ratings yet
Exponential Distribution
20 pages
Lecture3 PDF
No ratings yet
Lecture3 PDF
43 pages
Notes 9
No ratings yet
Notes 9
14 pages
Sac401-Lesson 2
No ratings yet
Sac401-Lesson 2
12 pages
L03 TimeResponse
No ratings yet
L03 TimeResponse
41 pages
Fly hw1
No ratings yet
Fly hw1
2 pages
Control Systems I: Lecture 4: Diagonalization, Modal Analysis, Intro To Feedback Readings
No ratings yet
Control Systems I: Lecture 4: Diagonalization, Modal Analysis, Intro To Feedback Readings
26 pages
EAU_Math2051_Part_I
No ratings yet
EAU_Math2051_Part_I
63 pages
Gradient Descent Based Learners
No ratings yet
Gradient Descent Based Learners
11 pages
Kayatu
No ratings yet
Kayatu
3 pages
10 - Chapter 5
No ratings yet
10 - Chapter 5
20 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Lecture 2
No ratings yet
Lecture 2
52 pages
Single Step Method PDF
No ratings yet
Single Step Method PDF
35 pages
1.5 Even and Odd Signals: 1.5.1 Definitions
No ratings yet
1.5 Even and Odd Signals: 1.5.1 Definitions
3 pages
Final_W2024 (1)
No ratings yet
Final_W2024 (1)
14 pages
Digital Image Processing - Sampling Theory
No ratings yet
Digital Image Processing - Sampling Theory
56 pages
Slides5 Fourier Representations of Continuous-Time Signals and Systems
No ratings yet
Slides5 Fourier Representations of Continuous-Time Signals and Systems
26 pages
FLANN.ppt
No ratings yet
FLANN.ppt
16 pages
Solutions-ex2
No ratings yet
Solutions-ex2
6 pages
MTH 102 A
No ratings yet
MTH 102 A
49 pages
Conservative realizations of Herglotz Nevanlinna functions 1st Edition Yuri Arlinskii - Download the full ebook set with all chapters in PDF format
No ratings yet
Conservative realizations of Herglotz Nevanlinna functions 1st Edition Yuri Arlinskii - Download the full ebook set with all chapters in PDF format
73 pages
Automatic Control III Homework Assignment 3 2015
No ratings yet
Automatic Control III Homework Assignment 3 2015
4 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Hamzah Rahman: Captain/Cashier
No ratings yet
Hamzah Rahman: Captain/Cashier
2 pages
A Comparison of ISO 9000 and SEI/CMM For Software Engineering Organizations
No ratings yet
A Comparison of ISO 9000 and SEI/CMM For Software Engineering Organizations
7 pages
Viewcontent Cgi PDF
No ratings yet
Viewcontent Cgi PDF
12 pages
How ISO 9001 Compares With The CMM: IEEE Software February 1995
No ratings yet
How ISO 9001 Compares With The CMM: IEEE Software February 1995
12 pages
Viewcontent Cgi PDF
No ratings yet
Viewcontent Cgi PDF
12 pages
Systems: Simple Procedure
No ratings yet
Systems: Simple Procedure
24 pages
How ISO 9001 Compares With The CMM: IEEE Software February 1995
No ratings yet
How ISO 9001 Compares With The CMM: IEEE Software February 1995
12 pages
Quality Management of The Software Industry: Gary M. Griggs
No ratings yet
Quality Management of The Software Industry: Gary M. Griggs
31 pages
ENGLISH 10 1st QUARTER
No ratings yet
ENGLISH 10 1st QUARTER
17 pages
Section Total Questions Correct/ Partial Correct/ Incorrect/ Unattempted +ve Marks / - Ve Marks Obtained / Max (Percentage) Time Taken
No ratings yet
Section Total Questions Correct/ Partial Correct/ Incorrect/ Unattempted +ve Marks / - Ve Marks Obtained / Max (Percentage) Time Taken
1 page
Accounting For Factory Overhead
No ratings yet
Accounting For Factory Overhead
19 pages
FYTB14: Exercise Sheet 2
No ratings yet
FYTB14: Exercise Sheet 2
3 pages
Introduction To Linear and Nonlinear Observers
No ratings yet
Introduction To Linear and Nonlinear Observers
51 pages
Numerical Simulation of Impact Using Autodyn
No ratings yet
Numerical Simulation of Impact Using Autodyn
7 pages
(Undergraduate Texts in Mathematics) Stephanie Frank Singer-Linearity, Symmetry, and Prediction in The Hydrogen Atom-Springer (2005) PDF
100% (3)
(Undergraduate Texts in Mathematics) Stephanie Frank Singer-Linearity, Symmetry, and Prediction in The Hydrogen Atom-Springer (2005) PDF
404 pages
Simple Linear Equations A Through H
No ratings yet
Simple Linear Equations A Through H
20 pages
econ0021_microeconometrics_2018_19
No ratings yet
econ0021_microeconometrics_2018_19
3 pages
Pr1 Cae Practice Tests Plus Glossary 10-2011
No ratings yet
Pr1 Cae Practice Tests Plus Glossary 10-2011
9 pages
Constructing A Nonagon Using Unmarked Straightedge and Compass Using The Des Method
No ratings yet
Constructing A Nonagon Using Unmarked Straightedge and Compass Using The Des Method
10 pages
Titu Andreescu Contests Around The World 19992000
No ratings yet
Titu Andreescu Contests Around The World 19992000
344 pages
Maths-preboard-2024
No ratings yet
Maths-preboard-2024
4 pages
Ms. Tiara Polite's Class
No ratings yet
Ms. Tiara Polite's Class
4 pages
Eee 3 - 8
No ratings yet
Eee 3 - 8
159 pages
4-Multiple IV Dosing
No ratings yet
4-Multiple IV Dosing
28 pages
Swephprg 1
No ratings yet
Swephprg 1
70 pages
Simplex Method Incase of Artificial Variables " "
No ratings yet
Simplex Method Incase of Artificial Variables " "
13 pages
Two-Step Equations With Integers (1)
No ratings yet
Two-Step Equations With Integers (1)
4 pages
Lesson Plan 1 Final
No ratings yet
Lesson Plan 1 Final
9 pages
Class XI CS Practical File 2022-23
0% (1)
Class XI CS Practical File 2022-23
26 pages
DAY 5 (MATHEMATICS) DLP IN GRADE 2 Reading and Writing Unit Fractions (April 11, 2023)
100% (4)
DAY 5 (MATHEMATICS) DLP IN GRADE 2 Reading and Writing Unit Fractions (April 11, 2023)
4 pages