0% found this document useful (0 votes)

10 views

Lecture 4 Classification P1

Uploaded by

Hoàng Hùng Mạnh

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Lecture 4 Classification P1

Uploaded by

Hoàng Hùng Mạnh

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

UET

Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

INT3405 - Machine Learning

Lecture 4: Classiﬁcation (P1)
Duc-Trong Le & Viet-Cuong Ta

Hanoi, 09/2023
Recap: Key Issues in Machine Learning
● What are good hypothesis spaces? We choose
○ Which spaces have been useful in practical applications and why? To
● What algorithms can work with these spaces? Optimize
○ Are there general design principles for machine learning algorithms?
● How can we find the best hypothesis in an efficient way?
○ How to find the optimal solution efficiently (“optimization” question)
● How can we optimize accuracy on future data?
○ Known as the “overfitting” problem (i.e., “generalization” theory)
● How can we have confidence in the results?
○ How much training data is required to find accurate hypothesis? (“statistical” question)
● Are some learning problems computationally intractable? (“computational” question)
● How can we formulate application problems as machine learning problems? (“engineering”
question)
FIT-CS INT3405 - Machine Learning 2
Recap: Model Representation
Training Set How do we represent h ?

Learning Algorithm y

Size of h Estimated x
house price
x Hypothesis y
Linear regression with one variable.
“Univariate Linear Regression”

How to choose parameters ?

FIT-CS INT3405 - Machine Learning 3
Recap: Gradient Descent for Optimization

FIT-CS INT3405 - Machine Learning 4

Recap: Gradient Descent Example

(for fixed , this is a function of x) (function of the parameters )

How fast to converge to the Global Optimal?

FIT-CS INT3405 - Machine Learning 5

Normal Equation (3)
● Matrix-vector formulation

● Analytical solution
Take O(mn2+n3)

FIT-CS INT3405 - Machine Learning 6

Outline
● Bayesian Learning
○ Bayes Theorem
○ MAP learning vs. MLE learning
● Probabilistic Generative Models
○ Naïve Bayes Classifier
● Discriminative Models
○ Logistic Regression
○ K-Nearest Neighbors

FIT-CS INT3405 - Machine Learning 7

Bayes Theorem
● Bayes Theorem

Posterior Likelihood Prior

Thomas Bayes (1702–1761)

○ P(h) = prior probability of hypothesis h

○ P(D) = prior probability of training data D
○ P(h|D) = conditional probability of h given D (Posterior)
○ P(D|h) = conditional probability of D given h (Likelihood)

FIT-CS INT3405 - Machine Learning 8

Maximum A Posterior Learning (MAP)
●Maximum a posterior learning (MAP)
○ Find the most probable hypothesis given the training data by
maximizing the posterior probability.

Prior encodes the

knowledge/preference
FIT-CS INT3405 - Machine Learning 9
MAP Learning
● For each hypothesis h in H, calculate the posterior prob.

● Output the hypothesis h with the highest posterior prob.

● Comments:
○ Computational intensive
○ Give a standard for judging the performance of learning algorithms
○ Choosing P(h) reflects our prior knowledge about the learning task

FIT-CS INT3405 - Machine Learning 10

Maximum-Likelihood Estimation (MLE)

● Maximum Likelihood Estimation (MLE) learning

○ Assume each hypothesis is equally probably a prior

○ Maximizing the likelihood of the training data

FIT-CS INT3405 - Machine Learning 11

Relationship between MLE Learning
and Least-Squared Error Learning (1)
● Consider

● Assume

● We want learn for f(x)

● Linear Regression minimizes the objective (cost function) of MSE

FIT-CS INT3405 - Machine Learning 12

Relationship between MLE Learning
and Least-Squared Error Learning (2)

FIT-CS INT3405 - Machine Learning 13

Probabilistic Generative Models (1)
• Classify instance x into one of K classes

Density function for class Ck Class prior

FIT-CS INT3405 - Machine Learning 14

Probabilistic Generative Models (2)
• Classification decision

• The key is to decide the parameters

FIT-CS INT3405 - Machine Learning 15

Probabilistic Generative Models (3)
● Given training data
● We have closed-form solutions:

FIT-CS INT3405 - Machine Learning 16

Probabilistic Generative Models (4)

class-conditional posterior
densities probability

FIT-CS INT3405 - Machine Learning 17

Curse of Dimensionality
● One challenge of learning with high-dimensional data is insufficient data
samples
● Suppose 5 samples/objects is considered enough in 1-D
– 1D : 5 points
– 2D : 25 points
– 3D : 125 points
– 10D : 9 765 625 points

FIT-CS INT3405 - Machine Learning 18

Naïve Bayes Classifier (1)
•Hard to estimate for high dimensional data x
•Conditional Independence assumption
• All attributes are conditionally independent
•Naïve Bayes approximation Distribution of 1 D

FIT-CS INT3405 - Machine Learning 19

Naïve Bayes Classifier (2)
● Text categorization
: word histogram of a document
● Bag of words assumption:
○ Assume position doesn’t matter
● Conditional independence:
Occurring times
of word in
document x

FIT-CS INT3405 - Machine Learning 20

Parameter Estimation
●Learning by Maximum Likelihood Estimates
○ Simply count the frequencies in the data

○ Create a mega-document for topic k by concatenating all the docs in this topic
○ Compute frequency of w in the mega-document

FIT-CS INT3405 - Machine Learning 21

Problem with Maximum Likelihood
● What if there is a new word (e.g., any novel words created in internet) in a
test document which never appears in the training data

● Smoothing
○ Avoid zero probability

CSUET INT3405 - Machine Learning 22

Naïve Bayes Classifier (3)
• Bad approximation Text categorization for 20 Newsgroups

• Good classification accuracy

FIT-CS INT3405 - Machine Learning 23

Naïve Bayes Classifier (4)

Naïve Bayes Classifier:

FIT-CS INT3405 - Machine Learning 24

Example: “Play Tennis” (1)
● Based on the examples in the table, classify the following datum x:
x=(Outl=Sunny, Temp=Cool, Hum=High, Wind=strong)

FIT-CS INT3405 - Machine Learning 25

Example: “Play Tennis” (2)

FIT-CS INT3405 - Machine Learning 26

The Independence Assumption
● Makes computation possible
● Yields optimal classifiers when satisfied
● Fairly good empirical results
● But is seldom satisfied in practice, as attributes (variables) are
often correlated
● Attempts to overcome this limitation:
○ Bayesian networks, that combine Bayesian reasoning with causal relationships
between attributes

FIT-CS INT3405 - Machine Learning 27

Decision Boundary of Naïve Bayes (1)
● Consider text categorization of two classes
● The ratio determines the decision

Linear decision boundary

FIT-CS INT3405 - Machine Learning 28
Decision Boundary of Naïve Bayes (2)
● Consider two class classification
● Gaussian density function
● Shared covariance matrix

Linear decision boundary

FIT-CS INT3405 - Machine Learning 29

Decision Boundary
• Generative models essentially create linear decision boundaries
• Why not directly model the linear decision boundary

FIT-CS
SML– Term 1 2020-2021 INT3405 - Machine Learning 30
30
Outline
● Bayesian Learning
○ Bayes Theorem
○ MAP learning vs. MLE learning
● Probabilistic Generative Models
○ Naïve Bayes Classifier
● Discriminative Models
○ Logistic Regression
○ K-Nearest Neighbors

FIT-CS INT3405 - Machine Learning 31

Discriminative Models: Logistic Regression
• Generative models often lead to linear decision boundary
• Linear discriminatory model
• Directly model the linear decision boundary

• w is the parameter to be decided

FIT-CS INT3405 - Machine Learning 32

Logistic Regression

FIT-CS INT3405 - Machine Learning 33

Logistic Sigmoid Function
● The logistic / sigmoid function

FIT-CS INT3405 - Machine Learning 34

Logistic Regression
• Given training data
• Likelihood function (or the Log-Likelihood)

• Learn parameter w by Maximum Likelihood Estimation (MLE)

FIT-CS INT3405 - Machine Learning 35

Convex Objective Functions

If y = 1 Convex Loss Functions: If y = -1

FIT-CS INT3405 - Machine Learning 36

Logistic Regression
• Convex objective function, global optimal

• No closed-form solution
• Gradient Descent

Classification error

FIT-CS INT3405 - Machine Learning 37

Example: Heart Disease (1)
1: 25-29
2: 30-34
3: 35-39
4: 40-44
5: 45-49
6: 50-54
7: 55-59
• Input feature x: age group id 8: 60-64
• Output y: if having heart disease
• y=1: having heart disease
• y=-1: no heart disease

FIT-CS INT3405 - Machine Learning 38

Example: Heart Disease (2)

FIT-CS INT3405 - Machine Learning 39

Example: Text Categorization (1)
● Learn to classify text into two categories
● Input d:
• a document, represented by a word histogram
● Output
• y=±1:
+1 for political document
-1 for non-political document

FIT-CS INT3405 - Machine Learning 40

Example: Text Categorization (2)
• Training data

FIT-CS INT3405 - Machine Learning 41

Example: Text Categorization (3)

• Dataset: Reuter-21578
• Classification accuracy
• Naïve Bayes: 77%
• Logistic regression: 88%

FIT-CS INT3405 - Machine Learning 42

Multi-class Logistic Regression
• How to extend logistic regression model to multi-class classification ?

FIT-CS INT3405 - Machine Learning 43

Conditional Exponential Model (1)
• Consider K classes
• Define

• where Z is normalization factor:

Normalization factor
(partition function)

• Need to learn

FIT-CS INT3405 - Machine Learning 44

Conditional Exponential Model (2)
• Learn weights w’s by maximum likelihood estimation

• Modified Conditional Exponential Model

FIT-CS INT3405 - Machine Learning 45

Logistic Regression versus Naïve Bayes
•Both are linear decision boundaries
• Naïve Bayes:

• Logistic regression: learn weights by MLE

•Both can be viewed as modeling p(x|y)

• Naïve Bayes: independence assumption
• Logistic regression: assume an exponential family distribution for
p(x|y) (a broad assumption)
FIT-CS INT3405 - Machine Learning 46
K-Nearest Neighbor
Main idea: It classifies the data point on how its neighbor is classified

FIT-CS INT3405 - Machine Learning 47

K-Nearest Neighbor

Complexity: O(ndk)
FIT-CS INT3405 - Machine Learning 48
Discriminative versus Generative
Discriminative Models Generative Models

● Model P(y|x) directly • Model P(x|y) directly

Pros Pros
● Usually better performance • Usually fast convergence
(with small training data) • Cheap computation
● Robust to noise data (easier to learn, e.g. NB)
Cons Cons
● Slow convergence • Sensitive to noise data
(e.g., LR by gradient descent) • Usually performs worse (with
small training data)
● Expensive computation

FIT-CS INT3405 - Machine Learning 49

Summary
● Bayesian Learning
○ Bayes Theorem
○ MAP learning vs. MLE learning
● Probabilistic Generative Models
○ Naïve Bayes Classifier
● Discriminative Models
○ Logistic Regression
○ K-Nearest Neighbors

FIT-CS INT3405 - Machine Learning 50

UET
Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

Thank you
Email me
[email protected]

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (78)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
Shortcut To Shred Ebook Revised 9-9-2015 PDF
88% (8)
Shortcut To Shred Ebook Revised 9-9-2015 PDF
15 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
Applied Generative AI For Beginners Practical Knowledge 1703207445
91% (11)
Applied Generative AI For Beginners Practical Knowledge 1703207445
221 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (7)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
70% (71)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Generative Adversarial Networks and Deep Learning Theory and Applications 9781032068107 - 20230320 - 112232 PDF
No ratings yet
Generative Adversarial Networks and Deep Learning Theory and Applications 9781032068107 - 20230320 - 112232 PDF
223 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
50 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
49 pages
Lecture03 Linear Regression
No ratings yet
Lecture03 Linear Regression
54 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
55 pages
Lecture 5 Classification P2 Decision Tree
No ratings yet
Lecture 5 Classification P2 Decision Tree
54 pages
Lecture2 - General Concepts For ML
No ratings yet
Lecture2 - General Concepts For ML
69 pages
Lecture 2 - General Concepts For ML
No ratings yet
Lecture 2 - General Concepts For ML
63 pages
Lecture 6 Classification SVM
No ratings yet
Lecture 6 Classification SVM
44 pages
Lecture 6 Classification P3 SVM
No ratings yet
Lecture 6 Classification P3 SVM
44 pages
Lecture 5 Classification SVM
No ratings yet
Lecture 5 Classification SVM
44 pages
Lecture 5_Decision Tree
No ratings yet
Lecture 5_Decision Tree
49 pages
Lecture 5 - Decision Tree
No ratings yet
Lecture 5 - Decision Tree
48 pages
Lecture 6 - Classification - SVM
No ratings yet
Lecture 6 - Classification - SVM
48 pages
05 Sciml PINN
No ratings yet
05 Sciml PINN
131 pages
ML Lec. 01
No ratings yet
ML Lec. 01
17 pages
Machine Learning With Spark
No ratings yet
Machine Learning With Spark
26 pages
did DML
No ratings yet
did DML
54 pages
Learningintro Notes
No ratings yet
Learningintro Notes
12 pages
Lecture 7 - Feature Selection & Model Optimization
No ratings yet
Lecture 7 - Feature Selection & Model Optimization
48 pages
ML PDF
No ratings yet
ML PDF
29 pages
Data Science - Convex Optimization and Examples PDF
No ratings yet
Data Science - Convex Optimization and Examples PDF
9 pages
CS115 01
No ratings yet
CS115 01
38 pages
SPandMT
No ratings yet
SPandMT
93 pages
Lecture 4 Introduction to Calculus (Part 1)
No ratings yet
Lecture 4 Introduction to Calculus (Part 1)
45 pages
Data Structures AND Algorithms: Bilgisayar Mühendisliği Bölümü
No ratings yet
Data Structures AND Algorithms: Bilgisayar Mühendisliği Bölümü
17 pages
17 Dimensionality Reduction
No ratings yet
17 Dimensionality Reduction
46 pages
DAA Unit 5
No ratings yet
DAA Unit 5
86 pages
Brute Force
No ratings yet
Brute Force
163 pages
Algorithmics
No ratings yet
Algorithmics
36 pages
ML 01
No ratings yet
ML 01
34 pages
Intro_DL_01
No ratings yet
Intro_DL_01
64 pages
Lecture13 - ML Linear & Log-Linear Models
No ratings yet
Lecture13 - ML Linear & Log-Linear Models
34 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
8.a-CMPS460-S22-Probabilitic Modeling - Review
No ratings yet
8.a-CMPS460-S22-Probabilitic Modeling - Review
19 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
Brute Force 1
No ratings yet
Brute Force 1
43 pages
05-1 Supervised Learning
No ratings yet
05-1 Supervised Learning
65 pages
1c Machinelearning
No ratings yet
1c Machinelearning
50 pages
l06_machine_learning
No ratings yet
l06_machine_learning
52 pages
I CM Nemirovski
No ratings yet
I CM Nemirovski
30 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Machine Learning
No ratings yet
Machine Learning
40 pages
18.Overview
No ratings yet
18.Overview
18 pages
AI ML Module2 Chapter 4
No ratings yet
AI ML Module2 Chapter 4
54 pages
MLT Unit-1
No ratings yet
MLT Unit-1
19 pages
CEC453 Machine Learning
No ratings yet
CEC453 Machine Learning
168 pages
Introduction To DL With TensorFlow
No ratings yet
Introduction To DL With TensorFlow
55 pages
Machine Learning Assignments
No ratings yet
Machine Learning Assignments
3 pages
Theory of Computation - CS8501 2017 Regulation - Semester Question Paper 2020 Nov Dec
No ratings yet
Theory of Computation - CS8501 2017 Regulation - Semester Question Paper 2020 Nov Dec
3 pages
3.pattern Recognition (Pattern Classification) - AdaBoost
No ratings yet
3.pattern Recognition (Pattern Classification) - AdaBoost
80 pages
Chapter1: Introduction: Notes On MLAPP
No ratings yet
Chapter1: Introduction: Notes On MLAPP
25 pages
Lecture Series On Machine Learning: Ravi Gupta G. Bharadwaja Kumar
No ratings yet
Lecture Series On Machine Learning: Ravi Gupta G. Bharadwaja Kumar
77 pages
ML Sit1305
No ratings yet
ML Sit1305
127 pages
INT354 Unit 1 Part1
No ratings yet
INT354 Unit 1 Part1
16 pages
ML-Plan: Automated Machine Learning Via Hierarchical Planning
No ratings yet
ML-Plan: Automated Machine Learning Via Hierarchical Planning
21 pages
Sem 2 IEE-545 Adv Simulating Stochastic Syst Module3_InputAnalys
No ratings yet
Sem 2 IEE-545 Adv Simulating Stochastic Syst Module3_InputAnalys
51 pages
Notes
No ratings yet
Notes
125 pages
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Python AI Programming: Navigating fundamentals of ML, deep learning, NLP, and reinforcement learning in practice
From Everand
Python AI Programming: Navigating fundamentals of ML, deep learning, NLP, and reinforcement learning in practice
Patrick J
No ratings yet
Python AI Programming
From Everand
Python AI Programming
Patrick J
No ratings yet
Mscfe XXX (Course Name) - Module X: Collaborative Review Task
No ratings yet
Mscfe XXX (Course Name) - Module X: Collaborative Review Task
19 pages
GANs
No ratings yet
GANs
13 pages
Sec1_introduction_GR_Tutorial_Slides_SIGIR
No ratings yet
Sec1_introduction_GR_Tutorial_Slides_SIGIR
25 pages
Unit5ethereum Wal
No ratings yet
Unit5ethereum Wal
14 pages
ai_gfg
No ratings yet
ai_gfg
24 pages
Text-to-Image Synthesis With Generative Models Met
No ratings yet
Text-to-Image Synthesis With Generative Models Met
16 pages
p29-GAN Based Anomaly Detection Review Including Reviewer Suggestions
No ratings yet
p29-GAN Based Anomaly Detection Review Including Reviewer Suggestions
13 pages
Lakshmi Priya Module 7 Assignment
No ratings yet
Lakshmi Priya Module 7 Assignment
5 pages
Learning From Multiple Noisy Partial Labelers
No ratings yet
Learning From Multiple Noisy Partial Labelers
24 pages
Proposed Course Syllabus: Algorithms For Big Data
No ratings yet
Proposed Course Syllabus: Algorithms For Big Data
7 pages
Class9_AI_PA4_Sample_MS_2024-25
No ratings yet
Class9_AI_PA4_Sample_MS_2024-25
7 pages
Machine Learning: Cognate/ Elective 2
No ratings yet
Machine Learning: Cognate/ Elective 2
46 pages
CM20315 01 Intro 01
No ratings yet
CM20315 01 Intro 01
39 pages
Generative
No ratings yet
Generative
4 pages
Deep Generative Modeling Jakub M. Tomczak - The ebook in PDF format with all chapters is ready for download
100% (1)
Deep Generative Modeling Jakub M. Tomczak - The ebook in PDF format with all chapters is ready for download
66 pages
Unit 1 Intoduction to Generative AI
No ratings yet
Unit 1 Intoduction to Generative AI
8 pages
Generative AI Course Outline
No ratings yet
Generative AI Course Outline
4 pages
Unit 7 - 2
No ratings yet
Unit 7 - 2
59 pages
Recent Advances in Artificial Intelligence
No ratings yet
Recent Advances in Artificial Intelligence
6 pages
3D Generative Models A Survey
No ratings yet
3D Generative Models A Survey
21 pages
DL 5
No ratings yet
DL 5
7 pages
On The Challenges of Learning With Inference Networks On Sparse, High-Dimensional Data
No ratings yet
On The Challenges of Learning With Inference Networks On Sparse, High-Dimensional Data
14 pages
What Is Artificial Intelligence
No ratings yet
What Is Artificial Intelligence
8 pages
07 Representation Learning
No ratings yet
07 Representation Learning
11 pages
6 Naive-Bayes
No ratings yet
6 Naive-Bayes
18 pages
generative-ai-manan-report-pdf-30-monday_1_-gg-jryj
No ratings yet
generative-ai-manan-report-pdf-30-monday_1_-gg-jryj
21 pages
How to Become an Agentic AI Expert in 2025
0% (1)
How to Become an Agentic AI Expert in 2025
19 pages
Full Text
No ratings yet
Full Text
15 pages