0% found this document useful (0 votes)

8 views

DSA5105 Lecture8

Uploaded by

Laura Zhou

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

DSA5105 Lecture8

Uploaded by

Laura Zhou

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

Principles of Machine Learning

DSA 5105 • Lecture 8

Soufiane Hayou
Department of Mathematics
So far
Until now, we have focused on supervised learning
• Datasets comes in input-label pairs
• Goal is to learn their relationship for prediction (the
oracle function)

For next few lectures, we are going to look at a variety of

unsupervised learning methodologies.

As always, we start with the simplest linear cases and proceed

from there.
Unsupervised Learning Overview
Supervised Learning

Supervised learning is about learning to make predictions

(Oracle) Cat

Predictive Dog
Model

Our goal: Using data, learn a predictive model that approximates

Unsupervised Learning

Unsupervised learning is where we do not have label information

(Oracle) Cat

Dog

Example goal: learn some task-agnostic patterns from the input data
Examples of Unsupervised Learning
Tasks: Dimensionality Reduction

https://media.geeksforgeeks.org/wp-content/uploads/Dimensionality_Reduction_1.jpg
Examples of Unsupervised Learning
Tasks: Clustering

https://upload.wikimedia.org/wikipedia/commons/thumb/c/c8/Cluster-2.svg/1200px-Cluster-2.svg.png
Examples of Unsupervised Learning
Tasks: Density Estimation

By ‫ طاها‬- Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=24309466

Examples of Unsupervised Learning
Tasks: Generative Models

http://www.lherranz.org/wp-content/uploads/2018/07/blog_generativesampling.png
Why unsupervised learning?
• Labelled data is expensive to collect
• Labelled data is impossible to get
• Different application scenarios
Principal Component Analysis
Review: Eigenvalues and Eigenvectors
• For a square matrix , an eigenvector with associated eigenvalue satisfies

• We say is diagonalizable if there exists a diagonal (matrix of eigenvalues)

and an invertible (columns=eigenvectors) such that
• is symmetric if . is orthogonal if
• Well-known result: if is symmetric then it is diagonalizable by orthogonal
matrices, i.e.

Columns of are orthonormal: . In fact, is an orthonormal basis for . Moreover,

the eigenvalues are real.

Watch this! https://www.youtube.com/watch?v=PFDu9oVAE-g&t=453s

Review: Eigenvalues and Eigenvectors
• A symmetric matrix is
• Positive semi-definite if for all
• Positive definite if for all
• Suppose is symmetric positive definite. Then, WLOG we will
order its eigenvalues

and are the corresponding orthonormal eigenvectors.

Motivating PCA: Shoe Sizes
Capturing the Variation?
Although there are two dimensions to the data, there is really one
effective dimension! How do we uncover this dimension?
A Dynamic Visualization
Find the direction
that captures the
most variance

Two
Formulations
Find the direction
that minimizes
projection error
Derivation of PCA
(Maximize Variance)
Derivation of PCA
(Minimize Error)
The PCA Algorithm
Simple Example
Choosing The Embedding Dimension
PCA in Feature Space (Example)
PCA in Feature Space
We define a vector of feature maps

Form design matrix

Perform PCA on the Transformed dataset!

PCA in Feature Space
PCA in Feature Space (Example)
Define Feature Maps
PCA as a Form of Whitening
Recall: Principal component scores are given by

Define the transformation

Then, !

In other words, has uncorrelated features. This is known as a

PCA whitening transform.
Example: Iris Dataset
Autoencoders
PCA as Compression Algorithm

𝑍 𝑚 = 𝑋 𝑈𝑚 𝑋
𝑑𝑒𝑐𝑜𝑚𝑝 𝑇
= 𝑍𝑚𝑈 𝑀

Encoder Decoder
Latent
Autoencoders
In this sense, the autoencoder is a nonlinear counter-part of PCA
based compression!
PCA: 𝑍 𝑚 = 𝑋 𝑈𝑚 ∗
𝑋 =𝑍 𝑚 𝑈 𝑚
𝑇

Encoder Latent Decoder

∗
AE: 𝑍 𝑚 =𝑇 enc ( 𝑋 ;𝜃) 𝑋 =𝑇 dec ( 𝑍 𝑚 ; 𝜙 )
Neural Network Autoencoders
How do we pick the encoding and decoding and

One choice: use universal approximators, e.g. neural networks!

where
Neural Network Autoencoders
Given a dataset , we solve the empirical risk minimization to
minimize the distance between and

The empirical risk minimization uses inputs as labels!

Demo: PCA and Autoencoders
Summary
PCA fits an ellipsoid to data. Two interpretations:
• Maximize variance
• Minimize error

PCA is useful for:

• Dimensionality reduction
• Feature extraction / clustering
• Data whitening

Viewed as a reconstruction algorithm, autoencoders is a nonlinear

analogue of PCA

Math GRE Exercises: Charlie Marshak
No ratings yet
Math GRE Exercises: Charlie Marshak
18 pages
DDE Seminar Tutorial II
No ratings yet
DDE Seminar Tutorial II
24 pages
Recursive Form of The Eigensystem Realization Algorithm For System Identification
No ratings yet
Recursive Form of The Eigensystem Realization Algorithm For System Identification
6 pages
DSA5102_lecture9
100% (1)
DSA5102_lecture9
35 pages
DL Class5
No ratings yet
DL Class5
23 pages
Chapter17 Autoencoders
No ratings yet
Chapter17 Autoencoders
23 pages
Image Classification Using Backpropagation Algorithm (Presentation)
No ratings yet
Image Classification Using Backpropagation Algorithm (Presentation)
23 pages
Industrial Training report_Shreya
No ratings yet
Industrial Training report_Shreya
38 pages
1
No ratings yet
1
6 pages
Autoencoder_GAN_edited
No ratings yet
Autoencoder_GAN_edited
138 pages
Week 6 Unsupervised Learning
No ratings yet
Week 6 Unsupervised Learning
60 pages
algorithmeknn-121213175830-phpapp02
No ratings yet
algorithmeknn-121213175830-phpapp02
52 pages
Unit-I (R20 Syllabus) Machine Learning Basics
No ratings yet
Unit-I (R20 Syllabus) Machine Learning Basics
50 pages
SCSA3015 Deep Learning Unit 3
100% (1)
SCSA3015 Deep Learning Unit 3
23 pages
AI ML Session Slides
No ratings yet
AI ML Session Slides
34 pages
Creating SimpleOCR Application
No ratings yet
Creating SimpleOCR Application
9 pages
Lecture 3.1
No ratings yet
Lecture 3.1
21 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
Workbook of Pattern Recognition
No ratings yet
Workbook of Pattern Recognition
11 pages
deep learning u1
No ratings yet
deep learning u1
5 pages
Autoencoders: Presented By: 2019220013 Balde Lansana (
No ratings yet
Autoencoders: Presented By: 2019220013 Balde Lansana (
21 pages
Unit I
No ratings yet
Unit I
28 pages
XAI Final
No ratings yet
XAI Final
18 pages
Study Materials - Denoising Autoencoders
No ratings yet
Study Materials - Denoising Autoencoders
7 pages
SUMSEM2021-22 PLA1001 LT BL2021227000018 Reference Material I 13-May-2022 PLA1001 MCA2023 Part-1
No ratings yet
SUMSEM2021-22 PLA1001 LT BL2021227000018 Reference Material I 13-May-2022 PLA1001 MCA2023 Part-1
302 pages
CVDL Cae 2
No ratings yet
CVDL Cae 2
7 pages
Project Proposal CS 327 - Software Engineering 2 Semester Project (Spring 2011)
No ratings yet
Project Proposal CS 327 - Software Engineering 2 Semester Project (Spring 2011)
5 pages
unit-iv-v-deep-learning-material
No ratings yet
unit-iv-v-deep-learning-material
32 pages
Autoencoder - Unit 4
No ratings yet
Autoencoder - Unit 4
39 pages
Jntuk r20 Unit v Deep Learning Techniqueswwwjntumaterials
No ratings yet
Jntuk r20 Unit v Deep Learning Techniqueswwwjntumaterials
32 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
Machine Learning for Data Science Unit-1
No ratings yet
Machine Learning for Data Science Unit-1
7 pages
Machine Learning 1707965934
No ratings yet
Machine Learning 1707965934
15 pages
Unit II
No ratings yet
Unit II
35 pages
Generative_Models
No ratings yet
Generative_Models
65 pages
465-Lecture 12
No ratings yet
465-Lecture 12
31 pages
Creating Optical Character Recognition (OCR) Applications Using Neural Networks - CodeProject
No ratings yet
Creating Optical Character Recognition (OCR) Applications Using Neural Networks - CodeProject
7 pages
Backpropagation: Static Backpropagation Is A Network Designed To
No ratings yet
Backpropagation: Static Backpropagation Is A Network Designed To
2 pages
UNIT 4 ML NN ,DL,CNN-1
No ratings yet
UNIT 4 ML NN ,DL,CNN-1
84 pages
Auto Encoder
No ratings yet
Auto Encoder
39 pages
Optimization of Deep Networks
No ratings yet
Optimization of Deep Networks
84 pages
Stacked Autoencoders. | Towards Data Science
No ratings yet
Stacked Autoencoders. | Towards Data Science
9 pages
003-KNN Complete Updated
No ratings yet
003-KNN Complete Updated
72 pages
Lecture 14 Autoencoders
No ratings yet
Lecture 14 Autoencoders
39 pages
Building Autoencoders in Keras
No ratings yet
Building Autoencoders in Keras
17 pages
Lec16 - Autoencoders
No ratings yet
Lec16 - Autoencoders
18 pages
AAM UT-2 QB ANS
No ratings yet
AAM UT-2 QB ANS
29 pages
Sarkar
No ratings yet
Sarkar
26 pages
Machine Learning CNN
No ratings yet
Machine Learning CNN
28 pages
Keras1-Introduction Two KEras
No ratings yet
Keras1-Introduction Two KEras
6 pages
Auto Encoder s
No ratings yet
Auto Encoder s
22 pages
ANN_Unit-2
No ratings yet
ANN_Unit-2
48 pages
Deep Learning: Prof:Naveen Ghorpade
No ratings yet
Deep Learning: Prof:Naveen Ghorpade
43 pages
4 DataAnalyics Part1
No ratings yet
4 DataAnalyics Part1
59 pages
1694266379-Unit1 Machine Learning Introduction CU 2.0
No ratings yet
1694266379-Unit1 Machine Learning Introduction CU 2.0
58 pages
Autoencoder
No ratings yet
Autoencoder
24 pages
1694601214-Unit 3.4 Principal Component Analysis CU 2.0
No ratings yet
1694601214-Unit 3.4 Principal Component Analysis CU 2.0
36 pages
Set1 Introduction
No ratings yet
Set1 Introduction
15 pages
Lecture 4 - Source Code Analysis
No ratings yet
Lecture 4 - Source Code Analysis
52 pages
Seminar Ppt
No ratings yet
Seminar Ppt
15 pages
minor project
No ratings yet
minor project
21 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
Numerical Technique Lab Manual
100% (1)
Numerical Technique Lab Manual
49 pages
IAS - Mathematics Optional - 2012 Question - Paper II
No ratings yet
IAS - Mathematics Optional - 2012 Question - Paper II
9 pages
189 Cheat Sheet Nominicards PDF
No ratings yet
189 Cheat Sheet Nominicards PDF
2 pages
Math 241 Section 2.1 (3-2-2021)
No ratings yet
Math 241 Section 2.1 (3-2-2021)
20 pages
MSC Nastran 2020 Whats New
No ratings yet
MSC Nastran 2020 Whats New
105 pages
Linear Algebra Week 5
No ratings yet
Linear Algebra Week 5
36 pages
Of Delhi: Experiment 3 Linear Algebra: Eigenvalues and Eigenvectors First Principle
No ratings yet
Of Delhi: Experiment 3 Linear Algebra: Eigenvalues and Eigenvectors First Principle
3 pages
Computer Vision and Robotics Notes
No ratings yet
Computer Vision and Robotics Notes
4 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
Linear Algebra Via Exterior Products
No ratings yet
Linear Algebra Via Exterior Products
285 pages
Wind Peak Pressures On A Square-Section Cylinder Flow Mechanism and Standardconditional POD Analyses
No ratings yet
Wind Peak Pressures On A Square-Section Cylinder Flow Mechanism and Standardconditional POD Analyses
15 pages
Affu
No ratings yet
Affu
22 pages
Quantum Physics: The Bottom-Up Approach
No ratings yet
Quantum Physics: The Bottom-Up Approach
270 pages
Full download Basics of algebra topology and differential calculus Gallier J pdf docx
100% (5)
Full download Basics of algebra topology and differential calculus Gallier J pdf docx
40 pages
Math 110: Linear Algebra Homework #10 Problem (1) - Use Induction On The Dimension N of A To Prove That Det (A Xi) Is A Poly
No ratings yet
Math 110: Linear Algebra Homework #10 Problem (1) - Use Induction On The Dimension N of A To Prove That Det (A Xi) Is A Poly
4 pages
ALL IMPORTANT FORMULA of BSM 101
No ratings yet
ALL IMPORTANT FORMULA of BSM 101
12 pages
Bra Ket & Linear Algebra
No ratings yet
Bra Ket & Linear Algebra
4 pages
Statistical Models: 26.1 Receptor Modeling Methods
No ratings yet
Statistical Models: 26.1 Receptor Modeling Methods
18 pages
MS252
No ratings yet
MS252
1 page
FE_syllabus2
No ratings yet
FE_syllabus2
9 pages
Solutions: Physics 505 Final Examination
No ratings yet
Solutions: Physics 505 Final Examination
13 pages
Mathematics For Engineering - Exercise Book
No ratings yet
Mathematics For Engineering - Exercise Book
28 pages
CHAPTER 4: Some Methods For Finding Natural Frequencies
No ratings yet
CHAPTER 4: Some Methods For Finding Natural Frequencies
14 pages
AA Textbook NM
No ratings yet
AA Textbook NM
33 pages
Nering LinearAlgebraAndMatrixTheory Text
100% (1)
Nering LinearAlgebraAndMatrixTheory Text
364 pages
ETH Zurich - Mechanics of Building Materials, Elastic Theory
No ratings yet
ETH Zurich - Mechanics of Building Materials, Elastic Theory
64 pages
VR20 Four Year Syllabus - 18 04 2023MOD - 2
No ratings yet
VR20 Four Year Syllabus - 18 04 2023MOD - 2
178 pages

DSA5105 Lecture8

Uploaded by

DSA5105 Lecture8

Uploaded by

Principles of Machine Learning

DSA 5105 • Lecture 8

For next few lectures, we are going to look at a variety of

As always, we start with the simplest linear cases and proceed

Supervised learning is about learning to make predictions

Our goal: Using data, learn a predictive model that approximates

Unsupervised learning is where we do not have label information

By ‫ طاها‬- Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=24309466

• We say is diagonalizable if there exists a diagonal (matrix of eigenvalues)

Columns of are orthonormal: . In fact, is an orthonormal basis for . Moreover,

Watch this! https://www.youtube.com/watch?v=PFDu9oVAE-g&t=453s

and are the corresponding orthonormal eigenvectors.

Form design matrix

Perform PCA on the Transformed dataset!

Define the transformation

In other words, has uncorrelated features. This is known as a

Encoder Latent Decoder

One choice: use universal approximators, e.g. neural networks!

The empirical risk minimization uses inputs as labels!

PCA is useful for:

Viewed as a reconstruction algorithm, autoencoders is a nonlinear

You might also like