0% found this document useful (0 votes)

7 views

Module12.01 UnsupervisedLearning

Stat learning

Uploaded by

cadi0761

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Module12.01 UnsupervisedLearning

Stat learning

Uploaded by

cadi0761

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Unsupervised

Learning
Reference Books

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An

introduction to statistical learning (Vol. 112, p. 18). New York:
springer.

Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H.

(2009). The elements of statistical learning: data mining,
inference, and prediction (Vol. 2, pp. 1-758). New York:
springer.

Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate

statistical analysis.
Unsupervised Learning

• Unsupervised learning is a type of algorithm that learns patterns

from untagged data.
• Unsupervised learning is more subjective than supervised
learning, as there is no simple goal for the analysis, such as
prediction of a response.
• We will discuss two unsupervised learning methods:
1. Principal components analysis
2. Clustering
Principal Components Analysis

• PCA produces a low-dimensional representation of a dataset.

It finds a sequence of linear combinations of the variables
that have maximal variance, and are mutually uncorrelated.
• Apart from producing derived variables for use in supervised
learning problems, PCA also serves as a tool for data
visualization.
Principal Components Analysis: details

• The first principal component of a set of features , ,...,

is the normalized linear combination of the features
+ + ..… +

that has the largest variance. B y normalized, we mean that

• The elements as the loadings of the first principal

component; together, the loadings make up the principal
component loading vector
P C A : example

35
30
25
Ad Spending
20
15
10
5
0

10 20 30 40 50 60 70

Population

The population size (pop) and ad spending (ad) for 100 different cities are shown as
purple circles. The green solid line indicates the first principal component direction,
and the blue dashed line indicates the second principal component direction.
Computation of Principal Components

• Suppose we have a n × p data set X.

• Assume variables in X has been centered to have mean zero.
• Get the linear combination of the sample feature values
of the form
+ + ..… +
for i = 1, . . . , n that has largest sample variance, (1)

subject to the constraint that

• Since each of the has mean zero, then so does .

Hence the sample variance of the can be written as
• Plugging in (1) the first principal component loading vector
solves the optimization problem

∅ …∅

• This problem can be solved via a singular-value decomposition

of the matrix X.
• We refer to Z 1 as the first principal component, with
realized values z 11 , . . . , z n 1.
• The second principal component is the linear combination of
, , . . . , that has maximal variance among all linear
combinations that are uncorrelated with Z 1 .
• The second principal component scores z 12 , z 22 , . . . , z n2
take the form

+ + ..… +

where is the second principal component loading vector,

with elements .
Illustration

• USAarrests data: For each of the fifty states in the United

States, the data set contains the number of arrests per
100,000 residents for each of three crimes: Assault, Murder,
and Other. We also record UrbanPop (the percent of the
population in each state living in urban areas).
• The principal component score vectors have length n = 50,
and the principal component loading vectors have length
p = 4.
• P C A was performed after standardizing each variable to
have mean zero and standard deviation one.
USAarrests data: P C A plot
−0.5 0.0 0.5

UrbanPop

3
2

0.5
Hawaii California
RhodM
e aIslU
saatnacdh useNttesw Jersey

Connecticut
Second Principal Component

Washington Colorado
1

Ohio New York Nevada

WiscoM
n sininnesota Pennsylvania IllinoisArizona
Oregon
Texas
Other
KansaO Dm
s klaho elaaware Missouri
Nebraska Indiana Michigan
New HaImowpashire

0.0
0

New Mexico Florida

Idaho Virginia
Wyoming
Maine Maryland
rth Dakota Montana
Assault
South Dakota TennesseLeouisiana
Kentucky
−1

Arkansas Alaska
Alabama
Georgia
VermontWest Virginia Murder

−0.5
South Carolina
−2

North Carolina
Mississippi
−3

−3 −2 −1 0 1 2 3

First Principal Component

Figure details

The first two principal components for the USArrests data.

• The blue state names represent the scores for the first
two principal components.
• The orange arrows indicate the first two principal
component loading vectors (with axes on the top and
right). For example, the loading for Other on the first
component is 0.54, and its loading on the second
principal component 0.17 [the word Rape is centered at
the point (0.54, 0.17)].
• This figure is known as a biplot, because it displays
both the principal component scores and the principal
component loadings.
Figure details

• First loading vector places approximately equal weights on

Assult, Murder and Other
• It indicates that the first PC represents crime in the city
• The second loading vector places most of the weight on
urban pop.
• The second PC represents the urban population
• Crime related variables are correlated (high murder rate is
associated with high assault)
• Urbanpop variable is less correlated with the other three.
How to Determine Principal Components

Let be the covariance matrix of the random variable

, ,...,

Let has eigenvalue-eigenvector pairs

Where

The PC is given by

+ +… .. + where

With following properties

where
for
Another Interpretation of Principal Components

• •
•

1.0
• • ••
•
• • • •• • •• •••
• •
•
• •

0.5
Second principal component
•• •
•
• • • •
• • •• • •

0.0
• •
• • • ••• • • •
• • • •• •
•• • • •
• • • •
•• •

−0.5
• •
• •••
• • ••
• • • •
• •
−1.0

−1.0 −0.5 0.0 0.5 1.0

First principal component
Another Interpretation of Principal Components

• The first principal component loading vector has a very

special property: it defines the hyperplane in p-dimensional
space that is closest to the n observations (using average
squared Euclidean distance as a measure of closeness).
• The notion of principal components as the dimensions that
are closest to the n observations extends beyond just the
first principal component.
Scaling of the variables
• If the variables are in different units, scaling each to have
standard deviation equal to one is recommended.
• Variance of Murder, Other, Assault and UrbanPop are:
18.97, 87.73, 6945.16 and 209.5
• If they are in the same units, scaling is not mandatory
Scaled Unscaled
−0.5 0.0 0.5 −0.5 0.0 0.5 1.0

1.0
UrbanPop UrbanPop
3

150
2

0.5

100
** **
Second Principal Component

Second Principal Component

* *

0.5
* **
1

*
* *

50
* * * * * Other
* Other
* * ** * * *
** * ** *
0.0

* *
0

* **
* * * * * * *** ** * * ** * **
* *

0.0
* 0
* *M*urd*er * *
* * A*ssault *
* * ** * * * Assa
* * * * ** *
* *
−1

* * *
* *
M*urder
−50

* *
−0.5

−0.5
*
−2

−100

**
−3

−3 −1 0 1 2 3 −100 −50 0 50 100

−2 150
First Principal Component First Principal Component
Proportion of Variance Explained

• To understand the strength of each component, measure the

proportion of variance explained ( P V E ) by each one.
• The total variance present in a data set (assuming that the
variables have been centered to have mean zero) is defined as

and the variance explained by the mth principal component is

• Therefore, the P V E of the mth principal component is given
by the positive quantity between 0 and 1
Scree Plots

Left: Proportion of variance explained by each of the four

principal components in the USArrests data.
Right: The cumulative proportion of variance explained by
the four principal components in the USArrests data.
Example

Suppose Random Variable have the covariance matrix

Determine the principal components

CSS Analytics Minimum Correlation Algorithm
100% (1)
CSS Analytics Minimum Correlation Algorithm
91 pages
A Step by Step Explanation of Principal Component Analysis
No ratings yet
A Step by Step Explanation of Principal Component Analysis
7 pages
Chapter 5
No ratings yet
Chapter 5
10 pages
Econometrics 1 Cumulative Final Study Guide
No ratings yet
Econometrics 1 Cumulative Final Study Guide
35 pages
Quantitative Methods of Investment Analysis
No ratings yet
Quantitative Methods of Investment Analysis
16 pages
Module12 - Unsupervised Learning
No ratings yet
Module12 - Unsupervised Learning
52 pages
Ch12 Unsupervised Learning
No ratings yet
Ch12 Unsupervised Learning
58 pages
Unsupervised Handout
No ratings yet
Unsupervised Handout
50 pages
Lecture 12 - Unsupervised- PCA
No ratings yet
Lecture 12 - Unsupervised- PCA
17 pages
AA11_Unsupervised Learning_2024 (2)
No ratings yet
AA11_Unsupervised Learning_2024 (2)
39 pages
02 Principal Components
No ratings yet
02 Principal Components
9 pages
Chapter2 PCA
No ratings yet
Chapter2 PCA
65 pages
116_Principal_components_analysis
No ratings yet
116_Principal_components_analysis
6 pages
DR Pca
No ratings yet
DR Pca
22 pages
L08 PrincipalComponentAnalysis
No ratings yet
L08 PrincipalComponentAnalysis
36 pages
Unit5 1
No ratings yet
Unit5 1
98 pages
Minitab Statguide Multivariate
No ratings yet
Minitab Statguide Multivariate
25 pages
Remote Sensing Assignment
No ratings yet
Remote Sensing Assignment
10 pages
PCA Explained Stepbystep
No ratings yet
PCA Explained Stepbystep
4 pages
Steps for PCA
No ratings yet
Steps for PCA
5 pages
PC Regression
No ratings yet
PC Regression
25 pages
1731050702_ML15_PCA
No ratings yet
1731050702_ML15_PCA
12 pages
Edab Module - 5
No ratings yet
Edab Module - 5
19 pages
Module 4-2 Principal Components Analysis
No ratings yet
Module 4-2 Principal Components Analysis
18 pages
MV - Principal Components Using SAS
No ratings yet
MV - Principal Components Using SAS
69 pages
Principal Component Analysis Concepts: T56Gzsrvah
No ratings yet
Principal Component Analysis Concepts: T56Gzsrvah
16 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Principal Components Analysis (PCA)
No ratings yet
Principal Components Analysis (PCA)
53 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
4 pages
Factor analysis is a statistical method used to explore the underlying structure of relationships among observed variables in a dataset. It aims to identify latent or unobservable factors that exp (2)
No ratings yet
Factor analysis is a statistical method used to explore the underlying structure of relationships among observed variables in a dataset. It aims to identify latent or unobservable factors that exp (2)
12 pages
Ch8-Principal Components
No ratings yet
Ch8-Principal Components
77 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Data Mining - Module 2 - HU
No ratings yet
Data Mining - Module 2 - HU
88 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Lecture 6 - PCA - Lecturefin
No ratings yet
Lecture 6 - PCA - Lecturefin
71 pages
Jolliffe 2014
No ratings yet
Jolliffe 2014
5 pages
Unit V Foml
No ratings yet
Unit V Foml
18 pages
PCA - Ensemble Classifiers
No ratings yet
PCA - Ensemble Classifiers
9 pages
Principal Component Analysis (PCA) Explained - Built in
No ratings yet
Principal Component Analysis (PCA) Explained - Built in
11 pages
Introduction To Data Analysis
No ratings yet
Introduction To Data Analysis
72 pages
Presentation a i Std 2
No ratings yet
Presentation a i Std 2
63 pages
Unit-3
No ratings yet
Unit-3
28 pages
The Mathematics Behind Principal Component Analysis
No ratings yet
The Mathematics Behind Principal Component Analysis
9 pages
ML Chapter 4 Part3
No ratings yet
ML Chapter 4 Part3
82 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
Principal Component Analysis Concepts
No ratings yet
Principal Component Analysis Concepts
16 pages
PCA using R
No ratings yet
PCA using R
12 pages
Principal Components Analysis: Hal Whitehead BIOL4062/5062
No ratings yet
Principal Components Analysis: Hal Whitehead BIOL4062/5062
29 pages
What Is Principal Component Analysis For Dummies
No ratings yet
What Is Principal Component Analysis For Dummies
6 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
12 pages
Principal Component Analysis (PCA) Final
No ratings yet
Principal Component Analysis (PCA) Final
37 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
9 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
Data Science Lecture
No ratings yet
Data Science Lecture
24 pages
Princomps George Dallas
No ratings yet
Princomps George Dallas
9 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
39 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
7 pages
Bia b350f Unit 4
No ratings yet
Bia b350f Unit 4
38 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
14 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
Principal Component Analysis: 2.1 Definition of Principal Components
No ratings yet
Principal Component Analysis: 2.1 Definition of Principal Components
8 pages
Module10 TreeBasedMethods
No ratings yet
Module10 TreeBasedMethods
33 pages
SLA - Class Test - 2 - AnswerKey
No ratings yet
SLA - Class Test - 2 - AnswerKey
5 pages
SLA - Class Test - 1 - AnswerKey
No ratings yet
SLA - Class Test - 1 - AnswerKey
4 pages
SLA - Class Test - 4 - AnswerKey
No ratings yet
SLA - Class Test - 4 - AnswerKey
2 pages
SLA - Class Test - 5 - AnswerKey
No ratings yet
SLA - Class Test - 5 - AnswerKey
2 pages
Multiv: Users Guide v. 2.4
No ratings yet
Multiv: Users Guide v. 2.4
51 pages
Descriptive Statistics: 4 Edition David P. Doane and Lori E. Seward
No ratings yet
Descriptive Statistics: 4 Edition David P. Doane and Lori E. Seward
9 pages
Covariance: A Little Introduction
No ratings yet
Covariance: A Little Introduction
3 pages
Portfolio Theory: Dealing With Uncertainty
No ratings yet
Portfolio Theory: Dealing With Uncertainty
28 pages
Sia Notes 2013
No ratings yet
Sia Notes 2013
279 pages
16BT40405 - Probability and Stochastic Processes
No ratings yet
16BT40405 - Probability and Stochastic Processes
2 pages
Meinshausen & Bühlmann, High-Dimensional Graphs and Variable Selection With The Lasso 009053606000000281
No ratings yet
Meinshausen & Bühlmann, High-Dimensional Graphs and Variable Selection With The Lasso 009053606000000281
27 pages
Se1 PDF
No ratings yet
Se1 PDF
35 pages
Elements of Econometrics Exam Commentaries
0% (1)
Elements of Econometrics Exam Commentaries
4 pages
Complete Business Statistics: Simple Linear Regression and Correlation
No ratings yet
Complete Business Statistics: Simple Linear Regression and Correlation
50 pages
Statistical Signal Processing of Complex Valued Data The Theory of Improper and Noncircular Signals 1st Edition Peter J. Schreier pdf download
100% (1)
Statistical Signal Processing of Complex Valued Data The Theory of Improper and Noncircular Signals 1st Edition Peter J. Schreier pdf download
66 pages
Eece 522 Notes - 11 CH - 6
No ratings yet
Eece 522 Notes - 11 CH - 6
16 pages
EBSCO-FullText-02_04_2025
No ratings yet
EBSCO-FullText-02_04_2025
21 pages
IBA ASSIGNMENT p22251
No ratings yet
IBA ASSIGNMENT p22251
6 pages
Mathematical Expectation
No ratings yet
Mathematical Expectation
28 pages
Summarizing Relationships: ETF1100 Business Statistics Week 6 Charanjit Kaur
No ratings yet
Summarizing Relationships: ETF1100 Business Statistics Week 6 Charanjit Kaur
4 pages
Unit 01 DWDM
No ratings yet
Unit 01 DWDM
105 pages
Test For Stat 25 - 04
No ratings yet
Test For Stat 25 - 04
12 pages
SPSS
No ratings yet
SPSS
30 pages
Elec2600 Lecture Part III H
No ratings yet
Elec2600 Lecture Part III H
237 pages
Heat Transfer and Pressure Drop Performance Compar
No ratings yet
Heat Transfer and Pressure Drop Performance Compar
18 pages
Unit-12 IGNOU STATISTICS
No ratings yet
Unit-12 IGNOU STATISTICS
34 pages
Math 425 Multivariate Data Analysis - Kabarak University
No ratings yet
Math 425 Multivariate Data Analysis - Kabarak University
5 pages
QT Primer Sessions
No ratings yet
QT Primer Sessions
43 pages
Statistics Exercises
No ratings yet
Statistics Exercises
4 pages
Statistics 333
100% (1)
Statistics 333
84 pages

Module12.01 UnsupervisedLearning

Uploaded by

Module12.01 UnsupervisedLearning

Uploaded by

Unsupervised

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An

Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H.

Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate

• Unsupervised learning is a type of algorithm that learns patterns

• PCA produces a low-dimensional representation of a dataset.

• The first principal component of a set of features , ,...,

that has the largest variance. B y normalized, we mean that

• The elements as the loadings of the first principal

• Suppose we have a n × p data set X.

subject to the constraint that

• Since each of the has mean zero, then so does .

• This problem can be solved via a singular-value decomposition

where is the second principal component loading vector,

• USAarrests data: For each of the fifty states in the United

Ohio New York Nevada

New Mexico Florida

First Principal Component

The first two principal components for the USArrests data.

• First loading vector places approximately equal weights on

Let be the covariance matrix of the random variable

Let has eigenvalue-eigenvector pairs

With following properties

−1.0 −0.5 0.0 0.5 1.0

• The first principal component loading vector has a very

Second Principal Component

−3 −1 0 1 2 3 −100 −50 0 50 100

• To understand the strength of each component, measure the

and the variance explained by the mth principal component is

Left: Proportion of variance explained by each of the four

Suppose Random Variable have the covariance matrix

Determine the principal components

You might also like