Open navigation menu

Scribd

0% found this document useful (0 votes)

23 views

PCA ChrisDing4

Principal component analysis (PCA) is a procedure to reduce the dimensionality of data by finding a new set of variables called principal components. PCA retains most of the variation in the original data. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. The principal components are eigenvectors of the data's covariance matrix and represent orthogonal directions with maximum variance in the data.

Uploaded by

Copyright

© © All Rights Reserved

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

PCA ChrisDing4

Principal component analysis (PCA) is a procedure to reduce the dimensionality of data by finding a new set of variables called principal components. PCA retains most of the variation in the original data. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. The principal components are eigenvectors of the data's covariance matrix and represent orthogonal directions with maximum variance in the data.

Uploaded by

Copyright

© © All Rights Reserved

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 74

Principal Component Analysis

Chris Ding
Department of Computer Science and Engineering
University of Texas at Arlington
PCA is the procedure of finding intrinsic
dimensions of the data

1.Data analysis
2.Data reduction
3.Data visualization

Represent high dimensional data in low-dim space

High-dimensional data

Gene expression Face images Handwritten digits

Example…
Application of feature reduction

• Face recognition
• Handwritten digit recognition
• Text mining
• Image retrieval
• Microarray data analysis
• Protein classification
Use PCA to approximate an image (a data matrix)

112 x 92

PCA PCA PCA PCA

original
k=10 k=20 k=30 k=40
Use PCA to approximate a set of images

original

PCA k=1

PCA k=2

PCA k=4

PCA k=6
Use PCA to approximate a set of images

original

PCA k=1

PCA k=2

PCA k=4

PCA k=6
Display the characters in 2-dim space

 a T
x
~x  G x   T 
T 1

 a2 x 
Application of feature reduction
Intrinsic dimensions of the data
Samples of children: hours of study, hours on internet, vs their age

Hours on study / homework

Children’s age

Hours on internet
Intrinsic dimensions of the data
Samples of children: hours of study, hours on internet, vs their age

Hours on study / homework

Children’s age

Data lie in a subspace

(intrinsic dimensions)

Hours on internet
PCA is the procedure of finding
intrinsic dimensions of the data
Find lines that best represent the data
PCA is a rotation of space to proper
directions (principal directions)
Geometric picture of principal components (PCs)

z1

• the 1st PC z1 is a minimum distance fit to a line in X space

• the 2nd PC z 2 is a minimum distance fit to a line in the plane
perpendicular to the 1st PC
PCs are a series of linear least squares fits to a sample,
each orthogonal to all the previous.
PCA represents data:
the close data to a linear subspace,
the more accurate representation
PCA Step 0: move coordinate to data center
This is equivalent to Centering the data

Hours on study / homework

Children’s age

Hours on internet
PCA Step 1: find a line that best represents the data

Hours on study / homework

Children’s age

Hours on internet
PCA Step 1: find a line that best represents the data

Hours on study / homework

Children’s age

Hours on internet
PCA Step 1: find a line that best represents the data

Hours on study / homework

Children’s age

Hours on internet
PCA Step 1: find a line that best represents the data

Hours on study / homework

Children’s age

Hours on internet
PCA Step 1: find a line that best represents the data

Hours on study / homework

Children’s age

projection error

Hours on internet

minimize sum of projection errors squared

Which error to minimize?
PCA Step 1: find the line that best represents the data
Fitting data to a curve (a straight line, the simplest curve)
Hours on study / homework

Children’s age

Hours on internet

minimize sum of projection errors squared

This gives the 1st principal direction
PCA directions are eigenvectors of Covariance Matrix
Repeating this process to find 3rd, 4th, … lines to best fit the remaining data,
the results are given by u2 ， u3 ，… , uk
Intrinsic dimensions of the data
Samples of children study, use internet vs their age

Hours on study / homework

Children’s age

Hours on internet
PCA from maximum variance

PCA from maximum spread-out

PCA represents data:
the close data to a linear subspace,
the more accurate representation

smaller
variance

Larger
variance
Larger spread-out = Larger variance
What is Principal Component Analysis?

• Principal component analysis (PCA)

– Reduce the dimensionality of a data set by finding a new set of
variables, smaller than the original set of variables
– Retains most of the sample's information.
– Useful for the compression and classification of data.

• By information we mean the variation present in the sample,

given by the correlations between the original variables.
– The new variables, called principal components (PCs), are
uncorrelated, and are ordered by the fraction of the total information
each retains.
Geometric picture of principal components (PCs)

z1

• the 1st PC z1 is a minimum distance fit to a line in X space

• the 2nd PC z 2 is a minimum distance fit to a line in the plane
perpendicular to the 1st PC
PCs are a series of linear least squares fits to a sample,
each orthogonal to all the previous.
C. Ding

Principal Component as maximum variance

Let x  (x 1 ,x 2 , ,x p )T be a vector random variable in p dimensions/variables
随机变量
(e1 ,e2 , ,e p )
Given n observations/samples of x:
x1 , x2 ,, xn   p

The first principal component.

Define a scalar random variable as linear combination of dimensions:
p
z1  a1T x   j1 ,
a
j 1
x j
a1  (a11 ,a21 , ,a p 1 )

var[ z1 ] is maximized
Principal Component as maximum variance
Because
1
 
n
2
var[ z1 ]  E (( z1  z1 ) 2 )   a1T xi  a1T x
n i 1
1 n T
 
T

  a1 xi  x xi  x a1  a1T Sa1
n i 1
where
1 n

S   xi  x xi  x
n i 1
T
 
1 n
is the covariance matrix. x   xi is the mean.
n i 1
In the following, we assume the data is centered. x  0
Principal Component as maximum variance

To find a1 that T
maximizes var[ z1 ] subject to a1 a1  1

Let λ be a Lagrange multiplier.

L  a1T Sa1  (a1T a1  1)


L  Sa1  a1  0
a1 “eigen” = german “do something to itself”
Sa1  a1 operator = matrix

therefore a1 is an eigenvector of S

corresponding to the largest eigenvalue   1.

Algebraic derivation of PCs

To find the next coefficient vector a2 maximizing var[ z2 ]

subject to cov[ z 2 , z1 ]  0 uncorrelated
and to a2T a2  1
T T
First note that cov[ z 2 , z1 ]  a Sa2   a a2
1 1 1

then let λ and φ be Lagrange multipliers, and maximize

T T T
L  a Sa2   (a a2  1)  a a
2 2 2 1
Algebraic derivation of PCs

T T T
L  a Sa2   (a a2  1)  a a
2 2 2 1


L  Sa2  a2  a1  0    0
a2

T
Sa2  a2 and   a Sa2 2
Algebraic derivation of PCs

We find that a2 is also an eigenvector of S

whose eigenvalue   2 is the second largest.
In general
T
var[ z k ]  a Sak  k
k

• The kth largest eigenvalue of S is the variance of the kth PC.

• The kth PC z k retains the kth greatest fraction of the variation

in the sample.
C. Ding
Projection to PCA subspace
• Main steps for computing PCA subspace
– Form the covariance matrix S.

– Compute its eigenvectors:

a  p
i i 1

– PCA subspace is spanned by the first d eigenvectors a d

i i 1

– The transformation G is given by  a1T x 

 
T
G  [a1 , a2 , , ad ]
~x  G x   a2 x 
T

U  (u1 ,u 2 , ,u k ) 
 T 
p T  ad x 
x    x~  G x  PCAsubspace
Algebraic derivation of PCs

Assume x0
p n
Form the matrix: X  [ x1 , x2 , , xn ]  
1
then S  XX T
n

Obtain eigenvectors of S by computing the SVD of X:

T
X  UV X’ = U^T * X = \Sigma * V’
Homework:

After you
1.Compute the covariance matrix S
2.Obtain the first k eigenvectors of S as (u_1, …, u_k)

Show that:
You can obtain (v_1,…,v_k) by doing matrix – vector
multiplications. No need to compute eigenvectors of the
kernel (Gram) matrix.
Reduction and Reconstruction Reconstruction
Dimension reduction X   p n  G T X   d  n
G T X   d n  X  G (G T X )   pn

GT  d  p

Y  G T X   d n
X   p n

X   p n
G   p d
Optimality property of PCA
Main theoretical result:
The matrix G consisting of the first d eigenvectors of the
covariance matrix S solves the following min problem:
T 2
min G pd X  G (G X ) subject to G T G  I d
F

2
X X reconstruction error
F

PCA projection minimizes the reconstruction error among all

linear projections of size d.
Applications of PCA

• Eigenfaces for recognition. Turk and Pentland. 1991.

• Principal Component Analysis for clustering gene

expression data. Yeung and Ruzzo. 2001.

• Probabilistic Disease Classification of Expression-

Dependent Proteomic Data from Mass Spectrometry of
Human Serum. Lilien. 2003.
Outline of lecture

• What is feature reduction?

• Why feature reduction?
• Feature reduction algorithms
• Principal Component Analysis
• Nonlinear PCA using Kernels
Motivation

Linear projections
will not detect the
pattern.
Nonlinear PCA using Kernels

• Traditional PCA applies linear transformation

– May not be effective for nonlinear data

• Solution: apply nonlinear transformation to potentially very high-

dimensional space.

 : x   ( x)
• Computational efficiency: apply the kernel trick.
– Require PCA can be rewritten in terms of dot product.

More on kernels
K ( xi , x j )   ( xi )   ( x j ) later
Nonlinear PCA using Kernels

Rewrite PCA in terms of dot product

Assume the data has been centered, i.e.,  xi  0.

i
1
The covariance matrix S can be written as S   xi xiT
n i

Let v be The eigenvector of S corresponding to

nonzero eigenvalue
1 1
Sv   xi xi v  v  v 
T
 ( x T
i v ) xi
n i n i
Eigenvectors of S lie in the space spanned by all data points.
Nonlinear PCA using Kernels
1 1
Sv   xi xi v  v  v 
T
 ( x T
i v ) xi
n i n i
The covariance matrix can be written in matrix form:
1
S  XX T , where X  [x1 , x 2 , , x n ].
n
v    i xi  X 1
Sv  XX T X  X
i n
1 T
( X X )( X T X )   ( X T X )
n
1 T Any benefits?
( X X )  
n
Nonlinear PCA using Kernels

Next consider the feature space:  : x   ( x)

1   T
S  X X  , where X   [x1 , x 2 ,  , x n ].


n
1
v    i ( xi )  X 
i

 T
X X   
n


The (i,j)-th entry of X  X

 T 
is  ( xi )   ( x j )
Apply the kernel trick: K ( xi , x j )   ( xi )   ( x j )
1
K is called the kernel matrix. K  
n
Nonlinear PCA using Kernels

• Projection of a test point x onto v:

 ( x)  v   ( x)    i ( xi )
i

   i ( x)   ( xi )    i K ( x, xi )
i i

Explicit mapping is not required here.

Reference

• Principal Component Analysis. I.T. Jolliffe.

• Kernel Principal Component Analysis. Schölkopf, et al.

• Geometric Methods for Feature Extraction and

Dimensional Reduction. Burges.
主成分分析（ PCA ） = K 均值聚类 (k-means)

把每个类的数据点集中到类中心 ( 假设每个类大致是球型 )
这 K 个类中心就组成了主成分分析的子空间！
（这可用数学严格证明）

in p-dim space

One early major advance using matrix

analysis
(Zha, He, Ding, et al, NIPS 2000)
(Ding & He, ICML 2004)
C. Ding, NMF for data
68
clustering and
PCA  k-means clustering

- Move every data points to its cluster center

- K cluster centers span a cluster subspace (k-1 dim)
- Cluster-subspace = PCA subspace (1st k-1 PCA directions)

in p-dim space

One early major advance on PCA, K-means (Zha, He, Ding, et al, NIPS 2000)
(Ding & He, ICML 2004)
Solution of K-means is represented by cluster indicators

 H

n1

We actually use scaled indicators: n2

nk
n1

n2

nk
Q TQ  I

You might also like

Gradient - AI by Hand Workbook
No ratings yet
Gradient - AI by Hand Workbook
26 pages
PCA
100% (1)
PCA
33 pages
Data Pre-Processing-IV (Feature Extraction-PCA)_7c5a4c5da931f4f69a14c94e7e8b9062
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)_7c5a4c5da931f4f69a14c94e7e8b9062
23 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Presentation a i Std 2
No ratings yet
Presentation a i Std 2
63 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
62 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
Dimension Reduction
No ratings yet
Dimension Reduction
23 pages
Face Recognition PAC
No ratings yet
Face Recognition PAC
24 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Pac
No ratings yet
Pac
70 pages
کتاب نهم بارگزاری شده
No ratings yet
کتاب نهم بارگزاری شده
55 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
PCA_dev
No ratings yet
PCA_dev
16 pages
UploadFile_9116
No ratings yet
UploadFile_9116
21 pages
Prs l6
No ratings yet
Prs l6
10 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
16 pages
Pca Kmeans GMM
No ratings yet
Pca Kmeans GMM
96 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
Dimensionality Reduction by Pca: Non - Feasible
No ratings yet
Dimensionality Reduction by Pca: Non - Feasible
26 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
PCA revis-BoW PDF
No ratings yet
PCA revis-BoW PDF
47 pages
MLSP Exp02
No ratings yet
MLSP Exp02
10 pages
Lec_16_PCA
No ratings yet
Lec_16_PCA
64 pages
PCA1
No ratings yet
PCA1
45 pages
Principal Component Analysis Concepts
No ratings yet
Principal Component Analysis Concepts
16 pages
2d
No ratings yet
2d
17 pages
Lecture 9_PCA
No ratings yet
Lecture 9_PCA
44 pages
Class8-9 DataPreprocessing DataReduction 30Sept-05Oct2020
No ratings yet
Class8-9 DataPreprocessing DataReduction 30Sept-05Oct2020
22 pages
4.5 Principal Component Analysis
No ratings yet
4.5 Principal Component Analysis
15 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
Dimensionality Reduction: Principal Component Analysis (PCA)
No ratings yet
Dimensionality Reduction: Principal Component Analysis (PCA)
11 pages
Linear Regression: Dimensionality Reduction
No ratings yet
Linear Regression: Dimensionality Reduction
7 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
22 pages
Pca
No ratings yet
Pca
28 pages
program-3
No ratings yet
program-3
7 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
No ratings yet
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
40 pages
Principal Components Analysis (PCA)
No ratings yet
Principal Components Analysis (PCA)
27 pages
7.3 Pca
No ratings yet
7.3 Pca
17 pages
Ferath Kherif PCA
No ratings yet
Ferath Kherif PCA
17 pages
Pca Lda Lobo
No ratings yet
Pca Lda Lobo
20 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
12 pages
Mlfa Autumn 2023 Pca
No ratings yet
Mlfa Autumn 2023 Pca
32 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
DR Pca
No ratings yet
DR Pca
22 pages
PCA
100% (1)
PCA
45 pages
Qrm2024 Topic5 Pca Fa
No ratings yet
Qrm2024 Topic5 Pca Fa
67 pages
09 PCA
No ratings yet
09 PCA
22 pages
CS464_Ch6_FeatureExtraction
No ratings yet
CS464_Ch6_FeatureExtraction
46 pages
pca
No ratings yet
pca
16 pages
MDA PrincipalComponentAnalysis
No ratings yet
MDA PrincipalComponentAnalysis
20 pages
Doc-20240330-Wa0002 240330 194818
No ratings yet
Doc-20240330-Wa0002 240330 194818
10 pages
Introduction to Deep Learning
From Everand
Introduction to Deep Learning
Eugene Charniak
No ratings yet
TutorialMatrices2 - With Answers
No ratings yet
TutorialMatrices2 - With Answers
3 pages
Gauge Theory and Variational Principles: David Bleccker
No ratings yet
Gauge Theory and Variational Principles: David Bleccker
10 pages
2. Mathematics for AI-1
No ratings yet
2. Mathematics for AI-1
3 pages
Review of Basics & Matrices & Operations Review of Formulae
No ratings yet
Review of Basics & Matrices & Operations Review of Formulae
8 pages
Linear Algebra 4nbsped 9781269431460 Compress
No ratings yet
Linear Algebra 4nbsped 9781269431460 Compress
592 pages
Mathematics: (Mock Test-1) 81
No ratings yet
Mathematics: (Mock Test-1) 81
7 pages
Multiobjective Linear Programming: An Introduction 1st Edition Dinh The Luc (Auth.) 2024 scribd download
100% (3)
Multiobjective Linear Programming: An Introduction 1st Edition Dinh The Luc (Auth.) 2024 scribd download
62 pages
Limits Continuity and Differentiability GATE Study Material in PDF
No ratings yet
Limits Continuity and Differentiability GATE Study Material in PDF
9 pages
Matlab Fundamentals: Powerpoints Organized by Dr. Michael R. Gustafson Ii, Duke University
No ratings yet
Matlab Fundamentals: Powerpoints Organized by Dr. Michael R. Gustafson Ii, Duke University
34 pages
24 Bit Wallace Tree Multiplier
No ratings yet
24 Bit Wallace Tree Multiplier
24 pages
Real Life Application of Linear Algebra
No ratings yet
Real Life Application of Linear Algebra
9 pages
Linearalgebra: Pure Applied
No ratings yet
Linearalgebra: Pure Applied
726 pages
Gyanendra Mishra: Class XII (Matrix)
No ratings yet
Gyanendra Mishra: Class XII (Matrix)
1 page
L-4 - Solution of Non-Homogenous System of Equations
No ratings yet
L-4 - Solution of Non-Homogenous System of Equations
16 pages
MATH3161 Lecture 1
No ratings yet
MATH3161 Lecture 1
7 pages
Rohini 93493285136
No ratings yet
Rohini 93493285136
7 pages
Thumbnail
No ratings yet
Thumbnail
89 pages
CL 701: Computational Methods For Chemical Engineering ODE-IVP and F (X) 0: Exercise Problems
No ratings yet
CL 701: Computational Methods For Chemical Engineering ODE-IVP and F (X) 0: Exercise Problems
6 pages
Transform The Triangles: Sheet 1
No ratings yet
Transform The Triangles: Sheet 1
2 pages
Static State Feedback: Capitolo 0. INTRODUCTION
No ratings yet
Static State Feedback: Capitolo 0. INTRODUCTION
3 pages
Cartesian Coordinates
No ratings yet
Cartesian Coordinates
13 pages
HW 02
100% (1)
HW 02
12 pages
Substitutions in Differential Equations and Systems
No ratings yet
Substitutions in Differential Equations and Systems
11 pages
Wk02 Cad Cam
No ratings yet
Wk02 Cad Cam
41 pages
Weather Wax Hastie Solutions Manual
No ratings yet
Weather Wax Hastie Solutions Manual
18 pages
TurboTDDFT 2.0-Hybrid Functionals and New Algorithms Within
No ratings yet
TurboTDDFT 2.0-Hybrid Functionals and New Algorithms Within
10 pages
Anisotropic Behavior
No ratings yet
Anisotropic Behavior
12 pages
Get College Algebra 6th Edition Dugopolski Solutions Manual Free All Chapters Available
100% (10)
Get College Algebra 6th Edition Dugopolski Solutions Manual Free All Chapters Available
70 pages
Wahba
No ratings yet
Wahba
21 pages