0% found this document useful (0 votes)

10 views

Handwritten Digit Recognition

Uploaded by

akilankannan4

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Handwritten Digit Recognition

Uploaded by

akilankannan4

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 30

1

Classification techniques
for
Hand-Written Digit Recognition

Venkat Raghavan N. S., Saneej B. C., and Karteek Popuri

Department of Chemical and Materials Engineering

University of Alberta, Canada.

Thursday, June 1, 2006 CPC group Seminar

2
Introduction
Objective: To recognise images of Handwritten
digits based on classification methods for
multivariate data.
 Optical Character Recognition (OCR)
• Predict the label of each image using the classification
function learned from training
 OCR is basically a classification task on
multivariate data
• Pixel Values  Variables
• Each type of character  Class

Thursday, June 1, 2006 CPC group Seminar

Handwritten Digit data 3

 16 x16 (= 256 pixel) Grey Scale images of

digits in range
x  [0,10-9
] ij

• Xi=[xi1, xi2, ……. xi256]

2

• yi { 0,1,2,3,4,5,6,7,8,9}
4

 9298 labelled samples 8

16
• Training set ~ 1000 images 10

• Test set 12

• Randomly selected from the full data base 14

2 4 6 8 10 12 14 16

 Basic idea
– Correctly identify the digit given an image 16

Thursday, June 1, 2006 CPC group Seminar

4
Dimension reduction - PCA
AVERAGE IMAGE

 PCA done on the mean centered images 2

 The eigenvectors of ∑ matrix are

256x256
10

called the Eigen digits (256 dimensional) 14

16
2 4 6 8 10 12 14 16

 The larger an Eigen value the more 5 5 5 5 5

10 10 10 AVERAGE
10 DIGIT 10

important is that Eigen digit. 15

5 10 15
15
5 10 15
15
5 10 15
15
5 10 15
15
5 10 15

EIGEN DIGITS
 The ith PC of an image X is
yi=ei’X 5 5 5 5 5

10 10 10 10 10

15 15 15 15 15

Thursday, June 1, 2006 CPC group Seminar

5 10 15 5 10 15 5 10 15 5 10 15 5 10 15
5
PCA (continued…)

 Based on the Eigen values first 64 PCs were found to be

significant

 Variance captured ~ 92.74%

 Any image represented by its PC: Y= [y1 y2….....y64 ]

 Reduced Data Matrix with 64 variables

Y = 1000 x 64 matrix
Thursday, June 1, 2006 CPC group Seminar
7
Interpreting the PCs as Image Features

 The Eigen vectors are the rotation of the original

axes to more meaningful directions. The PCs
are the projection of the data onto each of these
new axes.

 Image Reconstruction:
• The original image can be reconstructed by projecting
the PCs back to old axes.
• Using the most significant PC will give a
reconstructed image that is close to original image.
• These features can be used for carrying out further
investigations e.g. Classification!!

Thursday, June 1, 2006 CPC group Seminar

9
Normality test on PCs
QQ Plot of Sample Data versus Standard Normal QQ Plot of Sample Data versus Standard Normal QQ Plot of Sample Data versus Standard Normal
Principle Component No 1 Principle Component No 3 Principle Component No 5
1 1 1
Quantiles of Input Sample

Quantiles of Input Sample

0.5 0.5 0.5

0 0 0

-0.5 -0.5 -0.5

-1 -1 -1
-4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4
Standard Normal Quantiles Standard Normal Quantiles Standard Normal Quantiles

QQ Plot of Sample Data versus Standard Normal QQ Plot of Sample Data versus Standard Normal QQ Plot of Sample Data versus Standard Normal
Principle Component No 10 Principle Component No 20 Principle Component No 30
1 1.5 1.5
Quantiles of Input Sample

Quantiles of Input Sample

1 1
0.5
0.5 0.5

0 0 0

-0.5 -0.5
-0.5
-1 -1

-1 -1.5 -1.5
-4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4
Standard Normal Quantiles Standard Normal Quantiles Standard Normal Quantiles

QQ Plot of Sample Data versus Standard Normal QQ Plot of Sample Data versus Standard Normal QQ Plot of Sample Data versus Standard Normal
Principle Component No 40 Principle Component No 50 Principle Component No 60
2 3 5
Quantiles of Input Sample

Quantiles of Input Sample

1 2

1
0
0 0
-1
-1
-2 -2

-3 -3 -5
-4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4
Standard Normal Quantiles Standard Normal Quantiles Standard Normal Quantiles

Thursday, June 1, 2006 CPC group Seminar

10
Classification

 Principle Components used as features of

images

 LDA assuming multivariate normality of the

feature groups and common covariance

 Fisher discriminant procedure which assumes

only common covariance

Thursday, June 1, 2006 CPC group Seminar

11
Classification (contd..)

 Equal cost of misclassification

 Misclassification error rate:

• APER based on training data Averaged over
several random
• AER on the validation data
sampling of training
and validation data
from the full data set.
 Error rate using different number
of PCs were compared

Thursday, June 1, 2006 CPC group Seminar

12
Performing LDA

 Prior probabilities of each class were taken

as the frequency of that class in data.

 Equivalence of covariance matrix

• Strong Assumption
• Error rates used to check validity of
assumption
• Spooled used for covariance matrix

Thursday, June 1, 2006 CPC group Seminar

13
LDA Results

 APER

No of PCs 256 150 64

APER % 1.8 4 6.4
 AER

No of PCs 256 150 64

AER % 13.63 10.91 10.087
 APER underestimates the AER
 Using 64 PCs is better than using 150/256 PCs!
• The PCs with lower Eigen values tend to capture the noise in the data.

Thursday, June 1, 2006 CPC group Seminar

14
Fisher Discriminants

 Uses equal prior probabilities, covariances.

 No of discriminants can be r <= 9

• When all discriminants are used Fischer equivalent to
LDA (verified by error rates)
i.e. when r=9

 Error rates with different r compared

Thursday, June 1, 2006 CPC group Seminar

15
Fisher Discriminant Results

r=2 discriminants
APER
No of PCs 256 150 64
AER APER % 32 34.5 37.4
No of PCs 256 150 64
AER % 45 42 40
 Both AER and APER are very high

Thursday, June 1, 2006 CPC group Seminar

16
Fisher Discriminant Results

r=7 discriminants

APER No of PCs 256 150 64

APER % 3.2 4.8 7.9

AER No of PCs 256 150 64

AER % 14.1 12.4 10.8

 Considerable improvement in AER and APER

 Performance is close to LDA
 Using 64 PCs is better
Thursday, June 1, 2006 CPC group Seminar
17
Fisher Discriminant Results

r=9(all) discriminants

No of PCs 256 150 64

APER APER % 1.6 4.3 6.4
No of PCs 256 150 64
AER AER % 13.21 10.55 9.86

 No significant performance gain from r=7

 Error rates are ~ LDA (as expected!)
Thursday, June 1, 2006 CPC group Seminar
18
Nearest Neighbour Classifier

Finds the nearest neighbours from the training

set to test image and assigns its label to test
image. Test point assigned to
Class 2

 No assumption about
distribution of data Class 1

Class 2
 Euclidean distance to find
nearest neighbour

Thursday, June 1, 2006 CPC group Seminar

19
K-Nearest Neighbour Classifier (KNN)

 Compute the k nearest neighbours and

assign the class by majority vote.
k=3

Test point assigned to

Class 1 Class 1 ( 2 votes )

Class 2 ( 1 vote )

Thursday, June 1, 2006 CPC group Seminar

20
1-NN Classification Results:

No of PCs 256 150 64

AER % 7.09 7.01 6.45

Test error rates have improved compared to

LDA and Fisher
Using 64 PCs gives better results
Using higher k’s does not show improvement
in recognition rate

Thursday, June 1, 2006 CPC group Seminar

21
Misclassification in NN:

Recognised as
0 1 2 3 4 5 6 7 8 9
0 1376 0 4 2 0 5 12 2 0 0
Actual

1 0 1113 1 0 1 0 2 0 2 0
2 22 9 728 17 4 4 6 16 18 2
3 4 0 4 690 2 26 0 4 6 3
4 3 15 9 0 687 0 7 2 4 32
5 9 3 12 37 5 517 32 0 23 9
6 10 3 5 0 3 2 714 0 3 2
7 0 6 1 0 19 0 0 657 1 20
8 8 11 1 26 7 7 8 5 547 13
9 6 1 2 0 23 0 0 32 0 664

Euclidean distances between transformed

images of same class can be very high
Thursday, June 1, 2006 CPC group Seminar
23
Issues in NN:

Expensive: To determine the nearest

neighbour of a test image, must compute the
distance to all N training examples

Storage Requirements: Must store all

training data

Thursday, June 1, 2006 CPC group Seminar

24
Euclidean-NN method inefficient

 Store all possible instances (positions, sizes,

angles, thickness, writing styles…),
this is impractical.

Thursday, June 1, 2006 CPC group Seminar

25
Euclidean distance metric fails

Pattern to be classified Prototype A Prototype B

 Prototype B seems more similar than Prototype

A according to Euclidean distance.

 Digit “9” misclassified as “4”.

 Possible solution is to use an distance metric

invariant to irrelevant transformations.
Thursday, June 1, 2006 CPC group Seminar
26
Effect of a Transformation

Pixel Space  
X + α .Τ

X  
Τ
s (X, α)   s
 


SX = { y | there exists α for which y = s (X, α) }

Thursday, June 1, 2006 CPC group Seminar

27
Tangent Distance

Tangent distance
P SP

Euclidean distance
between P and E
Distance between
SP and SE

Thursday, June 1, 2006 CPC group Seminar

28
Images in tangent plane
P T  = -2  = -1  =0  =1  =2

Rotation

Scaling

Thickness

X Translation

Diag. Deformation

Axis Deformation

Y Translation

Thursday, June 1, 2006 CPC group Seminar

29
Implementation

 The vectors tangent to the manifold S X form the

hyper plane TX tangent to SX.

 The Tangent distance D(E,P) is found by

minimizing distance between TE and TP.

 The images are smoothed with a gaussian

σ = 1.

Thursday, June 1, 2006 CPC group Seminar

Implementation (Contd…) 30

The Equations of TP and TE are given by

E '  E   E  L E  E P '  P   P  L P  P
 s ( X ,  ) s ( X ,  ) 
where LX   ,......, 
   1   m   0

DE , P   min E ( E )  P ( P ) ' ' 2

 E , P

Thursday, June 1, 2006 CPC group Seminar

31
Implementation (Contd…)

D( E , P )
 E
  T
 2 E ' ( E )  P ' ( P ) L E  0

D( E , P )
 P
 T

 2 P ( P )  E ( E ) L P  0
' '

Solving for αP and αE we can calculate D(E,P) the Tangent Distance between
two patterns E and P.

Thursday, June 1, 2006 CPC group Seminar

32
Tangent Distance method Results

 USPS data set ,1000 training examples and

7000 test examples.

 The misclassification error rate using 3-NN is

3.26 %.

 The time taken is 9967.94 sec.

Thursday, June 1, 2006 CPC group Seminar

33
References:

 “The Elements of Statistical Learning- Data Mining,

Inference and Prediction” by Trevor Hastie, Robert
Tibshirani, Jerome Friedman

 “Applied Multivariate Statistical Analysis” by Richard A.

Johnson, Dean W. Wichern.

 http://www.robots.ox.ac.uk/~dclaus/

 “Transformation Invariance in Pattern Recognition –

Tangent Distance and Tangent propagation” by Patrice
Y. Simard, Yann A. Le Cun .
Thursday, June 1, 2006 CPC group Seminar

[Ebooks PDF] download The Wiley Blackwell Handbook of Bullying: A Comprehensive and International Review of Research and Intervention, Vol. 1 1st Edition Peter K. Smith full chapters
100% (2)
[Ebooks PDF] download The Wiley Blackwell Handbook of Bullying: A Comprehensive and International Review of Research and Intervention, Vol. 1 1st Edition Peter K. Smith full chapters
40 pages
CL and Rio Notes On Setting Up
No ratings yet
CL and Rio Notes On Setting Up
3 pages
Matlab
No ratings yet
Matlab
13 pages
Preshius Project
No ratings yet
Preshius Project
39 pages
09 ClassificationPCA
No ratings yet
09 ClassificationPCA
80 pages
Sample New
No ratings yet
Sample New
6 pages
PROS - Ardilla ADR, Iwan S, Ivanna KT - Performance Comparison Between - Fulltext
No ratings yet
PROS - Ardilla ADR, Iwan S, Ivanna KT - Performance Comparison Between - Fulltext
5 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
16 pages
Face Recognition by Using Linear Discriminant Analysis
No ratings yet
Face Recognition by Using Linear Discriminant Analysis
4 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
Classification Techniques
No ratings yet
Classification Techniques
99 pages
Pca
No ratings yet
Pca
6 pages
Lecture 11 Dimensionality Reduction
No ratings yet
Lecture 11 Dimensionality Reduction
32 pages
DSH - L5 - Data-Driven Approaches - Concepts
No ratings yet
DSH - L5 - Data-Driven Approaches - Concepts
38 pages
PCA (v3)
No ratings yet
PCA (v3)
34 pages
Lecture Notes On Pattern Recognition and Image Processing
No ratings yet
Lecture Notes On Pattern Recognition and Image Processing
24 pages
FR Pca Lda
No ratings yet
FR Pca Lda
52 pages
کتاب نهم بارگزاری شده
No ratings yet
کتاب نهم بارگزاری شده
55 pages
Report
No ratings yet
Report
7 pages
Project LA
No ratings yet
Project LA
13 pages
1-s2.0-S1877050915031828-main
No ratings yet
1-s2.0-S1877050915031828-main
7 pages
Gemini Algorithm
No ratings yet
Gemini Algorithm
28 pages
Pattern Recognition (CSE4213) : Linear Discriminant Analysis (LDA)
No ratings yet
Pattern Recognition (CSE4213) : Linear Discriminant Analysis (LDA)
33 pages
Sairam PCA
No ratings yet
Sairam PCA
27 pages
03 Face Detection
No ratings yet
03 Face Detection
7 pages
Unit 3,4 and 5
No ratings yet
Unit 3,4 and 5
5 pages
CS434a/541a: Pattern Recognition Prof. Olga Veksler
No ratings yet
CS434a/541a: Pattern Recognition Prof. Olga Veksler
42 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 02
No ratings yet
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 02
26 pages
Eigenfaces Face Recognition (MATLAB)
No ratings yet
Eigenfaces Face Recognition (MATLAB)
5 pages
CV Lecture 11
No ratings yet
CV Lecture 11
147 pages
Feature Engineering
No ratings yet
Feature Engineering
51 pages
Outline: Reducing Data Dimension
No ratings yet
Outline: Reducing Data Dimension
7 pages
SIA 544 Final Exam 132
No ratings yet
SIA 544 Final Exam 132
14 pages
Notes Implementation Component Analysis
No ratings yet
Notes Implementation Component Analysis
5 pages
CHP 4
No ratings yet
CHP 4
72 pages
10 EigenImages
No ratings yet
10 EigenImages
28 pages
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
No ratings yet
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
20 pages
a6
No ratings yet
a6
7 pages
PRNN P S Sastry Lec 1
No ratings yet
PRNN P S Sastry Lec 1
177 pages
Visualization 9 Dim Reduction
No ratings yet
Visualization 9 Dim Reduction
73 pages
Image Feature Extraction Based On PCA
No ratings yet
Image Feature Extraction Based On PCA
5 pages
Lec 14
No ratings yet
Lec 14
18 pages
Pattern Classification 06. Feature Selection & Extraction: Abdelmoniem Bayoumi, PHD
No ratings yet
Pattern Classification 06. Feature Selection & Extraction: Abdelmoniem Bayoumi, PHD
29 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
5 - Feature Generation
No ratings yet
5 - Feature Generation
15 pages
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015
No ratings yet
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015
23 pages
ML Unit 2 Part 2
No ratings yet
ML Unit 2 Part 2
23 pages
CVLAB_1
No ratings yet
CVLAB_1
6 pages
Image Similarity Test Using Eigenface Ca
No ratings yet
Image Similarity Test Using Eigenface Ca
6 pages
Pattern Recognition Concepts: CSE803 Fall 2012 1
No ratings yet
Pattern Recognition Concepts: CSE803 Fall 2012 1
18 pages
Lecture8 2015
No ratings yet
Lecture8 2015
51 pages
Lab Assignment 3 UCS522: Computer Vision: Thapar Institute of Engineering and Technology Patiala, Punjab
No ratings yet
Lab Assignment 3 UCS522: Computer Vision: Thapar Institute of Engineering and Technology Patiala, Punjab
20 pages
Feature Selection and Extraction
No ratings yet
Feature Selection and Extraction
26 pages
BigData Assessment2 26230605
No ratings yet
BigData Assessment2 26230605
14 pages
Lecture01 &02 (1)
No ratings yet
Lecture01 &02 (1)
77 pages
Fill Your Glass With Gold-When It's Half-Full or Even Completely Shattered
From Everand
Fill Your Glass With Gold-When It's Half-Full or Even Completely Shattered
Hillary Saffran
No ratings yet
RKU Brochure
100% (2)
RKU Brochure
108 pages
Rule - LR Guide For PRFV
No ratings yet
Rule - LR Guide For PRFV
29 pages
JKP CDs DVDs Books Cataloguea
No ratings yet
JKP CDs DVDs Books Cataloguea
45 pages
Series MS01: Technical Data
No ratings yet
Series MS01: Technical Data
49 pages
OIC ESL Y4 W1 LP-sample
No ratings yet
OIC ESL Y4 W1 LP-sample
3 pages
Vegetabkes Names With Pictures
No ratings yet
Vegetabkes Names With Pictures
7 pages
Geoforum: Marien González-Hidalgo, Christos Zografos
No ratings yet
Geoforum: Marien González-Hidalgo, Christos Zografos
13 pages
AIPS_2025_17
No ratings yet
AIPS_2025_17
2 pages
Goutam Profile 2019
No ratings yet
Goutam Profile 2019
3 pages
30 Pilot Interview Questions and Answers
No ratings yet
30 Pilot Interview Questions and Answers
17 pages
Merck Global Pharmaceutical
No ratings yet
Merck Global Pharmaceutical
8 pages
Proposal Name of Work Program Spe Pem Akamigas Student Chapter
No ratings yet
Proposal Name of Work Program Spe Pem Akamigas Student Chapter
8 pages
Kabir 1440 1518 Songs of Kabir
No ratings yet
Kabir 1440 1518 Songs of Kabir
168 pages
Eubiq Catalog - 2019 8 PDF
No ratings yet
Eubiq Catalog - 2019 8 PDF
1 page
BEER - Comprehension of All The Skills
No ratings yet
BEER - Comprehension of All The Skills
2 pages
12.7 X 108 MM M12 Black Spear
No ratings yet
12.7 X 108 MM M12 Black Spear
1 page
1 SM
No ratings yet
1 SM
10 pages
Demulsifier (CT DEM 700) - CAT MSDS - IPPS
100% (1)
Demulsifier (CT DEM 700) - CAT MSDS - IPPS
2 pages
REVIEWER
No ratings yet
REVIEWER
40 pages
Maths Test 4
No ratings yet
Maths Test 4
2 pages
Sin Sovereignty Zimbabwe
No ratings yet
Sin Sovereignty Zimbabwe
19 pages
Lars Spuybroek
No ratings yet
Lars Spuybroek
30 pages
Risks and Risk Management in Agriculture: Ludwig Theuvsen
No ratings yet
Risks and Risk Management in Agriculture: Ludwig Theuvsen
13 pages
Silliman University - College of Medicine - Curriculum: First Year
No ratings yet
Silliman University - College of Medicine - Curriculum: First Year
1 page
Benson Boone - There She Goes
No ratings yet
Benson Boone - There She Goes
3 pages
Year 11 Math Pp2 End of Term 1 Exams 2022 Final
No ratings yet
Year 11 Math Pp2 End of Term 1 Exams 2022 Final
27 pages
Contoh Narrative Text Singkat Dan Terjemahannya
No ratings yet
Contoh Narrative Text Singkat Dan Terjemahannya
7 pages
Booking Confirmation For Mr. Matthew White
No ratings yet
Booking Confirmation For Mr. Matthew White
3 pages