0% found this document useful (0 votes)
43 views

Unit - V Pattern Recognition: Dr.K.Sampath Kumar Scse/Gu

The document provides an introduction to pattern recognition. It discusses how pattern recognition is used in applications such as disease categorization, fingerprint verification, and optical character recognition. It then describes the key phases in a pattern recognition system, including preprocessing, feature extraction, and classification. Finally, it introduces some fundamental concepts from probability theory and Bayesian decision theory that are important foundations for statistical pattern recognition.

Uploaded by

Ashish Mittal
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Unit - V Pattern Recognition: Dr.K.Sampath Kumar Scse/Gu

The document provides an introduction to pattern recognition. It discusses how pattern recognition is used in applications such as disease categorization, fingerprint verification, and optical character recognition. It then describes the key phases in a pattern recognition system, including preprocessing, feature extraction, and classification. Finally, it introduces some fundamental concepts from probability theory and Bayesian decision theory that are important foundations for statistical pattern recognition.

Uploaded by

Ashish Mittal
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

Unit – V

Pattern Recognition
Dr.K.Sampath Kumar
SCSE/GU
Introduction
to
Pattern Recognition
Introduction to Pattern Recognition
The applications of Pattern Recognition can be found
everywhere.

Examples include disease categorization, prediction of survival


rates for patients of specific disease,
fingerprint verification,
face recognition,
iris discrimination,
chromosome shape discrimination,
optical character
recognition,
texture discrimination,
speech recognition, and etc.
• Pattern recognition is a branch of machine
learning that focuses on the recognition of patterns
and regularities in data.

• Pattern recognition systems are in many cases


trained from labeled "training" data (supervised
learning), but when no labeled data are available
other algorithms can be used to discover previously
unknown patterns (unsupervised learning).
• Machine learning is the common term for supervised
learning methods and originates from artificial
intelligence, whereas KDD and data mining have a
larger focus on unsupervised methods and stronger
connection to business use.
• Pattern recognition has its origins in engineering, and
the term is popular in the context of computer vision:
a leading computer vision conference is
named Conference on Computer Vision and Pattern
Recognition.
Pattern Recognition?

“The assignment of a physical object or event to one of


several pre-specified categories” -- Duda & Hart

•A pattern is an object, process or event

•A class (or category) is a set of patterns that share


common attribute (features) usually from the same
information source
•During recognition (or classification) classes
are assigned to the objects.
•A classifier is a machine that performs such task
Pattern Recognition Phases
Preprocessing
Classification
A Complete PR System
Problem Formulation

Measurements
Features Classification
Input Preprocessing Class
object Label
Basic ingredients:
•Measurement space (e.g., image intensity, pressure)
•Features (e.g., corners, spectral energy)
•Classifier - soft and hard
•Decision boundary
•Training sample
•Probability of error
A Pattern Recognition Paradigm
Texture Discrimination
Shape Discrimination
Optical Character Recognition
Face Recognition & Discrimination
Are They From the Same Person?
Statistical Pattern recognition
Outline

• Basic Probability Theory

• Bayesian Decision Theory


Probability theory
Probability is a mathematical model to help us study
physical systems in an ‘average’ sense

Kinds of probability
• Classical: Ratio of favorable to the total outcomes
NE
P( E ) 
N
• Relative Frequency: Measure of frequency of occurrence
NE
P ( E )  lim
N  N

• Axiomatic theory of probability


Probability Theory
 Conditional Probability: The probability of B given A is
P( AB)
P( B | A)  , P( A)  0
P( A)
 P( AB)  P( BA)
or , P( A | B) P( B)  P( B | A) P( A)
P( B | A) P( A) Bayes theorem
or , P( A | B) 
P( B)
• Unconditional Probability: A1,A2,…,AC be mutually exclusive events such
that C A   then for any event B,
 i C
P( B)   P( B | Ai ) P( Ai )
i 1

i 1

P( B | A j ) P( A j )
then P( A j | B)  C

 P( B | A ) P( A )
i 1
i i
Random variables
• Expected Value E[ X ]   xf X ( x)dx

• Conditional Expectation E[ X | B ]   xf X |B ( x)dx

• Moments mi j  E[( X  E[ X ]) i (Y  E[Y ]) j ]


• Variance
Var[ X ]  E[( X  E[ X ]) 2 ]
cov( X , Y )  E[( X  E[ X ])(Y  E[Y ])]
• Covariance
 E[ XY ]  E[ X ]E[Y ]
Joint Random Variables

• X and Y are random variables defined on the same sample


space 
Joint distribution function is given by

FXY ( x, y)  P( X  x, Y  y)
Joint probability density function is given by
d2
f XY ( x, y )  FXY ( x, y )
dxdy
x y
FXY ( x, y )  f
 
XY ( x, y )dxdy
Marginal Density Functions

d
f X ( x) 
( X  x)  ( X  x)  (Y  )
FX ( x)
dx
d
 FX ( x,)
dx
d 
x 

 FX ( x)  P( X  x)
    f XY ( x, y )dy dx
 
dx  


 P( X  x, Y  )
 f X ( x)   f XY ( x, y )dy


 FXY ( x,)
Similarly , fY ( x)  

f XY ( x, y )dx
Bayesian Decision Theory

• Consider C classes w1,…wC, with a priori probabilities


P(w1),…P(wC), assumed known

• To minimize the error probability, with no extra


information, we would assign a pattern to class wj if
P( w j )  P( wk ), k  1,..., C ; k  j
Bayesian Decision Theory
• If we have an observation vector x, considered to be a random
variable whose distribution is given by p(x|w), then assign x to
class wj if
P( w j | x)  P( wk | x), k  1,..., C; k  j MAP rule

p(x | w j ) P( w j ) p(x | wk ) P( wk )
or , 
P ( x) P ( x)
or , p(x | w j ) P( w j )  p(x | wk ) P( wk ), k  1,..., C; k  j

 For 2 class case, the decision rule is


p(x | w1 ) P( w2 )
L( x)    x  w1 Likelihood Ratio
p(x | w2 ) P( w1 )
Bayesian Decision Theory
Likelihood Ratio Test 0.16
p(x|w1)P(w1)
P(x|w1)=N(0,1) 0.12 p(x|w2)P(w2)

P(x|w2) = 0.6N(1,1) + 0.4N(-1,2) 0.08


P(w1) = P(w2) = 0.5 0.04
0
-4 -3 -2 -1 0 1 2 3 4
x
1.6
1.2 P(w2) /P(w1)
1.0
0.8
0.4 L(x)
0
-4 -3 -2 -1 0 1 2 3 4
x
Probability of error
C
Probability of error P(e | x)   P( wi | x)  1  P( w j | x)
i 1
i j

Minimized when
P(wj|x) is maximum
The average probability of error is

P (e)   P(e | x) P(x)dx


For every x, we ensure that P(e|x) is minimum so that the integral must be as
small as possible
Conditional Risk & Bayes’ Risk
• Loss Measure of the cost of making an error
 (ai | w j )  cost of assigning a pattern x to wi when x  w j
• Conditional Risk
The overall risk in choosing action ai so that it is minimum for every x is
C
R(ai | x)    (ai | w j ) P( wi | x)
i 1
C  0, i  j
  P ( w j | x) assuming  (ai | w j )   
j i  1, i  j
 1  P( wi | x)
To minimize the average probability of error, choose i that maximizes the
posteriori probability P(wi|x). If a is chosen such that for every x the
overall risk is minimized and the resulting minimum overall risk is called
the Bayes’ risk.
Bayes decision rule - Reject option
• Partition the sample space into 2 regions
R  {x | (1  max P( wi | x)  t}
i

A  {x | (1  max P( wi | x)  t} where t is a threshold


i

1
0. P(w1|x) Pr obability of rejection
t R is empty when
0.
9 r (t )   p( x)dx
0.
8 1
1-t 0.
7 1R  t 
0.5
6  error rate Cis
0. C 1
0.3
4 e(t )  or(1, max
t  p( wi | x)) p(x)dx
A
i C
0. P(w2|x)
0.1
2
0
-4 -3 A -2 -1 0 1 2 3 4
R A x

You might also like