Pattern Recognition Linear Classifier by Zaheer Ahmad
Pattern Recognition Linear Classifier by Zaheer Ahmad
Zaheer Ahmad PhD Scholar [email protected] Department of Computer Science University of Peshawar
Agenda
Pattern Recognition
Features and Patterns Classifiers Approaches Design Cycle
Linear Classification
Linear Discriminant Functions Linear Separability Fisher Discriminant Functions Support Vector Machines(SVMs)
The process of giving names to observations x, Schrmann Pattern Recognition is concerned with answering the question What is this? Morse
Applications of PR
Image processing Computer vision Speech recognition Data Mining Automated target recognition Optical character recognition Seismic analysis Man and machine diagnostics Fingerprint identification Industrial inspection Financial forecast Medical diagnosis ECG signal analysis
Terminology
Recognition: During recognition (or classification) given objects are assigned to prescribed classes. classification is the problem of identifying which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known An algorithm that implements classification, especially in a concrete implementation, is known as a classifier A classifier is a machine which performs classification.
Features
Feature is any distinctive aspect, quality or characteristic of an object Features may be symbolic (i.e., color) or numeric (i.e., height) The combination of features is a -dim column vector called a feature vector The -dimensional space defined by the feature vector is called the feature space
Objects are represented as points in feature space; the result is a scatter plot
Features
A pattern class (or category) is a set of patterns sharing common attributes and usually originating from the same source. A class/ pattern class is a set of objects having some important properties in common
Decision Boundary/Surface
A line or curve separating the classes is a decision boundary The equation g(x) = 0 defines the decision surface that separates points assigned to the category 1 from points assigned to the category 2 When g(x) is linear, the decision surface is a hyperplane If x1 and x2 are both on the hyperplane then
Decision Boundary
Slope intercept form of a Line(Straight Line): The equation of a line with a defined slope m can also be written as follows: y = mx + b
The task of a classifier is to partition feature space into class-labeled decision regions Borders between decision regions are called decision boundaries The classification of feature vector consists of determining which decision region it belongs to, and assign to this class
Classifiers
Very attractive since it requires minimum a priori knowledge with enough layers and neurons, ANNs can create any complex decision region Syntactic Patterns classified based on measures of structural similarity Knowledge is represented by means of formal grammars or relational descriptions (graphs) Used not only for classification, but also for description Typically, syntactic approaches formulate hierarchical descriptions of complex patterns built up from simpler sub patterns
Requires basic prior knowledge Model choice Statistical, neural and structural approaches Parameter settings
Training Given a feature set and a blank model, adapt the model to explain the data Supervised, unsupervised and reinforcement learning Evaluation How well does the trained model do? Overfitting vs. generalization
Linear Classification
Classification in which the decision boundary in the feature (input) space is linear In linear classification the input space is split in (hyper)planes, each with an assigned class
Linear Separable
If a hyperplanar decision boundary exists that correctly classify all the training samples for a c=2 class problem, the samples are said to be linearly separable.
where w is the weight vector and w0 is the bias (or threshold weight).
Linear Classifiers
Linear Classifiers
a linear classifier is a mapping which partitions feature space using a linear function (a straight line, or a hyperplane) it is one of the simplest classifiers we can imagine
separate the two classes using a straight line in feature space
Feature 2
Feature 2
Classes well-separated in D-space may strongly overlap in 1dimension Adjust component of the weight vector w Select projection to maximize class-separation
When projected onto the line joining the class means, the classes are not well separated.
Fisher chooses a direction that makes the projected classes much tighter, even though their projected means are less far apart.
y wT x
w m 2 m1
2 s1
n C1
( yn m1 ) ( yn m2 )
between within
2 s2
n C2
J (w )
(m2 m1 ) 2
2 2 s1 s2
wT S B w w SW w
T
S B (m 2 m1 ) (m 2 m1 )T SW (x n m1 ) (x n m1 )T (x n m 2 ) (x n m 2 )T
nC1
nC2
1 Optimal solution : w SW (m 2 m1 )
Separating Hyperplane
x2
yi 1
yi 1
A separating hypreplane w x b 0
x1
Separating Hyperplanes
yi 1
yi 1
Good generalization!
xi
x'
-The SVM idea is to maximize the distance between The hyperplane and the closest sample point.
In the optimal hyperplane:
xi