Feature Extraction
Feature Extraction
Abstract—Facial expression recognition has numerous result of a person moving their head so that not all features
applications, including psychological research, improved of the face can be seen by a camera. Another type of
human computer interaction, and sign language translation. A occlusion is a systematic occlusion, which can be caused by
novel facial expression recognition system based on hybrid face
a person wearing something such as a head-mounted
regions (HFR) is investigated. The expression recognition
system is fully automatic, and consists of the following display, which causes the features of the upper half of the
modules: face detection, facial detection, feature extraction, face to be invisible. These types of occlusions are
optimal features selection, and classification. The features are potentially more damaging since they result in whole
extracted from both whole face image and face regions (eyes features of relevance to judging facial expression being
and mouth) using log Gabor filters. Then, the most obscured.
discriminate features are selected based on mutual information
Plenty of work has been done on facial expression
criteria. The system can automatically recognize six
expressions: anger, disgust, fear, happiness, sadness and recognition [15], [17], [19], [21]. In this study, we
surprise. The selected features are classified using the Naive investigate the part of the face that contains the most
Bayesian (NB) classifier. The proposed method has been discriminative information for facial expression recognition
extensively assessed using Cohn-Kanade database and JAFFE system and propose hybrid face region method for feature
database. The experiments have highlighted the efficiency of extraction. An automatic classification of facial expressions
the proposed HFR method in enhancing the classification rate.
consists of two stages: feature extraction and feature
Index Terms—Facial expression recognition, Gabor filters,
classification. The feature extraction is extremely important
Face regions, Human computer interaction, Feature extraction to the whole classification process. If inadequate features are
used, even the best classifier could fail to achieve accurate
I. INTRODUCTION recognition. In most cases of facial expression classification,
the process of feature extraction yields a prohibitively large
Since last decade, a growing interest in human computer
number of features and subsequently a smaller sub-set of
interaction (HCI) systems has been developed. Automated
features needs to be selected according to some optimality
Facial expression recognition is an important task in human
criteria.
computer interaction systems that include emotion
The Gabor wavelet feature representation showed high
processing. Humans are capable of producing thousands of
performance in the recognition of facial actions from image
facial actions during communication that vary in
sequences. Although the Gabor wavelet facial feature
complexity, intensity, and meaning. Emotion or intention is
representations have been widely adopted [6], [7], [20], it is
often communicated by subtle changes in one or several
computationally expensive to convolve the face images with
discrete features. The addition or absence of one or more
the multi-level banks of the Gabor filters in order to extract
facial actions may alter its interpretation. In addition, some
the scale and the orientation coefficients. Furthermore, the
facial expressions may have a similar gross morphology but
Gabor wavelet analysis suffers from two major limitations:
indicate varied meaning for different expression intensities.
the maximum bandwidth of a Gabor filter is limited to
Automatic facial expression analysis is a flourishing area
approximately one octave and the Gabor filters are not
of research in computer science. Problems that have been
optimal when the objective is to achieve broad spectral
tackled with previously are the tracking of facial expression
information with the maximum spatial localization. These
in static images and video sequences, the transfer of
drawbacks can be overcome when using the logarithmic
expressions to novel faces, the repurposing of a person’s
form of the Gabor filters in the process of feature extraction.
expression to a virtual model and recognition of facial
The log Gabor filters are known to provide excellent
expression. These recognition tasks have focused on the
simultaneous localization of spatial and frequency
classification of emotional expressions [1], classification of
information, however the dimensionality of the resulting
complex mental states [2] or the automatic recognition of
data is high. The dimensionality reduction can be achieved
FACS action units [3].
by selection of a small sub-set of the log Gabor features
A problem that is frequently encountered in each of these based on specified optimality criteria.
tasks is that of partial occlusions. Occlusions can introduce The dimensionality reduction can be achieved by
errors into the predicted expression or result in an incorrect selection of the more informative features based on feature
expression being transferred to a virtual head. One type of selection and data reduction methods such as: principle
partial occlusion is a temporary occlusion caused by a part component analysis (PCA), independent component analysis
of the face being obscured momentarily by an object or as a (ICA), mutual information (MI), etc. [10], [11], [12]. In this
Digital Object Identifier 10.4316/AECE.2009.03012
63
Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009
64
Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009
65
Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009
r r r r
f HFR = f eyes U f mouth U f face (4) k
C = arg max{ p (c )Õ p ( f j | c)} (10)
c
This process results in prohibitively large number of j =1
feature arrays. For large training and testing sets, the where p(c)= Number of sample in class c / Total samples ,
computations are highly impractical. In order to improve the (fi|c) are conditional tables (or conditional density) learned
computational efficiency, it is critical to reduce the feature in training by using examples, and k is the length of feature
dimensions. This is achieved using feature selection process vector. Despite the independence assumption, NB has been
[5], [10], [25], [26]. shown to have very good classification performance for
many real data sets, on par with many more sophisticated
V. FEATURE SELECTION classifiers.
Optimal subset of features is selected on the basis of
mutual information (MI) criterion [10], [25]. The mutual VII. EXPERIMENTS AND RESULTS
information represents a measure of information commonly In this study, we used JAFFE (JF) and Cohn-Kanade
found in two random variables, say X and Y, and it is given (C-K) databases to train and test the facial expression
as: recognition system. Each test was performed 3 times using
p ( x, y ) randomly selected testing and training sets and an average
I ( X ;Y ) = åX åY p( x, y) log p( x) p( y)
xÎ yÎ
(5) result was calculated. Training has been done for six
expressions (C=6). The subjects represented in the training
where p(x) is the probability density function (pdf), defined set were not included in the testing set of images, thus
as p(x) = Pr{X=x}, and p(x,y) is the joint pdf defined as ensuring a person-independent classification of facial
p(x,y) = Pr(X=x and Y=y). The MI can also be expressed in expressions. Automatic face detection, facial detection, and
terms of the entropy: face region detection were used and the faces were also
I ( X ;Y ) = H ( X ) - H ( X | Y ) (6) scaled. The tested images were classified using log Gabor
filter for feature extraction and naïve Bayesian classifier.
where, H(X) is the entropy of a random variable X, given as: We extracted the features for different scales and
orientations and tested them using naïve Bayesian classifier
H ( X ) = - å p ( x) log p ( x) (7) to choose the best scale and orientation for the log Gabor
xÎX
filters. Figure 8 illustrates the recognition rate for different
H(X|Y) in Equation 6 is the conditional entropy given as: five scales and eight orientations using C-K database. As a
result, we have chosen the log Gabor filters with 3 scales
H ( X | Y ) = - åå p ( x, y ) log p ( y | x ) (8) and 6 orientations which have the maximum accuracy to do
xÎ X yÎY our experiments.
The mutual information feature selection (MIFS)
algorithm, described in [10] is applied to perform the feature
selection. In this approach, starting from the empty set, the
best available feature vectors are added, one by one to the
selected feature set, until the size of the set reaches the
desired value of NS. The sub-set S of feature vectors are
selected using a simultaneous maximization of the mutual
information between the selected feature vectors in S and
the class labels C, and a minimization of the mutual
information between the selected feature vectors within S.
I (C ; f i | S ) = I (C ; f i ) - b å I (f i ; s k ) (9)
s k ÎS
66
Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009
67