Face Recognition Using Wavelet PCA and N
Face Recognition Using Wavelet PCA and N
ABSTRACT
Automatic face recognition by computer can be divided into two
approaches [1, 2], namely, content-based and face-based. In
content-based approach, recognition is based on the relationship
This work presents a method to increased the face recognition between human facial features such as eyes, mouth, nose, profile
accuracy using a combination of Wavelet, PCA, and Neural silhouettes and face boundary [3, 4, 5, 6]. The success of this
Networks. Preprocessing, feature extraction and classification approach relies highly on the accurately is difficult. Every human
rules are three crucial issues for face recognition. This paper face has similar facial features, a small derivation in the extraction
presents a hybrid approach to employ these issues. For may introduce a large classification error.
preprocessing and feature extraction steps, we apply a
combination of wavelet transform and PCA. During the Face-based approach [7, 8, 5, 9] attempts to capture and define the
classification stage, the Neural Network (MLP) is explored to face as a whole. The face is treated as a two-dimensional pattern
achieve a robust decision in presence of wide facial variations, of intensity variation. Under this approach, face is matched
also we have used RBF Neural Network but results show that through identifying its underlying statistical regularities. Principal
MLP Neural Network outperforms RBF. The computational load Component Analysis (PCA) [7, 8, 10, 20, 24] has been proven to
of the proposed method is greatly reduced as comparing with the be an effective face-based approach. Sirovich and Kirby [10] first
original PCA based method. Moreover, the accuracy of the proposed using Karhunen-Loeve (KL) transform to represent
proposed method is improved. human faces. In their method, faces are represented by a linear
combination of weighted eigenvector, known as eigenfaces. Turk
and Pentland [8] developed a face recognition system using PCA.
1. INTRODUCTION
However common PCA-based methods suffer from two
limitations, namely, poor discriminatory power and large
Over the past few years, the user authentication is increasingly
computational load. It is well known that PCA gives a very good
important because the security control is required everywhere.
representation of the faces. Given two images of the same person,
Traditionally, ID cards and passwords are popular for
the similarity measured under PCA representation is very high.
authentication although the security is not so reliable and
Yet, given two images of different persons, the similarity
convenient. Recently, biological authentication technologies
measured is still high. That means PCA representation gets a poor
across voice, iris, fingerprint, palm print, and face, etc are playing
discriminatory power. Swets and Weng [11] also observed this
a crucial role and attracting intensive interests for many
drawback of PCA approach and further improve the
researchers. Among them, face recognition is an amicable
discriminability of PCA by adding Linear Discriminant Analysis
alternative because the authentication can be completed in a
(LDA). But, to get a precise result, a large number of samples for
hands-free way without stopping user activities. Also, the face
each class is required. On the other hand, O'Toole et al. [12]
recognition system is economic with the low-cost of cameras and
proposed different approach for selecting the eigenfaces. They
computers. It is extensively feasible to identity authentication,
pointed out that the eigenvectors with large eigenvalues are not
access control, and surveillance, etc. Over the past 20 years,
the best for distinguishing face images. They also demonstrated
extensive research works on various aspects of face recognition
that although the low dimensional representation is not optimal
by human and machines [1, 2, 12, 20, 21,22 ,23,28] have been
for recognizing a human face, gives good results in identifying
conducted by psychophysicists, neuroscientist and engineering
physical categories of face, such as gender and race. However,
scientists. Psychophysicists and neuroscientists have studied
O’Toole et al. have not addressed much on the selection criteria of
issues such as uniqueness of faces, how infants perceive faces and
eigenvectors for recognition.
organization of memory of faces. While engineering scientist have
designed and developed face recognition algorithms. This paper
The second problem in PCA-based method is the high
continues the work done by engineering scientist in face
computational load in finding the eigenvectors. The
recognition by machine.
computational complexity of this is O ( d 2 ) where d is the
number of pixels in the training images which has a typical value
ICMSA0/05-1
Proceeding of the First International Conference on Modeling, Simulation and Applied Optimization, Sharjah, U.A.E. February 1-3, 2005
E( X ) = ∑ Xn
of 128x128. The computational cost is beyond the power of most 1 N (1)
existing computers. Fortunately, from matrix theory, we know
N n =1
that if the number of training images, N, is smaller than the value
2
of d, the computational complexity will be reduced to O ( N ).
modified ensemble of vectors, X = { X n , n = 1,..., N } with
After subtracting the average from each element of X, we get a
Yet still, if N increases, the computational load will be increased
X n = X n − E( X ) .
in cubic order.
(2)
In view of the limitations in existing PCA-based approach, we The auto-covariance matrix M for the ensemble X is defined by
M = cov( X ) = E ( X ⊗ X )
proposed a new approach in using PCA – applying PCA on
wavelet subband for feature extraction. In the proposed method, (3)
an image is decomposed into a number of subbands with different
frequency components using the wavelet transform. The result in
∑X
Table 1 show that three level wavelet has a good performance in Where M is d 2 xd 2 matrix, with elements
M (i, j ) = (i ) X n ( j ),1 ≤ i. j ≤ d 2
face recognition. A mid-range frequency subband image with 1 (4)
n
resolution 16x16 is selected to compute the representational N
bases. The proposed method works on lower resolution, 16 x 16,
It is well known from matrix theory that the matrix M is
instead of the original image resolution of 128 x 128. Therefore,
positively definite (or semi-definite) and has only real non-
the proposed method reduces the computational complexity
negative eigenvalues [13]. The eigenvectors of the matrix M form
significantly when the number of training image is larger than 16 dxd
x 16, which is expected to be the case for a number of real-world an orthonormal basis for R . This basis is called the K-L basis.
applications. Moreover, experimental results demonstrated that Since the auto-covariance matrix for the K-L eigenvectors are
applying PCA on WT sub-image with mid-range frequency diagonal, it follows that the coordinates of the vectors in the
ICMSA0/05-2
Proceeding of the First International Conference on Modeling, Simulation and Applied Optimization, Sharjah, U.A.E. February 1-3, 2005
2.2. Eigenfaces
A neuron is the basic element of any artificial neural network
∑
Basically, eigenface is the eigenvector obtained from PCA. In (ANN). It works as:
h = w( i ) x
face recognition, each training image is transformed into a vector (5)
j jk k
by row concatenation. The covariance matrix is constructed by a k
set of training images. This idea is first proposed by Sirovich and
Where xk are input signals, w(ijk) are the weights of synaptic
connections between neurons of i and i + 1 layers. The output
Kirby [10]. After that, Turk and Pentland [8] developed a face
recognition system using PCA. The significant features
(eigenvectors associated with large eigenvalues) are called signal of the j − th neuron is y j = g (h j ) , where the activation
eigenfaces. The projection operation characterizes a face image
function g (x) is either a threshold function, or a sigmoid type
by a weighted sum of eigenfaces. Recognition is performed by
g ( x) =
1 .
1 + e− x
comparing the weight of each eigenface between unknown and function, like (6)
reference faces.
In the case of a threshold function and, say two classes, the
perceptron attributes the vector xi to the first class, if
∑w
2.3. Limitation
hj ≥ 0
( 2) or to the second class, otherwise. Such a scheme
ij
PCA has been widely adopted in human face recognition and face j
detection since 1987. However, in spite of PCA's popularity, it
∑w
admits the following geometric interpretation .The hyperplane
hj = 0
suffers from two major limitations: poor discriminatory power and given by equation ( 2) , divided the space on two
large computational load. It is well known that PCA gives a very ij
j
good approximation in face image. However, in eigenspace, each
halfspace corresponding to classes in question. If the number of
class is closely packed. Moghaddam et al. [14] have plotted the
classes is more than two, then several dividing hyperplanes will
largest three eigen coefficients of each class. It is found that they
be defined during the training process. For the input vector of the
overlap each other. This shows that PCA has poor discriminatory
classified features X i MLP brings in correspondence an output
power.
vector Yi . The transformation X i ⇒ Yi is completely described by
3. MLP Neural Network the matrix of synaptic weights to be found as a solution of any
concrete problem. Let us have some training sample as a set of
pairs of vectors {{ X i( m ) }, {Z i( m ) }} . The MLP training is
Multilayer Perceptron (MLP) Neural Network is a good tool for
E = ∑∑ (Yi
classification purpose [15, 16]. Neural Network (NN) and accomplished by minimization of so-called energy function
multilayer perceptron (MLP), in particular, are very fast means (m)
−Z i ) ⇒ min
(m) 2 (7)
foe classification of complex objects. It can approximate almost m i
any regularity between its input and output .The NN weights are by weights w(ijk) as minimization parameters. Such the EBP method
adjusted by supervised training procedure called back-
propagation (BP). Back propagation is a kind of the gradient is usually realized by the gradient descent method. The number of
descent method, which search an acceptable local minimum in units (neuron) in the input layer is equal to the number of image
the NN weight space in order to achieve minimal error. Error is pixel. The number of units in the hidden layer is unknown and it
defined as a root mean square of difference between real and determine with trial and error algorithm (see table 2.) and the
desired outputs of NN. During the training procedure, MLP number of output units is equal to the number of classes (number
builds separation hypersurfaces in the input space. The MLP can of different person in database)
successfully apply acquired skills to the previously unseen
samples after training procedure. It has good extrapolative and
interpolative and abilities. 4. Proposed Method
Typical architecture has a number of layers following one by one
[15, 16]. The MLP with one layer can build linear hypersurfaces, A wavelet-based PCA method is developed so as to overcome the
MLP with two layers can build convex hypersurfaces, and MLP limitation of the original PCA method; furthermore, we have
with three layers-hypersurfaces of any shape. We will be utilized a neural network in order to carry out the classification of
considering the following chain neurons of MLP in Figure 1. faces. We adopted a multilayer perceptron architecture which is
fed by the reduced input units, feature vectors generated by
combination of wavelet and PCA. Utilized MLP consists of one
hidden layer of 25 units, and 15 output units (see Table 2). We
propose the usage of a particular frequency band of a face image
for PCA to solve the first problem of PCA. The second limitation
can be dealt with by using a lower resolution image. The
combination of new wavelet-based PCA and neural network is
Figure 1: Scheme of chain of nodes considered in the back illustrated in Figure 2.
propagation algorithm.
ICMSA0/05-3
Proceeding of the First International Conference on Modeling, Simulation and Applied Optimization, Sharjah, U.A.E. February 1-3, 2005
•
unknown image according to the representational basis, identified be used in image decomposition because:
in the training stage. By decomposing an image using WT, the resolution of the
subimages are reduced. In turn, the computational
There are three significant steps in the training stage. In the first complexity will be reduced dramatically by operating on a
step, wavelet transform (WT) is applied to decompose reference lower a resolution image. Harmon [17] demonstrated that
images; consequently, sub-images in the form of 16x16 pixels image with resolution 16x16 is sufficient for recognizing a
obtained by three level wavelet decomposition are selected. In the human face. Comparing with the original image resolution of
next step, Principal Component Analysis (PCA) is performed on 128x128, size of the sub-image is reduced by 64 times, and
•
eigenvalues and sub-space projection. Finally, the feature vectors
of reference images obtained by previous steps are used so as to Under WT, images are decomposed into subbands,
train neural networks using back propagation algorithm. corresponding to different frequency ranges. These subbands
Processing in the recognition stage is similar to the training stage, meet readily with the input requirement for the next major
except that recognition stage also incorporates steps to match the step, and thus minimize the computational overhead in the
input unknown images with those reference images in the proposed system.
•
database by neural network. When an unknown face-image is
presented to the recognition stage, WT and PCA are applied to Wavelet decomposition provides the local information in
transform the unknown face-image into the representational basis both space domain and frequency domain, while the Fourier
identified in the recognition stage, and the classification is decomposition only supports global information in frequency
achieved by trained neural networks. domain.
Table 2. Percentage of correct classification on test set. Through out this paper (see table 3), the well known Daubechies
wavelet D4 [18, 19] is adopted and its four coefficient arc:
h0 = 0.48296291314453
Wavelet Principal Net Topology Best Average
h1 = 0.83651630373781
Component
h2 = 0.22414386804201
1-15 15:25:15 88.37 86.56
h3 = 0.12940952255126
1-25 25:30:15 90.35 89.23
1-35 35:30:15 89.78 87.24
1-45 45:25:15 88.92 87.67
An image is decomposed into four subbands as shown in Figure 3.
The band LL is a coarser approximation to the original image.
The bands LH and HL record respectively the changes of the
ICMSA0/05-4
Proceeding of the First International Conference on Modeling, Simulation and Applied Optimization, Sharjah, U.A.E. February 1-3, 2005
5. Experimental Result
Size of
Training Testing Recognition
Wavelet Image for To evaluate the performance of the proposed method, we used the
Times(s) Times(s) Rate (%)
subband face-images database of Yale University [29]. This database
Daubechies
221 5.00 79.2 16x16 consists of a total of 165 images (15 persons (males and females),
wavelet Daub(2)
Daubechies
with 11 images for each person). These images are of various
242 5.16 84.4 16x16 illumination and facial expression, as well as of wearing glasses.
wavelet Daub(4)
Daubechies All of these images have a resolution of 160x121. But the
289 5.32 84.5 16x16 dimension of these images is not the power of 2, so that the
wavelet Daub(6)
Daubechies wavelet transform can not be applied effectively. For solving this
313 5.78 83.9 16x16
wavelet Daub(8) problem, we crop these images to 91x91 and, then resize them
Biortoghonal into 128x128. And we the Leaving-one-out strategy [30] for our
wavelet 351 5.26 82.1 16x16 algorithm. This strategy uses 164 images for training and the rest
Wspline(4,4) one images for recognition. We applied this strategy 150 times for
Battle-Lemarie
training and recognition of all images.
wavelet 245 5.13 83.6 16x16
Lemarie(4)
In this work, the resolution of images is changed in from of
128x128 to 16x16 using the third level of wavelet decomposition.
The result (listed Table 4) shows that combination of wavelet and
Briefly the following steps are used for recognition using
PCA outperforms: PCA, DWT, and DCT. It also determined that
•
dimensionality reduction:
using of the MLP outperforms RBF Neural Networks [27], NN
Step 1. The combination of wavelet and PCA is used to
[25] and NFL [26].
•
reduce the input space
•
Step 2. The image vectors are normalized.
Step 3. The training set used contains 8 samples per
•
subject. Table 4. Performance comparison of recognition rate.
Step 4. The testing set used contains 3 samples per
subjects, with 3 samples not introduced in the training Classifier Type Coefficients Type Recognition Rate (%)
•
phase. MLP/BP NN WT+PCA
90.35
Step 5. The MLP Neural Network structure with the (proposed method)
reduced input units is used (15), one hidden layer of 25 MLP/BP NN WT 89.45
units, and 15 output units. MLP/BP NN LDA 89.24
MLP/BP NN PCA 81.27
Nearest Feature
WT 89.95
Line(NFL)
Nearest Feature
WT 90.20
Space(NFS)
ICMSA0/05-5
Proceeding of the First International Conference on Modeling, Simulation and Applied Optimization, Sharjah, U.A.E. February 1-3, 2005
6. Conclusions dimensions of the space, "J. Opt. Soc. Am., A, Vol. 10, 405-411,
1993.
This paper presents a hybrid approach for face recognition by [13]. A. K. Jain," Fundamentals of digital image processing,"
handling three issues put together. For preprocessing and feature pp.163-175, Prentice Hall, 1989.
extraction stages, we apply a combination of wavelet transform [14]. B Moghaddam, W Wahid and A pentland, "Beyond
and PCA. During the classification phase, the Neural Network eigenfaces: Probabilistic matching for face recognition,"
(MLP) is explored for robust decision in the presence of wide Proceeding of face and gesture recognition, pp. 30 –35, 1998.
facial variations. The experiments that we have conducted on the [15]. A.I. Wasserman ."Neural Computing: Theory and Practice
Yale database vindicated that the combination of Wavelet, PCA "– New York: Van Nostrand Reinhold, 1989.
and MLP exhibits the most favorable performance, on account of [16]. Golovko V., Gladyschuk V." Recirculation Neural Network
the fact that it has the lowest overall training time, the lowest Training for Image Processing,". Advanced Computer Systems. -
redundant data, and the highest recognition rates when compared 1999. - P. 73-78.
to similar so-far-introduced methods. [17]. L. Harmon, "The recognition of faces," Scientific American,
Vol. 229, 71-82, 1973.
Our proposed method in comparison with the present hybrid [18]. I. Daubechies, "Ten Lectures on Wavelets, CBMS-NSF
methods enjoys from a low computation load in both training and series in Applied Mathematics," Vol. 61, SIAM Press,
recognizing stages. As another illustration of the privileges of our Philadelphia, 1992.
introduced method, we can mention its great precision. [19]. I. Daubechies, "The wavelet transform, time-frequency
localization and signal analysis," IEEE Trans. Information
Theory, Vol. 36, No. 5, 961-1005, 1990.
7. References [20]. A. Pentland, B. Moghaddam and T. Starner," View-based
and modular eigenspaces for face recognition," Proc. IEEE Conf.
[1]. R. Chellappa, C. L. Wilson and S. Sirohey, "Human and Computer vision and Pattern Recognition, Seattle, June, 84-91,
machine recognition of faces: a survey," Proceedings of the IEEE, 1994.
Vol. 83, No. 5, 705-740, May 1995. [21]. H. A. Rowley, S. Baluja and T. Kanade, "Neural network-
[2]. G. Chow and X. Li, "Towards a system for automatic facial based face detection," IEEE Transaction on PAMI, Vol. 20, No.
feature detection," Pattern Recognition, Vol. 26, No. 12, 1739- 1,23-38, 1998.
[22]. E.M.-Tzanakou, E. Uyeda, R. Ray, A Sharma, R.
1755, 1993. Ramanujan and J. Dong, "Comparison of neural network
[3]. F. Goudail, E. Lange, T. Iwamoto, K. Kyuma and N. Otsu,
"Face recognition system using local autocorrelations and algorithm for face recognition," Simulation, 64, 1, 15-27, 1995.
multiscale integration," IEEE Trans. PAMI, Vol. 18, No. 10, [23]. D. Valentin, H. Abdi and A. J. O'Toole, "Principal
component and neural network analyses of face images:
1024-1028, 1996. Explorations into the nature of information available for
[4]. K. M. Lam and H. Yan, "Locating and extracting the eye in classifying faces by sex," In C. Dowling, F. S. Roberts, P. Theuns,
human face images", Pattern Recognition, Vol. 29, No.5 771-779, Progress in mathematical psychology, Hillsdale: Erlbaum, (in
1996. press, 1996)
[5]. D. Valentin, H. Abdi, A. J. O'Toole and G. W. Cottrell, [24]. K.Fukunaga, "introduction to Statistical Pattern
"Connectionist models of face processing: A Survey," Pattern Recognition,", Academic press.
Recognition, Vol. 27, 1209-1230, 1994. [25]. T.M.Cover and P.E. Hart, "Nearest Neighbor Pattern
[6]. A. L. Yuille, P. W. Hallinan and D. S. Cohen," Feature Classification,"IEEE Trans. Information theory, vol. 13, pp.21-
extraction from faces using deformable templates," Int. J. of 27, Jan. 1967.
Computer Vision, Vol. 8, No. 2, 99-111, 1992. [26]. S.Z. Li and J. Lu, "Face Recognition Using the Nearest
[7]. M. Kirby and L. Sirovich," Application of the Karhunen- Feature Line Method, "IEEE Trans. Neural Networks, vol. 10,
Loeve procedure for the characterization of human faces, "IEEE no.2,pp .439-443, mar. 1999.
Trans. PAMI., Vol. 12, 103-108, 1990. [27]. A.Jonathan Howell and H.Buxton, "face Recognition Using
[8]. M. Turk and A. Pentland," Eigenfaces for recognition, "J. Radial Basis Function Neural Networks", Proceedings of British
Cognitive Neuroscience, Vol. 3, 71-86., 1991. Machine Vision Conference, pages 445-464, Edinburgh, 1996.
[9]. M. V. Wickerhauser, Large-rank "approximate component BMVA Press, March 1999.
analysis with wavelets for signal feature discrimination and the [28]. Y. Meyer, "Wavelets: Algorithms and Applications," SIAM
inversion of complicated maps, "J. Chemical Information and Press, Philadelphia, 1993.
Computer Sciences, Vol. 34, No. 5, 1036-1046, 1994. [29].Yale face database:
[10]. L.Sirovich and M. Kirby, "Low-dimensional procedure for http://cvc.yale.edu/projects/yalefaces/yalefaces.html
the characterization of human faces," J. Opt. Soc. Am. A, Vol. 4, [30]R. Duda, P. Hart, “Pattern Classification and Scene
No. 3, 519-524, 1987. Analysis”, Wiley, New York, 1973.
[11]. D. L. Swets and J. J. Weng, "Using discriminant
eigenfeatures for image retrieval," IEEE Trans. PAMI., Vol. 18,
No. 8, 831-836, 1996.
[12]. A. J. O'Toole, H. Abdi, K. A. Deffenbacher and D. Valentin,
"A low-dimensional representation of faces in the higher
ICMSA0/05-6