Design and Implementation of an FPGA-based Realtime Face Recognition System Using Spatial Correlation Function
Design and Implementation of an FPGA-based Realtime Face Recognition System Using Spatial Correlation Function
Abstract—This paper presents a simple and efficient design of such as Principal Component Analysis and Independent
a face recognition system, where feature extraction algorithm is Component Analysis which are considered to be very
employed based on the principle of spatial cross-correlation. In successful methods among face recognition algorithms.
the feature extraction process, instead of processing the entire However, they suffer from two limitations. First, the
image at a time, only a pair of rows or columns of an image is discriminative power of these methods decreases when the size
considered which makes the algorithm very efficient and low-cost.
Considering the cross-correlations between the pairs, a unique 1-
of the training images increases [11]. Second, the
D signature of a 2-D face image is obtained which represents the computational load and memory requirement for calculating
variation in the face geometry along the vertical or horizontal eigenvectors increase dramatically for large databases [11].
direction. It is shown that the resulting vertical and horizontal Although elastic graph matching is a precise face recognition
features provide high compactness within the class and approach, its computation is time consuming; it usually cannot
separation between the classes. It is found that the proposed satisfy the practical demands [10]. In 3D modeling approach,
design can provide a satisfactory recognition performance for the existing sensors are not ideally suited for 3D based person
different standard databases. The results of hardware identification as they are not dynamic, and often cannot
implementation in terms of resources used and processing speed provide reliable data from incompliant subjects, wearing
have also been presented.
glasses, in arbitrary conditions. Although 3D data is not subject
to changes in illuminations, it is affected by other artifacts such
Keywords—Cross-Correlation, Spatial feature extraction, Face
geometry, Classification, Face recognition, Verilog, FPGA as changes in expression, and holes in data caused by imaging
effects [7]. Linear Discriminant Analysis (LDA) has been
I. INTRODUCTION widely used for face recognition [12]. However, being a linear
technique, it may not perform well when severe non-linearity is
FACIAL appearance serves as a natural and effective involved. Hidden Markov model based face recognition system
leads to a large dimension of the observation vectors and it
means for recognition of a person by another. There are two
main biometric techniques; Intrusive and nonintrusive. Face leads to high computational complexity of the training and
recognition system that employs a nonintrusive biometric detection/recognition systems [8]. Bayesian subspace analysis
approach is increasingly in demand for defense, security and has been successfully applied in face recognition. However, the
commercial applications, as it protects both safety and privacy direct application of the algorithm is much more
during the process [1]-[3]. Such systems generate a lot of raw computationally intensive [9]. All these methods for face
data in real-time. A significant amount of processing power is a recognition have impressive recognition accuracy with
prerequisite for this sort of applications. In addition, complex computationally complex hardware design. Still researchers
algorithms for feature extraction of images make the hardware are looking forward o design a system with high recognition
more complex and expensive. Several methods have been accuracy with a cost effective hardware design. This is the
proposed for the implementation of face recognition system main motivation behind our current research work.
such as elastic subspace analysis method [4]-[5], Linear In this paper, we intend to adopt the holistic approach for
discriminant analysis (LDA), 3D modeling [7], Hidden precisely capturing the variations in the whole image. In order
Markov models [8], Bayesian analysis [9] and elastic graph to capture variations in face geometry along the vertical and
matching [10]. horizontal direction, cross-correlation operation is performed
Face Recognition is a high dimensional pattern recognition considering pairs of consecutive rows and column of an image
problem and the intrinsic dimensionality of the face space is data. A simple minimum distance based Euclidean classifier is
much lower than the dimensionality of the raw image space. used for template matching process. The proposed feature
This fact is the starting point of the use of subspace methods extraction method is computationally efficient and cost
effective for real life implementation.
,
II. DERIVATION OF PROPOSED ALGORITHM AND image database is where length of each feature
CLASSIFIER vector is , then the Euclidian minimum distance based error
between these two vectors is defined as
A. Cross-correlation based signature:
,
Face geometry varies in different region of a human face. ∑ (2)
Compared to the entire face image, certain portions such as In the face recognition system, the classifier is designed
eyes, nose and lips carry more information. These high using sequential adder and multiplier.
informative zones can be used to identify a person due to the
existence of low similarity in these zones compared to other
zones [13]. Hence the idea introduced here is to exploit the III. HARDWARE ARCHITECTURE
discriminative features of the high informative zones of a face The architecture of the proposed design is shown using a
image for pattern classification. One effective way to measure flow chart in Figure 1. It is divided into four parts: feature
the similarity of two vectors is the Cross-correlation. For given extraction, centroid calculation, template database and
two finite dimensional real sequences x n 0,1,2,3, … , N classifier. All these four sections are controlled by a 3-bit
and y n 0,1,2,3, … , M with N , the m lag of cross controller.
correlation can be estimated as
| |
∑ | | (1)
where, 0. As images can be treated as 2-D array of
pixel intensity values, each row and column can be treated
separately as a sequence. In order to measure the similarity in
a portion of the image, one may compute cross-correlation
between two rows of the image using equation (1) considering
x n and y n as the array of pixel intensities. One simple
way to comment on the similarity of two sequences is to
consider the zero lag value of the cross-correlation, i.e.,
p 0 . Use of lags other than the zero lag may be avoided as
it unnecessary increases the computational burden without
providing any significant benefit in improving feature quality
[14]. If two rows are considered for cross-correlation and the
similarity between them is very low (i.e. rows from forehead),
the cross-correlation coefficient at the zero lag would be high
in magnitude and vice versa. Thus, the value of the cross-
correlation coefficient at the zero lag between two rows at the
zero lags varies in accordance with the degree of similarity
between these two rows. Considering the zero lag cross-
correlation values of consecutive pair of rows, the vertical
feature of a face image is extracted and similarly the
horizontal feature is extracted considering the consecutive pair
of columns. Concatenation of these two features provides the
proposed concatenated feature vector, which is ultimately used
for face recognition. Only matrix transpose, multiplier, Figure 1: Architecture of the Proposed Face Recognition system
sequential adder and divider would be enough to design
The computation of zero lag cross correlation value is a
feature extraction algorithm. As we considered matrix as a
dot-multiplication of two vectors. Suppose, two vectors are
whole we need matrix transpose module.
given by:
B. Classifier: u u 1 ,u 2 ,…,u K T
In order to test the recognition accuracy by using proposed T
extracted feature, different classifiers can be used. However, in v v 1 ,v 2 ,…,v N
view of keeping the overall computational complexity very low, where, it is implicitly understood that:
a distance-based classifier is employed. In the training phase, u k 0 for k ∞, … , 1, 0 & k K 1, K
using the concatenated feature the feature template is generated. 2, … , ∞
It is obtained by using the zero-lag value of the cross-
v n 0 for n ∞, … , 1, 0 & n N 1, N
correlation function. Then the template matching process is
carried out based on distance-based Euclidean classifier. If the 1, … , ∞
feature vector of the test image of a person is and the Now if we want to compute the cross correlation, which
feature vector of person in the pose from the training can be described mathematically as:
C w ∑N u w n . v n for w N 1, N extraction algorithm can handle 16-bit input and the other
2, … , 1,0,1, … , K 2, K 1 module can handle only 8-bit input. The structure of sequential
adder consists of one 16 bit full adder and a feedback path. At
Where it is known that:
first, it sorts 32-bit bit stream into two 16-bit inputs. The
C w 0 for w ∞, … , N 1, N & w K, K addition of two 16-bit input drives to the feedback path. The
1, … , ∞ feedback path drives the output to the next input for further
Where * indicates complex conjugate (of the signal v). This addition in each clock cycle. It means the output is added with
the next upcoming 16-bit input in each clock cycle. In the
means that there are N K 1 cross-correlation values,
beginning the input adds up with zero as the modules is in reset
C w , to compute. Although implicitly given by zero it should
mode and saves into register temporarily at the output for
be ensured that all needed u-values are directly given - also the
feedback purpose. Thus every output of the sequential adder is
values, which are zero.
added with the next upcoming input bit. The results of the
sequential output do not go over 24-bit.
D. Divider:
Like sequential adder, two divider modules have been
employed - one for feature extraction and the other for centroid
calculation. The only difference between these two modules is
the divisor number. The first module’s divisor is 92 and 112
decimal number depending on which feature vector is going to
be extracted. Another divider module’s divisor number is 10. It
normalizes the output by taking most significant 8-bit number
from the quotient. Here the input bit size is 24-bit and output
bit size is 8-bit.
has been removed. It has been done by subtracting the the proposed face recognition hardware. All the comparison
sequence values from mean of sequences. The Euclidean results are validated with the ORL database with 400 images
distance based classifier has been employed in order to test the of 40 subjects and 112 92 size in each gray level image.
recognition accuracy. The recognition accuracy has been Input pixels are considered as bit streams and limited to 0 to
justified on three different cases, such as concatenated feature 255. The maximum clock frequency was found to be 184.7
vector, only horizontal feature vector and only vertical feature MHz in Xilinx Virtex7 FPGA.
vector. The experiments were performed following the leave-
one-out cross validation rule.
Figure 3 represents images of a single person with different
facial expression. Figure 4 represents concatenated feature
vector of vertical and horizontal feature vector. It also proves
the class compactness of vertical and horizontal feature vector.
Figure 5 provides a better visualization of the effectiveness of
this simple yet effective algorithm. It is clear that the horizontal
feature vector is less effective at uniquely representing an
individual than the vertical feature vector. Using the combined
feature vector leads to an impressive 98.3% accuracy.
TABLE 2: HARDWARE IMPLEMENTATION COMPARISONS Approaches,” Computer Vision and Pattren Recognition – Workshops,
Max. 2005, pp. 114
Area Utilization
Reference Clock [8]. Ara V. Nefian and Monson H. Hayes III, “Face detection and
Technology Register Bits Total LUTs recognition using Hidden Markov Model,” International Conference on
design Freq.
(%Utilization) (%Utilization) (MHz) Image Processing, 1998, vol. 1, pp. 141-145
Xilinx [15] - 292 50 [9]. Niu Liping, Zheng Yanbin, Li Xin Yuan and Dou Yuqiang, “Bayesian
Spartan-2E Proposed Face Recognition Using 2DPCA,” Information Technology and
82 263 44.6 Application, 2009, vol. 2, pp. 567-570
Design
Xilinx [16] 652 830 50 [10]. Yun-feng LI, “A Face Recognition System Using Support Vector
Spartan-2E Machines and Elastic Graph Matching,” Artificial Intelligence and
Proposed
82 263 44.6 Computational Intelligence, 2009, pp. 3-6
Design
[11]. N. Shams, I. Hosseini, M.S. Sadri and E. Azarnasab, “Low Cost Fpga-
Xilinx [17] 128 256 -
Based Highly Accurate Face Recognition System using Combined
Spartan-3 Proposed
87 294 51.8 Wavelets with Subspace Methods,” IEEE International Conference on
Design Image Processing, 2006, pp. 2077-2080
[12]. P.N Belhumeur, D.J Kriegman and J.P. Hespanha, “Eigenfaces vs.
Fisherfaces: Recognition Using Class Specific Linear Projection,” IEEE
V. CONCLUSION Transection on Pattern Analysis and Machine Intelligence, 1997, vol. 19,
no. 7, pp. 711-720
In this paper, our objective is to develop a cost effective
[13]. M. S. U. Sarwar, A. Sharin, M. R. Khan, H. Imtiaz, and S. A. Fattah,
feature extraction hardware, which can provide high face “A Face Recognition Scheme Based on Spatial Correlation Function,”
recognition accuracy. In this regard, we intend to adopt the TENCON 2010-2010 IEEE Region 10 conference, 2010, pp. 671-674
holistic approach for precisely capturing the variations in the [14]. S. A. Fattah, M. R. Khan, A. Sharin and H. Imtiaz, “A face recognition
whole image. Zero lag cross-correlation operation is performed scheme based on spectral domain cross-correlation function,” TENCON
considering pairs of consecutive rows and column of an image 2011-2011 IEEE Region 10 Conference, 2011, pp. 10-13
data to extract the vertical and feature vector. Matrix transpose, [15]. “The ORL database of faces,” 2007 [online], Available:
www.cl.cam.ac.uk/Research/DTG/attarchive/pub/data/
multiplier, sequential adder and divider, these are the main
[16]. Fab Yang and M. Paindavoine, “Implementation of an RBF Neural
modules that have used to design the proposed face recognition Network on Embedded Systems: Real-Time Face Tracking and Identity
hardware. Recognition accuracy of proposed system has been verification,” IEEE Transactions on Neural Networks, 2003, vol.14, no.
testified. At the same time proposed hardware design has been 5, pp. 1162-1175
synthesized using Synopsys’s Synplify premier for different [17]. E. A. Abdel-Ghaffar, M. E. Allam, H. Mansour and M. A. Abo-Alsoud,
“A Secure Face Recognition System,” International Conference on
FPGA device to justify which platform would be better for Computer Engineering & Systems, 2008, pp. 95-100
implementation based on area utilization and maximum clock [18]. S. A. Dawwd and B. S. Mahmood, “A reconfigurable interconnected
frequency. The design best suits for Xilinx Virtex7 Filter for Face Recognition based on Convolution Neural Network,” 4th
XC7VX485T. In addition, synthesis results of proposed International Design and Test Workshop (IDT), 2009, pp. 1-6
hardware design have been compared with some other related
synthesized results and the results are satisfactory. The testing
of any image using the proposed system provides a decision
with an accuracy of 92.7%. All the comparison results are
validated with the ORL database with 400 images of 40
subjects and 112x92 size in each gray level image.
REFERENCES
[1]. Z. M. Hafed and M. D. Levine, “Face recognition using the discrete
cosine transform,” International Journal of Comoputer Vision, 2001,
vol. 43, no. 3, pp. 167–188, 2001
[2]. F. M. de S. Matos, L. V. Batista, and J. v. d. Poel, “Face recognition
using DCT coefficients selection,” ACM symposium on Applied
computing, 2008, pp. 1753–1757
[3]. W. Zhao, R. Chellappa, A. Rosenfeld, and P. J. Phillips, “Face
recognition: A literature survey,” ACM Computing Surveys (CSUR),
2003, vol. 35, no.4 pp. 399–458
[4]. M. Turk and A. Pentland, “Eigenfaces for recognition,” Journal of
Cognitive Neuroscience, 1991, vol. 3, no. 1, pp. 71-86
[5]. M.S. Bartlett, J.R. Movellan, and T.J. Sejnowski, “Face recognition by
Independent Component Analysis,” IEEE Transaction On Neural
Networks, 2002, vol. 13, no.6, pp. 1450-1464
[6]. Himaanshu Gupta, Amit K Agarwal, Tarun Pruthi, Chandra Shekher,
Rama Chellappa, “An Experimental Evaluation of Linear and Kernel-
Based Methods for Face Recognition,” Application of Computer Vision,
2002, pp. 13-18
[7]. J. Kittler, A. Hilton, M. Hamouz and J. Illingworth“3D Assisted Face
Recognition: A Survey of 3D Imaging, Modeling and Recognition