A Deep Face Identification Network Enhanced by Facial Attributes Prediction

Uploaded by

ohmyhilbert

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

A Deep Face Identification Network Enhanced by Facial Attributes Prediction

Uploaded by

ohmyhilbert

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

A Deep Face Identification Network Enhanced by Facial Attributes Prediction

Fariborz Taherkhani, Nasser M. Nasrabadi, Jeremy Dawson

West Virginia University
[email protected], [email protected], [email protected]
arXiv:1805.00324v1 [cs.CV] 20 Apr 2018

Abstract the predicted facial attributes as an auxiliary modality to

improve face identification performance. We also show that
In this paper, we propose a new deep framework which when our model is trained jointly to recognize face images
predicts facial attributes and leverage it as a soft modal- and predict facial attributes, the model performance on fa-
ity to improve face identification performance. Our model cial attribute prediction increases as well. In other words,
is an end to end framework which consists of a convolu- in our model the two modalities improve each other’s per-
tional neural network (CNN) whose output is fanned out formance once they are trained jointly. We show that some
into two separate branches; the first branch predicts facial soft biometric information, such as age and gender which on
attributes while the second branch identifies face images. their own are not distinctive enough for face identification,
Contrary to the existing multi-task methods which only use but, nevertheless provide complementary information along
a shared CNN feature space to train these two tasks jointly, with other primary information, such as the face images.
we fuse the predicted attributes with the features from the Despite significant improvements in face recognition
face modality in order to improve the face identification per- performance, it is still an ongoing problem in computer vi-
formance. Experimental results show that our model brings sion [3, 11, 24, 25, 27, 29]. There are a number of ap-
benefits to both face identification as well as facial attribute proaches in the literature that use facial attributes for bio-
prediction performance, especially in the case of identity metrics applications such as face recognition. For example,
facial attributes such as gender prediction. We tested our Wang et al [33] propose an attribute-constrained face recog-
model on two standard datasets annotated by identities and nition model for joint facial attributes prediction and face
face attributes. Experimental results indicate that the pro- recognition. In this model, the parameters of the network
posed model outperforms most of the current existing face are first updated for attributes prediction and then same net-
identification and attribute prediction methods. work is fine-tuned for face recognition. While Ranjan et al
[23] add other face related tasks to improve overall perfor-
mance. Their model is a single multi-task CNN network
1. Introduction for simultaneous face detection, face alignment, pose esti-
Deep neural networks, particularly deep Convolutional mation, gender recognition, smile detection, age estimation
Neural Networks (CNNs), have provided significant im- and face recognition.
provement in visual tasks such as face recognition, attribute Facial attributes as semantic features can be predicted
prediction and image classification [16, 26, 28, 12, 19, 22]. from face images directly, or from other facial attributes in-
Despite this advancement, designing a deep model to learn directly [32]. Attribute prediction methods are generally
different tasks jointly while improving their performance by classified into local or global approaches. Local methods
sharing learned parameters remains a challenging problem. consist of three steps; first they detect different parts of
Providing auxiliary information to a CNN-based face the object and then extract features from each part. Fi-
recognition model can improve its recognition perfor- nally, these features are concatenated to train a classifier
mance; however, in some cases such information is avail- [18, 4, 7, 2, 20, 37]. For example, Kumar et al’s method
able only during training and may not be available during [18] is based on extracting hand-crafted features from ten
the testing phase. Despite the potential advantages of using facial parts. Zhang et al [37] extract poselets aligning face
auxiliary data, these problems have diminished the popular- parts to predict facial attributes. This method works im-
ity and flexibility of using both soft and hard modalities for properly if object localization and alignment are not perfect.
biometric applications [30]. Global approaches, however, extract features from entire
We propose a model which jointly predicts facial at- image disregarding object parts and then train a classifier
tributes and identifies faces while simultaneously leverages on extracted features; these methods perform improperly if
large face variations such as occlusion, pose and lighting 2. Deep Joint Facial Attributes Prediction and
are present in the image [19, 13, 20]. Face Identification Model
Attribute prediction has been improved in recent years.
The proposed architecture predicts facial attributes and
Bourdev et al [5] propose a part-based attribute prediction
uses them as an auxiliary modality to recognize face im-
method which deploys semantic segmentation in order to
ages. The model is constructed from two successive cas-
transfer localization information from the auxiliary task of
caded networks as shown in Fig.1. The first network
semantic face parsing to the facial attribute prediction task.
(net@1) uses the VGG 19 structure [26] with identical filter
Liu et al [19] use two cascaded CNNs; the first of which,
size, convolutional layers, and pooling operation. The first
LNet, is used for face localization, while the second, ANet,
network applies filters with 3×3 receptive field. The convo-
is used for attribute description. Zhong et al [38] first local-
lution stride is set to 1 pixel. To preserve spatial resolution
ize face images and then use an off-the-shelf architecture
after convolution, spatial padding of the convolutional layer
designed for face recognition to describe face attributes at
is fixed to 1 pixel for all 3 × 3 convolutional layers. Spa-
different levels of a CNN. He et al [36] propose a multi-
tial pooling is performed by four max-pooling layers placed
task framework for relative attribute prediction. The method
after the second, fourth, eighth, and twelfth convolutional
uses a CNN to learn local context and global style informa-
layers and one global average pooling (GAP) layer which is
tion from the intermediate convolution and fully connected
placed after the sixteenth convolutional layer. Max-pooling
layers, respectively.
is carried out on a 2 × 2 pixel window with a stride of 2.
Our network is inspired by multi-task network but we Each hidden layer is followed by a Rectified Linear Units
fuse the output of the attribute predictor into the face recog- (ReLU) [16] activation function. A GAP layer is a substan-
nition layers which makes it different from other existing tial process in our model because by disregarding the GAP
multi-task methods such as Wang et al’s [33] approach. Our layer and replacing it by a max-pooling layer, the output of
deep CNN model is constructed from two cascaded net- the fusion layer will have a very high dimension when we
works in which the final one consists of two branches, each fuse face and attribute modalities together. The GAP layer
of which are used for facial attribute prediction and face simply takes average of each feature map obtained from last
identification, respectively. Both these two branches com- convolutional layer. Since no parameter is optimized at the
municate information together by sharing parameters of the GAP layer, overfitting is prevented at this layer.
first network in the model as well as fusing attribute branch The second network (net@2) is divided into two sepa-
with the last pooling layer of the face identification branch. rate branches trained simultaneously while communicating
In our model, all the parameters (i.e. the parameters of the information together through the training process. Both of
two cascaded networks) are updated simultaneously in each these branches consist of two fully connected (FC) layers
training step. operating on the output of the first network. The first FC
The Contributions of our work are summarized as fol- layer of each branch (Fc1 and Fc0 1 in Fig.1) consists of
lows: 4096 units. The next layers of (Fc1) and (Fc0 1) are fully
1) We design a new end to end CNN architecture that connected layers on which the soft-max operation is con-
learns to predict facial attributes while simultaneously being ducted. The first branch performs the attribute prediction
trained with the objective of face identification. Our model task, and the output of the last FC layer in this branch be-
shares learned parameters to train both tasks and also fuses fore performing soft-max operation is fused with the GAP
attribute information and the face modality to improve face layer of net @1 by using Kronecker product [10]. Finally
identification performance. this fused layer is employed to train the second branch - the
face identification task. As shown in Fig.1, attributes are
2) Contrary to the existing multi-task methods that only
predicted by net@1 and first branch of net@2 parameters
use a shared CNN feature space to train these two tasks
while face images are identified by net@1 and all parame-
jointly, our model uses a feature level fusion approach to
ters in net@2; the overall proposed architecture is shown in
leverage facial attributes for improving face identification
Fig.1.
performance. Furthermore, we observe that our jointly
trained network is a more capable face attribute predictor
3. Fusion Layer on Facial Attributes and Face
than one trained on facial attributes alone.
Modalities
The rest of this paper is organized as follows: The CNN
architecture is described in section 2, fusion of attribute and Previously, feature concatenation has been used as an ap-
face modalities is described in section 3, model training pa- proach for multimodal fusion. In this work, we use the Kro-
rameters are described in section 4, and finally, results and necker product to fuse facial attributes features with face
concluding remarks are provided in sections 5 and 6, re- features. Since the Kronecker product of two vectors (i.e.
spectively. attributes and face features) is mathematically formed by a
Figure 1. Proposed CNN architecture, face identification and attribute prediction are trained jointly.

matrix direct product, there are no learnable parameters at of 4-d tensors; W ∈ IRl×c×p×q where l, c, p and q are di-
this layer and, consequently, chances of overfitting are low mensions of the weights along the axes of filter, channel,
at this layer. Furthermore, we argue that, due to existing and spatial width and height, respectively. For notational
correlation between facial attributes features and face fea- simplicity, we denote all the weights in net@1 with W1 and
tures, the output neurons of the fusion layer are simple to the weights in net@2 with W2 . W2 is separated into two
interpret and are semantically meaningful. (i.e., the mani- groups of W2,1 and W2,2 representing all weights in the first
fold that they will lie on is not complex, however, it is just and second branches, respectively.
high dimensional). Therefore, it is simple for the following L1 and L2 described in (2) and (3) are the loss functions
layers of the network to decode such meaningful informa- designed to perform attribute prediction and face identifi-
tion. Assume that v and u are the feature vectors of at- cation tasks, respectively. We use the cross entropy as our
tributes and face, respectively. The Kronecker product of network loss functions. T, C and X = {xi }N i=1 indicate
these two vectors is defined as follows: the number of facial attributes used in the model, number
  of classes and the training samples, respectively. L0i and
u 1 v1
 u1 v2  Lji represent face label and facial attribute label for at-
   
u1 v1

 .. 
 tribute j and the training sample i, respectively. f and g
 u2   v2   .  functions are outputs of the network for attribute predic-
tion and face identification tasks, respectively. f 0 and g 0
 
u ⊗ v =  . ⊗ .  =  u1 vm  (1)
   
 ..   ..  
 u2 v1 

are soft-max functions performed on the f and g outputs,
un vm
 
 .  respectively. The loss functions represented in (2) and (3)
 .. 
show how two branches of net@2 communicate informa-
un vm tion and update their learning parameters with each other.
As shown in (2) and (3), the f function (attribute prediction
4. Training our CNN architecture output) takes W1 and W2,1 as input. The g function (face
In this section, we describe how we train our model. identification output) takes W1 , W2,2 and f as input. There-
Thousands of images are needed to train such a deep model. fore, both attribute prediction and face identification use W1
For this reason, we initialize net@1 parameters by a CNN as shared parameters. Furthermore, attribute prediction pa-
pre-trained on the ImageNet dataset and then we fine tune rameters and W2,2 are used for face identification.
it as a classifier by using the CASIA-Web Face dataset. We use an Adam optimizer [15] to minimize our net-
CASIA-Web Face contains 10,575 subjects and 494,414 work’s loss functions. The Adam optimizer is a robust and
images. As far as we know, this is the largest publicly avail- well-adapted optimizer that can be applied to a variety of
able face image dataset, second only to the private Facebook non-convex optimization problems in the field of deep neu-
dataset. ral networks. All parameter values used in Adam optimizer
The proposed deep network is described as a succession are initialized using the authors’ suggestion; we set learning
of two cascaded networks. net@1 is constructed from 16 rate to 0.001 to minimize our network’s loss functions.
layers of convolutional operations on the inputs, intertwined The optimization algorithm mainly consists of two steps,
with ReLU non-linear operation and five pooling opera- the first of which calculates the gradient of the loss func-
tions. Weights in each convolutional layer form a sequence tions with respect to the model parameters, and then, for the
second step, updates the biased first moment estimate and
the model parameters, successively.
T X
X N
L1 (W1 , W2,1 , X) = − Lji log(f 0 (f (Lji |xi , W1 ,
j=1 i=1

W2,1 ))) + (1 − Lji )log(f 0 (f (1 − Lji |xi , W1 , W2,1 )))

(2)

N X
X C
L2 (W1 , W2,1 , W2,2 , X) = − L0ik log(g 0 (g(L0ik |xi ,
i=1 k=1
W1 , W2,2 , f (xi , W1 , W2,1 ))))
(3)

We iterate this algorithm through several epochs for the

complete training batches until training error convergence
is achieved.

5. Experiment
Figure 2. : First and second rows are image samples in CelebA
We conducted experiments for two different cases to ex- dataset; third and forth rows are samples of aligned face images in
amine if our model improves overall performance in identi- MegaFace dataset.
fication and prediction tasks. In the first case, we train and
test the model to perform two tasks separately in isolation,
while in the second case we employ our model to train both
tasks jointly. In the second case, however, we predict facial CelebA is a large-scale, richly annotated face attribute
attributes assuming that such information is not available dataset containing more than 200K celebrity images, each
during the testing phase, and then outputs of the attribute of which is notated with 40 facial attributes. CelebA has
prediction branch before performing the soft-max operation about ten thousand identities with twenty images per iden-
is fused with the last pooling layer of net@1 by using the tity on average. This dataset is also annotated by five land-
Kronecker product. We fuse the face modality with those marks. The dataset can be used as the training and testing
facial attributes such as gender and face shape which re- sets for facial attribute prediction, face detection, and land-
main the same in all images of a person. Experimental re- mark (or facial part) localization. To compare our method
sults show that our model increases overall performance in fairly with the other methods, we use the same setup that
face identification as well as attribute prediction in compar- they have used. We use images of 8000 identities for train-
ison to the first case. We performed our experiments on two ing and remaining 1000 identities for testing. Train and test
GeForce GTX TITAN X 12GB GPU. We ran our model sets are available here.1
through 100 epochs using batch normalization (i.e. shift-
MegaFace is a publicly available and very challenging
ing inputs to zero-mean and unit variance) after each con-
dataset which is used for evaluating the performance of
volutional and fully connected layer before performing no-
face recognition algorithms with up to a million distrac-
linearity. Batch normalization potentially helps to achieve
tors ( i.e., up to a million people who are not in the test
faster learning as well as higher overall accuracy. Further-
set). MegaFace contains 1M images from 690K individu-
more, batch normalization allows us to use a higher learning
als with unconstrained pose, expression, lighting, and expo-
rate, which potentially provides another boost in speed. We
sure. MegaFace captures many different subjects rather than
used TensorFlow to implement our network. The batch size
many images of a small number of subjects. The gallery
in all experiments is fixed to 128.
set of MegaFace is collected from a subset of Flickr [31].
5.1. Datasets The probe set of MegaFace used in the challenge consists
of two databases; Facescrub [21] and FGNet [9]. FG-NET
We conducted our experiments on the CelebA dataset contains 975 images of 82 individuals, each with several im-
[19] for facial attribute prediction, as well as MegaFace [14] ages spanning ages from 0 to 69. Facescrub dataset contains
which is a widely used and well-known face datasets for
face identification. 1 http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
more than 100K face images of 530 people. The MegaFace 5.3. Methods for Comparisons
challenge evaluates performance of face recognition algo-
rithms by increasing the numbers of distractors (going from Attribute Prediction: We compare our method
10 to 1M) in the gallery set. Training size is important, with several competitive algorithms including FaceTracer,
since it has been shown that face recognition algorithms that PANDA[37], ANet+LNet [19] and MT-RBM-PCA [8].
were trained on larger sets tend to perform better at scale. FaceTracer [17] extracts handcraft features including color
In order to evaluate the face recognition algorithms fairly, histogram and HOG from some functional face image re-
MegaFace challenge has two protocols including large or gion and then concatenates these features to train a SVM
small training sets. If a training set has more than 0.5M im- classifier for predicting attributes. Functional regions are
ages and 20K subjects, it is considered as large. Otherwise, determined by using ground truth landmarks. PANDA
it is considered as small. We use a small training set which mainly was proposed by creating an ensemble of several
has 0.44M images and 10k subjects. The prob set in our CNNs for body attributes prediction. Each CNN in this
experiments is Facescrub. model extracts features from a well-aligned human part us-
ing poselet. Next, all of the extracted features are concate-
5.2. Evaluation metrics nated to train a SVM for body attribute prediction. How-
ever, for our case, it is simple to adjust this method for fa-
We evaluate the face identification performance of our cial attribute prediction such that the face part is aligned us-
model on the MegaFace dataset; and facial attribute pre- ing landmark points. In ANet+LNet method, images of the
diction performance on the CelebA dataset. The MegaFace first 8000 identities, which is roughly 162k images, are em-
dataset is not annotated by facial attribute. Our model, how- ployed for pre-training and face localization. The images
ever, predicts facial attributes and then uses them for face of the next 1000 identities, which is roughly 20k images,
identification. To conduct experiments on the MegaFace are used to train a SVM classifier. We use same testing
dataset, we restore the model parameters trained on the and training sets to conduct our experiment. We compare
CelebA dataset, which is annotated by facial attributes as our model with the other methods for attribute prediction.
well as people identification, and then fine-tune the model Table.1 shows the model improvement on identity facial at-
parameters on the MegaFace dataset for the objective of tribute prediction once the model trains both tasks jointly.
face identification on the MegaFace dataset. Our model pre- The results shows that joint-training has higher contribution
dicts facial attributes from the first branch of our architec- for the attributes of gender, bald, narrow eyes, big lip, big
ture and employs this auxiliary modality for face identifica- nose, oval face, young, high cheekbone and chubby, respec-
tion. tively.
Face Identification: we calculate the similarity between Face Identification: We compare our method with the
each of the images in the gallery set and given image from exiting methods on face identification which are reported
the probe set, and then rank these images based on the ob- from the official websites of MegaFace2 . We primarily
tained similarities. In face identification, the gallery set compare with publicly released methods, for which the
should contain at least one image of the same identity. We details are known. These methods are listed as follows:
evaluate our model by using rank-1 identification accuracy Google FaceNet [24], Center Loss [34], Lightened CNN
as well as Cumulative Match Characteristics (CMC) curves. [35], LBP [1] and Joint Bayes model [6].
CMC is a rank-base metric indicating the probability of the There are several other methods from commercial com-
correct gallery image that can be found in the top k similar panies such as FaceAll, NTechLAB, SIAT MMLAB, Bare-
images from the gallery set. BonesFR, 3DiVi companies, the details of which are not
Facial Attribute Prediction: We leverage identity facial known to the community yet. Therefore, we can not com-
attributes as an auxiliary modality for improving face iden- pare these methods with ours fairly; however, we report
tification performance. Identity facial attributes are invari- these methods to provide a comprehensive list of references
ant attributes which remain same from different images of on the Megaface dataset. Fig. 3 represents CMC curves
a person. For example, gender, nose and lips shapes remain for different methods; it is shown that our model covers
the same in different images of a person; however, attributes larger area under the curve in comparison to the other meth-
such as glasses, mustaches, or beards may or may not exist ods. We also compare our model performance when the
in different images of a person. We discard such attributes model trains facial attributes prediction and face identifi-
in our model because we look for robust as well as invariant cation jointly and separately. The results show that our
facial attributes. Identity facial attributes in CelebA dataset face identifier benefits from joint training. We also com-
are listed as follows: narrow eyes, big nose, pointy nose, pare performance of the algorithms by rank-1 identification
chubby, double chin, high cheekbones, male, bald, big lips accuracy; Table.2 compares face identification models on
and oval face . We evaluate our attribute predictor by using
accuracy metric. 2 http://megaface.cs.washington.edu/results/facescrub.html
FaceTracer PANDA LNets+ANet RBM-PCA Ours-S Ours-J
Bald 89 96 98 98 96.16 98.93
Big Lips 64 67 68 69 69.25 71.69
Big Nose 74 75 78 81 82.35 84.67
Chubby 86 86 91 95 94.22 95.27
High Cheekbones 84 86 88 83 86.61 87.79
Male 91 97 98 90 95.65 98.61
Narrow Eyes 82 84 81 86 85.45 87.9
Oval Face 64 65 66 73 74.49 75.94
Young 80 84 87 81 87.12 88.54
Table 1. Comparing attribute prediction models on CelebFacesA dataset.

Methods Rels Protocol Acc%

Google - FaceNet v8 X Large 70.5
NTechLAB - Large × Large 73.3
Faceall Co. - Norm-1600 × Large 64.8
Faceall Co. - FaceAll-1600 × Large 63.98
Lightened CNN X Small 67.11
Center Loss X Small 65.23
LBP X Small 3.02
Joint Bayes X Small 2.33
NTechLAB -Small × Small 58.22
3DiVi Company × Small 33.71
SIAT-MMLAB × Small 65.23
Barebones FR × Small 59.36
Wang et al [33] X Small 77.74
Figure 3. CMC curves of different methods with the protocol of
small training set by 1M distractors. Please note that results of
PM-Separately X Small 76.15
the other methods are reported from official website of MegaFace PM-Jointly X Small 78.82
dataset. Table 2. Comparing face identification models on MegaFace
dataset using rank-1 identification accuracy metric.

Experimental results show that training the two tasks

MegaFacedataset using rank-1 identification accuracy met- jointly increases not only face identification performance,
ric. The results show the superiority of our model. We also but also facial attribute prediction performance, especially
observe that the model performance increases about 2.5% if on identity facial attributes such as gender. For example,
the model train attributes and face jointly in comparison to experiments performed on the CelebA dataset indicate that
the case which the model is trained separately. performance on face attributes including narrow eyes, big
nose, pointy nose, chubby, double chin, high cheekbones,
5.4. Further Analysis male, bald, big lips and oval face is improved around 2% on
Experimental results included in Table.2 show that our average if the tasks are trained jointly. Moreover, as shown
model improves face recognition performance by leverag- in Table.2, our proposed model outperforms the accuracy
ing identity facial attributes. To verify this claim, we con- of the state of the art methods for identity facial attributes
ducted experiments for two different cases described ear- prediction. One of the intuitive reasons causing this im-
lier. In the second case we emphasize predicting facial at- provement is that, once our deep CNN model is trained to
tributes, because in a real face identification scenario, such identify face images, it also learns more accurate face at-
information is not available during the testing phase. To use tributes in order to perform better face identification. In
facial attributes as an auxiliary modality in our proposed other words, these two modalities enhance each others’ per-
model for face identification, we fused this modality with formance once they are trained jointly.
the last pooling layer of the model shown in Fig.1. The sec- Table.2 also indicates that using facial attributes as priv-
ond case of our model which uses the predicted attributes ileged data boosts the model performance on face identifi-
outperforms the first case which does not use any privilege cation task. Our model beats most of the face identification
data. algorithms used in the MegaFace data set challenge.
cial attributes better than the state of the art models.

References
[1] T. Ahonen, A. Hadid, and M. Pietikainen. Face description
with local binary patterns: Application to face recognition.
IEEE transactions on pattern analysis and machine intelli-
gence, 28(12):2037–2041, 2006. 5
[2] T. Berg and P. N. Belhumeur. Poof: Part-based one-vs.-one
features for fine-grained categorization, face verification, and
attribute estimation. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pages 955–
962, 2013. 1
[3] L. Best-Rowden, H. Han, C. Otto, B. F. Klare, and A. K.
Jain. Unconstrained face recognition: Identifying a person
of interest from a media collection. IEEE Transactions on
Information Forensics and Security, 9(12):2144–2157, 2014.
1
[4] L. Bourdev, S. Maji, and J. Malik. Describing people: A
poselet-based approach to attribute classification. In Com-
puter Vision (ICCV), 2011 IEEE International Conference
on, pages 1543–1550. IEEE, 2011. 1
Figure 4. Example of class activation map generated from attribute [5] L. D. Bourdev. Pose-aligned networks for deep attribute
predictor part of our model. Each row indicates nose attribute , modeling, July 26 2016. US Patent 9,400,925. 2
mouth attribute, eyes attribute and head attribute , respectively. We [6] D. Chen, X. Cao, L. Wang, F. Wen, and J. Sun. Bayesian
observe that highlighted regions are activated by class activation face revisited: A joint formulation. In European Conference
map algorithm. on Computer Vision, pages 566–579. Springer, 2012. 5
[7] J. Chung, D. Lee, Y. Seo, and C. D. Yoo. Deep attribute net-
works. In Deep Learning and Unsupervised Feature Learn-
ing NIPS Workshop, volume 3, 2012. 1
Inspired by the work in [39] on class activation map, we [8] M. Ehrlich, T. J. Shields, T. Almaev, and M. R. Amer. Fa-
interpret the prediction decision made by our proposed ar- cial attributes classification using multi-task representation
chitecture. Fig.4 shows the class activation map for predict- learning. In Proceedings of the IEEE Conference on Com-
ing big nose, big lips, narrow eyes and bald, respectively. puter Vision and Pattern Recognition Workshops, pages 47–
We can see that our model is triggered by different semantic 55, 2016. 5
regions of the image for different predictions. Fig.4 shows [9] Y. Fu, T. M. Hospedales, T. Xiang, S. Gong, and Y. Yao.
that our model due to using GAP layer also learns to local- Interestingness prediction by robust learning to rank. In
ize the common visual patterns for the same facial attribute. European conference on computer vision, pages 488–503.
Springer, 2014. 4
Furthermore, the deep features obtained from our attribute
[10] A. Graham. Kronecker products and matrix calculus: With
predictor branch can also be used for generic facial attribute
applications (mathematics and its applications) pdf. 1981. 2
localization in any given image without using any extra in-
[11] M. Guillaumin, J. Verbeek, and C. Schmid. Is that you? met-
formation such as bounding box.
ric learning approaches for face identification. In Computer
Vision, 2009 IEEE 12th international conference on, pages
6. Conclusion 498–505. IEEE, 2009. 1
In this paper, we proposed an end to end deep network to [12] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learn-
ing for image recognition. In Proceedings of the IEEE con-
predict facial attributes and identify face images simultane-
ference on computer vision and pattern recognition, pages
ously with better performance. Our model trains these two
770–778, 2016. 1
tasks jointly through shared CNN feature space, and also
[13] M. M. Kalayeh, B. Gong, and M. Shah. Improving fa-
fuses predicted identity attributes modality with face modal- cial attribute prediction using semantic segmentation. arXiv
ity features to improve face identification performance. The preprint arXiv:1704.08740, 2017. 2
model increases both face recognition and face attribute pre- [14] I. Kemelmacher-Shlizerman, S. M. Seitz, D. Miller, and
diction performance in comparison to the case when the E. Brossard. The megaface benchmark: 1 million faces for
model is trained separately. Experimental results show the recognition at scale. In Proceedings of the IEEE Conference
superiority of the model in comparison to the current face on Computer Vision and Pattern Recognition, pages 4873–
identification models. The model also predicts identity fa- 4882, 2016. 4
[15] D. Kingma and J. Ba. Adam: A method for stochastic opti- [31] B. Thomee, D. A. Shamma, G. Friedland, B. Elizalde,
mization. arXiv preprint arXiv:1412.6980, 2014. 3 K. Ni, D. Poland, D. Borth, and L.-J. Li. The new data
[16] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet and new challenges in multimedia research. arXiv preprint
classification with deep convolutional neural networks. In arXiv:1503.01817, 1(8), 2015. 4
Advances in neural information processing systems, pages [32] R. Torfason, E. Agustsson, R. Rothe, and R. Timofte. From
1097–1105, 2012. 1, 2 face images and attributes to attributes. In Asian Conference
[17] N. Kumar, P. Belhumeur, and S. Nayar. Facetracer: A on Computer Vision, pages 313–329. Springer, 2016. 1
search engine for large collections of images with faces. In [33] Z. Wang, K. He, Y. Fu, R. Feng, Y.-G. Jiang, and X. Xue.
European conference on computer vision, pages 340–353. Multi-task deep neural network for joint face recognition and
Springer, 2008. 5 facial attribute prediction. In Proceedings of the 2017 ACM
[18] N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar. on International Conference on Multimedia Retrieval, pages
Attribute and simile classifiers for face verification. In Com- 365–374. ACM, 2017. 1, 2, 6
puter Vision, 2009 IEEE 12th International Conference on, [34] Y. Wen, K. Zhang, Z. Li, and Y. Qiao. A discrimina-
pages 365–372. IEEE, 2009. 1 tive feature learning approach for deep face recognition. In
[19] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face at- European Conference on Computer Vision, pages 499–515.
tributes in the wild. In Proceedings of the IEEE International Springer, 2016. 5
Conference on Computer Vision, pages 3730–3738, 2015. 1, [35] X. Wu, R. He, and Z. Sun. A lightened cnn for deep face
2, 4, 5 representation. In 2015 IEEE Conference on IEEE Computer
[20] P. Luo, X. Wang, and X. Tang. A deep sum-product archi- Vision and Pattern Recognition (CVPR), volume 4, 2015. 5
tecture for robust facial attributes analysis. In Proceedings [36] L. C. Yuhang He and J. Chen. Multi-task relative attribute
of the IEEE International Conference on Computer Vision, prediction by incorporating local context and global style in-
pages 2864–2871, 2013. 1, 2 formation. In E. R. H. Richard C. Wilson and W. A. P. Smith,
[21] H.-W. Ng and S. Winkler. A data-driven approach to clean- editors, Proceedings of the British Machine Vision Confer-
ing large face datasets. In Image Processing (ICIP), 2014 ence (BMVC), pages 131.1–131.12. BMVA Press, Septem-
IEEE International Conference on, pages 343–347. IEEE, ber 2016. 2
2014. 4 [37] N. Zhang, M. Paluri, M. Ranzato, T. Darrell, and L. Bourdev.
[22] O. M. Parkhi, A. Vedaldi, A. Zisserman, et al. Deep face Panda: Pose aligned networks for deep attribute modeling. In
recognition. In BMVC, volume 1, page 6, 2015. 1 Proceedings of the IEEE conference on computer vision and
[23] R. Ranjan, S. Sankaranarayanan, C. D. Castillo, and R. Chel- pattern recognition, pages 1637–1644, 2014. 1, 5
lappa. An all-in-one convolutional neural network for face
[38] Y. Zhong, J. Sullivan, and H. Li. Face attribute prediction
analysis. In Automatic Face & Gesture Recognition (FG
using off-the-shelf cnn features. In Biometrics (ICB), 2016
2017), 2017 12th IEEE International Conference on, pages
International Conference on, pages 1–7. IEEE, 2016. 2
17–24. IEEE, 2017. 1
[39] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Tor-
[24] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A uni-
ralba. Learning deep features for discriminative localization.
fied embedding for face recognition and clustering. In Pro-
In Computer Vision and Pattern Recognition (CVPR), 2016
ceedings of the IEEE Conference on Computer Vision and
IEEE Conference on, pages 2921–2929. IEEE, 2016. 6
Pattern Recognition, pages 815–823, 2015. 1, 5
[25] W. R. Schwartz, H. Guo, and L. S. Davis. A robust and scal-
able approach to face identification. In European Conference
on Computer Vision, pages 476–489. Springer, 2010. 1
[26] K. Simonyan and A. Zisserman. Very deep convolutional
networks for large-scale image recognition. arXiv preprint
arXiv:1409.1556, 2014. 1, 2
[27] Y. Sun, D. Liang, X. Wang, and X. Tang. Deepid3: Face
recognition with very deep neural networks. arXiv preprint
arXiv:1502.00873, 2015. 1
[28] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed,
D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich.
Going deeper with convolutions. In Proceedings of the
IEEE conference on computer vision and pattern recogni-
tion, pages 1–9, 2015. 1
[29] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Web-
scale training for face identification. In Proceedings of the
IEEE conference on computer vision and pattern recogni-
tion, pages 2746–2754, 2015. 1
[30] V. Talreja, M. C. Valenti, and N. M. Nasrabadi. Multibio-
metric secure system based on deep learning. arXiv preprint
arXiv:1708.02314, 2017. 1

F05 Fault On An Allen Bradley
No ratings yet
F05 Fault On An Allen Bradley
12 pages
Semi-Detailed Lesson Plan (FilmaGranada)
100% (4)
Semi-Detailed Lesson Plan (FilmaGranada)
3 pages
TPAMI-2018-CR297-Faceness-net - Face Detection Through Deep Facial Part Responses
No ratings yet
TPAMI-2018-CR297-Faceness-net - Face Detection Through Deep Facial Part Responses
15 pages
Arun MasterThesis
No ratings yet
Arun MasterThesis
73 pages
Face Recognition With Deep Learning Architectures
No ratings yet
Face Recognition With Deep Learning Architectures
27 pages
Face Regonition
No ratings yet
Face Regonition
17 pages
Face Regonition
No ratings yet
Face Regonition
10 pages
paper [21]
No ratings yet
paper [21]
17 pages
A Hybrid Approach For Face Recognition Using A Convolutional Neural Network Combined With Feature Extraction Techniques
No ratings yet
A Hybrid Approach For Face Recognition Using A Convolutional Neural Network Combined With Feature Extraction Techniques
14 pages
Automated Attendance System With Multi-Faces Using Convolution Neural Network (CNN)
No ratings yet
Automated Attendance System With Multi-Faces Using Convolution Neural Network (CNN)
6 pages
Facial Recognition Based On Enhanced Neural Network
No ratings yet
Facial Recognition Based On Enhanced Neural Network
10 pages
125994526
No ratings yet
125994526
11 pages
One To Many Face Recognition With Bilinear CNNS: Sonal Mishra, Pratyush Tripathi
No ratings yet
One To Many Face Recognition With Bilinear CNNS: Sonal Mishra, Pratyush Tripathi
8 pages
14.+KURNIAWAN
No ratings yet
14.+KURNIAWAN
16 pages
PR 301
No ratings yet
PR 301
39 pages
Deep Learning Face Attributes in
No ratings yet
Deep Learning Face Attributes in
11 pages
Proposal for the Reasearch
No ratings yet
Proposal for the Reasearch
6 pages
An Efficient Face Recognition Method Based On CNN
No ratings yet
An Efficient Face Recognition Method Based On CNN
4 pages
2023efficient Deep Learning Approach To Recognize Person Attributes by Using Hybrid Transformers For Surveillance Scenarios
No ratings yet
2023efficient Deep Learning Approach To Recognize Person Attributes by Using Hybrid Transformers For Surveillance Scenarios
13 pages
[email protected]
No ratings yet
[email protected]
6 pages
Joint Face Detention & Alignment Using Multitask Cascaded Convolutional Networks
No ratings yet
Joint Face Detention & Alignment Using Multitask Cascaded Convolutional Networks
5 pages
Biometric Watermark on Soft-Biometric Information Stored in Biometric Face Embeddings
No ratings yet
Biometric Watermark on Soft-Biometric Information Stored in Biometric Face Embeddings
16 pages
guo2016
No ratings yet
guo2016
6 pages
Stranger Detection: Yada Arun Kumar
No ratings yet
Stranger Detection: Yada Arun Kumar
9 pages
Hyperface: A Deep Multi-Task Learning Framework For Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition
No ratings yet
Hyperface: A Deep Multi-Task Learning Framework For Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition
16 pages
Face Recognition Based On Convolutional Neural Network.: November 2017
No ratings yet
Face Recognition Based On Convolutional Neural Network.: November 2017
5 pages
Comparative Analysis of Transfer Learning CNN For Face Recognition
No ratings yet
Comparative Analysis of Transfer Learning CNN For Face Recognition
6 pages
Face_Recognition
No ratings yet
Face_Recognition
10 pages
A Comprehensive Survey on Face Recognition and Image Retrieval for Event-Based Applications
No ratings yet
A Comprehensive Survey on Face Recognition and Image Retrieval for Event-Based Applications
5 pages
Heterogeneous Face Recoognition
No ratings yet
Heterogeneous Face Recoognition
14 pages
Results and Discussion, Performance Analysis
No ratings yet
Results and Discussion, Performance Analysis
5 pages
Implementation of FaceNet and Support Vector Machine in A Real-Time Web-Based Timekeeping Application
No ratings yet
Implementation of FaceNet and Support Vector Machine in A Real-Time Web-Based Timekeeping Application
9 pages
information-16-00107
No ratings yet
information-16-00107
41 pages
Articulo10 Covic
No ratings yet
Articulo10 Covic
5 pages
Multi-Modal Human Verification Using Face and Speech
No ratings yet
Multi-Modal Human Verification Using Face and Speech
6 pages
Face Recognition Paper
No ratings yet
Face Recognition Paper
7 pages
An Associate-Predict Model For Face Recognition: Qi Yin Xiaoou Tang Jian Sun
No ratings yet
An Associate-Predict Model For Face Recognition: Qi Yin Xiaoou Tang Jian Sun
8 pages
Human Attribute Recognition by Rich Appearance Dictionary
No ratings yet
Human Attribute Recognition by Rich Appearance Dictionary
8 pages
Chapter 5 Face Recognition
No ratings yet
Chapter 5 Face Recognition
75 pages
Convolutional Neural Network Approach Fo
No ratings yet
Convolutional Neural Network Approach Fo
6 pages
A Supervised Learning Methodology For Real-Time Disguised Face Recognition in The Wild
No ratings yet
A Supervised Learning Methodology For Real-Time Disguised Face Recognition in The Wild
6 pages
Refe 1
No ratings yet
Refe 1
6 pages
Face Recognition System: By: Yang Li Yli@My - Harrisburgu.Edu
No ratings yet
Face Recognition System: By: Yang Li Yli@My - Harrisburgu.Edu
49 pages
Face Recognition Based On MTCNN and FaceNet
No ratings yet
Face Recognition Based On MTCNN and FaceNet
6 pages
Deep Convolutional Neural Network-Based Approaches
No ratings yet
Deep Convolutional Neural Network-Based Approaches
21 pages
Irjet V4i5112 PDF
No ratings yet
Irjet V4i5112 PDF
5 pages
10.1007@s00371 020 01814 8
No ratings yet
10.1007@s00371 020 01814 8
10 pages
FinalManuscriptICTIDS2021 Paperid 56
No ratings yet
FinalManuscriptICTIDS2021 Paperid 56
10 pages
ECCV-2018-CR154-Pyramidbox - A Context-Assisted Single Shot Face Detector
No ratings yet
ECCV-2018-CR154-Pyramidbox - A Context-Assisted Single Shot Face Detector
17 pages
Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks
No ratings yet
Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks
5 pages
Face
No ratings yet
Face
25 pages
Research Article: Applying Artificial Neural Networks For Face Recognition
No ratings yet
Research Article: Applying Artificial Neural Networks For Face Recognition
17 pages
Real Time Gender and Age Prediction Using Deep Lea
No ratings yet
Real Time Gender and Age Prediction Using Deep Lea
5 pages
comscie algo facenet
No ratings yet
comscie algo facenet
10 pages
Face Recognition - Past - Present - Future
No ratings yet
Face Recognition - Past - Present - Future
28 pages
1558784710763288-1
No ratings yet
1558784710763288-1
16 pages
Journal Paper-2
No ratings yet
Journal Paper-2
11 pages
Hamami 2020 J. Phys. Conf. Ser. 1641 012084
No ratings yet
Hamami 2020 J. Phys. Conf. Ser. 1641 012084
7 pages
Attendance System Based On The Face Recognition of Webcam's Image of The Classroom
No ratings yet
Attendance System Based On The Face Recognition of Webcam's Image of The Classroom
11 pages
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework and A New Benchmark
No ratings yet
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework and A New Benchmark
16 pages
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
From Everand
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
Fouad Sabry
No ratings yet
Facial Recognition System: Fundamentals and Applications
From Everand
Facial Recognition System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Frankenstein (Full Text)
No ratings yet
Frankenstein (Full Text)
277 pages
Palestine Israel and the Politics of Popular Culture Rebecca L. Stein All Chapters Instant Download
100% (2)
Palestine Israel and the Politics of Popular Culture Rebecca L. Stein All Chapters Instant Download
61 pages
7 Minutes of The Meeting - Memorandum - Job Interview
100% (1)
7 Minutes of The Meeting - Memorandum - Job Interview
2 pages
CLUP v1 PDF
No ratings yet
CLUP v1 PDF
210 pages
Bridge Team Management (BTM)
100% (2)
Bridge Team Management (BTM)
2 pages
Preparation of Statement of Basis of Design
No ratings yet
Preparation of Statement of Basis of Design
3 pages
Narrative Report
No ratings yet
Narrative Report
3 pages
Limestone Object Cleaning - Local Museum Guide - Version 2.0
No ratings yet
Limestone Object Cleaning - Local Museum Guide - Version 2.0
30 pages
Application Domain
No ratings yet
Application Domain
4 pages
Finding Solutions in A Teacher Support Circle Video Transcript
No ratings yet
Finding Solutions in A Teacher Support Circle Video Transcript
2 pages
Wieke Luthfia - Tutorial 1 - Mata Kuliah Bahasa Inggris
No ratings yet
Wieke Luthfia - Tutorial 1 - Mata Kuliah Bahasa Inggris
5 pages
PHY 1311 Lecture Notes
No ratings yet
PHY 1311 Lecture Notes
74 pages
Application Acknowledgement - Keam 2022: Office of The Commissioner For Entrance Examinations, Kerala
No ratings yet
Application Acknowledgement - Keam 2022: Office of The Commissioner For Entrance Examinations, Kerala
1 page
The Modern State and The Primitive Accumulation of Symbolic Power
No ratings yet
The Modern State and The Primitive Accumulation of Symbolic Power
34 pages
Ramset Epcon G5 Pro - Technical Data Sheet
No ratings yet
Ramset Epcon G5 Pro - Technical Data Sheet
10 pages
FINS5513 Lecture T03A (Pre Lecture)
No ratings yet
FINS5513 Lecture T03A (Pre Lecture)
27 pages
Essay Topics Grade Ten
No ratings yet
Essay Topics Grade Ten
8 pages
Carlo Argan - On Typology
No ratings yet
Carlo Argan - On Typology
2 pages
Cpar Quarter 1 Module 3
No ratings yet
Cpar Quarter 1 Module 3
21 pages
HWP File Structure
No ratings yet
HWP File Structure
77 pages
6 Stages of Spiritual Awakening_Christina Lopes
No ratings yet
6 Stages of Spiritual Awakening_Christina Lopes
22 pages
SEM5_MTMH_CC11
No ratings yet
SEM5_MTMH_CC11
2 pages
GEn Math Lecture Notes
100% (1)
GEn Math Lecture Notes
12 pages
By Mukkur - TRS Iyengar: Horoscope Chart - Where It Begin - A Specimen Sketch
No ratings yet
By Mukkur - TRS Iyengar: Horoscope Chart - Where It Begin - A Specimen Sketch
7 pages
CP Moshi
No ratings yet
CP Moshi
2 pages
Control Systems Unit 4
No ratings yet
Control Systems Unit 4
38 pages
Graph Theory Applications to Deregulated Power Systems Ricardo Moreno Chuquen - The full ebook version is ready for instant download
100% (1)
Graph Theory Applications to Deregulated Power Systems Ricardo Moreno Chuquen - The full ebook version is ready for instant download
55 pages
A Review On Status and Determinants of Household Food Security in Ethiopia
100% (1)
A Review On Status and Determinants of Household Food Security in Ethiopia
12 pages