0% found this document useful (0 votes)

25 views

Robust Attentive Deep Neural Network For Detecting

Uploaded by

kowshikrashid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

Robust Attentive Deep Neural Network For Detecting

Uploaded by

kowshikrashid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Received February 11, 2022, accepted February 25, 2022, date of publication March 8, 2022, date of current version

March 29, 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3157297

Robust Attentive Deep Neural Network

for Detecting GAN-Generated Faces
HUI GUO 1 , SHU HU 2 , XIN WANG2,3 , (Senior Member, IEEE),
MING-CHING CHANG 1 , (Senior Member, IEEE),
AND SIWEI LYU 2 , (Fellow, IEEE)
1 Department of Computer Science, College of Engineering and Applied Sciences, University at Albany, State University of New York, Albany, NY 12222, USA
2 Department of Computer Science and Engineering, University at Buffalo, State University of New York, Buffalo, NY 14260, USA
3 Keya Medical, Seattle, WA 98104, USA
Corresponding author: Ming-Ching Chang ([email protected])
This work was supported in part by the U.S. Defense Advanced Research Projects Agency (DARPA) Semantic Forensic (SemaFor)
Program under Kitware, Inc., through the Project Semantic Information Defender (SID) for the State University of New York and Albany
and Buffalo under Grant K003088-00-S04.

ABSTRACT Generative Adversarial Network (GAN) based techniques can generate and synthesize real-
istic faces that cause profound social concerns and security problems. Existing methods for detecting
GAN-generated faces can perform well on limited public datasets. However, images from existing datasets
do not represent real-world scenarios well enough in terms of view variations and data distributions, where
real faces largely outnumber synthetic ones. The state-of-the-art methods do not generalize well in real-world
problems and lack the interpretability of detection results. Performance of existing GAN-face detection
models degrades accordingly when facing data imbalance issues. To address these shortcomings, we propose
a robust, attentive, end-to-end framework that spots GAN-generated faces by analyzing eye inconsistencies.
Our model automatically learns to identify inconsistent eye components by localizing and comparing artifacts
between eyes. After the iris regions are extracted by Mask-RCNN, we design a Residual Attention Network
(RAN) to examine the consistency between the corneal specular highlights of the two eyes. Our method can
effectively learn from imbalanced data using a joint loss function combining the traditional cross-entropy
loss with a relaxation of the ROC-AUC loss via Wilcoxon-Mann-Whitney (WMW) statistics. Comprehensive
evaluations on a newly created FFHQ-GAN dataset in both balanced and imbalanced scenarios demonstrate
the superiority of our method.

INDEX TERMS GAN-generated face, fake face detection, iris detection, corneal specular highlights,
residual attention network, data imbalance, AUC maximization, WMW statistics, FFHQ-GAN dataset.

I. INTRODUCTION unaware users [6]–[9], which can cause significant secu-

The development of Generative Adversarial Networks rity problems and frauds. Therefore, the authentication of
(GANs) [1] has led to a dramatic increase in the realism in GAN-generated faces has obtained increasing importance in
generating high-quality face images, including PGGAN [2], recent years. However, there exists only a paucity of forensic
StyleGAN [3], StyleGAN2 [4], and StyleGAN3 [5]. As illus- techniques that can effectively detect such fake faces.
trated in Figure 1, these GAN generated (or synthesized) Many studies employ CNNs or other classifiers to distin-
fake faces are difficult to distinguish from human eyes. Such guish the GAN-generated faces from the real ones [10]–[15].
synthesized faces are easily generatable, can be directly lever- Although these methods detect various GAN-generated faces
aged for disinformation, and potentially lead to profound with relatively high accuracy, similar to other deep learning-
social, security, and ethical concerns. The GAN-generated based techniques, they suffer from poor generalization and
faces can be easily abused for malicious purposes, such lack interpretability of detection results. Physiology-based
as creating fake social media accounts to lure or deceive methods [10], [16], [17] detect fake faces by examining the
semantic aspects of human faces, including physiological or
The associate editor coordinating the review of this manuscript and shape-related cues such as symmetry, iris color, and pupil
approving it for publication was Sudhakar Radhakrishnan . shapes. Our prior work of an explainable physical method

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
32574 VOLUME 10, 2022
H. Guo et al.: Robust Attentive Deep Neural Network for Detecting GAN-Generated Faces

improve classification performance [24]. However, the AUC

is a pairwise rank-based metric with discontinuous values
among iterations. Therefore, AUC is not directly applicable
for loss design for end-to-end optimization of the classifier.
To this end, we incorporate a ROC-AUC loss term by max-
imizing the Wilcoxon-Mann-Whitney (WMW) statistics of
the ROC, which is shown to provide similar effects in approx-
imating the AUC optimization [25], [26]. Our experimental
results show that a combination of the binary cross-entropy
loss and the WMW-AUC loss leads to the best end-to-end
result.
We perform experiments on two data sources: (1) real
human face images obtained from the Flickr-Faces-HQ
(FFHQ) dataset [3] and (2) GAN-generated face images
available from http://thispersondoesnotexist.com as shown in
Figure 1. Experiment results demonstrate the superiority of
the proposed method in distinguishing GAN-generated faces
from the real ones. We summarize the main contributions of
this paper in the following:
FIGURE 1. StyleGAN2 [4] generated faces are highly realistic and can be
easily abused for malicious purposes. Effective forensic methods for
• We propose an end-to-end method for detecting
identifying them is of strong needs. GAN-generated faces by visually comparing the two
eyes. A residual attention network model is incorpo-
rated to better focus on the inconsistencies of the eyes
in [18] addressed some of the above limitations, where e.g. corneal specular highlights and other artifacts. Our
GAN-generated faces are identified based on a rule-based fake face detection method is interpretable, and the pro-
decision over the inconsistencies of the specular eye patterns. posed cues can be leveraged by human beings as well to
However, this method relies on assumptions of a frontal face perform examinations.
as input and the existence of far-away lighting reflection • We introduce the WMW-AUC loss that approximates
source(s) from both eyes. When these assumptions are vio- the direct optimization of the AUC. This can also
lated, generalization will be limited and false positives may effectively address the data imbalance learning prob-
rise significantly. lem in contrast to other sampling or data augmentation
In this paper, we improve our prior method of [18] and approaches.
develop an end-to-end approach for detecting GAN-generated • We generate a new FFHQ-GAN dataset by com-
faces by examining the inconsistencies between the corneal bining portions of the FFHQ real faces with the
specular highlights of the two eyes. We first use Mask StyleGAN2 generated images. The performance of
R-CNN [19] to detect and localize the iris regions. Instead of GAN-generated face detection is evaluated on this
segmenting the corneal specular highlights using low-level FFHQ-GAN dataset for both balanced and imbalanced
image processing methods in [18], we design a Residual data conditions. Experimental results show that our
Attention Network which consists of residual attention method achieves plausible performance, especially on
blocks inspired from [20], to automatically learn to localize imbalanced datasets. The ablation study also validates
the inconsistencies. Our new method is data-driven and can the effectiveness of the proposed attention module and
better spot inconsistent artifacts, including but not limited to the loss design.
the corneal specular highlights. The paper is organized as follows. Section II summarizes
Data imbalance is an important issue that is less addressed related works on GAN-generated faces detection, attention
in existing GAN-generated face detection works. In real- methods, and learning from imbalanced data. Section III
world use scenarios of face examination, real face images describes the proposed network architecture and proposed
usually outnumber GAN-generated ones by a large amount. loss terms for robust learning. Section IV shows experimental
Imbalanced data lead to learning problems and thus affect results with qualitative visualization and quantitative analy-
model design. It is well-known the widely-used cross-entropy sis. Section V concludes this work.
loss [21] is not suitable for classifying imbalanced data.
Although substantial progress is made by sampling [22], II. RELATED WORKS
adjusting of class weights, data enhancement [23], etc., learn- We briefly review related works, including GAN-generated
ing with imbalanced data is still challenging. It is intuitive that face detection methods. We also review the literature on
the Area Under Curve (AUC) of the Receiver Operating Char- the attention mechanism and learning from imbalanced
acteristic (ROC) plot can be incorporated as a loss term to data.

VOLUME 10, 2022 32575

H. Guo et al.: Robust Attentive Deep Neural Network for Detecting GAN-Generated Faces

FIGURE 2. The proposed architecture for GAN-generated face detection. We first use DLib [27] to detect faces and localize eyes, and use Mask R-CNN [19]
to segment out the iris regions. A Residual Attention Network (RAN) then performs binary classification on the extracted iris pair to determine if the face
is real or fake. The training is carried out using a joint loss combining the Binary Cross-Entropy (BCE) loss and the ROC-AUC loss with WMW relaxation to
better handle the learning from imbalanced data (see text).

A. GAN-GENERATED FACE DETECTION The channel attention [45] can automatically learn to focus
GAN-generated faces detection methods can be organized on important channels by analyzing the relationship between
into two categories. channels. SENet [46] embeds the channel attention mech-
Data-driven methods [28]–[32] mostly train a deep neu- anism into residual blocks, and effectiveness is shown on
ral network model to distinguish real and GAN-generated large-scale image classification. The attention mechanism
faces. These deep learning (DL) based methods work well is also used in [47] to distinguish important channels in
in many scenarios, as they can better learn representations in the network to improve the representation capability. The
a high-dimensional feature space instead of raw image pixels. idea of channel attention and spatial attention are combined
Physical and physiological methods look for signal jointly in [48], [49] to improve network performance signifi-
traces, artifacts, or inconsistencies left by the GAN synthe- cantly. The Residual Attention Network in [20] combines the
sizers. These methods are explainable in nature. Simple cues residual unit [50] with the attention mechanism by stacking
such as color difference are used in [33], [34] to distinguish residual attention blocks to improve performance and reduce
GAN images from the real ones. However, those methods model complexity.
are no longer effective as the GAN methods advance. More
sophisticated methods in [11], [35] leverage fingerprints or C. IMBALANCED DATA LEARNING
abstract signal-level traces of the noise residuals to differen- Learning from imbalanced data has been widely studied in
tiate GAN-generated faces. Many works [36]–[38] identify machine learning [51]–[54] and computer vision [55], [56].
GAN images by recognizing the specific artifacts produced Earlier solutions for imbalanced data learning are mainly
by the GAN upsampling process. In [39], the distribution of based on the sampling design, e.g. oversampling for minor
facial landmarks is analyzed to distinguish GAN-generated classes, undersampling for major classes, and weighed sam-
faces. Inconsistent head poses are detected to expose the fake pling [57], etc. These sampling-based methods come with
videos in [10]. The work of [16] identifies GAN-generated their own drawbacks. For example, undersampling may
faces as well as deepfakes face manipulations by inspecting ignore important samples, and oversampling may lead to
visual artifacts. Our prior work of [18] determines the incon- overfitting.
sistencies of the corneal specular highlights between left and Data augmentation provides an alternative solution to alle-
right eyes to expose GAN-generated faces. viate data imbalance issues. For image recognition, image
mirroring, rotation, color adjustments, etc. are simple meth-
ods to augment data samples [58]. However, data augmenta-
B. ATTENTION MECHANISM tion methods can only address the data imbalance problems
Since the seminal work of [40] in machine translation, the partly, as the size of the original dataset must be diverse
attention mechanism is widely used in many applications enough, such that a sufficient amount of representative sam-
on improving the performance of deep learning models by ples can be produced from augmentation.
focusing on the most relevant part of the features in a flexi-
ble manner. The Class Activation Mapping (CAM) [41] and III. METHOD
Grad-CAM [42] are widely used in many computer vision We next describe the proposed GAN-generated faces detec-
tasks [43]. However, in these works, attentions are only used tion framework. Given an input face image, facial landmarks
to visualize model prediction in showing significant portions are first localized using DLib [27], and Mask R-CNN [19] is
of the images. On the other hand, integrating the attention used to segment out the left and right iris regions of the eyes
mechanism into the network design is shown to be effective (§ III-A). We adopt a residual attention-based network [20]
in boosting performance, as the network can be guided by the to perform binary classification on the iris regions of interest
attention to focus on relevant regions during training [44]. to determine if the input image is real or fake (§ III-B).

32576 VOLUME 10, 2022

H. Guo et al.: Robust Attentive Deep Neural Network for Detecting GAN-Generated Faces

TABLE 1. Details of the proposed Residual Attention Network (RAN) in

the right of Figure 2.

FIGURE 3. Details of our Attention Module from the our RAN in Figure 2.
Our design is inspired from the residual attention network of [20].

The training of our network aims to maximize the classi-

fication performance reflected by the standard Area-Under-
Curve of the ROC plot (§ III-C), which is general and can
effectively address the data imbalance problem. However,
due to the discrete nature of the ROC-AUC values, a naive
gradient-based implementation does not work for end-to-end
learning. In § III-D), we present a detailed solution in our
proposed approach by relaxing the AUC maximization and
approximating the goal using the WMW statistics. Figure 5
overviews the training pipeline of the proposed method that
can effectively learn from imbalanced data.

A. FACIAL LANDMARK LOCALIZATION AND IRIS

SEGMENTATION
FIGURE 4. The extracted iris pairs of our method for the
Given a face image, the first step of our method to determine (a) GAN-generated and (b) real faces. Artifacts of inconsistent corneal
if it is real or GAN-generated is to detect and localize the face specular highlights are obvious in GAN-generated iris pairs.
using the facial landmark extractor provided in DLib [27].
The localized regions containing the eyes are cropped out
for consistency checking. Mask R-CNN [19], the state-of- highlights between the eyes so as to improve GAN-generated
the-art detection and segmentation network, is employed to face detection. Incorporating attention to a detection/
further detect and localize the iris regions. Mask R-CNN is segmentation network is commonly accomplished by hav-
a two-stage network based on Faster R-CNN [59] as shown ing a separate branch that calculates the attention maps and
in Figure 2 (middle). The first stage of Mask R-CNN is a later is incorporated back to the main branch with weights.
Region Proposal Network (RPN) that generates candidate Inspired from [20], each Attention Module in our attention
object bounding boxes for all the object categories. In the network consists of a trunk branch and a soft mask branch.
second stage, the R-CNN extracts features using the Region The trunk branch contains several residual blocks [50] and
of Interest Align (RoIAlign) layer for each proposal. In the acts as a shortcut for data flow. The soft mask branch uses a
last stage, label classification and bounding box regression U-net structure [65] to weight output features. Specifically,
are performed for each proposal, and mask prediction is given the input feature map f , denote the output of the trunk
performed in a parallel branch. We train the Mask R-CNN branch T as T (f ), and the output of the soft mask branch
model using the eye region images from the datasets in M as M (f ), respectively. As illustrated in Figure 3, the final
[60], [61], where more details will be provided in § IV-B. attended feature map f 0 is obtained via element-wise matrix
Figure 4 shows examples of the extracted pair of iris regions product via
from the cases of GAN-generated (left) and real (right) faces.
f 0 = (1 + M (f )) ◦ T (f ) , (1)
B. RESIDUAL ATTENTION NETWORK where the symbol ◦ denotes the Hadamard product.
We adopt the attention mechanism [45], [47], [62]–[64] The attention module can be configured to focus learning
to improve the spotting of inconsistent corneal specular on channel attention, spatial attention, or mixed attention.

VOLUME 10, 2022 32577

H. Guo et al.: Robust Attentive Deep Neural Network for Detecting GAN-Generated Faces

FIGURE 5. The proposed pipeline for training the Residual Attention Network (RAN) on possibly imbalanced data for GAN-generated face
classification. The extracted iris pairs are passed as input to the RAN. A robust loss function derived from maximizing the AUC of ROC is
optimized in the training of the RAN. See details in § III-D.

As suggested in [20], the mixed attention yields the best we assume gw (xi ) 6= gw (xj ) for i 6= j (ties can be broken
performance. Thus, we use the Sigmoid function 1+exp1 −f in any consistent way).
( s,c )
to learn the mixed attention for each channel and each spa- Given a threshold λ, the number of negative examples with
tial location, where s ranges over all spatial positions and c prediction scores larger than λ is false positive (FP), and the
ranges over all channels of f . The proposed Residual Atten- number of positive examples with prediction scores greater or
tion Network (RAN) is constructed by stacking multiple equal to λ is true positive (TP). According to the FP and TP,
Attention Modules, as shown on the right side of Figure 2. we can calculate the false positive rate (FPR) and the true
Table 1 provides details of the architectures. Although the positive rate (TPR) as follows,
attention module plays an important role in classification, P P
i∈N I[gw (xi )>λ] I[g (x )≥λ]
a simple stacking of attention modules may reduce perfor- FPR = , TPR = i∈P w i ,
|N | |P|
mance. To this end, we adopt a simple solution by adding the
attention map onto the original feature map. This combination where I[a] is an indicator function with I[a] = 1 if a is true
allows attention modules stacked like a ResNet [50] and and 0 otherwise. The receiver operation curve (ROC) is a plot
improves the performance [20]. Given an input image, The of FPR versus TPR with setting different decision thresholds
RAN outputs a prediction score from the last Sigmoid layer, λ ∈ (−∞, ∞). Based on this definition, ROC is a curve
as an indication of the likelihood of the input image being a confined to [0, 1] × [0, 1] and connecting the point (0,0) to
GAN-generated image. (1,1). The value of AUC corresponds to the area enclosed by
the ROC curve.
C. AUC OF ROC FOR CLASSIFICATION EVALUATION
D. WMW AUC RELAXATION FOR LOSS DESIGN
Most classification loss measures including the popular
cross-entropy loss are ineffective in addressing the issue of The computation of an AUC score based on the area under a
data imbalance. The resulting models can produce accurate ROC curve cannot be directly used in a loss function due to its
but rather biased predictions that do not work well in practice. discrete nature. According to the Wilcoxon-Mann-Whitney
It is desirable to address data imbalance directly by specifi- (WMW) statistic [25], we can relax the AUC as follows,
cally designing a suitable loss function. 1 XX
AUC = I[gw (xi )>gw (xj )] .
Since the area under the curve (AUC) of a receiver oper- |P||N |
i∈P j∈N
ation curve (ROC) [26], [66] is a robust evaluation metric
for both balanced and imbalanced data, we would like to Therefore, the corresponding AUC loss (risk) can be defined
directly maximize the AUC to handle imbalanced situations. as:
The AUC is widely used in the binary classification problems. 1 XX
We next briefly review the definition of AUC, and then moti- LAUC = 1 − AUC = I[gw (xi )<gw (xj )] . (2)
|P||N |
i∈P j∈N
vate how we incorporate a loss term that directly maximize
the AUC performance. Given a labeled dataset {(xi , yi )}M i=1 , Obviously, LAUC takes value in [0, 1]. It is a fraction of pairs
where each data sample xi ∈ Rd and each corresponding of prediction scores from the positive sample and negative
label yi ∈ {−1, +1}. We define a set of indices of positive sample that are ranked incorrectly, i.e., the prediction score
instances as P = {i | yi = +1}. Similarly, the set of from a negative sample is larger than the prediction score
indices of negative instances is N = {i | yi = −1}. Let from a positive sample. If all prediction scores from the
gw : Rd → R be a parametric prediction function with positive samples are larger than any prediction score from
parameter w ∈ Rm . gw (xi ) represents the prediction score the negative samples, then LAUC = 0. This indicates we
of the i-th sample, where i ∈ {1, · · · , M }. For simplicity, obtain a perfect classifier. Furthermore, LAUC is independent

32578 VOLUME 10, 2022

H. Guo et al.: Robust Attentive Deep Neural Network for Detecting GAN-Generated Faces

of the threshold λ. LAUC only depends on the prediction TABLE 2. Details of the FFHQ-GAN dataset regarding its balanced (-b)
and imbalanced (-imb) subsets.
scores gw (x). In other words, the predictor gw affects the
value of LAUC . Therefore, we aim to learn a classifier gw that
minimizes Eq.(2).
Although we can calculate LAUC by comparing predic-
tion score from the positive sample and prediction score
from the negative sample in each pair, the LAUC formulation
is non-differentiable due to the discrete computation. It is
TABLE 3. Results on the FFHQ-GAN dataset regarding its balanced (-b)
therefore desirable to find a differentiable approximation for and imbalanced (-imb) subsets.
LAUC . Inspired by the work in [25], we find an approximation
to LAUC that can be directly applied to our objective function
to minimize the AUC loss along with our imbalanced train-
ing procedure. Specifically, a differentiable approximation of
LAUC can be reformulated as:
1 XX
LAUC = R(gw (xi ), gw (xj )), (3)
|P||N |
i∈P j∈N
and R(gw (xi ), gw (xj )) =
(
(−(gw (xi ) − gw (xj ) − γ ))p , gw (xi ) − gw (xj ) < γ ,
(4)
0, otherwise,
B. IMPLEMENTATION DETAILS
where γ ∈ (0, 1] and p > 1 are two hyperparameters.
Loss for the Proposed Residual Attention Network: We use We implemented our method in PyTorch [67]. Experiments
a joint loss function comprising the conventional binary are conducted on a workstation with two NVIDIA GeForce
cross-entropy (BCE) loss function LBCE and the AUC loss 1080Ti GPUs.
function LAUC in weighted sum: For iris detection, Mask R-CNN is trained using the
datasets from [60], [61]. For each training eye image, the
L = α LBCE + (1 − α) LAUC , (5) outer boundary mask of each iris is obtained using the method
where α ∈ [0, 1] is a scaling factor that is designed for of [60] with default hyper-parameter settings. These masks
balancing the weights of the BCE loss and the AUC loss. are used to generate the iris bounding boxes and the corre-
sponding masks for training, using the default settings in [19].
IV. EXPERIMENT In the test stage, given an input face image, we first use the
For experimental evaluation of the proposed method and face detector and landmark extractor of DLib [27] to crop
comparison against the state-of-the-art methods, we first out the eye regions. Each cropped eye region is fed to Mask
introduce the newly constructed FFHQ-GAN datasets § IV-A. R-CNN for localizing the iris bounding box and segmentation
Implementation details of the proposed method are provided mask. This process is repeated for both the left and right eyes
in § IV-B. Performance evaluation on the FFHQ-GAN bal- to obtain the iris pairs as the input for our Residual Attention
anced and imbalanced subsets is in § IV-C. Ablation studies Network. We resize all iris pairs to a fixed size 96 × 96 for
are provided in § IV-D. Finally, qualitative results are shown training and testing to ensure that the whole pipeline works
in § IV-E. well.
Table 1 describes the details of our Residual Attention
A. THE NEW FFHQ-GAN DATASET Network (RAN), where the Attention Module (AM) detailed
We collect real human face images from the Flickr-Faces-HQ in Figure 3 is repeatedly stacked three times. The network is
(FFHQ) dataset [3]. GAN-generated face images are created trained using Adam optimizer [68] with the learning rate of
using StyleGAN2 [3] via http://thispersondoesnotexist.com, 0.001 and batch size 128. Training is terminated at 100 epochs
where the image resolution is 1024 × 1024 pixels. We ran- for balanced data and 2,000 for imbalanced data.
domly select 5,000 real face images from FFHQ and 5,000 Hyper-Parameters: We set p = 2 in Eq. (4) and γ = 0.4 for
GAN-generated face images. After iris detection, we discard balanced dataset, and γ = 0.6 for imbalanced data. For the
those images with the iris of any eye not detected. This ends experiments on the balanced dataset, α in Eq. (5) is set to 0.2.
up with 3,739 real faces (with iris pairs) and 3,748 fake faces For the experiments on the imbalanced dataset, α is set to 0.4.
(with iris pairs), which constitutes our new FFHQ-GAN These hyperparameters yields the best performance.
dataset. The split ratio of training and testing is 8:2.
To enable a thorough evaluation of the model in both C. EVALUATION ON THE FFHQ-GAN DATASET
balanced and imbalanced data scenarios, we sampled the We report evaluation of GAN-generated face detection on the
FFHQ-GAN dataset to form an imbalanced subset, where the FFHQ-GAN dataset in terms of Accuracy (ACC), Precision
statistics of the subsets are provided in Table 2. (P), Recall (R), F1 score (F1), the area under the curve (AUC)

VOLUME 10, 2022 32579

H. Guo et al.: Robust Attentive Deep Neural Network for Detecting GAN-Generated Faces

FIGURE 6. Performance comparison of the proposed method with ResNet with BCE loss, Xception with BCE loss, and RAN with BCE loss.

FIGURE 7. Confusion Matrix on the FFHQ-GAN (left) balanced and (right) imbalanced datasets.

FIGURE 8. Impact of hyperparameter α of the AUC loss in Eq.(4) for the

GAN-generated face detection on the imbalanced dataset.

of the ROC, and Precision-Recall (PR) curves. Accuracy

TP+TN
is calculated as ACC = TP+TN+FP+FN , where FN and
TN indicate false negatives and true negatives, respectively.
TP
Precision-Recall is calculated as P = TP+FP and R =
TP
TP+FN . F1 score is the harmonic average value of P and R,
2PR
as F1 = P+R .
FIGURE 9. Visualization of the extracted iris pairs and the corresponding
Results on the Balanced and Imbalanced Set: To evaluate attention maps obtained from our Residual Attention Network (RAN).
the effectiveness of the proposed method, we evaluate RAN Observe that the attention maps for GAN-generated faces better focus on
the artifacts such as the corneal specular highlights, while the attention
trained with BCE + AUC loss against the ResNet-50 [50] and maps for real faces are widely distributed. This shows the effective
Xception [69], two of the widely-used DNN classification learning of RAN for identifying GAN-generated faces.
models trained with the BCE loss. Table 3 presents the clas-
sification results of experiments training/test on the balanced
and imbalanced FFHQ-GAN datasets. The corresponding in Table 3. In comparison, our method achieves the highest
ROC and Precision-Recall curves are shown in Figure 6 performance in all metrics. These results indicate that our
(right), and the Confusion Matrices are shown in Figure 7. method can effectively improve performance on both bal-
Results show that both the ResNet-50 and Xception obtain anced and imbalanced data training. We have also performed
low Recall scores due to the imbalanced data distribution experiments to train our model, ResNet-50, and Xception

32580 VOLUME 10, 2022

H. Guo et al.: Robust Attentive Deep Neural Network for Detecting GAN-Generated Faces

FIGURE 10. Examples of detected GAN-generated faces and their corresponding iris regions and the attention maps produced from our method. These
examples show that our method can detect a wide range of face images, including those with tilted or side views where both irises are visible.

on a balanced dataset and test on an imbalanced dataset. images and attends to the highlight parts for the fake images.
The obtained performance difference is similar to that of Figure 10 shows additional examples of the GAN-generated
training/test on the balanced dataset. This result suggests the face with the extracted iris pairs and corresponding attention
importance of the training model on an imbalanced dataset if maps. The visualization also provides an intuitive approach
the model is expected to deal with detection on imbalanced for human beings to identify GAN-generated faces by com-
data. paring their iris regions.

D. ABLATION STUDIES V. CONCLUSION

1) EFFECT OF THE AUC LOSS In this work, we investigate building a robust end-to-end
We compare the proposed RAN model trained with the ideal deep learning framework for detecting GAN-generated faces.
case with combined AUC and BCE loss in Eq. (5) against the We show that GAN-generated faces can be distinguished
same model trained only with BCE loss. Results are shown in from real faces by examining the consistency between the
Table 3 and Figures 6 and 7. Observe that the proposed joint two iris regions. In particular, artifacts such as the corneal
BCE+AUC loss outperforms the same model trained only specular highlight inconsistencies can be robustly identified
with BCE loss alone in all evaluation metrics. In other words, through end-to-end learning via the proposed Residual Atten-
the incorporation of the AUC loss improves the classification tion Network. Our design of a joint loss combining the AUC
performance substantially and consistently. loss with the cross-entropy loss can effectively deal with the
learning from imbalanced data. We also showed that a direct
2) HYPER-PARAMETER ANALYSIS optimization of the ROC-AUC loss is computationally not
We also study the impact of hyper-parameter α in our loss feasible, however relaxing the ROC AUC via the Wilcoxon-
function in Eq. (5) regarding detection performance of imbal- Mann-Whitney (WMW) statistics can provide a good approx-
anced data. Figure 8 shows the experimental results of the imation. Our GAN- face detection result is explainable, and
obtained AUC score versus α ranging from 0 to 1, and the approach of spotting iris inconsistency can also serve as
α = 0.4 yields the best detection performance. a useful cue for human users. Experimental results show that
our model achieves superior performance on both balanced
E. QUALITATIVE RESULTS and imbalanced datasets for GAN-generated faces detection.
Figure 9 provides visualization of the attention maps of the
real and GAN-generated iris examples. Observe that there is REFERENCES
an obvious difference between the corresponding attention
[1] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
maps of the GAN-generated irises and the real ones. Con- S. Ozair, A. Courville, and Y. Bengio, ‘‘Generative adversarial nets,’’ in
cretely, the network attends on the whole iris part for the real Proc. NIPS, 2014, pp. 1–9.

VOLUME 10, 2022 32581

H. Guo et al.: Robust Attentive Deep Neural Network for Detecting GAN-Generated Faces

[2] T. Karras, T. Aila, S. Laine, and J. Lehtinen, ‘‘Progressive grow- [28] F. Marra, C. Saltori, G. Boato, and L. Verdoliva, ‘‘Incremental learning for
ing of GANs for improved quality, stability, and variation,’’ 2017, the detection and classification of GAN-generated images,’’ in Proc. IEEE
arXiv:1710.10196. Int. Workshop Inf. Forensics Secur. (WIFS), Dec. 2019, pp. 1–6.
[3] T. Karras, S. Laine, and T. Aila, ‘‘A style-based generator architecture for [29] M. Goebel, L. Nataraj, T. Nanjundaswamy, T. M. Mohammed,
generative adversarial networks,’’ in Proc. IEEE/CVF Conf. Comput. Vis. S. Chandrasekaran, and B. S. Manjunath, ‘‘Detection, attribution and
Pattern Recognit. (CVPR), Jun. 2019, pp. 4401–4410. localization of GAN generated images,’’ 2020, arXiv:2007.10466.
[4] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, ‘‘Ana- [30] S.-Y. Wang, O. Wang, R. Zhang, A. Owens, and A. A. Efros, ‘‘CNN-
lyzing and improving the image quality of StyleGAN,’’ in Proc. IEEE/CVF generated images are surprisingly easy to spot. . . for now,’’ in Proc.
Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 8110–8119. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), vol. 7, Jun. 2020,
[5] T. Karras, M. Aittala, S. Laine, E. Härkönen, J. Hellsten, J. Lehtinen, and pp. 1–10.
T. Aila, ‘‘Alias-free generative adversarial networks,’’ in Proc. NeurIPS, [31] Z. Liu, X. Qi, and P. H. S. Torr, ‘‘Global texture enhancement for fake
2021, pp. 1–12. face detection in the wild,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern
[6] A Spy Reportedly Used an AI-Generated Profile Picture to Connect Recognit. (CVPR), Jun. 2020, pp. 8060–8069.
With Sources on LinkedIn. Accessed: Jun. 13, 2019. [Online]. Available: [32] N. Hulzebosch, S. Ibrahimi, and M. Worring, ‘‘Detecting CNN-generated
https://bit.ly/35BU215 facial images in real-world scenarios,’’ in Proc. IEEE/CVF Conf. Comput.
[7] A High School Student Created a Fake 2020 US Candidate. Twitter Vis. Pattern Recognit. Workshops (CVPRW), Jun. 2020, pp. 642–643.
Verified it. Accessed: Feb. 28, 2020. [Online]. Available: https://www. [33] S. McCloskey and M. Albright, ‘‘Detecting GAN-generated imagery using
cnn.com/2020/02/28/tech/fake-twitter-candidate-2020/index.html color cues,’’ 2018, arXiv:1812.08247.
[8] How Fake Faces are Being Weaponized Online. Accessed: Feb. 20, 2020. [34] H. Li, B. Li, S. Tan, and J. Huang, ‘‘Identification of deep net-
[Online]. Available: https://www.cnn.com/2020/02/20/tech/fake-faces- work generated images using disparities in color components,’’ 2018,
deepfake/index.html arXiv:1808.07276.
[9] These Faces are not Real. Accessed: Jul. 15, 2020. [Online]. [35] N. Yu, L. Davis, and M. Fritz, ‘‘Attributing fake images to GANs: Learning
Available: https://graphics.reuters.com/CYBER-DEEPFAKE/ACTIVIST/ and analyzing GAN fingerprints,’’ in Proc. IEEE/CVF Int. Conf. Comput.
nmovajgnxpa/index.html Vis. (ICCV), Oct. 2019, pp. 7556–7566.
[10] X. Yang, Y. Li, and S. Lyu, ‘‘Exposing deep fakes using inconsistent head [36] X. Zhang, S. Karaman, and S.-F. Chang, ‘‘Detecting and simulating arti-
poses,’’ in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), facts in GAN fake images,’’ in Proc. IEEE Int. Workshop Inf. Forensics
May 2019, pp. 8261–8265. Secur. (WIFS), Dec. 2019, pp. 1–6.
[11] F. Marra, D. Gragnaniello, L. Verdoliva, and G. Poggi, ‘‘Do GANs leave [37] J. Frank, T. Eisenhofer, L. Schönherr, A. Fischer, D. Kolossa, and T. Holz,
artificial fingerprints?’’ in Proc. IEEE Conf. Multimedia Inf. Process. Retr. ‘‘Leveraging frequency analysis for deep fake image recognition,’’ 2020,
(MIPR), Mar. 2019, pp. 506–511. arXiv:2003.08685.
[12] H. Mo, B. Chen, and W. Luo, ‘‘Fake faces identification via convolutional [38] R. Durall, M. Keuper, and J. Keuper, ‘‘Watch your up-convolution: CNN
neural network,’’ in Proc. 6th ACM Workshop Inf. Hiding Multimedia based generative deep neural networks are failing to reproduce spectral
Secur., Jun. 2018, pp. 43–47. distributions,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.
[13] N.-T. Do, I.-S. Na, and S.-H. Kim, ‘‘Forensics face detection from GANs (CVPR), Jun. 2020, pp. 7890–7899.
using convolutional neural network,’’ in Proc. ISITC, 2018, pp. 1–5. [39] X. Yang, Y. Li, H. Qi, and S. Lyu, ‘‘Exposing GAN-synthesized faces using
[14] R. Wang, F. Juefei-Xu, L. Ma, X. Xie, Y. Huang, J. Wang, and Y. Liu, landmark locations,’’ in Proc. ACM Workshop Inf. Hiding Multimedia
‘‘FakeSpotter: A simple yet robust baseline for spotting AI-synthesized Secur., Jul. 2019, pp. 113–118.
fake faces,’’ 2019, arXiv:1909.06122. [40] D. Bahdanau, K. Cho, and Y. Bengio, ‘‘Neural machine translation by
[15] B. Chen, X. Ju, B. Xiao, W. Ding, Y. Zheng, and V. H. C. de Albuquerque, jointly learning to align and translate,’’ 2014, arXiv:1409.0473.
‘‘Locally GAN-generated face detection based on an improved Xception,’’ [41] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, ‘‘Learning
Inf. Sci., vol. 572, pp. 16–28, Sep. 2021. deep features for discriminative localization,’’ in Proc. IEEE Conf. Com-
[16] F. Matern, C. Riess, and M. Stamminger, ‘‘Exploiting visual artifacts to put. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 2921–2929.
expose deepfakes and face manipulations,’’ in Proc. IEEE Winter Appl. [42] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and
Comput. Vis. Workshops (WACVW), Jan. 2019, pp. 83–92. D. Batra, ‘‘Grad-CAM: Visual explanations from deep networks via
[17] H. Guo, S. Hu, X. Wang, M.-C. Chang, and S. Lyu, ‘‘Eyes tell all: gradient-based localization,’’ in Proc. IEEE Int. Conf. Comput. Vis.
Irregular pupil shapes reveal GAN-generated faces,’’ in Proc. ICASSP, (ICCV), Oct. 2017, pp. 618–626.
2022, pp. 1–6. [43] A. Chattopadhay, A. Sarkar, P. Howlader, and V. N. Balasubramanian,
[18] S. Hu, Y. Li, and S. Lyu, ‘‘Exposing GAN-generated faces using inconsis- ‘‘Grad-CAM++: Generalized gradient-based visual explanations for deep
tent corneal specular highlights,’’ in Proc. IEEE Int. Conf. Acoust., Speech convolutional networks,’’ in Proc. IEEE Winter Conf. Appl. Comput. Vis.
Signal Process. (ICASSP), Jun. 2021, pp. 2500–2504. (WACV), Mar. 2018, pp. 839–847.
[19] K. He, G. Gkioxari, P. Dollar, and R. Girshick, ‘‘Mask R-CNN,’’ in Proc. [44] S. Kardakis, I. Perikos, F. Grivokostopoulou, and I. Hatzilygeroudis,
ICCV, 2017, pp. 2961–2969. ‘‘Examining attention mechanisms in deep learning models for sentiment
[20] F. Wang, M. Jiang, C. Qian, S. Yang, C. Li, H. Zhang, X. Wang, analysis,’’ Appl. Sci., vol. 11, no. 9, p. 3883, Apr. 2021.
and X. Tang, ‘‘Residual attention network for image classification,’’ in [45] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, ‘‘CBAM: Convolutional block
Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, attention module,’’ in Proc. ECCV, 2018, pp. 3–19.
pp. 3156–3164. [46] J. Hu, L. Shen, and G. Sun, ‘‘Squeeze-and-excitation networks,’’ in
[21] K. P. Murphy, Machine Learning: A Probabilistic Perspective. Cambridge, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018,
MA, USA: MIT Press, 2012. pp. 7132–7141.
[22] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, [47] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, ‘‘Image super-
V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’ in resolution using very deep residual channel attention networks,’’ in Proc.
Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2015, pp. 1–9. ECCV, 2018, pp. 286–301.
[23] M. Fadaee, A. Bisazza, and C. Monz, ‘‘Data augmentation for low-resource [48] L. Chen, H. Zhang, J. Xiao, L. Nie, J. Shao, W. Liu, and T.-S. Chua,
neural machine translation,’’ 2017, arXiv:1705.00440. ‘‘SCA-CNN: Spatial and channel-wise attention in convolutional networks
[24] F. Provost, T. Fawcett, and R. Kohavi, ‘‘The case against accuracy estima- for image captioning,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
tion for comparing induction algorithms,’’ in Proc. ICML, vol. 98, 1998, (CVPR), Jul. 2017, pp. 5659–5667.
pp. 445–453. [49] S. Woo, J. Park, J. Y. Lee, and I. S. Kweon, ‘‘CBAM: Convolutional
[25] L. Yan, R. H. Dodier, M. Mozer, and R. H. Wolniewicz, ‘‘Optimizing clas- block attention module,’’ in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018,
sifier performance via an approximation to the Wilcoxon-Mann-Whitney pp. 3–19.
statistic,’’ in Proc. ICML, 2003, pp. 848–855. [50] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
[26] S. Lyu and Y. Ying, ‘‘A univariate bound of area under ROC,’’ in Proc. recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
UAI, 2018, pp. 1–10. Jun. 2016, pp. 770–778.
[27] D. E. King, ‘‘Dlib-ml: A machine learning toolkit,’’ J. Mach. Learn. Res., [51] Y. Yang and Z. Xu, ‘‘Rethinking the value of labels for improving class-
vol. 10, pp. 1755–1758, Jan. 2009. imbalanced learning,’’ 2020, arXiv:2006.07529.

32582 VOLUME 10, 2022

H. Guo et al.: Robust Attentive Deep Neural Network for Detecting GAN-Generated Faces

[52] K. Cao, C. Wei, A. Gaidon, N. Arechiga, and T. Ma, ‘‘Learning imbalanced XIN WANG (Senior Member, IEEE) received the
datasets with label-distribution-aware margin loss,’’ in Proc. NeurIPS, Ph.D. degree in computer science from the Uni-
2019, pp. 1567–1578. versity at Albany, State University of New York,
[53] S. Hu, Y. Ying, and S. Lyu, ‘‘Learning by minimizing the sum of Albany, NY, USA, in 2015. He is currently a
ranked range,’’ in Proc. Adv. Neural Inf. Process. Syst., vol. 33, 2020, Senior Machine Learning Scientist at Keya Med-
pp. 21013–21023. ical, Seattle, WA, USA. His research interests
[54] S. Hu, Y. Ying, X. Wang, and S. Lyu, ‘‘Sum of ranked range loss for include artificial intelligence, machine learning,
supervised learning,’’ 2021, arXiv:2106.03300.
reinforcement learning, medical image computing,
[55] Y. Wang, W. Gan, J. Yang, W. Wu, and J. Yan, ‘‘Dynamic curriculum
computer vision, and media forensics.
learning for imbalanced data classification,’’ in Proc. IEEE/CVF Int. Conf.
Comput. Vis. (ICCV), Oct. 2019, pp. 5017–5026.
[56] C. Huang, Y. Li, C. C. Loy, and X. Tang, ‘‘Learning deep representation
for imbalanced classification,’’ in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit. (CVPR), Jun. 2016, pp. 5375–5384.
[57] H. He and E. A. Garcia, ‘‘Learning from imbalanced data,’’ IEEE Trans.
Knowl. Data Eng., vol. 21, no. 9, pp. 1263–1284, Sep. 2009.
[58] C. Shorten and T. M. Khoshgoftaar, ‘‘A survey on image data augmentation
for deep learning,’’ J. Big Data, vol. 6, no. 1, pp. 1–48, Dec. 2019.
[59] S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards real-time MING-CHING CHANG (Senior Member, IEEE)
object detection with region proposal networks,’’ 2015, arXiv:1506.01497. received the B.S. degree in civil engineering and
[60] C. Wang, J. Muhammad, Y. Wang, Z. He, and Z. Sun, ‘‘Towards complete the M.S. degree in computer science and informa-
and accurate iris segmentation using deep multi-task attention network tion engineering (CSIE) from the National Taiwan
for non-cooperative iris recognition,’’ IEEE Trans. Inf. Forensics Security, University, in 1996 and 1998, respectively, and
vol. 15, pp. 2944–2959, 2020. the Ph.D. degree from the Laboratory for Engi-
[61] C. Wang et al., ‘‘NIR iris challenge evaluation in non-cooperative envi- neering Man/Machine Systems (LEMS), School
ronments: Segmentation and localization,’’ in Proc. IEEE Int. Joint Conf.
of Engineering, Brown University, in 2008. He was
Biometrics (IJCB), Aug. 2021, pp. 1–10.
an Assistant Researcher at Mechanical Industry
[62] L.-C. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille, ‘‘Attention to scale:
Scale-aware semantic image segmentation,’’ in Proc. IEEE Conf. Comput. Research Labs, Industrial Technology Research
Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 3640–3649. Institute (ITRI), Taiwan, from 1996 to 1998. From 2008 to 2016, he was
[63] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, ‘‘Spatial a Computer Scientist at the GE Global Research Center. From 2016 to 2018,
transformer networks,’’ in Proc. NIPS, vol. 28, 2015, pp. 2017–2025. he was with the Department of Electrical and Computer Engineering. He is
[64] Q. Jin, Z. Meng, C. Sun, H. Cui, and R. Su, ‘‘RA-UNet: A hybrid deep currently an Assistant Professor at the Department of Computer Science,
attention-aware network to extract liver and tumor in CT scans,’’ Frontiers College of Engineering and Applied Sciences (CEAS), University at Albany,
Bioeng. Biotechnol., vol. 8, p. 1471, Dec. 2020. State University of New York (SUNY). His research projects are funded
[65] O. Ronneberger, P. Fischer, and T. Brox, ‘‘U-net: Convolutional net- by GE Global Research, IARPA, DARPA, NIJ, VA, and UAlbany. He has
works for biomedical image segmentation,’’ in Proc. Med. Image Comput. authored more than 100 peer-reviewed journals and conference publications,
Comput.-Assist. Intervent. (MICCAI). Springer, 2015, pp. 234–241. seven U.S. patents, and 15 disclosures. His research interests include video
[66] C. Cortes and M. Mohri, ‘‘AUC optimization vs. error rate minimization,’’ analytics, computer vision, image processing, and artificial intelligence.
in Proc. Adv. Neural Inf. Process. Syst., vol. 16, 2003, pp. 313–320. He is a member of ACM. He was a recipient of the IEEE Advanced Video and
[67] A. Paszke et al., ‘‘PyTorch: An imperative style, high-performance deep Signal-based Surveillance (AVSS) 2011 Best Paper Award - Runner-Up, the
learning library,’’ in Proc. Adv. Neural Inf. Process. Syst., vol. 32,
IEEE Workshop on the Applications of Computer Vision (WACV) 2012 Best
Dec. 2019, pp. 8026–8037.
Student Paper Award, the GE Belief - Stay Lean and Go Fast Management
[68] D. P. Kingma and J. Ba, ‘‘Adam: A method for stochastic optimization,’’
in Proc. ICLR, 2015, pp. 1–15. Award in 2015, and the IEEE Smart World NVIDIA AI City Challenge 2017
[69] F. Chollet, ‘‘Xception: Deep learning with depthwise separable convo- Honorary Mention Award. He serves as the Co-Chair for the Annual AI City
lutions,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Challenge CVPR 2018–2021 Workshop, the Co-Chair for the IEEE Lower
Jul. 2017, pp. 1251–1258. Power Computer Vision (LPCV) Annual Contest and Workshop 2019–2021,
the Program Chair for the IEEE Advanced Video and Signal-Based Surveil-
lance (AVSS) 2019, the Co-Chair for the IWT4S 2017–2019, the Area Chair
for IEEE ICIP (2017, 2019–2021) and ICME (2021), and the TPC Chair for
HUI GUO is currently pursuing the Ph.D. degree the IEEE MIPR 2022.
with the University at Albany, State University
of New York. Her research interests include dig-
ital media forensics, computer vision, and deep
learning.

SIWEI LYU (Fellow, IEEE) received the B.S.

degree in information science and the M.S.
degree in computer science from Peking Univer-
sity, China, in 1997 and 2000, respectively, and
SHU HU received the M.Eng. degree in soft- the Ph.D. degree in computer science from the
ware engineering from the University of Science Dartmouth College, in 2005. He is currently an
and Technology of China, in 2016, and the M.A. SUNY Empire Innovation Professor at the Depart-
degree in mathematics from the University at ment of Computer Science and Engineering, the
Albany, State University of New York, in 2020. Director of the UB Media Forensic Laboratory
He is currently pursuing the Ph.D. degree in com- (UB MDFL), and the Founding Co-Director of the
puter science with the University at Buffalo, State Center for Information Integrity (CII), University at Buffalo, State University
University of New York. His research interests of New York. His research interests include digital media forensics, computer
include machine learning, digital media forensics, vision, and machine learning.
and computer vision.

VOLUME 10, 2022 32583

B5_PPT
No ratings yet
B5_PPT
31 pages
Eyes Tell All
No ratings yet
Eyes Tell All
5 pages
GAN-Generated Faces Detection A Survey and New Perspectives
No ratings yet
GAN-Generated Faces Detection A Survey and New Perspectives
10 pages
10.3934 Mbe.2024071
No ratings yet
10.3934 Mbe.2024071
25 pages
Bai Finding Tiny Faces CVPR 2018 Paper
No ratings yet
Bai Finding Tiny Faces CVPR 2018 Paper
10 pages
Global-Local Facial Fusion Based GAN Generated Fake Faces
No ratings yet
Global-Local Facial Fusion Based GAN Generated Fake Faces
18 pages
GAN How To Detect Deepfake
No ratings yet
GAN How To Detect Deepfake
6 pages
Ppt Techincal Seminar
No ratings yet
Ppt Techincal Seminar
17 pages
Face Recognition Attack Art
No ratings yet
Face Recognition Attack Art
21 pages
Locally GAN-generated Face Detection Based On An Improved Xception
No ratings yet
Locally GAN-generated Face Detection Based On An Improved Xception
13 pages
(X) (2023) (Corneal Localization For Discening Faces Using Advanced Machine Learning Algorithms of GAN) (Articulo)
No ratings yet
(X) (2023) (Corneal Localization For Discening Faces Using Advanced Machine Learning Algorithms of GAN) (Articulo)
5 pages
v1_covered_1badd57f-24b3-4ee2-8990-1ffc235fce36
No ratings yet
v1_covered_1badd57f-24b3-4ee2-8990-1ffc235fce36
17 pages
Qian Unsupervised Face Normalization With Extreme Pose and Expression in the CVPR 2019 Paper
No ratings yet
Qian Unsupervised Face Normalization With Extreme Pose and Expression in the CVPR 2019 Paper
8 pages
Adaptive Resnet Kaggle
No ratings yet
Adaptive Resnet Kaggle
11 pages
Where Do Deep Fakes Look? Synthetic Face Detection Via Gaze Tracking
No ratings yet
Where Do Deep Fakes Look? Synthetic Face Detection Via Gaze Tracking
14 pages
12046_2022_Article_1807
No ratings yet
12046_2022_Article_1807
20 pages
Fake Face Detection Using CNN
No ratings yet
Fake Face Detection Using CNN
6 pages
GTA-Net_A_Robust_Method_for_Deepfake_Face_Image_Detection
No ratings yet
GTA-Net_A_Robust_Method_for_Deepfake_Face_Image_Detection
6 pages
Paper 3
No ratings yet
Paper 3
10 pages
GAN-based Face Reconstruction For Masked-Face: Farnaz Farahanipad Mohammad Rezaei Mohammadsadegh Nasr
No ratings yet
GAN-based Face Reconstruction For Masked-Face: Farnaz Farahanipad Mohammad Rezaei Mohammadsadegh Nasr
5 pages
Fakespotter: A Simple Yet Robust Baseline For Spotting Ai-Synthesized Fake Faces
No ratings yet
Fakespotter: A Simple Yet Robust Baseline For Spotting Ai-Synthesized Fake Faces
8 pages
Face_detection_method_based_on_improved_YOLO-v4_ne
No ratings yet
Face_detection_method_based_on_improved_YOLO-v4_ne
12 pages
Anti Spoofing Face Detection With Convolutional Neural Networks Classifier
No ratings yet
Anti Spoofing Face Detection With Convolutional Neural Networks Classifier
6 pages
Vanilla Cyclegan (Zhu Et Al., 2017) Structure. Cyclegan Face-Off Application (Wei, 2017)
No ratings yet
Vanilla Cyclegan (Zhu Et Al., 2017) Structure. Cyclegan Face-Off Application (Wei, 2017)
9 pages
cchen_tip2020
No ratings yet
cchen_tip2020
14 pages
Symmetry: Emotion Classification Using A Tensorflow Generative Adversarial Network Implementation
No ratings yet
Symmetry: Emotion Classification Using A Tensorflow Generative Adversarial Network Implementation
19 pages
Face and Liveness Detection With Criminal Identification Using Machine Learning and Image Processing Techniques For Security System
No ratings yet
Face and Liveness Detection With Criminal Identification Using Machine Learning and Image Processing Techniques For Security System
8 pages
Dual-Attention GAN For Large-Pose Face Frontalization
No ratings yet
Dual-Attention GAN For Large-Pose Face Frontalization
8 pages
Multi-Attentional Deepfake Detection
No ratings yet
Multi-Attentional Deepfake Detection
10 pages
没代码ProActive DeepFake Detection Using GAN-based Visible
No ratings yet
没代码ProActive DeepFake Detection Using GAN-based Visible
27 pages
Research Paper Version 0
No ratings yet
Research Paper Version 0
9 pages
Detection of Fake and Fraudulent Faces Via Neural Memory Networks
No ratings yet
Detection of Fake and Fraudulent Faces Via Neural Memory Networks
21 pages
Liveness Detection in Face Recognition Using Deep Learning
No ratings yet
Liveness Detection in Face Recognition Using Deep Learning
4 pages
N-19906
No ratings yet
N-19906
8 pages
Fake Image Detection PDF
No ratings yet
Fake Image Detection PDF
19 pages
4. Deepfake_Detection_and_Localization_Using_Multi-View_Inconsistency_Measurement
No ratings yet
4. Deepfake_Detection_and_Localization_Using_Multi-View_Inconsistency_Measurement
14 pages
Generating Users Desired Face Image Using the Conditional Generative Adversarial Network and Relevance Feedback (1)
No ratings yet
Generating Users Desired Face Image Using the Conditional Generative Adversarial Network and Relevance Feedback (1)
11 pages
Deep Learning For Face Anti-Spoofing: An End-To-End Approach: September 2017
No ratings yet
Deep Learning For Face Anti-Spoofing: An End-To-End Approach: September 2017
7 pages
Face Antispoofing4
No ratings yet
Face Antispoofing4
9 pages
Detect Faces Efficiently: A Survey and Evaluations: Yuantao Feng, Shiqi Yu, Hanyang Peng, Yan-Ran Li, Jianguo Zhang
No ratings yet
Detect Faces Efficiently: A Survey and Evaluations: Yuantao Feng, Shiqi Yu, Hanyang Peng, Yan-Ran Li, Jianguo Zhang
19 pages
4DPM Deepfake Detection With A Denoising Diffusion Probabilistic Mask
0% (1)
4DPM Deepfake Detection With A Denoising Diffusion Probabilistic Mask
5 pages
DCGAN (Deep Convolution Generative Adversarial Networks)
No ratings yet
DCGAN (Deep Convolution Generative Adversarial Networks)
27 pages
Optimized GAN-Based Pipeline For High-Quality Face Restoration From CCTV Images
No ratings yet
Optimized GAN-Based Pipeline For High-Quality Face Restoration From CCTV Images
29 pages
R
No ratings yet
R
3 pages
125989871 (1)
No ratings yet
125989871 (1)
11 pages
ReFace: Real-Time Adversarial Attacks On Face Recognition Systems
No ratings yet
ReFace: Real-Time Adversarial Attacks On Face Recognition Systems
13 pages
1-s2.0-S0020025523003894-main
No ratings yet
1-s2.0-S0020025523003894-main
14 pages
On Hallucinating Context and Background Pixels From A Face Mask Using
No ratings yet
On Hallucinating Context and Background Pixels From A Face Mask Using
14 pages
S.No. Year Title Detector Task Datasets CNN Architect: A Survey Paper
No ratings yet
S.No. Year Title Detector Task Datasets CNN Architect: A Survey Paper
10 pages
PDF
No ratings yet
PDF
39 pages
EAC-Net: Deep Nets With Enhancing and Cropping For Facial Action Unit Detection
No ratings yet
EAC-Net: Deep Nets With Enhancing and Cropping For Facial Action Unit Detection
14 pages
Teshale Abstract
No ratings yet
Teshale Abstract
1 page
A GAN Based Model For Deepfake Detection in Social Media
No ratings yet
A GAN Based Model For Deepfake Detection in Social Media
10 pages
Neural Network-Based Face Detection
No ratings yet
Neural Network-Based Face Detection
48 pages
Ioegc 12 131 12188
No ratings yet
Ioegc 12 131 12188
8 pages
Fang_2022_J._Phys.__Conf._Ser._2224_012014
No ratings yet
Fang_2022_J._Phys.__Conf._Ser._2224_012014
7 pages
2104.01984
No ratings yet
2104.01984
10 pages
A_Graph_Neural_Network_Model_for_Live_Face_Anti-Spoofing_Detection_Camera_Systems
No ratings yet
A_Graph_Neural_Network_Model_for_Live_Face_Anti-Spoofing_Detection_Camera_Systems
11 pages
Facial Recognition System: Fundamentals and Applications
From Everand
Facial Recognition System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Facial Recognition System: Unlocking the Power of Visual Intelligence
From Everand
Facial Recognition System: Unlocking the Power of Visual Intelligence
Fouad Sabry
No ratings yet
Assignment 3
No ratings yet
Assignment 3
2 pages
DC Motor Parameter Identification Using Speed Step Responses
No ratings yet
DC Motor Parameter Identification Using Speed Step Responses
6 pages
IMPROVING BALL MILL CONTROL WITH MODERN TOOLS BASED ON
No ratings yet
IMPROVING BALL MILL CONTROL WITH MODERN TOOLS BASED ON
8 pages
Study Unit 3 - Linear Programming - LP
No ratings yet
Study Unit 3 - Linear Programming - LP
22 pages
Ravin DR
No ratings yet
Ravin DR
18 pages
DSBA Curriculum Guide
No ratings yet
DSBA Curriculum Guide
18 pages
STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION - MODULE 1
No ratings yet
STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION - MODULE 1
30 pages
ECON 262-Mathematical Applications in Economics-Kiran Arooj
0% (1)
ECON 262-Mathematical Applications in Economics-Kiran Arooj
4 pages
Pert
No ratings yet
Pert
28 pages
ECE 726 - Project Report - Redacted
100% (1)
ECE 726 - Project Report - Redacted
13 pages
CSE2026 - IMPORTANT QUESTIONS
No ratings yet
CSE2026 - IMPORTANT QUESTIONS
2 pages
DAA
No ratings yet
DAA
4 pages
Week3 Assignment
No ratings yet
Week3 Assignment
6 pages
Stock Market Analysis Using Supervised Machine Learning
100% (2)
Stock Market Analysis Using Supervised Machine Learning
17 pages
Voltage Regulation of DC-DC Buck Converters Feeding CPLs Via Deep Reinforcement Learning
No ratings yet
Voltage Regulation of DC-DC Buck Converters Feeding CPLs Via Deep Reinforcement Learning
5 pages
Mba 19 Pat 302 Ds Unit 1.3.1 NWCM
No ratings yet
Mba 19 Pat 302 Ds Unit 1.3.1 NWCM
29 pages
Superagent: A Customer Service Chatbot For E-Commerce Websites
No ratings yet
Superagent: A Customer Service Chatbot For E-Commerce Websites
4 pages
Ds7201 Adip
No ratings yet
Ds7201 Adip
2 pages
Chapter-3 - Fluid Flow and Heat Transfer in A Mixing Elbow - 20160419
No ratings yet
Chapter-3 - Fluid Flow and Heat Transfer in A Mixing Elbow - 20160419
44 pages
Lecture Slides Chapter 1
No ratings yet
Lecture Slides Chapter 1
12 pages
Module_4 Math
No ratings yet
Module_4 Math
13 pages
Schrodinger Equation
No ratings yet
Schrodinger Equation
15 pages
FL-03
No ratings yet
FL-03
55 pages
AI Project
No ratings yet
AI Project
30 pages
SCE527 Adaptive Control
No ratings yet
SCE527 Adaptive Control
10 pages
QA Assignment 2-3
No ratings yet
QA Assignment 2-3
4 pages
Supervised ECG Interval Segmentation Using LSTM Neural Network
No ratings yet
Supervised ECG Interval Segmentation Using LSTM Neural Network
7 pages
DigiCom Notes
No ratings yet
DigiCom Notes
25 pages
Market Notes
No ratings yet
Market Notes
904 pages
MGTSCIE Chapter 1 2023
No ratings yet
MGTSCIE Chapter 1 2023
49 pages

Robust Attentive Deep Neural Network For Detecting

Uploaded by

Robust Attentive Deep Neural Network For Detecting

Uploaded by

Received February 11, 2022, accepted February 25, 2022, date of publication March 8, 2022, date of current version

March 29, 2022.

Robust Attentive Deep Neural Network

I. INTRODUCTION unaware users [6]–[9], which can cause significant secu-

improve classification performance [24]. However, the AUC

VOLUME 10, 2022 32575

32576 VOLUME 10, 2022

TABLE 1. Details of the proposed Residual Attention Network (RAN) in

The training of our network aims to maximize the classi-

A. FACIAL LANDMARK LOCALIZATION AND IRIS

VOLUME 10, 2022 32577

32578 VOLUME 10, 2022

VOLUME 10, 2022 32579

FIGURE 8. Impact of hyperparameter α of the AUC loss in Eq.(4) for the

of the ROC, and Precision-Recall (PR) curves. Accuracy

32580 VOLUME 10, 2022

D. ABLATION STUDIES V. CONCLUSION

VOLUME 10, 2022 32581

32582 VOLUME 10, 2022

SIWEI LYU (Fellow, IEEE) received the B.S.

VOLUME 10, 2022 32583

You might also like