0% found this document useful (0 votes)
19 views10 pages

3417-SUBMISSION - Manuscript File (.Pdf-.Docx) - 14755-1-10-20241223

This paper presents a comparative analysis of various deepfake detection models, specifically focusing on CNN architectures like VGG16, ResNet50, VGG19, and MobileNetV2, tested against deepfake images generated by StyleGAN2, StyleGAN3, and ProGAN. The study utilized a dataset of 200,000 images, achieving over 96% accuracy with the VGG19 model for detecting these deepfakes. The research highlights the need for robust detection methods due to the evolving nature of deepfake generation techniques.

Uploaded by

lamdkhe160283
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views10 pages

3417-SUBMISSION - Manuscript File (.Pdf-.Docx) - 14755-1-10-20241223

This paper presents a comparative analysis of various deepfake detection models, specifically focusing on CNN architectures like VGG16, ResNet50, VGG19, and MobileNetV2, tested against deepfake images generated by StyleGAN2, StyleGAN3, and ProGAN. The study utilized a dataset of 200,000 images, achieving over 96% accuracy with the VGG19 model for detecting these deepfakes. The research highlights the need for robust detection methods due to the evolving nature of deepfake generation techniques.

Uploaded by

lamdkhe160283
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Comparative Analysis of Deepfake Detection

Models on Diverse GAN-Generated Images


Original Scientific Paper

Medha Wyawahare* Vrinda Parkhi


Vishwakarma Institute of Technology, Vishwakarma Institute of Technology,
Department of Electronics and Telecommunication Department of Electronics and Telecommunication
Pune, Maharashtra, India Pune, Maharashtra, India
[email protected] [email protected]

Siddharth Bhorge Mayank Jha


Vishwakarma Institute of Technology, Vishwakarma Institute of Technology,
Department of Electronics and Telecommunication Department of Electronics and Telecommunication
Pune, Maharashtra, India Pune, Maharashtra, India
[email protected] [email protected]

Milind Rane Narendra Muhal


Vishwakarma Institute of Technology, Vishwakarma Institute of Technology,
Department of Electronics and Telecommunication Department of Electronics and Telecommunication
Pune, Maharashtra, India Pune, Maharashtra, India
[email protected] [email protected]
*Corresponding author

Abstract – Advancement in Artificial intelligence has resulted in evolvement of various Deepfake generation methods. This subsequently
leads to spread of fake information which needs to be restricted. Deepfake detection methods offer solution to this problem. However,
a particular Deepfake detection method which gives best results for a set of Deepfake images (generated by a particular generation
method) fails to detect another set of Deepfake images (generated by another method). In this work various Deepfake detection methods
were tested for their suitability to decipher Deepfake images generated by various generation methods.
We have used VGG16, ResNet50, VGG19, and MobileNetV2 for deepfake detection and pre-trained models of StyleGAN2, StyleGAN3,
and ProGAN for fake generation. The training dataset comprised of 200000 images, 50 % of which were real and 50% were fake. The
best performing Deepfake detection model was VGG19 with more than 96 percent accuracy for StyleGAN2, StyleGAN3, and ProGAN-
generated fakes.

Keywords: CNN, GAN, VGG19, StyleGAN3, Deepfake

Received: March 22, 2024; Received in revised form: August 20, 2024; Accepted: August 21, 2024

1. INTRODUCTION how society interacts with visual information. The cre-


ation of models that can not only create but also rec-
The deepfake image synthesis and detection field ognize real images from altered ones is at the core of
has attracted significant research due to the conver- this endeavor. Similar to how image captioning seeks
gence of computer vision and artificial intelligence. to describe scenes, the main goal in this case is to cre-
This multidisciplinary area, which attracts academi-
ate material that fools or imitates reality. This technol-
cians, researchers and business experts, focuses on the
ogy offers inventive ways to create digital content but
automated creation and recognition of modified visual
also presents difficulties that call for strict safeguards
content. Deepfake image generation and detection
against misuse and false information.
have practical repercussions in a variety of fields, from
digital forensics and content verification to maintain- Modern deep learning techniques serve as inspira-
ing user confidence in computer-human interactions. tion for the architecture supporting deepfake image
Furthermore, it has the power to fundamentally alter production and detection. Modern methods usually

Volume 16, Number 1, 2025 9


use the encoder-decoder paradigm, which is appropri- conditional GAN architecture, where the generator is
ate for both facets of this topic. To encode the source's conditioned on both the input and the target identity.
unique features, the encoder must convert them into To remove the potential misuse of deepfake, they add-
little feature vectors. The decoder then makes use of ed watermarks on the deepfakes. Sanjana et al. [3] gave
these vectors to create them or determine their legiti- a thorough analysis of the current deepfake detection
macy. The core of the encoder component is a convo- methods to stop the spread of false information and
lutional neural network (CNN). The use of well-known protect the integrity of multimedia content. Detection
CNN architectures like VGG, ResNet, and MobileNet, is techniques like CNNs and GANs can spot deepfake face
common in the encoder. The model's capacity to spot swapping, in which one person's face is swapped out
subtle patterns in real and altered images is aided by for the face of another. To increase the effectiveness of
this larger viewpoint. deepfake detection, transfer learning techniques that
use pre-trained models for related tasks (e.g., facial rec-
The use of four specialized detection models demon-
ognition) are used.
strates our commitment to excellence in deepfake im-
age production and detection. Every model has been Malik et al. [4] provided a thorough analysis of the
painstakingly designed to handle particular aspects of various techniques and procedures employed for
deepfake identification, improving the model's over- deepfake detection. They look at a variety of strate-
all accuracy and adaptability. We explore the world of gies, computer vision approaches, and deep learning-
Generative Adversarial Networks (GANs), a powerful based solutions in particular. The survey examines the
method for producing deepfake content, as a comple- advantages and disadvantages of various detection
ment to these detection attempts. We seek to advance techniques and covers both text-based and video-
the authenticity and realism of the created images by based deepfake. Rana et al. [5] and Paul et al. [6] show
utilizing adversarial training, adding to the continuing significant progress in developing robust deepfake
arms race between creation and detection. detectors capable of fending off ever-more complex
manipulation techniques using GANs for adversarial
In this paper, we conduct a comprehensive evalu-
training. The study produces encouraging results in au-
ation of eight different CNN models, such as VGG16,
dio-based deepfake detection using recurrent neural
ResNet50, VGG19, and Xception, among others, for
networks, concentrating on minute acoustic artifacts
deepfake detection. Initially, we train these CNN
created during speech synthesis to distinguish altered
models on the OpenForensics dataset, a widely used
audio from real recordings.
benchmark dataset in the field of deepfake detection.
To evaluate these models performance and generaliz- Nguyen et al. [7] investigated various deep learn-
ability, we test them using a recently created dataset ing architectures, such as GANs, autoencoders, and
that contains a wide variety of deepfakes. Furthermore, others, that are used to produce deepfake content. To
to enhance the robustness of our evaluation, we aug- maintain visual integrity, autoencoders, a form of un-
ment the OpenForensics dataset by incorporating our supervised deep learning model, have been used for
own generated data, thereby expanding the dataset's deepfake production. By training on small samples of
diversity and realism. Subsequently, we retrain the CNN recently emerging deepfake content, one-shot learn-
models on this augmented dataset, leveraging the en- ing algorithms have demonstrated potential for iden-
riched data to improve the model's performance. tifying unique deepfake variants. Datasets like Face Fo-
Finally, we rigorously evaluate the trained models by rensics++ and the deepfake Detection Dataset (DFDC)
testing them on three distinct sets of GAN-generated have been significant in advancing research and test-
data: ProGAN, StyleGAN2, and StyleGAN3 [1]. The new- ing performance in the deepfake detection field. Shen
ly generated dataset from these GANs is available on et al. [8] have investigated the technical aspects of how
Kaggle for public access. We hope to shed light on how GANs are utilized to create deepfakes, including train-
well CNN models perform in identifying deepfakes us- ing the networks, selecting suitable datasets, and fine-
ing a variety of datasets and GAN architectures by us- tuning the models to produce realistic results, which
ing this thorough approach. are probably covered fully in the study. To improve con-
vergence and generation quality, the Wasserstein GAN
2. LITERATURE REVIEW algorithm variation with a deep convolutional architec-
ture is used to train the generator network. They use
In 2019, Yadav et al. [2] put forth that deepfake im- the Structural Similarity Index (SSIM) metric and Peak
ages are man-made media, especially edited videos or Signal-to-Noise Ratio (PSNR) to objectively analyze the
photos, produced by cutting-edge machine learning similarity between the created content and the ground
algorithms that can accurately replicate real human ex- truth data to assess the performance of our deepfake
pressions and activities. They examine many strategies, generation. Giudice et al. [9] outline a technique to spot
from conventional GAN-based techniques to more abnormalities in the Discrete Cosine Transform (DCT)
complex variants, like conditional GANs and cycle-con- domain of GAN-generated. This work focuses on pre-
sistent GANs. To create extremely realistic facial forger- venting deep fakes. The DCT is frequently used for im-
ies, the proposed deepfake generation model uses a age compression, including JPEG encoding, and GANs

10 International Journal of Electrical and Computer Engineering Systems


also employ it for creation. By examining anomalies in for deepfake detection. RNNs and LSTMs are noted
the DCT coefficients of images created using GANs, the for their capability to handle sequential data, making
authors of this research take a fresh approach to deep- them suitable for analyzing video content and identi-
fake identification. They make a distinction between fying temporal inconsistencies indicative of deepfakes.
real content and content that has been altered by us- Recent advancements in attention mechanisms and
ing statistical metrics and machine learning classifiers transformers show promise for improving deepfake
to identify specific DCT artifacts connected to deep- detection accuracy through sophisticated feature ex-
fake image. traction and analysis. The authors evaluated deepfake
detection models using the Inception Score (IS) and
Shad et al. [10] conducted a comparative analysis
Fréchet Inception Distance (FID) to quantify the quality
of the performance of eight CNN architectures. They
of the generated data and the effectiveness of detec-
have used images from the Flickr dataset for training
tion algorithms.
the models. Fake images for training were generated
using StyleGAN. They evaluated these models on five Nowroozi et al. [16] describe the application of GAN-
different evaluation metrics, such as accuracy, preci- based CNN models for deepfake detection, highlight-
sion, recall, etc. VGGFace and ResNet50 performed best ing their effectiveness in distinguishing real from artifi-
with an accuracy of 99% and 97%, respectively. Saxena cial faces. The effectiveness of the CNN models, which
et al. [11] gave a thorough introduction to GANs, not- are Cross-Co-Net and Co-Net, was compared to alter-
ing the difficulties in training and using them, suggest- native approaches. It showed superior accuracy, which
ing different ways to solve these problems, and provid- underscores the robustness of combining GAN-gener-
ing suggestions for possible future research paths. The ated data with CNN for deepfake.
study contributes to a deeper knowledge of GANs and
Sharma et al. [17] presented the effectiveness of
acts as an invaluable resource for scholars and practi-
GANs for deepfake detection, leveraging a GAN-based
tioners in the field of artificial intelligence by carefully
examining existing research and methodologies. CNN model. Using the Indian actor’s dataset, demon-
strates how GANs may be used to expand training da-
Ali et al. [12] tested the generalization of the fake tasets, hence improving the robustness of the model.
face detection methods. Two types of fake face detec- The suggested approach outperforms existing tech-
tion methods have been tested in this paper. The first niques and demonstrates its potential for useful ap-
is texture-based Local Binary Patterns (LBP), and the plications in digital forensics and image recognition.
second is using different CNN architectures such as Sergi et al. [18] investigated the human ability to iden-
Alexnet, VGG19, ResNet50, etc. These methods are test- tify deepfakes created using the StyleGAN2 algorithm.
ed on known and unknown data, and the results show Three intervention tactics were tested for their efficacy
that their performance drops for the unknowns. These in detecting deepfakes through an online poll that at-
results indicate the lack of transferability of the learned tracted 280 participants. Following the evaluation of
classifiers to the general face-forgery classification cas- twenty images, the participant's accuracy score ranged
es. Patashnik et al. [13] proposed StyleCLIP, a powerful from 60% to 64% depending on the situation, indicat-
framework for manipulating s generated by StyleGAN ing that deepfake images produced by StyleGAN2 are
using natural language descriptions. By aligning the difficult for humans to detect. Notably, interventions
CLIP model's textual embeddings with StyleGAN's la- did not significantly improve detection accuracy. The
tent space, users can apply targeted modifications to findings highlight the difficulty in detecting deepfake
generated text simply by providing descriptive text. images and underscore the urgent need for enhanced
StyleCLIP allows users to create diverse and specific vi- detection methods and public awareness.
sual outputs, offering an exciting approach to interac-
tive and intuitive synthesis and manipulation. 3. RESEARCH GAP
Kumar et al. [14] reviewed various techniques for There has been a lot of intensive research and devel-
implementing and detecting deepfake images, focus- opment in this field in the last few years as a result of
ing on Deep Convolution-based GAN models. A com-
the rise of Artificial Intelligence (AI) and Deep Learning
parative analysis of the proposed GAN model with
(DL) technologies. In the literature that we reviewed,
existing models is performed using parameters such
there are several gaps in the current state of deepfake
as Inception Score (IS) and Fréchet Inception Distance
detection research. While the majority of research to
(FID). Deepfake images present a substantial threat
date has focused on GAN-based methods and spe-
to biometric security and facilitate counterfeiting and
cific designs, there are noticeably few comprehensive
fraudulent activities.
comparative studies that look at a larger variety of GAN
Tiwari et al. [15] discuss the use of GANs in creating variants. Moreover, reliance on established datasets,
highly realistic deepfakes and their role in both gener- such as Face Forensics++ and DFDC, restricts the un-
ating and detecting fake content through discrimina- derstanding of model effectiveness across diverse data
tor networks. CNNs are highlighted for their effective- sources, indicating a need for research that examines
ness in classification and detecting subtle anomalies model’s performance on more varied and less curated
in images and videos, making them a primary method datasets.

Volume 16, Number 1, 2025 11


Furthermore, the challenge of generalization per-
sists, with many models demonstrating effectiveness
on known data but struggling to maintain accuracy in
the face of new deepfake techniques or unknown data.
The field lacks sufficient exploration of the robustness
of detection models against adversarial attacks, high-
lighting a critical gap in ensuring the reliability of de-
tection methods in real-world scenarios. The scarcity of (c)
work comparing different detection models on fakes
generated using different GANs is evident.
An organized research for identifying the most robust
detection algorithm capable of performing well on all
types of deepfakes across various GAN architectures
is essential. The lack of established evaluation metrics
(d)
and benchmarks makes it difficult to compare detec-
tion models consistently, which emphasizes the neces-
sity of research projects targeted at creating accurate
and consistent evaluations of model performance.

4. PROBLEM STATEMENT

The objectives of our research are to (e)


• Generate deepfake images using three different Fig. 1. Dataset containing (a) Real , Fake from (b)
GANs, so that we have diverse fake images to test OpenForensics, (c) ProGAN, (d) StyleGAN2, and (e)
our detection methods. StyleGAN3. [19]

• Train eight different CNN models on the dataset We used the Openforensics dataset [19] which is an
to detect fake images, which will be our detection open dataset and contains approximately 200,000 im-
models. ages. It was split in the ratio 70:20:10 (70% training,
• Compare the performance of deepfake detection 20% validation, and 10% testing). The quantity of im-
models when they are tested on the diverse fake ages used in the datasets for training, testing, and vali-
images that are generated different GANs in order dation is displayed in Table 1.
to suggest the best performing deepfake detec- A dataset of 15,000 fake images was generated from
tion method. the three pre-trained GAN models, five thousand from
each. We added 5,000 and 1,400 fake generated images
5. METHODOLOGY
from each GAN model for training, testing and valida-
Fig. 3 shows the general preprocessing and detec- tion. This increased the robustness, diversity, and over-
tion flow that the model is going through. all quality of the dataset before it was used for training.

Table 1. The Dataset Utilized


5.1. Dataset
Number of images
The dataset used comprised of both real and fake im- Datasets
ages Fig. 1 shows sample images, real and fake, along Train Validation Test
with the fake images generated using ProGAN, Style- Real 70001 19787 5413
GAN2, and StyleGAN3.
Fake 70001 19641 5492

5.2. Generation of Deepfakes

Generative Adversarial networks (GANs) are mostly


used to generate fake media. A GAN consists of two
parts. The first is the generator, which generates the
(a)
fakes. It starts with a random vector and keeps improv-
ing until it generates an image of the desired quality. The
discriminator in the second section determines whether
the data produced by the generator is real or fake based
on real training data. If the discriminator correctly clas-
sifies the generator's fake as fake, then the generator
(b) updates its model weights to generate better fakes, and

12 International Journal of Electrical and Computer Engineering Systems


if the discriminator fails to recognize the fake of the gen- Output:
erator, then the discriminator updates its model weights
• Fake generated images stored in specified directory.
to better distinguish between real and fake. Both the
generator and discriminator keep updating their models
in a loop until the generator can generate fake images Load Pretrained GAN Model:
good enough to fool the discriminator. Fig. 2 depicts the
GAN architecture nicely in a pictorial manner. It shows • Load the pre-trained GAN model from the speci-
how the two parts work together, as mentioned above. fied file.
• Extract the generator network (G) from the loaded
model.
Create Output Directory
Generate fake images:
• Loop for each :
a. Generate a random latent space vector using torch.
randn.
b. Generate an using the generator network (G) with
the specific latent space and conditioning.
Fig. 2. Block diagram of GAN c. Convert the PyTorch tensor to a PIL .
For generating fakes, we used three types of pre- d. Save the generated in the output directory with a
trained GAN models: StyleGAN2, StyleGAN3, and Pro- unique filename.
GAN. A total of 5000 tests were generated for each GAN
model to test its detection methods. ProGAN, short for End
Progressive GAN, was trained on the 'CelebA' dataset
5.3. Detection of Deepfakes
and produced images with a resolution of 128x128 pix-
els [20]. Its progressive training approach starts with CNN models are frequently used for detection tasks
low-resolution and gradually increases the resolution, and usually use an encoder-decoder design. The CNN
allowing it to capture fine details as it progresses. encoder creates a condensed feature vector after pro-
In contrast, StyleGAN2 and StyleGAN3 are both high- cessing the inputs. The desired output is then produced
resolution GANs. With a 256x256 model, they were trained by a CNN decoder using this feature representation. In
on the 'FFHQ' dataset [21], which includes human faces this system, a CNN model is used for training the datas-
and is known for generating exceptionally high-quality ets and has the best accuracy.
images. Since SyleGAN3 is advancement over StyleGAN2,
The goal is to ascertain the relative performance of
it generated the best fake images of them all.
each model in identifying deepfake content. Models
Algorithm 1 outlines the process of generating fake such as ResNet50V2, DenseNet121, VGG16, VGG19,
images using pre-trained GAN models. Initially, the al- InceptionNetV3, InceptionResNetV2, Xception, and
gorithm loads the pre-trained GAN model from a speci- MobileNetV2 are being examined in greater detail. In
fied file and extracts the generator network responsible this manner, we may truly learn about their distinct ad-
for generation. Afterward, it sets parameters such as the vantages and disadvantages in terms of spotting deep-
number of fakes to generate and the truncation factor
fakes.
for controlling quality. Through a loop, the algorithm
generates each fake image by creating a random latent We may choose the model or combination of mod-
space vector, feeding it into the generator network, and els that works best for our deepfake detection task by
converting the output into a recognizable format. These evaluating each model's accuracy independently. By
generated fake images are then saved to a designated di- using this technique, we can improve the deepfake
rectory. By systematically iterating through these steps, detection system and make it more dependable and
the algorithm efficiently produces a set of fake images, capable of handling the rapidly changing deepfake
leveraging the capabilities of pre-trained GAN models. technology landscape.

Algorithm 1 - Deepfake generation using GAN The basis for constructing and optimizing the eight
different CNNs is our training dataset, which consists of
Input:
more than 140,000 images.
• Pretrained model
• Truncation factor (truncation_psi) or latent dimen- Hyperparameter and Training Settings:
sion for controlling quality • Learning Rate: 0.0001

Volume 16, Number 1, 2025 13


• Activation Method: ReLu steps based on batch size and dataset size.
• Optimizer: Adam optimizer • Train the model using the training and validation
datasets.
• Batch Size: 64
• Utilize the custom callback for evaluating the mod-
• Number of Epochs: 10
el's performance on the validation set.
The model was trained with a learning rate of 0.0001
Save the Model
and the Adam optimizer, which combines the benefits
of AdaGrad and RMSProp. With a batch size of 64 to fit • Save the trained CNN model for future use.
GPU memory, the model was trained for 10 epochs to
End
balance training time and performance.
After the training phase, we use a testing dataset of Fig.3. shows the block diagram of our detection mod-
about 10,000 images to thoroughly evaluate the mod- el, where it shows how we train the model and then
el's performance. Each model is tasked with determin- preprocess the dataset, after preprocessing the model,
ing whether a given image is real or fake throughout it is trained on the different CNN architectures. After
this review. After testing the model, we predict wheth- training the model has been exported and tested on
er it is real or fake and then calculate the accuracy of the test dataset which consists of 10,000 images con-
the model. taining both real and fake. Then the accuracy of the
Algorithm 2 outlines the steps involved in building, model has been calculated.
training, and evaluating a deepfake detection model
using a generic CNN architecture. The flexibility of us-
ing CNN allows for customization based on specific
requirements and facilitates the development of an ef-
fective deepfake detection system.

Algorithm 2: Deepfake Detection using CNN


Input:
• datasets for training, validation, and testing (real
and fake images)
• Hyperparameters for the CNN model
• Number of training iterations
Output:
• Trained CNN model
Start
• Import necessary libraries and modules.
• Set the base path for the dataset.
Prepare the Dataset
• Load and preprocess the training, validation, and
testing datasets.
• Visualize a sample of s from the training set.
Build the Model
• Define the architecture for the CNN model.
• Utilizing the Adam optimizer and categorical cross-
entropy loss, compile the model.
Define Callback for Evaluation Fig. 3. Block diagram of detection model
• Create a custom callback (Prediction Callback) to
evaluate the model on the validation set after each Our approach to deepfake detection follows a struc-
epoch. tured and well-thought-out flow. It all starts with the
data collection of both real and fake images that form
Train the Model
the basis of our system. To make this data useful, we
• Set the number of training steps and validation take a step called preprocessing, where we divide it

14 International Journal of Electrical and Computer Engineering Systems


into three key parts: the training set, the validation set, more images to the training dataset created by each of
and the test set. We distribute them in a balanced way, the three GAN models—Style_GAN_3, Style_GAN_2,
with 70% for training, 20% for validation, and 10% for and ProGAN to improve accuracy.
testing. CNNs are an effective technique that we utilize
The objective of this addition was to ensure that the
to train the model using the training dataset.
model was exposed to better quality fake images and a
We used exported models for deepfake detection wider variety of fake images by adding more diversity
in the testing set. This indicated the true effectiveness and balance to the dataset. So, to address the initial low
of our deepfake detecting algorithm. It served as per- accuracy rates, the newly generated fake images were
formance evaluation of the system, demonstrating its subsequently included in the training and validation
dependability and efficiency in exposing misleading sets of the dataset.
material across a range of contexts.
After making this modification, the model's per-
formance was significantly improved. Across all three
6. RESULT AND DISCUSSION
GAN datasets, the re-trained models showed a notable
Training and testing of the models was done on cloud increase in accuracy after being trained on better fake
infrastructure. It featured dual Intel Xeon Silver 4114 images. Experiments with different activation strate-
CPUs with 40 cores, 128GB of DDR4-2666 ECC Memory, gies and learning rates were conducted to achieve
and an Nvidia Tesla V100 GPU with 32GB VRAM. With even better results. ReLU activation and 0.0001 learn-
4TB of HDD storage and a 100 Mb/s Ethernet interface, ing rate were found to work best for the model.
it was well-equipped for demanding Deep learning Fig.4. shows the graph of loss in training and loss in
tasks. validation vs the epochs. Graphs of all the eight models
A dataset containing 200,000 distinct real and fake have been included in the figure. Optimal configura-
images were used to train the model. Random images tions of 64 batch sizes and 10 epochs were determined
were fed into the testing process to determine whether through systematic testing for all CNN models.
or not they were real. In Fig.4 it can be observed that VGG16 has the best
Despite initial success with the OpenForensics da- training-validation loss curve as it has good training
taset, testing on deepfake images from StyleGAN2, loss convergence and the validation loss also converg-
StyleGAN3, and ProGAN revealed underwhelming ac- es close to training loss with little fluctuations. Some
curacies. Table 2. Shows the obtained accuracies, which of the model's validation graphs were smooth with
ranged from 20% to 50%, these accuracies are notice- less fluctuation, and good convergence and some had
ably low across multiple CNN architectures. These re- spikes and variation as compared to training loss. This
sults tell us about how well the model could distin- indicates how the models performed on unseen data
guish between real and fake. as compared to seen data and help determine which
model works best on unseen data. Some models do
Table 2. Comparison of Detection Accuracies of not have a good training validation graph, it is because
CNN models tested on various GANs the model has not generalized well that is it has not
performed well on unseen data, its accuracy is bad and
Comparison of Detection Accuracies of CNN the other reason is that sometimes the validation data
models tested on various GANs differs in quality to that of the training data. The mod-
Models
Style_GAN_2_ Style_GAN_3_ ProGAN_ els that have better validation graphs have generalized
FFHQ_256 FFHQ_256 CelebA_128 well on unseen data.
Table 3. Shows the accuracy of summarizing differ-
VGG16 29.620% 21.342% 35.305% ent CNNs on the difficult job of detecting manipulated
images part of our thorough review of deepfake detec-
tion approaches.
VGG19 35.343% 26.355% 42.397%
Table 3. Comparison of Detection Accuracy for
Various GAN Models Using Different CNNs
DenseNet121 30.650% 23.165% 38.525% Comparison of Detection Accuracy for
Various GAN Models Using Different
Convolutional Neural Networks (CNNs)
Models
MobileNetV2 29.420% 20.270% 33.447% Style_ Style_
ProGAN_
GAN_2_ GAN_3_
CelebA_128
FFHQ_256 FFHQ_256

In the initial stage when we evaluated the perfor- VGG16 95.038% 94.901% 94.987%

mance, we realized that the model needed to be trained


on all of the datasets, including ProGAN, StyleGAN2, and VGG19 97.983% 96.744% 97.397%
StyleGAN3. Then a calculated choice was made to add

Volume 16, Number 1, 2025 15


of the models under consideration—VGG16, VGG19,
DenseNet121 95.650% 95.045% 95.525%
DenseNet121, InceptionResNetV2, InceptionV3, Mo-
bileNetV2, ResNet50V2, and Xception is achieved.
InceptionResNetV2 96.495% 96.380% 96.447%
VGG19 emerges as a strong contender with the
InceptionV3 97.052% 96.668% 96.831% highest accuracy of 97.983% on the Style_GAN_2_
FFHQ_256 dataset and 97.397% on ProGAN_Cele-
MobileNetV2 95.707% 95.592% 95.659% bA_128. Xception demonstrates its robustness on the
Style_GAN_3_FFHQ_256 dataset by attaining the high-
ResNet50V2 95.150% 93.230% 95.304% est accuracy of 96.908%. Some of the models like In-
ceptionResNetV2 and InceptionV3 had accuracy above
Xception 96.956% 96.908% 96.898% 96% in all of the datasets and they were consistent
along with the other models.
Overall, for all the models the accuracy ranged be-
In Table 3. Three different deepfake datasets are used tween 94% and 98%. Models gave the least accuracy
to thoroughly evaluate each CNN's performance in on StyleGAN3 images because they were of the best
differentiating between real and fake content: Style_ quality amongst the images created by three GANs.
GAN_2_FFHQ_256, Style_GAN_3_FFHQ_256, and Pro- ProGAN and StyleGAN2 images were comparatively
GAN_CelebA_128. The accuracy in percentage values detected with greater accuracy.

(a)

(b)

(c)
Fig. 4. Loss in training and validation where (a) VGG16, VGG19, and DenseNet121, (b) InceptionResNetV2,
InceptionV3, and MobileNetV2, and (c) ResNet50V2 and Xception

16 International Journal of Electrical and Computer Engineering Systems


7. CONCLUSION sity, Honors College Theses, 2021.

Our study shows that different models work well [7] T. T. Nguyen, Q. V. Nguyen, D. T. Nguyen, D. T.
in different situations for spotting deepfake images. Nguyen, T. Huynh-The, S. Nahavandi, T. T. Nguy-
We tested the eight CNN models against fake images
en, Q. V. Pham, C. M. Nguyen, "Deep learning
from three GANs: StyleGAN2, StyleGAN3, and ProGAN.
VGG19 and VGG16 do great in some cases, while In- for deepfakes creation and detection: A survey",
ceptionV3 and Xception are consistently good giving Computer Vision and Understanding, Vol. 223,
an accuracy above 96.6% for all three GANs. The best- 2022, p. 103525.
performing model however is VGG19 since it has the
best overall accuracy across the three GANs. So our [8] T. Shen, R. Liu, J. Bai, Z. Li. "'Deep fakes' using gen-
study based on the performances of the CNN models erative adversarial networks (GAN)", Noiselab, Uni-
concludes that VGG19 is the better alternative to de-
versity of California, San Diego, 2018, Report 16.
tect deepfake images coming from various sources.
With more powerful GPUs and CPUs, we can gener- [9] O. Giudice, L. Guarnera, S. Battiato, "Fighting deep-
ate and detect deepfakes more efficiently. Advanced fakes by detecting GAN DCT anomalies", Journal
systems enable the use of models like EfficientNet, a of Imaging, Vol. 7, No. 8, 2021, p. 128.
highly effective CNN architecture, further enhancing
our deepfake detection capabilities. [10] H. S. Shad, M. M. Rizvee, N. T. Roza, S. M. Hoq, M. M.
With the rise of artificial intelligence, the quality of Khan, A. Singh, A. Zaguia, S. Bourouis, "Compara-
deepfakeimages is only going to increase thus making tive analysis of deepfake detection method using
their detection a continuous research topic. Our goal convolutional neural network", Computational In-
was to find a model that works well on fake s generated
telligence and Neuroscience, Vol. 2021, 2021.
through diverse sources thus making it a reliable tool
for countering the ever-evolving deepfake creation. [11] D. Saxena, J. Cao, "Generative adversarial net-
works (GANs) challenges, solutions, and future
8. REFERENCES:
directions", ACM Computing Surveys, Vol. 54, No.
[1] M. Kumar, N. Muhal, “Fake Face s Generated From 3, 2021, pp. 1-42.
Different GANs”, https://www.kaggle.com/data-
[12] A. Khodabakhsh, R. Ramachandra, K. Raja, P. Was-
sets/mayankjha146025/fake-face-s-generated-
nik, C. Busch, "Fake face detection methods: Can
from-different-gans (accessed: 2024)
they be generalized?", Proceedings of the Interna-
[2] Y. Digvijay, S. Salmani, "Deepfake: A survey on facial tional conference of the biometrics special inter-
forgery technique using the generative adversarial est group, Darmstadt, Germany, 26-28 September
network", Proceedings of the International Confer- 2018, pp. 1-6.
ence on Intelligent Computing and Control Sys-
[13] O. Patashnik, Z. Wu, E. Shechtman, D. Cohen-Or, D.
tems, Madurai, India, 15-17 May 2019, pp. 852-857.
Lischinski, "StyleCLIP: Text-Driven Manipulation of
[3] S. Sanjan, P. Thushara, P. C. Karthik, M. P. A. Vijayan, StyleGAN Imagery", Proceedings of the IEEE/CVF
A. Wilson, “Review of Deepfake Detection Tech- International Conference on Computer Vision,
niques”, International Journal of Engineering Re- Montreal, QC, Canada, 10-17 October 2021, pp.
search & Technology, Vol. 10, No. 5, 2021, pp. 813- 2065-2074.
816.
[14] M. Kumar, H.K. Sharma, "A GAN-based model of
[4] A. Malik, M. Kuribayashi, S. M. Abdullahi, A. N. deepfake detection in social media", Procedia
Khan, "DeepFake detection for human faces and Computer Science, Vol. 218, 2023, pp. 2153-2162.
videos: A survey", IEEE Access, Vol. 10, 2022, pp.
[15] A. Tiwari, R. Dave, M. Vanamala. "Leveraging deep
18757-18775.
learning approaches for deepfake detection: A
[5] M. S. Rana, M. N. Nobi, B. Murali, A. H. Sung, "Deep- review", Proceedings of the 7th International Con-
fake Detection: A Systematic Literature Review", ference on Intelligent Systems, Metaheuristics &
IEEE Access, Vol. 10, 2022, pp. 25494-25513. Swarm Intelligence, 2023, pp. 12-19. 2023.

[6] O. A. Paul, "Deepfakes Generated by Generative [16] E. Nowroozi, Y. Mekdad. "Detecting high-quality
Adversarial Networks", Georgia Southern Univer- GAN-generated faces using neural networks", Big

Volume 16, Number 1, 2025 17


Data Analytics and Intelligent Systems for Cyber “Openforensics: Multi-face Forgery Detection
Threat Intelligence, River Publishers, 2023, pp. and Segmentation In-the-wild Dataset [V.1.0.0]”,
235-252. https://doi.org/10.5281/zenodo.5528418 (ac-
cessed: 2023)
[17] S. Preeti, M. Kumar, H. K. Sharma. "Robust GAN-
Based CNN Model as Generative AI Application for [20] CelebA-Dataset, “Large-scale CelebFaces Attri-
Deepfake Detection", EAI Endorsed Transactions butes (CelebA) Dataset”, https://mmlab.ie.cuhk.
on Internet of Things, Vol. 10, 2024. edu.hk/projects/CelebA.html (accessed: 2024)

[18] B. D. Sergi, S. D. Johnson, B. Kleinberg, "Testing hu- [21] FFHQ, “NVlabs/ffhq-dataset: Flickr-Faces-HQ Data-
man ability to detect ‘deepfake’ s of human faces", set (FFHQ)”, https://github.com/NVlabs/ffhq-data-
Journal of Cybersecurity, Vol. 9, No. 1, 2023. set (accessed: 2024)

[19] T. N. Le, H. H. Nguyen, J. Yamagishi, I. Echizen,

18 International Journal of Electrical and Computer Engineering Systems

You might also like