0% found this document useful (0 votes)
18 views

Ratnesh

Uploaded by

g1bhagat109
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Ratnesh

Uploaded by

g1bhagat109
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

CAPSTONE PROJECT

Deep Fake Detection of images using Deep


learning model
Review III

Under the Guidance of Dr. Agilandeeswari L. (SCORE)

Submitted By
Raj Ratnesh Singh (20BIT0028)
Abstract
• Deepfakes are here to stay and in doing so have changed our perception of reality forever. An immense
challenge in developing forensic methods to detect real versus fake images and videos is that once papers are
published on new innovative approaches or methods are shared via open access, these flaws are
immediately incorporated in the next iteration of DeepFake generation methods.

• However, these artifacts can be difficult to spot, and forgers are constantly getting better at creating fake
images and videos that are indistinguishable from real ones. Another approach to detecting deepfakes is to
use machine learning algorithms to identify patterns that are associated with fakery. These algorithms can be
trained on a dataset of real and fake images and videos, and they can then be used to classify new images
and videos as real or fake.

• However, these algorithms can also be fooled by deepfakes that are designed to evade detection. Even with
models with accuracy as high as 97% are not enough. Similar to the medical domain, it is the ones that are
missed that represent the larger problem - i.e., 3% of billions of images on Google or Facebook platforms
would represent an immense loss of trust from users of these interfaces.

• The development of effective deepfake detection methods is a challenging task, and it is important to be
aware of the limitations of current methods. Our results show that state-of-art CNNs are now able to
distinguish with minimal misclassification inaccuracies between fake and real data. However, the detection of
these minimal inaccuracies remains a critical area of research
Problem Definition

1) Rapid Spread of Deep Fakes: The proliferation of deep fake technology has raised significant
concerns regarding its misuse for spreading misinformation, disinformation, and cybercrime. The
ability to create realistic-looking images or videos of individuals without their consent poses a
serious threat to privacy, security, and democratic processes.

2) Lack of Effective Detectio Mechanisms: Traditional methods of detecting deep fakes, such as
manual inspection or basic image analysis techniques, are not sufficient due to the high level of
realism achieved by modern deep learning models. There is an urgent need for automated,
efficient, and accurate detection mechanisms that can identify deep fakes in real-time

3) Challenges in Differentiating Real from Fake: The challenge lies in distinguishing between
genuine content and deep fakes, especially when the latter are highly sophisticated and visually
indistinguishable from the real ones. This requires advanced machine learning models capable of
analyzing subtle differences in texture, lighting, and other visual cues that are often overlooked
by human observers.

4) Need for Continuous Adaptation: As deep fake technology continues to evolve, so does the
sophistication of the generated content. This necessitates the development of detection models
that can adapt to new techniques and variations in deep fake creation, ensuring their
effectiveness over time.
Objectives
1) The project aims to discover the distorted truth of deepfakes by developing methods for
detecting and analyzing them. The project will also explore the potential for deepfakes to be used
for harmful purposes, such as spreading misinformation and propaganda.

2) The project will reduce the Abuses’ and misleading of the common people on the world wide
web.

3) The project will distinguish and classify the images as deepfake or real.

4)The project will also compare and contrast with other similar deepfake detection algorithms
Scope

The scope of the project is to develop a deepfake detection system that can be used to distinguish between
real and fake images and videos. The system will be based on a combination of computer vision and
machine learning techniques. The system will be evaluated on a dataset of real and fake images and videos.

The project will be divided into the following phases:

Data collection: The first phase of the project will involve collecting a dataset of real and fake images and
videos. The dataset will be used to train the deepfake detection system.

Model development: The second phase of the project will involve developing the deepfake detection
system. The system will be based on a combination of computer vision and machine learning techniques.
Evaluation: The third phase of the project will involve evaluating the deepfake detection system. The
system will be evaluated on a dataset of real and fake images.

The project uses two existing CNN frameworks, VGGFace and DenseNet, to solve the issue of
differentiating between real and deepfake images. VGGFace is a convolutional neural network that was
originally designed for face recognition. DenseNet is a deep convolutional neural network that was
originally designed for image classification.
Literature Survey

DeepFake Source Detection via Interpreting Residuals with Biological Signals 2020: Extract PPG cells from real and fake
videos 93.69% accuracy for deepfake videos This work looks for generator signatures in deepfakes, while the existing
work reported by looks for signatures in real videos. A holistic system combining these two perspectives can be
developed with a jointly trained model for detecting signatures on both authentic and fake videos.

Deep LearningBased Methodology for Video Deepfake Detection Using XGBoost(2020): CNN and Xgboost. 90.73%
accuracy, 93.5% specificity 85.3 sensitivity, 85.39% recall. This methodology employs the YOLO face detector to extract
face areas from video frames.but since deepfake video creation techniques develop continuously, more efforts are
needed to improve the existing detection methods

Deepfakes detectionusing human eye blinking pattern (2019): Deep Vision. 87.5% accuracy limitation of the study is
that blinking is also correlated with mental illness and dopamine activity which reduces accuracy.

Deepfakes Creation and Detection UsingDeep Learning(2021): CNN 85% Accuracy to further improve the performance
of deepfakedetectors, the focus should be on using datasetsof difficult conditions.
Proposed Architecture
Project Flow
1) Test dataset: The project starts with a test dataset, which is the initial data used for the model development process.
2) Preprocessing: The test dataset undergoes preprocessing steps, including cropping the dataset to specific dimensions
(224x224px) and saving the images for further processing.
3) Data splitting: The preprocessed dataset is split into training and validation sets, which are essential for training and
evaluating the machine learning models.
4) Ensemble Models: Multiple models are ensembled together, including a Densenet Classifier, Deepfake detection model, and
VGGface16 model.
5) Model Training: The ensembled models are trained using the training dataset. During this stage, the trained models are
exported for further use.
6) Inference: The trained models are loaded and used for inference tasks, such as classifying input data as real or fake.
7) Real/fake: The final output of the project is a classification of the input data as either real or fake, based on the predictions
made by the ensemble of trained models.
Methodology

1. Literature Review: Conduct a thorough review of existing research on deep fake detection, focusing on
methodologies, challenges, and state-of-the-art solutions.
2. Data Collection: Gather a comprehensive dataset of deep fakes and genuine content for training and testing the
detection model. Ensure the dataset includes various types of deep fakes (e.g., face swaps, deepfakes, etc.) and
covers different scenarios
3. Preprocessing: Apply preprocessing techniques to the collected data, including resizing, normalization, and
augmentation to enhance the model's performance and generalization capabilities.
4. Model Selection: Choose suitable deep learning models for the task, considering factors like accuracy,
computational efficiency, and adaptability to new deep fake techniques. Models like DenseNet and VGGFace are
strong candidates due to their proven performance in image classification and facial recognition tasks.
5. Model Training: Train the selected models on the preprocessed dataset, adjusting hyperparameters to optimize
performance. Utilize techniques like cross-validation to ensure the model's robustness and prevent overfitting.
6. Evaluation Metrics: Define appropriate evaluation metrics to assess the model's performance, including accuracy,
precision, recall, and F1 score. Consider using metrics that are particularly relevant to the detection task, such as
the Area Under the Receiver Operating Characteristic Curve (AUC-ROC).
7. Performance Analysis: Analyze the model's performance on the test dataset, identifying strengths and weaknesses.
Compare the results with those of other state-of-the-art models to highlight the unique advantages of your approach.

8. Feature Analysis: Investigate the features that the model uses to distinguish between deep fakes and genuine content.
This can provide insights into the subtle differences that the model is able to detect, which may not be immediately
apparent to human observers.

9. Adaptation to New Techniques: Design strategies for adapting the model to new deep fake techniques as they emerge.
This could involve continuous training on updated datasets or incorporating transfer learning techniques to leverage pre-
trained models.

10. User Interface Development: Develop a user-friendly interface for the detection system, allowing users to upload
images or videos for analysis. Include features for batch processing and real-time detection capabilities.

11. Security Measures: Implement security measures to protect against potential misuse of the detection system, such as
access controls and encryption for data transmission.

12. Documentation and Reporting: Document the entire process, from data collection to model deployment, in a clear and
comprehensive manner. Prepare reports detailing the methodology, findings, and recommendations for future work.

13.Ethical Considerations: Address ethical implications of deep fake detection, including privacy concerns and the potential
for misuse of the technology. Explore ways to mitigate these risks while still achieving the project's objectives.
Design and Module Description

[ 1 ] Module 1 – : Dataset Gathering:


Finding a quality data set was one of the biggest challenges and that could determine the end result of the research. The
dataset that I used consists of 140000 images To avoid the training bias of the model I have considered 50% Real and 50%
fake imagessplit. consists of all 70k REAL faces from the Flickr dataset collected by Nvidia, as well as 70k fake faces
sampled from the 1 Million FAKE faces (generated byStyleGAN). Fake Images: 70k thousand fake images dataset by
StyleGan.

[ 2 ] Module 2 – Pre-processing:
Image Cropping:
The images are first cropped to a size of 224px * 224px. This is done to reduce the size of each file, which can make the
training process more efficient. The cropping is done using the Image.crop method. The Image.crop method takes two
arguments: the topleft corner of the crop and the bottom-right corner of the crop. The top-left corner is specified as a
tuple of two integers, and the bottom-right corner is specified as a tuple of two integers.

Face Detection:
Once the images have been cropped, the face is detected using the dlib library. The dlib library is a popular library for
face detection. The dlib library uses the frontal_face_detector class to detect faces. The frontal_face_detector class takes
two arguments: the image and the scale factor. The image is the image that you want to detect faces in. The scale factor is
the factor by which you want to downsample the image before detecting faces.
[3] Module 3 – Training Model:

Model 1: DenseNet
For this model, we wanted to observe the accuracy of the DenseNet-121 model found in Keras with a slight modification
with the use of a dense layer as the last layer. This model contained 4 dense block with closely connected layers such as
batch normalization (BN), ReLU activation, and 3 x 3 convolution. In addition to this, between each dense block, the model
also contained a transition layer which included a 2 x 2 average pooling layer and a 1 x 1 convolution. After the last dense
block, we added the custom dense layer with sigmoid activation. For training we used 100,000 images in the training set
and 20,000 images in the validation set.

Model 2: VGGFace This model consisted of five layer blocks with each block made up of convolutional and max pooling
layers. The first and second block each consisted of two 3 x 3 convolutional layers followed by a max pooling layer. The
third, fourth, and fifth blocks each consist of three 3 x 3 convolutional layersfollowed by a max pooling layer. All
convolutional layers used ReLU activation function. Since, VGGFace uses pretrained weights, we had to fine tune to the
needs . Fine-tuning was done by adding a dense layer after the five layer blocks that gave us the facial features. Finally, we
also added a dense layer as the output layer with sigmoid activation.
CONSTRAINTS, ALTERNATIVES AND TRADEOFFS

1. Constraints –

• Integration with Non-AI Manipulations: The project must address not only AIgenerated deepfakes but also non-AI
manipulations like shallow fakes and miscontextualized content to ensure comprehensive detection capabilities.

• Global Accessibility: Ensuring that the detection system is accessible to a wide range of users, including journalists,
fact-checkers, and civil society organizations, is crucial. This requires careful consideration of licensing and distribution
strategies.

• Real-World Trends: The datasets and scoring rules for deepfake detection must accurately reflect real-world instances
of synthetic media to develop effective detection tools. This necessitates continuous updates and refinement of the
dataset and models.

• Adversarial Dynamics: The development and deployment of deepfake detection systems involve trade-offs between
open access to models and the need to deter adversaries from generating synthetic media. This highlights the
importance of responsible disclosure mechanisms and the potential for a hybrid approach to openness.
2. Alternatives –

• Responsible Disclosure Mechanisms: Implementing responsible disclosure mechanisms that limit the release
of synthetic media detection technology can ensure that the detection capabilities of those working to prevent
misinformation outpace the tactics deployed by those generating malicious, synthetic content.

• Incremental Release of Detection Models: A strategy that involves the incremental release of detection
models, starting with the most basic and gradually introducing more sophisticated models, could balance the
need for openness with the goal of maintaining a detector advantage over adversaries. This approach would
provide the eventual benefits of openness and increased access, while the phased release 33 would offer some
detector advantage over those trying to generate evasive, adversarial synthetic content.
3. Tradeoffs –
• Open Access vs. Adversarial Deterrence: There is a tension between the call for increased access to synthetic media
detection models and the need to prevent adversaries from learning how to evade detection methods. This tradeoff
underscores the importance of responsible disclosure mechanisms and the potential for a hybrid approach to
openness.

• Explainability vs. Technical Expertise: The need for the results of synthetic media detection to be understandable to
non-technical experts, such as journalists and policymakers, poses a challenge. This suggests a need for research on
explanations of inauthenticity and synthetic content, potentially including UX/UI improvements for detection models.

• Multistakeholder Input: Leveraging global multistakeholder input when developing synthetic media detection
challenges is beneficial but requires substantial time and energy. This highlights the importance of careful planning
and resource allocation for meaningful input
Output and Test Results

Densenet
Output and Test Results

VGGface
COST ANALYSIS

• Model training and Development: All the libraries and frameworks that have used in the
development of the deep fake detection system is free.

• Data Collection and Preprocessing: Since the dataset is taken from a Kaggle, there is no as such
cost associated with it.

• Model Development and Training: Used the free limit of computational requirement provided by
Google Collab and Kaggle.
Performance Metrics

Accuracy: The proportion of the total number of correct predictions that were correct.

Positive Predictive Value or Precision: The proportion of positive cases that were correctly identified.

Negative Predictive Value: The proportion of negative cases that were correctly identified.

Sensitivity or Recall: The proportion of actual positive cases which are correctly identified.

Specificity: The proportion of actual negative cases which are correctly identified.

Rate: It is a measuring factor in a confusion matrix. It has also 4 types TPR, FPR, TNR, and FNR.

F1 score : The F1 score is a measure of a model's performance on a classification task.


Test Results for Densenet and VGGface model:

ROC AUC SCORE OF DENSENET Model ROC AUC SCORE OF VGGface Model
COMPARATIVE ANALYSIS WITH EXISTING SYSTEMS

• DenseNet: Achieved an accuracy, precision, and recall of 0.97, indicating a highly accurate detection rate for deepfakes.
DenseNet's dense connections facilitate the capture of intricate patterns, which is beneficial for detecting subtle
manipulations in deepfake videos or images.
• VGGFace: Demonstrated exceptional performance with an accuracy, precision, and recall of 0.99. VGGFace's architecture
is optimized for face recognition, making it particularly effective in identifying deepfake facial manipulations.
• UADFV: The model achieved an accuracy, precision, and recall of 0.85, suggesting moderate performance. This could be
attributed to the specific characteristics of the UADFV dataset, which might present unique challenges for detection.
• FakeAPP: With an accuracy, precision, and recall of 0.88, this model shows good performance in detecting deepfakes
created using the FakeApp technique. This suggests that the model is well-suited for detecting manipulations made with
similar tools.
• DF-TIMIT: Achieved an accuracy, precision, and recall of 0.92, indicating high effectiveness in detecting deepfakes within
the DF-TIMIT dataset. This dataset's focus on TIMIT speakers might have contributed to the model's success.
COMPARATIVE ANALYSIS WITH EXISTING SYSTEMS

• FaceForensics++: This model achieved an accuracy, precision, and recall of 0.94, showcasing very high performance in
detecting deepfakes. The FaceForensics++ dataset's comprehensiveness likely played a significant role in this
achievement.
• DeepFakes, Face2Face, FaceSwap, NeuralTextures: These models achieved an accuracy, precision, and recall of 0.96,
demonstrating excellent performance across different deepfake techniques. This broad coverage of deepfake types
suggests a versatile and effective detection capability.
Final Results for Densenet and VGGface model:

Prediction of Real/fake image by VGGface Model Prediction of Real/fake image by VGGface Model

Prediction of Real/fake image by Emsemble Model


COMPARATIVE ANALYSIS WITH EXISTING SYSTEMS

• The analysis reveals that DenseNet and VGGFace stand out for their near-perfect performance across all metrics, indicating
their exceptional suitability for deepfake detection tasks. DenseNet's dense connections enable it to capture complex
patterns effectively, while VGGFace's architecture is tailored for face recognition, making it highly effective for detecting
deepfake facial manipulations.
• Models trained on specific datasets like UADFV and FakeAPP show varying levels of performance, suggesting that the
nature of the dataset can significantly influence detection capabilities. For instance, the UADFV dataset's unique
characteristics might pose specific challenges, affecting the model's performance.
• The models trained on the FaceForensics++ dataset and the combined dataset of DeepFakes, Face2Face, FaceSwap, and
NeuralTextures exhibit high performance, likely due to the comprehensive nature of these datasets, which cover a wide
range of deepfake techniques and scenarios.
• Overall, the results highlight the effectiveness of deep learning algorithms in detecting deepfakes, with DenseNet and
VGGFace emerging as particularly strong performers. The findings underscore the importance of selecting appropriate
models and datasets for deepfake detection tasks, as the choice can significantly impact the detection's accuracy and
reliability.
CONCLUSION AND FUTURE WORK

• The future work of deepfake detection would include the use of unsupervised clustering methods such as auto-encoders to
explore if true versus fake images cluster separately. Clustering methods are a type of machine learning algorithm that
can be used to group data points together. Autoencoders are a type of neural network that can be used to learn a
compressed representation of data. The use of unsupervised clustering methods such as auto-encoders could help to
identify patterns in data that are not visible to the human eye. This could be useful for detecting deepfakes, which are
often created by manipulating images and videos in ways that are not easily detectable by humans.
• Another future work would be to add transparency and interpretability to our models by use of CNN visualization
methods. CNN visualization methods are a type of technique that can be used to understand how a neural network
works. This could be useful for understanding how deepfake detection models work, and for identifying potential
weaknesses in these models.
• Overall, the future work of deepfake detection is a complex and challenging one. However, the development of effective
deepfake detection methods is an important task, as deepfakes can have a negative impact on people's lives and can be
used to spread misinformation and propaganda. It is important to develop methods for detecting deepfakes, and to raise
awareness about the dangers of deepfakes.
Thank You

You might also like