Vora
Vora
Research Article
Keywords: Machine Learning, AI-Generated Data, Deepfake, LLMs, Generative AI, Knowledge Graphs,
BERT, CNN, Multi-Modal Detection
DOI: https://doi.org/10.21203/rs.3.rs-3500331/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Abstract
As the use of AI-generated content grows, it becomes imperative to recognize
which data is human-developed and which is not. AI-generated content comes
in several modalities such as deepfakes, text and images. This article proposes
techniques that can be used to tackle this problem. To identify deepfakes we
can take advantage of characteristics that make us human: blood circulation,
human emotions and winks among other things. Text recognition takes a different
approach, using a combination of large-scale language models, style analysis, and
visual analysis. The generated images can be recognized by analyzing the pixels
and common colors and patterns. It can be concluded that AI-generated content
is the future, and it is difficult to develop a model that can identify it with 100%
accuracy. Validating content before submitting it and using it responsibly are the
steps we should take to adapt to a world where both types of content coexist.
1
Statements and Declarations
Competing Interests
The authors have no competing interests to declare that are relevant to the content
of this article.
Author’s Contributions
All authors contributed to the study conception and design. Material preparation,
data collection and analysis were performed by Vismay Vora, Jenil Savla and Deevya
Mehta. The first draft of the manuscript was written by them and all authors com-
mented on previous versions of the manuscript. All authors read and approved the
final manuscript.
Funding
No funding was received for conducting this study.
1 Introduction
The contemporary landscape is marked by a remarkable transition into the realm of
technology. The forefront of this transformation is adorned by Smart Machines and
Artificial Intelligence (AI), with ChatGPT and BingAI garnering considerable atten-
tion. These Language Model (LLM) based chatbots have witnessed a surge in daily
content generation. Rapid integration of these models into various applications under-
scores their utility. The ability to elicit curated responses with a single prompt has
bestowed unprecedented empowerment upon individuals, facilitating effortless content
generation. Gone are the days when content creation entailed laborious processes of
scouring diverse resources, gathering information, and amalgamating data to construct
coherent narratives. Traditional hurdles in content acquisition have receded, render-
ing generated content readily accessible to all.
However, this newfound capability also brings forth its own set of challenges. The
imperative for responsible content generation and utilization becomes evident. As users
harness AI to produce information, concerns over the authenticity and credibility of
such content emerge. Instances of content generated by AI infiltrating platforms like
StackOverflow have prompted temporary bans, as the influx of responses may initially
appear accurate but subsequently prove erroneous upon scrutiny. These AI models
can be tailored by users, thereby facilitating the dissemination of misleading or false
2
information—essentially enabling the propagation of self-inserted misinformation.
This privilege is sometimes exploited by students, who employ AI-generated content
to craft essays and undertake cognitive tasks, thereby negating the essence of the
assignments. Consequently, there exists a pressing need to establish mechanisms for
discerning AI-generated content. This paper delves into the methods of identifying
AI-generated text, images, and deepfake content, advocating for curated and filtered
content generation and distribution.
2 Related Work
The initial modality encompasses videos, wherein Generative AI techniques are har-
nessed to produce synthetic content, often derived from textual inputs. Within this
context, a distinct subset emerges—deepfake videos. A deepfake constitutes a syn-
thetic methodology facilitated by deep learning algorithms. It entails substituting the
visage of an individual in an existing image or video with the features or resemblance of
another individual. The applications of deepfakes span a spectrum of intentions, rang-
ing from legitimate to malevolent. These encompass purposes such as entertainment,
educational purposes, propaganda, spreading misinformation, harassment, blackmail,
and more. Broadly, deepfakes can be categorized into two principal classes: Respon-
sible Deepfakes, serving benign intentions, and Malicious Deepfakes, often entailing
harmful objectives.
• Responsible Deepfakes:
Responsible deepfakes are used for masking identities in video conversations, the
metaverse, and event participation in general. There is an increasing demand for
solutions that allow people to engage in video calls, virtual meetings, and virtual
settings while protecting their privacy and anonymity. In this context, research
into clever masking techniques that allow facial expressions to be exhibited without
exposing identity are critical. The field of responsible deepfakes entails creating
powerful algorithms that can change facial features in real-time while protecting
privacy and preserving the authenticity of facial expressions. While this field is
concerned with safeguarding privacy and anonymity, the major goal is to enable
individuals to communicate in virtual places without jeopardizing their identity.
• Malicious Deepfakes:
Malicious deepfakes are used for spreading misinformation and disinformation, as
well as propaganda and fake news. Deepfake technology is increasingly being utilised
for malevolent reasons, such as creating and disseminating deceptive content, fake
news, and propaganda. These deepfakes can have serious societal ramifications, such
as misinformation, loss of confidence, and even harmful outcomes. Detecting and
countering such deepfake films is critical for preserving information integrity and
safeguarding individuals from manipulation.
While responsible deepfakes prioritise privacy and do not involve deepfake detec-
tion, irresponsible or malicious deepfakes necessitate the development of powerful
deepfake detection techniques to counteract disinformation dissemination and protect
individuals from being tricked. In this research paper, the emphasis is on developing
3
effective deepfake detection systems that can detect and flag hostile deepfakes, hence
reducing the dissemination of misleading information and protecting the validity of
media content.
Research paper [1] synthesized 112 pertinent publications from 2018 to 2020 that
presented a variety of methodologies. They are grouped into four buckets: deep learning
approaches, classic machine learning methods, statistical techniques, and blockchain-
based strategies. The work [2] investigates novel approaches that implement GANs
as well as VAEs that are focused on faces. A convolutional EfficientNet B0 used to
extract features is integrated with ViTs, achieving performance equivalent to some
very recent ViT approaches. A basic inference process based on a simple voting scheme
for dealing with many faces in a single video shot has been described. A 0.951 AUC
and 88.0% F1 score was attained by the best model, very close to the state-of-the-art
on the DFDC Challenge[3].
Lastly, [4] suggests utilizing an ensemble of multiple CNNs to detect face swaps
and manipulation in video footage. Using attention layers and siamese training,
the suggested approach mixes several models obtained from a base network (Effi-
cientNetB4). The method is evaluated on openly accessible datasets containing over
119,000 clips, and it showed promising results in recognizing deepfakes and other facial
modification techniques.
One prevalent manifestation of generated content is text, and the advancement of
Large Language Models is perpetuating this trend. Research within this domain is
actively underway. A literature review on detecting AI-generated text unveils a spec-
trum of methodologies. In Article [9], published in the MIT Technology Review, the
discourse revolves around diverse approaches. This includes the employment of Large
Language Models, repurposed through retraining on human-written text to distin-
guish it. The introduction of watermarks during generation is suggested as an effective
demarcation between AI-generated and human-generated text. A pivotal deduction
from the article is that human judgment remains a potent means of detecting generated
text, as the presentation might not align with a general reader’s preferences.
Delving into the realm of research, Article [10] investigates the direct utilization
of generative AI models for detecting AI-generated content. However, this approach’s
accuracy hinges on shared training data between both systems.
In a nuanced evaluation, Article [11] conducts a differential analysis of AI-generated
content, focusing on scientific context. A distinctive framework is formulated, incor-
porating features encompassing syntax, semantics, pragmatics, and writing style to
unveil subtler errors. This contributes to refining AI models for improved content
quality, addressing ethical and security concerns.
The traceability of AI-generated text is a focal point in Article [13]. The article
delves into techniques employed to obscure AI-generated content, including paraphras-
ing tools. The exploration also entails a comparison between the detection model and a
random classifier. The study underscores the vulnerability of watermarking and intro-
duces the concept of spoofing attacks, whereby text is manipulated to contain hidden
signatures, evading detection.
Article [14] elucidates the utilization of watermarks as a segregation strategy
for distinct text types. Employing invisible text spans, detectable only via specific
4
algorithms, the study assesses its efficacy using models from the Open Pretrained
Transformer family. Watermarks are endorsed as robust and secure markers for
differentiation.
Images constitute the third modality. [15] conducts a systematic investigation on
the detection of CNN-generated images by utilizing the systematic deficit present in
these images in duplicating high-frequency Fourier spectrum decay features. However,
the study shows that the disparities in those decay attributes are not intrinsically
present in CNN-based generative models.
Another subsequent investigation creates an artificial data set that resembles the
CIFAR-10 dataset [17], containing sophisticated optical characteristics such as life-
like reflections, and gives the CIFAKE dataset [16]. This dataset creates a binary
classification task in order to distinguish between actual and AI-generated photos.
It proposes a simple neural network design for performing binary classification and
distinguishing between fake and real photos.
Notwithstanding these attempts, there is still a lot of room for advancement in
the field of AI-generated content detection research. The current study proposes a
method to detect deepfake content and is primarily concerned with the detection and
classification of AI-generated material in the text and image modalities with the use of
knowledge graphs. This research provides a feature-based model training methodology
for text detection as well as a neural network-based image detection method. The
following section goes over these approaches in depth.
3 Methodology
While the proposed approach for deepfake detection has been elucidated in subsec-
tion A, the technique for detecting AI-generated content for both text and images is
explained in two separate sections. Subsection B presents a method to identify text
produced by Large Language Models (LLMs) using a language model that relies on
knowledge-based features. Subsection C describes a technique to recognize images cre-
ated by artificial intelligence using a convolutional neural network with multiple layers.
Subsection D briefly proposes how knowledge graphs can be used to combine detection
for the text and image modalities.
5
PPG signals from the facial region of the input frame, the detector utilizes a PPG
extraction module. Subsequently, these PPG signals are fed into a PPG classification
module, which discerns their authenticity. Moreover, the detector incorporates an eye
gaze estimation module to predict the eye gaze direction of the face captured in the
input frame. The direction of the eye gazing is then sent into an eye gaze classification
module, which predicts whether the direction of the eye gaze is congruent with the
face orientation or not. The detector combines the outputs of the PPG classification
module with the eye gaze classification module to determine if the input frame is real
or fake.
The deepfake generator is designed to generate realistic and diversified synthetic
images while preserving the biological signals of the source image. To generate a syn-
thesised image, the generator employs a mix synthesis module that learns to combine
and blend numerous source images. The mix synthesis module is divided into two sub-
modules: composition and style. To establish a coherent framework for the synthesised
image, the composition module learns to choose and align diverse parts from several
source images.The style module learns to transfer and harmonise the source images’
colour, texture, and lighting to match the target image. The generator additionally
employs a residual interpretation module, which learns to comprehend and adjust
residuals between the synthesised image and the target image by utilising biological
signals such as PPG, facial expressions, and head attitude. To make the synthesised
image more realistic or diversified, the residual interpretation module can boost or
decrease particular biological signals.
The proposed technique also has potential applications for privacy enhancement
and social media anonymization. For example, it can be used to create privacy-
enhancing deepfakes that can replace or mask the identity of a person in a video while
preserving their biological signals and expressions. This can help users to protect their
personal information and avoid unwanted exposure or harassment online.
6
Fig. 1: Flowchart for Proposed Deepfake Detection Approach
The dataset is then partitioned into an 80:20 ratio for training and validation purposes,
respectively. The training set is employed to train the model, while the validation set
is used to assess the model’s performance and derive key evaluation metrics.
7
Algorithm 1 Feature Extraction
1: Input: Text
2: Output: Extracted Features
3: {Vocabulary-based features}
4: Tokenize and preprocess text
5: Calculate vocabulary size and lexical richness
6: {Syntactic features}
7: POS tag tokens and count nouns and verbs
8: {Semantic features}
9: Sentiment analysis to compute average sentiment score
10: {Stylistic features}
11: Calculate average sentence length and punctuation count
12: return features
8
exclusively alphabetic words that are not stopwords. This aids in computing lexical
richness by calculating the ratio of unique words to the total word count. Syntactic
features involve quantifying the count of specific Parts of Speech (POS) tags such as
nouns and verbs. The SemanticIntensityAnalyzer module from the nltk library serves
to extract semantic scores for each sentence in the text. Furthermore, integration
with the knowledge graph yields knowledge-driven features that encapsulate structural
attributes, semantic associations, and domain-specific insights. Additionally, stylish
features encompass average sentence length and punctuation count, contributing to
the assessment of text style.
9
In this research, the potency of a pre-trained Large Language Model (LLM)
– BERT (Bidirectional Encoder Representations for Transformers) – is harnessed
for detecting AI-generated text. Renowned for its efficacy, BERT is pre-trained on
extensive corpora, endowing it with nuanced understandings of word and sentence
representations, thus encompassing intricate syntactic and semantic relationships.
Contextualized word embeddings, intrinsic to BERT, further enrich its capacity
to comprehend content intricacies. The amalgamation of knowledge-driven features
derived from the knowledge graph and the learned representations of LLMs heightens
accuracy by capitalizing on their synergistic strengths. Accordingly, BERT emerged
as the model of choice for this experimental classification task.
Training the BERT model commences with dataset retrieval from a CSV file,
followed by partitioning into training and testing subsets. Tokenization via the
BERT tokenizer ensues, initializing the BERT model for subsequent steps. Tokenizer
encodings materialize into TensorFlow Dataset objects, fueling training and testing
endeavors. Fine-tuning transpires on the training dataset, with accuracy serving as
the evaluation metric on the test dataset.
10
processing before modulating the model’s weights. This judicious approach expedites
training iterations and facilitates expedient model evaluation.
11
Algorithm 3 Knowledge Graph Construction for AI-Generated Images
Input: CIFAKE Dataset (60,000 AI-generated images, 60,000 real photos)
Output: Knowledge Graph representing relationships and attributes of images
1: Construct Nodes:
2: Extract high-level visual features using CNN or pre-trained models (e.g., VGG16).
3: Encode images as nodes with attributes like image features, metadata, and class
labels.
4: Establish Relationships:
5: Create relationships between nodes based on semantic similarities.
6: Determine relationships through visual similarity or shared attributes.
12
process. The architecture culminates with an output layer that includes a Dense layer
with 1 unit and a sigmoid activation function. This specific configuration generates
prediction probabilities tailored to the binary classification task, offering indications
as to whether an image is AI-generated or not.
Model Compilation: The final model is compiled employing the Adam optimizer,
a binary cross-entropy loss function, and evaluation metrics encompassing accuracy,
precision, and recall.
13
Fig. 2: Architecture of Model for AI-Generated Image Detection
14
By integrating the knowledge graphs constructed for AI-generated images and
AI-generated text, we create a multi-modal knowledge graph that captures the interde-
pendencies and correlations between different modalities. This integration allows us to
leverage the strengths of both text and image analysis techniques, some of which have
been described above, leading to improved multi-modal AI-generated content detec-
tion. The combined knowledge graph facilitates cross-modal reasoning and provides a
holistic understanding of the data.
15
Fig. 3: Accuracy Comparison Plot
The accuracy comparison graph (Figure 3) presents a box plot depicting the
achieved accuracy for each model. The assessment of these models was conducted
using identical datasets to ensure uniformity. Notably, the BERT model exhibited the
highest accuracy of 0.8999. Subsequently, the Random Forest Classifier achieved an
accuracy of 0.7867, whereas the SVM model recorded an accuracy of 0.7334.
16
Fig. 7: ROC Curve Fig. 8: F1 Score Plot
The graph displayed above illustrates the F1 score curve, as represented in Figure
8, delineating the equilibrium between precision and recall for each model. Noteworthy
is the fact that the F1 score for the BERT model surfaced as the highest among the
models, signifying its adeptness at achieving a harmonious balance between precision
and recall. Conversely, both SVM and the Random Forest models exhibited marginally
lower F1 scores, implying prospective compromises between precision and recall within
their predictive outcomes.
17
4.2 Image Detection
The outcomes for the CNN model employed in AI-generated image detection are pre-
sented as follows:
The CNN-ANN model underwent training and evaluation using the CIFAKE dataset
to identify AI-generated images. The results obtained showcase its promising perfor-
mance, clearly distinguishing between AI-generated and real-world images. The model
achieved a minimal loss value of 0.1929, showcasing its capability to minimize the
disparity between predicted and actual labels post 10 epochs of training. On the val-
idation dataset, the model exhibited an accuracy of 93.55%, signifying a remarkably
high rate of accurate predictions overall.
To perform a comprehensive comparative analysis of the CNN model in conjunction
with its contemporaries, namely the Support Vector Machine (SVM) and Random
Forest (RF) models, all three were subjected to training using the CIFAKE dataset.
Visualizing the results, the bar plot presented in Figure 10 offers an illuminating
comparison of achieved accuracy levels.
Significantly, the CNN model stands out by attaining an impressive accuracy of
93.55%. This performance surpasses both the SVM and RF models, which achieved
accuracy rates of 84.84% and 83.41%, respectively. This disparity underscores the
exceptional discriminative prowess of the CNN model in distinguishing between AI-
generated and authentic images.
The confusion matrices for each of the models are displayed in the annotated
figures 11, 12, and 13, while a tabulated overview comparing the performance of these
confusion matrices is presented in the subsequent table. In this context, ”Positive”
denotes an AI-generated image, whereas ”Negative” signifies a real image.
18
Fig. 13: Confusion
Fig. 11: Confusion Fig. 12: Confusion Matrix (Random Forest
Matrix (CNN Model) Matrix (SVM Model) Model)
Drawing from these findings, it can be inferred that the CNN model demonstrates
superior performance compared to both the SVM and Random Forest models in effec-
tively distinguishing between AI-generated and real images. The CNN model attained
the highest instances of accurate predictions (18718) along with the lowest occurrences
of incorrect predictions (1292), underscoring its exceptional performance.
The ROC curve, displayed in Figure 14, provides a visual representation of the
models’ performance. This curve adeptly highlights the intricate equilibrium between
the true positive rate and the false positive rate. The metric of the area under the
curve (AUC) serves as a robust indicator of the models’ overall efficacy, with elevated
values correlating to enhanced class distinction. Specifically, within this context, the
CNN model demonstrated an AUC of 0.9359. In comparison, the SVM and Random
Forest models secured AUC values of 0.9267 and 0.9132, respectively.
The Precision-Recall curve, juxtaposing the three models, is visualized in figure 15.
Precision, gauging the ratio of accurately predicted AI-generated images to all pro-
jected AI-generated images, stood at 0.9469 for the CNN model. This commendable
precision score underlines the model’s efficacy in curtailing false positives, thereby
mitigating the likelihood of misidentifying genuine images as AI-generated. In addi-
tion, the model achieved a recall rate of 0.9122, signifying the proportion of correctly
19
Fig. 15: Precision Recall Curve
identified AI-generated images out of the entire set of actual AI-generated images.
This notable recall value suggests the model’s proficiency in minimizing false nega-
tives, ensuring the detection of a significant majority of AI-generated images within
the dataset.
20
5 Conclusion
The proliferation of AI-generated content has ushered in a surge in the velocity and
quantum of information available online. In this context, the imperative to differen-
tiate human-developed data from its AI-generated counterparts becomes paramount.
The detection of AI-generated text employs an alternative strategy, harnessing a fusion
of extensive language models, style assessment, and visual scrutiny to discern text
fashioned by AI models themselves. Conversely, AI-generated images can be discerned
through meticulous pixel analysis, with deviations in color, pattern, and structure
serving as telltale signs of synthetic content. Detecting deepfake content necessitates
consideration of naturally occurring biological cues often overlooked during synthetic
content creation. Cues encompassing eye movements, blood pulsations, and quintessen-
tial human expressions emerge as discernible parameters. Consolidating the insights
presented in this research, it is evident that AI-generated content is poised to domi-
nate the future landscape. Yet, the creation of a model that achieves absolute precision
in identifying such content remains a formidable challenge. Although countermea-
sures such as watermarking and cryptographic signatures will evolve, the potency of
systems generating synthetic content will concurrently heighten. Hence, content val-
idation prior to dissemination and responsible usage emerge as pivotal strategies as
society adjusts to the coexistence of both content paradigms.
6 Future Scope
The realm of generative artificial intelligence is rapidly advancing, offering numerous
potential directions for expanding this research. One avenue involves working with
larger and more diverse datasets, delving into the scalability and architecture of these
models. The success of each model is tightly intertwined to the datasets they are
trained upon. Another avenue to explore is integrating the audio modality, followed by
the incorporation of videos or deepfakes. A proposed approach for the latter has been
briefly mentioned. Additionally, fusing all these diverse models to create a cross-modal
detector ould yield more potent and resilient detection outcomes, achieved through the
exploration of novel techniques. Furthermore, a prospective avenue is the development
of a real-time detection system. Such a system could offer significant value by ensuring
prompt content assessment. Finally, while the current models demonstrate predictions
with high accuracy, their intricate architecture might pose challenges in terms of inter-
pretation. Strategies such as attention mechanisms or visualizing learned traits could
greatly enhance the models’ interpretability, shedding light on the underlying reasons
behind their decisions.
References
[1] M. S. Rana, M. N. Nobi, B. Murali and A. H. Sung, ”Deepfake Detection: A
Systematic Literature Review,” in IEEE Access, vol. 10, pp. 25494-25513, 2022,
doi:10.1109/ACCESS.2022.3154404.
21
[2] Coccomini, Davide Alessandro, et al. ”Combining efficientnet and vision trans-
formers for video deepfake detection.” Image Analysis and Processing–ICIAP
2022: 21st International Conference, Lecce, Italy, May 23–27, 2022, Proceedings,
Part III. Cham: Springer International Publishing, 2022.
[4] Bonettini, Nicolo, et al. ”Video face manipulation detection through ensemble of
cnns.” 2020 25th international conference on pattern recognition (ICPR). IEEE,
2021.
[5] Mathews, S., Trivedi, S., House, A. et al. ”An explainable deepfake detection
framework on a novel unconstrained dataset”. Complex Intell. Syst. (2023).
[6] Dagar, D., Vishwakarma, D.K. ”A literature review and perspectives in deep-
fakes: generation, detection, and applications”. Int J Multimed Info Retr 11,
219–289 (2022).
[7] Ma, Yongqiang, Jiawei Liu, and Fan Yi. ”Is this abstract generated by ai? a
research for the gap between ai-generated scientific text and human-written
scientific text.” arXiv preprint arXiv:2301.10416 (2023).
[8] Bonettini, Nicolo, et al. ”Video face manipulation detection through ensemble of
cnns.” 2020 25th international conference on pattern recognition (ICPR). IEEE,
2021.
[11] Yongqiang Ma, Jiawei Liu, Fan Yi, Qikai Cheng, Yong Huang, Wei Lu, Xiaozhong
Liu (2023), ”AI vs. Human – Differentiation Analysis of Scientific Content Gen-
eration”, arXiv:2301.10416
22
[12] ML Olympiad - Detect ChatGPT Answers — Kaggle Contest
https://www.kaggle.com/competitions/ml-olympiad-detect-chatgpt-
answers/data
[13] Vinu Sankar Sadasivan and Aounon Kumar and Sriram Balasubramanian and
Wenxiao Wang and Soheil Feizi (2023), ”Can AI-Generated Text be Reliably
Detected?”, arXiv:2303.11156
[14] John Kirchenbauer and Jonas Geiping and Yuxin Wen and Jonathan Katz and
Ian Miers and Tom Goldstein (2023), ”A Watermark for Large Language Mod-
els”, arXiv:2301.10226
[16] Bird, Jordan J., and Ahmad Lotfi. ”CIFAKE: Image Classification and
Explainable Identification of AI-Generated Synthetic Images.” arXiv preprint
arXiv:2303.14126 (2023).
23