Fake News Documentation
Fake News Documentation
With a lot of information or news, the one question occurred The algorithms used by fake news detection systems include
whether the given news or information is True or Fake. Fake machine learning algorithms such as Logistic Regression,
news is commonly distributed with an intent to mislead or Random Forests, Decision trees, Support Vector Machines,
make an inclination to get political or monetary benefits. Let’s Stochastic Gradient Descent, and so on. A simple method of
consider the example - In the recent elections of India, there fake news detection based on one of the AI algorithms called
has been a lot of discussion in regards to the credibility of the Naive Bayes classifier help to examine how this particular
different news reports preferring certain applicants and the method works for the particular problem with a manually
political thought processes behind them. In this growing labeled (fake or real) dataset and to support the idea of using
interest, exposing fake news is paramount in preventing its machine learning to detect fake news.
negative impact on people and society. II LITERATURE REVIEW
The World Wide Web contains data in grouped arrangements [1]Paper Name: - Evaluating Machine Learning algorithms for
like documents, videos, and audio. News distributed online in Fake News Detection.
an unstructured configuration (like news, articles, videos, Author: - Shloka Gilda.
audios) is moderately hard to distinguish and order as this
rigorously requires human mastery. However, computational In this article, the author introduced the concept of the
procedures, for example, natural language preparing (NLP) can importance of NLP in stumbling across incorrect information.
be utilized to identify irregularities that different a content They have used time frequency-inverse document frequency
(TF-IDF) of bigrams and probabilistic context-free grammar
IMPACT FACTOR 6.228 WWW.IJASRET.COM DOI: 10.51319/2456-0774.2021.6.0067 310
|| Volume 6 || Issue 6 || June 2021 || ISSN (Online) 2456-0774
INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH
AND ENGINEERING TRENDS
detection. Shloka Gilda introduced the concept of the new data set, which provided the opportunity to evaluate its
importance of NLP in stumbling over incorrect information. performance against the most recent data.
They used BiGram Count Vectorizer and Probabilistic Context- III PROPOSED METHODOLOGY
Free Grammar (PCFG) to detect deceptions. They examined
the data set in more than one class of algorithms to find out a This project will help to find a way to utilize Natural Language
better model. The count vectorizer of bi-grams fed directly into Processing (NLP) to identify and Classify fake news articles.
a stochastic gradient descent model which identifies The main objective is to detect fake news, which is a classic
noncredible resources with an accuracy of 71.2%. text classification problem. We will gather our data, preprocess
the text, and convert our articles into features for Use in
[2]Paper Name: - Fake News Detection on Social Media: A supervised models. We will use a Passive-Aggressive classifier
Data Mining Perspective. for training data sets and testing on news articles.
Author: - Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang and In this project, we will be using Python and Sci-kit libraries.
Huan Liu. Python has a great set of libraries and plugins that you can use
In this paper to detect fake news on social media, a data mining in machine learning. The Sci-Kit Learn library is the best
perspective is presented that includes the characterization of resource for the machine learning algorithms, which almost all
fake news in psychology and social theories. This article looks of the types of machine learning algorithms that are easily
at two main factors responsible for the widespread acceptance available to Python, so a simple and quick evaluation of the
of fake messages by the user which are naive realism and ML algorithms, is possible too. We used the flask to deploy a
confirmatory bias. It proposes a general two-phase data mining model along with the implementation help of HTML, CSS, and
framework that includes 1) feature extraction and 2) modeling, Javascript for the front end.
analyzing data sets, and confusion matrix for detecting fake IV SYSTEM DESIGN
news.
[3]Paper Name: - Media Rich Fake News Detection: A Survey.
Author: - Shivam B. Parikh and Pradeep K. Atrey.
Social networking sites read news mainly in three ways: The
(multilingual) text is analyzed with the help of computational
linguistics, which semantically and systematically focuses on
the creation of the text. Since most publications are in the form
of text, a lot of work has been done on analyzing them.
Multimedia: Several forms of media are integrated into a single
post. This can include audio, video, images, and graphics. This
is very attractive and attracts the viewer's attention without
worrying about the text. Hyperlinks allow the author of the
post to refer to various sources and thus gain the trust of
viewers. In practice, references are made to other social media
websites, and screenshots are inserted.
[4]Paper Name: - Fake News Detection using Naive Bayes
classifier.
Author: - Mykhailo Granik and Volodymyr Mesyura.
V IMPLEMENTATION
This article describes a simple method of fake news detection
1. Data Collection :
based on one of the artificial intelligence algorithms called the
Naive Bayes classifier. The goal of the research is to examine In the working first step is data collection. The algorithm of
how this particular method works for the particular problem machine learning used in this project is called supervised
with a manually labeled (fake or real) dataset and to support learning. Learning is said to be supervised when the model is
the idea of using machine learning to detect fake news. The trained on a data set that contains both input and output
difference between this article and articles on similar topics is parameters. In supervised learning, the model is trained using a
that this article is extensively based on a Naive Bayes classifier data set that contains both input and output parameters. To train
which is used for the classification of fake news and real news; the model we have taken the dataset from kaggle.com The size
In addition, the developed system was tested on a relatively of the dataset is 20000*5 that means it having 20000 news
article and 5 attributes.
RESEARCH PAPER – 2
IMPACT FACTOR 6.228 WWW.IJASRET.COM DOI: 10.51319/2456-0774.2021.6.0067 315
|| Volume 6 || Issue 6 || June 2021 || ISSN (Online) 2456-0774
INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH
AND ENGINEERING TRENDS
Received November 10, 2021, accepted November 16, 2021, date of publication November 18, 2021, date of current version November 30, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3129329
ABSTRACT
A protuberant issue of the present time is that, organizations from different domains are struggling to obtain effective solutions for detecting online-based fake news. It is quite
thought-provoking to distinguish fake information on the internet as it is often written to deceive users. Compared with many machine learning techniques, deep learning-
based techniques are capable of detecting fake news more accurately. Previous review papers were based on data mining and machine learning techniques, scarcely exploring
the deep learning techniques for fake news detection. However, emerging deep learning-based approaches such as Attention, Generative Adversarial Networks, and
Bidirectional Encoder Representations for Transformers are absent from previous surveys. This study attempts to investigate advanced and state-ofthe-art fake news detection
mechanisms pensively. We begin with highlighting the fake news consequences. Then, we proceed with the discussion on the dataset used in previous research and their NLP
techniques. A comprehensive overview of deep learning-based techniques has been bestowed to organize representative methods into various categories. The prominent
evaluation metrics in fake news detection are also discussed. Nevertheless, we suggest further recommendations to improve fake news detection mechanisms in future research
directions.
INDEX TERMS Natural language processing, machine learning, deep learning, fake news.
I. INTRODUCTION difficult to determine. As a result, any false or incorrect information is typically
The Internet has changed interaction and communication ways through low cost, branded as misinformation on the Internet. Distinguishing real and fake
simple access, and fast information dissemination. Therefore, social media and information is challenging. However, many approaches have been adopted to
online portals have become more popular for news searches and reading for many address this issue. Various machine learning (ML) methods have been used to detect
people rather than traditional newspapers. Social media harms society by false information spread online in the case of knowledge verification [16], natural
influencing major events even though it has become a powerful means of language processing (NLP) [16]–[18] and sentiment analysis [19]. Early research
information. Especially after the presidential election of the U.S. in 2016, the issue of concentrated on leveraging textual information derived from the article’s content,
online false news has gained more popularity [1], [2]. According to Zhang and such as statistical text features [20] and emotional information [21]–[23].
Ghorbani [3], voters might be easily controlled by deceptive political statements and Deep learning (DL) has recently become an emerging technology among the
claims. Inspection shows that false news or lies propagate more quickly through research community and has proven to be more effective in recognizing fake news
humans than original information and cause tremendous effects [4]. than traditional ML methods. DL has some particular advantages over ML, such as
The terms rumor and fake news are closely interrelated. a) automated feature extraction,
Fake news or disinformation is intentionally created. On the TABLE 1. A comparison of existing surveys based on fake news detection.
The associate editor coordinating the review of this manuscript and extract high-dimensional features, and c) better accuracy. Further, the current wide
other hand, rumors are unconfirmed and questionable information that is spread availability of data and programming frameworks has boosted the usage and
without the aim to deceive [15]. On social media sites, spreaders’ intentions might be robustness of DL-based approaches. Hence, in the last five years, numerous articles
approving it for publication was Sergio Consoli. b) lightly dependent on data pre-processing, c) ability to
|| Volume 6 || Issue 6 || June 2021 || ISSN (Online) 2456-0774
INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH
AND ENGINEERING TRENDS
have been published on fake news detection, mostly based on DL strategies [24]. An There has always been fake news since the beginning of human civilization.
enthusiastic effort has been made to review the current literature to compare the However, the spread of fake news is increased by modern technologies and the
extensive amount of DL-based fake news detection research efforts. conversion of the global media landscape. The major consequences on social,
A number of research works has been published on the survey of fake news political, and economic environments may be caused by fake news. Fake
detection [5], [25], [26]. Our investigation reveals that existing studies do not information and fake news have various faces. As information molds our view
provide a thorough overview of deep learning-based architectures for detecting fake toward the world, fake news has a huge impact. We make critical decisions based on
news. The existing survey papers mostly cover the ML strategies in detecting fake the information. By obtaining information, we develop an impression about a
news, scarcely exploring the DL strategies [3], [9], [10]. We provide a complete list of situation or people. We cannot obtain good decisions if we find fake, false, distorted,
NLP techniques as well as describe their benefits and drawbacks. In what follows, in or fabricated informationontheInternet.Theprimaryimpactsoffakenews are as
this survey, we performed an in depth analysis of current DL-based studies. Table 1 follows:
provides a brief overview of the existing survey papers and our research Impact on Innocent People: Rumors can have a major impact on specific people.
contributions. The present study aims to address the previous research’s weaknesses These people may be harassed by social media. They may also face insults and
and strengths by conducting a systematic survey on fake news detection. First, we threats that may have real-life consequences. People must not believe in invalid
divide existing fake news detection research into two main categories: (1) Natural information on social media or judge a person.
Language Processing (NLP) and (2) Deep Learning (DL). We discuss the NLP Impact on Health: The number of people searching for health-related news on the
techniques such as data pre-processing, data vectorizing, and feature extraction. Internet is continuously increasing. Fake news in health has a potential impact on
Second, we analyze the fake news detection architectures based on different DL people’s lives[36].Therefore,thisisoneofthemajorchallengestoday. Misinformation
architectures. Finally, we discuss used evaluation metrics in fake news detection. about health has had a tremendous impact in the last year [37]. Social media
Figure 1 depicts an overall taxonomy of fake news detection approaches. We also platforms have made some policy changes to ban or limit the spread of health
include a table 2, including acronyms used throughout the survey to assist misinformation as they face pressure from doctors, lawmakers, and health
researchers when encountering issues due to acronyms. advocates.
The rest of the paper is organized as follows. Section II highlights the consequences Financial Impact: Fake news is currently a crucial problem in industries and the
of fake news. Section III describes the used datasets. Section IV explains the Natural business world. Dishonest businessmen spread fake news or reviews to raise their
Language Processing techniques in fake news detection. Section V contains an in- profits. Fake information can cause stock prices to fall. It can ruin the fame of a
depth analysis of deep learning strategies. Section VI presents the evaluation metrics business. Fake news also has an impact on customer expectations. Fake news can
used in previous studies. Section VII narrates the challenges and future research create an unethical business mentality.
direction. Finally, Section VIII concludes the paper.
Democratic impact: The media has discussed the fake news phenomenon
II. FAKE NEWS CONSEQUENCES significantly because fake news played a
TABLE 2. The table contains the acronyms used in this survey.
vital role in the last American presidential election. This is III. BENCHMARK DATASET
a major democratic problem. We must stop spreading fake In this section, we discuss the datasets used in various news as it has a real impact. studies. For both training and
testing, benchmark datasets
TABLE 3. The table provides details of publicly available datasets and corresponding URLs.
FIGURE 2. A pie chart of the benchmark datasets used in the studies of fake news
detection.
dataset and they reported an accuracy of 93.50% which is the highest, utilizing the
same dataset for fake news detection.
A pie chart of used benchmark datasets is given in 2.
IV. NATURAL LANGUAGE PROCESSING
Natural Language Processing (NLP) is an area in machine learning with the
capability of a computer to understand, analyze, manipulate, and potentially
generate human language. The NLP technique consists of data pre-processing and
word embedding. By utilizing deep learning techniques, NLP has seen some colossal
advancements in recent years [41]. The natural language must be transformed into
a mathematical structure to give machines a sense of natural language. In section
IV-A, IV-B, and IV-C, NLP techniques are discussed.
A. DATA PRE-PROCESSING
Data pre-processing is utilized to represent complex structures with attributes,
binarize attributes, change discrete attributes, persist, and manage lost and obscure
attributes.
FIGURE 6. The figure shows the architecture of CNN. Here, an input picture of
a snowflake is given to the CNN picture classifier. The input goes through a
series of convolution layers, pooling layer, fully connected layers, and classifies
FIGURE 4. A nested pie chart illustrating the percentage of published articles the object based on learned features.
and popular models each year. 1) CONVOLUTION LAYER
CNNs work very well with image classification and computer vision because of
the convolution operation, and their ability to extract features from inputs for
better representation makes them very efficient. These properties make CNNs
powerful in sequence processing [131]. Fernández-Reyes and Shinde [77]
proposed a CNN architecture called, StackedCNN (2-dimensional convolution
layers, rather than 1-dimensional convolutions). It is proven that finding
patterns in text data a fusion of pre-trained word embeddings with 2-
dimensional convolutional layers helps, but the performance of the StackedCNN
is poor compared to state-of-the-art CNN. Another study by Li et al. [132]
adopted a novel approach with multilevel CNN (MCNN) and Sensitive word’s
weight calculating method (TFW). MCNN-TFW successfully captured semantic
information from the article text content. For this reason, it outperforms the
compared methods, including CNN. Their work did not consider latent-based
FIGURE 5. The diagram illustrates the general deep learning-based architecture
features. Alsaeedi and Al-Sarem [45] added more convolution layers, and it has
that was used in most studies.
an impact on the proposed model performance. According to the results, the
cutting-edgeartificialneuralnetworks.Therefore,weprovide Figure 4, which model’s performance is lowered by about 0.014.
shows the percentage of DL-based fake news detection papers with used
2) POOLING LAYER
classifiers in recent years.
A pooling operation that chooses the greatest component from each patch of
After inspecting previous studies, we found a general framework for deep
each feature map covered by the filter is called max pooling. A pooling layer is a
learning-based fake news detection. The first step was to collect a dataset or
new layer attached to the convolutional layer. Its purpose is to continuously
create one. Most studies have used news articles collected from publicly
diminish the spatial size of the representation in order to decrease the number of
available datasets. The pre-processing technique was applied after collecting the
parameters and the calculation inside the network. The pooling layer operates
dataset to feed the data in a neural network [42], [96], [131]. Word2vec and
autonomously on each feature map. Max pooling or average pooling is the most
GloVe word embedding methods have mostly been used in previous studies to
commonly used function in fake news detection. Alsaeedi and Al-Sarem [45]
map words into vectors [41], [78], [80]. We represent an overall process for fake
adjusted the hyperparameter settings in a CNN. They found the best parameter
news identification with deep learning in Figure 5 based on various studies [40],
settings that gave an improvement in the model’s performance. The
[42], [61].
recommended CNN model performs best when the number of units in the dense
148 DL-based studies were examined to provide a detailed description of these layer is set to 100, the number of filters is set to 100, and the window size is set to
architectures: CNN in section V-A and RNN in Section V-B, Graph Neural 5. The GlobalMaxPooling1D method achieved the highest scores, showing that it
Network in Section V-C, works well for fake news detection when compared to other pooling methods
Generative Adversarial Network in Section V-D, Attention Mechanism in [45].
Section V-E, Bidirectional Encoder Representations for Transformers in Section 3) REGULARIZATION LAYER
V-F, and Ensemble Approach in Section V-G.
The most crucial problem of classification is to reduce the training and test
A. CONVOLUTIONAL NEURAL NETWORK (CNN) errors of the classifier. Another common issue is the over-fitting problem (the
A few deep learning models have been introduced to handle ambiguous detection space between training and testing errors is huge). Overfitting makes it difficult
issues. CNNs and RNNs are the most interesting models [77]. Researchers are to generalize the model as it becomes more applicable (overfit) to the training
trying to boost the performance of the fake news detector with CNN by taking set. Regularization is a solution to the overfitting problem. Regularization is
its power of extracting features well and better classification process [132]. applied to the model to lessen the problem of overfitting and decrease the error
However, CNNs are also gaining popularity in the NLP technique too. It is of generalization, but not the error of training [45]. The dropout regularization
utilized for mapping the features of n-gram patterns. The CNN is similar to a method is mostly used for fake news detection [133]. Other methods such as
multilayer perceptron (MLP) as it is an unsupervised multilayer feed-forward early stopping and weight penalties were not used in previous studies on fake
neural network [45]. The CNN consists of an input layer, an output layer, and a news detection. Dropout avoids overfitting by gradually
sequence of hidden layers. CNNs are mostly used for picture recognition and filteringoutneurons.Eventually,allweightsarecalculatedasan average so that the
classification. Neural networks with 100 or more hidden layers have been weight is not too high for a single neuron.
reported in recent studies. Backward-propagation and forward-propagation B. RECURRENT NEURAL NETWORK (RNN)
algorithms are utilized in neural networks. These algorithms are used to train
The RNN is a type of neural network. In RNN, nodes are sequentially connected
neural networks by updating the weights of each layer. The gradient (derivative)
to construct a directed graph. The output from the earlier step serves as the
of the cost function is utilized to update the weights. When the sigmoid
input to the current step. RNNs are effective in time and sequence-based
activation function is applied, the value of the gradient decreases per layer. This
predictions. RNN is less compatible with features compared to CNN. RNNs are
lengthens the training time. This problem is called the vanishing-gradient
suitable for studying sequential texts and expressions. However, it cannot
problem. A deeper CNN or a direct connection in dense solves this problem.
process very long sequences when tanh or ReLU is used as an activation
Compared to a normal CNN, a deeper CNN is also
function.
lessvulnerabletooverfitting[67].Kaliyaretal.[40]proposed a model FNDNet (deep
CNN), which is designed to learn the discriminatory features for fake news The backward-propagation algorithm is utilized in the RNN for training. While
detection using multiple hidden layers. The model is less prone to overfitting but training the neural networks, it is required to take tiny steps frequently in the
takes a longer time to train. The convolutional layer, pooling layer, and way of the negative error derivative concerning network weights to establish
API. It requires more time to train and test the suggested model. Liao et al. [137]
proposed a novel model called fake news detection multi-task learning (FDML).
The model explores the influence of topic labels for fake news while also using
contextual news information to improve detection performance on short false
news. The FDML model, in particular, is made up of representation learning and
multi-task learning components that train both the false news detection task and
the news topic categorization task at the same time. However, the performance
of the model decreases without the author’s information.
2) GATED RECURRENT UNIT (GRU)
In terms of structure and capabilities, GRU is comparatively easier and more
proficient than LSTM. This is because there are only two gates, to be specific,
reset and update. The GRU manages the information flow in the same manner
as the LSTM unit does, but without the use of a memory unit. It literally exposes
the entire hidden content with no control whatsoever. When it comes to learning
long-term dependencies, the quality of GRU is way better than LSTM. Hence, it
is a promising candidate for NLP applications [41]. GRUs are more
straightforward as well as much more proficient compared to LSTM. GRU is
still in its early stages, thus, we are seeing it being used lately to identify false
news. GRU is a newer algorithm with a performance comparable to that of
LSTM but greater computational efficiency. Li et al. [134] used a deep
bidirectional GRU neural network (two-layer bidirectional GRU) as rumor
detection model. The model suffers from slow convergence. S and Chitturi [41]
showed that it is difficult to determine whether one of the gated RNNs (LSTM,
GRU) is more successful, and they are usually chosen based on the basis of the
FIGURE 7. The figure shows an architecture of basic RNN with n sequential available computing resources. Girgis et al. [96] experimented with CNN,
layers. x represents the inputs and y represents the output generated by the LSTM, Vanilla, and GRU. Vanilla suffers from a gradient vanishing problem,
RNN. but GRU solves this issue. Though GRU is said to be the best outcome of their
a minimum error function. The size of the gradients becomes tiny for each studies, it takes more training time. A bidirectional GRU was utilized by
consequent layer. Thus, the RNN suffers from a vanishing gradient issue in the Singhania et al. [87] forword-by-wordannotation.Withprecedingandsubsequent
bottom layers of the network. We can deal with the vanishing gradient problem words, it captures the word’s meaning within the sentence. A study by Shu et al.
by using three solutions: (1) using rectified linear unit (ReLU) activation [100] proposed a sentence-comment co-attention subnetwork model named
function, (2) using RMSProp optimization algorithm, and (3) using diverse dEFEND (Explainable fake news detection) utilizing news content and user
network architecture such as long short-term memory networks (LSTM) or comments for fake news detection. The authors considered textual information
gated recurrent unit (GRU). So previous studies focused on LSTM and GRU with bidirectional GRU (Bi-GRU) to achieve better performance. Moreover, the
rather than the state-of-the-art RNN [80], [96], [134]. Bugueño et al. [80] model has a low learning efficiency.
proposed a model based on RNN for propagation tree classification. The authors C. GRAPH NEURAL NETWORK (GNN)
used RNN for sequence analysis. The number of epochs was set as 200, which is
A Graph Neural Network is a form of neural network that operates on the graph
relatively high in comparison to their training examples. To predict fake news
structure directly. Node classification is a common application of GNN.
articles, authors have proposed distinctive RNN models, specifically LSTM,
Essentially, every node in the network has a label, and the network predicts the
GRU, tanhRNN, unidirectional LSTM-RNN, and vanilla RNN. RNNs, and in
labels of the nodes without using the ground truth. The network extends
specific LSTM, are especially successful in processing sequential data (human
recursive neural networks by processing a broader class of graphs, including
language) and catching significant features out of diverse data sources. Further,
directed, undirected graphs, and cyclic, and it can handle node-focused
in Sections V-B1 and V-B2, we discuss LSTM and GRU.
applications except any pre-processing steps [138]. The network extends
1) LONG SHORT-TERM MEMORY (LSTM) recursive neural networks by processing a broader class of graphs, including
LSTM models are front runners in NLP problems. LSTM is an artificial cyclic, directed, and undirected graphs, and it can handle node-focused
recurrent neural network framework used in deep learning. LSTM is a applications without requiring any pre-processing procedures cite190. GNN
progressed variation of RNN [41]. RNNs are not capable of learning long-term captures global structural features from graphs or trees better than the deep-
dependencies because back-propagation in recurrent networks takes a while, learning models discussed above [139]. GNNs are prone to noise in the datasets.
particularly for the evolving backflow of blunder. However, LSTM can keep Adding a little amount of noise to the graph via node perturbation or edge
‘‘Short Term Memories’’ for ‘‘Long periods.’’ The LSTM is made up of three deletion and addition has an antagonistic effect on the GNN output. Graph
gates: an input gate, an output gate, a forget gate, and a cell. Through a convolutional network (GCN) is considered as one of the basic graph neural
combination of the three, it calculates the hidden state. The cell can recall values networks variants.
over a large time interval. The word’s connection within the beginning of the A study by Huang et al. [140] claimed to be the first that experimented using a
content can impact the output of the word afterward within the sentence for this rich structure of user behavior for rumor detection. The user encoder uses graph
reason [67]. LSTM is an exceptionally viable solution for tending the vanishing convolutional networks (GCN) to learn a representation of the user from a
gradient issue. Bahad et al. [61] proposed an RNN model that suffers from the graph created by user behavioral information. The authors used two recursive
vanishing gradient issue. To tackle this issue, they implemented an LSTM-RNN. neural networks based on tree structure: bottom-up RvNN encoder and top-
But still, LSTM could not solve the vanishing gradient issue completely. The down RvNN encoder. The tree structure is shown in Figure 8. The proposed
LSTM-RNN model had a higher precision compared to the initial state-of-the- model performed worse for the non-rumor class cause user behavior
art CNN. Asghar et al. [135] proposed bidirectional LSTM (Bi-LSTM) with information brings some interference in non-rumor detection.
CNN for rumor detection. The model preserves the sequence information in
Another study by Bian et al. [139] proposed top-down GCN and bottom-up GCN
both directions. The Bi-LSTM layer is effective in remembering long-term
using a novel method DropEdge [141] for reducing over-fitting of GCNs. In
dependency. Even though the BiLSTM-CNN beat the other models, the
addition, a root feature enhancement operation is utilized to improve the
suggested approach is computationally expensive.
performance of rumor detection. Although it performed well on three datasets
A study by Ruchansky et al. [123] suggested a model called CSI, which (Weibo, Twitter15, Twitter16), the outliers in the dataset affected the models’
comprises three modules, Capture, Score, and Integrate. The capture module performance.
extracts features from the article, and the score module extracts features from
On the other hand, GCNs incur a significant memory footprint in storing the
the user. Then by integrating article and user-based features, the CSI model
complete adjacency matrix. Furthermore, GCNs are transductive, which implies
performs the prediction for fake news detection. The CSI model has fewer
that inferred nodes must be present at the training time. And do not guarantee
parameters than other RNN-based models. Another study by Sahoo and Gupta
generalizable representations [142]. Wu et al. [143] proposed an algorithm of
[136] proposed an approach with both user profile and news content features for
representation learning with a gated graph neural network named PGNN
detecting false news on Facebook. The authors used LSTM to identify fake news,
(propagation graph neural network). The suggested technique can incorporate
and a set of new features are extracted by Facebook crawling and Facebook
structural and textual features into high-level representations by propagating resilient against potential attacks. Though the model performed well, it is not
information among neighbor nodes throughout the propagation network. In evaluated using defense mechanisms, namely adversarial learning.
order to obtain considerable performance improvements, they also added an E. ATTENTION MECHANISM BASED
attention mechanism. The propagation graph is built using the whoreplies-to-
whom structure, but the follower-followee and forward relationships are The attention-related approach is another notable advancement. In deep neural
omitted. Zhang et al. [144] presented a simplified aggregation graph neural networks, the attention mechanism is an effort to implement the same behavior
network (SAGNN) based on efficient aggregation layers. Experiments on of selectively focusing on a few important items while ignoring others. Attention
publicly accessible Twitter datasets show that the proposed network is a bridge that connects the encoder and decoder, which provides information to
outperforms state-of-the-art graph convolutional networks while considerably the decoder from each encoder’s secret state. Using this framework, the model
lowering computational costs. selectively concentrates on the valuable components from the input. Thus the
model will be able to discover the associations among them. This allows the
D. GENERATIVE ADVERSARIAL NETWORK (GAN) model to deal with lengthy input sentences more effectively. Unlike RNNs or
Generative Adversarial Networks (GANs) are deep learningbased generative CNNs, attention mechanisms maintain word dependencies in a sentence despite
models. The GAN model architecture consists of two sub-models: a generator the distance between them. The primary downside of the attention mechanism is
model for creating new instances and a discriminator model for determining that it adds additional weight parameters to the model, which might lengthen
whether the produced examples are genuine or fake, generated by the generator the training time, especially if the model’s input data are long sequences.
model. Existing adversarial networks are often employed to create images that A study by Long [150] proposed attention-based LSTM with speaker profile
may be matched to observed samples using a minimax game framework [44]. features, and their experimental findings suggest that employing speaker
The generator model produces new images from the features learned from the profiles can help enhance fake news identification. Recently, attention
training data that resemble the original image. The discriminator model predicts techniques have been used to efficiently extract information related to a mini
whether the generated image is fake or real. GANs are extremely successful in query (article headline) from a long text (news content) [47], [87]. A study by
generative modeling and are used to train discriminators in a semisupervised Singhania et al. [87] used an automated detector through a three-level
context to assist in eliminating human participation in data labeling. hierarchical attention network (3HAN). Three levels exist in 3HAN, one for
Furthermore, GANs are useful when the data have imbalanced classes or words, one for sentences, and one for the headline. Because of its three levels of
underrepresented samples. GANs produce synthetic data only if they are based attention, 3HAN assigns different weights to different sections of an article. In
on continuous numbers. But GANs are inapplicable to NLP data because all contrast to other deep learning models, 3HAN yields understandable results.
NLPs are based on discrete values such as words, letters, or bytes [145]. To train While 3HAN only uses textual information, a study by Jin et al. [47] used image
features,
including
social
context and
text
features, as
well as
attention
on RNN
(att-RNN).
Another
study used
RNNs with
a soft-
attention
mechanism
to filter out
unique
linguistic
features
[151].
However,
FIGURE 8. This figure illustrates the propagation tree structure encoder taken from Huang et al. [140]. this method
is based on
GANs for text data, novel techniques are required. distinct domain and community features without any external evidence. Thus, it
A study by Long [145] provides a restricted context for credibility analysis.
proposed sequence GAN
(SeqGAN), which is a GAN architecture that overcomes the problem of gradient
descent in GANs for discrete outputs by employing reinforcement learning (RL)
based approach and Monte Carlo search. The authors provide actual news
content to the GAN. Then a classifier based on Google’s BERT model was
trained to identify the real samples from the samples generated by the GAN. The
architecture of SeqGAN is provided in Figure 9.
In generative adversarial networks, the principle of adversarial learning was
invented. The adversarial learning concept has produced outstanding results in
a wide range of topics, including information retrieval [146], text classification
[147], and network embedding [148]. The unique problem for detecting fake
news is the recognition of false news on recently emergent events on social
media. To solve this problem, Wang et al. [44] suggested an endto-end
architecture called event adversarial neural network (EANN). This architecture
is used to extract event-invariant characteristics and, therefore, aids in the
identification of false news on newly incoming events. It is made up of three
major components: a multimodal feature extractor, a fake news detector, and an
event discriminator. Another study by Le et al. [149] introduced Malcom that
generates malicious comments which have fooled five popular fake news FIGURE 9. A basic SeqGAN architecture. The figure is taken from
detectors (CSI, dEFEND, etc.) to detect fake news as real news with 94% and Hiriyannaiah et al. [145].
90% attack success rates. The authors showed that existing methods are not
To overcome the shortcomings of previous works, Aloshban [152] proposed an whereas the pre-trained VGG-19 model was used to extract image features in
automatic fake news classification through self-attention (ACT). Their principle the multimodal feature extractor. The extracted features are then concatenated
is inspired by the fact that claim texts are fairly short and hence cannot be used and sent to the detector to differentiate between fake and real news. Moreover,
for classification efficiently. Their suggested framework makes use of mutual the existence of noisy images in the Weibo dataset have affected the BDANN
interactions between a claim and many supporting responses. The LSTM neural results. Kaliyar et al. [92] proposed a BERT-based deep convolutional approach
network was applied to the article input. The outcome of the final step of LSTM (fakeBERT) for fake news detection. The fakeBERT is a combination of
may not completely reflect the semantics of the article. Connecting all vector different parallel blocks of a one-dimensional deep convolutional neural network
representations of words in the text will lead to a massive vector dimension. (1d-CNN) with different kernel sizes and filters and the BERT. Different filters
Therefore, the internal connection between the articles’ words can be ignored. can extract convenient information from the training dataset. The combination
As a result, employing the self-attention function on the LSTM model extracts of BERT with 1d-CNN can deal with both large-scale structure and
key parts of the article through several feature vectors. Their strategy is heavily unstructured text. Therefore, the combination is beneficial in dealing with
reliant on selfattention and an article representation matrix. Graph-aware co- ambiguity.
attention networks (GCAN) is an innovative approach for detecting fake news G. ENSEMBLE APPROACH
[153]. The authors predict if a source tweet article is false based just on its brief
text content and user retweet sequence, as well as user profiles. Given the Ensemble approaches are strategies that generate several models and combine
chronology of its retweeters, GCAN can determine whether a short-text tweet is them to achieve better results. Ensemble models typically yield more precise
fraudulent. However, this model is not suitable for long text as it is difficult to solutions than a single model does. An ensemble reduces the distribution or
find the relationship between a long tweet and retweet propagation. dispersion of predictions and model efficiency. Ensembling can be applied to
supervised and unsupervised learning activities [86]. Many researchers have
F. BIDIRECTIONAL ENCODER REPRESENTATIONS FOR TRANSFORMERS used an ensemble approach to boost their performance [42], [133]. Agarwal and
(BERT) Dixit [63] combined two datasets, namely, Liar and Kaggle, to evaluate the
BERT is a deep learning model that has shown cutting-edge results across a wide performance of LSTM and achieved
variety of natural language processing applications. BERT incorporates pre- anaccuracyof97%.TheyalsousedvariousmodelslikeCNN, LSTM, SVM, naive
training language representations developed by Google. BERT is a sophisticated bayes (NB), and k-nearest neighbour (KNN) for building an ensemble model.
pre-trained word-embedding model built on a transformerencoded architecture The authors showed an average accuracy score of their used algorithms but did
[89]. The BERT method is distinctive in its capacity to identify and capture not show the accuracy of their ensemble model, which is a limitation of their
contextual meaning in a sentence or text [90]. The main restriction of work.
conventional language models is that they are unidirectional, which restricts the Often the CNN-LSTM ensemble approach has been used in previous DL-based
architectures that could be utilized during pre-training. The BERT model studies. Kaliyar [67] used an ensemble of CNN and LSTM, and the accuracy was
eliminates unidirectional limitations by using a mask language model (MLM). slightly lower than that of the state-of-the-art CNN model. However, the
BERT employs the next sentence prediction (NSP) task in addition to the precision and recall were effectively improved. Asghar et al. [135] obtained an
masked language model to jointly pre-train text-pair representations. BERT increase in the efficiency of their model by using Bi-LSTM. The Bi-LSTM
consists of two stages: pre-training and fine-tuning. During pre-training, the retains knowledge from both former and upcoming contexts before rendering its
model was trained on unlabeled data using a variety of pre-training tasks. For input to the CNN model. Even though CNN and RNN typically require huge
fine-tuning, the BERT model is first initialized with the pre-trained parameters, datasets to function successfully, Ajao et al. [133] trained LSTM-CNN with a
and then all of the parameters are fine-tuned using labeled data from the smaller dataset. The abovementioned works considered just text-based features
downstream jobs. The architecture of the BERT model is shown in figure 10. for fake news classification, whereas the addition of new features may generate a
The data utilized in the BERT model are generic data gathered from Wikipedia more significant result. While most studies used CNN with LSTM, a study by
and the Book Corpus. While these data contain a wide range of information, Amine et al. [131] merged two convolutional neural networks to integrate
specific information on individual domains is still lacking. To overcome this metadata with text. They illustrate that integrating metadata with text will result
problem, a study by Jwa et al. [75] incorporated news data in the pre-training in substantial improvements in fine-grained fake news detection. Furthermore,
phase to boost fake news identification skills. When compared to the state-of- when tested on real-world datasets, this approach shows improvements
the-art model stackLSTM, compared to the textonly deep learning model. Moving further Kumar et al. [86]
FIGURE 10. The BERT architecture taken from Devlin et al. [89]. employed the use of an attention layer. It assists the CNN + LSTM model in
learning to pay attention to particular regions of input sequences rather than the
the proposed model named exBAKE (BERT with extra unlabeled news corpora) full series of input sequences. Utilizing the attention mechanism with
outperformed by a 0.137 F1-score. Ding et al. [154] discovered that including CNN+LSTM was reported to be efficient by a small margin. Result analysis of
mental features such as a speaker’s credit history at the language level might DL-based studies is presented in Table 7.
considerably improve BERT model performance. The history feature helps
further the relationship’s construction between the event and the person in
reality. But these studies did not consider any pre-processing methods.
Zhang et al. [91] presented a BERT-based domainadaption neural network for
multimodal false news detection (BDANN). BDANN is made up of three major
components: a multimodal feature extractor, a domain classifier, and a false
news detector. The pre-trained BERT model was used to extract text features,
TABLE 6. The table contains the strength and limitation of popular existing studies with reference and used classifier.
whole two-dimensional field under the entire ROC curve. The FPR can be works have taken this into account. We believe that studies that
defined as in Equation (5). concentrate on the selection of features and classifiers might potentially
FalsePositive improve performance.
FPR = (5) • The feature engineering concept is not common in deep learning-
based studies. News content and headline features are the widely used
FalsePositive + TrueNegative
features in fake news detection, but several other features such as user
VII. CHALLENGES AND RESEARCH DIRECTION behavior [154], user profile, and social network behavior need to be
Despite the fact that numerous studies have been conducted on the identification explored. Political or religious bias in profile features and lexical,
of fake news, there is always space for future advancement and investigation. In syntactic, and statistical-based features can increase the detection rate. A
the sense of recognizing fake news, we highlight challenges and several unique fusion of deeply hidden text features with other statistical features may
exploration areas for future studies. Although DL-based methods provide higher result in a better outcome.
accuracy compared to the other methods, there is scope to make it more • Propagation-based studies are scarce in this domain [117]. Network-
acceptable. based patterns of news propagation are a piece of information that has
• Thefeatureandclassifierselectiongreatlyinfluencesthe efficiency of not been comprehensively utilized for fake news detection [159]. Thus, we
the model. Previous studies did not place a high priority on the selection suggest considering news propagation for fake news identification. Meta-
of features and classifiers. Researchers should focus on determining data and additional information can increase the robustness and reduce
which classifier is most suitable for particular features. The long textual the noise of a single textual claim, but they must be handled with caution.
features require the use of sequence models (RNNs), but limited research
• Studies focused only on text data for fake news detection, whereas [2] T. Rasool, W. H. Butt, A. Shaukat, and M. U. Akram, ‘‘Multi-label
fake news is generated in sophisticated ways, with text or images that fake news detection using multi-layered supervised learning,’’ in Proc.
have been purposefully altered [95]. Only a few studies have used image 11th Int. Conf. Comput. Autom. Eng., 2019, pp. 73–77.
features[109],[110].Thus,werecommendtheuseofvisual data (videos and [3] X. Zhang and A. A. Ghorbani, ‘‘An overview of online fake news:
images). An examination with video and image features will be an Characterization, detection, and discussion,’’ Inf. Process. Manage., vol.
investigation region to build a stronger and more robust system. 57, no. 2, Mar. 2020, Art. no. 102025. [Online]. Available:
• Studies that use a fusion of features are scarce in this domain [160]. http://www.sciencedirect.com/science/article/pii/S0306457318306794
Combining information from multiple sources may be extremely [4] Abdullah-All-Tanvir, E. M. Mahir, S. Akhter, and M. R. Huq,
beneficial in detecting whether Internet articles are fake [95]. We suggest ‘‘Detecting fake news using machine learning and deep learning
utilizing multi-model-based approaches with later pretrained word algorithms,’’ in Proc. 7th Int. Conf. Smart Comput. Commun. (ICSCC),
embeddings. Many other hidden features may have a great impact on Jun. 2019, pp. 1–5.
fake news detection. Hence we encourage researchers to investigate
hidden features. • Fake news detection models that learn from newly [5] K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu, ‘‘Fake news detection
emerging web articles in real-time could enhance detection results. on social media: A data mining perspective,’’ ACM SIGKDD Explorations
Another promising future work is the use of a transfer-learning approach Newslett., vol. 19, no. 1, pp. 22–36, 2017.
for training a neural network with online data streams. [6] R. Oshikawa, J. Qian, and W. Y. Wang, ‘‘A survey on natural
• More data for a more significant number of fake news should be language processing for fake news detection,’’ 2018, arXiv:1811.00770.
released since the lack of data is the major problem in fake news [7] S. B. Parikh and P. K. Atrey, ‘‘Media-rich fake news detection: A
classification. We assume that more training data will improve model survey,’’ in Proc. IEEE Conf. Multimedia Inf. Process. Retr. (MIPR), Apr.
performance. 2018, pp. 436–441.
Datasets focused on news content are publicly available. On the other hand, [8] A. Habib, M. Z. Asghar, A. Khan, A. Habib, and A. Khan, ‘‘False
datasets based on different textual features are limited. Thus research utilizing information detection in online content and its role in decision making: A
additional textual features is scarce. systematic literature review,’’ Social Netw. Anal. Mining, vol. 9, no. 1, pp.
• Instead of a simple classifier, using an ensemble method produces 1–20, Dec. 2019.
better results [49]. By constructing an ensemble model with DL and ML [9] M. K. Elhadad, K. F. Li, and F. Gebali, ‘‘Fake news detection on
algorithms, in which an LSTM can identify the original article while social media: A systematic survey,’’ in Proc. IEEE Pacific Rim Conf.
passing auxiliary features through a second model can yield better results Commun., Comput. Signal Process. (PACRIM), Aug. 2019, pp. 1–8.
[41]. A simpler GRU model performs better than an LSTM [80].
[10] A. Bondielli and F. Marcelloni, ‘‘A survey on fake news and rumour
Therefore, we recommend combining GRU and CNNs to urge the leading
detection techniques,’’ Inf. Sci., vol. 497, pp. 38–55, Sep. 2019. [Online].
result.
Available: http://www.sciencedirect.
• Many researchers have achieved high accuracy by using CNN, com/science/article/pii/S0020025519304372
LSTM, and ensemble models [42], [64]. SeqGAN and Deep Belief [11] P. Meel and D. K. Vishwakarma, ‘‘Fake news, rumor, information
Network (DBN) were not explored in this domain. We encourage pollution in social media and web: A contemporary survey of state-of-the-
researchers to experiment with these models. arts, challenges and opportunities,’’ Expert Syst. Appl., vol. 153, Sep. 2020,
• Transformers have replaced RNN models such as LSTM as the Art. no. 112986.
model of choice for NLP tasks. BERT has been used in the identification [12] K. Sharma, F. Qian, H. Jiang, N. Ruchansky, M. Zhang, and Y. Liu,
of fake news, but Generative Pre-trained Transformer (GPT) has not ‘‘Combating fake news: A survey on identification and mitigation
been used in this domain. We suggest using GPT by fine-tuning fake news techniques,’’ ACM Trans. Intell. Syst. Technol., vol. 10, no. 3, pp. 1–42,
detection tasks. May 2019.
• Existing algorithms make critical decisions without providing [13] X. Zhou and R. Zafarani, ‘‘A survey of fake news: Fundamental
precise information about the reasoning that results in specific decisions, theories,
predictions, recommendations, or actions [161]. Explainable Artificial detectionmethods,andopportunities,’’ACMComput.Surv.,vol.53,no.5, pp.
Intelligence (XAI) is a study field that tries to make the outcomes of AI 1–40, 2020.
systems more understandable to humans [162]. XAI can be a valuable
approach to start making progress in this area. [14] B. Collins, D. T. Hoang, N. T. Nguyen, and D. Hwang, ‘‘Trends in
combating fake news on social media—A survey,’’ J. Inf. Telecommun.,
VIII. CONCLUSION vol. 5, no. 2, pp. 247–266, 2021.
Fake news is escalating as social media is growing. [15] A. Zubiaga, A. Aker, K. Bontcheva, M. Liakata, and R. Procter,
Researchers are also trying their best to find solutions to keep society safe from ‘‘Detection and resolution of rumours in social media: A survey,’’ ACM
fake news. This survey covers the overall analysis of fake news classification by Comput. Surveys, vol. 51, no. 2, pp. 1–36, Jun. 2018.
discussing major studies. A thorough understanding of recent approaches in fake
[16] M. D. Ibrishimova and K. F. Li, ‘‘A machine learning approach to
news detection is essential because advanced frameworks are the front-runners
fake news detection using knowledge verification and natural language
in this domain. Thus, we analyzed fake news identification methods based on
processing,’’ in Proc. Int. Conf. Intell. Netw. Collaborative Syst. Cham,
NLP and advanced DL strategies. We presented a taxonomy of fake news
Switzerland: Springer, 2019, pp. 223–234.
detection approaches. We explored different NLP techniques and DL
architectures and provided their strength and shortcomings. We have explored [17] H. Ahmed, I. Traore, and S. Saad, ‘‘Detecting opinion spams and
diverse assessment measurements. We have given a short description of the fake news using text classification,’’ Secur. Privacy, vol. 1, no. 1, p. e9,
experimental findings of previous studies. In this field, we briefly outlined Jan. 2018.
possible directions for future research. Fake news identification will remain an [18] H. Ahmed, I. Traore, and S. Saad, ‘‘Detection of online fake news
active research field for some time with the emergence of novel deep learning using N-gram analysis and machine learning techniques,’’ in Proc. Int.
network architectures. There are fewer chances of inaccurate results using deep Conf. Intell., Secure, Dependable Syst. Distrib. Cloud Environ.
learning-based models. We strongly believe that this review will assist Switzerland: Springer, 2017, pp. 127–138.
researchers in fake news detection to gain a better, concise perspective of
existing problems, solutions, and future directions. [19] B. Bhutani, N. Rastogi, P. Sehgal, and A. Purwar, ‘‘Fake news
detection using sentiment analysis,’’ in Proc. 12th Int. Conf. Contemp.
ACKNOWLEDGMENT Comput. (IC), Aug. 2019, pp. 1–5.
The authors would like to thank the Advanced Machine Learning (AML) Lab [20] C. Castillo, M. Mendoza, and B. Poblete, ‘‘Information credibility
for resource sharing and precious opinions. on Twitter,’’ in Proc. 20th Int. Conf. World Wide Web, Mar. 2011, pp. 675–
REFERENCES 684, doi: 10.1145/1963405.1963500.
[1] H. Allcott and M. Gentzkow, ‘‘Social media and fake news in the
2016 election,’’ J. Econ. Perspect., vol. 31, no. 2, pp. 36–211, 2017.
[21] O. Ajao, D. Bhowmik, and S. Zargari, ‘‘Sentiment aware fake news [40] R. K. Kaliyar, A. Goswami, P. Narang, and S. Sinha, ‘‘FNDNet— A
detection on online social networks,’’ in Proc. IEEE Int. Conf. Acoust., deep convolutional neural network for fake news detection,’’ Cognit. Syst.
Speech Signal Process. (ICASSP), May 2019, pp. 2507–2511. Res., vol. 61, pp. 32–44, Jun. 2020.[Online]. Available:
[22] B. Ghanem, P. Rosso, and F. Rangel, ‘‘An emotional analysis of false http://www.sciencedirect.com/science/article/pii/S1389041720300085
information in social media and news articles,’’ ACM Trans. Internet [41] S. Deepak and B. Chitturi, ‘‘Deep neural approach to Fake-News
Technol., vol. 20, no. 2, pp. 1–18, May 2020. identification,’’ Proc. Comput. Sci., vol. 167, pp. 2236–2243, Jan. 2020.
[23] A. Giachanou, P. Rosso, and F. Crestani, ‘‘Leveraging emotional [Online]. Available: http://www.sciencedirect.
signals for credibility detection,’’ in Proc. 42nd Int. ACM SIGIR Conf. com/science/article/pii/S1877050920307420
Res. Develop. Inf. Retr., Jul. 2019, pp. 877–880. [42] M. Umer, Z. Imtiaz, S. Ullah, A. Mehmood, G. S. Choi, and B.-W.
[24] D. Khattar, J. S. Goud, M. Gupta, and V. Varma, ‘‘MVAE: On, ‘‘Fake news stance detection using deep learning architecture
Multimodal variational autoencoder for fake news detection,’’ in Proc. (CNNLSTM),’’ IEEE Access, vol. 8, pp. 156695–156706, 2020.
World Wide Web Conf., May 2019, pp. 2915–2921. [43] N. Aslam, I. U. Khan, F. S. Alotaibi, L. A. Aldaej, and A. K.
[25] N. J. Conroy, V. L. Rubin, and Y. Chen, ‘‘Automatic deception Aldubaikil, ‘‘Fake detect: A deep learning ensemble model for fake news
detection: Methods for finding fake news,’’ in Proc. 78th ASIST Annu. detection,’’ Complexity, vol. 2021, pp. 1–8, Apr. 2021.
Meeting, Inf. Sci. Impact, Res. Community, vol. 52, no. 1, pp. 1–4, 2015. [44] Y. Wang, F. Ma, Z. Jin, Y. Yuan, G. Xun, K. Jha, L. Su, and J. Gao,
[26] A. R. Pathak, A. Mahajan, K. Singh, A. Patil, and A. Nair, ‘‘Analysis ‘‘EANN: Event adversarial neural networks for multi-modal fake news
of techniques for rumor detection in social media,’’ Proc. Comput. Sci., detection,’’ in Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discovery Data
vol. 167, pp. 2286–2296, Jan. 2020. Mining, Jul. 2018, pp. 849–857.
[27] J. Ma, W. Gao, P. Mitra, S. Kwon, B. J. Jansen, K.-F. Wong, and M. [45] A. Alsaeedi and M. Al-Sarem, ‘‘Detecting rumors on social media
Cha, ‘‘Detecting rumors from microblogs with recurrent neural based on a CNN deep learning technique,’’ Arabian J. Sci. Eng., vol. 45,
networks,’’ in Proc. 25th Int. Joint Conf. Artif. Intell. (IJCAI). Res. no. 12, pp. 1–32, 2020.
Collection School Comput. Inf. Syst., 2016, pp. 3818–3824. [46] A. Thota, P. Tilak, S. Ahluwalia, and N. Lohia, ‘‘Fake news
[28] J. Ma, W. Gao, and K.-F. Wong, ‘‘Detect rumors in microblog posts detection: A deep learning approach,’’ SMU Data Sci. Rev., vol. 1, no. 3, p.
using propagation structure via kernel learning,’’ in Proc. 55th Annu. 10, 2018.
Meeting Assoc. Comput. Linguistics (ACL). Vancouver, BC, Canada: Res. [47] Z. Jin, J. Cao, H. Guo, Y. Zhang, and J. Luo, ‘‘Multimodal fusion
Collection School Comput. Inf. Syst., Jul./Aug. 2017, pp. 708–717. with recurrent neural networks for rumor detection on microblogs,’’ in
[29] W. Y. Wang, ‘‘‘Liar, liar pants on fire’: A new benchmark dataset for Proc. 25th ACM Int. Conf. Multimedia, Oct. 2017, pp. 795–816.
fake newsdetection,’’inProc.55thAnnu.MeetingAssoc.Comput.Linguistics, [48] R. R. Mandical, N. Mamatha, N. Shivakumar, R. Monica, and A. N.
Vancouver, BC, Canada, Jul. 2017, pp. 422–426. [Online]. Available: Krishna, ‘‘Identification of fake news using machine learning,’’ in Proc.
https://www.aclweb.org/anthology/P17-2067 IEEE Int. Conf. Electron., Comput. Commun. Technol. (CONECCT), Jul.
[30] A. Zubiaga, M. Liakata, and R. Procter, ‘‘Learning reporting 2020, pp. 1–6.
dynamics during breaking news for rumour detection in social media,’’ [49] S. S. Jadhav and S. D. Thepade, ‘‘Fake news identification and
2016, arXiv:1610.07363. classification using DSSM and improved recurrent neural network
[31] K. Shu, D. Mahudeswaran, S. Wang, D. Lee, and H. Liu, classifier,’’ Appl. Artif. Intell., vol. 33, no. 12, pp. 1058–1068, Oct. 2019,
‘‘FakeNewsNet: A data repository with news content, social context, and doi: 10.1080/08839514.2019.1661579.
spatiotemporal information for studying fake news on social media,’’ Big [50] A. S. K. Shu, D. M. K. Shu, L. G. M. Mittal, L. G. M. Mittal, and M.
Data, vol. 8, no. 3, pp. 171–188, Jun. 2020. M. J. K. Sethi, ‘‘Fake news detection using a blend of neural networks:
[32] M. Amjad, G. Sidorov, A. Zhila, H. Gómez-Adorno, I. Voronkov, An application of deep learning,’’ Social Netw. Comput. Sci., vol. 1, no. 3,
and A. Gelbukh, ‘‘‘Bend the truth’: Benchmark dataset for fake news pp. 1–9, Jan. 1970. [Online]. Available: https://link.springer.
detection in Urdu language and its evaluation,’’ J. Intell. Fuzzy Syst., vol. com/article/10.1007/s42979-020-00165-4
39, no. 2, pp. 2457–2469, 2020. [51] A. P. S. Bali, M. Fernandes, S. Choubey, and M. Goel,
[33] E. Tacchini, G. Ballarin, M. L. Della Vedova, S. Moret, and L. de ‘‘Comparative performance of machine learning algorithms for fake
Alfaro, ‘‘Some like it hoax: Automated fake news detection in social news detection,’’ in Proc. Int. Conf. Adv. Comput. Data Sci. Switzerland:
networks,’’ 2017, arXiv:1704.07506. Springer, 2019, pp. 420–430.
[34] C. Boididou, S. Papadopoulos, and M. Zampoglou, ‘‘Detection and [52] A. Rusli, J. C. Young, and N. M. S. Iswari, ‘‘Identifying fake news in
visualization of misleading content,’’ Int. J. Multimedia Inf. Retr., vol. 7, Indonesian via supervised binary text classification,’’ in Proc. IEEE Int.
no. 1, pp. 71–86, 2018. Conf. Ind. 4.0, Artif. Intell., Commun. Technol. (IAICT), Jul. 2020, pp. 86–
90.
[35] J. Golbeck, M. Mauriello, B. Auxier, K. H. Bhanushali, C. Bonk, M.
A. Bouzaghrane,C.Buntain,R.Chanduka,P.Cheakalos,J.B.Everett, and W. [53] V. Tiwari, R. G. Lennon, and T. Dowling, ‘‘Not everything you read
Falak, ‘‘Fake news vs satire: A dataset and analysis,’’ in Proc. 10th ACM is true! Fake news detection using machine learning algorithms,’’ in Proc.
Conf. Web Sci., 2018, pp. 17–21. 31st Irish Signals Syst. Conf. (ISSC), Jun. 2020, pp. 1–4.
[36] P. M. Waszak, W. Kasprzycka-Waszak, and A. Kubanek, ‘‘The [54] A. Verma, V. Mittal, and S. Dawn, ‘‘FIND: Fake information and
spread of medical fake news in social media—The pilot quantitative news detections using deep learning,’’ in Proc. 12th Int. Conf. Contemp.
study,’’ Health Policy Technol., vol. 7, no. 2, pp. 115–118, Jun. 2018. Comput. (IC), Aug. 2019, pp. 1–7.
[37] (2020). The Year of Fake News Covid Related Scams and [55] M. Z. Hossain, M. A. Rahman, M. S. Islam, and S. Kar,
Ransomware. Accessed: Mar. 12, 2021. [Online]. Available: https://www. ‘‘BanFakeNews: A dataset for detecting fake news in Bangla,’’ in Proc.
12th Lang. Resour. Eval. Conf. Marseille, France: European Language
prnewswire.com/news-releases/2020-the-year-of-fake-news-covidrelated-scams-
Resources Association, May 2020, pp. 2862–2871. [Online]. Available:
and-ransomware-301180568
https://www.aclweb.org/anthology/2020.lrec-1.349
[38] K. Shu, D. Mahudeswaran, S. Wang, D. Lee, and H. Liu,
‘‘FakeNewsNet: A data repository with news content, social context and
[56] P. Savyan and S. M. S. Bhanu, ‘‘UbCadet: Detection of
compromised accounts in Twitter based on user behavioural profiling,’’
spatialtemporal information for studying fake news on social media,’’
Multimedia Tools Appl., vol. 79, pp. 1–37, Jul. 2020.
2018, arXiv:1809.01286.
[39] Y.-C. Ahn and C.-S. Jeong, ‘‘Natural language contents evaluation [57] J. Kapusta and J. Obonya, ‘‘Improvement of misleading and fake
news classification for flective languages by morphological group
system for detecting fake news using deep learning,’’ inProc. 16th Int.
analysis,’’ in Informatics, vol. 7, no. 1. Switzerland: Multidisciplinary
Joint Conf. Comput. Sci. Softw. Eng. (JCSSE), Jul. 2019, pp. 289–292.
Digital Publishing Institute, 2020, p. 4.
[58] S. Hakak, M. Alazab, S. Khan, T. R. Gadekallu, P. K. R. IberoAmer. Conf. Artif. Intell., Nov. 2018, pp. 206–216. [Online].
Maddikunta, and W. Z. Khan, ‘‘An ensemble machine learning approach Available: https://link.springer.com/chapter/10.1007/978-3-030-03928-
through effective feature extraction to classify fake news,’’ Future Gener. 8_17
Comput. Syst., vol. 117, pp. 47–58, Apr. 2021. [Online]. Available: [78] C. K. Hiramath and G. C. Deshpande, ‘‘Fake news detection using
https://www.sciencedirect.com/science/article/pii/S0167739X20330466 deep learning techniques,’’ in Proc. 1st Int. Conf. Adv. Inf. Technol.
[59] M. G. Hussain, M. Rashidul Hasan, M. Rahman, J. Protim, and S. (ICAIT), Jul. 2019, pp. 411–415.
A. Hasan, ‘‘Detection of Bangla fake news using MNB and SVM [79] A. P. B. Veyseh, M. T. Thai, T. H. Nguyen, and D. Dou, ‘‘Rumor
classifier,’’ in Proc. Int. Conf. Comput., Electron. Commun. Eng. detection in social networks via deep contextual modeling,’’ in Proc.
(iCCECE), Aug. 2020, pp. 81–85. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining, Aug. 2019, pp. 113–
[60] G. Gravanis, A. Vakali, K. Diamantaras, and P. Karadais, ‘‘Behind 120.
the cues: A benchmarking study for fake news detection,’’ Expert Syst. [80] M. Bugueño, G. Sepulveda, and M. Mendoza, ‘‘An empirical
Appl., vol. 128, pp. 201–213, Aug. 2019. analysis of rumor detection on microblogs with recurrent neural
[61] P. Bahad, P. Saxena, and R. Kamal, ‘‘Fake news detection using bi- networks,’’ in Proc. Int. Conf. Hum.-Comput. Interact., Jul. 2019, pp. 293–
directional LSTM-recurrent neural network,’’ Proc. Comput. Sci., vol. 310. [Online].
165, pp. 74–82, Jan. 2019. [Online]. Available: Available: https://link.springer.com/chapter/10.1007/978-3-030-219024_21
http://www.sciencedirect.com/science/article/pii/S1877050920300806
[81] E. Providel and M. Mendoza, ‘‘Using deep learning to detect rumors
[62] E. Qawasmeh, M. Tawalbeh, and M. Abdullah, ‘‘Automatic in Twitter,’’ in Proc. Int. Conf. Hum.-Comput. Interact. Switzerland:
identification of fake news using deep learning,’’ in Proc. 6th Int. Conf. Springer, 2020, pp. 321–334.
Social Netw. Anal., Manage. Secur. (SNAMS), Oct. 2019, pp. 383–388.
[82] Q. Le and T. Mikolov, ‘‘Distributed representations of sentences and
[63] A. Agarwal and A. Dixit, ‘‘Fake news detection: An ensemble documents,’’ in Proc. Int. Conf. Mach. Learn., 2014, pp. 1188–1196.
learning approach,’’ in Proc. 4th Int. Conf. Intell. Comput. Control Syst.
(ICICCS), May 2020, pp. 1178–1183. [83] S. Sangamnerkar, R. Srinivasan, M. R. Christhuraj, and R.
Sukumaran, ‘‘An ensemble technique to detect fabricated news article
[64] S. M. Padnekar, G. S. Kumar, and P. Deepak, ‘‘BiLSTM- using machine learning and natural language processing techniques,’’ in
autoencoder architecture for stance prediction,’’ in Proc. Int. Conf. Data Proc. Int. Conf.
Sci. Eng. (ICDSE), Dec. 2020, pp. 1–5.
Emerg. Technol. (INCET), Jun. 2020, pp. 1–7.
[65] M. Granik and V. Mesyura, ‘‘Fake news detection using naive Bayes
classifier,’’inProc.IEEE1stUkraineConf.Electr.Comput.Eng.(UKRCON), [84] S. Helmstetter and H. Paulheim, ‘‘Weakly supervised learning for
May 2017, pp. 900–903. fake news detection on Twitter,’’ in Proc. IEEE/ACM Int. Conf. Adv.
Social Netw. Anal. Mining (ASONAM), Aug. 2018, pp. 274–277.
[66] A.JainandA.Kasbe,‘‘Fake newsdetection,’’inProc.IEEEInt.Students’
Conf. Electr., Electron. Comput. Sci. (SCEECS), 2018, pp. 1–5. [85] J. Pennington, R. Socher, and C. Manning, ‘‘GloVe: Global vectors
for word representation,’’ in Proc. Conf. Empirical Methods Natural
[67] R. K. Kaliyar, ‘‘Fake news detection using a deep neural network,’’ Lang. Process. (EMNLP), 2014, pp. 1532–1543.
in Proc. 4th Int. Conf. Comput. Commun. Autom. (ICCCA), Dec. 2018, pp.
1–7. [86] S.Kumar,R.Asthana,S.Upadhyay,N.Upreti,andM.Akbar,‘‘Fakenews
detection using deep learning models: A novel approach,’’ Trans. Emerg.
[68] G. Bhatt, A. Sharma, S. Sharma, A. Nagpal, B. Raman, and A. Telecommun. Technol., vol. 31, no. 2, p. e3767, Feb. 2020. [Online].
Mittal, ‘‘Combining neural, statistical and external features for fake Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/ett.3767
news stance identification,’’ in Proc. Companion The Web Conf. Web
Conf. (WWW), 2018, pp. 1353–1357, doi: 10.1145/3184558.3191577. [87] S. Singhania, N. Fernandez, and S. Rao, ‘‘3HAN: A deep neural
network for fake news detection,’’ in Proc. Int. Conf. Neural Inf. Process.
[69] F. A. Ozbay and B. Alatas, ‘‘Fake news detection within online social Switzerland: Springer, 2017, pp. 572–581.
media using supervised artificial intelligence algorithms,’’ Phys. A, Stat.
Mech. Appl., vol. 540, Feb. 2020, Art. no. 123174. [88] J. A. Nasir, O. S. Khan, and I. Varlamis, ‘‘Fake news detection: A
hybrid CNN-RNN based deep learning approach,’’ Int. J. Inf. Manage.
[70] B. Al-Ahmad, A. M. Al-Zoubi, R. A. Khurma, and I. Aljarah, ‘‘An Data Insights, vol. 1, no. 1, Apr. 2021, Art. no. 100007.
evolutionary fake news detection method for COVID-19 pandemic
information,’’ Symmetry, vol. 13, no. 6, p. 1091, Jun. 2021. [89] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, ‘‘BERT: Pre-
training of deep bidirectional transformers for language understanding,’’
[71] S. Shabani and M. Sokhn, ‘‘Hybrid machine-crowd approach for 2018, arXiv:1810.04805.
fake news detection,’’ in Proc. IEEE 4th Int. Conf. Collaboration Internet
Comput. (CIC), Oct. 2018, pp. 299–306. [90] S. Kula, M. Choraś, and R. Kozik, ‘‘Application of the bert-based
architecture in fake news detection,’’ in Proc. Comput. Intell. Secur. Inf.
[72] C. M. M. Kotteti, X. Dong, N. Li, and L. Qian, ‘‘Fake news detection Syst. Conf. Switzerland: Springer, 2019, pp. 239–249.
enhancement with data imputation,’’ in Proc. IEEE 16th Int. Conf.
Dependable, Autonomic Secure Comput., 16th Int. Conf. Pervasive Intell. [91] T. Zhang, D. Wang, H. Chen, Z. Zeng, W. Guo, C. Miao, and L. Cui,
Comput., 4th Int. Conf. Big Data Intell. Comput. Cyber Sci.Technol.Congr. ‘‘BDANN: BERT-based domain adaptation neural network for
(DASC/PiCom/DataCom/CyberSciTech),Aug.2018, pp. 187–192. multimodal fake news detection,’’ in Proc. Int. Joint Conf. Neural Netw.
(IJCNN), Jul. 2020, pp. 1–8.
[73] X. Zhou, A. Jain, V. V. Phoha, and R. Zafarani, ‘‘Fake news early
detection: A theory-driven model,’’ Digit. Threats, Res. Pract., vol. 1, no. [92] R. K. Kaliyar, A. Goswami, and P. Narang, ‘‘FakeBERT: Fake news
2, pp. 1–25, Jul. 2020. detection in social media with a BERT-based deep learning approach,’’
Multimedia Tools Appl., vol. 80, no. 8, pp. 11765–11788, Mar. 2021.
[74] P. H. A. Faustini and T. F. Covões, ‘‘Fake news detection in multiple
platforms and languages,’’ Expert Syst. Appl., vol. 158, Nov. 2020, Art. no. [93] W. Shishah, ‘‘Fake news detection using BERT model with joint
113503. learning,’’ Arabian J. Sci. Eng., vol. 46, pp. 1–13, Jun. 2021.
[75] H. Jwa, D. Oh, K. Park, J. Kang, and H. Lim, ‘‘ExBAKE: [94] H. Yuan, J. Zheng, Q. Ye, Y. Qian, and Y. Zhang, ‘‘Improving fake
Automatic fake news detection model based on bidirectional encoder news detection with domain-adversarial and graph-attention neural
representations from transformers (BERT),’’ Appl. Sci., vol. 9, no. 19, p. network,’’ Decis. Support Syst., vol. 151, Dec. 2021, Art. no. 113633.
4062, Sep. 2019. [95] A. Giachanou, G. Zhang, and P. Rosso, ‘‘Multimodal multi-image
[76] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, fake news detection,’’ in Proc. IEEE 7th Int. Conf. Data Sci. Adv. Anal.
‘‘Distributed representations of words and phrases and their (DSAA), Oct. 2020, pp. 647–654.
compositionality,’’ in Proc. Adv. Neural Inf. Process. Syst., 2013, pp. 3111– [96] S. Girgis, E. Amer, and M. Gadallah, ‘‘Deep learning algorithms for
3119. detecting fake news in online text,’’ in Proc. 13th Int. Conf. Comput. Eng.
[77] F. C. Fernández-Reyes and S. Shinde, ‘‘Evaluating deep neural Syst. (ICCES), Dec. 2018, pp. 93–97.
networks for automatic fake news detection in political domain,’’ in Proc.
[97] H. Reddy, N. Raj, M. Gala, and A. Basava, ‘‘Text-mining-based fake [116] T. Saikh, A. Anand, A. Ekbal, and P. Bhattacharyya, ‘‘A
news detection using ensemble methods,’’ Int. J. Autom. Comput., vol. 17, novel approach towards fake news detection: Deep learning augmented
pp. 1–12, Apr. 2020. with textual entailment features,’’ in Proc. Int. Conf. Appl. Natural Lang.
[98] K. Shu, S. Wang, and H. Liu, ‘‘Understanding user profiles on social Inf. Syst. Switzerland: Springer, 2019, pp. 345–358.
media for fake news detection,’’ in Proc. IEEE Conf. Multimedia Inf. [117] L. Wu and H. Liu, ‘‘Tracing fake-news footprints:
Process. Retr. (MIPR), Apr. 2018, pp. 430–435. Characterizing social media messages by how they propagate,’’ in Proc.
[99] M. L. Della Vedova, E. Tacchini, S. Moret, G. Ballarin, M. DiPierro, 11th ACM Int. Conf. Web Search Data Mining, Feb. 2018, pp. 637–645.
and L. de Alfaro, ‘‘Automatic online fake news detection combining [118] K. Shu, S. Wang, and H. Liu, ‘‘Beyond news contents:
content and social signals,’’ in Proc. 22nd Conf. Open Innov. Assoc. The role of social context for fake news detection,’’ in Proc. 12th ACM
(FRUCT), May 2018, pp. 272–279. Int. Conf. Web Search Data Mining, Jan. 2019, pp. 312–320.
[100] K. Shu, L. Cui, S. Wang, D. Lee, and H. Liu, [119] F. Monti, F. Frasca, D. Eynard, D. Mannion, and M. M.
‘‘DEFEND: Explainable fake news detection,’’ in Proc. 25th ACM Bronstein, ‘‘Fake news detection on social media using geometric deep
SIGKDD Int. Conf. Knowl. Discovery Data Mining, Jul. 2019, pp. 395–405. learning,’’ 2019, arXiv:1902.06673.
[101] M. Potthast, J. Kiesel, K. Reinartz, J. Bevendorff, and [120] M. Albahar, ‘‘A hybrid model for fake news detection:
B. Stein, ‘‘A stylometric inquiry into hyperpartisan and fake news,’’ 2017, Leveraging news content and user comments in fake news,’’ IET Inf.
arXiv:1702.05638. Secur., vol. 15, no. 2, pp. 169–177, Mar. 2021.
[102] X. Zhang, J. Cao, X. Li, Q. Sheng, L. Zhong, and K. [121] B. Al Asaad and M. Erascu, ‘‘A tool for fake news
Shu, ‘‘Mining dual emotion for fake news detection,’’ 2019, detection,’’ in Proc. 20th Int. Symp. Symbolic Numeric Algorithms Sci.
arXiv:1903.01728. Comput. (SYNASC), Sep. 2018, pp. 379–386.
[103] S. Hosseinimotlagh and E. E. Papalexakis, [122] S. Aphiwongsophon and P. Chongstitvatana, ‘‘Detecting
‘‘Unsupervised contentbased identification of fake news articles with fake news with machine learning method,’’ in Proc. 15th Int. Conf. Electr.
tensor decomposition ensembles,’’ in Proc. Workshop Misinformation Eng., Electron., Comput., Telecommun. Inf. Technol. (ECTI-CON), Jul.
Misbehavior Mining Web (MIS), 2018, pp. 1–8. 2018, pp. 528–531.
[104] R. K. Kaliyar, A. Goswami, and P. Narang, ‘‘DeepFakE: [123] N. Ruchansky, S. Seo, and Y. Liu, ‘‘CSI: A hybrid deep
Improving fake news detection using tensor decomposition-based deep model for fake news detection,’’ in Proc. ACM Conf. Inf. Knowl. Manage.,
neural network,’’ J. Supercomput., vol. 77, no. 2, pp. 1015–1037, Feb. New York, NY, USA, Nov. 2017, pp. 797–806, doi:
2021. 10.1145/3132847.3132877.
[105] R. K. Kaliyar, A. Goswami, and P. Narang, [124] Y. Yang, L. Zheng, J. Zhang, Q. Cui, Z. Li, and P. S. Yu,
‘‘EchoFakeD: Improving fake news detection in social media with an ‘‘TICNN: Convolutional neural networks for fake news detection,’’
efficient deep neural network,’’ Neural Comput. Appl., vol. 33, pp. 1–17, CoRR, vol. abs/1806.00749, pp. 1–11, Jun. 2018.
Jan. 2021. [125] T. O’Shea and J. Hoydis, ‘‘An introduction to deep
[106] M. Dong, L. Yao, X. Wang, B. Benatallah, Q. Z. Sheng, learning for the physical layer,’’ IEEE Trans. Cogn. Commun. Netw., vol.
and H. Huang, ‘‘DUAL: A deep unified attention model with latent 3, no. 4, pp. 563–575, Dec. 2017.
relation representations for fake news detection,’’ in Proc. Int. Conf. Web [126] G. Aceto, D. Ciuonzo, A. Montieri, and A. Pescapé,
Inf. Syst. Eng.
‘‘Mobile encrypted traffic classification using deep learning:
Switzerland: Springer, 2018, pp. 199–209. Experimental evaluation, lessons learned, and challenges,’’ IEEE Trans.
[107] J. Zhang, B. Dong, and P. S. Yu, ‘‘FakeDetector: Netw. Service Manag., vol. 16, no. 2, pp. 445–458, Feb. 2019.
Effective fake news detection with deep diffusive neural network,’’ in [127] P. Yildirim and D. Birant, ‘‘The relative performance of
Proc. IEEE 36th Int. Conf. Data Eng. (ICDE), Apr. 2020, pp. 1826–1829. deep learning and ensemble learning for textile object classification,’’ in
[108] H. Karimi, P. Roy, S. Saba-Sadiya, and J. Tang, ‘‘Multi- Proc. 3rd Int. Conf. Comput. Sci. Eng. (UBMK), Sep. 2018, pp. 22–26.
source multi-class fake news detection,’’ in Proc. 27th Int. Conf. Comput. [128] D. Shen, G. Wu, and H. Suk, ‘‘Deep learning in medical
Linguistics, 2018, pp. 1546–1557. image analysis,’’ Annu. Rev. Biomed. Eng., vol. 19, pp. 221–248, Jun.
[109] D. Mangal and D. K. Sharma, ‘‘Fake news detection 2017.
with integration of embedded text cues and image features,’’ in Proc. 8th [129] M. Veres and M. Moussa, ‘‘Deep learning for intelligent
Int. Conf. Rel., INFOCOM Technol. Optim., Trends Future Directions transportation systems: A survey of emerging trends,’’ IEEE Trans.
(ICRITO), Jun. 2020, pp. 68–72. Intell. Transp. Syst., vol. 21, no. 8, pp. 3152–3168, Aug. 2020.
[110] P. Qi, J. Cao, T. Yang, J. Guo, and J. Li, ‘‘Exploiting [130] U. Kamath, J. Liu, and J. Whitaker, Deep Learning for
multi-domain visual information for fake news detection,’’ in Proc. IEEE NLP and Speech Recognition, vol. 84. Switzerland: Springer, 2019.
Int. Conf. Data Mining (ICDM), Nov. 2019, pp. 518–527.
[131] B. M. Amine, A. Drif, and S. Giordano, ‘‘Merging deep
[111] K. Shu, X. Zhou, S. Wang, R. Zafarani, and H. Liu, learning model for fake news detection,’’ in Proc. Int. Conf. Adv. Electr.
‘‘The role of user profiles for fake news detection,’’ in Proc. IEEE/ACM Eng. (ICAEE), Nov. 2019, pp. 1–4.
Int. Conf. Adv. Social Netw. Anal. Mining, Aug. 2019, pp. 436–439.
[132] Q. Li, Q. Hu, Y. Lu, Y. Yang, and J. Cheng, ‘‘Multi-level
[112] H. Guo, J. Cao, Y. Zhang, J. Guo, and J. Li, ‘‘Rumor word features based on CNN for fake news detection in cultural
detection with hierarchical social attention network,’’ in Proc. 27th ACM communication,’’ Pers. Ubiquitous Comput., vol. 24, no. 2, pp. 1–14, 2019.
Int. Conf. Inf. Knowl. Manage., Oct. 2018, pp. 943–951.
[133] O. Ajao, D. Bhowmik, and S. Zargari, ‘‘Fake news
[113] J. C. S. Reis, A. Correia, F. Murai, A. Veloso, and F. identification on Twitter with hybrid CNN and RNN models,’’ in Proc.
Benevenuto, ‘‘Explainable machine learning for fake news detection,’’ in 9th Int. Conf. Social Media Soc., New York, NY, USA, Jul. 2018, pp. 226–
Proc. 10th ACM Conf. Web Sci. (WebSci), New York, NY, USA, 2019, pp. 230, doi: 10.1145/3217804.3217917.
17–26, doi: 10.1145/3292522.3326027.
[134] L. Li, G. Cai, and N. Chen, ‘‘A rumor events detection
[114] J. Kim, B. Tabibian, A. Oh, B. Schölkopf, and M. method based on deep bidirectional GRU neural network,’’ in Proc.
Gomez-Rodriguez, ‘‘Leveraging the crowd to detect and reduce the IEEE 3rd Int. Conf. Image, Vis. Comput., Jun. 2018, pp. 755–759.
spread of fake news and misinformation,’’ in Proc. 11th ACM Int. Conf.
[135] M. Z. Asghar, A. Habib, A. Habib, A. Khan, R. Ali, and
Web Search Data Mining, Feb. 2018, pp. 324–332.
A. Khattak, ‘‘Exploring deep neural networks for rumor detection,’’ J.
[115] K. Popat, S. Mukherjee, A. Yates, and G. Weikum, Ambient Intell. Humanized Comput., vol. 12, no. 4, pp. 1–19, 2019.
‘‘DeClarE: Debunking fake news and false claims using evidence-aware
deep learning,’’ 2018, arXiv:1809.06416.
[136] S. R. Sahoo and B. B. Gupta, ‘‘Multiple features based [153] Y.-J. Lu and C.-T. Li, ‘‘GCAN: Graph-aware co-
approach for automatic fake news detection on social networks using attention networks for explainable fake news detection on social media,’’
deep learning,’’ Appl. Soft Comput., vol. 100, Mar. 2021, Art. no. 106983. 2020, arXiv:2004.11648.
[137] Q. Liao, H. Chai, H. Han, X. Zhang, X. Wang, W. Xia, [154] J. Ding, Y. Hu, and H. Chang, ‘‘BERT-based mental
and model, a better fake news detector,’’ in Proc. 6th Int. Conf. Comput. Artif.
Y. Ding, ‘‘An integrated multi-task model for fake news detection,’’ IEEE Trans. Intell., New York, NY, USA, Apr. 2020, pp. 396–400, doi:
Knowl. Data Eng., early access, Jan. 28, 2021, doi: 10.1109/TKDE.2021.3054993. 10.1145/3404555.3404607.
[138] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and [155] L. Wu, Y. Rao, H. Yu, Y. Wang, and A. Nazir, ‘‘False
G. Monfardini, ‘‘The graph neural network model,’’ IEEE Trans. Neural information detection on social media via a hybrid deep model,’’ in Proc.
Netw., vol. 20, no. 1, pp. 61–80, Jan. 2008. Int. Conf. Social Inform., Sep. 2018, pp. 323–333, doi: 10.1007/978-3-030-
011598_31.
[139] T. Bian, X. Xiao, T. Xu, P. Zhao, W. Huang, Y. Rong,
and A. Huang, ‘‘Rumor detection on social media with bi-directional [156] A. Choudhary and A. Arora, ‘‘Linguistic feature based
graph convolutional networks,’’ in Proc. AAAI Conf. Artif. Intell., 2020, learning model for fake news detection and classification,’’ Expert Syst.
vol. 34, no. 1, pp. 549–556. Appl., vol. 169, May 2021, Art. no. 114171.
[140] Q. Huang, C. Zhou, J. Wu, M. Wang, and B. Wang, [157] D. K. Vishwakarma, D. Varshney, and A. Yadav,
‘‘Deep structure learning for rumor detection on Twitter,’’ in Proc. Int. ‘‘Detection and veracity analysis of fake news via scrapping and
Joint Conf. Neural Netw. (IJCNN), Jul. 2019, pp. 1–8. authenticating the web search,’’ Cognit. Syst. Res., vol. 58, pp. 217–229,
Dec. 2019.
[141] Y. Rong, W. Huang, T. Xu, and J. Huang, ‘‘DropEdge:
Towards deep graph convolutional networks on node classification,’’ [158] Z. Jin, J. Cao, Y. Zhang, and J. Luo, ‘‘News verification
2019, arXiv:1907.10903. by exploiting conflicting social viewpoints in microblogs,’’ in Proc. 13th
AAAI Conf. Artif. Intell. (AAAI), 2016, pp. 2972–2978.
[142] Y. Ren, B. Wang, J. Zhang, and Y. Chang, ‘‘Adversarial
active learning based heterogeneous graph neural network for fake news [159] X. Zhou and R. Zafarani, ‘‘Fake news detection: An
detection,’’ in Proc. IEEE Int. Conf. Data Mining (ICDM), Nov. 2020, pp. interdisciplinary research,’’ in Proc. Companion World Wide Web Conf.,
452–461. May 2019, p. 1292.
[143] Z. Wu, D. Pi, J. Chen, M. Xie, and J. Cao, ‘‘Rumor [160] R. Kumari and A. Ekbal, ‘‘AMFB: Attention based
detection based on propagation graph neural network with attention multimodal factorized bilinear pooling for multimodal fake news
mechanism,’’ Expert Syst. Appl., vol. 158, Nov. 2020, Art. no. 113595. detection,’’ Expert Syst. Appl., vol. 184, Dec. 2021, Art. no. 115412.
[Online]. Available: [161] A. Nascita, A. Montieri, G. Aceto, D. Ciuonzo, V.
http://www.sciencedirect.com/science/article/pii/S095741742030419X Persico, and A. Pescape, ‘‘XAI meets mobile traffic classification:
Understanding and improving multimodal deep learning architectures,’’
[144] L. Zhang, J. Li, B. Zhou, and Y. Jia, ‘‘Rumor detection IEEE Trans. Netw. Service Manage., early access, Jul. 19, 2021, doi:
based on SAGNN: Simplified aggregation graph neural networks,’’ 10.1109/TNSM.2021.3098157.
Mach. Learn. Knowl. Extraction, vol. 3, no. 1, pp. 84–94, Jan. 2021.
[Online]. Available: https://www.mdpi.com/2504-4990/3/1/5 [162] A. Adadi and M. Berrada, ‘‘Peeking inside the black-
box: A survey on explainable artificial intelligence (XAI),’’ IEEE Access,
[145] S. Hiriyannaiah, A. Srinivas, G. K. Shetty, G. Siddesh, vol. 6, pp. 52138–52160, 2018.
and K. Srinivasa, ‘‘A computationally intelligent agent for detecting fake
news using generative adversarial networks,’’ in Hybrid Computational M. F. MRIDHA (Senior Member, IEEE) received the
Intelligence: Challenges and Applications. Amsterdam, The Netherlands: Ph.D. degree in AI/ML from Jahangirnagar
Elsevier, 2020, p. 69. University, in 2017. He joined as a Lecturer at the
Department of Computer Science and Engineering,
[146] J. Wang, L. Yu, W. Zhang, Y. Gong, Y. Xu, B. Wang, P. Stamford University Bangladesh, in June 2007. He
Zhang, and D. Zhang, ‘‘IRGAN: A minimax game for unifying generative was promoted as a Senior Lecturer at the
and discriminative information retrieval models,’’ in Proc. 40th Int. ACM Department of Computer Science and Engineering,
SIGIR Conf. Res. Develop. Inf. Retr., Aug. 2017, pp. 515–524. in October 2010, and promoted as an
[147] Y. Li and J. Ye, ‘‘Learning adversarial networks for AssistantProfessorattheDepartmentofComputer
semi-supervised text classification via policy gradient,’’ in Proc. 24th Science and Engineering, in October 2011. Then,
ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Jul. 2018, pp.
he joined as an Assistant Professor at UAP, in May 2012. He worked as a CSE
1715–1723.
Department Faculty Member at the University of Asia Pacific and a Graduate
[148] B. Hu, Y. Fang, and C. Shi, ‘‘Adversarial learning on Coordinator, from 2012 to 2019. He is currently working as an Associate
heterogeneous information networks,’’ in Proc. 25th ACM SIGKDD Int. Professor with the Department of Computer Science and Engineering,
Conf. Knowl. Discovery Data Mining, Jul. 2019, pp. 120–129. Bangladesh University of Business and Technology. His research experience,
[149] T. Le, S. Wang, and D. Lee, ‘‘MALCOM: Generating within both academia and industry, results in over 80 journals and conference
malicious comments to attack neural fake news detection models,’’ in publications. For more than ten years, he has been with the masters and
Proc. IEEE Int. Conf. Data Mining (ICDM), Nov. 2020, pp. 282–291. undergraduate students as a supervisor of their thesis work. His research
interests include artificial intelligence (AI), machine learning, deep learning,
[150] Y. Long, Q. Lu, R. Xiang, M. Li, and C.-R. Huang, natural language processing (NLP), and big data analysis. He has served as a
‘‘Fake news detection through multi-perspective speaker profiles,’’ in program committee member for several international conferences/workshops.
Proc. 8th Int. Joint Conf. Natural Lang. Process., vol. 2. Taipei, Taiwan: He served as an associate editor for several journals.
Asian Fed. Natural Lang. Process., Nov. 2017, pp. 252–256. [Online].
Available: https://aclanthology.org/I17-2043/ ASHFIA JANNAT KEYA was born in Dhaka,
Bangladesh. She received the B.Sc. degree in
[151] T. Chen, X. Li, H. Yin, and J. Zhang, ‘‘Call attention to
computer science and engineering from the
rumors: Deep attention based recurrent neural networks for early rumor
Bangladesh University of Business and Technology
detection,’’ in Proc. Pacific–Asia Conf. Knowl. Discovery Data Mining.
(BUBT), in 2021. She is currently working as a
Switzerland:
Research Assistant with the Department of CSE,
Springer, 2018, pp. 40–52. BUBT. She also works as a Researcher with
[152] N. Aloshban, ‘‘ACT: Automatic fake news classification theAdvancedMachineLearningLab.Herresearch
through selfattention,’’ in Proc. 12th ACM Conf. Web Sci., Jul. 2020, pp. interests include deep learning, natural language
115–124. processing (NLP), and computer vision. She has
experienced working in C++, Python, Keras, TensorFlow, Sklearn, NumPy,
Pandas, and Matplotlib.
MD. ABDUL HAMID was born in Sonatola,
Pabna, Bangladesh. He received the Bachelor of
Engineering degree in computer and information
engineering from the International Islamic
University Malaysia (IIUM), in 2001, and the
combined master’s and Ph.D. degree from the
Computer Engineering Department, Kyung Hee
University, South Korea, in August 2009, majoring in
information communication. His education life spans
over different countries in the world. From 1989
to 1995, his high school and college graduation at the Rajshahi Cadet College,
Bangladesh. He has been in the teaching profession throughout his life, which
also spans over different parts of the globe. From 2002 to 2004, he was a
Lecturer with the Computer Science and Engineering Department, Asian
University of Bangladesh, Dhaka, Bangladesh. From 2009 to 2012, he was an
Assistant Professor with the Department of Information and Communications
Engineering, Hankuk University of Foreign Studies (HUFS), South Korea. From
2012 to 2013, he was an Assistant Professor with the Department of Computer
Science and Engineering, Green University of Bangladesh. From 2013 to 2016,
he was an Assistant Professor with the Department of Computer Engineering,
Taibah University, Madinah, Saudi Arabia. From 2016 to 2017, he was an
Associate Professor with the Department of Computer Science, Faculty of
Science and Information Technology, American International University-
Bangladesh, Dhaka. From 2017 to 2019, he was an Associate Professor and a
Professor with the Department of Computer Science and Engineering,
University of Asia Pacific, Dhaka. Since 2019, he has been a Professor with the
Department of Information Technology, King Abdulaziz University, Jeddah,
Saudi Arabia. His research interests include network/cyber-security, natural
language processing, machine learning, wireless communications, and
networking protocols.
MUHAMMAD MOSTAFA MONOWAR received the
B.Sc. degree in computer science and information
technology from the Islamic University of Technology
(IUT), Bangladesh, in 2003, and the Ph.D. degree in
computer engineering from Kyung Hee University,
South Korea, in 2011. He worked as a Faculty
Member at the Department of Computer Science and
Engineering, University of Chittagong, Bangladesh.
He is currently working as an Associate Professor at
the Department
of Information Technology, King Abdulaziz
University, Saudi Arabia. His research interests include wireless networks,
mostly ad-hoc, sensor, and
meshnetworks,includingroutingprotocols,MACmechanisms,IPandtransport
layer issues, cross-layer design, and QoS provisioning, security and privacy
issues, and natural language processing. He has served as a program committee
member for several international conferences/workshops. He served as an editor
for a couple of books published by CRC Press and Taylor & Francis Group. He
also served as a guest editor for several journals.
MD. SAIFUR RAHMAN is currently working as an
Assistant Professor at the Department of Computer
Science and Engineering, Bangladesh University of
Business and Technology. He has expertise in
software development and has developed numerous
management systems. He has been a successful
Director of the International Collegiate
Programming Contest (ICPC), Dhaka Regional
Contest, in 2014. Apart from the collaboration and
development domain, his skills
cover theoretical background in computer
engineering sectors. His research interests include system design and artificial
intelligence-based systems.
He received coach awards in ICPC Dhaka Regional Contests.
RESEARCH PAPER -3
FAKE NEWS DETECTION USING MACHINE
LEARNING
Associate Prof. DR. Mahendra Sharma*1, Assistant Prof. Mrs. Laveena Sehgal*2, Blaghul Rizwan*3, Md
Zamin Zafar*4
*1 Associate Professor, Department of Information
Technology, IIMT College Of Engineering, Greater Noida, Uttar Pradesh India.
*2 Assistant Professor, Department of Information
Technology, IIMT College Of Engineering, Greater Noida, Uttar Pradesh India.
*3 Student, Department of Information
Technology, IIMT College Of Engineering, Greater Noida, Uttar Pradesh India.
*4 Student, Department of Information
Technology, IIMT College Of Engineering, Greater Noida, Uttar Pradesh India.
ABSTRACT
The fake news on social media and various other media is wide spreading and is a matter of serious concern due to its ability
to cause a lot of social and national damage with destructive impacts. A lot of research is already focused on detecting it.
This paper makes and analysis of the research related to fake news detection and explores the traditional machine learning
models to choose the best, in order to create a model of a product with supervised machine learning algorithm, that can
classify fake news as true or false, by using tools like python is scikit-learn, NLP for textual analysis. This process will result
in feature extraction and vectorization; we propose using python scikit-learn library to perform tokenization and feature
extraction of text data, because this library contains useful tools like Count Vectorizer and Tiff Vectorizer. Then, we will
perform feature selection methods, to experiment and choose the best fit features to obtain the highest precision, according to
confusion matrix result.
INTODUCTION
Fake news contains misleading information that could be checked. This maintains lies about a certain startup in a country or
exaggerated cost of certain services for a country, which may arise unrest for some countries like an Arabic spring. there are
organizations like the House of Commons and the crosscheck project, trying to deal with issue as confirming authors and
accountable. However, their scope is so limited because they depend on human mutual detection, in a globe with millions of
articles either removed or being published every minute, this cannot be accountable or feasible manually. A solution could
be, by the development of a system to provide a credible automated index scoring, or rating for credibility of different
publishers and news context.
This paper proposes a methodology to create a model that will detect if an article is authentic or fake based on its words,
phrases, sources and titles, by applying supervised machine learning algorithms on an annotated (labeled) dataset, that are
manually classified and guaranteed. Then, feature selection methods are applied to experiment and choose the best fit
features to obtain the highest precision, according to confusion matrix results. We propose to create the model using different
classification algorithms. The product model will test the unseen data, the results will be plotted, and accordingly, the
product will be a model that detects the classifies fake articles and can be used and integrated with any system for future use.
RELATED WORK
1. Social Media and Fake News
Social media includes websites and programs that are devoted to forums, social websites, microblogging, social
bookmarking and wikis. On the other side some researchers consider the fake news as a result of accidental issue such as
educational shock or underwriting actions like what happened in Nepal Earthquake case. In 2020, there was widespread fake
news concerning health that had exposed global health at risk. The WHO released a warning during early February 2020 that
the COVID-19 outbreak has caused massive ‘infodemic’, or a spurt of real and fake news-- which includes lots of
misinformation.
3. Data mining
data mining techniques are categorized into two main methods which are: supervised and unsupervised. The supervised
method utilizes the training information in order to foresee the hidden activities. Unsupervised data mining is an attempt to
recognize hidden data models provided without providing training data, for example, pairs of input labels and categories. A
model example for unsupervised data mining is aggregate mines and a syndicate base.
Tree based learning algorithms are widely with predictive models using supervised learning methods to establish high
accuracy. They are good at mapping non-linear relationships. They solve the classification or regression problems quite well
and are also referred to as CART.
6. Random Forest
Random forests are built on the concept of building many decision tree algorithms, after which the decision trees get a
separate result. The results, which are predicted by a large number of decision trees, are taken up by the random forest. To
ensure a variation of the decision trees, the random forest randomly selects a subcategory of properties from each group.
The applicability of Random Forest is the best when used on uncorrelated decision trees. If applied on similar trees, the
overall result will be more or less similar to a single decision tree. Uncorrelated decision trees can be obtained by
bootstrapping and feature randomness.
Random Forest Psudo-code
To make n classifiers:
For I= 1 to n do
Sample the training data T random with replacement for Ti output
Build a Ti- containing root node,
SVM Pseudo-Code
F[0...N-1]: a feature set with N features that is sorted by information gain in decreasing order accuracy (I): accuracy of
prediction model based on SVM with F[0...i] gone set
Low= 0
High= N-1
Value = accuracy (N-1)
IG_RFE_SVM(F[0....N-1], value, low, high){
If (high)<=low)
Return F[0...N-1] and value
Mid=(low+high) / 2
Value_2= accuracy(mid)
If (value_2>=value)
Return IG_RFE_SVM (F|0....mid), value_2, low, mid)
Else (value_2 < value)
Return IG_RFE_SVM (F[0...high], value, mid,high)
8. Naive Bayes
This algorithm works on Bayes theory under the assuming that its free from predictors and is used in multiple machine
learning problems. Simply put, Naïve Bayes assumes that one function in the category has nothing to do with another. For
example, the fruit will be classified as an apple when it's red color, swirls, and the diameter is closed 3 inches. Regardless of
whether these functions depend on each other or on different functions, and even if these functions depend on each other or
on other functions, Naïve Bayes assumes that all these functions share a separate proof of the apples.
Naïve Bayes Equation
P (c | x) = {P (x | c) P(c)} \ P(x)
P (c | X) = P (x1| c) * P (x2 | c) *…...* P (x2 | c) * P (c)
Where:
Database
Feature pre -
processing Pre- Feature
selectio
processing selection
The main goal is to apply a set of classification algorithms to obtain a classification model in order to be used as a scanner
for a fake news by details of news detections and embed the model in Python application to be used as a discovery for the
fake news data. Also, appropriate refactorings have been performed on the Python code to produce an optimized code.
Massification algorithms applied in this model are K nearest neighbors, linear regression, XGBoot, Naïve Bayes, Decision
Tree, Random Forest and Support Vector Machine. All these algorithms get as accurate as possible. Where reliable from the
combination of the average of them and compare them.
As shown in the figure, the data set is applied to different algorithms in order to detect a fake news. the accuracy of the
results obtained are analyzed to conclude the final result.
Linear Regression
Random Forest
XGBoot -AVG
Data set Naive Bayes sum of accuracy / N
K-Nearest Neighbors (K
-NN) - If class >=1 Then class = 1
else 0
Decision Tree
In the process of model creation, the approach of detecting political fake news is as follows: first step is collection political
news data set (a liar data set is adopted for the model), perform preprocessing through rough noise removal the next step is to
apply the NLTK (Natural Language Toolkit) to perform POS and features are selected next perform the data set splitting
apply ML algorithm. The figure 2 shows that after the NLTK is applied, the Dataset gets successfully preprocessed in the
system, then a message is generated for applying algorithms on trained portion. The system responds with N.B and Random
forests are applied, then the model is created with response message. Testing is performed on the test data set, and the results
are verified the next step is to monitor the precision for acceptance. The model is then applied on unseen data selected by the
user. Full data set is created of the data being fake and half with real articles thus making the model’s reset accuracy 50%.
random selection of 80% data is done from the fake and real data set to be used in our complete data set and leaving the
remaining 20% to be used as a testing set when our model is complete. Text data requires preprocessing before applying
classifier on it, so we we'll clean noise, using Stanford NLP (Natural language processing) for POS (Part of Speech)
processing and tokenization of words, then we must encode the resulted data as integers and floating point values to be
accepted as an input to ML algorithms. This process will result in feature extraction and vectorization; the research using
python scikit-learn library to perform tokenization and feature extraction of text data, because the library contains useful
tools like Count Vectorizer and Tiff Vectorizer. Data is viewed in graphical presentation with confusion matrix.
Aggregation
Data set Model User
+supll+preproces +Train() associatio +insert doublted
s() Dependenc news()
+create ()
+transform () +preprocess ()
+Test ()
+splitply model() Preprocess +predict ()
+Verify
+views_result ()
+update () (A+apply noise
removal()A+apply POS()
composition A+apply
CountVectorizer
Association
+Apply TfidfVectorizer()
RESULT
The scope of this project is to cover the political news data of a dataset known as Liar-dataset, it a new
Benchmark Dataset for fake news detection and labeled by fake or trust news. We have performed analysis on “Liar” dataset.
The results of the analysis of the datasets using the six algorithms have been depicted using the confusion matrix. The 6th
algorithms used for the detections are as:
• XGBoot.
• Random Forest.
• Naïve Bayes.
• K-Nearest Neighbors (KNN). SVM.
The confusion matrix is automatically obtained by Python code using the cognitive learning library when running the
algorithm code is Anaconda platform.
CONCLUSION
The research in this paper focuses on detecting the fake news by reviewing it in two stages: characterizations and disclosure.
in the first stage the basic concept and the principle of fake news are highlighted in social media. During the discovery stage,
the current methods are reviewed for detection of fake news using different supervised learning algorithms.
As for the displayed fake news detection approaches that is based on text analysis in the paper utilizes models based on
speech characteristics and predictive model that do not fit with the other current models.
In the F4 mentioned research summary and system analysis we concluded that most of the research papers used aive based
algorithm, and the prediction precision was between 70 to 76%, they mostly use qualitative analysis depending on sentiment
analysis, titles, word frequency repetitions. In our approach we propose to add to these methodologies, another aspect, which
is POS textual analysis, it is a quantitative approach its depends on adding numeric statical values as features, we thought
that increasing these features and using random forest will give further improvements to precession results. The features we
propose to add in our dataset are total words (tokens), total unique words(types), Type / Token Ratio(TTR) , Number of
sentences, average sentence length(ASL), number of characters, average word length (AWL), nouns, prepositions, adjectives
etc.
REFERENCES
1. https://www.researchgate.net/search?q=fake%20news
%20detection 2. https://www.researchgate.net/search?q=fake
%20news%20detection 3. https://www.researchgate.net/search?
q=fake%20news%20detection
4. https://www.researchgate.net/search?q=fake%20news
%20detection
RESEARCH PAPER -4
Journal on Interactive Systems, 2023, 14:1, doi: 10.5753/jis.2023.3020
This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
Fake news (i.e., false news created to have a high capacity for dissemination and malicious intentions) is
a problem of great interest to society today since it has achieved unprecedented political, economic, and
social impacts. Taking advantage of modern digital communication and information technologies, they are
widely propagated through social media, being their use intentional and challenging to identify. In order to
mitigate the damage caused by fake news, researchers have been seeking the development of automated
mechanisms to detect them, such as algorithms based on machine learning as well as the datasets employed in
this development. This research aims to analyze the machine learning algorithms and datasets used in training
to identify fake news published in the literature. It is exploratory research with a qualitative approach, which
uses a research protocol to identify studies with the intention of analyzing them. As a result, we have the
algorithms Stacking Method, Bidirectional Recurrent Neural Network (BiRNN), and Convolutional Neural
Network (CNN), with 99.9%, 99.8%, and 99.8% accuracy, respectively. Although this accuracy is expressive,
most of the research employed datasets in controlled environments (e.g., Kaggle) or without information
updated in real-time (from social networks). Still, only a few studies have been applied in social network
environments, where the most significant dissemination of disinformation occurs nowadays. Kaggle was the
platform identified with the most frequently used datasets, being succeeded by Weibo, FNC-1, COVID-19
Fake News, and Twitter. For future research, studies should be carried out in addition to news about politics,
the area that was the primary motivator for the growth of research from 2017, and the use of hybrid methods
for identifying fake news.
CNN+RNN 97.2%
21)Xie et al. (2021)
DistilBERT 97.2%
96.9% 22)Ozbay e Alatas (2019)
IARNet
Stance Extrac- 23)Fang et al. (2019)
tion and Reasonic 96.6%
Network (SERN) 24)Kaliyar et al. (2020b)
25)Faustini e Covões (2020)
Grey Wolf
96.5%
Optimization (GWO)
SMHA-CNN 95.5% 26)Varshney e Vishwakarma (2020)
SemSeq4FD 92.6%
EchoFakeD 92.3%
57) Garg
33) Sharma, Kesarwani
e et al. BiLSTM K-Nearest 91.5%
79.0%
(2020)
Shrivastava (2021) Neighbor
SVM-RNN-GRUs
AA-HGNN (Ad-
34) Albahar
58) (2021)
Ren et al. (2020)bidirecionais 91.2%
versarial Active
35) Bahad, Saxena e Learning based 73.5%
Kamal (2020) BiLSTM-RNN
Graph Neural91.1% Convolutional Neural Network (CNN) algorithm (Table 3).
Network)
GRU-LSTMCNN Three used typical CNN (articles 3, 8, and 10), and the
59) et
36) Torgheh Jardaneh et al. (2019)
al. (2021) Random Forest90.8% 76.0% other seven as a hybrid model, as follows: CNN-RNN
60) Al-Ahmad
37) Lakshmanarao, et al.
Swathi Algoritmos genéticos (articles 11 and 18), CNN with marginloss (article 14),
(2021)
e Kiran (2019) Random Forest 90.7% 75.4% CNN-LSTM (article 16), SHMA-CNN (article 23), GRU-
LSTM-CNN (article 36), CNN-BiLSTM (article 40).
38) Li et 61)
al. (2020) MCNN-TFV
Konkobo et al. (2020) 90.1% Goldani, Momtazi, and Safabakhsh (2021) achieved the
SSLNews 72.3%
Bi-LSTM-GRU- Ten highest accuracy of the CNN approaches, reaching 99.8%
39) Aslam et al. (2021) 89.8%
dense studies with deep learning (article 3). Through the articles analyzed
40) Kumar et al. (2020) CNN + BiLSTM 88.8% employed (Table 2), science’s growth in identifying fake news is
underlined. (Figure 2).
41) Lin et al. (2020) BERT 88.7% the
42) Kaliyar et al. (2020b) DeepFakE 88.6% Figure 2. Evolution over the years of the number of publications of
algorithms for fake news identification. Source: The authors.
Knowledgedriven
43) Wang et al. (2020)
Multi- Accuracy greater than 90% was only achieved in 2019
modal Graph 88.6% by Lakshmanarao, Swathi, and Kiran (2019) (article 37).
Convolutional
Supporting Almeida et al. (2021), Goldani, Momtazi, and
Networks
(KMGCN) Safabakhsh (2021) and Kumar, Anurag, and Pratik (2021)
Bayesian inference cite the 2016 US presidential elections as the biggest
44) Najar et al. (2019) 87.9% motivator for research applied to the identification of
algorithm
45) Chen et al. (2018) AERNN 87.6% disinformation or fake news.
The word cloud of the most recurrent keywords (Figure
46) Alanazi e Khan (2020) SVM 87.1% 3) in the analyzed articles showed that the terms “fake news
Gaussian Naive detection”, “fake news”, “deep learning”, “machine
47) Mugdha et al. (2020) 87.0% learning”, and “feature-extraction”, with 24, 15, 16, 13 and
Bayes
Gradient Boosting 12 occurrences, respectively, were the words with the
48) Kaliyar et al. (2019) 86.0% highest recurrence in the 61 articles analyzed by this
49) Ajao et al. (2019) LSTM HAN 86.0% research.
54) Barua et al. (2019) LSTM+GRU Figure 3. Word cloud of most recurrent keywords in articles
(Recurrent Neural 82.6% Source: The authors.
Networks)
55) Khattar et al. (2016) MVAE Regarding RQ2 (Which datasets are used?), the
(Multimodal investigation reveals that many datasets were being used to
82.4%
Variational develop fake news identification methods. Table 4 presents
Autoencoder) them with the respective amount of use in the publications.
56) Gangireddy et al.
GTUT 80.0%
(2020) Table 4. Datasets used in the analyzed studies of fake news detection.
Source: The authors.
Dataset Amount Frequency Kaggle 39 63.9%
Weibo 6 9.8%
FNC-1 3 4.9%
COVID-19 Fake News 2 3.3%
Twitter 2 3.3%
NewsFN 1 1.6%
Bengali Language 1 1.6%
btvlifestyle 1 1.6%
Slovak language 1 1.6%
Fake vs Satire 1 1.6%
fake news Amharic 1 1.6%
LUN 1 1.6%
Fakeddit 1 1.6%
Facebook 1 1.6%
Total 61 100.0%
25)Random Forest (RF) – Faustini and 50)XGBoost – Lin et al. (2019) Kaggle
Kaggle
Covões (2020)
51)FND-Bidirectional LSTM concatenated FNC-1
26)Random Forest – Varshney and Kaggle – Qawasmeh et al. (2019)
Vishwakarma (2020)
Kaggle 52)Multi-domain Visual Neural Network Weibo
27)S-HAN – Ahuja and Kumar (2020) btvlifestyle (MVNN) – Qi et al. (2019)
28)LSTM – 53)CROWDSOURCING – Shabani and Sokhn Fake vs
Kaggle Satire
Ivancov, Sarnovsk and Maslej-kre (2018)
(2021)
Kaggle 54)LSTM+GRU (Recurrent Neural Networks) Kaggle
29)WELFake – Verma et al. (2021) – Barua et al. (2019)
Slovak language
30)SemSeq4FD – Wang et al. (2021) 55)MVAE (Multimodal Variational Weibo
Autoencoder) – Khattar et al. (2016)
31)EchoFakeD – Kumar, Anurag and Kaggle
56)GTUT – Gangireddy et al. (2020) Kaggle
Pratik (2021)
LUN
32)CARMN – Song et al. (2021) 57)K-Nearest – Neighbor Kesarwani et al. Kaggle
Kaggle (2020)
33)BiLSTM – Sharma, Garg and
Shrivastava (2021) 58)AA-HGNN (Adversarial Active Learning
Weibo based Graph Neural Network) – Ren et al. Kaggle
34)SVM-RNN-GRUs bidirecionais – (2020)
Albahar (2021) Kaggle
59)Random Forest – Jardaneh et al. (2019) Twitter
35)BiLSTM-RNN – Bahad, Saxena and COVID-
Kaggle 60)Algoritmos genéticos – Al-Ahmad et al.
Kamal (2020) 19 Fake
(2021)
News
36)GRU-LSTM-CNN – Torgheh et al.
(2021) Kaggle 61)SSLNews – Konkobo et al. (2020) Kaggle
37)Random Forest – Lakshmanarao, Swathi
and Kiran (2019) Twitter
38)MCNN-TFV – Li et al. (2020) Regarding RQ3 (What are the top recommendations for
FNC-1 future studies?), some perspectives are presented, such as
39)Bi-LSTM-GRU-dense – Aslam et al. (2021) using other languages for the dataset. The accuracy of
NewsFN identifying fake news for the authors does not only depend
40)CNN + BiLSTM – Kumar et al. (2020)
on the algorithm but also on the dataset language. Therefore,
Kaggle Jiang et al. (2021) and Ahuja and Kumar (2020) recommend
extending studies by applying research in datasets from other
Kaggle languages.
Another research suggestion is classifying fake news using a scoring model, such as a credibility rate. The
scoring model is justified by the difficulty in classifying information as only false or true (Agarwal et al., 2020),
given the inherent complexity of human language and other aspects, such as the area of news (e.g., economics or
politics).
The combination of models generating hybrid models is a recommendation highlighted by Jiang et al. (2020),
Pardameanm, and Pardede (2021), and Kaliyar, Goswamim and Narang (2021). Another recommendation was to use
algorithms based on deep learning in future research and understand how this technique can help identify fake news
(Bahad, Saxena & Kamal, 2020).
Despite much research being directed toward textual information, Song et al. (2021) and Varshney and Vishwakarma
(2020) recommended research on the exploitation of visual information in search of fake news. Goel et al. (2021)
highlight the relevance of further expanding research on fake news in other areas, given that many were restricted to the
identification of fake news in datasets exclusive to political news.
Fang et al. (2019) recommended a better understanding of how the classifier detects fake news. Thus, it allows
modifying or replacing features to avoid detection method that relies on very specific semantics of fake news, which could
be explored and generate misclassification (e.g., false positives, false negatives).
For Albahar (2021), the challenge is great, and researchers need to devote more attention to understanding the patterns
of news structures and what is considered false in the digital universe. For the researcher, fake digital news continues to
acquire new formats, making it difficult to distinguish fake news embedded in long news.
Finally, some limitations regarding the performed analysis threaten this study’s validity. Although the number of
publications retrieved from the scientific repositories is expressive, the method employed does not intend to be exhaustive.
Accuracy comparison between different algorithms and datasets provides a limited view of the matter. A more accurate
comparison between algorithms required a controlled environment and advanced (statistic) analysis (e.g., n-fold cross-
validation, paired t-tests). The classification of the algorithms is also variable according to different authors’ perspectives
and theoretical backgrounds, especially regarding mixed approaches, called hybrid methods.
Conclusion
This research intended to investigate the computational techniques and datasets used in fake news identification, analyzing
the accuracy reported in scientific literature. For this, three questions were investigated. Regarding the accuracy of the
main algorithms used to identify fake news (RQ1), the top three approaches are as follows: the Stacking Method, with
99.9% accuracy, Bidirectional Recurrent Neural Network (BiRNN), with 99.8%, and the Convolutional Neural Network
(CNN), also with 99.8%.
The most popular technique was CNN, being used in ten studies. The scientific evolution in the past years for fake
news identification is remarkable. An accuracy superior to 90% was reached only in 2019, with the 21 highest accuracies,
above 96.6%, dating between 2020 and 2021.
Regarding the datasets used for the identification of fake news (RQ2), Kaggle has a more significant predominance,
probably due to its popularity and contemplating several datasets on its platform for studies of artificial intelligence. After
Kaggle, Weibo (i.e., a Chinese microblog similar to Twitter), FNC-1, COVID-19 Fake News, and Twitter were found and
are presented in order considering the highest number of occurrences in the analyzed studies.
The top recommendations for future research in fake news identification (RQ3) are pointed out as follows:
I. INTRODUCTION
Information is significant for human dynamics and affects life practices. In earlier days, the daily news
or information was presented through print media, newspapers, and electronic media such as television
and radio. The data from these publishing technologies are more credible as it is either self-screened or
constrained by specialists [1]. These days, individuals are presented with
an extreme amount of data through various sources, particularly with the prominence of the internet
and webbased media stages. The ease of internet access has caused the hazardous development of a
wide range of falsehoods like malicious discussion, double-dealing, fabrications, fake news, spam
assessment, which diffuses quickly and widely in the human culture. The misinformation of online
social media has become a global problem in public trust and society as it has become an essential
mode of communication and networking nowadays.
Nowadays, online social platforms and blogs contain a significant amount of fake and fabricated
news, negatively affecting society [2]. This news is embellished with dubious facts and misleading
information, causing interpersonal anxiety and detrimental social panic. This unreliable information
destroys people's trust and adversely influences the economy and major political processes, such as
the stock market, elections, etc. The proliferation of fake and fabricated news is generally detected
manually by human verification. This manual fact-checking process is subjective in nature, laborious,
time-consuming, and inefficient. In recent years, automatic systems based on machine learning and
natural language processing algorithms have been utilized to tackle the issue of fake news detection
[3], [4]. With the advancement of technology and artificial intelligence, these automatic systems
efficiently restrain misleading and false news propagation. Thus, these techniques have created deep
interest among researchers in detecting fake news for a better future endeavor.
This paper has designed a fake news detection and classification system using different types of
machine learning techniques. The open-source fake news datasets of the proposed artificial news
detection system contain the information of various articles' authors, captions, and main descriptions.
Initially, the dataset is preprocessed using conventional techniques, e.g., regex, tokenization, stop
words, lemmatization, and then applied NLP techniques, count vectorizer, TF-IDF vectorizer. The
major contributions of this work are as follows:
• In this paper, an automatic fake news detection system has been developed using various machine
learning and natural language processing algorithms. This work uses logistic regression, decision
tree, naive bayes, and SVM machine learning techniques.
• Additionally, Long Short-Term Memory (LSTM), deep learning model and natural language
processing algorithm, Bidirectional Encoder Representations from Transformers (BERT) are also
implemented.
• Next, the efficiency of all the machine learning and natural language processing models are
compared in terms of classification accuracy, precision, recall, F-1 score, and ROC curve.
• Finally, the performance of the proposed fake news detection system is compared with previous
relevant works in terms of classification accuracy. The nobility of this work is to utilize the
BERTbased NLP model for detecting fake news.
The other part of the paper is constructed as follows. In Section III, the proposed system has been
discussed with appropriate equations. The actual results of the research have been shown in Section IV.
Lastly, Section V concludes the paper with some directions for the future improvement of this work.
III. METHODOLOGY
In this section, we have discussed the methodology of our work in great detail. We have explained all
the regular machine learning, neural network and NLP methods that we have used in our dataset.
A. Dataset
In this work, an open-source fake news dataset from Kaggle [10] has been used. The public dataset has
been created by web scrapping of different search engines. Lots of fake news and agenda always take
place around us, so the whole data was curated with the help of automated data science technologies. It
was posted on the data science community as a challenge to use those data to implement efficient fake
news detection architecture. This specific database of fake news has been utilized in this work because
it involves a diverse dataset from a wide variety of news portals and social sites. The dataset comprises
26,000 unique sample documents and has been used successfully in some papers to identify fake news
[11], [12]. The original dataset has four columns, viz. id, title, author, text. The id column represents a
particular numerical label for a news article; the title holds the heading of a news article; the author
column contains the information about the writer of the news item; and finally, under the text column,
the text of the report has been described. The training dataset has the label column, which marks the
news item as potentially unreliable or reliable. It is worth mentioning that, the dataset has 20,822
unique values in the text column.
B. Data Preprocessing
We need to transform the text data using preprocessing techniques, NLP, tokenization, and
lemmatization before feeding them through the ML and DL models [13]. Data preprocessing helps to
remove the noises and inconsistency of data, which increases the performance and efficiency of the
model. In this work, we have used traditional techniques, regex, tokenization, stopwords,
lemmatization, NLP technique, and TF-IDF for data preprocessing. The implemented data
preprocessing techniques are explained briefly in the subsequent paragraphs. 1) Regex
We use regex to remove punctuations from the text data. Often in the sentences, there may have extra
punctuations like exclamatory signs. We use regex to remove those additional punctuations to make the
dataset noise-free.
Regex is based on context-free grammar.
2) Tokenization
Tokenization, preprocessing tool is used to break the sentences into words [14].
3) Stopwords
We use the English stopwords library in our preprocessing technique because our model data is English.
We need to use the stopwords preprocessing technique to remove noises, make the model faster and
more efficient, and save memory space.
4) Lemmatization
Lemmatization is used to transform the words into root words. We can resolve data ambiguity and
inflection with lemmatization.
5) NLP technique
NLP techniques have been applied to convert the texts into meaningful numbers to feed these numbers
into our proposed machine learning algorithm.
6) Bag of words
The bag of words technique converts texts into machineunderstandable numbers, which is expressed as:
𝑇𝐹 − 𝐼𝐷𝐹 = 𝑇𝐹𝑡𝑑. 𝐼𝐷𝐹𝑡 (1) where 𝑡 is a term, and 𝑑 denotes the
𝑇𝐹 = (2)
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑒𝑟𝑚𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡
where 𝑞 is the number of times the term, 𝑡 appears in the document, 𝑑 .𝐼𝐷𝐹 denotes inverse
document frequency, which indicates the importance of a particular term. IDF is calculated as:
log(1+𝑛)
𝐼𝐷𝐹 = +1 (3)
(1+𝑑𝑓)𝑑𝑡
where 𝑛 means the number of documents and the denominator indicates the document frequency of the
term,
𝑡.
C. Machine Learning Algorithms
To detect and classify real and fake news, we have used different machine learning algorithms: logistic
regression, naive bayes, decision tree, and support vector machine.
1) Logistic regression
Logistic regression is a statistical ML classification model [15]. The basis of the proposed system
consists of the binary classification problem. Logistic regression is manipulated to model the
probability of a certain existing event, such as true/false, reliable/unreliable, win/lose, etc. Hence, the
logistic model is one of the most appropriate models for the fake news detection system. The condition
for predicting logistic model is:
0 ≤ ℎ𝜃(𝑥) ≤ 1 (4)
The logistic regression sigmoid function is expressed as:
ℎ𝜃(𝑥) = 𝑔(𝜃𝑇𝑋) (5)
where,
𝑔(𝑧) = (
1
1+ 𝑥−𝑧) (6)
probability that incident X will happen provided information Y. The typical notation for this is 𝑃(𝑋|
outcomes that allow flipping the state around straightforwardly [16]. A conditional probability is a
𝑌). We can use the naive bayes rule to compute this probability when we only have the probability of
the opposite result and the two components separately.
Algorithm 1 briefly explains the building steps of the decision tree classification technique. The first
approach is subtree replacement, which refers to replacing nodes in a decision tree's leaves to reduce
the number of tests in the convinced route. In most cases, subtree raising has a minor influence on
decision tree models. Usually, there is no accurate method to forecast an option’s usefulness. However,
turning it off may be advisable if the induction operation takes longer than expected because the
subtree's raising is computationally complex. Next, the current state's entropy and its corresponding
characteristics are determined. Consequently, the attribute with the maximum information gain is
computed and removed. This process is continued until all features have been exhausted or the decision
tree has all leaf nodes.
4) Support Vector Machine (SVM)
SVM, which is also known as support vector machine network, is a supervised learning method [20].
SVMs are trained using particular data that has previously been divided into two groups [21]. As a
result, once the model has been trained, it is created. Moreover, the goal of the support vector machine
technique is to decide any new information belongs to which group and to increase the class label [22].
The final goal of the SVM is to locate a subspace that divides the data into two parts. As Radial Basis
kernel for this proposed system. On two samples 𝑥 and 𝑥′, the radial basis function is expressed as:
Function (RBF) is suitable for large systems like a collection of media articles, it was chosen as the
‖𝑥−𝑥 ‖2
𝐾(𝑥, 𝑥′) = 𝑒−
′
2
2𝜎 (9)
where ‖𝑥 − 𝑥′‖2 is a free parameter that denotes the squared Euclidean distance.
For our classification, we used an LSTM model with an input layer that takes the input titles and article
body and an embedding layer that turns every word into a 300-pixel vector. As there are 256 features,
this layer will produce a 256×300 matrix. The weights we obtain from matrix multiplication will be in
the output matrix, which will generate a vector for every word. These vectors are input through an
LSTM, which is subsequently transferred to a fully linked dense layer, resulting in a single final output.
Table I shows the model layers and parameters, which were trained on batches of size 256.
TABLE I. LAYERS AND PARAMETERS OF THE PROPOSED LSTM
MODEL
Layer Output Shape Number of
Parameters
Input (None, 256) 0
Embedding (None, 256, 300) 60,974,100
Spatial Dropout (None, 256, 300) 0
Bidirectional (None, 256) 439,296
Dense (None, 64) 16,448
Dropout (None, 64) 0
Total parameters: 61,429,909
Trainable parameters: 61,429,909
Non-trainable parameters: 0
According to Fig. 4, the area under the curve (AUC) score of the ROC curve of the proposed logistic
regression algorithm is 0.79. The rest of the performance metrics for the logistic regression model are
demonstrated in Table III. The proposed logistic regression model's precision, recall, and F1-score are
74%, 72%, and 73%, respectively.
TABLE III. LOGISTIC REGRESSION MODEL’S PERFORMANCE METRICS
Precision Recall F1-
score
0 (Not 0.74 0.84 0.78
Fake)
1 (Fake) 0.74 0.61 0.67
Accuracy 0.74
Weighted 0.74 0.74 0.73
Average
B. Performance of Naive Bayes Model
The confusion matrix for the naive bayes model of the proposed system has been shown in Fig. 5. The
authentic news class has 830 right predictions and 202 wrong predictions from the total 1032 test
samples. So, the accuracy for real news prediction is 80%, and for the fake news class, it has a
significant number of wrong classifications similar to the logistic regression model. Finally, the
accuracy for fake news is 66%, and the overall accuracy is 74%.
The true and false positive rates of the proposed naive bayes approach are depicted in Fig. 6. According
to Fig. 6, the naive bayes model has an ROC AUC score of 0.79. In Table IV, the rest of the
performance metrics for the naive bayes model are demonstrated. The precision, recall, and F1-score of
the proposed naive bayes model are 74%, 73%, and 73%, respectively.
According to Fig. 8, the ROC AUC value of the proposed decision tree algorithm is 0.89. In Table V,
the rest of the performance metrics for the decision tree model are demonstrated. The precision, recall,
and F1-score of the proposed decision tree model are 90%, 89%, and 89%, respectively.
According to Fig. 10, the ROC AUC coefficient of the proposed SVM algorithm is 0.83. Table VI
depicts the rest of the performance metrics for the SVM model.
TABLE VI. SVM MODEL ACCURACY METRICS
Precision Recall F1-score
0 (Not Fake) 0.78 0.82 0.80
1 (Fake) 0.75 0.70 0.72
Accuracy 0.77
Weighted 0.77 0.77 0.77
Average
E. Performance of LSTM Model
Fig. 11 illustrates the confusion matrix for the deep learning-based LSTM model of the proposed
system. The real news class has 1920 right predictions and 157 wrong predictions. So, the accuracy for
real news prediction is 92%, and for the fake news class, the prediction is significantly improved
compared to other ML techniques. Finally, the overall accuracy of the LSTM technique is 95%. The
total number of test samples for each class is different from the ML approaches because of the better
preprocessing for NLP methods which helps to decrease the chances of removing samples.
According to Table VII, other performance metrics for the LSTM model demonstrated better results.
The precision, recall, and F1-score of the proposed LSTM model are 94%, 95%, and 94%, respectively.
TABLE VII. PERFORMANCE METRICS OF THE LSTM APPROACH
Precision Recall F1-score
0 (Not Fake) 0.98 0.92 0.95
1 (Fake) 0.91 0.97 0.94
Accuracy 0.95
Weighted 0.95 0.95 0.95
Average
Figure 13. Accuracy and loss vs. epochs graphs of BERT framework.
Fig. 13 shows the accuracy and loss graph of BERT with respect to epoch. For the BERT model, at the
initial stages of training, the model's validation starts from 97%, which did not change remarkably, and
after three epochs, it increased only by 1% and achieved 98%.
Finding the accuracy and credibility of information and news that is available on the internet is critical
nowadays. It has recently been discovered that various online platforms significantly influence
disseminating misleading information and spreading fake news to serve several dreadful purposes and
benefit many people. Because of the plethora of spreading and sharing data on the internet, there is a
growing demand for automated false news identification systems that are accurate and efficient. This
paper proposes an automatic fake news detection system that utilizes various regular machine learning,
deep learning, and natural language processing techniques. Various feature extraction methods, such as
regex, tokenization, stopwords, lemmatization, NLP, TF-IDF, were used to preprocess the data in this
suggested system. Next, several models, logistic regression, decision tree, naive bayes, support vector
machine, long short-term memory, bidirectional encoder representation from transformers have been
employed to classify the fabricated news. For the machine learning model logistic regression, decision
tree, naive bayes, and SVM, we got 73.75%, 89.66%, 74.19%, and 76.65% accuracies, respectively.
Finally, substantial better performance was achieved by the neural network LSTM and NLP-based
BERT techniques. In the future, the proposed system can be extended to detect more specific false news
with various categories, e.g., religious, political, COVID-19, etc. The word2vec approach can be
applied to deal with and classify images and video-related visual datasets. News data from diverse
languages can be utilized to identify false news from different nations and countries. A future extension
of this work can be to employ attention-based deep learning approaches.
CONFLICT OF INTEREST
The authors declare no conflict of interest.
AUTHOR CONTRIBUTIONS
M. E. H. Rafi proposed the research idea; N. N. Prachi and MH conducted the research; E. Alam and R.
Khan analyzed the data; N. N. Prachi, M. Habibullah and M. E. H. Rafi wrote the paper; R. Khan
helped to draft the final manuscript; all authors had approved the final version.
REFERENCES
[1] J. Strömbäck, Y. Tsfati, H. Boomgaarden, et al., “News media trust and its impact on media use: Toward a framework
for future research,” Annals of the International Communication Association, vol. 44, pp. 139-156, 2020.
[2] E. Mitchelstein and P. J. Boczkowski, “Online news consumption research: An assessment of past work and an agenda
for the future,” New Media & Society, vol. 12, pp. 1085-1102, 2010.
[3] P. Henrique, A. Faustini and T. F. Covões, “Fake news detection in multiple platforms and languages,” Expert Systems
with
Applications, vol. 158, pp. 1-9, 2020.
[4] F. A. Ozbay and B. Alatas, “Fake news detection within online social media using supervised artificial intelligence
algorithms,” Physica A: Statistical Mechanics and its Applications, vol. 540, pp. 1-19, 2020.
[5] M. Umer, “Fake news stance detection using deep learning architecture (CNN-LSTM),” IEEE Access, vol. 8, pp.
156695156706, 2020.
[6] T. Jiang, J. P. Li, A. U. Haq, et al., “A novel stacking approach for accurate detection of fake news,” IEEE Access, vol.
9, pp. 2262622639, 2021.
[7] S. I. Manzoor, J. Singla, and Nikita, “Fake news detection using machine learning approaches: A systematic review,” in
Proc. International Conference on Trends in Electronics and Informatics, 2019, pp. 230-234.
[8] A. Jain, A. Shakya, H. Khatter, et al., “A smart system for fake news detection using machine learning,” in Proc.
International Conference on Issues and Challenges in Intelligent Computing Techniques, 2019, pp. 1-4.
[9] I. Ahmad, M. Yousaf, S. Yousaf, et al., “Fake news detection using machine learning ensemble methods,” Complexity,
pp. 1-11, 2020.
[10] UTK machine learning club. (July 2017). Fake news, version 1. [Online]. Available: https://www.kaggle.com/c/fake-
news/data
[11] H. Ali, M. S. Khan, A. AlGhadhban, et al., “All your fake detector are belong to us: Evaluating adversarial robustness
of fake-news detectors under black-box settings,” IEEE Access, vol. 9, pp. 8167881692, 2021.
[12] I. K, Sastrawan, I. P. A. Bayupati, and D. M. S. Arsa, “Detection of fake news using deep learning CNN-RNN based
methods,” ICT Express, pp. 1-13, 2021.
[13] Y. A. Solangi, Z. A. Solangi, S. Aarain, et al., “Review on Natural Language Processing (NLP) and its toolkits for
opinion mining and sentiment analysis,” in Proc. International Conference on Engineering Technologies and Applied
Sciences, 2018, pp. 1-4.
[14] G. Kim and S. H. Lee, “Comparison of Korean preprocessing performance according to Tokenizer in NMT transformer
model,” Journal of Advances in Information Technology, vol. 11, pp. 228232, 2020.
[15] T. Daghistani and R. Alshammari, “Comparison of statistical logistic regression and random forest machine learning
techniques in predicting diabetes,” Journal of Advances in Information Technology, vol. 11, pp. 78-83, 2020.
[16] W. He, Y. He, B. Li, et al., “A naive-Bayes-based fault diagnosis approach for analog circuit by using image-oriented
feature extraction and selection technique,” IEEE Access, vol. 8, pp. 50655079, 2020.
[17] Q. Xue, Y. Zhu, and J. Wang, “Joint distribution estimation and naïve bayes classification under local differential
privacy,” IEEE Transactions on Emerging Topics in Computing, vol. 9, pp. 20532063, 2021.
[18] H. A. Maddah, “Decision trees based performance analysis for influence of sensitizers characteristics in dye-sensitized
solar cells,” Journal of Advances in Information Technology, vol. 13, pp. 271276, 2022.
[19] I. D. Mienye, Y. Sun, and Z. Wang, “Prediction performance of improved decision tree-based algorithms: A review,”
Procedia Manufacturing, vol. 35, pp. 698-703, 2019.
[20] J. A. C. Moreano and N. B. L. S. Palomino, “Global facial recognition using gabor wavelet, support vector machines
and 3D face models,” Journal of Advances in Information Technology, vol. 11, pp. 143-148, 2020.
[21] A. B. Gumelar, A. Yogatama, D. P. Adi, et al., “Forward feature selection for toxic speech classification using support
vector machine and random forest,” International Journal of Artificial Intelligence, vol. 11, pp. 717-726, 2022.
[22] J. Cervantes, F. Garcaí-Lamont, L. Rodrgíuez, et al., “A comprehensive survey on support vector machine
classification:
Applications, challenges and trends,” Neurocomputing, vol. 408, pp. 189-215, 2020.
[23] I. Benchaji, S. Douzi, and B. E. Ouahidi, “Credit card fraud detection model based on LSTM recurrent neural
networks,” Journal of Advances in Information Technology, vol. 12, pp. 113118, 2021.
[24] N. Yadav and A. K. Singh, “Bi-directional encoder representation of transformer model for sequential music
recommender system,” in Proc. Forum for Information Retrieval Evaluation, 2020, pp. 4953.
[25] S. Ni, J. Li, and H. Y. Kao, “MVAN: Multi-view attention networks for fake news detection on social media,” IEEE
Access, vol. 9, pp. 106907-106917, 2021.
Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License
(CC BYNC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly
cited, the use is noncommercial and no modifications or adaptations are made.
Noshin Nirvana Prachi obtained her bachelor's degree in computer science and engineering in July 2021 from North South
University, Bangladesh. Noshin was born in Dhaka, Bangladesh. One of her research work on deep learning-based speaker
recognition system was published at Interdisciplinary Research in Technology and Management (IRTM) conference. She is
working on data science, machine learning, computer vision and software engineering.
Md. Habibullah completed his B.Sc. degree in computer science and engineering in 2021 from North South University,
Bangladesh's electrical and computer engineering department. Recently he has published a manuscript on a deep learning-
based speaker recognition system at an IEEE conference. Currently, he is doing research on data science, machine learning,
cryptography and cyber security.
Md. Emanul Haque Rafi received his bachelor of science degree in computer science and engineering from the electrical
and computer engineering department of North South University, Bangladesh. Emanul was born in Dhaka, captial city of
Bangladesh. His primary research interest includes data science and management, machine learning, deep learning, and
natural language processing.
Evan Alam has a bachelor’s degree in computer science and engineering from electrical and computer engineering
department of North South University, Bangladesh. He was an active member of the Computer & Engineering Club of North
South University during his undergraduate study. Currently, his primary research interests are computer vision, data science,
machine learning, and computer network security.
Riasat Khan received a B.Sc. degree in Electrical and Electronic Engineering from the Islamic University of Technology,
Bangladesh, in 2010. He completed his M.Sc. and Ph.D. degrees in Electrical Engineering from New Mexico State
University, Las Cruces, USA, in 2018. Currently, Dr. Khan is working as an Assistant Professor in the Department of
Electrical and Computer Engineering at North South University, Dhaka, Bangladesh. His research interests include
biomedical engineering, cardiac electrophysiology and computational bioelectromagnetics.
CODE
Fake news detection using ML
Sushwanth Reddy 17STUCHH010063
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn import feature_extraction, linear_model, model_selection,
preprocessing
from sklearn.metrics import accuracy_score from
sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
Read datasets
fake = pd.read_csv("data/Fake.csv")
true = pd.read_csv("data/True.csv")
fake.shape
(23
481
,4)
true.shape
(21417,4)
# Concatenate dataframes
data = pd.concat([fake, true]).reset_index(drop = True) data.shape
(44
898
,
5)
# Shuffle the data
from sklearn.utils import shuffle data =
shuffle(data)
data = data.reset_index(drop=True)
text
subject
target
0 BRUSSELS (Reuters) - The European Commission s...
worldnews
true
1 Remember during the effort to get Obamacare pa...
politics
fake
2 WASHINGTON (Reuters) - A senior European Union...
politicsNews
true
3 WASHINGTON (Reuters) - Hurricane Harvey devast...
politicsNews
true
4 These people are sick and evil. They will stop...
politics
fake
# Convert to lowercase
string
data['text'] = data['text'].apply(punctuation_removal)
# Check data.head()
text
subject
target
0 brussels reuters the european commission said...
worldnews
true
1 remember during the effort to get obamacare pa...
politics
fake
2 washington reuters a senior european union of... politicsNews
true
3 washington reuters hurricane harvey devastate... politicsNews
true
4 these people are sick and evil they will stop ... politics
fake
politicsNews
11272 worldnews
10145 Name: text,
dtype: int64
plt.figure(figsize=(10,7))
plt.imshow(wordcloud,
interpolation='bilinear') plt.axis("off")
plt.show()
# Word cloud for real news
from wordcloud import
WordCloud
plt.figure(figsize=(10,7))
plt.imshow(wordcloud,
interpolation='bilinear') plt.axis("off")
plt.show()
# Most frequent words counter (Code adapted from
https://www.kaggle.com/rodolfoluna/fake-news-detector)
from nltk import tokenize
token_space = tokenize.WhitespaceTokenizer()
plt.imshow(cm, interpolation='nearest',
cmap=cmap) plt.title(title) plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)
if normalize: cm = cm.astype('float') /
cm.sum(axis=1)[:, np.newaxis] print("Normalized
confusion matrix")
else: print('Confusion matrix, without
normalization')
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
Naive Bayes
dct = dict()
NB_classifier = MultinomialNB()
pipe = Pipeline([('vect', CountVectorizer()),
('tfidf', TfidfTransformer()), ('model',
NB_classifier)])
# Accuracy
prediction = model.predict(X_test)
print("accuracy: {}%".format(round(accuracy_score(y_test,
prediction)*100,2)))
dct['Logistic Regression'] = round(accuracy_score(y_test, prediction)*100,2)
accuracy:
98.84%
cm = metrics.confusion_matrix(y_test, prediction) plot_confusion_matrix(cm,
classes=['Fake', 'Real'])
# Accuracy
prediction = model.predict(X_test)
print("accuracy: {}%".format(round(accuracy_score(y_test,
prediction)*100,2)))
dct['Decision Tree'] = round(accuracy_score(y_test, prediction)*100,2)
accuracy:
99.58%
cm = metrics.confusion_matrix(y_test, prediction) plot_confusion_matrix(cm,
classes=['Fake', 'Real'])
Confusion matrix, without normalization
Random Forest
([<matplotlib.axis.YTick at 0x18101096a00>,
<matplotlib.axis.YTick at 0x18101096670>,
<matplotlib.axis.YTick at 0x181038b14c0>,
<matplotlib.axis.YTick at 0x18104447880>,
<matplotlib.axis.YTick at 0x18104447d90>,
<matplotlib.axis.YTick at 0x181044302e0>,
<matplotlib.axis.YTick at 0x181044307f0>,
<matplotlib.axis.YTick at 0x18104430d00>,
<matplotlib.axis.YTick at 0x1810446a250>,
<matplotlib.axis.YTick at 0x18104430a30>],
[Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0, ''),
Text(0, 0,, '')
Text(0, 0,, '')
Text(0, 0,] )'')
the end.....