0% found this document useful (0 votes)
24 views

Rumour Detection Based On Graph Convolutional Neuron

This document summarizes a research article that proposes a new method called the Ensemble Graph Convolutional Neural Network (EGCN) for rumor detection. The EGCN models conversation structures as graphs and uses graph convolutional neural networks to learn both global structural features and local content features from these graphs. It represents tweets as nodes, with features based on weighted word embeddings, and connections between tweets as edges. The EGCN also includes a Nodes Proportion Allocation Mechanism to better handle graphs of varying sizes. Experimental results showed the EGCN approach was comparable or better than existing machine learning models for rumor detection.

Uploaded by

Santhosh S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Rumour Detection Based On Graph Convolutional Neuron

This document summarizes a research article that proposes a new method called the Ensemble Graph Convolutional Neural Network (EGCN) for rumor detection. The EGCN models conversation structures as graphs and uses graph convolutional neural networks to learn both global structural features and local content features from these graphs. It represents tweets as nodes, with features based on weighted word embeddings, and connections between tweets as edges. The EGCN also includes a Nodes Proportion Allocation Mechanism to better handle graphs of varying sizes. Experimental results showed the EGCN approach was comparable or better than existing machine learning models for rumor detection.

Uploaded by

Santhosh S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3050563, IEEE Access

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2019.DOI

Rumour Detection based on Graph


Convolutional Neural Net
NA BAI1 , FANRONG MENG12 , XIAOBIN RUI1 , ZHIXIAO WANG12
1
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
2
Mine Digitization Engineering Research Center of Minstry of Education of the People’s Republic of China, Xuzhou 221116, China
Corresponding author: Fanrong Meng (e-mail: [email protected]); Zhixiao Wang (e-mail: [email protected]).
This work is supported by the National Key Research and Development Plan (No. 2016YFC0600908) and the National Natural Science
Foundations of China (No. 61876186, No. 61977061).

ABSTRACT Rumor detection is an important research topic in social networks, and lots of rumor detection
models are proposed in recent years. For the rumor detection task, structural information in a conversation
can be used to extract effective features. However, many existing rumor detection models focus on local
structural features while the global structural features between the source tweet and its replies are not
effectively used. To make full use of global structural features and content information, we propose Source-
Replies relation Graph (SR-graph) for each conversation, in which every node denotes a tweet, its node
feature is weighted word vectors, and edges denote the interaction between tweets. Based on SR-graphs,
we propose an Ensemble Graph Convolutional Neural Net with a Nodes Proportion Allocation Mechanism
(EGCN) for the rumor detection task. In experiments, we first verify that the extracted structural features are
effective, and then we show the effects of different word-embedding dimensions on multiple test indices.
Moreover, we show that our proposed EGCN model is comparable or even better than the current state-of-art
machine learning models.

INDEX TERMS Rumour detection; Graph Convolutional Neural Nets; Word-vectors embedding

I. INTRODUCTION these methods can extract effective content features, scholars


OR the first time in history, people consume more news realize that exclusive use of content features is not satisfac-
F from social media than from traditional sources (e.g.,
television, newspapers); people tend to believe the infor-
tory for rumor detection tasks, and these rumor detection
models cannot reflect the structural information of rumor
mation from the media, making them vulnerable to rumors propagation. To further enhance the performance of rumor
and fake news [1]. This phenomenon makes rumors spread detection models, scholars introduce structural features to
quickly. Rumors are defined as information that is unverified their models. In the structure-based rumor detection models,
at the time of its posting, and unverified rumors will cause such as Multi-features Support Vector Machines (SVMs) [8],
anxiety and panic to varying degrees [2, 3], and it is difficult the structural information is expressed as context features,
for the public to distinguish verified information from rumors and scholars combine context features, such as User-based
[4]. Therefore, an effective automatic rumor detection system features [9] and Network-based features [10] with the content
is extremely important. features in their machine learning models. However, these
context features are always local, and local structural features
In recent years, there is a growing interest in studying
and content features are extracted by using different methods,
machine learning methods for automatic rumor detection.
which make the training difficult. Therefore, we aim to de-
Zhao et al. [5] assumes that rumors will cause Twitter users to
sign a uniform deep learning frame to learn both the content
question the veracity of tweets. This method focuses on con-
features and the global structural features end-to-end for the
tent information, but not all rumors provoke inquiry tweets.
rumor detection task.
To make full use of content information, scholars propose
some content-based rumor detection models, such as Ran- From the perspective of local structures, the degrees of
dom Forests (RFs) [6], TF-IDF based models [7], and deep replies are similar based on our observations, while the de-
learning models. These content features contain Syntactic gree of source tweet is large. From the perspective of global
features, Lexical features, and Semantic features. Although structures, conversations are organized as graphs or trees. In

VOLUME x, 2016 1

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3050563, IEEE Access

N. Bai et al.: Rumour Detection based on Graph Convolutional Neural Net

this paper, both the global structure and the local structure are dimension of word vectors. Based on experiments, we
treated as important features for the rumor detection task. We conclude that the proposed EGCN is optimal based on
aim to model both the global and local structural information F1 scores when the dimension of word vectors is 25.
in a uniform frame and then propose a deep ensemble Neural The remainder of this paper is organized as follows. Sec-
Net to learn these two features for conversations. In the tion 2 introduces the related works, including rumor detection
proposed model, the structural information can be expressed methods that use content features and context features. Sec-
in Source-Replies relation graphs (SR-graphs). In a SR- tion 3 details the proposed model. In Section 4, experimental
graph, a node denotes a tweet, the node feature is its weight- results prove the feasibilities of the proposed model. The last
ed word vectors, and edges denote the interaction between section provides conclusions and outlook.
tweets. The word vectors are used to express the content
information, which are trained by using the Word2Vec model II. RELATED WORK
[11]. However, when the structure of a SR-graph is simple, For rumor detection, Zhao et al. [5] assumes that rumors
the global structural features of SR-graphs between rumors will cause Twitter users to question the veracity of tweets.
and non-rumors may be indistinguishable. In order to ex- This method is based on content information, but not all
tract distinguishable features from conversations in different rumors provoke inquiry tweets. Based on this research, many
lengths, we propose a Nodes Proportion Allocation Mecha- scholars focus on the content features of rumors. Zubiaga
nism (NPAM) to build an ensemble deep neural network for et al. propose an alternative approach that learns context
different conversations. Generally, the complexity of a SR- from breaking news to determine whether a tweet constitutes
graph is proportional to its number of nodes, if a SR-graph a rumor [6]. Tolosi et al. distinguish rumors by analyzing
has limited nodes, its structure is always simple, and most the characteristics of different events. However, the features
simple SR-graphs have similar global structures. Therefore, change dramatically across events [7]. McCreadie et al. study
for simple SR-graphs, the text features and local structural the feasibility using a crowdsourcing platform to identify
features are more important for rumor detection while the rumors and non-rumors in social media [8]. Bhattacharjee
global structural features are secondary. Conversely, for com- et al. regard rumor detection as a text classification task
plex SR-graphs, the structures of rumors and non-rumors are [9]. They propose a novel approach of feature construction
probably distinguishable, and the global structural features by reweighting the TF-IDF score of some particular terms
and text features are both important for rumor detection. according to the label information of training data, and they
In order to effectively model the above phenomenon, the show that their model reaches comparable performance to
proposed NPAM is used to ensemble two neural networks, a LSTM with Glove word-embedding for rumor detection
a Text CNN (TCNN) and a GCN, while the rumor detection on PHEME datasets. Although these methods can extract
is treated as a classification task. Assuming that the number effective content features, scholars realize that these rumor
of nodes in current SR-graph is N, and the number of nodes detection models cannot reflect the structural information
in the maximal SR-graph is M, the contribution rate of the of rumor propagation, and the exclusive use of content
TCNN and the GCN for classification is defined as N/M. We features is not sufficient for the rumor detection task. To
call the resulting model an Ensemble Graph Convolutional further enhance the performance of rumor detection mod-
Neural Net with Nodes Proportion Allocation Mechanism els, scholars consider structural information and introduce
(EGCN). In experiments, we first verify that the global struc- structural features to their models. In the proposed models,
tural features are effective for rumor detection, and then, the the structural information is expressed as context features,
effects of different word-embedding dimensions on multiple which are usually local. Context features are extracted by
test indices are studied. Moreover, we show that our proposed considering relevant information of the social media tweet
EGCN model is comparable or even better than current state- or fake news [10]. Wu et al. introduce random walk kernels
of-art models. between tweets to a Kernel Support Vector Machine (KSVM)
Our main contributions can be summarized as follows: and combine both the content-based kernels and the random
walk kernels for rumor detection [12]. Pamungkas et al. use
1) To learn both the global and local structural informa- Jaccard Similarity between every tweet and its source as
tion, we construct a SR-graph for every conversation. context features [13]. Although these local structural features
In a SR-graph, a node denotes a tweet, and the node are useful, they do not make full use of the global structural
feature is its weighted word vectors. features for the rumor detection task. Therefore, this paper
2) To build an effective deep learning model for the rumor suggests that more application about structural information
detection task, we propose an EGCN model based on could be done for the rumor detection task, and we pay
a Nodes Proportion Allocation Mechanism (NPAM). attention to the global and local structural features in every
Based on NPAM, the text features, local structural conversation in our research.
features, and global structural features can be learned
by an ensemble deep Neural Network. III. THE PROPOSED MODEL
3) To obtain a satisfactory EGCN model for the rumor Global structural information means that the interaction be-
detection task, we explore the optimal values of the tween all tweets in a conversation is considered. As we
2 VOLUME x, 2019

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3050563, IEEE Access

N. Bai et al.: Rumour Detection based on Graph Convolutional Neural Net

observed, if a source tweet constitutes a rumor in the end,


its replies trends to follow the source or the former comment.
Source
To learn the structural information of each conversation, we
propose a source-replies relation graph (SR-graph) for every
conversation. In a SR-graph, a node denotes a tweet, and
the node feature is its corresponding word vectors. Based on Reply_1 Reply_5

SR-graphs, we propose an Ensemble Graph Convolutional


Neural Net with Nodes Proportion Allocation Mechanism
(EGCN).
Reply_2 Reply_4

A. SOURCE-REPLIES RELATION GRAPHS


In a conversation, the source tweet and its replies are correlat-
ed, and this correlation is not only local but also global. As we
observe, if a source tweet constitutes a rumor in the end, the Reply_3 Reply_6
global structure of this conversation is approximately linear
or starlike. The local structure show the connections and
importance of the current tweet, and a reply always follows FIGURE 1. A diagrammatic sketch of a source-replies relation graph with 6
replies.
the source or the former tweet. Therefore, both the global
structure and the local structure can be treated as important
features for the rumor detection task, and we try to model
both the global and local structural information in a uniform semantic structures and the relation between the source and
graph structure. The most commonly used graph structures its replies are especially important for rumor detection. For
are Gaussian Diffusion Kernels and Random Walking Ker- every tweet and its replies, we build their corresponding
nels. However, these two kernels do not express the adja- word vectors and construct a SR-graph. We propose a Nodes
cent relation directly. To directly obtain the global structural Proportion Allocation Mechanism (NPAM) to build an en-
information and extract the global and local features, we semble deep neural network for different conversations, and
build a SR-graph for every conversation and transform the the ensemble deep neural network contains a Text CNN and
rumor detection task to a graph classification task. In a SR- a GCN. Assuming that the number of nodes in current SR-
graph, the adjacent matrix can reflect the global structure of graph is N, and the number of nodes in the max SR-graph is
a conversation, if a tweet replies another tweet, then their M, the contribution rate of the TCNN and the GCN for clas-
corresponding nodes are connected. The node features in a sification is N/M. We call the resulting Deep Neural Net an
SR-graph are described by weighted word vectors, in which Ensemble Graph Convolutional Net with Nodes Proportion
the weight of a node is the proportion of its degree. We use Allocation Mechanism (EGCN). The structure of an EGCN
the weights to reflect the local structure of a conversation is shown in Fig. 2.
and train word vectors by using the Word2Vec model. In
addition to emotions, special characters and websites, other Text CNN_1
Full-connection Layer
texts are transformed to word vectors. As a part of input data
for Neural Nets, every word vector is regularized to [0, 1].
Output Layer
In experiments, the effect of different word vector lengths is Word-
vectors
studied. A diagrammatic sketch of a source-replies relation
graph is shown in Fig. 1. Nodes Proportion
Allocation
Based on SR-graphs and the weighted word vectors, we conversati
ons Graph Convolution layer SortPooling 1-dimension
Mechanism

layer convolution layers


train Deep Neural Nets for the rumor detection task. Graph
Convolutional Nets (GCNs) are proposed for learning com- B
A

F
Spatial
based GCN
features

plex graphs [14-20], and spectral domain-based GCNs are C D SortPooling


features

more suitable for node classification tasks [15-17], whereas E

Word-vectors
G
Spatial
based GCN
embedded to Graphs features

spatial domain-based GCNs are more suitable for graph


Spatial based GCN
classification tasks [18-20]. In the rumor detection task, a
source tweet and its replies can be expressed as a SR-graph, FIGURE 2. The structure of an EGCN.
which can be learned by a spatial-based GCN.
As Fig. 2 shows, assuming that the input of an EGCN is
B. AN ENSEMBLE GRAPH CONVOLUTIONAL NEURAL a conversation, the corresponding word vectors and the SR-
NET graph of this conversation are built. The SR-graphs and the
Based on SR-graphs and the weighted word vectors, we corresponding word vectors are passed to a Text CNN and a
transform the rumor detection task to a binary classification GCN, the feature output of Text CNN is proportional to the
task. Compared with traditional classification problems, key rate of N/M. Specifically, assuming that the feature output of
VOLUME x, 2019 3

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3050563, IEEE Access

N. Bai et al.: Rumour Detection based on Graph Convolutional Neural Net

the Text CNN is PT and the feature output of the GCN is PG, Hebdo, killing 11 people and wounding 11 more, on January
the total feature ouput of EGCN is 7, 2015.
N N * Ferguson: citizens of Ferguson in Missouri, USA,
y = PG ×+ P T (1 − ) (1) protested after the fatal shooting of an 18-year-old African
M M
American, Michael Brown, by a white police officer on
The spatial-based GCN has 4 Graph convolution layers,
August 9, 2014.
a SortPooling layer, and 3 1-dimension convolution layers.
* Gencrash: a passenger plane from Barcelona to Dussel-
The output of the proposed EGCN is an ensemble of the Text
dorf crashed in the French Alps on March 24, 2015, killing
CNN and the spatial-based GCN.
all passengers and crew. The plane was ultimately found to
The convolution operation in the Text CNN model can be
have been deliberately crashed by the co-pilot.
expressed as follows:
* Ottawa Shooting: shootings occurred on Ottawaaŕs ˛ Par-
X liament Hill, resulting in the death of a Canadian soldier on
Y = W (i) ∗ x + b(i) (2)
i=1 October 22, 2014.
where W (i) denotes the i-th convolution kernel, which can * Sydney Siege: a gunman held hostage ten customers and
be optimized by the BP algorithm. After the convolutional eight employees of a Lindt chocolate cafe located at Martin
operations, a ReLU active function is used. Place in Sydney on December 15, 2014.
The final dataset contains 5,802 annotated tweets, of which
Yconv = ReLU (Y ) = max (0, Y ) (3) 1,972 were classified as rumors and 3,830 as non-rumors.
These annotations are distributed differently across the five
After convolution layers, the features, such as keywords events, as shown in Table 1.
are extracted, and then higher level features are extracted by
the pooling layers. The pooled features can be expressed as TABLE 1. Distribution of categories for the five events in the dataset PHEME.
Formula (4).
Event Rumors Non-rumors Total
Ypool = Pmax (Yconv ) (4) Charlie Hebdo 458(22.0%) 1621(78.0%) 2079
Ferguson 284 (24.8%) 859 (75.2%) 1143
where, Pmax denotes the max-pooling operation. The fea-
Gencrash 238 (50.7%) 231 (49.3%) 469
tures extracted by pooling layers are then passed to a full-
Ottawa Shooting 470 (52.8%) 420 (47.2%) 890
connection layer.
Sydney siege 522 (42.8%) 699 (57.2%) 1221
In the GCN part of the EGCN, given a graph A and its node
Total 1972 (34.0%) 3830 (66.0%) 5802
features X, the graph convolution layer takes the following
form:  
Z = f D̃−1 ÃXW (5) The source data are structured as follows. Each event has a
directory with two subfolders: rumors and non-rumors. These
where à = A+I is the adjacent matrix of the graph with two folders have folders named with a tweet ID. The tweet
P self- itself can be found on the source-tweet directory of the tweet
loops, D̃ is its diagonal degree matrix with D̃ii = j Ãij ,
W is a matrix of trainable graph parameters, f is a nonlinear in question, and the directory reactions has the set of tweets
activation function, and Z is the output activation matrix. We responding to that source tweet.
stack multiple graph convolution layers as follows:
  2) Evaluation Measures
Z t+1 = f D̃−1 ÃZ t W t (6) Accuracy is often treated as a suitable evaluation measure
for a classifier. In this paper, we also introduce other 3
After several spatial graph convolution layers, a SortPool- indices: Precision, Recall, and F1. The evaluating indicators
ing layer [21] is used to sort the feature descriptors, each of are defined as follows:
which represents a vertex. The SortPooling operation defines
a sequence of nodes in the graph. Then, the output of the Positive Negative
SortPooling layer is passed to the full-connection layer. True True Positive (TP) True Negative (TN)
False False Positive (FP) False Negative (FN)
IV. EXPERIMENTS
A. EXPERIMENTAL SETUP
1) Datasets TP
P recision= (7)
In this paper, we use the PHEME dataset which contains TP + FP
Twitter posts during breaking news. The five breaking news TP
Recall= (8)
in PHEME are as follows [6]: TP + FN
* Charlie Hebdo: two brothers forced their way into the 2 × P recison × Recall
offices of the French satirical weekly newspaper Charlie F 1= (9)
P recison + Recall
4 VOLUME x, 2019

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3050563, IEEE Access

N. Bai et al.: Rumour Detection based on Graph Convolutional Neural Net

where TP and FN refer to the number of instances which TABLE 2. Test F1 scores of the five events in the dataset PHEME.
are correctly classified, and FP and TN are the numbers of
instances that are incorrectly classified. Events Learning rate filters F1 scores
Charlie Hebdo 1e-5 32-16-32 0.793
3) Evaluation Measures
Ferguson 1e-5 32-16-32 0.730
Gencrash 1e-5 32-16-32 0.553
SVM (Content + Context). Support Vector Machines with
Ottawa shooting 1e-5 32-16-32 0.551
the cost coefficient selected via nested cross-validation.
Sydney siege 1e-5 32-16-32 0.650
Random Forest (Content). Random forest is a classifier that
uses multiple trees to train and predict samples.
Naive Bayes (Content). Naive Bayesian method is a classi- Compared with traditional machine learning models, the
fication method based on Bayesian theorem and independent spatial-based GCN can recognize the structural information
hypothesis of characteristic conditions. of different rumor categories. To show the improvement of
Maximum Entropy (Content). Maximum Entropy states the proposed GCN compared with the traditional machine
that the probability distribution that best represents the cur- learning methods, we compare our experimental results with
rent state of knowledge is the one with largest entropy in the the commonly used rumor detection models, and the test F1
context of precisely stated prior data. scores are shown in Table 3.
CRF (Content+Social). Conditional Random Field is a As noted in Tables 2 and 3, the SR-graphs in the rumor
discriminant probability model used two types of features: dataset are helpful for the classification task, which means
content-based features and social features [6]. that both the local and global structure are helpful for rumor
Text-CNN (Content). A convolutional neural network de- detection.
signed for text data.
Zhao et al., 2015. Classification method proposed by Zhao 2) Exploring the structures of conversations
et al. [7]. To improve the classification results of the model, we use
TF-IDFB. A novel approach of feature construction by weighted word vectors to replace the degrees in the node
reweighting the TF-IDF score of some particular terms taking features and introduce a node threshold to the model. If a
into account the label information of training data [8]. graph contains fewer nodes, the structure of the source tweet
PGNN (Content + Structural). In a PGNN, the adjacent and the replies are simple. A Text CNN may be effective for
relation is transformed into indicator functions in its defined the rumor detection task. According to SR-graphs in datasets,
graph convolution to avoid directly using adjacent matrices the degree of each node can be calculated. When the number
[22]. of nodes in a SR-graph is greater than 10, the SR-graphs of
most conversations will become relatively complex, and this
B. EXPERIMENTAL RESULTS AND DISCUSSION complexity is reflected in the fact that different nodes in one
In experiments, the EGCN contains a Text CNN and a GCN. complex SR-graph have different degrees. When the node
The Text CNN has 3 layers, and the GCN contains 4 Graph number of SR-graph is less than 10, the SR-graphs in some
convolution layers, a SortPooling layer, and a 1-dimension conversations are complex, and the degrees of most nodes
convolution layers. The Adam method is used, and the initial are different. However, the SR-graph in some conversations
learning rate is 1e-5. The structure of source tweets and is relatively simple, which is reflected by the fact that the
the replies can be organized as SR-graphs, and different degrees of other nodes are close except the source node.
SR-graphs have different numbers of nodes. Both the local When the node number of SR-graph is less than 5, most SR-
structural information and the global structural information graphs are simple. We show that each event contains its own
are treated as important features for the rumor detection task node distribution, and the node distributions are shown in
in this paper. Our experiments are divided into two parts. Table 4.
First, we verify the validity of global structural features by TABLE 4. Node distributions of the five events in the dataset PHEME.
the GCN model in the proposed EGCN. Then we explore the
effect of different dimensional word vectors and show that Event max minn rumors non-rumors
our proposed EGCN model is comparable or even better than Charlie Hebdo 346 1 458 1621
the current state-of-art models. Ferguson 289 1 284 859
Gencrash 77 1 238 231
1) Effective GCNs based on global structure features Ottawa shooting 108 1 470 420
To verify that the extracted structural features of our SR- Sydney siege 342 1 522 699
graphs are effective, in our first experiment, we test the F1
scores of the proposed graph structure without embedding As noted in Table 4, the node distributions of different
word vectors, and the node features are their degrees rather events are unbalanced. The second column shows the max-
than the weighted word vectors. The test F1 scores are shown imum number of nodes in graphs of corresponding events.
as Table 2. The third column shows the minimum number of nodes, and
VOLUME x, 2019 5

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3050563, IEEE Access

N. Bai et al.: Rumour Detection based on Graph Convolutional Neural Net

TABLE 3. Test F1 scores of the GCN and traditional machine learning methods on PHEME.

Charlie Hebdo Ferguson Gencrash Ottawa shooting Sydney siege


SVM 0.332 0.313 0.483 0.459 0.458
Random Forest 0.209 0.169 0.055 0.097 0.114
Naive Bayes 0.361 0.381 0.643 0.145 0.590
MaxEnt 0.330 0.295 0.458 0.454 0.427
Text-CNN 0.764 0.702 0.613 0.622 0.579
LSTM 0.824 0.775 0.582 0.703 0.624
Zhao et al.[6] 0.094 0.127 0.108 0.109 0.127
GCN 0.793 0.730 0.553 0.551 0.650

the last 2 columns show the distribution of rumors and non- As Table 8 shows, the proposed EGCN achieves at least one
rumors. best result in every event and almost all the optimal solutions
or suboptimal solutions. Overall, the EGCN obtains the 6
3) Exploring the Optimal EGCN best results and 2 suboptimal solutions in the PHEME dataset
To further explore the effects of different dimensional word- and perform better than other models. For comparing, we
vectors and construct a satisfactory EGCN model, we test use a LSTM and a Text CNN, which are commonly used
4 different indices on PHEME dataset based on different for classification, and the results show that the proposed
dimensional word vectors. EGCN performs better than the two commonly used models.
The proposed EGCN use both text features and structural
TABLE 5. Node distributions of the five events in the dataset PHEME. features for classification, and the experiments verify that the
extracted text features and structural features are effective for
Event max min N ∈ (0, 10] N ∈ (10, ∞) the rumor detection task. Moreover, we also add a state-of-
Charlie Hebdo 346 1 687 1392 the-art Graph Neural Network PGNN, a PGNN is a kind of
Ferguson 289 1 379 764 GCN, and the adjacent relation in a PGNN is transformed
Gencrash 77 1 299 170 into indicator functions in the graph convolution to avoid
Ottawa shooting 108 1 387 503 directly using adjacent matrices. The experiments show that
Sydney siege 342 1 328 893 the proposed EGCN uses the structural information and the
text information more effectively and performs better than
Next, we test the classification indices and we aim to find the PGNN on most indexes.
an optimal dimension of word vectors. The test classification
indices are shown in Table 6. V. CONCLUSIONS AND OUTLOOKS
As Table 6 shows, different dimensional word vectors For the rumor detection task, we propose a deep Neural Net
lead to different test indices. When the dimension of word to transform the rumor detection problem to classification
vectors is 25, the average test indices are satisfactory. After problem. To obtain satisfactory classification results, we train
excluding meaningless words, the length of most tweets is word vectors based on the Word2Vec model and propose a
relatively short. For commonly used words, low dimension SR-graph for every source tweet and its replies. Based on SR-
word vector can achieve satisfactory results. Therefore, we graph and the corresponding word vectors, we train an EGCN
use weighted 25-dimension word vectors as the node features model that achieves comparable or even better results than
in SR-graphs, and then the structure of the proposed EGCN the state-of-art machine learning models. In experiments,
model is fixed. Table 7 shows the final test indices of the we find that the word vectors are very important for the
EGCN in our experiments. final performance of the proposed EGCN model, and we
To further illustrate the effectiveness of the proposed EGC- use an existing word-embedding model to train the word
N, we compare the indices of the EGCN with the commonly vectors in this paper. However, we suppose that a word-
used rumor detection models. In reference [22], a PGNN is embedding model designed for Twitter datasets might work
proposed for a Four-classes rumor detection task. A PGNN better than the existing models. Therefore, we will design an
is a kind of GCN, and the adjacent relation in a PGNN is unsupervised word-embedding Neural Net for Twitter in our
transformed into indicator functions in the graph convolution next study.
to avoid directly using adjacent matrices.The F1 scores and
other indices are shown in Table 8. VI. ACKNOWLEDGMENT
In Table 8, there are 5 events and 15 indices. R.F denotes This work are supported by the National Key Research and
the Random Forest algorithm, and N. B is the Naive Bayes Development Plan (No. 2016YFC0600908) and the Nation-
method. We use bold black fonts to mark the optimal solu- al Natural Science Foundation of China under Grant(No.
tions and bold blue fonts to mark the suboptimal solutions. 61876186, No. 61977061).
6 VOLUME x, 2019

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3050563, IEEE Access

N. Bai et al.: Rumour Detection based on Graph Convolutional Neural Net

TABLE 6. The test results of different dimensional word vectors in the dataset PHEME.

Events Embedding Number of filters Accuracy Precision Recall F1


dimension
Charlie Hebdo 50 32-16-32 0.798 0.822 0.934 0.875
Charlie Hebdo 25 32-16-32 0.841 0.882 0.911 0.896
Charlie Hebdo 10 32-16-32 0.793 0.801 0.982 0.883
Charlie Hebdo 5 32-16-32 0.817 0.821 0.944 0.878
Ferguson 50 32-16-32 0.747 0.774 0.921 0.841
Ferguson 25 32-16-32 0.809 0.821 0.956 0.883
Ferguson 10 32-16-32 0.756 0.833 0.815 0.824
Ferguson 5 32-16-32 0.739 0.743 0.929 0.825
Gencrash 50 32-16-32 0.617 0.581 0.720 0.643
Gencrash 25 32-16-32 0.638 0.800 0.667 0.727
Gencrash 10 32-16-32 0.681 0.704 0.731 0.717
Gencrash 5 32-16-32 0.638 0.591 0.650 0.619
Ottawa shooting 50 32-16-32 0.730 0.707 0.547 0.617
Ottawa shooting 25 32-16-32 0.685 0.725 0.690 0.707
Ottawa shooting 10 32-16-32 0.663 0.543 0.532 0.538
Ottawa shooting 5 32-16-32 0.708 0.745 0.583 0.654
Sydney siege 50 32-16-32 0.723 0.711 0.738 0.724
Sydney siege 25 32-16-32 0.683 0.702 0.835 0.763
Sydney siege 10 32-16-32 0.650 0.649 0.632 0.640
Sydney siege 5 32-16-32 0.602 0.605 0.645 0.624

TABLE 7. The best test results of the proposed EGCN in the dataset PHEME.

Events Embedding di- Accuracy Precision Recall F1


mension
Charlie Hebdo 25 0.841 0.882 0.911 0.896
Ferguson 25 0.809 0.821 0.956 0.883
Gencrash 25 0.638 0.800 0.667 0.727
Ottawa shooting 25 0.685 0.725 0.690 0.707
Sydney siege 25 0.683 0.702 0.835 0.763

TABLE 8. Node distributions of the five events in the dataset PHEME.

Charlie Hebdo Ferguson Gencrash Ottawa shooting Sydney siege


P R F1 P R F1 P R F1 P R F1 P R F1
SVM 23.9 54.6 33.2 24.0 45.1 31.3 46.3 50.4 48.3 49.6 42.8 45.9 43.5 48.5 45.8
R. F 21.5 20.3 20.9 25.4 12.7 16.9 43.8 2.9 5.5 55.6 5.3 9.7 46.6 6.5 11.4
N. B 22.3 96.1 36.1 24.8 82.0 38.1 50.6 88.2 64.3 43.6 8.7 14.5 42.6 96.2 59.0
MaxEnt 23.9 53.5 33.0 24.5 37.0 29.5 47.5 44.1 45.8 51.2 40.9 45.4 42.5 42.9 42.7
Zhao.[6] 63.6 5.7 9.4 35.5 7.7 12.7 26.8 5.9 10.8 65.1 6.0 10.9 42.9 7.5 12.7
CRF [7] 54.5 76.2 63.6 56.6 39.4 46.5 74.3 66.8 70.4 84.1 58.5 69.0 76.4 38.5 51.2
TF-IDF [8] 86.8 90.6 89.2 78.0 96.3 86.4 51.7 95.2 69.2 56.1 93.3 70.2 64.7 93.2 75.9
Text CNN [8] 78.2 74.7 76.4 77.1 64.4 70.2 66.7 56.7 61.3 76.2 52.5 62.2 74.7 47.3 57.9
LSTM 83.2 81.6 82.4 73.3 82.2 77.5 67.1 51.4 58.2 77.0 64.7 70.3 75.5 53.2 62.4
PGNN 84.7 97.6 90.7 76.7 98.5 86.3 84.6 62.5 71.9 65.9 75.8 70.5 65.8 87.6 75.1
EGCN 88.1 91.1 89.6 82.1 95.6 88.3 80.0 66.7 72.7 72.5 69.0 70.7 70.2 83.5 76.3

REFERENCES no. 2, pp. 32:1-32.


[1] L. Wong, J. Burkell, Motivations for Sharing News on Social Media, [4] P. Tolmie, R. Procter, D. Randall, et al., Supporting the use of user
SMSociety, 2017, vol. 57, pp. 1-5. generated content in journalistic practice, Proceedings of the 2017 CHI
[2] B. Fang, Y. Jia, Y. Han, et al., A survey of social network and information Conference on Human Factors in Computing Systems, 2017, pp. 3632-
dissemination analysis, Chinese Science Bulletin, 2014, vol. 59, no.32, pp. 3644.
4163-4172. [5] Z. Zhao, P. Resnick, Q. Mei, Enquiring minds: Early detection of rumors
[3] Zubiaga. A, Aker. A, Bontcheva. K, et al., Detection and Resolution of in social media from enquiry posts, Proceedings of the 24th International
Rumors in Social Media: A Survey, ACM Comput. Surv, 2018, vol. 51, Conference on World Wide Web, 2015, pp. 1395-1405.

VOLUME x, 2019 7

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3050563, IEEE Access

N. Bai et al.: Rumour Detection based on Graph Convolutional Neural Net

[6] A. Zubiaga, M. Liakata, R. Procter, Exploiting Context for Rumour De-


tection in Social Media, International Conference on Social Informatics,
2017.
[7] U. Bhattacharjee, P. Srijith, S. Maunendra, Term Specific TF-IDF Boosting
for Detection of Rumours in Social Networks, Proceedings of the 11th In-
ternational Conference on Communication Systems and Networks, 2019.
[8] L. Tolosi, A. Tagarev, G. Georgiev, An analysis of event-agnostic features
for rumour classification in Twitter, Proceedings of the ICWSM Workshop
on Social Media in the Newsroom, 2016.
[9] R. McCreadie, C. Macdonald, I. Ounis, Crowdsourced rumour identifica-
tion during emergencies, Proceedings of the 24th International Conference
on World Wide Web, 2015, pp. 965-970.
[10] A. Bondielliab, F. Marcellonia, A survey on fake news and rumour
detection techniques, Information Sciences, 2019, vol. 497, pp. 38-55.
[11] T. Mikolov, I. Sutskever, K. Chen, et al. Distributed Representations of
Words and Phrases and their Compositionality, Proceedings of the 26th
Advances in Neural Information Processing Systems, 2013, vol. 26, pp.
3111-3119.
[12] W. Ke, Y. Song, K. Zhu, False rumors detection on Sina Weibo by prop-
agation structures, IEEE International Conference on Data Engineering,
2015.
[13] E. Pamungkas, V. Basile, V. Patti, Stance Classification for Rumour
Analysis in Twitter: Exploiting Affective Information and Conversation
Structure, arXiv:1901.01911, 2019.
[14] T. Kipf, M. Welling, Semi-supervised classification with graph convolu-
tional networks, International Conference on Learning Representations,
2017.
[15] J. Bruna, W. Zaremba, A. Szlam, et al., Spectral Networks and Locally
Connected Networks on Graphs, International Conference on Learning
Representations, 2014.
[16] M. Henaff, J. Bruna, Y. Lecun, Deep Convolutional Networks on Graph-
Structured Data, Computer Science, Proceedings of the 28th Advances in
Neural Information Processing Systems, 2015.
[17] M. Defferrard, X. Bresson, P. Vandergheynst, Convolutional Neural Net-
works on Graphs with Fast Localized Spectral Filtering, Proceedings of
the 29th Advances in Neural Information Processing Systems, 2015.
[18] J. Atwood, D. Towsley, Diffusion-Convolutional Neural Networks, Pro-
ceedings of the 30th Advances in Neural Information Processing Systems,
2016.
[19] L. Yao, C. Mao, Y. Luo, Graph Convolutional Networks for Text Classi-
fication, Proceedings of the 32th AAAI Conference on Artificial Intelli-
gence, 2019.
[20] J. Masci, D. Boscaini, M. Bronstein, et al., Geodesic convolutional neu-
ral networks on Riemannian manifolds, IEEE Conference on Computer
Vision and Pattern Recognition, 2015.
[21] M. Zhang, Z. Cui, M. Neumann, et al., An end-to-end deep learning archi-
tecture for graph classification, Proceedings of the 32th AAAI Conference
on Artificial Intelligence, 2018.
[22] Z. Wu, D. Pi, J. Chen, et al. Rumor detection based on propagation
graph neural network with attention mechanism. Expert Systems with
Applications, 2020.

8 VOLUME x, 2019

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

You might also like