0% found this document useful (0 votes)
74 views

(ACCEPTED) Detection-Of-Hateful-Twitter-Users-With-Graph-Convolutional-Network-Model

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views

(ACCEPTED) Detection-Of-Hateful-Twitter-Users-With-Graph-Convolutional-Network-Model

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Earth Science Informatics (2023) 16:329–343

https://doi.org/10.1007/s12145-023-00940-w

RESEARCH

Detection of hateful twitter users with graph convolutional network


model
Anıl Utku1   · Umit Can1   · Serpil Aslan2 

Received: 20 November 2022 / Accepted: 12 January 2023 / Published online: 19 January 2023
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023

Abstract
Today, hate speech is widespread and persistent in various forms on social networking platforms, targeting different minor-
ity groups. These attacks can be carried out using various factors such as racial, religious, gender, and physical disability,
etc. Considering the number of people and their interactions, social networks are the most important channels through
which these discourses spread. The social network structure is considered a set of nodes and edges and is very suitable for
the graph structure. The multidimensional structure of social networks carries social network data from Euclidean space
to non-Euclidean space. In non-Euclidean space, the graph structure is used to represent data effectively. In this respect,
solving the hate speech problem with graph-based methods in a complex dimensional space can produce more impressive
results. In this study, a powerful method based on the Graph Convolutional Network (GCN) model, which is rarely used in
this field, was proposed for the detection of hateful Twitter users in social networks. Well-known machine learning methods
were used to measure the performance of this method. According to the results obtained, the proposed GCN model gave
the most successful result.

Keywords  Hate speech detection · Graph convolutional network · Deep learning · Machine learning

Introduction social media users can intentionally or unintentionally make


messages or comments that may insult an individual or a
Due to the circumstances of the age we live in, social media segment of society (MacAvaney et al. 2019). It can also be
has a significant role in our lives. On social media plat- defined as a language that encourages, incites, or increases
forms that have a huge impact on society, users can freely violence against certain groups based on certain characteris-
share their ideas, feelings, and thoughts. This freedom also tics. HS should not be confused with freedom of expression.
brought sharing of hateful and disturbing content (Bölücü Because although social media is interpreted as free thought
and Canbay 2021). Although there is no general definition sharing when the purpose of the use is considered, freedom
of Hate Speech (HS), it is a social media problem that can of expression and press must be absolute. Posts encourag-
be considered a criminal attempt seen in all scenarios where ing violence in social media environments are considered
violent crimes today (Gitari et al. 2015). Detection of such
shares is very important for public authorities and social
Communicated by: H. Babaie
media networks.
* Umit Can The history of social networks is as old as the human
[email protected] species. The transition to an agricultural society and grow-
Anıl Utku ing urban civilizations made the networks between human
[email protected] communities larger and more complex (Knoke and Yang
Serpil Aslan 2019). While social networks were mainly the field of social
[email protected] sciences at first, it has now become an interdisciplinary
research field with the contributions of Sociology, Social
1
Computer Engineering Department, Munzur University, Psychology, Anthropology, Physics, Mathematics, Computer
62000 Tunceli, Turkey
Science, and other disciplines. In a study conducted in 1978
2
Software Engineering Department, Turgut Özal University, to understand the social network structure among scientists
44210 Malatya, Turkey

13
Vol.:(0123456789)

330 Earth Science Informatics (2023) 16:329–343

working in various departments at a famous university in organizations such as companies and even governments to
the USA, the social network formed among the astronomy gain insight into people's views on their decisions. Senti-
department itself is shown in Fig. 1. In the seventies, tech- ment analysis and Idea Mining led to an increase in research
nological possibilities were limiting the analysis of large and techniques especially in the field of Natural Language
amounts of data and therefore large-scale networks. Later, Processing (NLP) in the last decade. Although it mostly
with the internet revolution, technology and communication investigates the feelings and thoughts of users about a cer-
opportunities increased enormously. This revolution enabled tain product or an event, research was carried out recently
the establishment of large-scale social networks by interact- to detect many situations that may negatively affect social
ing socially with millions online. From this point on, online and ethical issues such as fake news, extremism, violence,
network structures that we call Online Social Networks and social media environments. HS, also called toxic hate
(OSNs) were formed. speech, is a phenomenon that can easily spread in social
Social networking environments provide new possibilities media environments. In recent years, social media tools
for presenting personal expressions, creating communities became frequently used tools for the dissemination of both
with common interests, collaboration, and sharing (Murray authorized information and misinformation. It is a prime
2008). OSN sites are web-based services that allow users to example of how online media's excellent dissemination trend
create public or semi-public profiles, add other people they turns into both opportunities and challenges. For this reason,
are connected to their friend lists, and visit and review the HS detection in social media has recently become the focus
profiles of the people on their lists. Social networks attract of attention of many researchers doing research in the field
millions of users thanks to their features that allow sharing of NLP. Developing automatic tools can be one of the most
of a wide variety of content and profile information such as critical measures to prevent dangerous tendencies such as
photos, videos, and texts. Therefore, the use of social net- the escalation of violence and hatred.
works among internet users is spreading daily. For example, Social media platforms and media tools should provide
the number of users of Facebook, the popular OSN platform, proven, unbiased information that is constructive and posi-
approached three billion, and YouTube reached two and a tive instead of providing false information that will divide
half billion users as of the beginning of 2022 (Most popular society and create negative emotions such as fear, anxiety,
social networks worldwide as of January 2022, ranked by and stress. In this perspective, the purpose of HS detection
the number of monthly active users 2022). is to distinguish hateful users from non-hateful users. There
Twitter is today's most popular social media platform that were many studies examining the analysis of such shared
prioritizes sharing meaningful content where people share information and HS detection is a widely researched task
their feelings and opinions about certain topics (Nielsen with many challenges (Poletto et al. 2021). The majority
2011). Data on Twitter is huge and unstructured. For these of studies in the literature analyzed social network posts
reasons, it is difficult for people to analyze Twitter data and containing hate speech using various methods to determine
obtain subjective information. Recently, sentiment analysis whether the post contains hate speech or not. In this study,
was applied in various fields such as commercial, political, in addition to analyzing tweeter posts, hateful users were
and social media. Sentiment analysis methods, as in many detected by using users' features such as tweeting frequency,
social media platforms, are the most important tools that number of followers, number of favorites, and retweets.
enable us to obtain the feelings and opinions of individuals Therefore, a graph-based GCN embedding approach was
from Twitter data. Through sentiment analysis, it can help proposed to detect Hateful users on Twitter. The GCN
method allows us to analyze Twitter data more comprehen-
sively than traditional machine learning models. Here, Deci-
sion Tree (DT), Linear Regression (LR), Naïve Bayes (NB),
Random Forest (RF), Support Vector Machine (SVM), Mul-
tilayer Perceptron (MLP), and Bidirectional Long-Short
Term Memory (Bi-LSTM) methods were used to measure
the performance of the proposed GCN method. GCNs are
neural networks that can be applied directly to graphs and
provide an easy way to perform node-level, edge-level, and
graph-level prediction tasks. GCNs can do what CNNs can-
not. Experimental results prove the success of the proposed
model. The results obtained from this study will make valu-
able contributions to the detection of HS discourses that
Fig. 1  The social network structure of the statistics department (Friedkin trigger feelings and thoughts of hatred, violence, fear, and
1978) aggression in the public in social media environments. It can

13
Earth Science Informatics (2023) 16:329–343 331

also help government agencies and social media platform is based on Text Mining techniques. Poletto et al. (2021)
providers plan early to take action against toxic HS. presented a detailed analysis of the methods proposed in the
literature for the detection of HS. The authors of the study;
Motivation and contribution analyzed suggested models, dictionaries, and resources used
in different categories. Many studies analyze methods for
With the developing technology, the views expressed on detecting HS like this (Al-Hassan and Al-Dossari 2019;
Twitter, one of today's most important social media plat- Naseem et al. 2021; Cao et al. 2020).
forms, often cause controversy. Different opinions and con- The studies proposed in the literature generally range
troversial content directly affect interactions between com- from traditional machine learning models based on dic-
munities. Recently, controversial content was replaced by tionary-based approaches to new methods based on deep
offensive content and hate speech. Automatically detecting learning such as convolutional neural networks (CNN) and
users posting such content that can cause social unrest is recurrent neural networks (RNN) using attention mecha-
challenging. Most of the existing approaches in the literature nisms (Badjatiya et al. 2017; Gröndahl et al. 2018). There
focus on the content of hateful messages (Mulki et al. 2019; are also various HS detection studies that were developed
Albadi et al. 2018; Roy et al. 2020). To overcome this limi- using machine learning algorithms for different languages​​
tation, we propose a novel GCN-based approach for hateful in the literature. Rangel et al. (2021) tried to identify pos-
user detection by leveraging the power of Twitter data to use sible spreaders of hate in online environments by analyz-
both the message content and the structured network features ing the posts of social media phenomena. To this end, they
of Twitter users. presented hate speech word corpuscles based on English
The main contributions of the present study are summa- and Spanish languages. In their study, Abro et al. (2020)
rized as follows: proposed machine-learning approaches using datasets with
three different class labels for HS detection. The authors
• In the proposed model, the ability of GCNs to encode achieved the best performance in the experimental results
relational data is exploited to create a network embed- with the bi-gram and support vector machine (SVM) clas-
ding vector model that combines tweet content and Twit- sifier. They also emphasized that they will use different
ter user information. Since the generated network has feature engineering techniques and artificial intelligence
both semantic and structural properties, it contains more algorithms in their future studies to achieve higher perfor-
comprehensive information for HS detection compared mance. Gitari et al. (2015) proposed a new dictionary-based
to classical models. approach for HS detection in their study. It is a new dic-
• With the GCN network model, each node collects the tionary that focuses on the extraction of subjective features
feature information of its neighboring nodes in the con- that the authors created with a rule-based approach in their
volution layer. Then, a combined feature set is obtained. work. Experimental results revealed the addition of seman-
New results are obtained by applying non-linear trans- tic and topic-based features to the proposed dictionary-
formations to the obtained features. Also, a three-stage based approach. Nobata et al. (2016) proposed a machine
GCN was used to extract more meaningful features in the learning-based approach for detecting abusive language on
proposed model. In the proposed GCN network model, online platforms, which analyzes different levels of abu-
the output of the convolutional layer feeds the next sive language over time. In addition, the authors created a
convolutional layer as input. This way, similar features community of user comments for abusive language. A new
between nodes are captured while transferring informa- supervised attention approach based on keyword extraction
tion between layers. The obtained information provides and word graph based on BERT's attention mechanism was
a significant advantage when identifying Hateful users. proposed for HS detection (Sarracén and Rosso 2021). When
• Experiments show the positive effect of combining tex- extracting keywords, the authors proposed a model that uses
tual features and structural information in a graphic, in the differences between hateful and non-hateful texts and the
terms of performance and accuracy. harmonic mean of the words' relative frequencies. Roy et al.
(2020) proposed a deep learning-based approach based on
the deep convolutional neural network model (DCNN) for
Related work HS detection using a shared Twitter dataset in English. In
the proposed model, the GLoVe word embedding approach
Considering the effect of online social media environments is used for word representation. MacAvaney et al. (2019)
on the spread of HS, the number of research and techniques proposed a new approach based on SVM. In their analysis,
produced in the literature is increasing day by day. There are the authors argued that SVM would be simpler and easier to
many studies in this field, many datasets used and different interpret than neural network methods involving attention
techniques proposed. Most of the work done to detect HSs mechanisms. Garain and Basu (2019) used the RNN-based

13

332 Earth Science Informatics (2023) 16:329–343

Bi-LSTM deep learning approach in the model they devel- has been a source of inspiration. GCNs are divided into
oped for HS detection using the Twitter dataset on women spectral-based and spatial-based. In the spectral-based
and refugees on Twitter. The main disadvantages of the pro- approach, the models are based on spectral graph theory
posed model are that the authors used a single classifier in and the signals in a graph are filtered using the intrinsic
the experimental results and they used any data other than composition of the Laplacian graphs. In the spatial-based
the dataset they used. Zimmerman et al. (2018) proposed approach, on the other hand, convolution operations are
a new approach in their study, in which they combined ten performed directly on graphs, and graph convolution is
CNNs with different initial weights into a community to represented as a combination of feature information from
maximize the power of CNNs to improve HS detection. node neighbors (Zhang et al. 2019). Later, Wu et al. (2020)
In Table 1, the results of the proposed GCN-based model proposed a new taxonomy. Accordingly, GNN was divided
were compared in terms of accuracy metric with the studies into four groups: recurrent GNNs (RecGNN), convolu-
conducted in the literature to detect hate speech on Twitter. tional GNNs (ConvGNNs), graph autoencoders (GAEs),
It can be observed that the proposed approach displayed a and spatial–temporal GNNs (STGNNs).
higher classification performance compared to other pro- Today, it is a method used especially for the analysis
posed approaches in the current literature. of social networks. In particular, Graph Autoencoders
Graph Neural Networks (GNNs) are deep learning- (GAE) is unsupervised learning frameworks that encode
based approaches used to manipulate data that can be nodes and graph structures in social networks into a hid-
represented as graphs. The most basic part of GNNs is den vector space. To this end, they reconstruct the original
a graph. GNNs became a powerful tool for many impor- graphic inputs into word insertion vectors. In doing this,
tant applications as graphs become more common and a graphic taken as input via an encoder is first converted
rich with information and neural networks become more to a low-dimensional vector. Then, the vector generated
popular and capable. In various studies, researchers used in the previous step is converted to the original input via
the GNN graph structure, which is effectively used in the a decoder. In this way, the information loss between the
solution of non-regular, multidimensional, and complex input graph and the output graph is minimized (Wu et al.
problems in non-Euclidean data space. Zhang et al. (2019) 2020). Peng et al. (2018) proposed a graphics-based deep
categorized the first studies with GNNs in the literature as learning approach to convert the text they process as input
(Recurrent Graph Neural Networks) RecGNNs. However, into graphics. Then, they made text classification using
the high computational cost of these models emerged as GCNs. Mishra et al. (2019) proposed a GCNs model that
an important issue. To solve this problem, the research- captures the linguistic behavior of social media users for
ers used a new approach called convolutional GNNs abusive language detection on online social media plat-
(GCN), which can be applied to graphs. While develop- forms. This proposed model is one of the first approaches
ing this method, the success of CNNs in image processing in the literature to use GCNs for the detection of HS.

Table 1  Comparison of the Model Accuracy Reported by


proposed GCN-based model
with other models in the The Proposed GCN based Model 0.930 Ours
literature
NB 0.903 Mulki et al. (2019)
SVM 0.832
GRU-based RNN 0.790 Albadi et al. (2018)
Sentiment-based, Semantic Unigram 0.754 Watanabe et al. (2018)
BiLSTM 0.752 Naseem et al. (2021)
Logistic Regression 0.750 Dukic and Krzic (2021)
SVM 0.790 Abro et al. (2020)
AdaBoost 0.780
Rule based 0.734 Gitari et al. (2015)
Voted Ensemble Classifier 0.890 Burnap and Williams (2015)
2CNN 0.920 Roy et al. (2020)
Multi-view SVM 0.803 MacAvaney et al. (2019)
BiLSTM 0.703 Garain and Basu (2019)
GraphSage 0.90 Ma et al. (2019)
GCN with additional conv layer 0.814 Bölücü and Canbay (2021)

13
Earth Science Informatics (2023) 16:329–343 333

Base models accuracy. RF was successfully applied in many fields


such as chemoinformatics, ecology, 3D object recogni-
In this study, the proposed GCN model and effective tion, bioinformatics, spam detection, credit card fraud
machine learning methods were used for the detection of detection, text classification, and event prediction (Biau
hateful users. In this section, these methods were briefly and Scornet 2016).
introduced. • SVM: This method is one of the widely used supervised
machine learning techniques for classification and regres-
• DT: This model is an example of a decision structure sion and was developed by Vapnik (Cortes and Vapnik
with a tree shape, and its classes are determined via 1995). SVM is a learning method developed in the field
induction from sample data that is known (Song and of statistical learning theory. SVM first transfers the data
Ying 2015). DTs are the most widely used method to a higher dimension where it can be separated linearly.
among classification and regression models. The rea- The main purpose of SVM is to find a function in a mul-
son for this is that decision tree methods are easy to tidimensional space that can separate the training data
interpret and can be easily integrated with other sys- with known class labels. SVM was applied in various
tems. Therefore, it is to put forward understandable fields such as image classification, speech detection, text
rules and be reliable. DTs are generated from general classification, cyberbullying detection, and face detection
to specific and downward-trained data. The tree starts (Pradhan 2012).
with a root node containing all the data in the sample • MLP: It is a feed-forward neural network with one or
(Safavian and Landgrebe 1991). The internal nodes of more layers between the input and output layers. In this
the DT represent the terms and the leaf nodes represent model, the input layer is the layer where the incoming
the classes. information is received and directed to the hidden layer
• LR: It is a method based on statistics, which is used for the purpose of learning. The hidden layer where the
to reveal the cause-and-effect relationship between a learning process is performed can be one or more, and
dependent variable and one or more independent vari- the layer where the information output is provided is
ables. The regression model associates the dependent called the output layer (Okwu and Tartibu 2021). Each
variable with the independent variable or variables neuron in the layers is connected to each neuron in the
through a function (Luu et al. 2021). A single argument adjacent layers (Area and Mesra 2012). Although the
can be used in the LR method, or multiple arguments number of hidden layers and the number of neurons in
can be used. When more than one independent variable the hidden layer is not certain, they are two important
is used, it is called multiple linear regression. factors that affect the quality of education.
• NB: It is a simple probability-based classification • Bi-LSTM: This model is very successful in classifica-
method based on Bayes' theorem (Rish 2001). The tion and predicting tasks. Bi-LSTM may learn long-term
NB method is a simple yet powerful algorithm for pre- dependencies using the past and future data present in the
dictive modeling. Among the classification methods, time series. To understand past and future information,
features such as multi-class support, categorical pre- the Bi-LSTM architecture has a forward and backward
diction support, prediction speed, memory usage, and layer made up of LSTM units. In doing so, it does not
interoperability are superior to other methods. It has take into account unnecessary contextual information
successful applications in areas such as software defect (Zhang et al. 2020; Jang et al. 2020).
prediction, health care, cybersecurity, and education • GCN: This is a multilayer neural network model running
(Wickramasinghe and Kalutarage 2021), also it has one on a graph. GCN is a strong version of the GNN and was
of the most widely used classifications and prediction produced from Graph Spectral Theory (Chung 1997).
algorithms, especially in signal and image processing GCN works on vectors and matrices like classical neural
fields. network models. A vector of numeric values reflecting
• RF: This effective algorithm is one of the ensemble pertinent information about nodes and their connections
learning methods. Ensemble learning methods aim to is produced by GCN from structured graph data. The
improve results by combining different models. The term "graph embedding" refers to this vector represen-
RF algorithm is obtained from a collection of multiple tation. Machine learning frequently uses this technique
decision trees (Breiman 2001). The RF algorithm offers to turn complex information into a structure that can be
a fast, flexible and powerful structure for analyzing differentiated and learned. Building on the CNN model,
high-dimensional data (Antoniadis et al. 2021). It's an GCN aims to take the concept of convolution beyond the
unusually capable algorithm that can handle thousands simple two dimensions. Convolution takes a small sub-
of variables without deleting them or degrading their segment of the image and applies a convolution function

13

334 Earth Science Informatics (2023) 16:329–343

to it, producing a new part. Here, the central node col- available on Kaggle was used. The dataset used consists of
lects information from its neighbors and itself to generate 100,368 Twitter users and their activities.
a new value. GCN uses the properties of neighboring For each user, there are events such as tweeting fre-
nodes to create an embedded node structure (Kipf and quency, number of followers, number of favourites, and
Welling 2017). GCN maps nodes into a d-dimensional number of hashtags. In addition, dictionary analysis
embedding space so that similar nodes in the graph are derived using the last 200 tweets of each user enabled
placed close together. It is intended to map the nodes the acquisition of language content-related features.
so that the docking area's similarity approximates the Using the Empath tool in the dataset, it was used to ana-
network's similarity. lyse each user's dictionary for categories such as love,
violence, community, warmth, ridicule, independence,
envy, and politics, and assign numerical values to indi-
Methodology and materials cate the user's agreement with each category.
The dataset contains 204 attributes to characterize
Description of the model each user as hateful or normal. There are important rela-
tionships between Twitter users in the dataset. If a user
With the increased use of the Internet and the spread of retweeted another user, those users are considered linked.
social media platforms, the Web content produced is also About 5000 of the users in the dataset are labeled as
increasing rapidly. Social media platforms generally allow hateful or normal. The users_anon_neighborhood.csv file
users who are connected or like-minded and have common included in the dataset contains the number of people
interests to interact. However, shares that may be perceived as tweeted for 1 neighbourhood as well as various attributes
usual or entertaining by users may be disturbing for another for each user. The first 5 lines and the first 10 attributes
user. In addition, social media platforms provide users with of this file are shown in Fig. 2.
the opportunity to share all kinds of thoughts. Accordingly, The distribution of users in the dataset is 95415
expressions containing hate speech are also increasing. other, 4427 normal and 544 hateful. Of the 1039 node
Traditional machine learning and deep learning models features in the dataset, only 206 attributes based on the
need to be in a structure to model the relationships and inter- user's attributes and the tweet dictionary are used. Fig-
actions between users on social media platforms. However, ure 3 presents the heatmap showing the relationships of
modeling user interactions with a graph structure also reveal the first 10 features in the dataset. Since there are 206
hidden connections. For this reason, the use of graph-based features in the dataset, the heatmap of a sample consist-
models comes to the fore in studies using social media data. ing of 10 features is presented in Fig. 3.
In this study, a GCN-based detection system was proposed The features used are normalized. The dataset is divided
to detect hateful users. With the developed graph structure, into 80% for the training set and 20% for the test set. The
it is aimed to model user interactions and connections more training data is divided into 90% for training and 10% for
accurately. The proposed model was extensively compared validation. After obtaining the train and test sets, the shape
with DT, LR, NB, RF, SVM, MLP, and Bi-LSTM using of the train set (1211, 204) and the shape of the test set
accuracy, precision, recall, and F1-score. (303, 204) were obtained. The adjacency matrix showing
the relationships between the users in the dataset is shown
Dataset in Fig. 4.
A sample of the adjacency matrix is shown as so
In this study, a dataset created by researchers at Universi- many users follow each user. The adjacency matrix for
dade Federal de Minas Gerais in Brazil and made publicly the first 5 users in the dataset is presented in Fig. 4.

Fig. 2  The first 5 lines and the first 10 attributes of users_anon_neighborhood.csv file

13
Earth Science Informatics (2023) 16:329–343 335

Fig. 3  Heatmap of the first ten


features in the dataset

Proposed GCN based model potent tool for numerous critical applications as graphs
got more prevalent and information-rich and as neural net-
A sort of machine learning algorithm known as the GNN works gained popularity and capability. It is a widely used
model is capable of extracting significant data from graphs model today, especially for the analysis of social networks.
and producing helpful predictions. GNNs emerged as a In Fig. 5 basic structure of GNN is shown. In the GNN

Fig. 4  The adjacency matrix

Fig. 5  The basic structure of the


GNN model

13

336 Earth Science Informatics (2023) 16:329–343

structure, the input feeds the hidden nodes of the graph matrix ̃L refers to the graph Laplace operator. The eigenvalue
and the data gains a graph-structural representation. Then decomposition of ̃ L is shown as ̃
L = U ∧ U T  . The lth column
an output graph is produced from this structure. of U is the Eigenvector ul and ∧(l, l) is the appropriate Eigen
GCN is a powerful version of GNN. Unlike CNN, the value 𝜆i . From here, the Fourier transform of an x-graph
GCN method deals with irregular graphs, and the data in signal is calculated as follows (Zhang et al. 2019):
these graphs depend on graph Laplacian matrices (Guo et al. ∑n
2021). In these, it performs the filtering operation in the ̂
x(𝜆l ) =
i=1
x(i)u∗l (i) (1)
frequency domain. Figure 6 shows a basic diagram of the
GCN structure. Here, each convolutional layer contains a
hidden representation of each node by collecting the feature Filtering of graph
information of the neighboring nodes of each node. A non-
linear transformation is applied to these aggregated features Graph filtering is performed on graph signals. A signal can
and a result is obtained. After further processing of multiple be filtered at the node or spectral field of a graph.
layers, nodes receive information from subsequent nodes,
revealing a hidden display. a) Frequency filtering: The structure of graphs is generally
The basic theoretical representation of the GCN irregular, but graph convolution in the vertex areas of a
structure is given below. Here, G represents a graph, graph is easier than classical signal convolution in the
and G = (V, ℇ, A). V, ℇ denotes the set of nodes and time domain. Spectral graph convolution is defined as
edges where |V|= n and |ℇ|= m. A indicates the neigh- (Zhang et al. 2019):
borhood matrix. If there is an edge between node i and ∑n ( )
(x ∗ Gy)(i) = x 𝜆l ̂
̂ y(𝜆l )ul (i) (2)
node j, A(i,j) indicates the weight of the edge. Otherwise l=1

A(i,j) = 1. For unweighted graphs, A(i,j) = 1 is defined.


Diagonal matrix D specifies a degree matrix where ( ) ( )
∑n ̂
x 𝜆l ̂y 𝜆l specifies filtering in the spectral field. The fil-
D(i,i)= j=1 A(i, j). L is the Laplacian matrix of A and tering of an x signal in the G graph with the y filter is shown
is calculated as L = D-A. The symmetrically normalized by Eq. 2.
Laplacian matrix is expressed as ̃ L = I − D− 2 AD− 2 . Here
1 1

I represent the identity matrix. The graph signal at a node b) Vertex Filtering: The vertex filtering of the x signal at
is represented as the x ∈ Rn vector. The signal value at an i node is as follows:
node i is expressed by x(i). The node attributes are con- ∑
sidered a graph signal, and the matrix of an attribute
xout (i) = wi,i x(i) + w x(j)
j∈N(i,K) i,j (3)
graph is represented by X ∈ Rnxd  . Columns of X are d
signals of a graph (Zhang et al. 2019).
N(i, K) represents the K-hop neighborhoods of the node
i. wi,i is the weight combination used.
Transform of graph Fourier

Considering the Fourier transform of the 1-D f signal, the Spectral graph convolutional
Laplacian Matrix L is the Laplace operator in a graph. From
this perspective, an eigenvector of L is analogous with expo- GCN is divided into two models in the literature as spectral-
nential complexity at a certain frequency. The normalized based and spatial-based. Spectral methods can be thought

Fig. 6  GCN architecture with a


multi-layer structure

13
Earth Science Informatics (2023) 16:329–343 337

of as frequency filtering methods. The first spectral GCN network neighbors, aggregate information, and multi-
model was influenced by the classic CNN structure (Bruna layer computation. A graph is created by determining
et al. 2013; Zhang et al. 2019): nodes and neighborhoods. After the graph is created using

p
⎛�d ⎡ (θi,j )(1) ⋯ 0 ⎤ ⎞
p+1
X (∶, j) = 𝜎 ⎜ p
V ⎢ ⋮ ⋱ ⋱ ⎥V T X P (∶, i)⎟, ∀j = 1, … , dp+1 , (4)
⎜ i=1 ⎢ p ⎥ ⎟
⎝ ⎣ 0 ⋯ (θi,j )(1) ⎦ ⎠

X p is a vector and it is an input feature map of the pth the neighborhood information, the addition process is per-
layer. Its dimension is n × dp . X p (∶, i)(X p+1 (∶, j)) is ith and formed with the help of functions such as sum, average,
jth dimension of input output feature. θi,j specifies a vector and maximum using neural networks. In GCN, convo-
p

of learnable parameters of the filter in layer p. Each column lution is the operation of collecting and processing the
of V is the eigenvector value of L. σ(.) denotes the activation information of an element's neighbors to update the value
function. of the graph elements. GCN uses the model of forward
propagation. In GCN, the inputs are the feature vectors
of the nodes. The neural network performs aggregation
Spatial graph convolutional
by taking the feature vectors and then transferring them
to the next layer.
Spatial convolutional tries to perform convolutional process-
Due to the inability of classical machine learning mod-
ing directly in the vertex domain. During this process, a
els to adapt to the graph structure in training, the approx-
node collects information from its neighboring nodes in a
imately 95,000 users who are not labeled as hateful or
recursive manner so that the state of the node is updated. The
normal, and the relationships between users were not
spatial graph model includes models such as the CNN-based
evaluated. It should be noted that hateful users are careful
model, propagation-based model, and general framework
not to use words that clearly indicate hate speech in order
models The CNN-based model has successful applications
not to reveal their identity. In addition, users who retweet
in areas such as image classification and object recogni-
a hateful person's tweets may develop different behavio-
tion. This model is successful in grid data structures. In the
ral patterns. Information about such behavior is hidden in
propagation-based structure, a model emerges that collects
the relationships between users. Therefore, the proposed
and disseminates the information of neighboring nodes in
GCN model uses user properties and relationships among
vertex domains. In this model, the graph convolutional for
all users in the dataset, including the GCN model, unan-
the node u in the pth layer has the following representation
notated users. In this study, the StellarGraph library was
(Zhang et al. 2019):
used to create the GCN model.

P
XN(u) = X P (u, ;) + X P (v, ;) (5) The proposed model is based on a prediction for a
v∈N(u)
node, the properties of the node as well as the proper-
( ) ties of its neighbours. That is, it stands out that a hateful
X P+1 (u, ;) = 𝜎 xN(U)
P P
𝜃|N(u)| (6) user is to connect with other hateful users more pos-
sible. While the GCN model trains a node during the
P
𝜃|N(u)| represents a weight matrix for nodes. training phase, it receives information from its neighbors
and does this with a graph convolutional neural network
In this study, we consider the U and v as two nodes in layer. This graph layer embeds by sampling and collect-
the graph, and XU and XV are feature vectors. The encoder ing features from local neighbourhoods of nodes. During
function Enc (u) and Enc (v), which convert feature vec- training, GCN generates new expressions that will gen-
tors to Z U and Z V, are defined. Here it is necessary to erate hidden representations for nodes that are not pre-
determine the similarity between the Z U and Z V feature sent in the network. Figure 7 shows the proposed model
vectors using a metric. GCN applies the neural network architecture.
for each node vector and obtains a learned node vector. As seen in Fig. 7, the proposed model consists of a three-
Learning is done per edge by performing a similar opera- stage GCN. The output of the last GCN layer becomes an
tion for the edges. GCN uses pooling to collect informa- input to a fully connected layer with ReLu functionality.
tion from the edges and forward it to the nodes for predic- The output layer has a softmax function to determine class
tion. Each item to be pooled is embedded and combined labels based on probabilities. Categorical cross-entropy
into a matrix. The resulting matrix is gathered using the was used as the loss function. Adam optimizer was used
encoder function. The encoder function handles local as the optimizer in all layers. GridSearchCV is used to

13

338 Earth Science Informatics (2023) 16:329–343

Fig. 7  The proposed model architecture

Table 2  Confusion matrix positives (TP) are those in which the true value and the pre-
Actual values
dicted value are both 1. True Negatives (TN) are instances
where both true and the predicted value is 0. In cases
Positive (1) Negative (0) where the true value is 0 but the predicted value is 1, these
Predicted values Positive (1) TP FP instances are known as false positives (FP). When the true
Negative (0) FN TN value is 1 but the predicted value is 0, this is known as a
false negative (FN).
According to Eq. 7, accuracy is determined by dividing
determine pool size, learning rate, layer size, and the num- the number of samples correctly classified by the model by
ber of the epoch. This way, epoch 20, layer sizes [32, 16], the total number of samples.
and learning rate were selected as 0.005. GCN layers con-
TP + TN
sist of 1024 nodes. fivefold cross-validation was used for Accuracy = (7)
TP + FP + FN + TN
model testing.
In the implementation phase of the GCNs model, a Stel- How many of the positively predicted values are actually
larGraph object was created to prepare the data for pro- positives is referred to as precision. Equation 8 is used to
cessing using GCNs. Using the Tensorflow Keras library, calculate the precision.
the Softmax layer is defined to generate the prediction
TP
outputs. Adam was used as the optimizer and the binary Precision = (8)
cross entropy was used for the loss. In order to obtain TP + FP
more successful classification results, hyperparameters Recall is a measure of how many of the outcomes
such as epoch and learning rate that the model will use we predict are positive. Equation 9 is used to calculate
were optimized. recall.
TP
Recall = (9)
Performance evaluation metrics TP + FN
The F1-score is calculated by taking the harmonic aver-
Classification algorithms aim to classify binary or mul- age of precision and recall values. It is a measure of how
ticlass categorical values. Basically, accuracy, precision, well the classifier is performing and is often used to com-
recall, and F1-score metrics are used to evaluate the errors pare classifiers. F1-score is calculated as seen in Eq. 10.
that occur and to reveal the accuracy rates effectively. Uti-
lizing the values from the confusion matrix, these evalu- F − 1 score =
2 ∗ Precision ∗ Recall
(10)
ation metrics are calculated. The outcomes of classifica- Precision + Recall
tion models are interpreted using the confusion matrix to
cross-evaluate the errors in the relationship between actual
and predicted values. The confusion matrix is shown in The experimental results
Table 2.
Table 1 shows the confusion matrix of the output of a In this study, a deep learning model based on GCN was
model set up for binary classification. Instances of true proposed to detect hateful users. The results obtained

13
Earth Science Informatics (2023) 16:329–343 339

from the DT, LR, NB, RF, SVM, MLP, and Bi-LSTM As seen in Fig. 8, the number of normal users correctly
models were used to accurately evaluate the success of classified by the DT algorithm is 255 and the number of
the results obtained with the GCN model. Each model's hateful users correctly classified is 1. In total, it classified
achieved accuracy, precision, recall, and F1-score were 256 users correctly and misclassified 47 users. Also seen in
compared. To determine the parameters of the used deep the same figure, the number of normal users correctly clas-
learning and machine learning models, parameter analy- sified by the LR algorithm is 258, and the number of hateful
sis studies were conducted in the study utilizing Grid- users correctly classified is 4. In total, it classified 262 users
SearchCV. Cross-validation was employed in the applied correctly and misclassified 41 users.
models to solve the overfitting issue and raise the caliber Figure 9 shows the results of the NB and RF algorithms.
of the models built. Cross-validation allows the perfor- According to these results, the NB algorithm correctly clas-
mance of the model to be tested before encountering high sified 246 normal users and 1 hateful user. The RF algorithm
error rates on an as-yet-unseen dataset. Cross-validation correctly classified 262 normal users and 8 hateful users. In
was made by choosing the k value as 5. In the graphs total, the NB algorithm classified 247 users correctly and
below, the confusion matrix results of the DT, LR, NB, misclassified 56 users. The RF algorithm classified 270 users
RF, SVM, MLP, and Bi-LSTM algorithms and the newly correctly but misclassified 33 users.
proposed GCN algorithm are given, respectively, and Figure 10 demonstrates the confusion matrix results of
these graphs show the performance of these methods in the SVM and MLP algorithms. In light of these results, it is
detecting hateful users correctly. seen that the SVM algorithm correctly classifies 264 normal

Fig. 8  Confusion matrix results


of DT and LR algorithms

Fig. 9  Confusion matrix results


of NB and RF algorithms

Fig. 10  Confusion matrix
results of SVM and MLP
algorithms

13

340 Earth Science Informatics (2023) 16:329–343

Fig. 11  Confusion matrix
results of Bi-LSTM and GCN

Table 3  Comparative experimental results classified 277 users correctly and misclassified 26 users.
Model Accuracy Precision Recall F1-score
The GNN method correctly classified 268 normal users and
14 hateful users. In total, it classified 282 users correctly. In
DT 0.844 0.906 0.916 0.910 addition, the GNN algorithm misclassified 21 users.
LR 0.864 0.914 0.931 0.922 Table 3 and Fig. 12 display comparative experimental
NB 0.785 0.772 0.981 0.864 results for the DT, LR, NB, RF, SVM, MLP, Bi-LSTM,
RF 0.894 1.000 0.893 0.943 and GCN according to accuracy, precision, recall, and
SVM 0.907 1.000 0.905 0.950 F1-score values.
MLP 0.897 0.951 0.934 0.942 As demonstrated in Table 3 and Fig. 12, in comparison
Bi-LSTM 0.914 0.992 0.916 0.952 to the other models, the proposed GCN-based model pro-
GCN 0.930 1.000 0.927 0.962 duced findings that were more favorable. For the proposed
model, the accuracy value is 0.930, the precision value is
1.000, the recall value is 0.927, and the F1-score is 0.962.
users and 10 hateful users. It is seen that the MLP algorithm The proposed GCN model outperformed other models
correctly classifies 263 normal users and 9 hateful users. In in terms of classification performance for detecting hateful
total, the SVM algorithm classified 274 users correctly and users, as shown in Table 3 and Fig. 12. After the proposed
29 users incorrectly. The MLP algorithm classified 272 users model, Bi-LSTM, SVM, MLP, RF, LR, DT, and NB are the
correctly and 31 users incorrectly. most successful models, respectively.
Figure 11 shows the results of the Bi-LSTM and GCN. The proposed model's training and validation accuracy
The Bi-LSTM correctly classified 265 normal and 12 hate- graph was depicted in Fig. 13a, and the training and valida-
ful users according to these results. In total, the Bi-LSTM tion loss graph of the proposed model was depicted in Fig. 13b.

Fig. 12  Comparative experi-
mental results

13
Earth Science Informatics (2023) 16:329–343 341

The ROC curve of the proposed model was demonstrated people and various diseases are creating the multidimen-
in Fig. 14. The ROC curve is a metric showing the effec- sional structure of HS. Hate speech is widespread against
tiveness of the developed models using True Positive Rate various segments on online platforms such as Twit-
(TPR) and False Positive Rate (FPR) values. ROC Curve is ter, Facebook, and Instagram, and attacks are organized
the curve that shows the values that FPR will take in case against individuals or groups targeted through posts. It was
TPR increases, that is, in case of 1′ convergence. Here, the observed that these attacks increase the risk of violent ten-
convergence of TPR to 1 is a desirable situation. FPR is dencies after a certain period of time and that the victim
expected to remain low when the TPR converges to 1. groups are exposed to violent attacks. In other words, hate
Tidak ada kekurangan serta kelemahan pada Analisis speech, with its repetitive content and potential violence
Hasil Pengujian load, reduces the respect and tolerance towards groups
Conclusions that differ in terms of certain characteristics such as reli-
gion, language, race, or sexual orientation, and thus ena-
Today, behavior patterns that include hate speech are com- bles the group that creates the discourse to create a new
mon on social networking platforms. For instance, hate identity value. In societies that are constantly exposed to
speeches about politics, women, foreigners, immigrants, hate speech, certain stereotypes about minorities begin to
sexual identity-based, beliefs and sect-based, disabled form in the public memory, which causes groups that are

Fig. 13  Training and validation accuracy/loss graphs

Fig. 14  The ROC curve of the


proposed model

13

342 Earth Science Informatics (2023) 16:329–343

matched with bad facts to become victims of hate explo- another journal. We have read and understood your journal’s policies,
sions in the future. Since social networking platforms and we believe that neither the manuscript nor the study violates any
of these.
allow interaction between users, the hate speech produced
becomes widespread and commonplace in a much more Consent to participate  Not applicable.
effective way, and in this way, it can be taken for granted
and turn into hate crimes over time. Consent to publish  Not applicable.
In order to solve the HS problem, intelligent systems with Competing interests  The authors have no relevant financial or non-
automatic HS detection in social networks are needed. There financial interests to disclose.
are many studies in the literature on HS detection, but most
of these studies are automatic HS detection systems created
using text mining methods (especially NLP-based methods)
References
that analyze users' comments and shares on social networks.
In these studies, apart from social network sharing, user fea- Abro S, Shaikh S, Khand ZH, Zafar A, Khan S, Mujtaba G (2020)
tures and the relations of these users with each other in a Automatic hate speech detection using machine learning: a com-
network were not taken into account. However, in our study, parative study. Int J Adv Comput Sci Appl 11(8)
besides the analysis of Twitter posts containing hate speech, Al-Hassan A, Al-Dossari H (2019) Detection of hate speech in social
networks: a survey on multilingual corpus. In: 6th international
many features of users such as tweeting frequency, num- conference on computer science and information technology,
ber of followers, number of favorites, and retweeting were vol 10, pp 10–5121. https://​doi.​org/​10.​5121/​csit.​2019.​90208
used. Hence, hateful Twitter user detection was made with Albadi N, Kurdi M, Mishra S (2018) Are they our brothers? analy-
the graph convolutional-based GCN model. In this method, sis and detection of religious hate speech in the arabic twit-
tersphere. In: 2018 IEEE/ACM International Conference on
users who spread hate speech were considered a node, and Advances in Social Networks Analysis and Mining (ASONAM),
just like in a social network structure, these nodes learn pp 69–76. https://​doi.​org/​10.​1109/​ASONAM.​2018.​85082​47
from each other, and HS is detected in the graph structure. Antoniadis A, Lambert-Lacroix S, Poggi JM (2021) Random for-
This study is one of the few studies using the GCN in this ests for global sensitivity analysis: a selective review. Reliab
Eng Syst Saf 206:107312. https://​doi.​org/​10.​1016/j.​ress.​2020.​
area and compared to popular methods, quite good results 107312
were obtained. The GCN method achieved 0.930 accuracy, Area S, Mesra R (2012) Analysis of Bayes, neural network and tree
1.000 precision, 0.927 recall, and 0.962 F1- score values. classifier of classification technique in data mining using WEKA.
As a result of this study, a model proposed that detects users Comput Sci Inf Technol. https://​doi.​org/​10.​5121/​csit.​2012.​2236
Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for
whose HS content is shared on social media platforms in a hate speech detection in tweets. In: Proceedings of the 26th
fast, effective, and automatic way. Thus, users who engage international conference on world wide web companion, pp
in insults, profanity, humiliation, and harassment can eas- 759–760. https://​doi.​org/​10.​1145/​30410​21.​30542​23
ily be identified. If the user shares a message that contains Biau G, Scornet E (2016) A random forest guided tour. Test
25(2):197–227. https://​doi.​org/​10.​1007/​s11749-​016-​0481-7
hate speech, this post may be stopped from publishing, or Bölücü N, Canbay P (2021) Hate speech and offensive content identifi-
such behavior may be prevented from happening again. Most cation with graph convolutional networks. In: Forum for informa-
importantly, the unforeseen margin of error will be reduced tion retrieval evaluation (working notes) (FIRE), CEUR-WS.org.
with the automatic HS detection system, and time and cost Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks
efficiency will be increased. and locally connected networks on graphs. arXiv preprint arXiv
:1312.6203
Burnap P, Williams ML (2015) Cyber hate speech on twitter: an
Authors contributions  All the authors contributed to the study’s con- application of machine classification and statistical modeling
ception, design, data analysis, and writing of the original manuscript. for policy and decision making. Policy Internet 7(2):223–242.
Anıl Utku carried out data curation, methodology, and software devel- https://​doi.​org/​10.​1002/​poi3.​85
opment. Umit Can and Serpil Aslan contributed to the original draft’s Cao R, Lee RK, Hoang TA (2020) DeepHate: hate speech detection via
conceptualization, validation, supervision, rewriting, and editing. All multi-faceted text representations. In: 12th ACM conference on
authors read and approved the final manuscript draft. web science, pp 11–20. https://d​ oi.o​ rg/1​ 0.1​ 145/3​ 39423​ 1.3​ 39789​ 0
Chung FR (1997) Spectral graph theory. American Mathematical Soc.
Data availability  Program codes available on request and the datasets Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn
generated during and/or analyzed during the current study are available 20(3):273–297. https://​doi.​org/​10.​1007/​BF009​94018
in the Kaggle repository and persistent web links to datasets are below: Dukic D, Krzic AS (2021) Detection of hate speech spreaders with
https://​www.​kaggle.​com/​datas​ets/​manoe​lribe​iro/​hatef​ul-​users-​on-​ BERT. InCLEF (Working Notes), pp 1910–1919
twitt​er Friedkin NE (1978) University social structure and social networks
among scientists. Am J Sociol 3(6):1444–1465
Declarations  Garain A, Basu A (2019) The titans at SemEval-2019 task 5: detec-
tion of hate speech against immigrants and women in twitter.
Ethics approval  This manuscript has not been published or presented In: Proceedings of the 13th international workshop on semantic
elsewhere in part or in entirety and is not under consideration by evaluation, pp. 494–497. https://​doi.​org/​10.​18653/​v1/​S19-​2088

13
Earth Science Informatics (2023) 16:329–343 343

Gitari ND, Zuping Z, Damien H, Long J (2015) A lexicon-based graph-cnn. In: Proceedings of the 2018 world wide web conference,
approach for hate speech detection. Int J Multimedia Ubiquitous pp 1063–1072. https://​doi.​org/​10.​1145/​31788​76.​31860​05
Eng 10(4):215–230 Poletto F, Basile V, Sanguinetti M, Bosco C, Patti V (2021) Resources
Gröndahl T, Pajola L, Juuti M, Conti M, Asokan N (2018) All you and benchmark corpora for hate speech detection: a systematic
need is" love" evading hate speech detection. In: Proceedings of review. Lang Resour Eval 55(2):477–523. https://d​ oi.o​ rg/1​ 0.1​ 007/​
the 11th ACM workshop on artificial intelligence and security, s10579-​020-​09502-8
pp 2–12. https://​doi.​org/​10.​1145/​32701​01.​32701​03 Rangel F, De la Peña Sarracén GL, Chulvi B, Fersini E, Rosso P (2021)
Guo K, Hu Y, Sun Y, Qian S, Gao J, Yin B (2021) Hierarchical graph Profiling Hate Speech Spreaders on Twitter Task at PAN 2021.
convolution network for traffic forecasting. In: Proceedings of the InCLEF (Working Notes), pp 1772–1789
AAAI conference on artificial intelligence, vol 35, No 1, pp 151–159 Rish I (2001) An empirical study of the naive Bayes classifier. In:
Jang B, Kim M, Harerimana G, Kang SU, Kim JW (2020) Bi- IJCAI 2001 workshop on empirical methods in artificial intel-
LSTM model to increase accuracy in text classification: com- ligence, vol 3, no 22, pp 41–46
bining Word2vec CNN and attention mechanism. Appl Sci Roy PK, Tripathy AK, Das TK, Gao XZ (2020) A framework for hate speech
10(17):5841. https://​doi.​org/​10.​3390/​app10​175841 detection using deep convolutional neural network. IEEE Access
Knoke D, Yang S (2019) Social network analysis. California, USA 8:204951–204962. https://​doi.​org/​10.​1109/​ACCESS.​2020.​30370​73
Kipf TN, Welling M (2017) Semi-supervised classification with graph Safavian SR, Landgrebe D (1991) A survey of decision tree classi-
convolutional networks. In: J. International Conference on Learn- fier methodology. IEEE Trans Syst Man Cybern 21(3):660–674.
ing Representations https://​doi.​org/​10.​1109/​21.​97458
Luu QH, Lau MF, Ng SP, Chen TY (2021) Testing multiple linear Sarracén GL, Rosso P (2021) Offensive keyword extraction based on
regression systems with metamorphic testing. J Syst Softw the attention mechanism of BERT and the eigenvector centrality
182:111062. https://​doi.​org/​10.​1016/j.​jss.​2021.​111062 using a graph representation. Pers Ubiquitous Comput 27:1–3.
Ma J, Nervo G, Zheng J (2019) Improving detection of hateful users on https://​doi.​org/​10.​1007/​s00779-​021-​01605-5
twitter using attention and ReFex. CS224W, Stanford, CA Song YY, Ying LU (2015) Decision tree methods: applications for
MacAvaney S, Yao HR, Yang E, Russell K, Goharian N, Frieder O classification and prediction. Shanghai Arch Psychiatry 27(2):130.
(2019) Hate speech detection: challenges and solutions. PloS one https://​doi.​org/​10.​11919/j.​issn.​1002-​0829.​215044
14(8):e0221152. https://​doi.​org/​10.​1371/​journ​al.​pone.​02211​52 Watanabe H, Bouazizi M, Ohtsuki T (2018) Hate speech on twitter: a
Mishra P, Del Tredici M, Yannakoudakis H, Shutova E (2019) Abu- pragmatic approach to collect hateful and offensive expressions
sive language detection with graph convolutional networks. Con- and perform hate speech detection. IEEE Access 6:13825–13835.
ference of the North American Chapter of the Association for https://​doi.​org/​10.​1109/​ACCESS.​2018.​28063​94
Computational Linguistics: Human Language Technologies, 1, Wickramasinghe I, Kalutarage H (2021) Naive Bayes: applications,
pp 2145–2150 variations and vulnerabilities: a review of literature with code
Most popular social networks worldwide as of January 2022, ranked by snippets for implementation. Soft Comput 25(3):2277–2293.
number of monthly active users. Available online: https://​www.​ https://​doi.​org/​10.​1007/​s00500-​020-​05297-6
stati​sta.​com/​stati​stics/​272014/​global-​social-​netwo​rks-​ranked-​by-​ Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A com-
number-​of-​users/. Accessed 25 Sept 2022 prehensive survey on graph neural networks. IEEE Trans Neural
Mulki H, Haddad H, Ali CB, Alshabani H (2019) L-hsab: A levan- Netw Learn Syst 32(1):4–24. https://​doi.​org/​10.​1109/​TNNLS.​
tine twitter dataset for hate speech and abusive language. In: Pro- 2020.​29783​86
ceedings of the third workshop on abusive language online, pp Zhang S, Tong H, Xu J, Maciejewski R (2019) Graph convolutional
111–118. https://​doi.​org/​10.​18653/​v1/​W19-​3512 networks: a comprehensive review. Comput Social Networks
Murray C (2008) Schools and social networking: Fear or education. 6(1):1–23. https://​doi.​org/​10.​1186/​s40649-​019-​0069-y
Synergy Perspect: Local 6(1):8–12 Zhang B, Zhang H, Zhao G, Lian J (2020) Constructing a PM2. 5
Naseem U, Razzak I, Eklund PW (2021) A survey of pre-processing concentration prediction model by combining auto-encoder with
techniques to improve short-text quality: a case study on hate Bi-LSTM neural networks. Environ Modell Softw 124:104600.
speech detection on twitter. Multimedia Tools Appl 80(28):35239– https://​doi.​org/​10.​1016/j.​envso​ft.​2019.​104600
35266. https://​doi.​org/​10.​1007/​s11042-​020-​10082-6 Zimmerman S, Kruschwitz U, Fox C (2018) Improving hate speech
Nielsen, F. Å. (2011). A new ANEW: evaluation of a word list for detection with deep learning ensembles. In: Proceedings of the
sentiment analysis in microblogs. arXiv preprint arXiv:1​ 103.2​ 903 eleventh international conference on language resources and
Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive evaluation, pp 2546–2553
language detection in online user content. In: Proceedings of the
25th international conference on world wide web, pp 145–153
Okwu MO, Tartibu LK (2021) Artificial neural network. In: Metaheuris- Publisher's note Springer Nature remains neutral with regard to
tic optimization: nature-inspired algorithms swarm and computa- jurisdictional claims in published maps and institutional affiliations.
tional intelligence, theory and applications. Springer, Cham, pp
133–145. https://​doi.​org/​10.​1007/​978-3-​030-​61111-8_​14 Springer Nature or its licensor (e.g. a society or other partner) holds
Pradhan A (2012) Support vector machine-a survey. Int J Emerging exclusive rights to this article under a publishing agreement with the
Technol Adv Eng 2(8):82–85 author(s) or other rightsholder(s); author self-archiving of the accepted
Peng H, Li J, He Y, Liu Y, Bao M, Wang L, Song Y, Yang Q (2018) Large- manuscript version of this article is solely governed by the terms of
scale hierarchical text classification with recursively regularized deep such publishing agreement and applicable law.

13

You might also like