Hate Speech, Offensive Language Detection and Blocking On Social Media Platform Using Feature Engineering Techniques and Machine Learning Algorithms A Comparative Study
Hate Speech, Offensive Language Detection and Blocking On Social Media Platform Using Feature Engineering Techniques and Machine Learning Algorithms A Comparative Study
ISSN No:-2456-2165
Mwayi Malemia
21251377011
Guide
Dr. Glorindall
Project Report
Submitted
Hate Speech in relation to social media, is a kind of writing that disparages and is likely to cause harm or danger to the
victim. It is a bias-motivated, hostile, malicious speech aimed at a person or a group of people because of some of their actual or
perceived innate characteristics [12]. It is a kind of speech that demonstrates a clear intention to be hurtful, to incite harm, or to
promote hatred. The environment of social media provides a particularly fertile ground for creation, sharing and exchange of hate
messages against a perceived enemy group [5].
However, identifying and removing hate speech content has proved to be labor-intensive and time consuming. Owing to
these worries and prevalent hate speech content on the internet, there is a strong motivation to implement an automated hate
speech detection system. The automatic detection of hate speech has proved to be a challenging task because of disagreements on
different hate speech definitions as perceived by many. Detection of hate speech and offensive language has been considered as an
emerging application in numerous research problems associated with the domain of Natural Language Processing [13]
Regardless of the extensive amount of work that researchers have so far done, it remains problematic to make comparisons
on the performance of these approaches to categorize hate speech content that is flooded on the social media. To my knowledge
based on literature that I have read so far, the prevailing studies lack the comparative analysis of dissimilar feature engineering
techniques and ML algorithms.
Main Objective
This paper will discuss the ways in which machine learning and feature engineering techniques are used to control hate
speech and abusive language on social media with a given dataset.
Specific Objective
My study is quite significant as it donates to resolving the problem at hand, by relating three feature engineering and eight
Machine Learning classifiers on standard hate speech datasets having three distinct classifiers. The study holds applied reputation
and therefore, serves as a base line for new researchers in the domain of automatic hate speech detection on social media platform.
Research Question
Which combination of three feature engineering and eight Machine Learning classifiers out performs on a standard hate
speech dataset?
Artificial intelligence “(AI) is an area of computer science that emphasizes the creation of intelligent machines that work and
reacts like humans”. (Andrew Ng, 2015)
Machine learning “(ML) is the science of getting computers to learn and act like human do, and improve their learning over
time in autonomous fashion, by feeding them data and information in the form of observations and real-word interactions”.
(Arthur Samuel, 2016).
Natural Language Processing is the sub-field that takes the inspiration from the areas of Artificial Intelligence and
Linguistics. It enables the computers/machines to process and analyze the large amount of human language data such as
speech or text.
Parihar et al., 2021 [18] Hate speech detection is a very difficult task and continues to be a societal problem. There is a very
fine line between what is a hate speech and what is not. For example, a satire might also be considered as a possible threat but it is
not actually a hate speech. The annotation and collection of data for building a model for hate speech detection is thus a very
troublesome task. As discussed, this problem can be solved by narrowing down the criteria for annotations. Similarly, there is a
need to focus research on code-mixed languages and regional languages as well. Language models and deep learning models have
shown promising results in hate speech classifications. For tackling with unbalanced data, the up sampling or down sampling
techniques based on language models should be researched upon. The challenges discussed above must be tackled with more
research in the domain so that the internet becomes more inclusive, welcoming and free from hate.
Mahibha et al, [13] They concluded that, from the output obtained from the different models it could be inferred that deep
learning models outperform the machine learning models considering the offensive language classification problem for the data
set provided by HASOC@FIRE-2021 for Task 1 associated with code mixed Tamil. Among the deep learning model transformer-
based models has done the more accurate predictions compared to recurrent models, hence more scope for transformer-based
models could be identified for research based on Dravidian languages and in specific Hate and Offensive language-based
researches.
Zeerak Waseem et al. [20] classify the hate speech on twitter. In their research, they employed character Ngrams feature
engineering techniques to generate the numeric vectors. The authors fed the generated numeric vector to the LR classifier and
obtained overall 73% F-score. While, Chikashi Nobata et al. [6] used the ML -based approach to detect the abusive language in
online user content. In their research authors employed character Ngrams feature representation technique to represent the
features. The authors fed the features to the SVM classifier. The results showed that the classifier obtained overall 77% F-score.
Shervin Malmasi et al [14] used an ML -based approach to classify hate speech in social media. In their research, the authors
employed 4grams with character grams feature engineering techniques to generate numeric features. The authors fed the generated
numeric features to the SVM classifier. The authors reported maximum of 78% accuracy.
Al-Hassan and Al-Dossari, 2019 [1] Arab regions and worldwide are now more aware of the problem of spreading hate
through the social networks. Many countries are working hard in regulating and countering such speech. This attention raised the
need for automating the detection of hate speech. In this paper we analyzed the concept of hate speech and specifically “cyber
hate” which is conducted in the means of social media and the internet sphere. Moreover, they differentiated between the different
anti-social behaviors which include (Cyberbullying, Abusive and offensive language, Radicalization and hate speech). After that
they presented a comprehensive study on how text mining can be used in social networks. we investigated some challenges which
can be a guide for the implementation of Arabic hate speech detection model. In addition, these recommendations will help in
drawing a road map and a blueprint for the future model. The future work will include incorporating the latest deep learning
architectures to build a model that is capable to detect and classify Arabic hate speech in twitter into distinct classes. A data set
will be collected from twitter, and for intensifying the training of our neural network they will including data from additional
platform “e.g. Facebook” as it is the most used platform in the Arab region.
Research Design
The section explains the Methodology that has been adopted in order to categorize tweets into three different distinct classes
namely, “hate speech, offensive but not hate speech, and neither hate speech nor offensive speech”. Fig. 1 below illustrates a
comprehensive research methodology that has been implemented in this research. As presented by the figure below, the
methodology employed has six key steps that are going to be used before the results can be concluded. The steps include; (1) data
collection, (2) data preprocessing, (3) feature engineering, (4) data splitting, (5) classification model construction, (6) and
classification model evaluation. Each step is explained in detail below.
Data Collection
This study, for the purpose collecting research data, I have used a publicly available open source CrowdFlower dataset.
CrowdFlower provided this dataset as an open source, they compiled and labelled datasets making it very user friendly. The
tweets are labeled into three distinct classes, namely, hate speech, not offensive, and offensive but not hate speech. This dataset
has 14509 number of tweets. Of these, 16% of tweets belong to class hate speech. In addition, 50% of tweets belong to not
offensive class and the remaining 33% tweets are offensive but not hate speech class. The details of this distribution are also
shown in
Feature Engineering
Basically, ML algorithms do not recognize the classification rules from the raw text. Hence the need for numerical features
to understand classification rules. This is why feature engineering is highly considered as one among the top key steps in text
classification. At this level it where key features are extracted from raw text and later representing the extracted features in
numerical form. This study has therefore, used three unalike feature engineering techniques. They include; n-gram with TFIDF,
Word2vec and Doc2vec.
Data Splitting
The table below illustrates the class-wise distribution and also results after splitting of the overall dataset. The table is
showing the number of instances that has been used in Training set and also the number of instances that has been used in Test set.
In the study to split the preprocessed data we have used 80-20 ratio, basically what it means is that 80% of the instances has been
used for Training Data while 20% has been dedicated for Test Data. The whole idea is to ensure that classification models are
trained to learn classification rules.
Classifier Evaluation
At this stage, it is where classes of unlabeled text are predicted by the constructed classifier. The Test Set process follows
that the text is labeled into three distinct classes, namely; (0) hate speech, (1) offensive but not hate speech, (3) neither hate speech
nor offensive speech. The matrixes of True Negative (TN), False Positives (FP), False Negatives (FN) and True Positives (TP) are
calculated in order to evaluate classifier performance..
Sampling Procedure
For the purpose of this study a Probability Sampling technique has been adopted in which samples from a larger population
based on the theory of probability has equal chances. This sampling method considers every member of the population and forms
samples based on a fixed process.
Sample Size
The CrowdFlower dataset has a population of 14,509 number of tweets. This is an open source data which has been provided
freely by CrowdFlower for the purposes of research and other studies.
Sampling Area
Social media users via world wide web drawn from “Crowd Flower” dataset
Graphs
Interpretation
This segment will explain the total results of 24 studies that have been conducted in the in this research. Graph A to graph D
demonstrates (A) the precision, (B) recall, (C) F-measure and (D) accuracy of all 24 studies, respectively. The performance and
classification techniques for each of the different feature representation are displayed graphically. Machine Learning (ML)
algorithms with MLP and also KNN registered the lowest results. i.e. Precision analysis recorded 0.56, Recall analysis recorded
0.58, F-measure analysis recorded 0.48, while Accuracy analysis recorded 58%. Machine Learning (ML) algorithms with SVM
while using TFIDF feature representation with bigram registered the highest results. Precision analysis recorded 0.78, Recall
analysis recorded 0.80, F-measure analysis recorded 0.78, while Accuracy analysis recorded 80%. Feature engineering
representation; Best performance has been registered with Bigram feature as compared to Word2vec and Doc2vec. However
further analysis has revealed a marginal difference recorded from results obtained in bigram and Doc2vec. SVM classifier
registered best performance in the text classification models. SVM results out classed all the eight classifiers. Never the less
AdaBoost and RF classifiers results were lesser than SVM results and were better than LR, DT, NB, KNN, and MLP results.
Feature Engineering
This study has three distinct feature extraction techniques which have been deployed and evaluated in their performance
over a standard dataset. The three feature techniques include; Bigram with TFIDF, word2vec and doc2vec. The findings have
revealed Bigram with TFIDF as the best performing model while Word2vec and Doc2vec has inversely performed lower.
Bigram with TFIDF feature technique maintains a sequence of words unlike Word2vec and Doc2vec, this is probably the reason
why Bigram with TFIDF performed better than the rest. Numerous research studies have showed that the TFIDF representation
technique is better than the binary and term frequency representation Mujtaba et al., 2018 [16]. The likely cause for the lower
performance of Word2vec is because it is unable to handle OOV (out of vocabulary) words specially in the domain of Twitter
data.
Machine Learning
Previous research has proved that “no single Machine Learning (ML) algorithm has performed better results on all kinds of
dataset. It is against this background that several different ML algorithms have to be deployed in order to determine the best
performer on a given dataset. SVM uses threshold functions to perform data separation and not the number of feature based on
margin, this study has thus revealed SVM and AdaBoost as best classifiers based on this reason. This shows that SVM is
independent upon the presence of the number of features in the data Hornik et al., 2013 [8]. The Kernel functions in SVM gives it
ability to perform much better on non-linear data apart from working with linear data. The adaptive algorithms in AdaBoost
enables the model to learn the classification rules, with much attention focusing on decreasing training error. This is the reason
why AdaBoost has better performance as compared with the rest of the ML algorithms. The study has also revealed that SVM and
AdaBoost classifiers has better results and on the second tier there is RF and LR who also have performed higher than results of
NB, DT, KNN, and MLP which are placed on third tier. The absence of informative features in RF which result to incorrect
predictions could be the reason for its low performance. It is possible that the performance of LR might be lower because its
decision surface is linear in nature and cannot handle nonlinear data adequately Eftekhar et al., 2005 [5]. The reason behind the
poor performance of the MLP classifier is due to not having enough training data that's why it is considered as complex “black
box” Singh and Shahid Husain, 2014 [19]. The KNN had the worst performance due to laziness of the learning algorithm and it
does not work adequately for noisy data Bhatia and Author, 2010 [3]. Therefore, according to this study the KNN has proved to be
not suitable for detecting hate speech tweets.