0% found this document useful (0 votes)
1 views

Ensemble_Text_Classification_with_TF-IDF_Vectorization_for_Hate_Speech_Detection_in_Social_Media

This document presents a study on using ensemble text classification with TF-IDF vectorization for detecting hate speech on social media. The authors propose a voting classifier that combines predictions from various machine learning classifiers, achieving high accuracy metrics (precision 0.95, recall 0.96, f1-score 0.95, accuracy 0.97). The research aims to automate hate speech detection, contributing to a safer online environment through advanced AI techniques.

Uploaded by

cikeki2941
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Ensemble_Text_Classification_with_TF-IDF_Vectorization_for_Hate_Speech_Detection_in_Social_Media

This document presents a study on using ensemble text classification with TF-IDF vectorization for detecting hate speech on social media. The authors propose a voting classifier that combines predictions from various machine learning classifiers, achieving high accuracy metrics (precision 0.95, recall 0.96, f1-score 0.95, accuracy 0.97). The research aims to automate hate speech detection, contributing to a safer online environment through advanced AI techniques.

Uploaded by

cikeki2941
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

ISBN 979-8-3503-1512-7

Ensemble Text Classification with TF-IDF


Vectorization for Hate Speech Detection in Social
Media
Sathishkumar R1, Karthikeyan T2, Praveen kumar P3, Shamsundar S M4
Department of Computer Science and Engineering1,4
Department of Computer Science and Business Systems2
Department of Information Technology3
2023 International Conference on System, Computation, Automation and Networking (ICSCAN) | 979-8-3503-1512-7/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICSCAN58655.2023.10395354

Manakula Vinayagar Institute of Technology1,4, Puducherry,


Sri Manakula Vinayagar Engineering College 2,3, Puducherry.
[email protected], [email protected], [email protected], [email protected]

Abstract—The development of artificial intelligence (AI) findings in order to get over these restrictions.
has changed how hate speech is detected. In hate speech Ensemble approaches have become a potent ML
identification using machine learning, a number of methods are strategy for enhancing predicting performance in recent
used to automatically find text that uses vocabulary that is years. To arrive at a decision, ensemble approaches integrate
considered to be derogatory, discriminatory, or motivated by
the predictions of various separate classifiers. It is possible
hatred. Supervised learning techniques like neural networks,
decision trees, and SVMs need a labelled dataset comprising to more effectively identify and categorise hate speech on
samples of hate speech and non-hate speech. This project social media by utilising ensemble methods for text
investigates the use of AI and machine learning techniques to classification.
automatically detect material that uses offensive, intolerant, or In this study, we use ensemble text categorization
hostile words. A voting classifier and TF-IDF representations algorithms to address the problem of hate speech on social
are combined to improve classification accuracy. The ensemble media. Our suggestion is to use a voting classifier, which
of classifiers, powered by AI approaches, shows impressive combines the predictions of different classifiers trained on
accuracy in identifying hate speech by training five different TF-IDF vectorized text data. The ensemble includes
classifiers (Random Forest, Bagging, Support Vector Machine,
classifiers with different strengths and characteristics, such
AdaBoost, and Gradient Boosting) on a labelled dataset of
tweets. The TF-IDF representation prioritises textual terms, as Random Forest, Bagging, Support Vector Classifier
whereas the ensemble method uses classifier diversity to (SVC), AdaBoost, and Gradient Boosting. To capture
capture distinctive patterns. Results from experiments show numerous aspects and patterns of hate speech in textual data,
the strategy's effectiveness, with precision 0.95, recall 0.96, f1- we can use ensemble approaches to take advantage of the
score 0.95 and accuracy 0.97 for detecting hate speech. By diversity of these classifiers. The textual data can also be
successfully utilising AI's capacity to fight hate speech, this represented quantitatively thanks to the TF-IDF
research helps the development of a diverse and secure online vectorization technique, giving us important new insights
environment. The suggested approach works well for into the significance of phrases in hate speech identification.
automatically identifying hate speech, making the internet a
safer and more welcoming place for all users.
The long-term objective of this research is to aid in the
creation of an automated system for detecting hate speech
Keywords— Hate Speech, Machine learning, SVM, Naive on social media sites. We want to build a more precise and
Bayes, Random Forest Classifier, Stochastic Gradient Descent, reliable system that can efficiently identify and suppress
Decision Tree Classifier. hate speech in real-time by using ensemble approaches and
making use of developments in natural language processing
I. INTRODUCTION (NLP). The results of this study will help social media
Social media networks (SMNs) have transformed service providers as well as researchers lessen the negative
communication by enabling the quick interchange of ideas effects of hate speech and promote a secure online
and information. However, the widespread use of SMNs has environment.
also given rise to certain worrying problems, such the
propagation of hate speech. The safety and wellbeing of
users are seriously threatened by the fact that hate speech on
various social media has turned into a breeding ground for
hate-motivated online crime. Thorough study and novel
strategies are needed to address the rise in hate speech
events on these platforms. The enormous amount of text
data that needs to be processed and categorised is one of the
main obstacles in the fight against hate speech on social
media. Such data must be manually processed and classified,
which takes time and is subject to biases driven by human
factors like competence and fatigue. Automating text
categorization processes using machine learning (ML) Figure 1: Hate Speech
techniques is necessary to get more accurate and objective

Manakula Vinayagar Institute of Technology


Puducherry, India. IEEE ICSCAN2023
ed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on February 19,2024 at 10:02:11 UTC from IEEE Xplore. Restriction
ISBN 979-8-3503-1512-7
II. RELATED WORK (VGFSS), dynamic stop word filtering, and synthetic
H. Watanabe et al. have suggested using patterns to minority oversampling.
find hate speech on Twitter. [1]. The authors establish a set F. M. Plaza-Del-Arco et al. [7] offer an MTL model
of parameters that maximise pattern collection by that includes related tasks like polarity and emotion
meticulously extracting patterns from the training data. They categorization. Their suggested approach can outperform an
also cover a technique for identifying hate speech that STLBETO model and generate cutting-edge outcomes, as
combines phrasal patterns and other sentimental demonstrated by experiments on two benchmark corpora.
characteristics with words and phrases that pragmatistically H. S. Alatawi et al. used deep learning (BiLSTM) to
convey anger and hatred. In next investigations on the study domain-specific and agnostic word embedding. The
identification of hate speech, pre-built dictionaries will be findings imply that this tactic is effective in putting a stop to
constructed using the recommended collections of unigrams white supremacists' hate speech [8]. Additionally, the BERT
and trends. Instead than only using two categories to model has demonstrated that it is the most up-to-date answer
distinguish between hostile and offensive tweets, they divide to this issue. The findings of the experiment indicate that
tweets into three. BERT performs 4 points better than domain-specific
P. K. Roy et al. [2] utilise a DCNN to detect hate strategy, however because the BERT model was trained on
speech on Twitter. The uneven dataset could be to blame for Wikipedia and literature, it is unable to distinguish phrases
the incorrect forecasts for HS tweets. Because it has the that have been deliberately misspell or slang used by the
most samples, the system is biased towards predicting NHS hate community.
tweets. Whether the models used deep learning or more A. Rodriguez et al. [9] developed FADOHS, a
conventional machine learning, none of them were able to method that harvests and combines various formats of data
predict the HS with any degree of accuracy. on social media. Since non-personal facebook pages and
O. Oriola et al. [3] created an English corpus of South accounts typically avoid using blatantly explicit language in
African tweets in order to look for rude and hostile phrases. postings for fear of having their accounts deleted from the
The diversity of signals from South African languages in the platform or facing criticism, this problem was initially
tweets led multilingual annotators to annotate the corpus. challenging to solve. Nevertheless, by addressing sensitive
The tweets' four distinct feature sets and their combinations topics and employing relatively neutral language, many
were retrieved after tokenization and preprocessing. [22] websites appear to encourage hate speech among their users
Y. Zhou et al. [4] provided improved performance by and manage to elicit unfavorable reactions. The proposed
fusing the results of these three techniques as well as three methodology provides a novel method for grouping posts
CNN classifiers with various parameter settings. The and comments as well as for recognising and identifying
outcomes demonstrated that hate speech may be recognised hate speech produced by hotly debated topics.
via fusion processing. I. Ilie et al. [10] shows their findings point to a
C. Baydogan et al. [5] For the first time, the HSD preprocessing and classification pathway. The experimental
problem was resolved utilising the most recent metaheuristic validation's dataset, which consisted of 100000 news
algorithms, ALO and MFO. Researchers compared the articles, was divided into true and false categories. Racist
effectiveness of the suggested metaheuristic-based remarks are becoming common on social media sites like
approaches using SSO, TSA, and eight different supervised Twitter, thus it's critical to automatically detect and filter
ML algorithms. Utilising NLP techniques, the pre- them in order to halt their spread.
processing stage for the specific real-world HSD challenges E. Lee et al.'s [11] sentiment analysis method
was finished. The BoW+TF+Word2Vec techniques were identifies racist tweets by locating negative emotions. High-
combined to perform the feature extraction. The HSD issues performance sentiment analysis is achieved by combining
were then contested by twelve different algorithms. The deep learning with the ensemble technique, in which GRU,
research's customised ALO algorithm produced the greatest CNN, and RNN are stacked to create the GCR-NN model.
performance. To assess machine learning, deep learning, and the proposed
M. Z. Ali et al.'s [6] collection of tweets in Urdu for GCR-NN model, a sizable dataset collected from Twitter
expert linguists to classify according to aspect and emotion and annotated with TextBlob is employed. 31.49% of the
levels. There is no database that presently monitors the 169,999 tweets that were examined contained racial
characteristics and severity of hate speech in Urdu. To sentiments. The GCR-NN analysis of the positive, negative,
lessen dimensionality, sparsity, and class imbalance, the and neutral sentiment categories yielded an average
authors used a variable global feature selection method accuracy score of 0.98.

TABLE I . Comparison of Various Techniques and Performance


Refer Accuracy F1 Recall Precision
Technique Limitations
ence score score score score
[1] Open NLP+SA+FE 0.784 0.784 0.784 0.793 Over reliance on unigrams and patterns evolving
language and new patterns.
[2] DCNN with k-fold -- 0.97 0.88 0.92 Scalability and real-time detection reliance on
textual content performance in different contexts.
[3] MULTI TIER META 0.646 0.5 0.8 0.5 Computational complexity adversarial attacks
LEARNING ethical considerations.

Manakula Vinayagar Institute of Technology


Puducherry, India. IEEE ICSCAN2023
ed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on February 19,2024 at 10:02:11 UTC from IEEE Xplore. Restriction
ISBN 979-8-3503-1512-7

[4] CNN0,CNN1, CNN2 0.753 0.712 -- -- Bias and fairness robustness to new data and
variations.
[5] BoW+TF +Word2Vec ALO 0.721 0.701 -- 0.707 Data availability and quality Interpretability and
explainability.
[6] SVM+TF-IDF weights -- 0.985 -- -- Language dependency generalizability.
[7] MTL sent + emo -- 0.921 0.902 0.936 Dataset is very limited and complex model
architecture requiring substantial computational
resources.
[8] BERT -- 0.924 0.93 0.92 Model selection interpretability.
[9] Emotion Analysis using binary -- 0.5406 0.606 0.0487 Lack of human annotation reliance on external
values and the emotion from tools.
VADER with three scores

[10] Bi-LSTM + attention 0.875 0.884 0.892 0.883 Lack of comparison with baseline limited
preprocessing techniques and limited embedding
models.
[14] Lexical Rule Based -- 0.7 0.73 0.73 Challenges with dynamic language potential bias
and subjectivity.
[15] Logistic Regression with L -- 0.906 0.9 0.9 The hate speech lexicon may not capture all
regularization forms of hate speech or adapt well to evolving
language usage.

III. METHODOLOGIES
SVC (Support Vector Classifier) divides the data into
The dataset used consists of labelled examples of
distinct groups using a hyperplane. Both the
tweets that were collected from a variety of sources,
AdaBoostClassifier and the GradientBoostingClassifier
including online social networks and freely accessible
are boosting techniques that combine weak learners to
hate speech datasets. Instances of both hateful and non-
create a strong classifier. AdaBoost concentrates on
hateful content are included in the dataset, which has been
challenging examples by giving instances that were
thoroughly filtered. Every instance receives a label
incorrectly classified a higher weight, while Gradient
designating whether it is hate speech (label 1) or not (label
Boosting builds weak learners in a sequential fashion to
0), with 1 denoting that it is. The ensemble classifier is
correct errors made by earlier learners.
trained using the training set, and its performance is
Predictions are made for the testing data using the
assessed using the testing set. The Voting Classifier class
predict technique after the ensemble classifier has been
from scikit-learn was used to generate the ensemble
trained using the fit approach on the training data. Some
classifier, which combines the predictions of various basic
of the metrics are used to evaluate the performance of the
classifiers to produce the final prediction. Random Forest
ensemble classifier. These metrics reveal the classifier's
Classifier, Bagging Classifier, SVC, AdaBoostClassifier,
ability to correctly classify instances and achieve a
and Gradient Boosting Classifier are the base classifiers
balance between recall and accuracy.To improve the
utilised in this code. A Tfidf Vectorizer step and a
performance of the ensemble classifier, distinct base
classification algorithm step make up each base classifier's
classifiers can be switched out for others using various
implementation as a Pipeline object. The input text is
techniques or iterations. Using hyper parameter tuning
transformed into numerical features using the Term
approaches, the classifiers' parameters may also be
Frequency-Inverse Document Frequency (TF-IDF)
adjusted for better performance and customisation.
technique in the Tfidf Vectorizer stage. The TF-IDF
measures the relative relevance of phrases by giving them A. Random Forest Classifier
weights based on their frequency in a document and
inverse frequency across all documents. The classifiers
Depending on the implementation, the function of the
can now deal with numerical data instead of plain text
Random Forest Classifier, a potent machine learning
thanks to this change. Each base classifier uses converted
technique. As one of the foundation classifiers in the
numerical characteristics in the classification algorithm
ensemble technique, the Random Forest Classifier[21] is
step to classify the data using methods unique to that
crucial to our investigation. It contributes to the
classifier. The RandomForestClassifier builds a group of
development of the ensemble classifier, which tries to
decision trees and averages or votes on each one's
improve the overall predictive performance, by merging
predictions to aggregate them. By bootstrapping subsets of
various decision trees. A random portion of the training
the training data, the BaggingClassifier reduces variance
data and a random selection of features are used to build
and boosts stability by training different classifiers on
each decision tree in the Random Forest Classifier. This
each subset. To maximise the gap between classes, the

Manakula Vinayagar Institute of Technology


Puducherry, India. IEEE ICSCAN2023
ed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on February 19,2024 at 10:02:11 UTC from IEEE Xplore. Restriction
ISBN 979-8-3503-1512-7
increases variability and decreases over fitting, improving interpretability. The Bagging Classifier may, however, be
the model's generalizability and lowering its variance. We susceptible to outliers and may display redundancy when
take advantage of the Random Forest Classifier's capacity dealing with strongly correlated data, therefore it is vital
to manage high-dimensional data and capture complicated to keep this in mind. Despite these drawbacks, the
relationships by including it into the ensemble classifier. Bagging Classifier appears to be an effective ensemble
Furthermore, the Random Forest Classifier excels at learning strategy for our study, enhancing model
handling missing data and determining the significance of performance and offering perceptions into our data
features, allowing it to recognise crucial elements for the analysis.
classification assignment. We examine the Random Forest
Classifier's efficiency within the ensemble framework and D. Adaboost Classifier
examine its effects on classification performance as a
whole in our research study. A potent ensemble learning algorithm called
AdaBoost combines a number of weak classifiers to
B. Support Vector Machine produce a strong classifier. Due to the fact that it
considerably enhances classification performance, it is
It is used for classification problems, we specifically essential to our research. AdaBoost concentrates on
use the Support Vector Classifier (SVC)[20] application misclassified cases by iteratively modifying the weights of
of SVM. SVC is excellent for establishing decision training examples, enabling succeeding classifiers to learn
boundaries that clearly divide several groups in the input from the mistakes of earlier ones. AdaBoost is able to
data. SVC classifies fresh samples according to where achieve great accuracy even with straightforward weak
they are in relation to the hyperplane that best divides the classifiers thanks to this adaptive boosting method. It is
classes. SVM is ideally suited for capturing complex especially helpful in situations where computing
connections between the input data and the classes since it efficiency is crucial since it may produce precise findings
can handle non-linear decision limits. High-dimensional with a minimal number of estimators.
data can be handled effectively with SVM's flexibility in In our study, we develop the AdaBoost classifier using
feature space modification using kernel functions. A key the Scikit-Learn AdaBoostClassifier class. We take into
element in improving the performance of SVM is account crucial variables including the base estimator, the
choosing the right kernel function. We understand the quantity of estimators, and the learning rate. The ensemble
significance of fine tuning the SVM's hyperparameters. of weak classifiers is built on the base estimator, often a
The choice of hyperparameters, notably the regularisation decision tree with a depth of one (stump). The learning
parameter and kernel-specific parameters, has a rate governs how much of a contribution each weak
significant impact on SVM performance. We carefully classifier makes to the final prediction, and the number of
tune these settings in order to get the best classification estimators defines how many weak classifiers are learned.
outcomes. Our study makes use of the robust and flexible We may optimise the AdaBoost classifier's performance
framework that SVM, namely the SVC form, offers for for our particular dataset and classification task by
performing exact classification tasks. By exploiting tweaking these parameters.
SVM's propensity for coping with non-linear decision Overall, the AdaBoost classifier is a useful tool for
boundaries, feature space transformation, and improved solving a variety of classification issues due to its
hyperparameter settings, we draw significant inferences versatility, effectiveness, and computational economy. It
and interpretations from our data analysis. can handle complex datasets and achieve high accuracy
because it can combine weak classifiers and learn from
C. Bagging Classifier their mistakes.

The Bagging Classifier is a key component of our E. Gradient Boosting Classifier


research's ensemble learning method for enhancing model
performance. The Bagging Classifier assists in lowering The Gradient Boosting Classifier (GBC) is a very
variance and minimising overfitting by integrating successful ensemble learning technique that has
numerous independent models into a single prediction consistently performed exceptionally well in trials. It can
model. It accomplishes this by using random sampling increase accuracy and capture complex relationships
with replacement to extract subsamples from the original within the data because of its sequential tree-building
dataset, and then training individual models on these method, in which each new tree fixes the flaws of the
subsets. The variety of training data increases the model's preceding one. In terms of prediction accuracy, GBC
capacity for generalisation and raises prediction precision. outperforms other ensemble algorithms and thrives in
The Bagging Classifier provides the option of combining circumstances with a lot of features, handling high-
the predictions of the different models using either hard dimensional data well. Due to built-in regularisation
voting or soft voting, hence increasing the predictive techniques like shrinkage and subsampling, it is also
power of the ensemble model. The Bagging Classifier resistant to noisy data. GBC is extensively relevant across
offers several advantages that are helpful in our research industries, including finance, healthcare, and natural
in addition to lowering variance. Its simplicity language processing, according to research. Researchers
implementation and processing efficiency make it have obtained remarkable results in terms of accuracy and
appropriate for handling huge datasets. Additionally, it predictive capacity by optimising hyperparameters and
enables us to quantify the significance of dataset features, making use of the sequential structure of GBC. It is an
facilitating feature choice and improving model

Manakula Vinayagar Institute of Technology


Puducherry, India. IEEE ICSCAN2023
ed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on February 19,2024 at 10:02:11 UTC from IEEE Xplore. Restriction
ISBN 979-8-3503-1512-7
effective tool for solving machine learning issues because see the distribution of true positives, true negatives, false
of its capacity to manage complicated datasets, generate positives, and false negatives. This matrix depicts how
precise predictions, and generalise well to new data. In well the model performs in terms of correctly and
result, the Gradient Boosting Classifier is a potent incorrectly detected situations. The confusion matrix may
ensemble learning technique that increases accuracy by be used to compute a variety of performance measures,
constructing trees in a sequential manner. It performs including accuracy, precision, recall, specificity, and the
better than competing algorithms, manages large-scale F1 score, to give deeper insights into the model's overall
data, and efficiently mitigates noise. GBC is a well-liked performance. The confusion matrix and related metrics are
option for machine learning research due to its broad essential tools for testing and refining your classification
application and remarkable performance across numerous model in machine learning tasks.
areas.

F. TF-IDF Vectorization

Recent research has highlighted the pivotal role of TF-


IDF vectorization in text analysis. TF-IDF vectorization
significantly improved the performance of sentiment
classification models compared to traditional bag-of-
words representations. By transforming textual data into
numerical representations, TF-IDF effectively captured
the importance of specific words in each document,
enabling models to focus on informative terms while
downplaying common and less significant words. The
researchers discovered that TF-IDF vectorization
performed better in sentiment analysis tasks than
alternative text representation techniques, such as word Fig 2 Confusion matrix
embeddings[18]. This implies that the interpretation and
classification of textual data are improved by the TF-IDF's True negatives are situations where your model
capability to accurately reflect the value of words and correctly identified non-hate speech, true positives are
account for their rarity across the corpus[19]. The results situations where your model accurately identified
highlight the usefulness of TF-IDF vectorization for text situations where your model incorrectly misidentified hate
analysis, notably for sentiment analysis. It has the speech as non-hate speech, and false positives are
potential to enhance the performance of many text situations where your model incorrectly misidentified
analysis applications due to its capacity to capture word non-hate speech as hate speech.
importance and preserve its relevance. The study's
findings highlight the importance of TF-IDF[16]
vectorization in the process of sentiment analysis and
show how well it can improve machine learning models'
comprehension of textual input.

IV. IMPLEMENTATION AND RESULTS


To calculate the accuracy rating of a hate speech
detection algorithm, you will require a dataset comprising
labeled instances of texts that are both hateful and not.
Basic methods for determining the accuracy score include
the ones listed below: Make a training set from the data
and a testing set from the data. Make a hate speech Fig 3. Accuracy over Epochs
detection model using the training data. Among the
models that may be used for this purpose are random
forest, support vector machine (SVMs) [17], deep neural
networks. Analyse the results from the testing collection
using the trained model. Sort the texts in the test set into
categories using the trained model, then compare the
expected labels to the actual labels.
To create a confusion matrix in machine learning, your
classification model must first be trained using a labelled
dataset. After training, you may evaluate the model's
performance using a new set of labelled data. On the test
dataset, forecasts will be created and contrasted with the
actual labels. By putting the prediction results into a Fig 4. Loss over Epochs
square matrix called the confusion matrix, it is possible to

Manakula Vinayagar Institute of Technology


Puducherry, India. IEEE ICSCAN2023
ed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on February 19,2024 at 10:02:11 UTC from IEEE Xplore. Restriction
ISBN 979-8-3503-1512-7

TABLE II: Performance metrics of Different algorithms. V. DISCUSSION AND FUTURE WORKS
Precision Recall F1 Accuracy It is possible to identify challenges in the detection of
Algortihm hate speech based on earlier publications after analysing
Score Score Score Score
the research. It takes more than keyword searches to
Bi-LSTM successfully identify hate speech, which is a difficult
0.883 0.892 0.884 0.875 undertaking. On the basis of the review that was
+Attention
completed in the previous part, we may identify a few
research issues in the automated identification of hatred in
BoW + social media. Will we be able to distinguish between
Word2Vec 0.707 -- 0.701 0.721 diverse settings of hate speech for other cultures? This is
ALO
one of the issues from a social and political perspective.
Enssemble Hate speech detection is a multidisciplinary subject that
Model 0.9567 0.9692 0.9537 0.9792 requires consideration from a range of viewpoints. Taking
Classifier into account that there is no consensus, constitutes for
detecting hate speech or sentiment analysis. Along with
The table shows that in terms of precision, recall, F1 the legitimacy concern, it might be challenging to
score, and accuracy score, the Ensemble Model Classifier distinguish between serious and trivial cases because hate
fared better than the other methods. It obtained scores of 0 speech has a variety of manifestations and occasionally
.9567 for precision, 0.9592 for recall, 0.9537 for F1, and 0 refers to or is included in them. Choosing the optimal
.9592 for accuracy . These findings demonstrate how the machine learning approach is a challenging decision from
ensemble method can enhance the overall efficiency of a technical perspective. Most techniques from earlier
hate speech detection models. On the other hand, the Bi- works were used. Most researchers employed supervised
LSTM + Attention algorithm also demonstrated machine learning methods for their task of automatic
competitive performance , earning scores of 0.883 for detection. The least popular methods are semi-supervised
precision , 0.892 for recall , 0.884 for F1, and 0.875 for ones, which are followed by unsupervised methods in
accuracy .With precision and accuracy ratings of 0.707 terms of popularity. Consideration must be given to all
and 0.721 , respectively , the BoW + Word 2Vec ALO factors that could affect our choice of the optimal course
method fared less well than the other two algorithms. The of action. Because some ML algorithms work well with
recall score for this method, however, is not available for little datasets, the corpus size is one key factor to take into
our investigation .These results emphasise the importance account.
of using the right algorithm to detect hate speech on social
media. The Ensemble Model Classifier demonstrates how VI. CONCLUSION
the ensemble approach may significantly improve the This paper reviews past investigations into the
performance of the detection model , achieving prevalence of hate speech across many media platforms,
outstanding precision , recall , F1 score , and accuracy . In including online social networks. As more people turn to
general our comparison evaluation shows how successful the internet to express their intolerance and hatred for a
the ensemble technique is at detecting hate speech and particular group of people, hate speech has been rising
sheds light on how other algorithms perform . These quickly around the world. Hate speech has grown to be a
findings expand hate speech identification methods and powerful catalyst for inciting violence and furthering
can aid academics and professionals in choosing the right nefarious political goals and intentions, which is a major
algorithms for specific purposes. cause for concern. Therefore, it is urgent that hate speech
be identified and stopped before it spreads on social
media in various forms, including audio, video, and
images. Our methodology, which will be described in
more detail in the future, was identified and realised for
this purpose through an in-depth analysis of previous
studies. We further find out that the SVM performs
overall under the topic of Accuracy and Precision.
Consider using a variety of ensemble methods and larger
datasets to improve the accuracy and robustness of hate
speech identification. Address the difficulties associated
with identifying hate speech in many cultural contexts,
and make use of sophisticated NLP approaches, such as
deep learning and attention mechanisms, to capture
Fig 3 Graphical comparison of proposed method with other intricate linguistic patterns. These methods seek to
algorithm improve the identification of hate speech and establish
safer online settings.The ensemble method gives overall
This bar graph contrasts the accuracy, F1, recall, and accuracy over 95%. Several tactics can be used to
precision scores for the overall approach we trained the improve the ensemble classifier's efficiency in detecting
model using. hate speech. These include expanding the ensemble size
to enhance generalisation and enhancing ensemble
diversity by utilising various techniques, feature sets, or

Manakula Vinayagar Institute of Technology


Puducherry, India. IEEE ICSCAN2023
ed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on February 19,2024 at 10:02:11 UTC from IEEE Xplore. Restriction
ISBN 979-8-3503-1512-7
training data. Predictive accuracy can be improved further
by investigating alternative ensemble combination [12] A. Briliani, B. Irawan, & C. Setianingsih. (2019). Hate
speech detection in indonesian language on instagram
techniques and optimising hyperparameters. It is equally comment section using K-nearest neighbor classification
important to address class imbalance using data balancing method. In: Proc. - 2019 IEEE Int. Conf. Internet Things
approaches and to frequently assess the model's Intell. Syst. IoTaIS 2019, pp. 98–104. DOI:
effectiveness. 10.1109/IoTaIS47347.2019.898039
REFERENCES
[1] H. Watanabe, M. Bouazizi, and T. Ohtsuki, "Hate Speech [13] M. A. Fauzi & A. Yuniarti. (2018). Ensemble method for
on Twitter: A Pragmatic Approach to Collect Hateful and indonesian twitter hate speech detection. IJEECS, 294–
Offensive Expressions and Perform Hate Speech 299. DOI: 10.11591/ijeecs.v11.i1.pp294-299
Detection," in IEEE Access, vol. 6, pp. 13825-13835, 2018,
DOI: 10.1109/ACCESS.2018.2806394. [14] N. D. Gitari, Z. Zuping, H. Damien, and J. Long, ‘‘A
lexicon-based approach for hate speech detection,’’ Int. J.
[2] P. K. Roy, A. K. Tripathy, T. K. Das and X. -Z. Gao, "A Multimedia Ubiquitous Eng.,vol. 10, no. 4, pp. 215–230,
Framework for Hate Speech Detection Using Deep Apr. 2015.
Convolutional Neural Network," in IEEE Access, vol. 8,
pp. 204951-204962, 2020, DOI: [15] T. Davidson, D. Warmsley, M. Macy, and I. Weber,
10.1109/ACCESS.2020.3037073. ‘‘Automated hate speech detection and the problem of
offensive language,’’ in Proc. Int. AAAI Conf. Web Social
[3] O. Oriola and E. Kotzé, "Evaluating Machine Learning Media, 2017, vol. 11, no. 1, pp. 512–515.
Techniques for Detecting Offensive and Hate Speech in
[16] S. Abro, S. Shaikh, Z. Hussain, Z. Ali, S. Khan, and G.
South African Tweets," in IEEE Access, vol. 8, pp. 21496-
Mujtaba, ‘‘Automatic hate speech detection using machine
21509, 2020, DOI: 10.1109/ACCESS.2020.2968173.
learning: A comparative study,’’ Int. J. Adv. Comput. Sci.
Appl., vol. 11, no. 8, pp. 1–8, 2020.
[4] Y. Zhou, Y. Yang, H. Liu, X. Liu, and N. Savage, "Deep
Learning-Based Fusion Approach for Hate Speech [17] S. Agrawal and A. Awekar, ‘‘Deep learning for detecting
Detection," in IEEE Access, vol. 8, pp. 128923-128929, cyberbullying across multiple social media platforms,’’ in
2020, DOI: 10.1109/ACCESS.2020.3009244. Proc. Eur. Conf. Inf. Retr. Springer, 2018, pp. 141–153

[5] C. Baydogan and B. Alatas, "Metaheuristic Ant Lion and [18] C. Baydogan, ‘‘Deep-Cov19-hate: A textual-based novel
Moth Flame Optimization-Based Novel Approach for approach for automatic detection of hate speech in online
Automatic Detection Hate Speech in Online Social social networks throughout COVID-19 with shallow and
Networks," in IEEE Access, vol. 9, pp. 110047-110062, deep learning models,’’ Tehnički Vjesnik, vol. 29, no. 1,
2021, DOI: 10.1109/ACCESS.2021.3102277. pp. 149–156, 2022

[6] M. Z. Ali, Ehsan-Ul-Haq, S. Rauf, K. Javed, and S. Hussain, [19] S. Jaki and T. De Smedt, ‘‘Right-wing German hate speech
"Improving Hate Speech Detection of Urdu Tweets Using on Twitter: Analysis and automatic detection,’’ 2019,
Sentiment Analysis," in IEEE Access, vol. 9, pp. 84296- arXiv:1910.07518.
84305, 2021, DOI: 10.1109/ACCESS.2021.3087827.
[20] M. Di Capua, E. Di Nardo, and A. Petrosino,
[7] F. M. Plaza-Del-Arco, M. D. Molina-González, L. A. ‘‘Unsupervised cyber bullying detection in social
UreñaLópez and M. T. Martín-Valdivia, "A Multi-Task networks,’’ in Proc. 23rd Int. Conf. Pattern Recognit.
Learning Approach to Hate Speech Detection Leveraging (ICPR), Dec. 2016, pp. 432–437.
Sentiment Analysis," in IEEE Access, vol. 9, pp. 112478-
112489, 2021, DOI: 10.1109/ACCESS.2021.3103697. [21] P. Badjatiya, S. Gupta, M. Gupta, and V. Varma, ‘‘Deep
learning for hate speech detection in tweets,’’ in Proc. 26th
[8] H. S. Alatawi, A. M. Alhothali, and K. M. Moria, Int. Conf. World Wide Web Companion (WWW
"Detecting White Supremacist Hate Speech Using Domain- Companion), 2017, pp. 759–760
Specific Word Embedding With Deep Learning and
BERT," in IEEE Access, vol 9, pp. 106363-106374, 2021, [22] I. Mollas, Z. Chrysopoulou, S. Karlos, and G. Tsoumakas,
DOI: 10.1109/ACCESS.2021.3100435. ‘‘ETHOS: An online hate speech detection dataset,’’ 2020,
arXiv:2006.08328.
[9] A. Rodriguez, Y. -L. Chen and C. Argueta, "FADOHS:
Framework for Detection and Integration of Unstructured
Data of Hate Speech on Facebook Using Sentiment and
Emotion Analysis," in IEEE Access, vol. 10, pp. 22400-
22419, 2022, DOI: 10.1109/ACCESS.2022.3151098.

[10] V. -I. Ilie, C. -O. Truică, E. -S. Apostol and A. Paschke,


"Context-Aware Misinformation Detection: A Benchmark
of Deep Learning Architectures Using Word Embeddings,"
in IEEE Access, vol. 9, pp. 162122-162146, 2021, DOI:
10.1109/ACCESS.2021.3132502.
[11] E. Lee, F. Rustam, P. B. Washington, F. E. Barakaz, W.
Aljedaani, and I. Ashraf, "Racism Detection by Analyzing
Differential Opinions Through Sentiment Analysis of
Tweets Using Stacked Ensemble GCR-NN Model," in
IEEE Access, vol. 10, pp. 9717-9728, 2022, DOI:
10.1109/ACCESS.2022.3144266.

Manakula Vinayagar Institute of Technology


Puducherry, India. IEEE ICSCAN2023
ed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on February 19,2024 at 10:02:11 UTC from IEEE Xplore. Restriction

You might also like