An Optimized Crossover Framework for Social Media Sentiment Analysis
An Optimized Crossover Framework for Social Media Sentiment Analysis
An International Journal
Surender Singh Samant, Vijay Singh, Arun Chauhan & Jagadish Dasarahalli
Narasimaiah
To cite this article: Surender Singh Samant, Vijay Singh, Arun Chauhan & Jagadish Dasarahalli
Narasimaiah (2022): An Optimized Crossover Framework for Social Media Sentiment Analysis,
Cybernetics and Systems, DOI: 10.1080/01969722.2022.2146849
ABSTRACT KEYWORDS
Using deep learning, a new sentiment analysis model is B-BoW; DBN with transfer
designed in our article. The original data (input) is first pre-proc- learning; optimized Bi-
essed by stemming, stop-word removal, and tokenization. The LSTM; sentiment analysis;
SI-HBA; social media;
projected technique includes 4 phases: "pre-processing, feature T-TFIDF
extraction, feature selection, and sentiment classification." The
characteristics “Bigram-BoW (B-BoW), Threshold Term Frequency-
Inverse Document Frequency (T-TFIDF), Unigram, and N-Gram”
are then retrieved from the pre-processed data. Utilizing the
self-improved Honey Badger Algorithm, the best features out of
the chosen features will be chosen (SI-HBA). The basic Honey
Badger Algorithm (HBA) has been conceptually improved by this
SI-HBA model. The review classification will then be conducted
via the proposed optimized crossover framework, which is con-
structed by hybridizing the optimized Bi-Long Short-Term
Memory (Bi-LSTM) and Deep Belief Network (DBN) trained with
Transfer learning, respectively. The SI-HBA model’s optimally
chosen features are used to train the hybrid classifier within the
optimized crossover architecture. A self-improved Honey Badger
Algorithm is used to fine-tune the weight of the Bi-LSTM classi-
fier to improve the classification performance of the gathered
reviews (SI-HBA). The final results will indicate whether the
reviews are mostly good, negative, or neutral. The proposed
sentiment classification model is then validated by a compari-
son analysis.
Introduction
The amount of people using the internet has dramatically increased
recently, with the number of users doubling every year (Alfrjani, Osman,
and Cosma 2019; Khan, Qamar, and Bashir 2016; Wadawadagi and Pagi
2019; Ruz, Henrıquez, and Mascare~ no 2020). This growth has led to a rise
in the popularity of user-generated content on particular goods and serv-
ices. After the emergence of social media, People are free to express their
language dataset and is appropriate for social media articles. But in this case,
the polarity of the emotions was not taken into account. The suggested
method can also be improved by using web-based learning algorithms and
up-to-date preparatory data (Khan, Qamar, and Bashir 2016). The BabelNet
depended on the knowledge-enhanced meta-classifier. The meta-learning tech-
nique assisted with acquiring the most appropriate outcomes across the area
and accomplished high execution (Kastrati, Imran, and Kurti 2020; Yin et al.
2020). Although many "supervised learning techniques" for sentiment categor-
ization have been put forth, obtaining labeled corpora for each area is incred-
ibly difficult and is a major barrier to their practical use. Conversely,
universally applicable assumption techniques don’t need any labeled informa-
tion, although their significance level isn’t very high. (Aziz and Starkey 2020;
Gao et al. 2019; Deng et al. 2019; Zhang et al. 2019; Gu, Xu, and Luo 2020).
Therefore, method improvements for deep learning can be used to tackle
these challenges. Since the public is free to voice their opinions or discuss any
topic on blogs, online social networks, e-commerce websites, forums, etc., the
development of web technologies has recently fueled the growth in user-gener-
ated data (Chan et al. 2022).
This work’s main contribution is:
Literature Review
Alfrjani, Osman, and Cosma (2019) introduced a “Hybrid Semantic
Knowledgebase-Machine Learning technique” for mining views and sum-
marizing general opinions on a multi-point scale at the domain feature
level. This "Hybrid Semantic Knowledgebase-Machine Learning" technique
was shown to be suitable for expanding the database of semantic features
with higher-level precision since it improved the recall and accuracy of the
retrieved domain characteristics in a trial evaluation.
Khan, Qamar, and Bashir (2016) suggested a semi-supervised system for
sentiment analysis based on "Multi-objective model selection (MOMS)". To
increase classification accuracy and learn feature weights, they used a MOMS
procedure and a support vector machine. Comparing the results of the pro-
posed method to past studies on this issue, better results are obtained.
Wadawadagi & Pagi (Wadawadagi and Pagi 2019) studied the use of lin-
guistic kernels in "subjectivity identification, opinion extraction, and polar-
ity classification" along with the addition of a hierarchical framework
centered on the Document object model (DOM) tree. The "opinion polar-
ity" in the returning emotional material was categorized in the last stage
using the fine-grained linguistic kernels.
Ruz, Henrıquez, and Mascare~ no (2020) have discussed the focus of senti-
ment analysis during big events, including social movements or natural dis-
asters. By using the Bayes factor technique to dynamically modify the
quantity of edges provided by the training examples in the Bayesian net-
work classifier, they were able to produce a more realistic network. Given a
high number of training instances, the results demonstrated the benefit of
employing the Bayes factor measure and its superior prediction results
when compared to SVM and RF.
Nagamanjula and Pethalakshmi (2020) developed a novel method for identi-
fying a vocabulary set specific to Twitter Sentiment Analysis (TSA) in 2018.
They demonstrated that the "Twitter Specific Lexicon Set (TSLS)" was minimal
and, more crucially, portable across domains. A set of vectorized tweets are pro-
vided by this method, which machine learning algorithms can use as input.
Kang, Ahn, and Lee (2018) designed a text classification sentiment ana-
lysis method based on text-based hidden Markov models (TextHMMs).
The proposed model did not employ a predetermined lexicon sentiment,
but rather a word arrangement in training texts. They have studied textual
patterns to build ensemble TextHMMs. Moreover, when determining the
"hidden variables in TextHMMs," the semantic cluster data was taken into
consideration. They demonstrated that this method was superior to several
usual strategies in studies using a benchmark informational index and that
it might potentially organize verified attitudes.
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 5
Step 1- Initially, the collected raw data Dinp is pre-processed via tokenization,
stop-word removal, and stemming. The pre-processed data is denoted as Dpre :
Step 2- Then, from the pr-processed data Dpre , the features like Bigram-
BoW (B-BoW) g pBoW , Threshold- TFIDF (T-TFIDF) g pTFIFD , Unigram
6 S. S. SAMANT ET AL.
g uni and N-Gram g ngram are extracted. These features are together denoted
as G ¼ g pBoW þ g pTFIFD þ g uni þ g ngram
Step 3- From the choosen features G, the most optimal features are picked
by utilizing the SI-HBA. This SI-HBA model is the conceptual improve-
ment of standard HBA. The selected optimal features are pointed as
Gopt follows.
Step 4- As a last step, the review classification is carried out using the sug-
gested optimal crossover framework, which was created by fuzing together
optimized Bi-LSTM and DBN that have each been trained using transfer
learning. The DBN and optimized Bi-LSTM are given training by using
best possible feature set of the SI-HBA model. The classification perform-
ance of the acquired reviews is enhanced by fine-tuning the weight of the
Bi-LSTM classifier using a self-improved Honey Badger Algorithm (SI-
HBA).Whether a good, negative, or neutral sentiment is indicated by the
reviews that were gathered will be shown in the final results.
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 7
Pre-Processing
The cleansing and prepping of the text for categorization is known as pre-
processing the data. Noise and ambiguous sections such as HTML tags,
javascript, and ads are common in electronic writings. Furthermore, most
phrases in the language have very little bearing on the overall direction of
the document. Retaining such phrases increases the problem’s complexity,
making categorization increasingly complicated as each phrase in text is
considered a uni-dimensional construct. The premise behind appropriately
pre-processing information is that reducing distortion in the text would
increase the effectiveness of the classification and quicken the categoriza-
tion, allowing for real-time sentiment analysis. Tokenization, stop-word
erasure, and stemming are used in our article to perform the pre-process-
ing. Figure 2 diagrammatically depicts this phase.
The pre-processing phase’s steps are illustrated below:
Step 1- Input data initially enters the tokenization phase. The tokens Dtoken
are first extracted from Dinp : In any NLP pipeline, tokenization is indeed
the preliminary stage. It has a significant impact on the remainder of
the pipelines.
Step 2- From Dtoken , the stop words are removed. Stop words are often
screened out before a natural language is processed. These are the most
frequent words in any language (“articles, prepositions, pronouns, conjunc-
tions, and so on”), and they don’t add anything to the text. Stop words in
English include "the," "a," "an," "so," and "what.” The data acquired after
stop word removal is denoted as Dstop :
Step 3- Then, the stemming is carried out upon Dstop : Whenever ’fluff’ let-
ters (not words) are eliminated from a phrase, the "stem form" is gathered
collectively. The terms ’play,’ ’playing,’ and ’plays,’ for example, imply the
same thing. Rather than having them as separate terms, we can group
them underneath the phrase ’play’. The outcome acquired after stemming
is the pre-processed data, and it is denoted as Dpre :
8 S. S. SAMANT ET AL.
Feature Extraction
From the pr-processed data Dpre , the features like Bigram-BoW (B-BoW),
Threshold Term Frequency-Inverse Document Frequency (T-TFIDF), Unigram,
and N-Gram are extracted. This phase is diagrammatically shown in Figure 3.
Bigram-Bag-of-Words
The Bag of Words (BoW; Chen, Yap, and Chau 2011) model is the sim-
plest form of text representation in numbers. The Bag-of-Words counts the
total occurrence of the most frequently utilized words in the document.
However, the Bag-of-Words suffers from the drawbacks like: “(a) If the
new sentences contain new words, then our vocabulary size would increase
and thereby, the length of the vectors would increase too and (b)
Additionally, the vectors would also contain many 0 s, thereby resulting in
a sparse matrix (which is what we would like to avoid)”. In this proposed
study, a proposed Bigram-BoW (B-BoW) model is presented. The Bigram-
BoW (B-BoW) makes the list of all the words in Dpre :
Then, score the words in each document based on their frequency using
the bigram (instead of unigram that is being followed in standard BoW).
The frequency of each word in Dpre is computed, and the positive
weight is computed as per Eq. (1).
PCi
K
i¼1 ti
B¼ (1)
Ct
Here, tft and idft denotes the term frequency and inverse document fre-
quency, respectively. In addition, Thr indicates the threshold value that is
fixed between [-1,þ1]. Here, the most extreme negative is represented by
1, and the most extreme positive is represented by þ1. In addition, d, t
denotes the count of files in the corpus and count of documents containing
the words, respectively. The extracted feature is denoted as g pTFIFD :
Unigram
The features based on unigrams are retrieved using a dictionary created
using user-defined parameters. It is assumed that the term’s appearance is
independent to the word before it. The extracted unigram based feature
from Dpre is pointed as g uni :
N-Gram
Text N-grams are commonly employed in text mining and natural language
processing. An n-gram is a continuous series of n components from a given
sample of text. N-gram may be calculated mathematically using Eq. (3).
10 S. S. SAMANT ET AL.
Here, X denotes the count of words in the sentence and N (N-gram; i.e.
for unigram N ¼ 1, for bigram N ¼ 2 as well). The extracted unigram based
feature from Dpre is pointed as g Ngram : The extracted features are together
denoted as G ¼ g pBoW þ g pTFIFD þ g uni þ g ngram :
Feature Selection
SA-HBA
The development of a new SI-HBA model resolves the optimization
issue that is the focus of the current research. This SI-HBA model was
created using the conventional HBA (Hashim et al. 2022) model, which
is based on the feeding habits of honey badgers. The "digging phase"
and the "honey phase" are the two main phases that the suggested SI-
HBA model covers. The input to SI-HBA is the weight of BI-LSTM
W1, 2, ::::Q and the extracted features G: The solution encoding is shown
in Figure 4.
Initialization: the population of the solutions is initialized. Here, N
denotes the population size and D is the dimension. This is shown in Eq.
(4) and Eq. (5), respectively.
2 3
p11 p12 :::: p1D
6 p21 p22 :::: p2D 7
6 7
P¼6 6 ::::::::::::::::::: 7 (4)
7
4 pN1 pN2 pN3 pND 5
Step 1- Using Eq. (6), determine the search agent’s fitness. The main goal
of this study is to reduce categorization errors as much as possible. This
may be expressed mathematically in Eq.(6).
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 11
Step 2- The best position is saved as pprey and assigned fitness to fitprey
Step 3- Check the termination criterion: While itr maxitr : If this condition
is satisfied, then processed to Step 4.
Step 4- Use the newly projected equation (Eq. (7) and Eq. (8)) to update
the decreasing factor of the density factor a: The density factor governs
time-varying randomness to guarantee a successful transition from explor-
ation to exploitation. To limit randomness over time, update a lowering
factor that lowers with iteration.
itrðradÞ
a ¼ C exp (7)
maxitr
Here, a new variable rad is introduced to avoid the solution from getting
trapped into the local optima.
Q
rad ¼ (8)
2p
Here, Q is the circumference of the prey and rad is the radius between
the honey badger (search agent) and prey (target: optimal weight or opti-
mal feature) and constant C ¼ 2:
Step 5- For i ¼ 1 to N, compute the intensity of the solutions Ii using the
newly projected expression (proposed) given in Eq. (9)- Eq. (11). The
prey’s concentration strength and the distance between it and the honey
badger determine the intensity. Ii is the prey’s scent strength; if the smell
is strong, the motion will be quick, and vice versa, according to the
Inverse Square Law.
S
Ii ¼ rand2 Mean (9)
4pD2i
rand2 is a number between 0 and 1 that is created at random. In order to
determine the best answers, the mean is computed amongst the gathered
solutions. As a result, the solutions’ convergence speed improves.
S ¼ ðpitr pitrþ1 Þ2 (10)
Di ¼ pprey pi (11)
The distance between the prey and the ith search agent is denoted by Di ,
while the source strength or concentration strength is denoted by S:
Step 6- Generate a random number r between 0 to 1.
12 S. S. SAMANT ET AL.
Step 7- If1 r<0:5, then update the position of pnew in the digging phase using
Eq. (12). During the digging phase, a honey badger adopts a cardioid shape.
Here, pprey denotes the prey’s position, which is the best search agent so
far, and flag denotes the flag that changes the search direction.
Step 8- elseIf1 r 0:5, then update the position of pnew using the honey stage as
shown in Eq. (13). The scenario in which a honey badger follows a honeyguide
bird to a beehive is recreated here.
The selected optimal feature is denoted as Gopt : The feature selection phase
is shown in Figure 5.
Bi-LSTM
A particular kind of artificial RNN utilized in deep learning is the Bi-
LSTM (Rajeyyagari 2020). The LSTM’s advantage is that it can analyze a
stream of data as opposed to just one data point. A typical Bi-LSTM unit
consists of a cell, an input gate, an output gate, and a forget gate. The three
gates control how information enters and leaves the cell, and the cell
retains values for an extended length of time. Memory blocks are a unique
type of special unit found in the Bi-recurrent LSTM’s hidden layer. Self-
connections are placed in memory cells to save the network’s temporal
state within the memory blocks. The gates also function as particular multi-
plicative units for managing data flow. The two gates in the Bi-LSTM
architecture are an input gate and an output gate. The output gate regulates
the activity flow in the unused portions of the network, while the input
14 S. S. SAMANT ET AL.
gate regulates the input activation flows into the memory cell. There is also
a forget gate in the memory block. In order to adaptively forget or reset
the memory of the cell, the forget gate scales the internal state of the cell
before adding it to the cell (as input) via the cell’s self-recurrent link. E
and H represent the hidden and cell states of the LSTM, respectively. it is
trained with Gopt
In addition, for the time instant t, outt and Inpt stand for the forget gate
and input gate, respectively. The Bi-LSTM model is mathematically mod-
eled as per Eq. (14)- Eq. (17), respectively.
opt
Inpt ¼ r WeInp Gt þ RInp :Ht1 þ BInp (14)
opt
Fort ¼ r WeFor Gt þ RFor :Ht1 þ BFor (15)
opt
Ct ¼ r WeC Gt þ RC :Ht1 þ BC (16)
opt
outt ¼ r Weout Gt þ Rout :Ht1 þ Bout (17)
The notation r denotes the gate’s activation function (sigmoid). In add-
ition, WeInp , WeFor , WeC and Weout denotes the input weight metrics.
opt
RInp , RFor , RC and Rout are recurrent weight matrices. In addition, Gt is
the input and Ht1 is the output at the previous time t1: Moreover, BInp ,
BFor , BC and Bout points to the bias vector.
The notation r denotes the gate’s activation function (sigmoid). In add-
ition, WeInp , WeFor , WeC and Weout denotes the input weight metrics.
opt
RInp , RFor , RC and Rout are recurrent weight matrices. In addition, Gt is
the input and Ht1 is the output at the previous time t1: Moreover, BInp ,
BFor , BC and Bout points to the bias vector.
The outcome acquired from Bi-LSTM is denoted as OBiLSTM :
accurate. The following are the steps taken in the proposed Transfer learn-
ing based Deep belief network (DBN):
Pre-train the network after initializing the basic DBN structure with a
single hidden layer.
Transmit learning is used to move the knowledge of the neurons and
hidden layers to the newly inserted neurons and hidden layers.
Once the parameters are established, using Eq (19) and (20), we can
determine the joint probability distribution of (v, h) in terms of the energy
function, respectively.
1
probðvis, hidÞ ¼ eEðvis, hisÞ (19)
Z
X
Z¼ eEðvis, hidÞ (20)
vis, hid
16 S. S. SAMANT ET AL.
Dataset Description
The dataset for assessment is taken from: https://analyticsdrift.com/top-15-
datasets-for-sentiment-analysis-with-significant-citations/. Thirty percent of
the data gathered are used for testing, and the leftover seventy percent are
used for training. The Yelp’13 and IMDB reviews with “segment-level
polarity labels” (positive/neutral/negative) were combined to create the
SPOT sentiment analysis dataset, which contains 197 reviews. Sentences
and Elementary Discourse Units are the two levels of granularity where
annotations have been gathered (EDUs). Totally 2591 reviews are consid-
ered for this study. The dataset is perfect for testing approaches aimed at
segment-level and fine-grained sentiment prediction.
Table 2. Performance Analysis of the projected sentiment analysis model with optimized crossover classifier framework: Accuracy.
DOA þ DOXA þ SSOA þ BOA þ HBA þ SI-HBA þ
DCNN (Jianqiang, crossover crossover crossover crossover crossover crossover
Percentage Xiaolin, and GA (Iqbal classifier classifier classifier classifier classifier classifier
(%) SVM RF MLP DT Xuejun 2018) et al. 2019) framework framework framework framework framework framework
60 77.5501 79.28002 86.37899 85.6866 86.24184 84.71725 81.37901 82.71071 81.22962 80.41529 87.19045 89.407
70 86.42203 84.81105 88.22619 87.12184 87.36478 85.64309 83.79588 86.72254 84.56748 85.03198 89.10113 91.062
80 88.78922 87.48918 88.36221 89.05861 87.49772 88.83073 86.32398 88.57575 89.41953 89.39895 89.42981 91.82806
90 91.79274 91.29498 92.06983 91.95122 87.6092 91.43474 89.35776 91.56226 92.0797 92.06983 92.11916 93.54201
Table 3. Performance Analysis of the projected sentiment analysis model with optimized crossover classifier framework: Precision.
DCNN
(Jianqiang, DOA þ DOXA þ SSOA þ BOA þ HBA þ SI-HBA þ
Xiaolin, and crossover crossover crossover crossover crossover crossover
Percentage Xuejun GA (Iqbal classifier classifier classifier classifier classifier classifier
(%) SVM RF MLP DT 2018) et al. 2019) framework framework framework framework framework framework
60 76.55309 75.67735 86.09524 84.89108 72.61913 82.29053 81.8116 80.1628 79.02274 77.9492 88.28955 90.33242
70 86.34398 78.14407 87.44712 86.22608 86.55079 83.84125 82.68269 81.58098 80.18023 79.28348 88.4889 90.95782
80 88.12229 87.07446 88.45157 87.84766 86.65414 87.80949 87.49627 87.84232 85.91789 88.24308 90.58501 92.10971
90 90.50345 89.40628 90.031 90.36564 87.96859 90.53813 90.30596 89.64969 90.35713 90.34009 92.33536 93.96676
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL
19
20
S. S. SAMANT ET AL.
Table 4. Performance Analysis of the projected sentiment analysis model with optimized crossover classifier framework: F1-score.
DCNN
(Jianqiang, DOA þ DOXA þ SSOA þ BOA þ HBA þ SI-HBA þ
Xiaolin, crossover crossover crossover crossover crossover crossover
Percentage and Xuejun GA (Iqbal classifier classifier classifier classifier classifier classifier
(%) SVM RF MLP DT 2018) et al. 2019) framework framework framework framework framework framework
60 75.67568 78.14371 85.69158 84.48282 86.12615 83.63879 81.87609 80.16492 79.02176 77.95276 88.06303 90.44404
70 76.4992 85.99196 87.25889 86.03081 87.24895 86.79919 82.68003 81.58285 80.1509 79.28311 88.27502 90.74501
80 87.54668 88.55807 87.50559 86.84115 87.41319 89.80916 87.452 87.83297 85.72281 86.18824 90.36066 91.9837
90 89.77078 92.05005 89.24815 90.01946 88.03171 92.17497 90.29453 89.56859 90.35123 90.33235 92.28726 93.90249
Table 5. Performance analysis of the projected sentiment analysis model with optimized crossover classifier framework: MCC.
DCNN DOA þ DOXA þ SSOA þ BOA þ HBA þ SI-HBA þ
(Jianqiang, crossover crossover crossover crossover crossover crossover
Percentage Xiaolin, and GA (Iqbal classifier classifier classifier classifier classifier classifier
(%) SVM RF MLP DT Xuejun 2018) et al. 2019) framework framework framework framework framework framework
60 75.73362 72.54056 75.64814 74.27443 76.33068 74.18818 75.53932 72.98386 72.06794 77.26195 79.0119 81.02263
70 80.4513 77.85612 79.326 80.99051 80.34718 80.53436 81.58991 80.02691 81.71374 81.67248 81.73436 83.52166
80 86.49555 85.48857 87.0571 86.81664 80.7728 85.77107 87.0571 87.15718 87.07713 86.03018 87.19718 90.09583
90 91.84366 91.3741 92.23413 91.99994 81.55799 91.68726 92.1756 90.76053 91.47058 92.23413 92.23413 93.6981
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL
21
22 S. S. SAMANT ET AL.
Ablation Study
The anticipated model has received approval in 4 distinct fields: (a)pro-
posed model with conventional BOW þ Threshold- TFIDF (T-TFIDF),
Unigram and N-Gram þ optimal feature selection þ optimized crossover
classifier framework); (b) proposed model with conventional TF-
IDF þ Bigram-BoW (B-BoW), Unigram and N-Gram þ optimal feature
selection þ optimized crossover classifier framework); (c) proposed model
without optimal feature selection þ feature selection (Bigram-BoW (B-
BoW), Threshold- TFIDF (T-TFIDF), Unigram and N-Gram) þ optimized
crossover classifier framework; (d) proposed model with feature selection
(Bigram-BoW (B-BoW), Threshold- TFIDF (T-TFIDF), Unigram and N-
Gram) þ optimized crossover classifier framework þ optimal feature selec-
tion. With standard BOW model, standard TF-IDF model, and without
optimal feature selection, the projected model can only provide the accur-
acy level as 89.2%, 90.3%, and 87.5% when assessing the acquired out-
comes. But when a better BOW, TF-IDF, and feature selection are used,
the projected model has improved dramatically. Table 6 displays the results
that were obtained.
Figure 7. Analysis on training (a) Accuracy (b) Precision (c) F-Measure and (d) MCC.
precision, F-measure, and MCC of proposed model hold the greatest val-
ues. The improvement of the suggested method is thus demonstrated.
Convergence Analysis
Using a new SA-HBA model, the optimization problem with the sentiment
classification model was resolved. The SA-HBA model is used to select the best
characteristics and change the Bi-weight LSTM’s in order to increase classifica-
tion accuracy. The fitness function behind the SA-HBA model minimizes clas-
sification errors. Therefore, it should record the minimal cost function to
exhibit it as the best approach in terms of convergence speed. Interestingly,
due to improvements made to the HBA model, The estimated scheme will
adjust for changes in the count of iterations has consistently returned the func-
tion with the lowest cost. (updating of density factor as well). Moreover, at the
100th iteration, the HBA model can record the cost function as 1.09. But, with
SA-HBA, the enhanced convergence of 1.07 has been recorded. Thus, it is clear
that with self-adaptation within standard optimization, the solutions’ conver-
gence rate can be boosted. The findings are presented in Figure 8.
Conclusion
This study created a new sentiment analysis model based on deep learning.
Here, the SI-HBA model’s ideally chosen features are used to train the
hybrid classifier within an optimized crossover framework. By means of a
SI-HBA, for improvising the classification results of the gathered reviews,
the weights of the Bi-LSTM classifier is adjusted. The final results will indi-
cate whether the reviews are mostly good, negative, or neutral. The pro-
posed sentiment classification model is then validated by a comparison
analysis. At the 90th LP, the predicted strategy had the maximum accuracy,
93.54%. Moreover, at 80th LP, the planned technique has observed as the
largest sentiment classification accuracy as 91.82806, that is superior to
SVM ¼ 88.78921, RF ¼ 87.48917743, MLP ¼ 88.36221, DT ¼ 89.0586098,
DCNN (Jianqiang, Xiaolin, and Xuejun 2018) ¼87.497721, GA (Iqbal et al.
2019) ¼88.83073, DOA ¼ 86.323975728, DOXA ¼ 88.575749, SSOA ¼
89.41952, BOA ¼ 89.3989505 and HBA ¼ 89.42981. Designing a new opti-
mized crossover classifier architecture for sentiment classification is the
main factor contributing to the projected model’s improved accuracy in
sentiment classification.
References
Alattar, F., and K. Shaalan. 2021. Using artificial intelligence to understand what causes
sentiment changes on social media. IEEE Access 9:61756–67. doi:10.1109/ACCESS.2021.
3073657.
Alfrjani, R., T. Osman, and G. Cosma. 2019. A hybrid semantic knowledgebase-machine
learning approach for opinion mining. Data & Knowledge Engineering 121:88–108. doi:
10.1016/j.datak.2019.05.002.
Ali, F., K. S. Kwak, and Y. G. Kim. 2016. Opinion mining based on fuzzy domain ontology
and Support Vector Machine: A proposal to automate online review classification.
Applied Soft Computing 47:235–50. doi:10.1016/j.asoc.2016.06.003.
Araque, O., J. F. Sanchez-Rada, and C. A. Iglesias. 2022. GSITK: A sentiment analysis
framework for agile replication and development. SoftwareX 17:100921. doi:10.1016/j.
softx.2021.100921.
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 27
Aydin, C. R., and T. G€ ung€or. 2020. Combination of recursive and recurrent neural net-
works for aspect-based sentiment analysis using inter-aspect relations. IEEE Access 8:
77820–32. doi:10.1109/ACCESS.2020.2990306.
Ayesha, H. 2021. Race detection using mutated salp swarm optimization algorithm based
DBN from face shape features. Multimedia Research 4 (2):879–905.
Aziz, A. A., and A. Starkey. 2020. Predicting supervise machine learning performances for
sentiment analysis using contextual-based approaches. IEEE Access 8:17722–33. doi:10.
1109/ACCESS.2019.2958702.
Chan, J. Y. L., K. T. Bea, S. M. H. Leow, S. W. Phoong, and W. K. Cheng. 2022. State of
the art: a review of sentiment analysis based on sequential transfer learning. Artificial
Intelligence Review:1–32.
Chen, T., K. H. Yap, and L. P. Chau. 2011. From universal bag-of-words to adaptive bag-
of-phrases for mobile scene recognition. In 2011 18th IEEE International Conference on
Image Processing (825–8). IEEE.
Dalaorao, G. A., A. M. Sison, and R. P. Medina. 2019. Integrating collocation as tf-idf
enhancement to improve classification accuracy. In 2019 IEEE 13th International
Conference on Telecommunication Systems, Services, and Applications (TSSA) (pp.
282–285). IEEE. doi:10.1109/TSSA48701.2019.8985458.
Deng, D., L. Jing, J. Yu, and S. Sun. 2019. Sparse self-attention LSTM for sentiment lexicon
construction. IEEE/ACM Transactions on Audio. Speech, and Language Processing 27
(11):1777–90.
Gao, Y., J. Liu, P. Li, and D. Zhou. 2019. CE-HEAT: an aspect-level sentiment classification
approach with collaborative extraction hierarchical attention network. IEEE Access 7:
168548–56. doi:10.1109/ACCESS.2019.2954590.
Gu, T., G. Xu, and J. Luo. 2020. Sentiment analysis via deep multichannel neural networks
with variational information bottleneck. IEEE Access 8:121014–21. doi:10.1109/ACCESS.
2020.3006569.
Hashim, F. A., E. H. Houssein, K. Hussain, M. S. Mabrouk, and W. Al-Atabany. 2022.
Honey Badger Algorithm: New metaheuristic algorithm for solving optimization prob-
lems. Mathematics and Computers in Simulation 192:84–110. doi:10.1016/j.matcom.2021.
08.013.
Iqbal, F., J. M. Hashmi, B. C. Fung, R. Batool, A. M. Khattak, S. Aleem, and P. C. Hung.
2019. A hybrid framework for sentiment analysis using genetic algorithm based feature
reduction. IEEE Access 7:14637–52. doi:10.1109/ACCESS.2019.2892852.
Januario, B. A., A. E. D. O. Carosia, A. E. A. da Silva, and G. P. Coelho. 2022. Sentiment
analysis applied to news from the Brazilian stock market. IEEE Latin America
Transactions 20 (3):512–8. doi:10.1109/TLA.2022.9667151.
Jianqiang, Z., G. Xiaolin, and Z. Xuejun. 2018. Deep convolution neural networks for twit-
ter sentiment analysis. IEEE Access 6:23253–60. doi:10.1109/ACCESS.2017.2776930.
Kang, M., J. Ahn, and K. Lee. 2018. Opinion mining using ensemble text hidden Markov
models for text classification. Expert Systems with Applications 94:218–27. doi:10.1016/j.
eswa.2017.07.019.
Kastrati, Z., A. S. Imran, and A. Kurti. 2020. Weakly supervised framework for aspect-
based sentiment analysis on students’ reviews of MOOCs. IEEE Access 8:106799–810.
doi:10.1109/ACCESS.2020.3000739.
Khan, F. H., U. Qamar, and S. Bashir. 2016. Multi-objective model selection (MOMS)-
based semi-supervised framework for sentiment analysis. Cognitive Computation 8 (4):
614–28. doi:10.1007/s12559-016-9386-8.
28 S. S. SAMANT ET AL.
Li, Z., R. Li, and G. Jin. 2020. Sentiment analysis of danmaku videos based on naïve bayes
and sentiment dictionary. IEEE Access 8:75073–84. doi:10.1109/ACCESS.2020.2986582.
Liang, H., U. Ganeshbabu, and T. Thorne. 2020. A dynamic Bayesian network approach
for analysing topic-sentiment evolution. IEEE Access 8:54164–74. doi:10.1109/ACCESS.
2020.2979012.
Mehanna, Y. S., and M. B. Mahmuddin. 2021. A semantic conceptualization using tagged
bag-of-concepts for sentiment analysis. IEEE Access 9:118736–56. doi:10.1109/ACCESS.
2021.3107237.
Nagamanjula, R., and A. Pethalakshmi. 2020. A novel framework based on bi-objective
optimization and LAN2FIS for Twitter sentiment analysis. Social Network Analysis and
Mining 10 (1):1–16. doi:10.1007/s13278-020-00648-5.
Obiedat, R., D. Al-Darras, E. Alzaghoul, and O. Harfoushi. 2021. Arabic aspect-based senti-
ment analysis: A systematic literature review. IEEE Access 9:152628–45. doi:10.1109/
ACCESS.2021.3127140.
Pandey, A. C., D. S. Rajpoot, and M. Saraswat. 2017. Twitter sentiment analysis using
hybrid cuckoo search method. Information Processing & Management 53 (4):764–79. doi:
10.1016/j.ipm.2017.02.004.
Phan, H. T., V. C. Tran, N. T. Nguyen, and D. Hwang. 2020. Improving the performance
of sentiment analysis of tweets containing fuzzy sentiment using the feature ensemble
model. IEEE Access 8:14630–41. doi:10.1109/ACCESS.2019.2963702.
Rajeyyagari, S. 2020. Automatic speaker diarization using deep LSTM in audio lecturing of
e-Khool platform. Journal of Networking and Communication Systems 3 (4):17–25.
Ruz, G. A., P. A. Henrıquez, and A. Mascare~ no. 2020. Sentiment analysis of Twitter data
during critical events through Bayesian networks classifiers. Future Generation Computer
Systems 106:92–104. doi:10.1016/j.future.2020.01.005.
Sehar, U., S. Kanwal, K. Dashtipur, U. Mir, U. Abbasi, and F. Khan. 2021. Urdu sentiment
analysis via multimodal data mining based on deep learning algorithms. IEEE Access 9:
153072–82. doi:10.1109/ACCESS.2021.3122025.
Silva, H., E. Andrade, D. Ara ujo, and J. Dantas. 2022. Sentiment analysis of tweets related
to SUS before and during COVID-19 pandemic. IEEE Latin America Transactions 20
(1):6–13. doi:10.1109/TLA.2022.9662168.
Smetanin, S. 2020. The applications of sentiment analysis for Russian language texts:
Current challenges and future perspectives. IEEE Access 8:110693–719. doi:10.1109/
ACCESS.2020.3002215.
Valdivia, A., M. V. Luzı on, and F. Herrera. 2017. Neutrality in the sentiment analysis prob-
lem based on fuzzy majority. In 2017 IEEE international conference on fuzzy systems
(FUZZ-IEEE) (1–6). IEEE.
Wadawadagi, R. S., and V. B. Pagi. 2019. A multi-layer approach to opinion polarity classi-
fication using augmented semantic tree kernels. Journal of Experimental & Theoretical
Artificial Intelligence 31 (3):349–67. doi:10.1080/0952813X.2018.1549108.
Wang, Y., G. Huang, J. Li, H. Li, Y. Zhou, and H. Jiang. 2021. Refined global word embed-
dings based on sentiment concept for sentiment analysis. IEEE Access 9:37075–85. doi:
10.1109/ACCESS.2021.3062654.
Wang, L., J. Niu, and S. Yu. 2020. SentiDiff: combining textual information and sentiment
diffusion patterns for Twitter sentiment analysis. IEEE Transactions on Knowledge and
Data Engineering 32 (10):2026–39. doi:10.1109/TKDE.2019.2913641.
Wu, J., K. Lu, S. Su, and S. Wang. 2019. Chinese micro-blog sentiment analysis based on
multiple sentiment dictionaries and semantic rule sets. IEEE Access 7:183924–39. doi:10.
1109/ACCESS.2019.2960655.
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 29
Xu, G., Z. Yu, H. Yao, F. Li, Y. Meng, and X. Wu. 2019. Chinese text sentiment analysis
based on extended sentiment dictionary. IEEE Access 7:43749–62. doi:10.1109/ACCESS.
2019.2907772.
Yang, L., Y. Li, J. Wang, and R. S. Sherratt. 2020. Sentiment analysis for e-commerce prod-
uct reviews in chinese based on sentiment lexicon and deep learning. IEEE Acces 8:
23522–30. doi:10.1109/ACCESS.
Yin, F., Y. Wang, J. Liu, and L. Lin. 2020. The construction of sentiment lexicon based on
context-dependent part-of-speech chunks for semantic disambiguation. IEEE Access 8:
63359–67. doi:10.1109/ACCESS.2020.2984284.
Zhai, G., Y. Yang, H. Wang, and S. Du. 2020. Multi-attention fusion modeling for senti-
ment analysis of educational big data. Big Data Mining and Analytics 3 (4):311–9. doi:10.
26599/BDMA.2020.9020024.
Zhang, B., D. Xu, H. Zhang, and M. Li. 2019. STCS lexicon: spectral-clustering-based
topic-specific Chinese sentiment lexicon construction for social networks. IEEE
Transactions on Computational Social Systems 6 (6):1180–9. doi:10.1109/TCSS.2019.
2941344.
Zhou, J., S. Jin, and X. Huang. 2020. ADeCNN: An improved model for aspect-level senti-
ment analysis based on deformable CNN and attention. IEEE Access 8:132970–9. doi:10.
1109/ACCESS.2020.3010802.