Sentiment Analysis of Customers Reviews Using A Hybrid Evolutionary SVM-Based Approach in An Imbalanced Data Distribution
Sentiment Analysis of Customers Reviews Using A Hybrid Evolutionary SVM-Based Approach in An Imbalanced Data Distribution
3, 2022.
Digital Object Identifier 10.1109/ACCESS.2022.3149482
Jordan
5 Research Centre for Information and Communications Technologies of the University of Granada (CITIC-UGR), University of Granada, 1, 18010 Granada, Spain
ABSTRACT Online media has an increasing presence on the restaurants’ activities through social media
websites, coinciding with an increase in customers’ reviews of these restaurants. These reviews become
the main source of information for both customers and decision-makers in this field. Any customer who
is seeking such places will check their reviews first, which usually affect their final choice. In addition,
customers’ experiences can be enhanced by utilizing other customers’ suggestions. Consequently, customers’
reviews can influence the success of restaurant business since it is considered the final judgment of the overall
quality of any restaurant. Thus, decision-makers need to analyze their customers’ underlying sentiments in
order to meet their expectations and improve the restaurants’ services, in terms of food quality, ambiance,
price range, and customer service. The number of reviews available for various products and services
has dramatically increased these days and so has the need for automated methods to collect and analyze
these reviews. Sentiment Analysis (SA) is a field of machine learning that helps analyze and predict the
sentiments underlying these reviews. Usually, SA for customers’ reviews face imbalanced datasets challenge,
as the majority of these sentiments fall into supporters or resistors of the product or service. This work
proposes a hybrid approach by combining the Support Vector Machine (SVM) algorithm with Particle Swarm
Optimization (PSO) and different oversampling techniques to handle the imbalanced data problem. SVM is
applied as a machine learning classification technique to predict the sentiments of reviews by optimizing the
dataset, which contains different reviews of several restaurants in Jordan. Data were collected from Jeeran,
a well-known social network for Arabic reviews. A PSO technique is used to optimize the weights of the
features, as well as four different oversampling techniques, namely, the Synthetic Minority Oversampling
Technique (SMOTE), SVM-SMOTE, Adaptive Synthetic Sampling (ADASYN) and borderline-SMOTE
were examined to produce an optimized dataset and solve the imbalanced problem of the dataset. This study
shows that the proposed PSO-SVM approach produces the best results compared to different classification
techniques in terms of accuracy, F-measure, G-mean and Area Under the Curve (AUC), for different versions
of the datasets.
INDEX TERMS Sentiment analysis, SVM, PSO, SMOTE, oversampling, feature extraction, features
weighting.
I. INTRODUCTION sites have grown not only in terms of volume but also in their
The popularity of social media websites has witnessed importance to different aspects of life, including business,
tremendous growth in the last few years [1]. Social media politics, and education [2]. Nowadays, all businesses are
offering their products and services online. These sites allow
The associate editor coordinating the review of this manuscript and consumers to share their experiences and recommendations
approving it for publication was Alberto Cano . about these businesses’ products, places, and services on
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
22260 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 10, 2022
R. Obiedat et al.: Sentiment Analysis of Customers’ Reviews Using Hybrid Evolutionary SVM-Based Approach
different platforms such as TripAdvisor, Yelp, Facebook and services. This leads to higher customer satisfaction and more
Jeeran [3]. sales and revenues for the business [12]. On the other hand,
Online reviews represent the electronic version of word of customers can utilize these reviews in making thoughtful
mouth (WOM), which is an important aspect of in traditional decisions based on previous customers’ experiences [17].
marketing. While WOM is restricted to family, friends or Recently, it has been noticed that almost all restaurants
close people, online reviews have a worldwide reach [4]. have presence in the online social world. Restaurants are
Many websites allow users to rate and review different prod- becoming increasingly present on different social websites,
ucts and services. These reviews become the main source of and so are customers’ reviews of these restaurants [18].
information for potential customers who are seeking such Online restaurant reviews are considered a rich source of
products [5]. A survey conducted by BrightLocal (2020) information that helps attract new customers. Checking
found that 79% of customers trust online reviews as much reviews by locals and tourists before visiting restaurants has
as personal recommendations [6]. become a trend [17]. This is supported by 2020 BrightLocal
These days, whenever a customer wants to buy a new prod- survey revealing that 93% of consumers check a restau-
uct online, he or she will consider what other people think rant’s reviews before visiting it [6]. Consequently, customers’
about it, how they rate it, and their feedback and comments reviews can influence the success of restaurant business [19].
about the product before making a purchase [7]. Accord- It was found that the more positive comments a restaurant
ing to a BrightLocal survey in 2020, 87% of consumers receives the more customers visit its web pages and physical
had checked online reviews of local businesses [6]. These locations, which leads to more popularity and success [4].
reviews may affect the customer’s final choice since people In contrast, negative comments lead to the loss of trustwor-
trust customers’ reviews more than advertisements produced thiness of the restaurant and reduced revenue [17]. According
by a company. Furthermore, customers’ experiences can be to 2020 BrightLocal survey, 94% of consumers are more
enhanced by utilizing other customers’ suggestions [3]. likely to buy from a business if it has received positive
Due to the widespread availability of social websites and reviews, while 92% are less likely to use it if it has been given
applications, the number of reviews available for various bad reviews [6]. People tend to post reviews when they either
products has dramatically increased [8], and so has the a strong positive or strong negative experience (generally, the
need for automated methods to collect and analyze these number of positive reviews exceeds the number of negative
reviews [9]. These methods are essential to speed up and ones) [4].
improve the quality of decision making process [10]. Customers’ reviews and opinions are considered the final
Sentiment Analysis (SA) can be used to deduct users’ judgment of the overall quality of any restaurant. Thus, own-
feelings about various topics by processing their implicit atti- ers need to analyze their customers’ underlying sentiments
tudes and analyzing the underlying sentiments hidden in their so they can meet their expectations and offer customized ser-
comments [11]. Sharif et al. [12] defined SA as ‘‘analyzing vices in terms of food quality, ambiance, price, and customer
people’s sentiments, opinions, appraisals, attitudes, evalua- service [18].
tions and emotions towards such entities as organizations, Many studies have followed a Machine Learning (ML)
products, services, individuals, topics, issues, events and their approaches for restaurants sentiment analysis. A study done
attributes, as presented online via text, video and other means by Zahoor et al. [3] used NB Classifier, logistic regres-
of communication.’’ Sentiment analysis is also referred to as sion, SVM, and RF methods to analyze customers’ sen-
opinion mining based on natural language processing, text timents about restaurants in Karachi. The study annotated
analysis, and computational techniques [13]. It can be applied 4000 reviews from a well-known Pakistani Facebook com-
at the document level, sentence level, or aspect level [14]. munity called SWOT’S. Random forest gained the highest
It aims to classify customers’ attitudes towards a product or performance, with an accuracy of 95%. Another study con-
service as expressed in the comments, reviews, and posts ducted by Sharif et al. [19] classified customers reviews
as positive, negative, or neutral comments [15]. Two main for 1000 restaurants (written in Bengali) into positive and
SA approaches can be followed, namely, machine learning negative classes using three machine learning algorithms,
approaches and the lexicon-based approach [16]. Different namely, Decision Tree (DT), RF, and multinomial NB. The
machine learning algorithms are used to evaluate results in results showed that the multinomial NB method achieved the
the sentiment field; the most common ones are Naïve Bayes best results, with 80.48% accuracy.
(NB), SVM, Logistic Regression (LR), Random Forest (RF), Furthermore, sentiment analysis can be used to build a
and K-Nearest Neighbors (k-NN) [13]. recommender system in different fields including the restau-
Sentiment analysis is essential for every business as it rant industry. Asani et al. [11] for example collected people’s
can be used to improve the decisions of customers, business sentiments from the TripAdvisor website and built a cus-
owners, and service providers [17]. SA is used by business tomized restaurant recommender system based on people’s
owners to enhance their businesses’ image and increase their opinions and food preferences. The recommender system
success [3] since it helps decision-makers improve the quality suggests restaurants according to users’ preferences, thus
of their products and services based on their customers’ helping them to choose the best option and make an informed
reviews; thus, the business can provide more praiseworthy decision. Choosing the best restaurant among many unknown
options is an important decision, especially for tourists and well-suited for feature analysis and parameter tuning, making
travelers. the classification process more reliable. Another challenge
Few studies have followed an evolutionary approach relates to the choice of tokenization method used to form the
in the restaurant field. Govindarajan [20], for example, dataset for the classification process, and it can be recognized
applied a hybrid classification method based on restaurant through experimental practice.
reviews found on Yelp. The study used NB, SVM, and This research proposes an evolutionary approach to ana-
Genetic Algorithm (GA) and then compared their perfor- lyzing people’s sentiments regarding restaurants’ reviews in
mances with the proposed hybrid model built by coupling the Arabic language. Furthermore, this work followed an
all three classification methods. Another study conducted by evolutionary hybrid approach by combining the PSO evo-
Somantri et al. [21] proposed a hybrid model for restaurant lutionary algorithm with different oversampling techniques
culinary food reviews in Indonesia. The study confirmed the and the SVM algorithm to automatically detect the sentiment
efficiency of PSO, as it used a hybrid model baed on Particle in the customers’ comments. Four different oversampling
Swarm Optimization (PSO) and Information Gain (PSO-IG) techniques are applied to handle the problem of imbalance in
with four different classification algorithms, namely SVM, the dataset. Additionally, the applied evolutionary algorithm
NB, DT, and K-NN. The best results were achieved using the helps reduce the effort and time needed to tune the parameters
proposed PSO-IG method with the NB classifier. The main and optimize the classification by finding the best feature
limitation of this work was that it ignored the imbalanced weights and best k value for the oversampling technique,
nature of the dataset, as positive reviews were significantly thereby resulting in better performance measures.
more common than negative reviews. All aforementioned This work collects the reviews for almost 3000 restaurants
studies were applied to English sentiment; to the best of our from the Jeeran website. After the data preparation process,
knowledge, our work is the first to use such recent evolution- four different versions of the dataset are presented using
ary approach to explore Arabic sentiments. different tokenization methods. The initial individual of the
This work uses the Jeeran website to collect people’s study is created of random weights and a random k value
comments about different restaurants in Jordan. The Jeeran for the oversampling parameter. The weighted oversampled
website is a social platform on which customers can post their data is then classified using the SVM classification technique
reviews about more than 300,000 different places, including and the results are evaluated using G-mean. A Particle Swarm
shopping centers, cafes, restaurants, or doctors’ offices. Peo- Optimizer (PSO) evolutionary algorithm is then used to opti-
ple use Jeeran to find the best places and services in their mize the values of the individual and achieve a better G-mean.
cities and avoid bad experiences. Customers post thousands Finally, the proposed approach is compared with different
of comments on different social platforms every day, mainly standard and powerful classification models, including SVM,
in the Arabic language. XGBoost, DT, RF, NB, k-NN, and LR based on Accuracy,
This study is conducted on Arabic sentiment since it is the F1P , F1N , G-mean, and AUC evaluation measures.
fifth-most widely spoken language in the world and the first The main contributions of the study can be summarized as
language of more than 422 million people [22]. Moreover, follows:
about 185 million web users are Arabic speakers [23]. The • Collecting the dataset from the Jeeran website with
Arabic language is a more challenging language to study than approximately 3000 restaurant reviews. The dataset is
to English for many reasons. Firstly, Arabic has a dialec- cleaned, labeled, formatted, and stemmed.
tal variety; people often post their comments in dialectical • Oversampling the dataset using four different oversam-
Arabic rather than Modern Standard Arabic, thus requiring pling techniques to solve the imbalanced problem.
more complex prepossessing [24]. Another reason is the • Applying the PSO optimization technique to find the
morphology of the Arabic language, meaning the same word best weights for the dataset features and the best k
may have a different meaning, even if it has the same root. value for each oversampling technique, then applying
Also suffixes, affixes, and prefixes added to the same word the SVM classification technique to the oversampled
may carry essential information [23]. Moreover, the richness and weighted dataset to find the sentiments of the restau-
of synonyms in the Arabic language plays a key role in its rant reviews.
The remainder of the paper is divided as follows: Section II
complexity. Furthermore, the same word may have different
presents a review of the literature on restaurants’ sentiment
meanings according to the context, and a word can fit into
analysis. Section III introduces the backgrounds of different
more than one lexical category [22].
methods and concepts that have been used. The proposed
In addition, the problem of imbalanced datasets is very
approach is described in Section IV. Section V discusses
common since the majority of the sentiments of customers fall
the experiments and provides the results achieved by the
into either the supporter or resistor category. Parameter tuning
proposed approach and other models. Finally, the conclusion
of the oversampling technique is another challenge. More-
and future directions are are offered in Section VI.
over, a large number of features is generated by tokenizing
the sentiments of the customers. Thus, leading methodologies II. RELATED WORK
can be applied to feature analysis to achieve the best outcome Sentiment analysis (SA) is one of the most studied research
from the classification process. Optimization techniques are areas combining natural language processing, data mining,
and web mining. Owing to its importance to business and most written reviews use informal words that are not likely
community, SA research has spread into management and included in the lexicon. As a result, ML is a preferable
social sciences as discussed by Liu [25]. alternative since it includes highly illustrative and flexible
As explained by Tubishat et al. [26], SA, which is also models, as stated by Alaei et al. [31].
referred to as opinion mining, is a text-classification field in Various ML algorithms have been used to conduct SA in
which people’s opinions, evaluations, attitudes, moods, and the restaurants domain. NB, LR, and DT. ML algorithms
emotions regarding a service or product are analyzed to detect were applied by Hassan et al. [32] to conduct SA on three
orientations. It is conducted computationally by using natural different datasets, namely, the Yelp dataset, IMDB dataset,
language processing, linguistics, or text analysis to detect and Arabic qaym.com restaurant reviews dataset. Perfor-
the feelings expressed within informal text posted online. mance was measured in terms of accuracy and recall. NB and
Recently, due to the popularity of social networks and online LR recorded the best results. Similarly, NB, SVM, multi-
review websites, people tend to check a restaurant’s reviews layer perceptron, DT, k-NN, and fuzzy logic were applied
before visiting it. As a result, customer’s impressions have by Kumar and Jaiswal [33] on data extracted from Twitter
become a vital factor influencing the success of restaurants; and Tumblr, which are widely used micro-blogging social
the interest among decision- makers toward customers’ expe- networks. A comparative analysis of performance is pre-
rience about services provided has also increased as stated sented in terms of precision, recall, and accuracy. Besides,
by Sharif et al. [19]. SA has been applied to online reviews a deep learning model called DOC-ABSADeepL was pro-
about restaurants in the literature. For instance, Gan et al. [5] posed by Zuheros et al. [34] and applied on the TripAdvisor
studied the attributes representing consumers’ reviews of dataset for restaurants to categorize the aspects included in
restaurants. This study found that the attributes derived from an expert review while also extracting opinions and criteria.
previous studies such as food, service, ambiance, and price The tripR-2020 dataset was built, manually annotated, and
were not enough to affect restaurants’ ratings and that context released before being used in the same study.
should be added as a significant attribute. Meanwhile, Aye Several implementations have considered the Arabic
and Aung [1] proposed a Myanmar language resource for language for conducting SA using ML. For instance,
lexicon-based SA as a solution to language-specific problems Al Omari et al. [35], logistic regression (LR) was applied on
since most studies have considered the English language for data extracted from reviews (including restaurant reviews)
SA. Restaurant review data were used, but informal expres- posted on Google and Zomato about public services in
sions were not addressed. Lebanon. Several ML algorithms, namely, KNN, NB, SVM,
Since online booking websites gained substantial LR, and RF, were applied for SA by Alharbi and Qamar [36]
interest recently, and since people now check hun- to assess customers’ reviews about restaurants and cafes in
dreds of reviews before making any booking decisions, the Qassim region of Saudi Arabia. Performance was mea-
Agüero-Torales et al. [27] proposed a cloud-based software sured based on accuracy, recall, and F-measure, with the best
tool to analyze data from the TripAdvisor website by con- performances for these measures produced by SVM, LR, and
ducting SA on them in the province of Granada. RF, respectively.
The SA task was accomplished by using various datasets, Class imbalance problem have been considered in some
such as the Yelp dataset, to examine the approaches pro- sentiment analysis studies, such as Qiu et al. [37] in which
posed by other researchers as explained by Hegde et al. [28] a heuristic re-sampling algorithm was applied as solution
and made public for research and academic studies. The to imbalance data encountered while training the proposed
Zomato Restaurant Dataset is derived from the online multi- model. While Pongthanoo and Songpan [38] proposed a tech-
national restaurant aggregator in which reviews are posted nique which combines Information Gain (IG) with SMOTE
alongside information, menus, and delivery options. Also, to improve performance accuracy of ML classifiers when
Taneja et al. [29] discussed that Zomato is a very rich applied on imbalanced dataset. A GAN-SMOTE architecture
database that includes information on more than 20,000 was proposed by Scott and Plested [39] in which generative
restaurants. Zomato API enables users to access the most adversarial networks (GANs) was merged with SMOTE in
up-to-date content and generate information about nearby order to up-sample their dataset and generate convincing
restaurants. Furthermore, SemEval Datasets are high-quality synthetic examples of it for improving performance accuracy
annotated datasets generated through a series of interna- of their experiments. SMOTE technique was also applied by
tional workshops; different versions of these datasets (e.g., Nguyen et al. [40] on restaurant data crawled from Foody,
SemEval-2015, SemEval-2016) have been used in literature an online community people use to search and comment on
as stated by Khan et al. [30]. food in Vietnam, to overcome the class imbalance challenge
SA is conducted through three main approaches, namely, for supervised learning method to be applied afterward.
the lexicon, machine learning (ML), and hybrid approaches. Evolutionary algorithms (EAs) were applied during the
However, the application of lexicon-based SA has several pre-processing step to select the most effective attributes
drawbacks. Firstly, it requires a massive number of linguistic for building the model. For example, a hybrid SA frame-
resources. Secondly, a predefined list of polarity annotation is work applying the GA feature reduction approach was
required, and it differs based on the language used. Thirdly, proposed by Iqbal et al. [41] in which both principal
component analysis (PCA) and latent semantic analy- Taking velocity as v and position as x, both are the same as
sis (LSA) were applied for comparison purposes with the counter increases by unity for iterations. Equation 1 describes
proposed GA-based approach, which outperformed them. the velocity update.
Support vector machine (SVM) techniques have been
vt+1 t t t t t
id = vid + c1 r1 (pid − xid ) + c2 r2 (pgd − xid ) (1)
applied as a classifier after the feature-selection step using
an EA. The study by Kumar and Khorwal [42] applied a where in a D-dimensional search space, D-dimensional vec-
support vector machine (SVM) for classification after the tor is denoted as xit = (xi1 t , x t , . . . , x t )T represents ith
i2 iD
feature-selection step using the firefly algorithm to find the particle of the swarm at t time step. Velocity is represented as
optimum subset of features. Experiments were conducted on vti = (vti1 , vti2 , . . . , vtiD )T . The best-visited position previously
four datasets (two English and two Hindi), and a genetic algo- of the ith particle of swarm at t time step is denoted as pti =
rithm (GA) was applied for comparison purposes. The whale (pti1 , pti2 , . . . , ptiD )T and index of best particle in the swarm
optimization algorithm (WOA), which is one of the recent is referred to as g, while c1 , c2 are constants representing
metaheuristic algorithms introduced by Tubishat et al. [43] cognitive and social scaling parameters, r1 , r2 are random
to solve the problem of falling in local optima, was improved numbers in the range [0, 1].
by including elite opposition-based learning (EOBL) in the The position is updated as in equation 2, where d is the
initialization step and including evolutionary operators, such dimension and i is the particle index. Algorithm 1 illustrates
as mutation, crossover, and selection operators, at the end of the pseudo-code of PSO.
each iteration. Along with the filter feature selection tech- t+1
xid t
= xid + vt+1
id (2)
nique, information gain (IG) was considered using SVM. The
proposed improvements were validated using four Arabic
datasets for SA while six ML and two DL algorithms were Algorithm 1 Pseudo-Code of Particle Swarm Optimizer
applied. 1 Create and Initialize a D-dimensional swarm, S and
Generally, we noticed that the majority of the previous corresponding velocity vectors
works that proposed machine learning techniques for sen- 2 for (t = 1 to the maximum bound on the number of
timent analysis in the Arabic language context focused on iterations) do
applying classical supervised classification techniques like 3 for(i = 1 to S) do
SVM, NB and Decision trees. However, there are less works 4 for(d = 1 to D) do
tried to solve the class imbalance challenge in the data dis- 5 Apply the velocity update equation
tribution. Therefore, in this work, we are going to follow a 6 Apply position update equation
different line of research by integrating Evolutionary Algo- 7 end
rithms (EA) with different variants of the SMOTE oversam- 8 Compute fitness of updated position
pling techniques and SVM. The goal of this integration is 9 If needed, Update historical information for pbest and
to overcome the class imbalance challenge and at the same gbest
time improving the classification power for the targeted SA 10 end
problem in the Arabic language context. 11 Terminate if gbest meets problem requirements
12 end
III. PRELIMINARIES
A. PARTICLE SWARM OPTIMIZATION B. SUPPORT VECTOR MACHINE
Particle swarm optimization (PSO) is a swarm intelligence The support vector machine (SVM) algorithm is a supervised
algorithm developed for solving nonlinear problems within classifier that is applied widely to solve classification and
various sciences and engineering domains; it was derived regression problems. It was designed as an improvement to
from the flocking of birds or schooling of fish, as stated the support vector classifier, which has been introduced as
by [44]. an enhancement to the maximal margin classifier, which is
PSO is a search algorithm that uses swarm intelligence to restricted to dealing with simple linearly separable data [47].
find solutions as explained in [45] and [46]. It generates a In high-dimensional vector spaces where the feature space
random search result by analyzing a set of potential solutions is well-divided, SVM generates linearly separated hyper-
(called a swarm); every single potential solution is referred to planes. These planes partition the data points belonging to
as a particle. Particles normally rely on two kinds of learning the two classes into distinct regions. The optimal hyperplane
while moving: cognitive learning and social learning. The for- is always the one that maximizes the distance between the
mer refers to the process of learning from other particles (the nearest training data points and the feature space [48].
result is stored as gbest), while the latter refers to the process Significant misclassification can appear because of linear
of storing the best solution that was visited by the particle classification since data points belonging to distinct classes
(stored pbest). Velocity is used to determine the magnitude are rarely clearly distinguished. SVM can handle such com-
and direction of a particle. It refers to changing position rate monly faced cases, as it maps the feature space into a
with respect to time—that is, an iteration in the case of PSO. higher-dimensional space where non-linear data points are
transformed into linearly separable points. Therefore, are The process starts by setting up N, which is the total
transformed into linearly separable points. Therefore, data amount of oversampling (presented as an integer value). This
points that belong to different classes have clearer separation is done either by approximating a 1:1 class distribution or
boundaries [49]. using the wrapper process to discover the class distribution.
The kernel function is used to generate the high- Then, an iterative set of steps are carried out, starting with
dimensional space by enlarging the original space non- randomly selecting a minority class instance training set.
linearly. There are several forms of kernel functions, namely After that, the K nearest neighbors are obtained, with the
linear kernels, polynomial kernels, and radial basis function value of k set to 5 by default. Finally, new instances are
kernels. Equations 3, 4, and 5 describe them respectively, computed by selecting N K-NN instances.
where k () denotes the kernel function and the product of the The final step is obtaining the difference between the fea-
two observation vectors xi and xi0 represents its outcome. The ture vector of the sample being processed and every single
two vectors’ product is referred to as ϕ (xi ).ϕ xi0 , where ϕ is
selected neighbor. Afterward, this difference is multiplied by
the transformed feature space [50]. a random number between 0 and 1; the result is then added to
the previous feature vector. This results in the selection of a
p
X 0 random point within the line segment of features. Eventually,
k xi , xi0 = xij xij (3) one of the nominated attributes is selected. The entire process
j=1 is summarized in Algorithm 2.
X p
k xi , xi0 = (1 + xij xij0 )d Algorithm 2 Pseudo-Code of SMOTE Algorithm
(4)
j=1 1 function SMOTE(T;N; k)
p
X Input: T; N; k #minority class examples, Amount of
k xi , xi = exp(−γ
0
(xij − xij0 )2 )
(5) oversampling, #nearest neighbors
j=1 Output: (N/100) * T synthetic minority class samples
Variables:
C. OVERSAMPLING TECHNIQUES Sample: array for original minority class samples
In classification problems, having the target class label newindex: keeps a count of number of synthetic
unequally distributed causes a situation that is commonly samples generated, initialized to 0
encountered. These data can be referred to as an imbalanced Synthetic: array for synthetic samples
dataset, which affects the training process of the data mining 2 if N < 100 then
model, as it will be conducted mainly on the majority class, 3 Randomize the T minority class samples
causing bias in class predictions, as the minority class holding 4 T = (N /100) ∗ T
few instances may be considered as noise or outliers. As a 5 N = 100
result, imbalanced datasets pose serious challenges by affect- 6 end
ing classifiers’ performance, as explained by [51]. 7 N = (int)N /100 The amount of SMOTE is assumed
Hence, solving data imbalance issues is vital and should to be in integral multiples of 100
be conducted as a preliminary step before classification 8 for i = 1 to T do
as was done in [52]. Various balancing techniques are 9 Compute k nearest neighbors for i, and save the
applied in this regard. They can be categorized into oversam- indices in the nnarray
pling techniques—such as SMOTE and adaptive synthetic 10 POPULATE(N ; i; nnarray)
sampling (ADASYN)—and undersampling techniques, such 11 end
as edited nearest neighbors, random under sampling, and 12 end function
TomekLinks. The former refers to the artificial creation of
minority class points in the dataset, while the latter removes 2) SVM-SMOTE
the majority class labels from the dataset. SVM-SMOTE is a variant of the SMOTE algorithm that
deploys an SVM classifier to capture a sample to be used
1) SMOTE for new synthetic samples generation. The process is con-
The synthetic minority oversampling technique (SMOTE) ducted by applying SVM to the original training dataset after
[53] is an oversampling technique that is widely applied in approximating the borderline area using support vectors. The
data mining to solve imbalance datasets as explained by [54]. algorithm emphasizes data separation, as it synthesizes data
It focuses on the ‘‘feature space’’ instead of the ‘‘data space,’’ far from class overlap. Data is synthesized randomly along
as it does not replicate minority classes but instead intro- lines joining every minority class support vector by referring
duces synthetic instances by randomly choosing a minority to its nearest neighbors [56].
class and then interpolating their K-nearest neighbors. KNN
generates instances within the dataset by considering other 3) ADAPTIVE SYNTHETIC SAMPLING ADASYN
instances near them since it is naturally applied to find the Adaptive synthetic sampling (ADASYN) is another varia-
closest neighbors of a specific point [55]. tion of SMOTE that differs from focusing on neighbors or
borderlines. Instead, it focuses on data density and creates TABLE 1. Details of the datasets.
synthetic data accordingly [52].
Generating synthetic data is inversely related to minority
class density. This means more synthetic data is generated
in places within feature space where minority class density is
low, while few (or no) such data are generated where minority vectors, along with the frequency of each word from a certain
class density is high [57]. In other words, where the minority vocabulary. It has been used in SA applications as a robust
class is less dense within the feature space, the synthetic data technique despite its simplicity.
are created at a higher frequency; otherwise, no synthetic data
are created.
IV. METHODOLOGY
4) BORDERLINE-SMOTE The methodology of this paper is described and presented
in this section. Three phases have been provided in detail,
As the name implies, borderline-SMOTE is a version of
including, data description, collection and preparation, and
SMOTE that differs in functionality. Instead of creating syn-
the proposed approach.
thetic data randomly related to near data, borderline-SMOTE
tends to specify each class’s borderline. Instances on the
borderline and close to it are more likely than others to be A. DATA DESCRIPTION AND COLLECTION
misclassified than those far from borderline and, therefore, The datasets utilized in this work describe customers’ reviews
are more important for the classification task [58]. of various restaurants in Jordan. Data were collected from
In borderline-SMOTE, all minority class instances are Jeeran, a well-known social network for Arabic reviews.
divided into three groups: noise that is rare, incorrect, and Since 2010, the Jeeran website has provided a platform
located in areas of majority class instances; danger instances, for comparing and reviewing the platform for comparing
which are located on class boundaries and overlap with the and reviewing the best places and services in the Arab
majority class; and safe instances, which represent the minor- world, including cafes, hotels, restaurants, and public ser-
ity class. vices. Reviews of these types can provide useful feedback to
those who make decisions about the quality of service and
D. FEATURE EXTRACTION food, prices, and other aspects related to the ambiance of
Feature extraction (FE) is defined as the process by which a these places.
set of initial raw data is split dimensionality into groups that Approximately 3000 restaurant reviews have been com-
can be processed easily, as described by [59]. piled from the Jeeran website. The collection process is
According to [60], FE is one of the most commonly performed using a developed C# script that can easily and
applied techniques for reducing the dimensionality of data, as simultaneously be gathered from different pages of several
high-dimensional data is mapped into low-dimensional but restaurants’ websites. Further, these collected reviews are
potential features. These techniques extract only informative stored as text files for further analysis. Furthermore, Table 1
features, the use of which would cause a significant improve- shows the details of the datasets.
ment to ML models’ performance.
B. DATA PREPARATION AND LABELING
1) N-GRAM The collected dataset is cleaned, labeled, formatted,
The formation of text features for a supervised ML classifier and stemmed [64], [65]. The cleaning process is per-
can be done by N-gram, as stated in [61], where given text formed on the dataset by removing symbols and special
can be represented by several sequences of tokens n. If the characters. [66], [67].
value of n is 1, it is called unigram; if the value of n is 2, To label the reviews, we implemented a crowdsourcing
it is called bigram; if the value of n is 3, it is called trigram. website, where all reviews are uploaded and arranged for
For example, if the sentence ‘‘Jordan is better country’’ is labeling. We invited more than 290 individuals to annotate
considered and n = 2, then it will produce ‘‘Jordan is,’’ ‘‘is the reviews, where 10 reviews were assigned for each person
better,’’ and ‘‘better country.’’ and labeled positive or negative. The reviewers were asked
to read each review carefully and label it according to their
2) BAG-OF-WORDS understanding of the sentiment of that review. Two options
A very popular technique for FE, as stated in [62], involves were available under each review—negative and positive—
columns that represent words and columns that represent the for the reviewers to select. Consequently, the class of the
value of a weight measure such as term frequency and term review was assigned based on the majority of reviewers’
frequency-inverse document frequency. Meanwhile, in [63], selections.
it was discussed that bag-of-words (BoW) embraces features Thereafter, each review was stored on its own file, and all
representing documents as vectors of words from a vocabu- files of the reviews were collected and stored according to the
lary. In other words, the BoW model includes a representation class type. A CSV file was created from all reviews, including
of every single document from the corpus in the form of their context in one column and their class label in the second.
TABLE 2. Different versions of tokenized datasets. the same process applied to the initial individual. Multiple
iterations are performed to optimize the classification to find
a better weight and k value for the process. For the final
iteration, the best individual is kept for the testing process.
The testing process, which is observed in the right part of
Figure 1, uses the optimized individual to generate a weighted
oversampled testing data. That is, the testing process gener-
ates the weighted testing data based on the weights of the
The dataset comprises 2150 positive reviews and 640 negative individual and performs the oversampling process based on
reviews. the k value of the individual. The classification of the data
After dataset labeling, the formatting process is started. by SVM is then evaluated using the evaluation measures
First, the stop words such as considered in this paper to evaluate the performance of the
(translated to I’m, so, that, then, very, this and may proposed approach.
respectively) are removed. Stop words have to be deleted
since they do not affect the meaning of the text. After that, V. EXPERIMENTS AND RESULTS
non-Arabic letters and emoticons are eliminated through a The experiments and their results are described and analyzed
normalization process. Several useless features are elimi- in detail in this section. Several stages are involved in this
nated using text normalization and stop word removal, which phase, including the experimental set-up, the evaluation mea-
reduces the overall number of extracted features and enhances sures, the results of PSO-SVM with different oversampling
the feature selection process. techniques, a comparison with some standard classification
Finally, the stemming process is applied by the Lovins algorithms followed by a comparison against other recent
stemmer technique to remove duplicate words and supple- studies approaches, and finally a feature importance analysis
mentary letters. Then, N-gram and BOW feature extraction is discussed.
methods are applied for text tokenization, generating four
different versions of the data (Table 2) As observed in the
A. EXPERIMENTAL SETUP
table, a different number of features is generated by each
In this study, the experiments were conducted on a PC run-
combination of feature extraction and stemming. The num-
ning Windows 10 with a 2.40-GHz Intel Core i7 and 16 GB
bers of features are 3439, 8985, 14,233, and 2916 for the
of RAM. Further, the scikit-learn library and EvoloPy frame-
1-gram, 2-gram, 3-gram, and BoW techniques, respectively.
work were used to run tests in the Python environment. Our
proposed approach settings were 100 iterations, a population
C. PROPOSED APPROACH size of 100, and 30 runs.
Figure 1 represents the approach proposed in this paper
and shows the steps applied to conduct the experiments and B. EVALUATION MEASURES
achieve the obtained results. After the preparation of the data, The performance of our model was evaluated by considering
the dataset is split into training and testing parts. Then, the accuracy, F1-score (positive), F1-score (negative), g-mean,
evolutionary classification of the training dataset is applied and AUC measures. Accuracy provides the classification
to the training part. Based on this process, the classification quality of the model, calculated as the ratio between the true
of the optimized weights and the k value are applied to the negatives (TNs) and true positives (TPs), as well as between
testing data. TNs and false positives (FPs) based on Equation 6.
Specifically, the training process, which can be observed
in the left part of Figure 1, considers the creation of an initial TP + TN
Accuracy = (6)
individual consisting of random weights and a random k TP + TN + FP + FN
value for the oversampling parameter. The random weights
of the initial individual are applied to the training data to The F-measure is the consonant mean of precision and
generate weighted features’ values for each instance, thus recall, and it can be calculated using the following equation:
generating the weighted training data. Figure 2 shows the precisionP × recallP
weighting process by which the weight part of the individ- F1P = 2 ∗ (7)
precisionP + recallP
ual is multiplied by each instance to generate the weighted precisionN × recallN
dataset. On the other hand, the weighted training data are F1N = 2 ∗ (8)
precisionN + recallN
used in the oversampling technique, with the k parameter
value used as the random value obtained from the initial G-mean is a measure of the balance of the classification
individual, to generate oversampled and weighted training performance between two classes. G-means are calculated
data. The data are then classified using the SVM classification mathematically by multiplying both recalls; recall-negative
technique and evaluated by the fitness function generated in (RECN ) and recall-positive (RECP ) by the square root.
terms of G-mean. PSO is then used to optimize the values of p
the individual to generate a better G-mean value following G − mean = RECN × RECP (9)
FIGURE 2. Illustration of the proposed weighting process utilized in the proposed PSO-SVM approach.
TABLE 3. Accuracy, F-measure, and G-mean results for the proposed PSO-SVM for all datasets with four different oversampling techniques.
TABLE 4. Accuracy, F-measure, G-mean, and AUC results for Data 1 for TABLE 6. Accuracy, F-measure, G-mean, and AUC results for Data 3 for
the proposed PSO-SVM against other algorithms. the proposed PSO-SVM against other algorithms.
TABLE 5. Accuracy, F-measure, G-mean, and AUC results for Data 2 for TABLE 7. Accuracy, F-measure, G-mean, and AUC results for Data 4 for
the proposed PSO-SVM against other algorithms. the proposed PSO-SVM against other algorithms.
ADASYN, and borderline-SMOTE. Table 3 shows the results SVM-PSO+BorderlineSMOTE; for g-mean, the best results
of the datasets in terms of accuracy, F-measure, and g-mean. were obtained by SVM-PSO+SMOTE.
All datasets are presented in ascending order, from Data 1 to In the third part (Data 3), the best accuracy and F1N
Data 4. were achieved by SVM-PSO+SVMSMOTE. For F1N ,
As shown in the first part of the table, the highest results for SVM-PSO+SVMSMOTE and SVM-PSO+Borderline
Data 1 obtained by SVM-PSO+BorderlineSMOTE in terms SMOTE both acquired the highest result of 0.662. Mean-
of accuracy, F1P , F1N , and g-mean were 0.897, 0.939, 0.650, while, the best g-mean was obtained by SVM-PSO+
and 0.800, respectively. BorderlineSMOTE.
As for Data 2, the SVM-PSO+ADASYN outperforms Further, the SVM-PSO+SVMSMOTE performed better
the other algorithms in terms of accuracy and F1P . than the other algorithms for Data 4 in terms of accuracy,
Meanwhile, for F1N , the best results were obtained by F1P and F1N (0.8856, 0.9327, and 0.6190, respectively).
TABLE 8. Accuracy, F-measure, G-mean, and AUC results for the proposed Moreover, the LR achieved the second-best results for all
PSO-SVM against recent studies.
measures, while the RF ranked third in terms of accuracy
and F1P ; XGBoost ranked third for F1N and g-mean; while
standard SVM ranked third in terms of AUC.
In order to demonstrate the performance of the proposed
approach, two different comparisons have been added. In the
first comparison the proposed approach is compared against
other recent studies in the literature [68]. The used tech-
TABLE 9. Accuracy, F-measure, G-mean, and AUC results for OCLAR niques in these studies were TF-IDF-Bidirectional-LSTM
dataset for the proposed PSO-SVM against other algorithms.
and TF-IDF-GBDT. As can be seen in table 8, the results illus-
trates that the proposed approach (SVM-PSO+BSMOTE)
outperform the TF-IDF-Bidirectional-LSTM technique in
all performance measurements. Moreover, our approach
(3-gram-SVM-PSO+BSMOTE) achieved the highest results
in terms of F1N , g-mean, and AUC, while TF-IDF-GBDT
obtained the best result for accuracy and F1N . Since this work
aims to solve the imbalance problem, G-mean and AUC are
considered more important than the other measures.
Additionally, in the second comparison, the proposed
However, in terms of g-mean, the SVM-PSO+Borderline approach was compared with other algorithms based on the
SMOTE achieved the highest result (0.7844). Opinion Corpus for Lebanese Arabic Reviews (OCLAR)
It is worth mentioning that the PSO-SVM yielded the best dataset that was published in the UCI repository (Table 9).
g-mean results, which was considered the main measurement The results support our research findings and prove the supe-
in this study since the datasets are imbalanced. It is essen- riority of our proposed approach, since it can be noticed that
tial to ensure that the classification approach makes quality the SVM-PSO+BSMOTE achieved the best results in all
predictions for the majority class and other classes. measures.
In summary, the PSO-SVM with BorderlineSMOTE and
D. STANDARD CLASSIFICATION MODELS COMPARISON SMOTE obtained the best results in all measures for
In this subsection, the experiments focused on comparing all datasets, followed by NB, LR, and standard SVM
the proposed PSO-SVM with different standard classification (Tables 4, 5, 6, 7, and 9). While in comparison with recent
models, including SVM, XGBoost, DT, RF, NB, k-NN, and studies the proposed PSO-SVM outperforms the other algo-
LR. All these standard classification models were also com- rithms in terms of F1N , g-mean, and AUC.
bined with an oversampling technique. Table 4 shows that
the SVM-PSO+BorderlineSMOTE outperformed all other E. DISCUSSION
algorithms on Data 1 in terms of all measures, while the Four different versions of the dataset were presented in this
standard SVM ranked second and the LR placed third. study. Three of the versions were created using the N-gram
As shown in Table 5, the highest results were acquired by method, while the fourth was created by using the bag-of-
SVM-PSO+SMOTE in all measures; however, the NB was words method. The different data versions produced different
the second-best algorithm in terms of accuracy and F1P , with numbers of features (Table 2).
values of 0.86 and 0.92, respectively. As for F1N , g-mean, In the first instance, the datasets were examined using the
and AUC, the LR obtained the second-best results of 0.57, proposed PSO-SVM (Figure 3). According to the figure, Data
0.78 and 0.79, respectively. Meanwhile, the standard SVM 2 showed the highest results for accuracy and F1P , while Data
yielded values of 0.85, 0.91, 0.56, 0.77, and 0.78 in terms of 3 presented the highest results in terms of F1N and g-mean.
accuracy, F1P , F1N , g-mean, and AUC, respectively. Overall, the classification performance of all datasets
The results for Data 3 show similar outcomes, with the improved as the number of features increased. Accordingly,
SVM-PSO+BorderlineSMOTE emerging as the superior the use of many features (N-grams) has a positive influence
algorithm, with results of 0.89, 0.93, 0.66, and 0.81 in terms on classification performance. The second part of the analysis
of accuracy, F1P , F1N , g-mean and AUC, respectively. The of the results revealed how poorly the bag-of-words method
NB, LR, and standard SVM placed second, third, and fourth, performed compared to other methods.
respectively. Additionally, Figure 4 illustrates the g-mean results for
For Data 4, the SVM-PSO+BorderlineSMOTE also out- each algorithm across all datasets. The results clearly show
performed the other algorithms, with values of 0.88, 0.93, that the classification performance of the standard SVM was
0.61, 0.78, and 0.79 for accuracy, F1P , F1N , g-mean, improved by using PSO. By using feature weighting and
and AUC, respectively. However, in this data, the NB’s optimizing the SVM parameters, the PSO-SVM achieved
performance decreased in terms of accuracy, F1P , F1N , better results. Thus, the PSO improved the results of the
g-mean, and AUC when compared with the other algorithms. standard method by 0.01, 0.03, 0.05, and 0.07 for Data 1,
REFERENCES
[1] Y. M. Aye and S. S. Aung, ‘‘Senti-lexicon and analysis for restaurant
reviews of Myanmar text,’’ Int. J. Adv. Eng., Manage. Sci., vol. 4, no. 5,
Jan. 2018, Art. no. 240004.
FIGURE 3. Comparison of results for all datasets for the proposed [2] P. P. Rokade and A. K. D, ‘‘Business intelligence analytics using sentiment
PSO-SVM. analysis—A survey,’’ Int. J. Electr. Comput. Eng., vol. 9, no. 1, p. 613,
Feb. 2019.
[3] K. Zahoor, N. Z. Bawany, and S. Hamid, ‘‘Sentiment analysis and classi-
fication of restaurant reviews using machine learning,’’ in Proc. 21st Int.
Arab Conf. Inf. Technol. (ACIT), Nov. 2020, pp. 1–6.
[4] M. Nakayama and Y. Wan, ‘‘The cultural impact on social commerce:
A sentiment analysis on yelp ethnic restaurant reviews,’’ Inf. Manage.,
vol. 56, no. 2, pp. 271–279, Mar. 2019.
[5] Q. Gan, B. H. Ferns, Y. Yu, and L. Jin, ‘‘A text mining and multidi-
mensional sentiment analysis of online restaurant reviews,’’ J. Quality
Assurance Hospitality Tourism, vol. 18, no. 4, pp. 465–492, Oct. 2017.
[6] R. Murphy. (Dec. 9 2020). Local Consumer Review Survey
2020. BrightLocal. Accessed: Nov. 5, 2021. [Online]. Available:
https://www.brightlocal.com/research/local-consumer-review-survey/
[7] R. Feldman, ‘‘Techniques and applications for sentiment analysis,’’ Com-
mun. ACM, vol. 56, no. 4, pp. 82–89, 2013.
[8] H. Kang, S. J. Yoo, and D. Han, ‘‘Senti-lexicon and improved Naïve
Bayes algorithms for sentiment analysis of restaurant reviews,’’ Expert
FIGURE 4. G-mean results of all datasets for all algorithms. Syst. Appl., vol. 39, no. 5, pp. 6000–6010, 2012.
[9] L. Li, L. Yang, and Y. Zeng, ‘‘Improving sentiment classification of restau-
rant reviews with attention-based bi-GRU neural network,’’ Symmetry,
Data 2, Data 3, and Data 4. We also noticed that the proposed vol. 13, no. 8, p. 1517, Aug. 2021.
PSO-SVM outperformed the other algorithms in all datasets. [10] O. Oueslati, A. I. S. Khalil, and H. Ounelli, ‘‘Sentiment analysis for helpful
On the other hand, the k-NN algorithm obtained the worst reviews prediction,’’ Int. J. Adv. Trends Comput. Sci. Eng., vol. 7, no. 3,
pp. 34–40, Jun. 2018.
results due to the complexity of the data in terms of instances [11] E. Asani, H. Vahdat-Nejad, and J. Sadri, ‘‘Restaurant recommender system
and dimensions. based on sentiment analysis,’’ Mach. Learn. with Appl., vol. 6, Dec. 2021,
Art. no. 100114.
[12] N. M. Sharef, H. M. Zin, and S. Nadali, ‘‘Overview and future opportunities
VI. CONCLUSION of sentiment analysis approaches for big data,’’ J. Comput. Sci., vol. 12,
Sentiment analysis has witnessed increased interest in the no. 3, pp. 153–168, Mar. 2016.
academic field in the last few years. Many people post reviews [13] B. Yu, J. Zhou, Y. Zhang, and Y. Cao, ‘‘Identifying restaurant features via
sentiment analysis on yelp reviews,’’ 2017, arXiv:1709.08698.
of different services and products. The analysis of customers’ [14] G. Beigi, X. Hu, R. Maciejewski, and H. Liu, ‘‘An overview of sentiment
attitudes and feedback is essential for all businesses, includ- analysis in social media and its applications in disaster relief,’’ in Sentiment
ing restaurants. Thus, this research proposed a new hybrid Analysis and Ontology Engineering. 2016, pp. 313–340.
[15] O. Harfoushi, D. Hasan, and R. Obiedat, ‘‘Sentiment analysis algorithms
evolutionary technique that aims to analyze people’s sen- through azure machine learning: Analysis and comparison,’’ Modern Appl.
timent towards various restaurants across Jordan. The data Sci., vol. 12, no. 7, p. 49, Jun. 2018.
were collected from a popular social network, namely Jeeran. [16] S. Gao, J. Hao, and Y. Fu, ‘‘The application and comparison of web services
for sentiment analysis in tourism,’’ in Proc. 12th Int. Conf. Service Syst.
The proposed approach consisted of collecting more than Service Manage. (ICSSSM), Jun. 2015, pp. 1–6.
3000 restaurant reviews and labeling them using the crowd- [17] E. Hossain, O. Sharif, M. M. Hoque, and I. H. Sarker, ‘‘SentiLSTM: A
sourcing technique. Oversampling techniques were then deep learning approach for sentiment analysis of restaurant reviews,’’ 2020,
arXiv:2011.09684.
applied to solve the problem of imbalanced data in the dataset.
[18] N. Hossain, M. R. Bhuiyan, Z. N. Tumpa, and S. A. Hossain, ‘‘Sentiment
We produced four versions of the collected dataset using analysis of restaurant reviews using combined CNN-LSTM,’’ in Proc.
different tokenization methods, including 1-Gram, 2-Gram, 11th Int. Conf. Comput., Commun. Netw. Technol. (ICCCNT), Jul. 2020,
pp. 1–5.
3-Gram, and bag-of-words. Further, we implemented a hybrid
[19] O. Sharif, M. M. Hoque, and E. Hossain, ‘‘Sentiment analysis of Bengali
optimization technique comprising PSO and SVM to find texts on online restaurant reviews using multinomial Naïve Bayes,’’ in
the best weights while also finding the k values of four Proc. 1st Int. Conf. Adv. Sci., Eng. Robot. Technol. (ICASERT), May 2019,
different oversampling techniques to predict the sentiments of pp. 1–6.
[20] M. Govindarajan, ‘‘Sentiment analysis of restaurant reviews using hybrid
reviews. The study demonstrates that the proposed PSO-SVM classification method,’’ Int. J. Soft Comput. Artif. Intell., vol. 2, no. 1,
approach is effective and outperforms the other approaches in pp. 17–23, 2014.
[21] O. Somantri, D. A. Kurnia, D. Sudrajat, N. Rahaningsih, O. Nurdiawan, [44] B. Chopard and M. Tomassini, ‘‘Particle swarm optimization,’’ in An Intro-
and L. P. Wanti, ‘‘A hybrid method based on particle swarm optimization duction to Metaheuristics for Optimization. Cham, Switzerland: Springer,
for restaurant culinary food reviews,’’ in Proc. 4th Int. Conf. Informat. 2018, pp. 97–102.
Comput. (ICIC), Oct. 2019, pp. 1–5. [45] J. C. Bansal, ‘‘Particle swarm optimization,’’ in Evolutionary and
[22] M. K. Saad and W. M. Ashour, ‘‘Osac: Open source Arabic corpora,’’ in Swarm Intelligence Algorithms. Dhahran, Saudi Arabia: Springer, 2019,
Proc. 6th ArchEng Int. Symp., EEECS, vol. 10, 2010, pp. 1–6. pp. 11–23.
[23] O. Oueslati, E. Cambria, M. B. HajHmida, and H. Ounelli, ‘‘A review of [46] S. Sengupta, S. Basak, and R. A. Peters, II, ‘‘Particle swarm optimization:
sentiment analysis research in Arabic language,’’ Future Gener. Comput. A survey of historical and recent developments with hybridization perspec-
Syst., vol. 112, pp. 408–430, Nov. 2020. tives,’’ Mach. Learn. Knowl. Extraction, vol. 1, no. 1, pp. 157–191, 2019.
[24] A. Ghallab, A. Mohsen, and Y. Ali, ‘‘Arabic sentiment analysis: A sys- [47] A.-Z. Ala’M, A. A. Heidari, M. Habib, H. Faris, I. Aljarah, and
tematic literature review,’’ Appl. Comput. Intell. Soft Comput., vol. 2020, M. A. Hassonah, ‘‘Salp chain-based optimization of support vector
pp. 1–21, Jan. 2020. machines and feature weighting for medical diagnostic information
systems,’’ in Evolutionary Machine Learning Techniques. Singapore:
[25] B. Liu, ‘‘Many facets of sentiment analysis,’’ in A Practical Guide to
Springer, 2020, pp. 11–34.
Sentiment Analysis. Cham, Switzerland: Springer, 2017, pp. 11–39.
[48] J. Yousif and M. Al-Risi, ‘‘Part of speech tagger for Arabic text based
[26] M. Tubishat, N. Idris, and M. A. M. Abushariah, ‘‘Implicit aspect extrac- support vector machines: A review,’’ ICTACT J. Soft Comput., vol. 9, no. 2,
tion in sentiment analysis: Review, taxonomy, oppportunities, and open pp. 1–7, Jan. 2019.
challenges,’’ Inf. Process. Manage., vol. 54, no. 4, pp. 545–563, 2018. [49] A. Apsemidis and S. Psarakis, ‘‘Support vector machines: A review and
[27] M. M. Agüero-Torales, M. J. Cobo, E. Herrera-Viedma, and applications in statistical process monitoring,’’ Data Anal. Appl., Comput.,
A. G. López-Herrera, ‘‘A cloud-based tool for sentiment analysis in Classification, Financial, Stat. Stochastic Methods, vol. 5, pp. 123–144,
reviews about restaurants on TripAdvisor,’’ Proc. Comput. Sci., vol. 162, Apr. 2020.
pp. 392–399, Jan. 2019. [50] J. Nalepa and M. Kawulok, ‘‘Selecting training sets for support vector
[28] S. Hegde, S. Satyappanavar, and S. Setty, ‘‘Restaurant setup business machines: A review,’’ Artif. Intell. Rev., vol. 52, pp. 857–900, Jan. 2019.
analysis using yelp dataset,’’ in Proc. Int. Conf. Adv. Comput., Commun. [51] A. Gosain and S. Sardana, ‘‘Handling class imbalance problem using
Informat. (ICACCI), Sep. 2017, pp. 2342–2348. oversampling techniques: A review,’’ in Proc. Int. Conf. Adv. Comput.,
[29] A. Taneja, P. Gupta, A. Garg, A. Bansal, K. P. Grewal, and A. Arora, Commun. Informat. (ICACCI), Sep. 2017, pp. 79–85.
‘‘Social graph based location recommendation using users’ behavior: By [52] A. Fernández, S. Garcia, F. Herrera, and N. V. Chawla, ‘‘SMOTE for
locating the best route and dining in best restaurant,’’ in Proc. 4th Int. Conf. learning from imbalanced data: Progress and challenges, marking the 15-
Parallel, Distrib. Grid Comput. (PDGC), 2016, pp. 488–494. year anniversary,’’ J. Artif. Intell. Res., vol. 61, pp. 863–905, Apr. 2018.
[30] M. U. Khan, A. R. Javed, M. Ihsan, and U. Tariq, ‘‘A novel category [53] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, ‘‘Smote:
detection of social media reviews in the restaurant industry,’’ Multimedia Synthetic minority over-sampling technique,’’ J. Artif. Intell. Res., vol. 16,
Syst., vol. 6, pp. 1–14, Oct. 2020. pp. 321–357, Jul. 2018.
[31] A. R. Alaei, S. Becken, and B. Stantic, ‘‘Sentiment analysis in tourism: [54] D. Elreedy and A. F. Atiya, ‘‘A comprehensive analysis of synthetic
Capitalizing on big data,’’ J. Travel Res., vol. 58, no. 2, pp. 175–191, minority oversampling technique (SMOTE) for handling class imbalance,’’
Feb. 2019. Inf. Sci., vol. 505, pp. 32–64, Dec. 2019.
[55] R. Qaddoura, H. Faris, and I. Aljarah, ‘‘An efficient clustering algorithm
[32] A. K. A. Hassan and A. B. A. Abdulwahhab, ‘‘Reviews sentiment analysis
based on the k-nearest neighbors with an indexing ratio,’’ Int. J. Mach.
for collaborative recommender system,’’ Kurdistan J. Appl. Res., vol. 2,
Learn. Cybern., vol. 11, no. 3, pp. 675–714, Mar. 2020.
no. 3, pp. 87–91, Aug. 2017. [56] X. Zheng, SMOTE Variants for Imbalanced Binary Classification: Heart
[33] A. Kumar and A. Jaiswal, ‘‘Empirical study of Twitter and Tumblr for sen- Disease Prediction. Los Angeles, CA, USA: Univ. California, 2020.
timent analysis using soft computing techniques,’’ in Proc. World Congr. [57] A. Alhudhaif, ‘‘A novel multi-class imbalanced EEG signals classification
Eng. Comput. Sci., vol. 1, 2017, pp. 1–5. based on the adaptive synthetic sampling (ADASYN) approach,’’ PeerJ
[34] C. Zuheros, E. Martínez-Cámara, E. Herrera-Viedma, and F. Herrera, Comput. Sci., vol. 7, p. e523, May 2021.
‘‘Sentiment analysis based multi-person multi-criteria decision making [58] J. Zhang and X. Li, ‘‘Phishing detection method based on borderline-smote
methodology using natural language processing and deep learning for deep belief network,’’ in Proc. Int. Conf. Secur., Privacy Anonymity Com-
smarter decision aid. Case study of restaurant choice using TripAdvisor put., Commun. Storage. Cham, Switzerland: Springer, 2017, pp. 45–53.
reviews,’’ Inf. Fusion, vol. 68, pp. 22–36, Apr. 2021. [59] R. M. D’Addio, M. A. Domingues, and M. G. Manzato, ‘‘Exploiting fea-
[35] M. Al Omari, M. Al-Hajj, N. Hammami, and A. Sabra, ‘‘Sentiment classi- ture extraction techniques on users’ reviews for movies recommendation,’’
fier: Logistic regression for Arabic services’ reviews in Lebanon,’’ in Proc. J. Brazilian Comput. Soc., vol. 23, no. 1, pp. 1–16, Dec. 2017.
Int. Conf. Comput. Inf. Sci. (ICCIS), Apr. 2019, pp. 1–5. [60] A. Madasu, ‘‘A study of feature extraction techniques for sentiment anal-
[36] L. M. Alharbi and A. M. Qamar, ‘‘Arabic sentiment analysis of eateries’ ysis,’’ 2019, arXiv:1906.01573.
reviews: Qassim region case study,’’ in Proc. Nat. Comput. Colleges Conf. [61] R. Ahuja, A. Chug, S. Kohli, S. Gupta, and P. Ahuja, ‘‘The impact of
(NCCC), Mar. 2021, pp. 1–6. features extraction on the sentiment analysis,’’ Proc. Comput. Sci., vol. 152,
[37] J. Qiu, C. Liu, Y. Li, and Z. Lin, ‘‘Leveraging sentiment analysis at pp. 341–348, Jan. 2019.
the aspects level to predict ratings of reviews,’’ Inf. Sci., vols. 451–452, [62] K. Kumar, B. S. Harish, and H. K. Darshan, ‘‘Sentiment analysis on IMDb
pp. 295–309, Jul. 2018. movie reviews using hybrid feature extraction method,’’ Int. J. Interact.
Multimedia Artif. Intell., vol. 5, no. 5, p. 109, 2019.
[38] P. Pongthanoo and W. Songpan, ‘‘Feature selection and reduction based
[63] J. A. García-Díaz, M. Cánovas-García, and R. Valencia-García,
on SMOTE and information gain for sentiment mining,’’ in Proc. 5th Int.
‘‘Ontology-driven aspect-based sentiment analysis classification: An
Conf. Comput. Commun. Syst. (ICCCS), May 2020, pp. 109–114.
infodemiological case study regarding infectious diseases in Latin
[39] M. Scott and J. Plested, ‘‘Gan-smote: A generative adversarial network
America,’’ Future Gener. Comput. Syst., vol. 112, pp. 641–657,
approach to synthetic minority oversampling,’’ Aust. J. Intell. Inf. Process.
Nov. 2020. [Online]. Available: https://www.sciencedirect.com/
Syst., vol. 15, no. 2, pp. 29–35, 2019.
3science/article/pii/S0167739X2030892X
[40] M.-H. Nguyen, T. M. Nguyen, D. Van Thin, and N. L.-T. Nguyen, ‘‘A cor- [64] A. M. Al-Zoubi, J. Alqatawna, H. Faris, and M. A. Hassonah, ‘‘Spam
pus for aspect-based sentiment analysis in Vietnamese,’’ in Proc. 11th Int. profiles detection on social networks using computational intelligence
Conf. Knowl. Syst. Eng. (KSE), Oct. 2019, pp. 1–5. methods: The effect of the lingual context,’’ J. Inf. Sci., vol. 47, no. 1,
[41] F. Iqbal, J. M. Hashmi, B. C. M. Fung, R. Batool, A. M. Khattak, pp. 58–81, Feb. 2021.
S. Aleem, and P. C. K. Hung, ‘‘A hybrid framework for sentiment analysis [65] S. Srinivasan, V. Ravi, M. Alazab, S. Ketha, A.-Z. Ala’M, and
using genetic algorithm based feature reduction,’’ IEEE Access, vol. 7, S. K. Padannayil, ‘‘Spam emails detection based on distributed word
pp. 14637–14652, 2019. embedding with deep learning,’’ in Machine Intelligence and Big Data
[42] A. Kumar and R. Khorwal, ‘‘Firefly algorithm for feature selection in senti- Analytics for Cybersecurity Applications. Cham, Switzerland: Springer,
ment analysis,’’ in Computational Intelligence in Data Mining. Singapore: 2021, pp. 161–189.
Springer, 2017, pp. 693–703. [66] M. Habib, H. Faris, M. A. Hassonah, J. Alqatawna, A. F. Sheta, and
[43] M. Tubishat, M. A. M. Abushariah, N. Idris, and I. Aljarah, ‘‘Improved A.-Z. Ala’M, ‘‘Automatic email spam detection using genetic program-
whale optimization algorithm for feature selection in Arabic sentiment ming with smote,’’ in Proc. 5th HCT Inf. Technol. Trends (ITT), Nov. 2018,
analysis,’’ Int. J. Speech Technol., vol. 49, no. 5, pp. 1688–1707, May 2019. pp. 185–190.
[67] H. Faris, J. Alqatawna, A. Z. Ala’M, and I. Aljarah, ‘‘Improving email LAILA AL-QAISI received the bachelor’s degree
spam detection using content based feature engineering approach,’’ in from the King Abdulla II School for Information
Proc. IEEE Jordan Conf. Appl. Electr. Eng. Comput. Technol. (AEECT), Technology, The University of Jordan, in 2008,
Oct. 2017, pp. 1–6. the master’s degree in information technology
[68] Y. Luo and X. Xu, ‘‘Comparative study of deep learning models for management from the University of Sunderland,
analyzing online restaurant reviews in the era of the COVID-19 pandemic,’’ in 2012, and the master’s degree in web intelli-
Int. J. Hospitality Manage., vol. 94, Apr. 2021, Art. no. 102849.
gence from The University of Jordan, in 2017. She
is currently a Lecturer at the Network Computer
and Information Systems Department, Informa-
tion Technology Faculty, The World Islamic Sci-
ences and Education University. Her research interests include web and its
enormous data through: artificial intelligence, machine learning, sentiment
RUBA OBIEDAT received the B.Sc. degree in analysis, big data analytics, churn prediction, data warehouses, data mining,
computer science from The University of Jordan, and cloud computing security.
in 2003, the M.Sc. degree in information sys-
tem from DePaul University, in 2007, and the
Ph.D. degree in e-business from the University del
Salento, Lecce, Italy, in 2010. Since 2014, she has
been an Associate Professor with the Department OSAMA HARFOUSHI received the B.Sc. degree
of Information Technology, The University of in CIS from the Jordan University of Science and
Jordan. Her research interests include data mining, Technology, Jordan, in 2003, the M.Sc. degree in
machine learning, business intelligence, sentiment e-business from the University of Huddersfield,
analysis, and e-business. She was awarded a full-time, competition-based U.K., and the Ph.D. degree in mobile learning
Ph.D. Scholarship from the Italian Ministry of Education and Research to from the University of Bradford, U.K. He is cur-
pursue her Ph.D. degree. rently a Full Professor with the King Abdullah
II School of Information Technology, Information
Technology Department, The University of Jor-
dan. His research interests include cloud comput-
ing, e-business, and business data mining.