Sentiment Analysis of Customers Reviews Using A Hybrid Evolutionary SVM-Based Approach in An Imbalanced Data Distribution

This document presents a study that proposes a hybrid approach combining Support Vector Machine (SVM) and Particle Swarm Optimization (PSO) with oversampling techniques to handle imbalanced data when performing sentiment analysis on customers' reviews of restaurants. The study collected reviews from Jeeran, a social media network, on various restaurants in Jordan. A PSO technique was used to optimize the feature weights and SVM classification along with SMOTE, SVM-SMOTE, ADASYN, and borderline-SMOTE oversampling were examined. The results showed the proposed PSO-SVM approach produced the best results in terms of accuracy, F-measure, G-mean and AUC for different versions of the imbalanced dataset.

Uploaded by

Rizki Ramadhansyah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views14 pages

Sentiment Analysis of Customers Reviews Using A Hybrid Evolutionary SVM-Based Approach in An Imbalanced Data Distribution

Uploaded by

Rizki Ramadhansyah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Received January 22, 2022, accepted February 2, 2022, date of publication February 7, 2022, date of current version March

3, 2022.
Digital Object Identifier 10.1109/ACCESS.2022.3149482

Sentiment Analysis of Customers’ Reviews Using

a Hybrid Evolutionary SVM-Based Approach in an
Imbalanced Data Distribution
RUBA OBIEDAT 1 , RANEEM QADDOURA2 , ALA’ M. AL-ZOUBI1,3 , LAILA AL-QAISI 4,

OSAMA HARFOUSHI 1 , MO’ATH ALREFAI1 , AND HOSSAM FARIS 1,2,5

1 King Abdullah II School for Information Technology, The University of Jordan, Amman 11942, Jordan
2 School of Computing and Informatics, Al Hussein Technical University, Amman 11831, Jordan
3 School of Science, Technology and Engineering, University of Granada, 18010 Granada, Spain
4 Information Systems and Networks Department, Faculty of Information Technology, The World Islamic Sciences and Education University, Amman 11947,

Jordan
5 Research Centre for Information and Communications Technologies of the University of Granada (CITIC-UGR), University of Granada, 1, 18010 Granada, Spain

Corresponding author: Hossam Faris ([email protected])

ABSTRACT Online media has an increasing presence on the restaurants’ activities through social media
websites, coinciding with an increase in customers’ reviews of these restaurants. These reviews become
the main source of information for both customers and decision-makers in this field. Any customer who
is seeking such places will check their reviews first, which usually affect their final choice. In addition,
customers’ experiences can be enhanced by utilizing other customers’ suggestions. Consequently, customers’
reviews can influence the success of restaurant business since it is considered the final judgment of the overall
quality of any restaurant. Thus, decision-makers need to analyze their customers’ underlying sentiments in
order to meet their expectations and improve the restaurants’ services, in terms of food quality, ambiance,
price range, and customer service. The number of reviews available for various products and services
has dramatically increased these days and so has the need for automated methods to collect and analyze
these reviews. Sentiment Analysis (SA) is a field of machine learning that helps analyze and predict the
sentiments underlying these reviews. Usually, SA for customers’ reviews face imbalanced datasets challenge,
as the majority of these sentiments fall into supporters or resistors of the product or service. This work
proposes a hybrid approach by combining the Support Vector Machine (SVM) algorithm with Particle Swarm
Optimization (PSO) and different oversampling techniques to handle the imbalanced data problem. SVM is
applied as a machine learning classification technique to predict the sentiments of reviews by optimizing the
dataset, which contains different reviews of several restaurants in Jordan. Data were collected from Jeeran,
a well-known social network for Arabic reviews. A PSO technique is used to optimize the weights of the
features, as well as four different oversampling techniques, namely, the Synthetic Minority Oversampling
Technique (SMOTE), SVM-SMOTE, Adaptive Synthetic Sampling (ADASYN) and borderline-SMOTE
were examined to produce an optimized dataset and solve the imbalanced problem of the dataset. This study
shows that the proposed PSO-SVM approach produces the best results compared to different classification
techniques in terms of accuracy, F-measure, G-mean and Area Under the Curve (AUC), for different versions
of the datasets.

INDEX TERMS Sentiment analysis, SVM, PSO, SMOTE, oversampling, feature extraction, features
weighting.

I. INTRODUCTION sites have grown not only in terms of volume but also in their
The popularity of social media websites has witnessed importance to different aspects of life, including business,
tremendous growth in the last few years [1]. Social media politics, and education [2]. Nowadays, all businesses are
offering their products and services online. These sites allow
The associate editor coordinating the review of this manuscript and consumers to share their experiences and recommendations
approving it for publication was Alberto Cano . about these businesses’ products, places, and services on
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
22260 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 10, 2022
R. Obiedat et al.: Sentiment Analysis of Customers’ Reviews Using Hybrid Evolutionary SVM-Based Approach

different platforms such as TripAdvisor, Yelp, Facebook and services. This leads to higher customer satisfaction and more
Jeeran [3]. sales and revenues for the business [12]. On the other hand,
Online reviews represent the electronic version of word of customers can utilize these reviews in making thoughtful
mouth (WOM), which is an important aspect of in traditional decisions based on previous customers’ experiences [17].
marketing. While WOM is restricted to family, friends or Recently, it has been noticed that almost all restaurants
close people, online reviews have a worldwide reach [4]. have presence in the online social world. Restaurants are
Many websites allow users to rate and review different prod- becoming increasingly present on different social websites,
ucts and services. These reviews become the main source of and so are customers’ reviews of these restaurants [18].
information for potential customers who are seeking such Online restaurant reviews are considered a rich source of
products [5]. A survey conducted by BrightLocal (2020) information that helps attract new customers. Checking
found that 79% of customers trust online reviews as much reviews by locals and tourists before visiting restaurants has
as personal recommendations [6]. become a trend [17]. This is supported by 2020 BrightLocal
These days, whenever a customer wants to buy a new prod- survey revealing that 93% of consumers check a restau-
uct online, he or she will consider what other people think rant’s reviews before visiting it [6]. Consequently, customers’
about it, how they rate it, and their feedback and comments reviews can influence the success of restaurant business [19].
about the product before making a purchase [7]. Accord- It was found that the more positive comments a restaurant
ing to a BrightLocal survey in 2020, 87% of consumers receives the more customers visit its web pages and physical
had checked online reviews of local businesses [6]. These locations, which leads to more popularity and success [4].
reviews may affect the customer’s final choice since people In contrast, negative comments lead to the loss of trustwor-
trust customers’ reviews more than advertisements produced thiness of the restaurant and reduced revenue [17]. According
by a company. Furthermore, customers’ experiences can be to 2020 BrightLocal survey, 94% of consumers are more
enhanced by utilizing other customers’ suggestions [3]. likely to buy from a business if it has received positive
Due to the widespread availability of social websites and reviews, while 92% are less likely to use it if it has been given
applications, the number of reviews available for various bad reviews [6]. People tend to post reviews when they either
products has dramatically increased [8], and so has the a strong positive or strong negative experience (generally, the
need for automated methods to collect and analyze these number of positive reviews exceeds the number of negative
reviews [9]. These methods are essential to speed up and ones) [4].
improve the quality of decision making process [10]. Customers’ reviews and opinions are considered the final
Sentiment Analysis (SA) can be used to deduct users’ judgment of the overall quality of any restaurant. Thus, own-
feelings about various topics by processing their implicit atti- ers need to analyze their customers’ underlying sentiments
tudes and analyzing the underlying sentiments hidden in their so they can meet their expectations and offer customized ser-
comments [11]. Sharif et al. [12] defined SA as ‘‘analyzing vices in terms of food quality, ambiance, price, and customer
people’s sentiments, opinions, appraisals, attitudes, evalua- service [18].
tions and emotions towards such entities as organizations, Many studies have followed a Machine Learning (ML)
products, services, individuals, topics, issues, events and their approaches for restaurants sentiment analysis. A study done
attributes, as presented online via text, video and other means by Zahoor et al. [3] used NB Classifier, logistic regres-
of communication.’’ Sentiment analysis is also referred to as sion, SVM, and RF methods to analyze customers’ sen-
opinion mining based on natural language processing, text timents about restaurants in Karachi. The study annotated
analysis, and computational techniques [13]. It can be applied 4000 reviews from a well-known Pakistani Facebook com-
at the document level, sentence level, or aspect level [14]. munity called SWOT’S. Random forest gained the highest
It aims to classify customers’ attitudes towards a product or performance, with an accuracy of 95%. Another study con-
service as expressed in the comments, reviews, and posts ducted by Sharif et al. [19] classified customers reviews
as positive, negative, or neutral comments [15]. Two main for 1000 restaurants (written in Bengali) into positive and
SA approaches can be followed, namely, machine learning negative classes using three machine learning algorithms,
approaches and the lexicon-based approach [16]. Different namely, Decision Tree (DT), RF, and multinomial NB. The
machine learning algorithms are used to evaluate results in results showed that the multinomial NB method achieved the
the sentiment field; the most common ones are Naïve Bayes best results, with 80.48% accuracy.
(NB), SVM, Logistic Regression (LR), Random Forest (RF), Furthermore, sentiment analysis can be used to build a
and K-Nearest Neighbors (k-NN) [13]. recommender system in different fields including the restau-
Sentiment analysis is essential for every business as it rant industry. Asani et al. [11] for example collected people’s
can be used to improve the decisions of customers, business sentiments from the TripAdvisor website and built a cus-
owners, and service providers [17]. SA is used by business tomized restaurant recommender system based on people’s
owners to enhance their businesses’ image and increase their opinions and food preferences. The recommender system
success [3] since it helps decision-makers improve the quality suggests restaurants according to users’ preferences, thus
of their products and services based on their customers’ helping them to choose the best option and make an informed
reviews; thus, the business can provide more praiseworthy decision. Choosing the best restaurant among many unknown