Classification of Customer Feedbacks Using Sentiment Analysis Towards Mobile Banking Applications
Classification of Customer Feedbacks Using Sentiment Analysis Towards Mobile Banking Applications
Corresponding Author:
Nurazzah Abd Rahman
Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA
40450 Shah Alam, Selangor, Malaysia
Email: [email protected]
1. INTRODUCTION
A customer can be defined as a receiver, consumer or buyer of a company’s products and services.
Opinions and feedbacks given by customer after experiencing services and products is beneficial to
accurately regulate business operation that fits the customers’ need. In this era of internet and digital world,
there are numbers of knowledge and information readily available at the end of your fingertips. The
advancement of technology forces banking industry to move towards using mobile banking. The
transformation on banking transaction from paper-based to an electronic payment is an example to see how
banking industry has evolved. In banking industry, the prospect of revenue growth and operational efficiency
is essential for all business in order to stay relevant and survive in the industry. High competition between
banks is one of the factors that leads to this migration.
Nowadays, customer prefer to use mobile phone in doing all activities including on getting services.
One of the reasons is due to unlimited access on owning a mobile phone regardless of social status. Internet
network services that readily available at a low cost has subsequently contribute to the rise of mobile phone
users. This is where the needs on having mobile banking application emerged. Mobile banking defines as a
capability on performing financial transaction via mobile device [1]. Thus, customers can have access to their
respective bank account and perform the transaction anytime and anywhere without the need on going to the
physical bank.
In Malaysia, there are numbers of mobile banking application available namely Maybank My,
Commerce International Merchant Bankers (CIMB) Clicks, PB Engage, Rashid Hussein Bank (RHB) Now
and many more. With wide range of feedbacks and reviews available, study on sentiment analysis is
important for a company to classify those comments for products and services improvement. In this current
competitive world, building a strong relationship with the customer has become an important strategy. Banks
must battle to provide the best in customer satisfaction by introducing innovative strategies [2], [3] stated that
a business spends a huge amount of money and time on brand monitoring and gathering real-time customer
feedback. Dissatisfaction on services provided would affect business reputation and lead to customer
attrition, thus, consequently impact the business’s revenue. Customer attrition is an important issue faced not
just in banking industry but also in insurance company, mobile service provider and many more [4]. Prompt
action should be taken to retain existing customer and attract potential customer to perform business with the
bank. An implementation of analytics is undeniably important for business if it can be rationally applied and
identified in monitoring the customer behaviour towards business operation [5].
The aim of this research is to classify customers’ feedback made on mobile banking application by
using ML techniques. Three objectives are defined which firstly, is to classify the customer’s feedback based
on sentiment polarity score. Secondly to identify keywords related with customer’s feedback towards mobile
banking experience and lastly to evaluate the performance of sentiment analysis classifiers by using
performance matrix. For this research, classification of large scale customer feedback towards mobile
banking application in Malaysia is focused. The data of customer’s feedback towards mobile application will
be extracted based on six (6) banking institutions in Malaysia based on the review posted via their official
mobile banking application platform. ML algorithms of naïve Bayes (NB) and support vector machine
(SVM) are used to predict the accuracy of sentiment classifications. For NB, the kernel used is multinomial
naïve Bayes (MNB) and bernoulli naïve Bayes (BNB) while for SVM, the kernel used is linear support
vector machine (LSVM). The findings of this research are significant as it provides insight for banking
institution on users’ reaction towards their mobile banking applications to fulfil the customer experience.
According to [2], it is vital for banks to collect customer feedback from various banking services. With the
results from this research, an improvement could be made to come up with an application that suits
customer’s needs by enhancing the related aspects to compete with other mobile application in banking
industry. Other than that, this research also useful in providing information for public user on the
performance of mobile banking application available. A view on service provided through this mobile
banking could be a reason for a banking industry to attract a customer to perform business with them.
Furthermore, a good mobile banking platform would be a factor to retain customer in savings and performed
financial transaction thus subsequently will gain a strong bond between banking institution and customer.
Research on sentiment analysis of customer’s feedback and review has been widely covered on
various field by researchers in the past. Hasan et al. [6] conducted a ML based sentiment analysis focusing on
user tweets about politics in Pakistan. The tweets were written in Urdu and translated to English. In this
research, the accuracy of NB and SVM is compared. The result shows word sequence disambiguation
(W-WSD) had the highest accuracy in NB classifier with the percentage of 79.00%, followed by 76.00% and
54.75% for TextBlob and SentiWordNet. In SVM, TextBlob had the highest accuracy in comparison with W-
WSD and SentiWordNet with the percentage of 62.67%, 62.33% and 53.33% respectively.
Lien [7] analysed review from bank customers in Norway. The author defined the polarity
proportion based on 1-5 stars rating review given by the bank customers. The reviews are gathered from three
sources which are the bank’s review sites, social media and discussion forums. ML algorithm of gaussian
naïve Bayes (GNB), LSVM and maximum entropy (ME) in used where the models is conducted with 5-fold
cross validation. The result shows that ME result in highest accuracy. Rana and Singh [8] used LSVM and
NB to identify film user reviews and detect opinion. The accuracy of algorithm is process on each movie
genre namely action, adventure, drama and romantic where from the experiment, LSVM shows a higher
accuracy compared to NB in all genres mention with the highest accuracy in drama type of movie.
Furthermore, Ayo et al. [9] used RapidMiner to combine tweets and comments on Facebook from five major
banks in Nigeria. The result is divided into two parts which is by using sentiment analysis and clustering
analysis. The sentiment analysis was made to compare the most negative value (MNV) and most positive
value (MPV) on tweets and comments between banks. Kumar and Dabas [10] proposed social media
complaint workflow automation tool that use sentiment analysis on social media to actively respond on
complaints. Based on the study, three variants of NB classified are used consist of MNB, BNB, GNB and
SVM. The classifiers are implemented to see the performance against social media post of HDFC Bank India.
The results show that approximately 83% and 75% of accuracy achieved by MNB cassifier in the analysis of
sentiment classification and department classification respectively. Altrabsheh et al. [11] analysed a real-time
student feedback with sentiment analysis. Data regarding student’s feedback, opinion and feeling on lecture
session were collected. The technique used in the experiments are NB, complement naïve Bayes (CNB), ME
and three types of SVM kernels namely linear, radial basis and polynomial. The result shows that SVM has
high accuracy in precision, recall and F-score. Shi and Li [12] used sentiment analysis model for online hotel
reviews. The study used SVM technique for the sentiment classification with unigram feature consist of
information on frequency and term-frequency inverse document frequency (TF-IDF). The result shows that
TF-IDF is more efficient. Go et al. [13] used ML techniques to correctly categorize Twitter post as either
positive or negative. The supervised classifiers used are NB, ME and SVM where NB have the highest result
of 84.2% compared to the other two classifiers.
From the literature, SVM and NB are the common techniques used in research of sentiment
analysis. This section summarizes the frequently used techniques for sentiment analysis of customer
feedback. With this motivation, this paper has proposed to use NB and SVM method and compare the results
to find the best model for sentiment analysis of customer feedback in mobile banking application.
2. RESEARCH METHOD
Based on cross-industry standard process for data mining (CRISP-DM) method [14]. This research
proposed a research model as shown in Figure 1. The processes involve are data collection, data pre-
processing, model development, model evaluation and lastly results deployment.
In the first process, data of customer’s feedback towards mobile banking application will be
collected from Google Playstore for Maybank, CIMB, Public Bank, RHB Bank, Hong Leong Bank and AM
Bank. The reviews were extracted on 7th March 2020 focusing on first 100 pages in user review section. The
author of the reviews are bank customers and the reviews were written in mix languages consist of Malay,
Chinese and English language. Three attributes are used namely ‘Rating’ which contains the rate given by
user, ‘Descriptions’ which contains the user’s review on the application and ‘Bank Type’ which is type of
bank for each review.
Pre-processing is a method to improve sentiment analysis by cleaning the data from undesirable
elements to increase the accurateness thus lessen the existence of error in processing the outcomes. The user
reviews consist of great amount of vague information that need to be eliminated. Shekhawat [15] stated that
data cleaning process is important to compute the sentiment score so that machine will easily understand the
text. Pre-processing involves demojization of transforming emojis to the textual equivalent form [16], [17],
noise removal for text normalization [18]. The review captured is in the form of multiple languages such as
Malay, Chinese and English. Since Malaysia is a multi-racial country, thus it is normal for the people to give
reviews and feedbacks in their preferred language. For this research, translation is applied to translate the
languages into English language. Desai and Narvekar [19] stated that, spelling errors is produced
unintentionally due to human errors. The spelling corrections process is important to avoid the system from
ignoring important words in the reviews. Tokenization is used to divide the text of documents into separate
series of words or sequence of tokens [20], [21]. Stopword elimination is implemented to enhance the system
performance despite reducing the number of texts [22], [23]. Lemmatization is the technique to reduce
related word to common root word form [8], [22]. The example of lemmatization can be seen in a word
variation like “feature”, “featuring”, “features” and “featured” where these words is belonged to root word
“feature”.
Classification of customer feedbacks using sentiment analysis towards … (Nurazzah Abd Rahman)
1582 ISSN: 2252-8938
Feature in language processing refers to the textual data that is converted to numeric vector [24]. In
data extraction phase, a bag of word needs to be converted in vector model. According to [18] there are three
ways for converting terms into vector namely term frequency, term occurrences and TF-IDF. TF-IDF is a
weighing factor that can be used to replace binary and word count representation [25]. Conversion of term to
vector produce a lower weightage to irrelevant terms while a higher weightage to the relevant terms with
vector value of 0 and 1 respectively. In this research, TF-IDF is used as a weighting scheme to create the
word vector [26]–[28].
In model development, clean dataset is used to detect the sentiment of customers’ review either
positive, neutral or negative sentiment. To provide valuable insight from the emotion and opinion stated,
Textblob sentiment analyser will be used to determine the polarity and subjectivity score of each review. The
polarity score [29] is determined by assigning a score from -1 to 1 based on the words used where a negative
score represents a negative statement, a positive score represents positive statement while zero value
indicates a neutral statement. On the other hand, subjectivity score is determined to know either the context
of the review is in subjective meaning or objective meaning. It is based on range value of 0 to 1, the closer
the score to 1 the more subjective the text is.The score that are more than 0 will be classified as positive
sentiment, score which are less that 0 is negative sentiment and equal to zero as neutral sentiment. The
polarity score is determined based on reviews given by user under “Descriptions” column. According to [30],
in a survey made on user review posted in Google Play Store, user review act as an important source of
knowledge for developers as it provides wide information in terms of issues and improvement can be made
on the application. Noei [31] stated among the important pieces of information hidden in the user reviews are
user’s expectations and concerns, feature requests, bug reports, and guidelines planning for a future release.
Sentiment analysis consist of three polarity classes, which are positive, negative and neutral The
sentiment polarity is set based on user review instead of rating score due to unclear definition and
inconsistent personal interpretation of star rating. The star rating score given might be differ from the reviews
stated. As the polarity score has been determined, the sentiment datasets are then evaluated. The evaluation
will be based on the concept of confusion matrix. The classifiers used are SVM and NB. For NB, the kernel
used is MNB and also BNB while for SVM, the kernel used is LSVM. The polarity of sentiment
classification is in three-class classification, the confusion matrix is extended as shown in Table 1.
can be assumed that the user is having an issue when trying to log in to their account. It is also can be said
that the negative sentiments are closely related to the performance of the application in performing the online
transaction.
Classification of customer feedbacks using sentiment analysis towards … (Nurazzah Abd Rahman)
1584 ISSN: 2252-8938
In this section, the accuracy, precision, recall and F1 score is compared. Table 3 shows the
confusion matrix for MNB. It shows that out of 5,017 total reviews there are 846 negative sentiments
predicted correctly, making the result of accuracy for MNB classifier become 83.16%. Figure 5 shows the
result of precision, recall and F-measure in each sentiment classes by using weighted average. Negative and
neutral has the highest recall value. Meaning it has a higher negative and neutral sentiment that were
correctly predicted. On the hand for positive sentiment, it shows a high precision where the value is defined
as the proportion of texts that are correctly predicted over total prediction of positive texts.
Table 4 shows the confusion matrix for BNB, the confusion matrix shows that out of 5,017 total
reviews there are 729 negative sentiments predicted correctly, 947 neutral sentiments predicted correctly, and
2,577 positive sentiments has been predicted correctly. BNB shows the accuracy of 84.77% while for
precision, recall and F1-score result, it is as per shown in Figure 6. Thus, for BNB it also has a higher
negative and neutral sentiment that were correctly predicted over the actual amount. Positive sentiment
shows highest precision percentage.
Furthermore, the confusion matrix for LSVM is shown in Table 5, the confusion matrix shows that
out of 5,017 total reviews there are 988 negative sentiment predicted correctly, 1,164 neutral sentiment
predicted correctly, and 2,723 positive predicted correctly making accuracy of 97.17% for LSVM classifier.
Figure 7 shows the result of precision, recall and F-measure in each sentiment classes. LSVM shows a
slightly different result compared to both NB classifier where negative sentiment has a high recall value. On
the other hand, neutral and positive sentiment shows a high precision value with the percentage of 98.23%
and 98.27% respectively.The result shows that LSVM is the best technique with the highest value in all
accuracy, precision, recall, including the F1-score to predict sentiment of customer feedback in mobile
banking application. For the result obtained above, the positive and negative word clouds indicate the most
frequent words that appear in the feedbacks given by customer.
4. CONCLUSION
This research has potential to be improved in sentiment analysis study. In order to get more valuable
information in mobile banking application review, some of the features may be implemented to get more
comprehensive solution in determining the sentiments of user review. As the current research is considering
only the reviews given by user, hybrid study by analysing the sentiment behind both user review and rating
score given can be done by using mean of the stared rating and numeric rating generated in polarity score.
This to provide valuable insight from the emotion and opinion written by user. Future work should evaluate
the performance by using deep learning models such as convolutional neural network (CNN), recurrent
neural network (RNN) and long short term memory (LSTM). Deep learning model is worth to explore as it
provides deeper analysis of sentiment and would have more accurate sentiment detection.
ACKNOWLEDGEMENTS
This research is based upon work supported by Faculty of Computer and Mathematical Sciences,
Universiti Teknologi MARA, Shah Alam, Selangor, Malaysia.
Classification of customer feedbacks using sentiment analysis towards … (Nurazzah Abd Rahman)
1586 ISSN: 2252-8938
REFERENCES
[1] M. Drexelius, K. and Herzig, “Mobile banking and mobile brokerage–successful applications of mobile business?,” International
management & Consulting, vol. 16, no. 2, pp. 20–23, 2001.
[2] L. A. R. Mr and M. Kalhoro, “Customer satisfaction in commercial bank of Sindh Province a case study of Bank AL Falah,”
Library Philosophy and Practice (e-journal) Qurat-ul-Ain Abro, vol. 2331, 2019, Accessed: Jun. 16, 2022. [Online]. Available:
https://digitalcommons.unl.edu/libphilprac/2331.
[3] S. Ranjan, S. Sood, and V. Verma, “Twitter sentiment analysis of real-time customer experience feedback for predicting growth
of Indian Telecom Companies,”,” in 2018 4th International Conference on Computing Sciences (ICCS), Aug. 2018, pp. 166–174,
doi: 10.1109/iccs.2018.00035.
[4] A. Rombel, “CRM shifts to data mining to keep customers,” Global Finance, vol. 15, no. 11, pp. 97–98, 2001.
[5] P. Tanwar and P. Rai, “A proposed system for opinion mining using machine learning, NLP and classifiers,” IAES International
Journal of Artificial Intelligence (IJ-AI), vol. 9, no. 4, pp. 726–733, Dec. 2020, doi: 10.11591/ijai.v9.i4.pp726-733.
[6] A. Hasan, S. Moin, A. Karim, and S. Shamshirband, “Machine learning-based sentiment analysis for Twitter accounts,”
Mathematical and Computational Applications, vol. 23, no. 1, p. 11, Feb. 2018, doi: 10.3390/mca23010011.
[7] A. Lien, “Brand sentiment analysis of the Norwegian banking sector,” Norwegian uiversity of Science and Technology (NTNU),
2017.
[8] S. Rana and A. Singh, “Comparative analysis of sentiment orientation using SVM and Naive Bayes techniques,” in 2016 2nd
International Conference on Next Generation Computing Technologies (NGCT), Oct. 2016, pp. 106–111, doi:
10.1109/ngct.2016.7877399.
[9] C. K. Ayo, A. A. Ezenwoke, and I. T. Afolabi, “Competitive analysis of social media data in the banking industry,” International
Journal of Internet Marketing and Advertising, vol. 11, no. 3, p. 183, 2017, doi: 10.1504/ijima.2017.10006719.
[10] A. Kumar and V. Dabas, “A social media complaint workflow automation tool using sentiment intelligence,” in Proceedings of
the world congress on engineering, 2016, vol. 1.
[11] N. Altrabsheh, M. Cocea, and S. Fallahkhair, “Sentiment analysis: towards a tool for analysing real-time students feedback,” in
2014 IEEE 26th International Conference on Tools with Artificial Intelligence, Nov. 2014, pp. 419–423, doi:
10.1109/ictai.2014.70.
[12] H.-X. Shi and X.-J. Li, “A sentiment analysis model for hotel reviews based on supervised learning,” in 2011 International
Conference on Machine Learning and Cybernetics, Jul. 2011, pp. 950–954, doi: 10.1109/icmlc.2011.6016866.
[13] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant supervision,” Standford, 2009.
[14] M. A. Burhanuddin, R. Ismail, N. Izzaimah, A. A.-J. Mohammed, and N. Zainol, “Analysis of mobile service providers
performance using Naive Bayes data mining technique,” International Journal of Electrical and Computer Engineering (IJECE),
vol. 8, no. 6, pp. 5153–5161, Dec. 2018, doi: 10.11591/ijece.v8i6.pp5153-5161.
[15] B. S. Shekhawat, “Sentiment classification of current public opinion on BREXIT: Naïve Bayes classifier model vs Python’s
TextBlob approach,” School of Computing National College of Ireland, Dublin, 2019.
[16] A. Malte and S. Sonawane, “Effective distributed representation of code-mixed text,” Dec. 2019, doi:
10.1109/indicon47234.2019.9028960.
[17] M. Bojkovský and M. Pikuliak, “STUFIIT at SemEval-2019 Task 5: multilingual hate speech detection on Twitter with MUSE
and ELMo embeddings,” 2019, doi: 10.18653/v1/s19-2082.
[18] V. Kalra and R. Aggarwal, “Importance of text data preprocessing & implementation in RapidMiner,” in Proceedings of the First
International Conference on Information Technology and Knowledge Management, Jan. 2018, pp. 71–75, doi:
10.15439/2017km46.
[19] N. Desai and M. Narvekar, “Normalization of noisy text data,” Procedia Computer Science, vol. 45, pp. 127–132, 2015, doi:
10.1016/j.procs.2015.03.104.
[20] P. Tripathi, S. K. Vishwakarma, and A. Lala, “Sentiment analysis of English Tweets using Rapid Miner,” in 2015 International
Conference on Computational Intelligence and Communication Networks (CICN), Dec. 2015, pp. 668–672, doi:
10.1109/cicn.2015.137.
[21] H. AL-Rubaiee, R. Qiu, and D. Li, “Analysis of the relationship between Saudi twitter posts and the Saudi stock market,” in 2015
IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS), Dec. 2015, pp. 660–665,
doi: 10.1109/intelcis.2015.7397193.
[22] T. Verma, R. Renu, and D. Gaur, “Tokenization and filtering process in RapidMiner,” International Journal of Applied
Information Systems, vol. 7, no. 2, pp. 16–18, Apr. 2014, doi: 10.5120/ijais14-451139.
[23] F. Magliani et al., “A comparison between preprocessing techniques for sentiment analysis in Twitter,” 2016.
[24] M. M. J. Soumik, S. S. M. Farhavi, F. Eva, T. Sinha, and M. S. Alam, “Employing machine learning techniques on sentiment
analysis of Google Play Store Bangla reviews,” in 2019 22nd International Conference on Computer and Information Technology
(ICCIT), Dec. 2019, pp. 18–20, doi: 10.1109/iccit48885.2019.9038348.
[25] N. A. Rahman, F. I. M. R. Syamil, and S. B. bin Rodzman, “Development of mobile application for Malay translated hadith
search engine,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 20, no. 2, pp. 932–938, Nov. 2020, doi:
10.11591/ijeecs.v20.i2.pp932-938.
[26] S. R. M. Najib, N. A. Rahman, N. K. Ismail, N. Alias, Z. M. Nor, and M. N. Alias, “Comparative study of machine learning
approach on Malay translated hadith text classification based on sanad,” MATEC Web of Conferences, vol. 135, p. 66, Nov. 2017,
doi: 10.1051/matecconf/201713500066.
[27] N. S. Shaeeali, A. Mohamed, and S. Mutalib, “Customer reviews analytics on food delivery services in social media: a review,”
IAES International Journal of Artificial Intelligence (IJ-AI), vol. 9, no. 4, pp. 691–699, Dec. 2020, doi: 10.11591/ijai.v9.i4.pp691-
699.
[28] N. S. A. A. Bakar, R. A. Rahmat, and U. F. Othman, “Polarity classification tool for sentiment analysis in Malay language,” IAES
International Journal of Artificial Intelligence (IJ-AI), vol. 8, no. 3, pp. 259–263, Dec. 2019, doi: 10.11591/ijai.v8.i3.pp259-263.
[29] N. Seman and N. A. Razmi, “Machine learning-based technique for big data sentiments extraction,” IAES International Journal of
Artificial Intelligence (IJ-AI), vol. 9, no. 3, pp. 473–479, Sep. 2020, doi: 10.11591/ijai.v9.i3.pp473-479.
[30] E. Noei and K. Lyons, “A survey of utilizing user-reviews posted on google play store,” in Proceedings of the 29th Annual
International Conference on Computer Science and Software Engineering, 2019, pp. 54–63, doi: 10.1145/1122445.1122456.
[31] E. Noei, “Succeeding in mobile application markets,” Queen’s University Kingston, Ontario, Canada, 2018.
BIOGRAPHIES OF AUTHORS
Classification of customer feedbacks using sentiment analysis towards … (Nurazzah Abd Rahman)