Sentiment_Analysis_on_E-commerce_Product_using_Mac
Sentiment_Analysis_on_E-commerce_Product_using_Mac
The best result obtained from using unigram and SVM After pre-processing, the data weighted using TF-IDF. The
method with 80,87% accuracy [7]. data classified using Naive Bayes algorithm. Accuracy
Billy works on sentiment analysis using Naive Bayes achieved was 93.3%.
algorithm with different amounts of classes and data training. Sudheer also works on real time sentiment analysis of
The datasets are Bahasa Indonesia sentiments classified into 3 tweets about e-commerce websites data. The collected tweets
classes (positif, netral, negatif) and 5 classes (sangat positif, about e-commerce are Amazon 50000 tweets, eBay 25000
positif, netral, negatif, sangat negatif). The dataset split into tweets, and Alibaba 25000 tweets. The work focuses on
different amounts of data training, 80% and 90%[6]. The comparing accuracy with different classifiers, feature
highest accuracy achieved was 77.78% that classified into 3 selection, and datasets. The algorithms used in this paper are
classes and using 90% data training [8]. Naive Bayes, Maximum Entropy, and Decision Tree. The
Twitter users' opinions about the service of the feature selection used are document frequency and part of
marketplaces are used in Muljono work. The dataset retrieved speech tag. The finest result is data set from amazon
with crawling opinions from twitter using twitter API. e-commerce, much of the time Naive Bayes classifier
Collected data was 1200 Bahasa Indonesia opinion data [9]. outperformed the other classifier [10].
1 Analisa Sentimen Pelanggan Ai Nurhayatul Kamilah Product quality has not Using Naive Bayes Accuracy 77%
Tokopedia Menggunakan been conveyed Classification,
Algoritma Naive Bayes properly to customers Tokopedia product
Berdasarkan Review review dataset translated
Pelanggan to English, 200 data
training, 20 data testing
2 Support Vector Machine Hario Laskito Ardi, Eko Compare TF-IDF, n-gram using The unigram character
Classifier for Sentiment Sediyono, Retno characteristics Support Vector Machine analysis model and the
Analysis of Feedback Kusumaningrum Support Vector Machine
Marketplace with a analysis to get the best classification are the best
Comparison Features at classification results. models with an accuracy
Aspect Level value of 80.87%.
3 Sistem Analisis Sentimen Billy Gunawan, Helen Difficulty to read all of Classification with Naive Accuracy 77.78% (3 class and
pada Ulasan Produk Sasty Pratiwi, Enda the reviews and Bayes Algorithm. 4 90% data training), 73.89%(3
Menggunakan Metode Naive Esyudha Pratama opinions because the testing, classified into 3 class and 80% data training),
Bayes data too much classes and 5 classes with 59.33%(5 class and 90% data
2 different amount data training), 52.66%(5 class and
training 80% data training)
4 Analisa Sentimen Untuk Muljono, Dian Putri Consumer use social Using Naive Bayes Accuracy 93.3%
Penilaian Pelayanan Situs Artanti, Abdul Syukur, media to express their Classification Algorithm,
Belanja Online Menggunakan Adi Prihandono, De opinion about the 1200 dataset from twitter.
Algoritma Naive Bayes Rosal I. Moses Setiadi services of online
marketplace
5 Real Time Sentiment Analysis Prof. K. Sudheer, Dr. B. The e-commerce 50000 datasets from The best accuracy 92%
of E-Commerce Websites Valarmathi websites only maintain amazon tweets, 25000 obtained from amazon using
Using Machine Learning positive rating dataset from eBay tweets, document frequency feature
Algorithms 25000 dataset from selection
Alibaba tweets
III. METHODOLOGY
Sentiment analysis is a computational methodology to
identify and extract the sentiment contents in text, speech, or
database. Sentiment analysis also characterized emotions,
subjective impression, and opinions [11]. Classifying
sentiments on e-commerce product review with the best
performance is the main goal in this paper. Sentiments will be
Fig. 1. Research Methodology
classified into either positive or negative classes. Fig. 1.
shows the steps of the proposed work to accomplish the
expected results.
V. CONCLUSION
Table-II shows the confusion matrix that will be used to The objective of this research is to optimize sentiment
calculate performance accuracy, precision, recall and f analysis performance by using feature selection strategy.
measure. It compares the confusion matrix of sentiment Product reviews from Tokopedia, Shopee, and Bukalapak
analysis using TF-IDF, and sentiment analysis using was used as the dataset, while TF-IDF feature extraction,
combination of TF-IDF and Backward Elimination in five Backward Elimination feature selection, and SVM, Naive
different machine learning methods. Bayes, Decision Tree, K-NN, Random Forest classifiers was
Table-III: Results Comparison Without Feature used in analysing the sentiment of the reviews. The best
Selection accuracy is achieved by using TF-IDF and Backward
Elimination in SVM with a score of 85.97%, which increases
by 7.91% after applying feature selection. From the results,
Backward Elimination succeeded in improving all
performance including accuracy, precision, recall, and f
measure for all classifiers used in this research if compared to
sentiment analysis that did not use any feature selection. The
concern in using Backward Elimination feature selection is
Table-IV: Results Comparison With Feature Selection longer runtime when dataset gets bigger. Overall, it can be
concluded that feature selection technique can be used to
optimize performance of 2 class classification in sentiment
analysis on e-commerce product reviews. For future works in
this research, it is highly recommendable to use larger
datasets and to do comparison with other feature selection
methods.
ACKNOWLEDGMENT
The authors would like take this opportunity to express
their deepest gratitude to all those who have helped in
completing this study, especially to Bina Nusantara
University for supporting this research project.
REFERENCES
1. N. Kristiadi, "E-Commerce, Manfaat, dan Keuntungannya," 15 August
2017. [Online]. Available:
https://www.kompasiana.com/novikristiadi/5992634e93be2508e06c5
Fig 5. Results Comparison 402/e-commerce-manfaat-dan-keuntungannya. [Accessed 13
Table-III and Table-IV shows the results comparison of November 2019].
performance accuracy, precision, recall, and f measure 2. Aseanup, "Top 10 e-commerce sites in Indonesia 2019," 6 November
2019. [Online]. Available:
between sentiment analysis using TF-IDF and sentiment
https://aseanup.com/top-e-commerce-sites-indonesia/. [Accessed 13
analysis using combination of TF-IDF and Backward November 2019].
Elimination in five different machine learning methods. 3. R. Liang and J.-q. Wang, "A Linguistic Intuitionistic Cloud Decision
The highest accuracy for classifying sentiments in this Support Model with Sentiment Analysis for Product Selection in
E-commerce," International Journal of Fuzzy System, 2019.
research is 85.97%. It is achieved by using SVM and 4. T. U. Haque, N. N. Saber and F. M. Shah, "Sentiment Analysis on
combination of TF-IDF and Backward Elimination. Although Large Scale Amazon Product Reviews," IEEE International
Backward Elimination feature selection increases the process Conference on Innovative Research and Development, 2018.
5. R. Safrin, K. Sharmila, T. Subangi and E. Vimal, "Sentiment Analysis
runtime, it has shown better results in performance accuracy, on Online Product Review," International Research Journal of
precision, recall, and f measure for all classifiers used in this Engineering and Technology(IRJET), vol. 4, no. 4, pp. 2381-2388,
paper. Therefore Backward Elimination feature selection 2017.
6. A. N. Kamilah, "Analisa Sentimen Pelanggan Tokopedia
succeeded in achieving the expectation of this research. Menggunakan Algoritma Naive Bayes Berdasarkan Review
The highest accuracy for classifying sentiments in this Pelanggan," Simki-Techsain, vol. 1, no. 6, pp. 1-13, 2017.
research is 85.97%. It is achieved by using SVM and 7. H. L. Adi, E. Sediyono and R. Kusumaningrum, "Support Vector
combination of TF-IDF and Backward Elimination. Machine Classifier for Sentiment Analysis of Feedback Marketplace
with a Comparison Features at
Although Backward Elimination feature selection increases Aspect Level".
the process runtime, it has shown better results in
performance accuracy, precision, recall, and f measure for all
Retrieval Number: F7889038620/2020©BEIESP
DOI:10.35940/ijrte.F7889.038620 Published By:
Journal Website: www.ijrte.org Blue Eyes Intelligence Engineering
2866 & Sciences Publication
Sentiment Analysis on E-commerce Product Review using Machine Learning and Combination of TF-IDF and
Backward Elimination