To Find Out The Quality and Popularity of A Product by Using User Comments
To Find Out The Quality and Popularity of A Product by Using User Comments
Comments
AYESHA SAJJAD
ADNAN AHMED
WALEED AMIR
1
INTRODUCTION
User reviews on social platforms have a great influence on products reputation, they are viewed
by other customers before making a decision to purchase and organisations can also take benefit
from user reviews by identifying which parameters are satisfying customers and which are not.
Due to huge amount of user reviews on different platforms, it is a challenging task for
organisations to identify which parameter is satisfying their customers.[1]
Text-Mining is the process of examining large number of unstructured data (i.e. user
reviews) and converting them into structured data to observe the emotions and behaviours of
reviewers from unstructured text data.
Therefore, in this project we will design a model which will be capable of identifying
which feature of a product was good, bad or neutral to customer and popularity of a product by
using Text-Mining and Sentiment Analysis to classify reviews.
OBJECTIVES
OUTCOME
Software component to be produced which will include all the given functionality.
Evaluation of the build model against developed dataset based on user comments.
A complete software to be developed that will include this component and provide all
results in visual representation.
2
Benefit of the project:
Organizations can evaluate what customers want, which product was successful and which
products needs more enhancement. They also can determine market strategies to target
maximum customers.
By assessing user reviews, organizations will be able to identify which feature is lacking in a
product and can work on such features in future to satisfy customers.
BACKGROUND/LITERATURE REVIEW
In Zhang and Hua’s [1] work, they compared two methods i.e. Naïve Bayes and Support Vector
Machine to analyse user reviews to find out which method has more accuracy in predicting
user’s behaviour through reviews. In this research they concluded that Naïve Bayes algorithm is
more effective than SVM. Further, they also evaluated that the average shortest reviews have 17
words, shortest review had only 1 word, largest review had maximum 6000 words. Moreover,
they stated that text length of reviews satisfies Power-Law distribution i.e. the accuracy of
sentiment polarity classification rises as the word count decreases.
[2]
Chrystal and Joseph , in their research they worked on Structured Support Vector Machine to
perform text mining on electronic gadgets reviews. They developed a model to analyse the
performance and flexibility of structured support vector machine by creating a confusion matrix
to measure the degree of prediction and classification of text documents. This model had four
modules i.e. pre-processing, learning, classification and evaluation. Their system result in an
overall accuracy of 80.4%.
[3]
Jack and Tsai worked only on Amazon reviews. They found that high quality reviews are
those that subjectively comment on several product features. This paper reviews a method of
applying text mining techniques to compare and highlight top customer opinions of a product.
The research was primarily focused on understanding what was really important to users, what
positively or negatively affected product reviews, and what specifically users choose as
highlights or pain points when reviewing laptop and tablets. Their model was to apply text
mining to understand consumer feedback about purchased products. Further, they concluded that
using crowdsourced data in the form of online reviews can inform a company on how customers
think about and react to products and what is most important to them and urgent to fix, it is a
method of feedback to manufacturers
3
According to Wahyudi and Kristiyanti [4], Support Vector Machine lacks in electing appropriate
parameters or features. In this research, they used the merger method election features, namely
Particle Swarm Optimization in order to increase the classifications accuracy Support Vector
Machine. Their data set was based on 100 positive and 100 negative smartphone reviews and 4
words related to the sentiment of products, namely bad, fail, good and premium. The data set was
pre-processed using 3 steps i.e. Tokenization, Stop Words-Removal and Stemming. The
accuracy of sentiment analysis using SVM was 82.00% and with addition of Particle Swarm
Optimization (PSO), it obtained 94.50% of accuracy rate.
PROJECT METHODOLOGY
Feature Selection & Extraction
It is further divided into
1) Tokenization: Text document is collection of statements. This step divides the whole
statements into words by removing blank spaces, commas etc.
2) Stop word removal: This step involves removing of stop words such as ’a’, ’is’, ’of’, ’an’ and
so on. According to these words stop word removal process removes words from documents.
3) Stemming (stem word removal): Stemming is the process to identify the root of certain words
such as presented, presenting, presentation gets convert into original word present. The most
commonly used algorithm is porters’ algorithm for stemming. [5]
Feature Weighting
Feature weighting will be done using techniques like Term Frequency (TF) and Term Frequency
and Inverse Document Frequency (TF-IDF).
Machine Learning & Classification
Text classification is the task of sorting a set of documents into categories from a predefined set
of documents. It assigns labels to each document. It is based on supervised learning.
Classification techniques like Nearest Neighbor classifier, Naïve Bayesian classifier, Decision
Tree, and Support Vector Machines can be used to categorize text.[5]
Validation & Evaluation
Validation and evaluation would be completed, in which it will be identified that the
review/opinion lies in which classifier.
4
Figure 1- Block Diagram
5
PROJECT SCHEDULE
KEY MILESTONES
Key Milestones of the Project with dates
S. No Elapsed time since start of the project Milestone Deliverable
Preparation &
th
1 20 January 2020 submission of 24th Jan 2020
proposal
Literature survey &
th
2 8 Feb 2020 Study/Understandin 20th Feb 2020
g of project
Prototype and
3 21st Feb 2020 25th April 2020
design
4 26th April 2020 Experiment December 2020
GANTT CHART
Weeks (Spring 2020)
Activity
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Midterm Exam Week
Title
Submission
2-1-2020
Preparation &
Submission of
Study Week
Exam Week
Exam Week
Proposal
24-1-2020
Literature
Survey &
Study/
Understandin
g Project
20-2-2017
Experiment
Till final exam
Prototype and
Design
25-3-2020
Mid Viva /
6
Progress
Study Week
Exam Week
Exam Week
Analysis
Writing
Report
Viva
7
REFERENCES
[1] Lin Zhang, Kun Hua, Honggang Wang, Guanqun Qian, Li Zhang “Sentiment Analysis on
Reviews of Mobile Users”, The 11th International Conference on Mobile Systems and
Pervasive Computing, 2014. Online, Available at:
https://www.sciencedirect.com/science/article/pii/S1877050914008680
[2] Jincy B. Chrystal and Stephy Joseph “Text Mining and Classification of
Product Reviews
Using Structured Support Vector Machine”, 2015. Online Available at:
https://www.researchgate.net/publication/300665247_Text_Mining_and_Classific
ation_of_Product_Reviews_Using_Structured_Support_Vector_Machine
[3] L. Jack and Y.D. Tsai “Using Text Mining of Amazon Reviews to Explore
User-Defined Product Highlights and Issues”, November 20, 2015. Online
Available at:
https://www.researchgate.net/publication/284188657_Using_Text_Mining_of_Am
azon_Reviews_to_Explore_User-Defined_Product_Highlights_and_Issues
[4] Mochamad Wahyudi, Dinar Ajeng Kristiyanti “Sentiment Analysis of Smartphone Product
Review Using Support Vector Machine Algorithm-Based Particle Swarm Optimization”,
Journal of Theoretical and Applied Information Technology Vol.91 No.1, 2016. Online
Available at:
https://www.academia.edu/33152344/SENTIMENT_ANALYSIS_OF_SMARTPHONE_PRODUCT
_REVIEW_USING_SUPPORT_VECTOR_MACHINE_ALGORITHM-
BASED_PARTICLE_SWARM_OPTIMIZATION
[5] Yugandhara Bapurao Dasri, Bhagyashree Vyankatrao Barde, Nalwade Prakash Shivajirao,
Anant Madhavrao Bainwad “Text Mining Framework, Methods and Techniques”,
IOSR Journal of Computer Engineering (IOSR-JCE) Vol. 19 ver. II, 2017. Online Available at:
https://www.semanticscholar.org/paper/Mining-Framework-%2C-Methods-and-Techniques-Dasri-
Barde/22496c8251735204fcf66cceb0feedb946a68e25