Project Report Final
Project Report Final
CHAPTER 1
INTRODUCTION
This project is going to be focused on solving the problem of genuineness while shopping
online. When shopping online, people get confused to choose a product which is genuine. To
ensure quality people have to browse through all the user reviews and it is really tiring. We by
this project is trying to create a platform for customers where they can identify the genuineness
of a product from the user reviews.
To start with, let’s take a moment to pin down exactly what it is we’re trying to do. Challenge
is to predict the genuineness of a product from the already existing user reviews. We give a
rating to that product from 1 to 10 with 1 as least genuine and 10 to be the most genuine
product.
For public opinion we collected the data from Amazon and used it as our test data. We created
a dataset from the reviews available in Amazon reviews as our training data. The data is
pre-processed and then filtered for irrelevant characters. The data is then clustered based on the
negativity and positivity of the comments. After that using the Naive Bayes classifier to a scale
of 0 to 10 (where 5 is average) and thus the rating of an individual product is obtained. This
finding is made presentable as a website. To start with small only some categories of products
like Laptop, Camera and Mobile Phones are taken.
Client online buyers: Our client wants to check the genuineness of the products available in
E-Commerce platforms.Client is buying a product without actually seeing it so to ensure its
genuineness is a worrisome job by itself. To understand the genuineness of a product is only
possible by reading out the reviews which are given by already existing users. To read out all
the reviews available and understand the product is a tiring job. By using this platform we
provide a space to circumvent this exasperating work by analysing a product and its reviews
and thereby giving a rating to that product from already existing user reviews. By this rating it
is easy to understand whether a product is genuine or not without having to read all the
reviews. With this study, they can understand which features influence the genuineness of a
product. If the rating is good, they can ensure that they are getting a genuine product.
Sentiment analysis is a process that finds opinions, emotions from texts, comments and other
sources of natural languages. All the opinion/emotion is captured using natural language
processing. As now a day the amount of data is getting bigger and bigger ‘Natural Language
Processing ‘ is becoming more popular.
People who shop online nowadays are caring more to leave their opinions and criticisms
online. These opinions are from a wide range of users all including people who know
technically more about the product and people who have no idea about the technicalities of the
product. So by these comments both techies and non techies get a good idea of the product. By
these comments or reviews new buyers are able to analyse a product even before buying. All
types of positive negative and neutral reviews are someway helpful for other buyers. So to
analyse these comments provide great importance and effectiveness.
Due to these reasons, we thought of collecting these reviews and analyzing them to create a
rating system.This system will take ll reviews and process them using Naive Bayes algorithm ,
do a sentiment analysis on reviews to get the polarity which further leads to rating of a product.
Supervised learning is basically training the algorithm manually so that it can predict relatively
correctly for future datasets. In supervised learning training datasets are given with a desired
output. From all the information received by the algorithm it then predicts the probability for
unknown attributes.
For probability theory or to find out the probability of an event Bayes theorem/Bayes rule is
used. Bayes theorem follows the prior knowledge of the conditions for a specific event and
then calculates the probability of a certain event occurrence. Bayes theorem works based on the
conditional probability.
Where P(A) and P(B) are the probabilities of A and B disregarding each other and this is called
the marginal probability. P(B|A) is the probability of B occurring depending on the occurence
of A. Finally, the answer, P(A|B) is the conditional probability of A occurring given B is true.
Provided P(B) ≠ Φ.
While working on dataset with millions of records, Naїve Bayes approach is recommended.
The Naїve Bayes algorithm uses Bayes Theorem with strong independent assumptions. Bayes
theorem works on conditional probability. Conditional probability is the probability of
something will happen given that something else has already happened. It predicts the
probabilities for each class such as the probability that given record or data point belongs to a
particular class. If there are m possible classes A={a1,a2,.........,am} for reviews
T={t1,t2,...........,tn} then using bayes rule we can predict the probability review ‘t’ to be in a
class a
A naive bayes occurs independently thus it assumes each term or word wk,tk is the frequency of
each word wk, nd is the number of unique words then the equation becomes
In Figure 1, the basic workflow of Naive Bayes Classifier is shown. For each attribute it
traverses each node and finds the probability to be in a specific class. If it finds all the values of
an attribute, it goes to the next attribute. If it does not get the values, then it goes to different
nodes and check again.
Naive Bayes pseudocode is given. First it extracts the vocabulary and has attribute counter. For
each attribute, go to the node and check if the attribute belongs to the class. For each text, word
is tokenized. Then the tokenized word’s probability is measured.Then each word is scored.
Then it returns the score.
Both the training and the testing algorithms are presented below in the form of pseudo code:
CHAPTER 2
LITERATURE SURVEY
In this chapter, we briefly review some existing Sentiment analysis programs for Social
platforms. Also, we take a look at the past scenario (without sentiment analysis). Many
Sentiment analysis platforms have been designed for one or more social platforms.
Precursors to sentiment analysis include the General Inquirer, which provided hints towards
quantifying patterns in text and, separately, psychological research that examined a person’s
psychological state based on analysis of their verbal behavior. The General Inquirer is a unique
set of procedures for identifying, in a useful and meaningful way, recurrent patterns within the
rich variety of man’s written and spoken communications. It provides a flexible common
referent for testing the hypotheses of different investigators.The system is programmed to
accept actual text, look up words and phrases in dictionaries, assign descriptors, check for
specified descriptor patterns, count occurrences, and retrieve sentences with specified
characteristics.
Subsequently the method described in a patent by Volcani and Fogel looked especially at
sentiment and identified individual words and phrases in text with respect to different
emotional scales. A current system based on their work, called EffectCheck,presents synonyms
that can be used to increase or decrease the level of evoked emotion in each scale.
Work by Turney used a mere polar view of sentiment, from positive to negative, applied
different methods for detecting the polarity of product reviews and movie reviews respectively.
This work is at the document level. This is an early and influential paper presenting an
unsupervised approach to review classification. There are three basic ideas introduced:
One key idea is to score the polarity of a review based on the total polarity of the phrases in it.
A second idea is to use patterns of part of speech tags to pick out phrases that are likely to be
First step of bringing together various approaches- learning, lexical, knowledge-based, etc. -
were taken in 2004 AAAI Spring Symposium where linguistics, computer scientists, and
researchers first aligned interests and proposed shared tasks and benchmark data sets for the
systematic computational research on affect, appeal, subjectivity and sentiment in text.
Existing approaches to sentiment analysis can be grouped into three main categories:
Knowledge-based techniques
Statistical methods and
Hybrid approaches
Statistical method leverage elements from machine learning such as latent semantic analysis,
support vector machines, “bag of words”, “Pointwise Mutual Information” for Semantic
Orientation, and deep learning. More sophisticated methods try to detect the holder of a
sentiment (i.e, the person who maintains the affective state) and the target (i.e, the entity about
which the effect is felt). To mine the opinion in context and get the feature about which the
speaker has oppined, the grammatical relationships of words are used. Grammatical
dependency relations are obtained by deep parsing of the text.
Hybrid approaches leverage both machine learning and elements from knowledge
representation such as ontologies and semantic networks in order to detect semantics that are
expressed in a subtle manner, e.g., through the analysis of concepts that do not explicitly
convey relevant information, but which are implicitly linked to other concepts that do so.
Also assign arbitrary words a More sophisticated methods try It adds both features of
probable “affinity” to particular to detect the holder of a sentiment Knowledge-Based and
emotions. and the target Statistical methods
Knowledge is priority. Statistical values are prior. Both values are considered.
Open source tools as well as a range of free and paid sentiment analysis tools deploy machine
learning, statistics, and natural language processing techniques to automate sentiment analysis
on large collections of texts, including web pages, online news, internet discussion groups,
online reviews, web blogs, and social media.
A human analysis component is required in sentiment analysis, as automated systems are not
able to analyze historical tendencies of the individual commenter, or the platform and are often
classified incorrectly in their expressed sentiment.
The structure of sentiments and topics are often complex. The problem of sentiment analysis is
non-monotonic in respect to sentence extension and stop-word substitution. To address this
issue a number of rule-based and reasoning-based approaches have been applied to sentiment
analysis, including defeasible logic programming. There are a number of tree traversal rules
applied to syntactic parse tree to extract the topicality of sentiment in open domain setting.
Sentiment analysis on E-commerce platforms is now a popular which all the developers use.
Many sentiment analysis works are being developed which are used in so many platforms.
Some of the works which are closer to our work are discussed in this section.
ReviewMeta.com is a free web tool that analyzes millions of reviews and helps you decide
which ones to trust. Simply copy and paste any Amazon product URL into the search bar on
ReviewMeta.com for a full analysis.Their Chrome Extension helps streamline this process by
providing you with an adjusted rating for each product based only on the most trustworthy
reviews, and displays it directly at the top of your browser. ReviewMeta.com analyzes millions
of reviews and helps you decide which one to trust. ReviewMeta.com is completely
independent of Amazon and Bodybuilding.com. They are not a replacement for reading
reviews, but is an Amazon review checker tool that analyzes reviews and helps improve your
shopping experience. The review analysis does not guarantee whether or not fake reviews are
not present - They simply show you some detailed stats and making an educated guess.
Simply browse Amazon as normal. When viewing a product, the extension will show the
adjusted rating and color based on the authenticity of reviews. To see detailed report, click the
icon. It will help you weed out the biased or fake reviews and leave you with the most honest
feedback.
Amazon.com/.ca/.co.uk/etc: The extension simply reads the current URL to figure out which
product you are viewing so it can show you the corresponding data that we have on that
product.
ReviewMeta.com: Simply tells our website to hide the notification about installing the
extension.
The user has to copy product or business link from the URL box of our browser and paste the
copied link to Fakespot Analyzer tool and click Analyze Reviews. It analyzes reviews and
reviewers of the product or business.
It began as a simple idea back in the summer of 1999: a single score could summarize the many
entertainment reviews available for a movie or a video game. Metacritic's three founding
members found a more constructive but less profitable use of time by launching the site in
January 2001 and Metacritic has evolved over the last decade to reflect their experience
distilling many critics' voices into the single Metascore, a weighted average of the most
respected critics writing reviews online and in print.
Metacritic's mission is to help consumers make an informed decision about how to spend their
time and money on entertainment. They believe that multiple opinions are better than one, user
voices can be as important as critics, and opinions must be scored to be easy to use.
Their Metascore system is unique and merits its own explanation page.
Creating their proprietary Metascores is a complicated process. They carefully curate a large
group of the world’s most respected critics, assign scores to their reviews, and apply a
weighted average to summarize the range of their opinions. The result is a single number that
captures the essence of critical opinion in one Metascore. Each movie, games, television shows
and album featured on Metacritic gets a Metascore when we've collected at least four critics'
reviews.
Metascore is a weighted average because we assign more importance, or weight, to some critics
and publications than others, based on their quality and overall stature. In addition, for music
and movies, we also normalize the resulting scores (akin to "grading on a curve" in college),
which prevents scores from clumping together.
At the property level, the same sophisticated analytics used to create the Meta-Reviews can be
used to identify strengths and weaknesses impacting your ratings and reviews.
CHAPTER 3
SYSTEM DESIGN
3.1 USE CASE DIAGRAM
● User inputs his required data values on the website. The input contains values like
category of item (phone, laptop, camera) and the model from the category.
● This input is used to get the reviews from the database.
● A classifier model containing naive bayes algorithm to analyse the review of the
product chosen by the user is used to give a polarity for the review and a rating.
● The output is then displayed to the user through the website interface.
● Output displayed to the user is a rating on the genuineness of the product.
● State Diagram describes the behavior of the system. It contains a finite number of states
to show the working of the system.
● User Interface: In this all the information are made visible to user. This takes input from
the user and feed it to the backend process.
● Classifier Model: The input from user is given to the classifier. The classifier classify
the review and give a rating to the genuineness of the product.
● Print Rating: This gets the output from the Classifier and the output is given to the
interface to display it to users.
● Print Description: Once the user gives the input the input is given to the database to get
the descriptions about the selected product. This is given in the webpage to display to
users.
● Website is launched. This displays all the description and reviews to the users and gets
the input from the user.
● Check if any changes to the comments are made.
● If no changes are made then default values are used to make analysis
● If change is made then click evaluate again to make the ratings.
● After evaluating the rating is automatically calculated.
CHAPTER 4
ARCHITECTURE OF SENTIMENT ANALYSER
The basic idea of our project is to gather data from E-Commerce platforms and run a sentiment
analysis on the gathered data. Then get the calculated polarity of each review of a product.
After that the average of the polarity of the reviews of the product is calculated and converted
to 10 which is the rating of the product.
4.1 PROCESS
The processes in the proposed model are:
1. The first step is the collection of data from different E-commerce platforms.
2. The second step is the preprocessing of the gathered data to a supervised form.
3. The third step is the building of a list of positive and negative words added.
4. The fourth step is to collect data to be tested from Amazon comments.
5. The fifth step is to cluster different attributes of the product.
6. The sixth step is tokenizing and parts of speech tagging.
7. The seventh step is to do a sentiment analysis on the data to get the polarity.
8. The eight step is to get the average polarity and generate a rating based on the polarity.
9. The final step is to design a front end website to present the findings.
For gathering data, we used an automation tool ParseHub to collect the comments of a
particular product from amazon. The steps followed to extract data using ParseHub are:
{“keyword”:[“canon”,”nikon”]}
6. Open command menu and select advanced menu and add loop tool
7. In “for each” textbox leave text a name(item) and in the “In” textbox enter name of
list(keyword)
8. Click on + button next to Loop and choose Begin New Entry from Advanced Option.
9. Click on + button next to Begin New entry and add Select command.
10. Click on Amazon search bar to select it and change input type to “expression”.
13. Add another Select command and click plus next to it.
16. Change the mode from select mode to browse mode for ease of searching.
18. A Select command will be automatically added. Scroll through the page to select all
products.
20. Scrape the price, reviews and description of all the products.
22. Click on plus next to Begin New Entry and choose Click command.
23. In text box write details and click create new template
27. Choose select command and click on product description to extract it as well
After the collection of data which is in CSV format, collected data need to be pre-processed in
a supervised form. It means there cannot be a punctuation or additional symbol. To pre-process
the data means to remove stop words and punctuation and to tokenize the sentence. We use a
program to process the CSV file to remove unwanted characters, numbers, and space. We split
the data accordingly when full stop is identified.
We added a list of positive and negative words with the provided wordlist by python
NLTK.We can remove Stop Words Using NLTK easily. NLTK is shipped with stop words lists
for most languages. To get English stop words, you can use this code:
We used the data collected from Amazon and added some other data from other sites manually.
We manually classified the training dataset into positive data and negative data. The data for
training are stored in as data.pos and data.neg This data is pre processed and stored in the right
format which makes training process easy.
After reading the comment the classifier first tokenizes the words based on comma (,), full stop
(.), space and any other punctuation. After the stop words are removed by using parts of speech
tagging we can indicate subjectivity of comment better.Parts of speech tagging is done by
NLTK pos tagger which was pre-trained by python NLTK
Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it
determines whether a piece of writing is positive, negative or neutral. After preprocessing the
data, we did sentiment analysis on the datasets. For sentiment analysis we used the Naive
Bayes Classifier algorithm in PHP.
This algorithm is used for predicting the probability of words being in any particular class (neg
or pos). This is used due to its ease during both training and classifying steps. Preprocessed
data is given as input to train the classifier and that model is applied on test to generate positive
or negative or neutral sentiment.
4.7.1 SentimentAnalyzerTest:
This is the main class of sentiment analyzer. This contains all the functions essential to
sentiment analysis.all the methods and declarations are in a class file named
SentimentAnalyzer.class.php .This class initialize all the working functions for the analyzer
module with their parameters and specify their return values .
_construct()
trainAnalyzer()
analyseSentence()
analyseDocument
4.7.2 _construct():
This function is used to declare constructor method for the given class by initializing a array0
named arrBayesDifference. arrBayesDifference is an array of numbers from -1 to 1.5 with 0.1
increment.
4.7.3 splitSentence():
This particular function is used to perform a regular expression match. We are checking the
pattern \w that is only words numbers and _ omitting white space and punctuation in the
subject $word . The text that matched the full pattern will be contained in $matches.
4.7.4 insertTestData():
This function is used to take test data and process it passes it to the classifier.
This produces an exception if any other sentiment type other than the predefined types are
encountered.
In this portion the test data it into words and occurence of each word is calculated. If the
sentiment of word matches that of the test data type 1 is incremented. Otherwise it is set to 0.
4.7.5 analyseSentence():
Analyse Sentence is the main part of the whole classifier. This part analyse a sentence and give
polarity to it.
This portion fixes the sentiment score as an array of positive and negative. Sentence is split to
words and stored as words.
Laplace Correction is used in this portion. To avoid the value of sentiment to be zero laplace is
used and 1 is added to the value before multiplying.
This function is used to return the sentiment polarity and values of their polarity of a given
sentence.this functions works only for single sentence.this function itself split the sentence
using the function Splitsentence and count the words in it and finally finding the average
polarity and polarity score from these individual word values
4.7.6 analyseDocument():
This part in both analyseSentence and analyseDocument is used for the same purpose, that is to
evaluate the polarity and sentiment score of document.
Rating is calculated by averaging the polarity(positive, negative and neutral) of all the reviews
of a product. As these polarities are fraction values, we convert it to decimal, round it up using
floor and show the rating. As neutral polarities are considered to have a polarity of 0 and it
does not affect the average polarity cunt, these values are not taken in order to make our system
more efficient. We save the number of positive, negative comments and we average them to
find out the rating on a scale of 1 to 10. This is how polarity is calculated of a single product.
This initialize a safer connection with the database server with the given credentials. The
mysqli_connect command is used to ensure the connection to the database. If the connection is
not established a connection failed message is shown. mysqli_select_db is used to connect to
the database commodity_dataset.
The structure of the four tables which are used in this database are:
camera:
This is used to store the camera information like brand, model name, model, type, camera
resolution, display, warranty, video recording, review, status and score.
filtered_input:
This table is used to filter the inputs and store them for all the three categories used.
laptop:
This table stores the information regarding a laptop like its brand, model name, screen-size,
price, screen-resolution, RAM, Hard disk capacity, processor, graphics, battery, color,weight,
warranty, review, status and score.
phone:
This table contains the information related to a phone like its brand, model name, resolution,
RAM,price, front camera, back camera, processor, review, battery capacity , status and score.
We are using a webpage as a medium to present our analysis result to users. A web page which
is run on a local server by Xampp is designed. The webpage contain different pages for
different categories which on a click will be redirected to these pages from the home page.
Here we use bootstrap for styling our webpage. Bootstrap is a free and open source CSS
framework for developing responsive websites. Free Bootstrap templates are available for easy
styling of websites. This gives home page interface of the project. Here the page is styled with
bootstrap framework. Here the different categories of the items are displayed with button, and
if pressed it will redirect to its consecutive category php page.
For home page we have a home.php which deals with the display of all the categories available.
While clicking on these categories it will redirected to these respective pages. For camera we
have camera.php, for laptop laptop.php, for phone phone.php
The design is done in a stepwise procedural manner . Collection of data followed by Pre
processing it. After the classifier model is constructed and Training is done. Using this model
the test data is classified. After all this process is completed a web interface is designed for
presenting the analysis result
CHAPTER 5
RESULT AND ANALYSIS
The result of sentiment analysis is presented as a website run on local server.XAMPP is used
for thispurpos.XAMPP is a free and open-source cross-platform web server solution stack
package developed by Apache Friends, consisting mainly of the Apache HTTP Server,
MariaDB database, and interpreters for scripts written in PHP and Perl programming
languages.
To run our project after initialising the Xampp apache server and MySQL go to the google
chrome and in the URL area type: http://localhost/Project_Dynamic_Facet/home.php
Camera:
It will be redirected to the following page:
http://localhost/Project_Dynamic%20Facet/camera.php
In the drop down box type we can select the required criteria and in the input box next to it type
down the value.
On clicking Submit the list of cameras with specified criteria will be displayed.
We can select the product and this will redirect to the product page. :
http://localhost/Project_Dynamic%20Facet/camera2.php?model=Canon%20EOS%20100D%2
0SLR
Every comment can be evaluated individually and the overall average will be given at the top,
below the description of the product.
Laptop:
It will be redirected to the following page:
http://localhost/Project_Dynamic%20Facet/laptop.php
Since this section has so many products the show drop down box is used to limit the entries to a
number.
In the Search input box we can look for any particular brand or property of that laptop.
We can select the product and this will redirect to the product page. :
http://localhost/Project_Dynamic%20Facet/laptop2.php?model=Toshiba%20C50-A%20P0011
%20Satellite%20Laptop
Every comment can be evaluated individually and the overall average will be given at the top,
below the description of the product.
Mobile phone:
It will be redirected to the following page:
http://localhost/Project_Dynamic%20Facet/phone.php
In the drop down box type we can select the required criteria and in the input box next to it type
down the value.
On clicking Submit the list of phones with specified criteria will be displayed.
We can select the product and this will redirect to the product page. :
http://localhost/Project_Dynamic%20Facet/phone2.php?model=Apple%20iPhone%206
Every comment can be evaluated individually and the overall average will be given at the top,
below the description of the product.
The final results are displayed along with the description and all the reviews. When evaluating
a review that particular review’s score will be given at the top along with the overall rating and
polarity.
Sentiment analysis is done using Naive Bayes Algorithm. The accuracy of this algorithm can
be calculated as
Accuracy = (a+d)/(a+b+c+d)
A confusion matrix is a table that is often used to describe the performance of a classification
model (or "classifier") on a set of test data for which the true values are known. The confusion
matrix itself is relatively simple to understand, but the related terminology can be confusing
The accuracy of naive bayes used in our sentiment analysis is calculated to determine how well
the analyser works. Using 600 test data the accuracy of the system was calculated.
Total Observations in Table: 600
Accuracy=(241+233)/600
Model accuracy = 0.790
Naive Bayes has some characteristics which makes it different from other algorithms. It is very
simple, easy to implement and fast. If the Naive Bayes conditional independence assumption
holds, then it will converge quicker than discriminative models like logistic regression.
Even if the Naive Bayes assumption doesn’t hold, it works great in practice.It need less training
data.It is highly scalable. It scales linearly with the number of predictors and data points. It can
be used for both binary and mult-iclass classification problems.It can also make probabilistic
predictions.It handles continuous and discrete data. It is not sensitive to irrelevant features.
CHAPTER 6
ADVANTAGES
● Free to Use:
The project is implemented and developed to improve the customer problems as an
open source. This is available free of cost to customers.
CHAPTER 7
DISADVANTAGES
CHAPTER 8
APPLICATIONS
When experimenting with machine learning and big data, you may identify data sets that
contain streams of text that contain customer reviews, or social media posts where customers
(or potential customers) are talking about a product, brand or service that you offer. Mining
such data to determine how people feel about your product, brand, or service, is called
Sentiment Analysis.The applications of sentiment analysis in business are plenty and
overwhelming. Gaining a greater business value with sentiment analysis depends on what tool
you use and how well to use it to your advantage.
This application when embedded with the E-commerce platforms can help in improving the
working of the platforms in many ways. It improves the platform’s reliability and can improve
marketing and become more profitable. The two main advantages or aspects while using this in
the platforms are:
❏ Reputation management
It can also be called as brand monitoring. We all know how much good reputation means these
days when the majority of us check social media reviews as well as review sites before making
a purchase decision. Now people don't decide to eat out without checking the reviews of a
place beforehand. The same thing applies to buying stuff online, or researching tools used
daily at work.
Negative reviews put people off and how it is handled can define your future as a business. It
could either ignore them (highly not recommended), act rude and make the situation even
worse, or apologise for whatever caused a person to write a negative opinion and do what is
best to make up for it.
But we have to be aware of those opinions in the first place. That’s where social media
monitoring combined with sentiment analysis comes in! While some say it’s just a fad or
something that only big businesses can use, it is believed that a social media monitoring tool
not only will help you manage your reputation, but also prevent your customers from turning to
your competitors and earn you money they could spend elsewhere.
A brand is not defined by the product it manufactures or the services it provides. The name
and fame that build a brand majorly depend on their online marketing, social campaigning,
content marketing and customer support services. Sentiment analysis in business helps in
quantifying the perception of the present and potential customers regarding all these factors.
Keeping the negative sentiments in knowledge, you can develop more appealing branding
techniques and marketing strategies to switch from torpid to terrific brand status. Sentiment
analysis in business can majorly help you to make a quick transition.
❏ Customer support
Social media are channels of communication with your customers these days, and whenever
they’re unhappy about something related to product, whether or not it’s the fault of the product,
they’ll call that out on Facebook/Twitter/Instagram.
Such mentions will appear in dashboard with a flashing red colour, and it should be engaged as
soon as they are there.
People nowadays expect brands to respond on social media almost immediately, and if you’re
not quick enough, you might as well see them moving on to your competitors instead of
waiting for your reply.
A business breathes on the gratification of its customers. The experience of the customers can
either be positive, negative or neutral. Owing to the internet savvy era, this experience becomes
the text of their social posting and online feedback. The tone and temperament of this data can
be detected and then categorized according to the sentiments attached. This helps to know what
is being properly implemented with regard to products, services and customer support and what
needs improvement.
Getting a positive response to your product is not always enough. The customer support system
of your company should always be impeccable no matter how phenomenal your services are.
❏ Competitor monitoring
Chances are some of your competitors are getting bad press online. It’s where you could step in
as long as you’re aware of those negative mentions. It is not about taking advantage of
whatever they had neglected in an aggressive way, but chiming in conversations when they
don’t even bother to reply to the mentions they are getting can be helpful.
It doesn’t necessarily need to put a competitor in a bad light, it can be a situation when it’s
totally fine to pop in with a helping hand. Not only does it solve the problem of a person
asking, but also represents a proactive approach indicating that you have your ear to the ground
with whatever’s going on in the industry. It is still difficult for the vast majority of tools to
precisely evaluate what truly is a negative, neutral, and a positive statement. At the moment it’s
not advanced enough to successfully deal with sarcasm or context of some of the discussions.
Having insights-rich information eliminates the guesswork and execution of timely decisions.
With the sentiment data about your established and the new products, it’s easier to estimate
your customer retention rate. Based on the reviews generated through sentiment analysis in
business, you can always adjust to the present market situation and satisfy your customers in a
better way. Overall, you can make immediate decisions with automated insights. Business
intelligence is all about staying dynamic throughout. Having the sentiments data gives you that
liberty. If you develop a big idea, you can test it before bringing life to it. This is known as
concept testing. Whether it is a new product, campaign or a new logo, just put it to concept
testing and analyze the sentiments attached to it.
CHAPTER 9
SUMMARY
In this project the sentiment analysis on E-commerce platform was developed and discussed.
Sentiment analysis is an emerging technology which helps in mining opinions from a large
group of texts. To implement this project the naive bayes classifier was used. Naive Bayes
classifier makes use of naive bayes theorem with conditional probability. The sentiment of a
word is determined by its ratio of number of occurrence to the total number of words in both
the positive and negative datasets. The bigger ratio set is assigned as the sentiment of that
word.
The related works are also discussed in this report. Some of the applications which performs
the similar functions like Review meta, Fakespot, Metacritic, Trust You are discussed.
The workings of the system is shown using different uml diagrams. The process and the code is
also discussed in this report.
The project collects data from Amazon using ParseHub and this data is preprocessed and kept
in formats to help in easily analysing the data. Using the training data set a classifier is built
and using this classifier the test data is analysed.
The reviews from amazon are displayed as a web page interface and the user can evaluate the
rating of each comment and from this average value is found and the total rating is displayed.
CHAPTER 10
CONCLUSION
CHAPTER 11
FUTURE ENHANCEMENT
In recent years, we have seen the democratization of sentiment analysis, in that it’s now being
offered as-a-service. Companies such as Microsoft, IBM and smaller emerging companies offer
REST APIs that integrate easily with your existing software applications. For example, using
the following publicly available Sentiment Analysis REST API from a small start-up called
Social Opinion, we pass in the text, “this phone is awesome”, to the following URL:
http://api.socialopinion.co.uk/api/sentiment/?text=phone%25awesome&token=00000
REST API response after passing a text to Social Opinion in sentiment analysis
In the response, we can see the text has been identified as expressing positive emotion, with a
64% probability of that being true.
Sentiment Analysis has been more than just a social analytic tool. It’s been an interesting field
of study. But it is a field that is still being studied, although not at great lengths due to the
intricacies of this analysis. That is this field has functions that are too complicated for machines
to understand. The ability to understand sarcasm, hyperbole, positive feelings, or negative
feelings has been difficult, for machines that lack feelings. Algorithms have not been able to
predict with more than 60% accuracy the feelings portrayed by people. Yet with so many
limitations this is one field which is growing at great pace within many industries. Companies
want to accommodate the sentiment analysis tools into areas of customer feedback, marketing,
CRM, and ecommerce.
REFERENCES
4. Pablo Gamallo, MarcosGarcia (2014), “Citius: A Naïve Bayes Strategy for Sentiment
6. Mrs. Sayantani Ghosh, Mr. sudipta Roy, Prof. Samir K. Bandyopadhyay (2012)
“A tutorial review on Text Mining Algorithm - Vol. 1, Issue 4,” 2278 – 1021,11.
8. Alec Go, Richa Bhayani, Lei Huang (2016) “Twitter Sentiment Classification using
Distant Supervision”34632156, 6.
10. Callen Rain (2014) “Sentiment Analysis in Amazon Reviews Using Probabilistic
12. Wenyuan Dai† Gui-Rong Xue, Qiang Yang, Yong YuTransferring (2007),”Naive