Final Project Report
Final Project Report
Project Report
On
Using Python Language for Sentiment Analysis of Restaurant
Reviews
Dept. of
Electrical and Electronic Engineering
Faculty of Engineering and Technology
Begum Rokeya University , Rangpur
Submitted by
Md. Iftik Arman Emon
ID No: 1716017
Reg.No:000010701
Session: 2017-2018
Supervised by
Iffat Ara Badhan
Lecturer , Dept. of EEE
Begum Rokeya University ,Rangpur
2
ACKNOWLEDGEMENT
A project was included in the syllabus of the final year of the Department of Electrical and
Electronic Engineering at Begum Rokeya University, Rangpur. Represented Iffat Ara
Badhon mam’s gave necessary instructions to do this project successfully.I am grateful to her
for this . I also thank all of my friends who have helped me run the project in various ways
Author
……………………………
3
CERTIFICATE
This is to certify that Md. Iftik Arman Emon ID number 1716017 , Reg. number 000010701,
sessional :2017-2018, has successfully finished the project titled “ Using Python Language
for Sentiment Analysis of Restaurant Reviews ”. In order to completed the criteria for the
Bachelor of Engineering in Electrical and Electronic Engineering degree, the project was
carried out under my supervision and guidance. To the best of the of my knowledge and
belief, the project report contains the candidate’s original work, for which she conducted
adequate investigation and can be noted as a unique idea.
…………………………
Iffat Ara Badhan
Lecturer, Department of Electrical and
Electronic Engineering
Begum Rokeya University,Rangpur
4
DECLARATION
The project report “Using Python Language for Sentiment Analysis of Restaurant Reviews ” is
based on my personal work that was completed throughout the course of our studies under
the supervision of Iffat Ara Badhan Mam’s. I the undersigned solemnly swear .I claim that
claims made and judgments made are a results of my study. I further certify that
1. The work contained in the report is original and has been done by me under the
general supervision of my supervisor
2. I have followed the guidelines provided by the university in writing report
3. Whenever I have used materials from other sources, we have given due credit to them
in the text of the report and giving their details in the references
………………………………..
Md. Iftik Arman Emon
ID no:1716017
Reg no: 000010701
Session: 2017-2018
Department of Electrical and Electronic Engineering
Begum Rokeya University, Rangpur
5
LIST OF CONTENTS
ABSTRACT……………………………………………………………………………7
CHAPTER1.INTRODUCTION………………………………………………………8
1.1.Introduction ………………………………………………….8
1.2. Related work………………………………………………....9
1.3.Objective of Project…………………………………………..9
3.1.Flssk==1.1.1
3.2.gunicorn==19.9.0
3.3. itsdangerous==1.1.0 ……………………………………13
3.4.jinja2==2.10.1
3.5.MarkSafe==1.1..1
3.6.Werkzeug==0.15.5
3.7. numpy>=1.9.2 …………………………………..14
3.8.scipy>=0.15.1
3.9.Scikit-learn>=1.4.3
3.10. matplotlib>=1.4.3
3.11.Pandas>=0.19 …………………………………….15
4.1. SVM……………………………………………………………20
4.2. NAIVE BAYES………………………………………………..21-
22
4.3. LOGISTIC REGRETION……………………………………..23
4.4. LTSM………………………………………………………….24-
25
4.5. BERT………………………………………………………….26-27
6
CHAPTER 5. METHODOLOGY………………………………………………….28
5.1. Methodology……………………………………………………..28
5.2 . Classification
Multinomial Naïve Bayes
Random Forest
Decision Tree ………………………….28-
30
Support Vector Machine
5.3 . Achievement Rating Assessment………………………………..30
5.3 . Predicting a class…………………………………………………31
CHAPTER 7. REFERENCES………………………………………………………..42-43
7
Abstract In the last ten years, the Internet's development has generated vast amounts of data
across all industries. These innovations have given people new avenues for expressing their
ideas on anything through tweets, blog entries, online forums, status updates, etc. Sentiment
analysis is the technique of computationally identifying and classifying opinions stated in a
text, particularly to ascertain if the writer has a good, negative, or neutral attitude towards a
given topic. Any firm should be very interested in client feedback. Therefore, in this present
paper, we use python language classification system to analyze the customer reviews of the
restaurant. This study's major topics are the use of several categorization algorithms and an
evaluation of their effectiveness. According to the simulation findings, the highest accuracy
achieved by SDG at 69.23%
Keywords LR(Logistic Regression model), DT(Decision Tree Model), RF( Random Forest
Model) , MNB( Multinomial Naïve model) , KNN ( K- Nearest Neighbors model) , Linear
SVM ( Linear Support Vector Machine model ) , SGD ( Stochastic Gradient Descent model )
8
Introduction
There is an infinite amount of online activity, including blog posts, video calls, conferences,
monitoring, and other e-commerce and online transactions. have been sparked by the
exponential rise in Internet usage. This makes it necessary to quickly collect, convert, load,
and analyzes vast amounts of diverse, unstructured data[1]. Numerous discussion boards,
blogs, social networks, e-commerce websites, news articles, and other online resources
provide a place for opinion expression that can be used to gauge the opinions of the general
population. Sentiment analysis aids in identifying, extracting, and categories ideas,
sentiments, and attitudes conveyed in textual input on many issues [2]. Additionally, it aids in
reaching objectives such as tracking public opinion on political movements, gauging
customer happiness, forecasting movie sales, and ascertaining critics' view points. To extract
key information about a certain product (including variables such a digital camera, a
computer, books, or films), sentiment analysis can be used to categorize online evaluations of
merchandise from retailers like Ebey and Flip kart . The method of sentiment analysis is
frequently used to track how the public's views on a political candidate are evolving by
looking at online discussion boards. [3]. Since it may be utilized for study into trends or
consumer preferences, monitoring the mood of bloggers is likewise becoming a highly
sought-after research area. Sentiment analysis is turning out to be absolutely essential in the
area of opinion spam. Opinion spam describes criminal practices that aim to deceive readers,
such as writing fraudulent reviews (also known as shilling). It might be seen as an automated
sentiment analysis system giving some target entities unwarranted good assessments in an
effort to advance the entities. It can also mean giving a falsely adverse review of another
organization in an effort to harm its reputation. Studying the reviews and looking at the
sentiment scores is the major objective of sentiment analysis. People primarily rely on user-
generated content while making decisions. Before making a purchase, the user can determine
via sentiment analysis whether the product's information is satisfactory or not. Companies
and advertising agencies use this analysis data to find out more about their products or
services so they can more effectively satisfy client wants. Sentiment analysis is typically
performed at several levels, ranging from coarse to fine. A document's overall sentiment is
determined by coarse- evaluation of sensitivity, while attribute-level sentiment analysis is the
focus of fine-level analysis [4]. In between these two is where emotion evaluation at the
sentence stage exists.
keywords expressing opinions, language processing techniques are employed [7]. However,
using a dataset that is originally categorized by a human, methods for guided artificial
intelligence learn whether the review is favorable, unfavorable, or neither. [6]. In the
Lexicon-based technique, the polarity is determined by matching opinion terms from a
sentiment lexicon with the data. Following that, ratings are given according to the dictionary
terms' unfavorable or bad connotations to the view words [2]. The planned research would
examine patron perceptions of a restaurant's service.
Segregation of
Training and
Predict Class Testing data
of a review Performance
using best Analysis of
classifier Classification
using Test data
Classification
using Training
data set
New set of
Reviews
Working principle :
The natural language processing (NLP) approach of sentiment analysis, commonly referred to
as opinion mining, is used to ascertain the sentiment or emotional tone of a document.
Sentiment analysis can be used to examine customer input in the context of restaurant
reviews in order to ascertain if the sentiment indicated in the review is positive, negative, or
neutral.
Here is a summary of how Python and machine learning are used to perform sentiment
analysis on restaurant reviews:
3 Data Collection:
The dataset for this study was created from comments made about various
restaurants .The data used in this study was prepared from foodpanda and other
restaurant sites in rangpur division .The dataset contains 600 reviews .It contains seven
columns .First column contains SL No , Restaurant Name , Third is reviewer, Fourth is
Location Name , Fifth is cuisine, Sixth is Ratting and seventh is sentiment .The reviews
are classified in two categories .They are positive and negative .The range of the
sentiment is (0-5).Here in my dataset positive reviews contains ratting 3-5 out of 5 and
negative reviews contains ratings 1 to 2. From the dataset there are total positive reviews
348 and negative reviews 252.After cleaning,214 small reviews are cleaned. Then total
reviews are 386 which is 190 are positive and 196 are negative reviews.
4 Prepping of Information
The dataset we used is excel. In this excel format, we have used the restaurant reviews to
train the model. Since all the algorithms we are working with are supervised learning, we
have classified the dataset beforehand to train our algorithm. We import the dataset using
Pandas in python .19
The most important phase in establishing a text's atmosphere is preprocessing. In our
approach, the preprocessing is broken down into three basic phases .Here first stage is to
remove the punctuation from the sentences. Special characters such as exclamations, quotes,
etc. are eliminated by creating a suitable pattern expression. The resulting data would consist
only of alphabetical characters.
The second step is to get rid of the stop-words. Stop-words are words in the English language
that are not used for emotion or sentiment but are used as links or articles. Examples of stop-
words include “and”, “with,” “of,” and “the”. NLP techniques such as Stop-words are found
and removed from the dataset using lexical examination, grammatical evaluation, semantic
evaluation, transparency integrating, and pragmatic evaluation. The semantic analysis step
usually deletes the “not” like “not.” However, when it comes to opinion mining, it’s not the
word not that matters. For instance, the review says that “Crust is not good”. By removing
stop-word “crust”, this sentence becomes “crust good” and a negative opinion becomes a
positive opinion. To prevent this from happening, we’ve changed the semantics The
evaluation stage in the NLP and we’ve to ensure that these stop-words aren’t getting
eliminated throughout. The third step is to calculate the sentiment of all data which I import
to the excel sheet .To do this work we have some python Library .Those Python library are
given bellow :-
13
Flask==1.1.1
gunicorn==19.9.0
itsdangerous==1.1.0
Jinja2==2.10.1
MarkupSafe==1.1.1
Werkzeug==0.15.5
numpy>=1.9.2
scipy>=0.15.1
scikit-learn>=0.18
matplotlib>=1.4.3
pandas>=0.19
Flask 1.1.1: A well-liked Python web framework called Flask makes it simple and
requires little boilerplate code to create online applications. It is a simple and
adaptable framework that adheres to the WSGI (Web Server Gateway Interface)
standard and is frequently utilized to develop RESTful APIs and web services.
Pip, the Python package manager, can be used to install Flask. Run the following
command after opening your command-line interface:
Pip install Flask==1.1.1
Gunicorn 19.9.0: A well-liked WSGI (Web Server Gateway Interface) HTTP server
called Gunicorn (Green Unicorn) is frequently used to deliver Python web
applications. Flask, Django, Pyramid, and other web frameworks are just a few of the
ones that it is made to operate well with. Gunicorn is a good option for hosting
production-ready web apps because of its simplicity, performance, and scalability.
The Python package manager, pip, is used to install Gunicorn. Run the following
command after opening your command-line interface:
Pip install gunicorn==19.9.0
pandas>=0.19: A potent Python package for data analysis and manipulation is called
pandas. For handling and analyzing structured data, it offers data structures like Series
(1-dimensional labelled arrays) and Data Frame (2-dimensional labelled data tables).
Pandas is frequently used for data preprocessing, exploration, and cleaning activities
in data science, machine learning, finance, and numerous other fields.
Use pip, the Python package manager, to install pandas with a version equal to or
higher than 0.19. Run the following command after opening your command-line
interface:
Pip install pandas>=0.19
16
The preprocessing stages are a crucial step in obtaining clear and unambiguous information
so that the results of order assurance can be more precisely predicted in the future
II. Create a function to handle case folding and any optional further text
preprocessing. After removing any non-alphanumeric characters with a
regular expression, the text will be changed to lowercase.
def preprocess_text(text):
# Remove non-alphanumeric characters and replace with spaces
processed_text = re.sub(r'[^a-zA-Z0-9\s]', ' ', text)
# Convert to lowercase
processed_text = processed_text.lower()
return processed_text
Output
this is an example sentence with mixed cases
Symbol Removal: is the stage in which punctuation (period (! ), comma (,), question
mark (? ), exclamation point (!) and other symbols are applied, as well as explicit
characters (&,%, $, #, @ and other symbols) and numbers (0,1,2... to 9).
Example:
Input Function
# Example text with symbols
text_with_symbols = "Hello, this is an example text with some !@#$%^&*()_+
symbols."
Output
Hello this is an example text with some symbols
Python has a number of modules that offer lists of words to avoid in many languages.
The Natural Language Toolkit (NLTK) is one of the most widely used libraries for
NLP activities. If you haven't previously, install the NLTK library before using
stopwords in Python.
Install NLTK library using this
pip install nltk
The StopWords data for the English language after installing NLTK:
import nltk
nltk.download('stopwords')
Example:
18
Input
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
def remove_stopwords(sentence):
stop_words = set(stopwords.words('english'))
word_tokens = word_tokenize(sentence)
filtered_sentence = [word for word in word_tokens if word.lower() not in
stop_words]
return ' '.join(filtered_sentence)
Output
example sentence common stopwords .
Data cleaning: In the pipeline of data preprocessing, data cleansing is a vital stage.
The process of identifying and correcting defects, contradictions, and errors in a data
set is necessary to enhance its quality and get it ready for additional research.
I. Treatment of Missing Values:
Datasets frequently have missing values, which can cause issues during
analysis. Using pandas, a well-liked Python data manipulation toolkit, you
can deal with missing values.
Example:
import pandas as pd
# Remove duplicates
df = df.drop_duplicates()
III. Data Type Conversion: For analysis, make sure columns have the
appropriate data types. You can change the data type using pandas.
# Convert a column to a numeric type
df['column_name'] = pd.to_numeric(df['column_name'])
# Remove outliers
df = df[~df['column_name'].isin(outliers)]
V. Text Cleaning: You can lowercase, remove punctuation, and stopword text
data using various methods.
import re
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
def clean_text(text):
text = text.lower()
text = re.sub(r'[^\w\s]', '', text)
stop_words = set(stopwords.words('english'))
words = word_tokenize(text)
cleaned_words = [word for word in words if word not in stop_words]
return ' '.join(cleaned_words)
df['text_column'] = df['text_column'].apply(clean_text)
VI. Feature Scaling: You may want to use feature scaling to bring numerical
features that are on various scales into a range that is similar.
scaler = MinMaxScaler()
df[['numerical_column1', 'numerical_column2']] =
scaler.fit_transform(df[['numerical_column1', 'numerical_column2']])
SVM Algorithm
One effective machine learning approach for sentiment analysis is called Support Vector
Machines (SVM). Finding the sentiment or attitude communicated in a text (such as whether
it is favourable, negative, or neutral) is the aim of sentiment analysis. Based on the features
that have been taken out of the text, SVM can be used to categorize text data into various
sentiment groups.
The following actions to perform sentiment analysis in Python using SVM:
1. Data preprocessing: Cleanse and preprocess the text data to prepare the dataset. This
includes operations like erasing punctuation, changing the text's case to lowercase,
and eliminating stop words.
2. Feature Extraction : Convert the preprocessed text data into numerical features that
SVM may use with feature extraction. Term Frequency-Inverse Document Frequency
(TF-IDF) representation is one such technique.
3. Training the SVM Model: Split your dataset into a training set and a testing set
before starting to train the SVM model. The SVM model should next be trained using
the training set using the retrieved features.
4. Evaluating The Model: Model Evaluation: Using the testing set, assess the trained
SVM model's performance.
Here is a sample Python program that uses the scikit-learn module and the TF-IDF
representation:
data = pd.DataFrame({
'text': ['I love this product!', 'This is terrible.', 'It is okay.'],
'sentiment': ['positive', 'negative', 'neutral']
})
# Data preprocessing (optional, you can add more steps based on
your needs)
data['text'] = data['text'].str.lower()
# TF-IDF vectorization
tfidf_vectorizer = TfidfVectorizer()
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)
X_test_tfidf = tfidf_vectorizer.transform(X_test)
Naïve Bayes
Another well-liked machine learning algorithm frequently used for sentiment analysis
jobs is naive bayes. The Bayes theorem-based probabilistic technique is effective with text
data and high-dimensional feature spaces.
Similar to the SVM method, you can use Naive Bayes to perform sentiment analysis
in Python.
1. Data Pre-Processing: Similar to the last example, prepare the dataset by
cleaning and preparing the text data.
2. Feature Extraction: Create numerical characteristics from the text data that
has been preprocessed. Similar to the SVM method, we can utilise the Bag-of-
Words or TF-IDF representation for Naive Bayes.
22
3. Training the Naïve Bayes Model: Train the Naive Bayes model using the
features that were retrieved after dividing the dataset into a training set and a
testing set.
4. Evaluating the Model: Utilizing the testing set, assess the trained Naive
Bayes model's performance.
Here is an example Python program that uses the TF-IDF format and the
scikit-learn library:
# Data preprocessing (optional, you can add more steps based on your needs)
data['text'] = data['text'].str.lower()
# TF-IDF vectorization
tfidf_vectorizer = TfidfVectorizer()
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)
X_test_tfidf = tfidf_vectorizer.transform(X_test)
Logistic Regression
# Data preprocessing (optional, you can add more steps based on your needs)
data['text'] = data['text'].str.lower()
# TF-IDF vectorization
24
tfidf_vectorizer = TfidfVectorizer()
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)
X_test_tfidf = tfidf_vectorizer.transform(X_test)
Although we've just used a tiny sample dataset in this example, you should utilise a larger
dataset to improve the performance of your model. To increase the model's accuracy, you
might also wish to experiment with other preprocessing methods and hyperparameter
adjustment.
LTSM
It is effective to use Long Short-Term Memory (LSTM) for sentiment analysis, particularly
when working with Textual information that is sequential. RNNs (recurrent neural
networks)of the network of long short-term memories variety are particularly good at
capturing long-term dependencies in sequences.
I'll show you how to use the Keras library, a high-level neural networks API built on top of
TensorFlow, to do sentiment analysis using LSTM in Python in this example.
Example:
# Importing necessary libraries
import pandas as pd
from keras.models import Sequential
from keras.layers import LSTM, Dense, Embedding
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
data = pd.DataFrame({
'text': ['I love this product!', 'This is terrible.', 'It is okay.'],
'sentiment': ['positive', 'negative', 'neutral']
})
# Data preprocessing (optional, you can add more steps based on your needs)
data['text'] = data['text'].str.lower()
# Tokenization
tokenizer = Tokenizer()
tokenizer.fit_on_texts(data['text'])
vocab_size = len(tokenizer.word_index) + 1
X = tokenizer.texts_to_sequences(data['text'])
X = pad_sequences(X)
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=embedding_dim,
input_length=max_length))
model.add(LSTM(128))
model.add(Dense(1, activation='sigmoid'))
Although we've just used a tiny sample dataset in this example, you should utilise a larger
dataset to improve the performance of your model. The text data is transformed into
numerical sequences using the tokenizer, and all of the sequences are made to be the same
length using pad_sequences before being fed into the LSTM.
BERT
# Make predictions
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probabilities = softmax(logits, dim=1)
predicted_labels = torch.argmax(probabilities, dim=1)
We utilized a tiny sample dataset for this example, however for better model performance,
you should use your own larger dataset. Running BERT models on a CPU could be laborious
because they require a lot of processing. If a GPU is available, think about using it for
improved performance.
28
Model Training : The preprocessed and feature-extracted data are then used
to train the chosen model. A training set and a testing set are created from the
dataset in order to assess the effectiveness of the model.
Model Evaluation: The model is assessed using the testing set once it has
been trained to determine its accuracy and other performance measures
including precision, recall, and F1-score. How well the model can predict
sentiment on unobserved data is determined by the evaluation.
Sentiment Prediction : Once trained and assessed, the model can be used to
anticipate the tone of fresh restaurant evaluations. The model outputs a
sentiment label (such as positive, negative, or neutral) from the input text data
that has been preprocessed and feature extracted.
Deployment : Finally, the sentiment analysis model that has been trained can
be used to analyse sentiment instantly. It may be incorporated into restaurant
management systems to track patron comments and offer priceless information
for enhancing the patron experience.
6 METHODOLOGY
Data collection and preprocessing, model training, and model evaluation are all steps in the
approach for Using Python Language for Sentiment Analysis of Restaurant Reviews . An instruction
manual for performing sentiment analysis on restaurant reviews is provided below:
Data Collection
Data Processing
Data Labeling
Data Splitting
Feature Extraction
Model Evaluation
29
Hyperparameter
Sentiment Prediction
Deployment
7 Classification
After dividing the dataset into its component parts, we teach the algorithm how to
classify the data by feeding it training data. Numerous classification techniques have been
used, including Naive Bayes, the decision trees, random forests, and a classifier using the
support vector algorithm (SVM). The conditional probability model is the foundation of the
Naive Bayes method. Sentiment Analysis of Restaurant Reviews The Naive Bayes classifier
assumes feature freedom and gives the probability as though the data set to be classified is
expressed as a vector x (X1,..,..Xn) of n distinct features.
p(Ck|x1... xn) =p(Ck) p(xi |Ck)
Here, C k represents K th class name
The decision tree classifier uses a number of conditions and questions to create a tree
structure in which the leaf nodes correspond to the necessary classifications. Entropy is
calculated in order to choose between the tree's roots.
H= ¿−∑ p ( x ) logp ( x)
In order to categorize the provided data, the SVM classifier creates a
→
hyperplane between the set of points x as
w x−b=0
Here, w is the normal vector to the hyperplane.
Each classification algorithm has advantages and disadvantages, and the nature of the
dataset affects how well it performs. Every Set of training data for the method to test set
ratio is changed and tested To raise efficiency.
The text mining industry's most popular classification technique is this one. Natural
Language Processing (NLP) uses it frequently because of its excellent performance. The
algorithm is supported by the Bayes theorem. utilizing naive Bayes principles for a text
evaluation and class (M) and (N). When N is the supplied instance to be identified and M
is the class of potential outcomes, the Bayes theorem calculates the probability P(M|N),
where M is the class of potential outcomes and N is the supplied instance.
The formula is given below:
P(M|N) = P(M) * P(N|M)/P(N)
Where,
P(N) = prior probability of N
P(M) = prior probability of class M
P(N|M) = occurrence of predictor N given class M probability
Random Forest: A classifier called Random Forest uses a variety of decision trees stored on
various subsets of a dataset to increase the predicted accuracy of the dataset. An assortment of
various decision trees with a single initial contrast make up a random forest. perhaps then selecting
the most advantageous divisor from the full list of elements, the calculation selects a random subset
of the variables.
Decision Tree: Decision trees, a kind of classification method, are a part of the supervised learning
technique. A decision tree uses both internal and external nodes to make decisions. The objective of
a decision tree is to characterise an item by creating a set of valid/false articulations. Entropy for
several qualities is represented mathematically as:
FT FN
FAT = ∧FRR =
FT + TN FN +TP
TP+TN
Accuracy=
TP+TN + FP+ FN
We can calculate the precision, recall, and f1-support of the performance metric
evaluation for such algorithms using the following equation.
a) Precision: The word "precision" refers to a high predictive value. To determine
precision, apply the following equation.
10 Predicting a Class
The chosen algorithm can be utilized for predicting the class of a fresh dataset. when it is
received. The machine can generate the most appropriate class because it has already learned
the characteristics of the dataset. Because we used reviews of restaurants when a new client
submits a review, it is added to our dataset and used as part of the algorithm, This decides
whether the evaluation of the restaurant is favorable or bad.
Dataset Summary:
Was 123
Good 85
Food 49
Very 33
But 32
I 27
So 26
To 22
Quality 21
not 21
76.840000
LR 76.920000 76.980000 76.860000
The ROC curve Analysis for unigram features are given bellow
74.24
LR 73.08 73.81 73.04
The ROC curve Analysis for Bigram features are given bellow:
38
79.49
LR 78.21 78.97 78.17
The ROC curve Analysis for Tri-gram features are given bellow:
40
12 Conclusion
For Bigram:
Highest Accuracy achieved by LR at = 73.08
Highest F1-Score achieved by LR at = 73.04
Highest Precision Score achieved by LR at = 74.24
Highest Recall Score achieved by LR at = 73.81
For Trigram:
11.References
[1] K. Ravi and V. Ravi, “A survey on opinion mining and sentiment analysis: Tasks,
approaches and applications,” Knowledge-Based Syst., vol. 89, pp. 14–46, Nov. 2015,
doi: 10.1016/j.knosys.2015.06.015.
[2] V. A. and S. S. Sonawane, “Sentiment Analysis of Twitter Data: A Survey of
Techniques,” Int. J. Comput. Appl., vol. 139, no. 11, pp. 5–15, Apr. 2016, doi:
10.5120/ijca2016908625.
[3] S. Schrauwen, “Machine Learning Approaches To Sentiment Analysis Using the
Dutch Netlog Corpus,” 2010.
[4] M. S. Neethu and R. Rajasree, “Sentiment analysis in twitter using machine learning
techniques,” in 2013 Fourth International Conference on Computing,
Communications and Networking Technologies (ICCCNT), IEEE, Jul. 2013, pp. 1–5.
doi: 10.1109/ICCCNT.2013.6726818.
[5] A. P. Jain and P. Dandannavar, “Application of machine learning techniques to
sentiment analysis,” in 2016 2nd International Conference on Applied and Theoretical
Computing and Communication Technology (iCATccT), IEEE, 2016, pp. 628–632.
doi: 10.1109/ICATCCT.2016.7912076.
[6] G. Gautam and D. Yadav, “Sentiment analysis of twitter data using machine learning
approaches and semantic analysis,” in 2014 Seventh International Conference on
Contemporary Computing (IC3), IEEE, Aug. 2014, pp. 437–442. doi:
10.1109/IC3.2014.6897213.
[7] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and
applications: A survey,” Ain Shams Eng. J., vol. 5, no. 4, pp. 1093–1113, Dec. 2014,
doi: 10.1016/j.asej.2014.04.011.
[8] R. Liu, R. Xiong, and L. Song, “A sentiment classification method for Chinese
document,” in 2010 5th International Conference on Computer Science & Education,
IEEE, Aug. 2010, pp. 918–922. doi: 10.1109/ICCSE.2010.5593462.
[9] L. Ramachandran and E. F. Gehringer, “Automated Assessment of Review Quality
Using Latent Semantic Analysis,” in 2011 IEEE 11th International Conference on
Advanced Learning Technologies, IEEE, Jul. 2011, pp. 136–138. doi:
10.1109/ICALT.2011.46.
[10] B. Pang and L. Lee, “Opinion Mining and Sentiment Analysis,” Found. Trends® Inf.
43