Major Project Presentationn (2) - 1
Major Project Presentationn (2) - 1
on
“Sentiment Analysis of Amazon Product Reviews using Natural Language
Processing”
Presented by:
Shuhel Shabana Ferdhousy (20RISTCS012)
Suhail Akhtar Mazarbhuiya (20RISTCS014)
Tanbeer Ahmed Laskar (20RISTCS015)
Trishna Das (20RISTCS017)
• Amidst the sphere of product reviews, sentiment holds a significant sway, acting as a
pivotal indicator of consumer satisfaction, product efficacy, and brand perception.
• Moreover, sentiment analysis plays a crucial role in enhancing the customer experience.
By identifying areas of dissatisfaction or concern, businesses can proactively address
customer issues, improving overall satisfaction and loyalty.
LITERATURE SURVEY
[1] Khan, M. and Srivastava, A., 2024. “A Review on Sentiment Analysis of Twitter Data
Using Machine Learning Techniques”. International Journal of Engineering and
Management Research, 14(1), pp.186-195.
• The paper explores sentiment analysis of Twitter data using machine learning techniques
to extract valuable insights from user-generated content for organizations and
governments.
• The datasets used in the studies included evaluations of public perception, sentiment in
Spanish tweets, sentiment categorization using CNN and LSTM, text semantics analysis
in Hindi and Kannada, and aspect-based sentiment analysis of Indonesian cinema
reviews.
• The accuracy of the paper is 95.5%.
LITERATURE SURVEY (cont…)
[2] Mahalakshmi, V., et al. "Twitter sentiment analysis using conditional generative
adversarial network." International Journal of Cognitive Computing in Engineering 5
(2024): 161-169
• The paper introduces a novel approach using Conditional Generative Adversarial
Network (CGAN) for Twitter sentiment analysis, outperforming existing methods.
• The paper mentions that about 20% of the dataset was used for validation, while the
remaining 80% was utilized for training the sentiment analysis model using the CGAN
network.
• The paper achieved an accuracy rate of 93.33% using the proposed CGAN approach for
sentiment analysis on Twitter data .
LITERATURE SURVEY (cont…)
[3] Jacob N, Viswanatham VM. “Sentiment Analysis using Improved Atom Search Optimizer
with a Simulated Annealing and ReLU based Gated Recurrent Unit”. IEEE Access. 2024
Mar 7.
• The research paper proposes a novel approach using ReLU-GRU for sentiment analysis
on Twitter, achieving high accuracy rates on COVID-19 and Sentiment-140 datasets
through feature extraction, selection, and classification methods.
• The paper utilized two datasets: the COVID-19 dataset containing tweets from Indian
users during the lockdown period and the Sentiment-140 dataset comprising 1.6 million
tweets classified as positive, negative, or neutral.
• The paper achieved high accuracy rates, with the ReLU-GRU model demonstrating
accuracy percentages of 97.62% for joy, 96.88% for fear, 97.99% for sadness, and
98.99% for anger sentiment factors on the COVID-19 dataset.
LITERATURE SURVEY (cont…)
[4] Aljuhani, Sara Ashour, and Norah Saleh Alghamdi. "A comparison of sentiment analysis
methods on Amazon reviews of Mobile Phones." International Journal of Advanced
Computer Science and Applications 10.6 (2023).
• This paper presents a comprehensive analysis of sentiment analysis methods on mobile
phone reviews.
• The methodology followed a systematic process to extract features, apply machine
learning algorithms, and classify reviews based on sentiment analysis techniques.
• The accuracy of the paper:
Balanced Dataset: CNN achieved the best accuracy of 92.72% and Logistic
Regression (LR) achieved an accuracy of 80.00%.
Unbalanced Dataset: CNN achieved an accuracy of 79.60% and Logistic Regression
(LR) achieved an accuracy of 80.00%.
LITERATURE SURVEY (cont…)
[5] Wahyudi, Mochamad, and Dinar Ajeng Kristiyanti. "Sentiment analysis of smartphone
product review using support vector machine algorithm-based particle swarm optimization."
(2022): 189-201.
• The paper analyzes smartphone product reviews from www.gsmarena.com, through
preprocessing 100 positive and 100 negative reviews through tokenization, stopwords
removal, and stemming.
• Support Vector Machine (SVM) and Particle Swarm Optimization (PSO) are employed
for sentiment analysis, with PSO enhancing feature selection for improved classification
accuracy.
• Evaluation using 10 Fold Cross Validation shows SVM achieving 82.00% accuracy,
while SVM-based PSO achieves a higher accuracy rate of 94.50%.
LITERATURE SURVEY (cont…)
[6] AlQahtani, Arwa SM. "Product sentiment analysis for amazon reviews." International
Journal of Computer Science & Information Technology (IJCSIT) Vol 13 (2021).
• This paper used the Amazon dataset extracted via Prompt Cloud, which had a total of
413,840 reviews, labelled as 1, 2, 3, 4 and 5 star-ed reviews.
• The proposed methodology had the following steps: Data Collecting, Data Cleansing and
Pre-Processing, Feature Extraction, Model Training and Evaluation.
• In the feature extraction phase, Bag-of-Words, Term frequency-Inverse document
frequency and GloVe algorithms were used.
• For model training, Naїve Bayes, Logistic Regression, Random Forest, Bidirectional
Long-Short Term Memory and BERT algorithms were used.
• After evaluation of the models, it was found that the model using BERT gave the best
accuracy of 94% for multiclass (positive, negative and neutral) classification, and an
accuracy of 95% for binary (positive and negative) classification.
LITERATURE SURVEY (cont…)
[7] Jagdale, Rajkumar S., Vishal S. Shirsat, and Sachin N. Deshmukh. "Sentiment analysis
on product reviews using machine learning techniques." Cognitive Informatics and Soft
Computing: Proceeding of CISC 2017. Springer Singapore, 2019.
• The paper showcases the successful application of machine learning algorithms in
accurately categorizing product reviews as positive or negative, highlighting the
significance of sentiment analysis for business strategy improvement.
• The review discusses the use of machine learning algorithms like SVM and NB, hybrid
approaches, sentiment lexicons, and preprocessing tasks employed for sentiment
classification in product reviews.
• The paper utilizes a diverse dataset from Amazon, including reviews of various products
structured in JSON format.
• Experimental results show high accuracy rates, with Naïve Bayes achieving 98.17%
accuracy and Support Vector Machine achieving 93.54% accuracy for camera reviews,
indicating the effectiveness of the methodology.
LITERATURE SURVEY (cont…)
[8] Farooqui NA, Ritika AS, Saini A. “Sentiment analysis of twitter accounts using natural
language processing”. International Journal of Engineering and Advanced Technology.
2019;8(3):473-9.
• The paper conducts sentiment analysis on Twitter data through various preprocessing
steps like emotion tagging and POS tagging to create feature vectors for analysis.
• The paper employs NLP methods for sentiment analysis, including feature vector and
plain sentiment text mining, utilizing sentiment analyzers like Senti WordNet and
machine learning models such as SVM and Neural Network.
• The paper collects data from Twitter via the streaming API, focusing on political parties'
tweets for sentiment analysis, particularly in the context of presidential elections.
• Evaluation in the paper includes measuring positive/negative scores, mean absolute error
(MAE), and classification into positive, negative, and neutral sentiments, with an
impressive accuracy rate of 95%.
LITERATURE SURVEY (cont…)
Sl.
Title Dataset Methodology Accuracy
No.
A Review on Sentiment
ML (Naive Baies, SVM,
Analysis of Twitter Data 5 datasets
1 Logistic Regression) DL 95.5%
Using Machine Learning are used
(RNN, LSTM, CNN)
Techniques
A comparison of sentiment
6 datasets
4 analysis methods on Amazon CNN, Logistic Regression 92.72%
are used
reviews of Mobile Phones
LITERATURE SURVEY (cont…)
Sl.
Title Dataset Methodology Accuracy
No.
Sentiment analysis of
smartphone product review 200 (100
positive
5 using support vector machine Bag-of-Words, TF-IDF, GloVe 94.5%
and 100
algorithm-based particle negative)
swarm optimization
400,000 Bag-of-Words, Term
6 Product sentiment analysis customer 95%
frequency-Inverse document
for amazon reviews reviews frequency, GloVe
Sentiment analysis on
6 datasets SVM, Naïve Bayes
7 product reviews using 98%
are used
machine learning techniques
Sentiment analysis of twitter
1400 Utilize Tweepy API, Text Blob,
8 accounts using natural 95%
tweets Senti WordNet,SVM
language processing
• There is a pressing need for more sophisticated sentiment analysis methods that can
effectively capture the diverse range of sentiments conveyed through written
communication, especially within the context of product reviews.
• By analyzing sentiments expressed in product reviews, businesses can identify areas for
improvement in the customer experience and implement targeted interventions to
address consumer concerns effectively.
OBJECTIVES
• To implement web scraping to gather diverse product reviews and ratings, facilitating
real-time insights.
• To select the best models after evaluation and build a hybrid model that analyzes
consumer sentiment in product reviews.
• To deploy the classification model for front-end integration, enabling users to submit
reviews and receive predictions.
METHODOLOGY
Algorithm
Download dataset from Kaggle.
• Step 1: To collect data from various sources.
Collect data using Web Scraping.
• Step 2: To preprocess the collected data.
• Step 6: To select the best models and build a hybrid classification model.
• Step 6: To select the best models and build a hybrid classification model.
• Step 6: To select the best models and build a hybrid classification model.
• Step 6: To select the best models and build a hybrid classification model.
• Step 6: To select the best models and build a hybrid classification model.
• Step 6: To select the best models and build a hybrid classification model.
• Step 6: To select the best models and build a hybrid classification model.
Data Preprocessing
Data Transformation
Dataset
Model Comparison,
Training and Testing
Building Interface
Fig 1: Methodology Flowchart
METHODOLOGY (cont…)
Dataset (Downloaded)
• Result:
IMPLEMENTATION (cont…)
Objective 2: To compare the different classification models and evaluate them.
• Code:
IMPLEMENTATION (cont…)
IMPLEMENTATION (cont…)
IMPLEMENTATION (cont…)
IMPLEMENTATION (cont…)
IMPLEMENTATION (cont…)
RESULT
Output of web scraping:
RESULT (cont..)
Hybrid
model 0.98 0.87 0.86 0.87 0.86
46
FUTURE SCOPE
Advanced NLP Techniques: Future improvements in models like BERT
will make sentiment analysis more accurate by understanding complex
language details. Integrating text with images and videos will help
understand customer feelings better.
• Successfully gathered diverse Amazon product reviews and ratings using web
scraping techniques.
• Preprocessed and cleaned the collected data to ensure high-quality inputs for
model training.
• Compared multiple classification models and upon evaluation, the best
performing models were:
Random Forest: Training accuracy = 99%
Testing accuracy = 94%
SVM: Training accuracy = 98%
Testing accuracy = 94%
REFERENCES
[1] Khan, M. and Srivastava, A., 2024. “A Review on Sentiment Analysis of Twitter Data
Using Machine Learning Techniques”. International Journal of Engineering and
Management Research, 14(1), pp.186-195.
[2] Mahalakshmi, V., et al. "Twitter sentiment analysis using conditional generative
adversarial network." International Journal of Cognitive Computing in Engineering 5
(2024): 161-169
[3] Jacob N, Viswanatham VM. “Sentiment Analysis using Improved Atom Search Optimizer
with a Simulated Annealing and ReLU based Gated Recurrent Unit”. IEEE Access. 2024
Mar 7.
[4] Aljuhani, Sara Ashour, and Norah Saleh Alghamdi. "A comparison of sentiment analysis
methods on Amazon reviews of Mobile Phones." International Journal of Advanced
Computer Science and Applications 10.6 (2023).
REFERENCES
[5] Wahyudi, Mochamad, and Dinar Ajeng Kristiyanti. "Sentiment analysis of smartphone
product review using support vector machine algorithm-based particle swarm optimization."
(2022): 189-201.
[6] AlQahtani, Arwa SM. "Product sentiment analysis for amazon reviews." International
Journal of Computer Science & Information Technology (IJCSIT) Vol 13 (2021).
[7] Jagdale, Rajkumar S., Vishal S. Shirsat, and Sachin N. Deshmukh. "Sentiment analysis
on product reviews using machine learning techniques." Cognitive Informatics and Soft
Computing: Proceeding of CISC 2017. Springer Singapore, 2019.
[8] Farooqui NA, Ritika AS, Saini A. “Sentiment analysis of twitter accounts using natural
language processing”. International Journal of Engineering and Advanced Technology.
2019;8(3):473-9.
THANK YOU!