0% found this document useful (0 votes)
3 views

SET Final Project-2

The document discusses the development of an AI-powered product review analysis system that utilizes natural language processing (NLP) to extract insights from customer reviews. It aims to identify 'gain points' and 'pain points' to enhance product development, marketing, and customer service, while also analyzing competitor products. The methodology includes web scraping, sentiment analysis using LSTM and Seq2Seq models, and data preprocessing techniques to ensure accurate sentiment extraction and summarization.

Uploaded by

justpics.tanvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

SET Final Project-2

The document discusses the development of an AI-powered product review analysis system that utilizes natural language processing (NLP) to extract insights from customer reviews. It aims to identify 'gain points' and 'pain points' to enhance product development, marketing, and customer service, while also analyzing competitor products. The methodology includes web scraping, sentiment analysis using LSTM and Seq2Seq models, and data preprocessing techniques to ensure accurate sentiment extraction and summarization.

Uploaded by

justpics.tanvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Decoding Reviews: AI-Powered Product Review

Analysis
Apurva Wankhade
Tanvi Paigude Dr. Anny Leema A
Computer Science and
Computer Science and School Of Computer Science and
Engineering(Artificial Intelligence
Engineering(Artificial Intelligence Engineering (SCOPE)
and Machine Learning)
and Machine Learning) Vellore of Institute of Technology
Vellore of Institute of Technology
Vellore of Institute of Technology Vellore, India
Vellore, India
Vellore, India [email protected]
[email protected]
[email protected]

Abstract— Artificial Intelligence(AI)-powered product II. PROBLEM STATEMENT


review analysis is a growing field that uses AI to extract valuable
insights from customer reviews on a variety of products which AI-powered product review analysis aims to extract insights from
will be useful to improve product development and customer feedback for product enhancement. A product-specific
enhancement, marketing, and customer service. It can also be crawler gathers reviews from trusted sites, analyzing sentiments to
used to identify the customer-specific gain points such as key identify gain and pain points. It also assesses competitors'
features and benefits using natural language processing (NLP) products, revealing strengths and weaknesses. This approach
to extract keywords and phrases from reviews. After the offers businesses a nuanced understanding of customer
successful identification of gain points, they can be used to preferences, fostering informed decisions and resource
improvise product development and marketing. Besides, AI can optimization.
also be used to extract the pain points of the customers from
their negative parts of the reviews. For that, this work is going to III. LITERATURE REVIEW
develop a product-specific focused crawler whose objective is
firstly to search the web to extract customer reviews from the The advent of Artificial Intelligence (AI) has initiated
most trusted websites. Subsequently, it must extract the positive transformative changes across diverse industries, particularly
or negative sentiments on specific features of a product under within the business sector. One pivotal application of AI in
concern from their gain points and pain points respectively. As a the business landscape is the analysis of product reviews,
side chain, this can also be used as a tool for analyzing the which has gained significant attention from researchers.
competitors’ products to understand their pain as well as gain These reviews contain valuable customer insights, and
points. The approach can uncover the most discussed aspects of several scholars have emphasized the importance of
products, enabling businesses to better understand customer extracting and summarizing customer opinions from these
perceptions and preferences to make better decisions thereby
reviews. For instance, Htay SS et al. [7] propose techniques
saving time and money.
that leverage linguistic rules and part-of-speech tagging for
Keywords—Sentiment Analysis, Web scraping, Natural this purpose, aiding in distilling valuable insights from
Language Processing, LSTM etc. unstructured data. Similarly, C. Chauhan et al. [8]
emphasize the significance of sentiment analysis in
I. INTRODUCTION understanding customer sentiments on platforms like
In today's competitive market, understanding customer Amazon and Flipkart, highlighting its role in refining
sentiments is vital for businesses. Artificial Intelligence (AI) products and marketing strategies through web scraping and
and Natural Language Processing (NLP) offer a solution polarity analysis. Additionally, Zhang, Z. et al. [5] introduce a
known as AI-powered product review analysis. This field unique approach that combines sentiment analysis and the
extracts insights from customer reviews to enhance product intuitionistic fuzzy TODIM method for product selection
development, marketing, and customer service. It identifies based on online reviews, enhancing the decision-making
"gain points" (positive aspects) and "pain points" (negative process.
aspects) using NLP. To achieve this, we are developing a
product-specific web crawler to gather trusted customer In sentiment analysis, researchers have developed various
reviews. This data not only improves our own products but techniques to enhance opinion mining. L. Yang et al. [1]
also provides a competitive edge by analyzing rival present the SLCABG model, which integrates sentiment
products. By deciphering what customers love and dislike, lexicons, Convolutional Neural Networks (CNN), and
businesses can make informed decisions, saving time and Bidirectional Gated Recurrent Unit (BiGRU) to extract
resources. In summary, AI-powered review analysis sentiment features and context from reviews, allowing for a
revolutionizes how we understand customer preferences, more nuanced understanding of sentiments within studies.
ensuring smarter business choices. Other scholars, such as W. Zhao et al. [2] introduce
supervised deep Embedding (WDE) learning frameworks
that employ Convolutional Neural Networks (CNN) and
Long Short-Term Memory (LSTM) architectures for
sentiment analysis, each offering unique advantages and V. METHODOLOGY
trade-offs. These deep learning methods enhance sentiment
analysis accuracy and robustness. Furthermore, SenticNet 2,
introduced in [6], associates semantics and sentics with Review extraction with sentiment analysis categorizes opinions
common-sense concepts, offering a comprehensive from various sources, aiding in understanding customer
approach that enhances opinion mining and sentiment sentiments for products or topics, valuable for businesses and
analysis. The applications of sentiment analysis extend researchers in decision-making and insights generation.
beyond product reviews. S. A. A. Shah et al. [3] propose a A. Data Collection
novel Bi- directional LSTM with a CNN model for
detecting e- commerce entities, achieving impressive The paper presents a methodology to aggregate reviews from
accuracy rates. This underscores the versatility of sentiment multiple e-commerce platforms like Amazon and Flipkart into
analysis. Various applications and ongoing challenges a unified dataset. Utilizing tools like BeautifulSoup or Scrapy,
it parses HTML to extract essential review attributes and
underscore the evolving landscape of sentiment analysis in
handles dynamic content loading. To ensure ethical scraping,
AI-driven business insights. The study reveals connections
the method employs rate limiting and politeness measures,
between sentiment review structures and these challenges,
preventing server overload. The data collected can be stored in
emphasizing domain dependence as a crucial factor.
various formats for further analysis or direct integration into
Efficient preprocessing of online reviews is also highlighted
databases.
as a critical step, as discussed by James et al. [10].
The workflow, depicted in Figure 2, outlines the systematic
Deep learning methods are increasingly prevalent in approach from web scraping to data storage. It involves
sentiment analysis, as seen in the work of M. S. Parvez et al. identifying review elements within HTML, navigating
[9], who propose that web wrappers and machine learning pagination for comprehensive data collection, and
approaches yield fast and accurate results for structured and implementing measures to avoid server overload. Through
unstructured HTML pages. Perwej et al. [4] apply machine techniques like rate limiting and politeness, the scraper
learning methodologies for sentiment analysis on data controls request frequency, ensuring ethical data extraction.
collected through web scraping, revealing sentiment trends The collected data can then be saved in different formats or
directly integrated into databases for subsequent analysis.
within web data. This innovative methodology enhances
sentiment analysis accuracy by effectively capturing B. Data Preprocessing
hierarchical relationships between words. However, it a) Tokenization: The process of breaking a whole
comes with challenges related to computational complexity sentence into individuals such as symbol, keywords, phrases
and data availability. Zhao, Wei et al. [11] combine deep are known as a token. In tokenization some character like an
embedding models with weak supervision to tackle limited exclamation mark, the semicolon is removed.
labeled data, potentially improving the accuracy and
efficiency of sentiment analysis, particularly in the domain b) Removing Stop Words: Stop words are those objects
of product reviews. S. G. Kim et al. [12] employ text in a sentence which is not required in any segment of text
mining techniques to gain insights from consumer reviews mining, so usually, we removed these sentences to increase
of cosmetic products, pinpointing specific attributes that the efficiency of analysis. Stop words are different in
influence consumer decisions. These studies collectively formats according to the language and country.
contribute to the ever- evolving landscape of sentiment
analysis in AI-driven business insights. c) POS tagging: The POS tagging is the process of
assigning the part of speech to the given word, it contains
IV. PROPOSED SYSTEM noun, pronoun, verb, adjective and their subcategories.

Fig. 1 Proposed System Architecture

Fig. 2 Work Flow Diagram


trained, the model can be used for sentiment analysis on
C. Sentence Summarization new, unlabelled product reviews. User can pass the review
text through the trained model, and it outputs the predicted
The paper advocates using Seq2Seq models for text sentiment label along with a short sentiment description.
summarization due to their versatility in handling sentences
of varying lengths. These models comprise an encoder-
decoder architecture, with GRU or LSTM layers in the
encoder. The encoder encodes input to create a context
vector, utilized by the decoder to generate a summary
sentence. Teacher forcing aids learning during training.
Evaluation metrics like BLEU or human assessment assess
the quality of generated summaries..

Fig. 4 LSTM Model Architecture

Fig. 3 Seq2Seq Architecture VI. RESULTS


D. Sentence Sentiment Extraction
A Assessing product qualities in online e-commerce is The conducted market research shows the gain and pain points for
complex due to the varied expressions used by reviewers. the product from the customer reviews. In this section, the
Identifying nuanced sentiments, especially those expressed experimental results are gathered and discussed.
through negative prefixes, is challenging yet crucial for
accurate analysis. For instance, phrases like "not wrong"
may convey less negative sentiments despite containing
negative terms. To address this, a Negative Polarity
Identification algorithm is employed to pinpoint such
expressions, enhancing sentiment analysis accuracy.

In this approach, the Seq2Seq model generates


summaries, which are then inputted into an LSTM model to
extract sentiments from reviews. The LSTM utilizes an
embedding layer to convert word representations into Figure. 5 Extracted reviews constituting the dataset
continuous vectors, learning mappings based on training
data. With a sequential structure, LSTM networks The dataset consists of 150 reviews extracted for headset product.
incorporate gates (input, output, forget) to regulate The dataset is preprocessed to maintain only the required columns
information flow within and between memory cells. At each
or features. The required feature set shall consist of the title of the
time step, these gates manage input, historical information
review and review description.
retention, and output, contributing to nuanced sentiment
analysis and product evaluation in e-commerce settings.
To begin with, the reviews are fed to the Seq2Seq model as input.
This shall be responsible for text summarization. Variable length
FT=𝜎(Wf∗[ht−1,xt]+Bf) …Eqn(1)
input reviews are summarized as fixed-size context vector. Word
by word, the recurrent layers process the input sequence,
In the above Eqn(1), the forget gate (ft) is defined by input capturing details about the context and content of the review. This
xt and output ft−1 of the last time. The value of the forget is then passed on to the LSTM model for sentiment analysis.
gate runs between 0 and 1. When ft = 0, the previous value
is forgotten in the calculation. ft = 1, the forget gate keeps Input sequences pass through an embedding layer, converting
previous information. The architecture of LSTM model is
discrete word representations into continuous vector
depicted in Fig. 4.
representations. Once trained, the model is ready for sentiment
The model is trained using a suitable loss function, typically
categorical cross-entropy for multi-class sentiment analysis
or binary cross-entropy for binary sentiment analysis.
Training data includes labelled reviews with their
corresponding sentiment labels. Gradient descent
optimization algorithms like Adam or RMSprop can be used
to update the model's parameters during training. Once
analysis on new, unlabeled product reviews by passing the review
text through the model, which outputs the predicted sentiment
label.
[2] W. Zhao et al., "Weakly-Supervised Deep
For example, for the given reviews of a product, the summed
Embedding for Product Review Sentiment Analysis,"
gain points and pain points along with the overall summary of the in IEEE Transactions on Knowledge and Data
product is achieved as: Engineering, vol. 30, no. 1, pp. 185-197, 1 Jan.
2018, doi: 10.1109/TKDE.2017.2756658.
{ [3] S. A. A. Shah, M. Ali Masood, and A. Yasin, "Dark
"gain_count": 26, Web: E-Commerce Information Extraction Based on
“pain_count": 2, Name Entity Recognition Using Bidirectional-
"summary": "Customers generally appreciate the product, LSTM," in IEEE Access, vol. 10, pp. 99633-99645,
mentioning its good qualities. However, there are also some 2022, doi: 10.1109/ACCESS.2022.3206539.
[4] Perwej, Dr. Yusuf & Divya, Km & Rastogi, Dr &
concerns mentioned. Overall, the product seems to have a
Yadav, Puneet. (2022). Sentimental Analysis on Web
positive reception."
Scraping Using Machine Learning Method. Journal
} of Information and Computational Science. Volume
12. 10.12733/JICS.2022/V12I08.535569.67004.

[5] Zhang, Z., Guo, J., Zhang, H. et al. Product selection


based on sentiment analysis of online reviews: an
intuitionistic fuzzy TODIM method. Complex Intell.
Figure. 6 Gain points, pain points and summary for the product Syst. 8, 3349–3362 (2022).

[6] SenticNet 2: A semantic and affective resource for


Here, the gain point for the product is 26 whereas the pain opinion mining and sentiment analysis, Proceedings
point for the product is 2. This means, on an average the of the 25th International Florida Artificial
customers have found more positive features in the product as Intelligence Research Society Conference, FLAIRS-
compared to negative features. The ‘summary’ specifies how 25.
the customers have perceived the product. In-spite of some [7] Htay SS, Lynn KT. Extracting product features and
drawbacks in the product, many features are still liked by the opinion words using pattern knowledge in customer
customers. reviews. ScientificWorldJournal. 2013 Dec
26;2013:394758. doi: 10.1155/2013/394758. PMID:
24459430; PMCID: PMC3888732.
[8] C. Chauhan and S. Sehgal, "Sentiment analysis on
VII. CONCLUSION AND FUTURE SCOPE
product reviews," 2017 International Conference on
Computing, Communication and Automation
(ICCCA), Greater Noida, India, 2017, pp. 26-31,
The paper presents to use LSTM and Seq2Seq models for
doi: 10.1109/CCAA.2017.8229825.
sentiment analysis and review extraction from customer
[9] M. S. Parvez, K. S. A. Tasneem, S. S. Rajendra and K.
reviews to help businesses identify the qualities that are liked
R. Bodke, "Analysis Of Different Web Data
and disliked by the end users. To capture information from
Extraction Techniques," 2018 International
both past and future context, one can use a bidirectional
Conference on Smart City and Emerging
LSTM. This type of LSTM processes the input sequence in
Technology (ICSCET), Mumbai, India,
both forward and backward directions, providing a richer
2018, pp. 1-7,
representation of the input sequence. Furthermore, the study
doi: 10.1109/ICSCET.2018.8537333.
can be extended to draw sentiments from emotions and star
[10] Kavanagh, James, Greenhow, Keith and Jordanous,
ratings voted by customers for the product. The sarcasm and
Anna (2023) Assessing the Effects of Lemmatisation
pun included by customers in the reviews can also be
and Spell Checking on Sentiment Analysis of Online
handled using pattern extraction and sentiment analysis.
Reviews. In: 17th IEEE International Conference on
SEMANTIC COMPUTING (ICSC), 1-3 Feb 2023,
ACKNOWLEDGMENT Laguna Hills, USA.
[11] Zhao, Wei & Guan, Ziyu & Chen, Long & He,
We would like to extend our sincere gratitude towards Dr. Xiaofei & Cai, Deng & Wang, Beidou & Wang,
Anny Leema for igniting the spark in us to work on this Quan. (2017). Weakly-Supervised Deep Embedding
intriguing area of research and guiding us through the research for Product Review Sentiment Analysis. IEEE
work by her knowledgeable insights. Transactions on Knowledge and Data Engineering.
PP. 1-1. 10.1109/TKDE.2017.2756658.
[12] Kim, S. G., & Kang, J. (2018). Analyzing the
REFERENCES discriminative attributes of products using text
[1] L. Yang, Y. Li, J. Wang, and R. S. Sherratt, mining focused on cosmetic reviews. Information
"Sentiment Analysis for E-Commerce Product Processing & Management, 54(6), 938-957.
Reviews in Chinese Based on Sentiment Lexicon
and Deep Learning," in IEEE Access, vol. 8, pp.
23522-23530, 2020, doi:
10.1109/ACCESS.2020.2969854.

You might also like