0% found this document useful (0 votes)
32 views41 pages

Sujan dulal

The document outlines an individual coursework assignment for a module on Artificial Intelligence, focusing on the development of a sentiment analysis system using machine learning techniques, particularly the Naive Bayes algorithm. It discusses the importance of sentiment analysis in understanding customer feedback and public sentiment, while acknowledging the challenges faced during the research process. The coursework aims to create a robust sentiment analysis system that can provide valuable insights across various domains.

Uploaded by

sulavdulal8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views41 pages

Sujan dulal

The document outlines an individual coursework assignment for a module on Artificial Intelligence, focusing on the development of a sentiment analysis system using machine learning techniques, particularly the Naive Bayes algorithm. It discusses the importance of sentiment analysis in understanding customer feedback and public sentiment, while acknowledging the challenges faced during the research process. The coursework aims to create a robust sentiment analysis system that can provide valuable insights across various domains.

Uploaded by

sulavdulal8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Module Code & Module Title

CS6004NT- Artificial Intelligence

Assessment Weightage & Type

Individual Coursework (75%)

Year and Semester

2024-25 Autumn

Student Name: Sujan Dulal

London Met ID: 22072225

College ID: NP04CP4A220129

Assignment Submission Date: January 22

Submitted To: Zishan Siddique

I confirm that I understand my coursework needs to be submitted online via Google Classroom under the
relevant module page before the deadline for my assignment to be accepted and marked. I am fully aware
that late submissions will be treated as non-submission and a mark of zero will be awarded
Acknowledgement
I initially felt excited to work on this coursework but struggled with time management as the
deadline approached, especially while conducting research and gathering information.
However, the support from my friends and module teacher, Zishan Siddique, was invaluable.
My friends motivated me with their insights and understanding, while Mr. Siddique expertise,
guidance, and patience greatly enhanced my understanding of the subject and helped me
achieve the best possible outcome. I am deeply grateful for their contributions and support
throughout this journey.
Abstract
The purpose of this research is to create a sentiment analysis system that can accurately
identify the sentiment of text in different contexts. To accomplish this, the use of machine
learning techniques, specifically the Naive Bayes algorithm, is proposed to train a sentiment
classifier using a large and diverse dataset of text samples that are labelled with their
corresponding sentiment. Sentiment analysis is a method to understand the opinions and
emotions expressed in text data such as customer feedback and social media posts.
However, building an accurate and robust sentiment analysis system is still a challenging
task due to the complexity and variability of natural language. As the coursework progresses,
the researcher anticipates facing several difficulties and obstacles. To overcome these, it is
crucial to thoroughly examine different materials and use a test and error method to gain a
deeper understanding of any issues that may arise. The outcome of this coursework will lead
to the development of a sentiment analysis system that can be applied in various domains,
providing valuable insights into the opinions and emotions expressed in text data

Table of Contents
1 Introduction.........................................................................................................................7

1.1 Introduction to Artificial Intelligence............................................................................7

1.2 Introduction to Sentiment Analysis System................................................................7

1.3 Explanation on AI concept used.................................................................................8

1.4 Problem Statement...................................................................................................10

2 Background.......................................................................................................................11

2.1 Research work done method....................................................................................11

2.1.1 Fine-grained sentiment Analysis...........................................................................11


2.1.2 Aspect-Based Sentiment Analysis........................................................................11

2.1.3 Emotional Detection.............................................................................................12

2.2 Review and analysis of existing work in the problem domain..................................13

3 Solution.............................................................................................................................14

3.1 Explanation of the AI algorithms used......................................................................14

3.1.1 Naïve Bayes.........................................................................................................14

3.1.2 Recurrent Neural Networks (RNN).......................................................................15

3.1.3 Logistic Regression..............................................................................................15

3.2 Pseudocode of the solution......................................................................................16

3.3 Diagrammatical Solution of the problem..................................................................17

3.4 Development Process..............................................................................................21

3.4.1 Tools and Techniques...........................................................................................21

3.4.2 Libraries and Frameworks....................................................................................22

3.4.3 Dataset.................................................................................................................23

3.5 Code Implementation and Result.............................................................................23

3.6 Importing Libraries....................................................................................................23

3.7 Loading Dataset.......................................................................................................24

3.8 Splitting Dataset.......................................................................................................24

3.9 Implementing Naïve Bayes Algorithm......................................................................24

3.10 Implementing Logistic Regression...........................................................................25

3.11 Implementing RNN...................................................................................................25

3.12 Plotting and Visualizing Chart for Each model.........................................................26

3.13 Training Data Comparison on Pie Chart...................................................................27

3.14 Model Result with Test Data.....................................................................................29

3.15 Accuracy for Each Model.........................................................................................31

3.16 Metrics for Each Model.............................................................................................33

4 Conclusion........................................................................................................................34

4.1 Results......................................................................................................................34
4.1.1 Naïve Bayes.........................................................................................................34

4.1.2 Logistic Regression..............................................................................................34

4.1.3 Recurrent Neural Network (RNN).........................................................................34

4.1.4 Key Observations.................................................................................................35

4.2 Analysis of work done..............................................................................................35

4.3 How the solution addresses real problems..............................................................36

5 References.......................................................................................................................36

Figure 1 of sentiment analysis....................................................................................................8


Figure 2 of supervised learning..................................................................................................9
Figure 3 of unsupervised learning............................................................................................10
Figure 4 of fine-grained sentiment analysis..............................................................................11
Figure 5 of aspect-based sentiment analysis...........................................................................12
Figure 6 of emotional detection in sentiment analysis..............................................................13
Figure 7 flowchart of RNN........................................................................................................18
Figure 8 flowchart of naive Bayes............................................................................................19
Figure 9 of flowchart logistic regression...................................................................................20
Figure 10 of python logo...........................................................................................................21
Figure 11 of Jupyter notebook..................................................................................................22
Figure 12 of code implementation............................................................................................23
Figure 13 of loading dataset.....................................................................................................24
Figure 14 of splitting dataset....................................................................................................24
Figure 15 of implementing naive bayes algorithm....................................................................24
Figure 16 of Implementing Logistic Regression.......................................................................25
Figure 17 Implementing RNN...................................................................................................25
Figure 18 of visualizing chart code...........................................................................................26
Figure 19 data comparison through naïve bayes.....................................................................27
Figure 20 data comparison through Logistic regression..........................................................28
Figure 21 of RNN.....................................................................................................................29
Figure 22 testing the data.........................................................................................................30
Figure 23: Accuracy result........................................................................................................31
Figure 24: Accuracy comparison chart.....................................................................................32
Figure 25: Metrices for each model..........................................................................................33

Table 1 of analysis work done ............................................................................................. 12


Artificial Intelligence CU6051

1 Introduction

1.1 Introduction to Artificial Intelligence


Artificial Intelligence (AI) is the simulation of human cognitive functions by computer
systems, including expert systems, natural language processing, speech recognition, and
machine vision. It is used in various industries such as self-driving cars, customer service
chatbots, personalized recommendations and medical diagnosis among many others. (craig,
2020)
The field of technology is advancing rapidly, and we are constantly seeing new innovations
emerge. Artificial intelligence is one of the most rapidly developing areas of computer
science, with the potential to bring about significant advancements in technology through the
development of intelligent machines. Artificial intelligence is now prevalent in many aspects
of our lives, and it is being applied in various subfields such as self-driving cars, chess,
theorem proving, music performance, and painting. AI is a versatile and exciting field of
computer science, that holds a bright future. One of the main characteristics of AI is the
ability to mimic human behaviour and make machines function like humans. (javatpoint,
2020)
The aim of Artificial Intelligence is to develop computer capabilities that enhance human life
by facilitating learning and problem-solving related to human understanding. It's an
implementation that makes computers more intelligent, rather than a standalone system.
The subject of Artificial Intelligence is becoming increasingly prevalent in the media, with
new advancements and potential uses being discussed frequently. Artificial Intelligence, or
AI, encompasses a wide range of technologies that allow machines to learn from
experiences and perform activities like those of humans. Opinions on the current and future
use of AI varies greatly, from hopeful to cynical. The portrayal of AI in popular culture often
misrepresents the technology and its capabilities, leading to unrealistic expectations. It is
important to understand that AI encompasses a variety of technologies that enable robots to
learn in an "intelligent" manner, rather than a single entity. (innoplexus, n.d.)

1.2 Introduction to Sentiment Analysis System


Sentiment analysis is a method to identify emotions, opinions, and attitudes in text data,
used in business to analyse customer feedback, social media posts and more, to understand
public sentiment about a brand, product, or topic, which can inform data driven decisions for
product offerings, marketing strategies, and customer service. It helps companies
understand customer perception and improve accordingly. (geeksforgeeks, 2024)

7|Page Sujan Dulal


Artificial Intelligence CU6051

Machine learning enhances sentiment analysis by automating text analytics operations, such
as identifying sentiment in text by training models on large datasets of text data, such as
customer feedback, using supervised and unsupervised approaches. NLP techniques are
also used to assign weighted sentiment ratings to entities, topics, and categories within text,
allowing for a more nuanced understanding and better data driven decisions for businesses.
(Anon., n.d.)

Sentiment analysis is becoming increasingly important as a tool for understanding and


monitoring public sentiment, as people are expressing their opinions and emotions more
freely than ever before. Brands can use this data to gain insight into what their customers
like and dislike, by automatically analysing feedback such as survey responses and social
media conversations. With this information, companies can make data-driven decisions to
improve their products and services to better meet the needs of their customers. With the
help of Sentiment Analysis, businesses can quickly understand how their products, services,
and brand are being perceived by customers and make improvements accordingly, which
can help them to improve customer satisfaction and increase brand loyalty. To understand
people's positive or negative reactions using textual data, the topic "Sentiment Analysis of
Text" was selected as the best solution. In short, it serves as an important tool for
understanding customers and making improvements accordingly. (aimtechnologies, n.d.)

Figure 1 of sentiment analysis

1.3 Explanation on AI concept used


Natural Language Processing (NLP): Natural Language Processing (NLP) is a study of
computer science and AI that allows computers to understand and interpret human language
using techniques and different models. It helps computers understand the meaning, context,
and purpose of text or speech, just like humans. (stryker, n.d.)

8|Page Sujan Dulal


Artificial Intelligence CU6051

Machine Learning (ML): Machine learning is a branch of AI that enables computer systems
to improve and adapt without explicit programming. It uses algorithms to analyze data and
make predictions and decisions, the goal being to allow computer systems to learn and
improve automatically. Its purpose is identifying patterns, update predictions and decision as
new data is available, thus making the model intelligent over time. (jannade, n.d.) There are
mainly two types of machine learning:
• Supervised Machine Learning

Supervised learning is a type of machine learning where the model is trained on labeled data
to make predictions. The model learns from input-output relationships in the labeled data
during training, then evaluated on separate test data to gauge its accuracy. This allows the
model to make predictions on new, unseen data. The goal is to train a model using labeled
data and apply the learned relationships to make predictions on new, unlabelled data.
(geeksforgeeks, n.d.)

Figure 2 of supervised learning

• Unsupervised Machine Learning

Unsupervised learning is a branch of machine learning where the model is trained on


unlabelled data to discover hidden patterns and relationships. It aims to find the underlying
structure of the data, classify it into groups and represent it in a compressed format, without
human supervision. Unlike supervised learning, it doesn't rely on pre-labelled data, but it can
discover patterns that were unknown before. However, the patterns it finds may not be as
accurate as supervised learning. (ibm, n.d.)

9|Page Sujan Dulal


Artificial Intelligence CU6051

Figure 3 of unsupervised learning

1.4 Problem Statement


Natural language processing (NLP) and machine learning are combined in sentiment
analysis to determine if a text is neutral, positive, or negative. Rule-based and automated
sentiment analysis are the two basic methods. The analysis of those data helps to prepare
the next step for improving the business. In today's businesses, data is highly helpful for
identifying various difficulties. Public opinion and customer feedback on companies' goods,
services, and brands are among the most crucial aspects of any corporation. It is impossible
to tell whether their services and products are in high demand or whether clients are
dissatisfied with them given the vast amount of customer feedback. There are several online
marketplaces these days, including Amazon, E-Bay, and educational marketplaces like
Coursera and Udemy. It boasts thousands of users and consumers and offers thousands of
brands, goods, and learning opportunities. Customers provide reviews of businesses and
products based on their personal experiences, and thousands of reviews are produced each
year. Humans are incapable of determining if a certain input is positive or bad among
thousands of other feedback. Reviews and comments are crucial for assessing the
effectiveness of a specific product or service. They may be tracked, which aids in making
future business decisions. Sentiment analysis may be the finest overall solution for this issue
because it can be used to locate and extract subjective data that will enable businesses to
comprehend the social sentiment about their brands and items.

10 | P a g e Sujan Dulal
Artificial Intelligence CU6051

2 Background

2.1 Research work done method

2.1.1 Fine-grained sentiment Analysis

Fine-grained sentiment analysis is a method that can offer more detailed information on
emotions and attitudes in text data than a simple binary classification. It determines specific
emotions and intensity in customer feedback, social media posts and more. Fine grained
sentiment can provide more accurate and nuanced insights, such as specific emotions such
as excitement or disappointment and the level of intensity. It is particularly useful in
applications where detailed understanding of customer sentiment is required.
(analyticsvidhya, 2020)

Figure 4 of fine-grained sentiment analysis

2.1.2 Aspect-Based Sentiment Analysis

Aspect-based sentiment analysis (ABSA) is a method of examining user sentiment that


focuses on identifying and extracting specific aspects, features, or subjects in a piece of text,
and then assigning a sentiment score to describe the text's overall sentiment. It is particularly
useful for analysing consumer reviews and feedback, as it allows for a more detailed
understanding of customer sentiment towards specific aspects of a product or service
(valleywood, n.d.). ABSA can be used to improve marketing and market research initiatives
by identifying areas of concern and suggesting ways to improve product features in the
future. It is a specific machine learning task in the domain of natural language processing
and is different from other methods of sentiment analysis, such as document-based
sentiment analysis which provides a more general picture of sentiment, by providing more in-
depth insights and granular information from the data. (repustate, n.d.). Here is what Aspect-
based sentiment analysis can extract:

a. Sentiments: positive or negative feelings regarding a particular aspect.

b. Aspects: the category, feature, or topic being discussed.

11 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Figure 5 of aspect-based sentiment analysis

2.1.3 Emotional Detection

Emotion detection is a field of study that aims to identify and understand the emotions that
people express through various channels of communication. It is a rapidly growing area of
research, and its applications are wide-ranging, from facilitating communication between
robots and humans to enhancing decision-making processes. There are various Machine
Learning Models that have been developed for the task of extracting emotions from text, one
of the most popular methods is the lexicon-based approach, which uses a collection of
words that express emotions, commonly called lexicons. This approach is widely used by
emotion detection systems. (thecleverprogrammer, 2020)

However, some advanced systems use more sophisticated Machine Learning techniques
that are better equipped to understand the nuances of human emotions and the different
ways in which they are expressed. This is because people express their emotions in many
ways, and lexicons may not always be adequate for proper emotion recognition. For
example, a sentence like "This product is going to kill me" can be used to express fear and
panic, but it could also be used in a different context with a positive connotation like "This
product is killing it for me" the word "kill" is used differently and a lexicon-based approach
could lead to improper emotion recognition. make it short make it short by counting the
positive and negative words in the text, adding these values mathematically, and then
labelling each word, we may determine the total emotion score.

The sentiment score (StSc) is often calculated using the following formula:

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑤𝑜𝑟𝑑𝑠−𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑤𝑜𝑟𝑑𝑠


StSc =

𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑜𝑟𝑑𝑠

12 | P a g e Sujan Dulal
Artificial Intelligence CU6051

The text is categorized as negative if the sentiment score is negative. Accordingly, a score of
positive indicates a positive text, and a score of zero designates a neutral text. (Bessa,
2023)

Figure 6 of emotional detection in sentiment analysis

2.2 Review and analysis of existing work in the problem domain

• Google's output serves as a solid example of how sentiment analysis software helps
improve products. Consider the Chrome web browser as an illustration. The
development team for Google Chrome is always keeping an eye on both direct and
indirect user feedback (i.e. presented in the open sources, most notably, blogs).
• KFC is a superb illustration of brand tracking with sentiment analysis. For brand
monitoring and advertising, KFC used sentiment analysis. They connect individuals
with their brand and eventually get them to associate with the product by fusing
sentiment analysis with social network monitoring and campaign management.

• Netflix uses sentiment analysis to understand customer opinions by analyzing social


media posts and feedback. This helps the platform gauge how users feel about its
content, features, and overall experience. By processing this data, Netflix refines its
content recommendations to better align with user preferences and ensures a more
engaging and personalized viewing experience. Additionally, sentiment analysis
allows Netflix to identify potential issues or trends in user feedback, enabling timely
improvements to maintain customer satisfaction and enhance its competitive edge in
the streaming industry.

13 | P a g e Sujan Dulal
Artificial Intelligence CU6051

• YouTube has a vast number of user comments under videos. Sentiment analysis can
help creators and businesses understand the general sentiment of their audience
whether viewers feel positively, negatively, or neutrally about their content.

• Itahari International college uses sentiment analaysis to analyse feedback from


students about courses, teachers, or overall academic experience which helps
identifying negative sentiments in feedback about a particular course could help
administrators address issues such as outdated materials or ineffective teaching
methods.

3 Solution
Text sentiment analysis helps solve real-world problems by understanding how people feel
through their written words. It has many useful applications:
• Companies use it to read customer reviews and make their services better.

• Doctors can spot signs of depression by analysing patients’ writings. (oh, 2024)

• During emergencies, it helps find people who need food or medical help by checking
social media posts. (kumar, n.d.)
• Schools use student feedback to improve teaching.

By understanding people's feelings from text, we can make healthcare, education,


emergency response, and government work better for everyone.

3.1 Explanation of the AI algorithms used

3.1.1 Naïve Bayes

Naive Bayes classification method is a popular technique for identifying patterns in data and
making predictions. It is based on the Bayes' Theorem, which states that the probability of an
event occurring is determined by the prior knowledge of conditions that might be relevant to
that event. It's a probabilistic approach which consider all the feature independently. The
algorithm calculates the likelihood of each class and chooses the one with the highest
probability. It is commonly used for a variety of tasks, but it has been particularly effective in
solving problems related to natural language processing (NLP). (Gamal, 2020)

𝑃(𝐴|𝐵)=𝑃(𝐵|𝐴) 𝑥 𝑝(𝐴)

𝑃(𝐵)

14 | P a g e Sujan Dulal
Artificial Intelligence CU6051

P(A|B) – posterior

P(B|A) – likelihood

P(A) – prior

P(B) – evidence

3.1.2 Recurrent Neural Networks (RNN)

Recurrent Neural Networks (RNNs) are considered highly suitable for sentiment analysis
because they are specifically designed to process sequential data, such as text, where the
order of words plays a critical role. By remembering the context from earlier words while
analysing the subsequent ones, RNNs can maintain continuity, which is essential for
understanding the flow of sentiment in a sentence. This ability allows them to handle
complex sentences and capture long-range dependencies effectively, making them
particularly useful for tasks where understanding relationships between words is necessary
to derive accurate sentiment predictions. (Patel, 2019)

3.1.3 Logistic Regression


sentiment analysis using logistic regression works by first turning tweets into numbers. This
is done by creating a list of all the unique words in the tweets (called a vocabulary). Each
tweet is then turned into a list of 1's and 0's, where a 1 means a word from the vocabulary is
in the tweet, and a 0 means it's not. Most of these lists will have a lot of 0's because each
tweet only contains a small number of words. Then, the logistic regression model is trained
to recognize patterns between these lists of numbers and the tweet's sentiment (positive or
negative). Once the model is trained, it can predict whether new tweets are positive or
negative by turning them into the same kind of number list and checking what the model
learned. (peterfoy, 2017) Logistic function is defined as:

15 | P a g e Sujan Dulal
Artificial Intelligence CU6051

3.2 Pseudocode of the solution


A step-by-step explanation of an algorithm is referred to as pseudocode. Pseudocode
doesn't represent itself using any programming languages; rather, it employs plain English
text because it's meant to be read by people rather than machines. (geeksforgeeks, 2024)
Pseudocode
IMPORT necessary modules and libraries

READ and classify the dataset by reading a file 'Dataset.txt' using the pandas library

Assign the 'Label' and 'Reviews' as the column names

PRINT the dataset's data using the print function

INSPECT the missing values by using the isnull() function and sum() function to get the

missing value

SHOW the first five labelled dataframe using the head() function

SHOW the length of the dataset by using the len() function

COUNT the total number of negative and positive labels by using the value_counts() function

ASSIGN the Reviews to variable 'A' and Label to variable 'B'

CONVERT all the text's uppercase to lowercase and store in a list lower_text

REMOVE punctuations and regular expressions from the text and store in a list clean_text

REMOVE extra spaces from the text and store in final_text

SPLIT the prepared data into training group and testing group

TRAIN the data using basic text classification model

PREDICT the output using the predict function on test data

PRINT the accuracy score

TAKE a new input from the user, process it the same way as training data, predict the

sentiment, and print it

16 | P a g e Sujan Dulal
Artificial Intelligence CU6051

3.3 Diagrammatical Solution of the problem


A flowchart is a visual representation of a process, algorithm, or problem-solving method. It
shows steps, order, and flow of data in a system, making it easy to understand and identify
inefficiencies. It helps improve and streamline information processing system by showing the
big picture and fundamental components (conceptdarw, n.d.)

17 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Figure 7 flowchart of RNN

18 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Figure 8 flowchart of naive Bayes

19 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Figure 9 of flowchart logistic regression

20 | P a g e Sujan Dulal
Artificial Intelligence CU6051

3.4 Development Process


The process of creating the sentiment analysis system began with the acquisition of a
dataset of the text with identified sentiments. The data was then pre-processed by
eliminating stop words, splitting the text into individual words (tokenization) and further
standardizing it to fit the analysis.

Third, the text was transformed to numbers by using the TF-IDF technique so that the data
could be used in machine learning algorithms. Naïve Bayes, Logistic Regression, and
RNNs were developed and trained using this processed data.

The performance of each model was evaluated using basic assessment tools such as
accuracy and confusion matrix. Finally, the results were presented in the form of graphs to
ensure that they could be easily understood. This approach was instrumental in developing
a process that can now be used to determine the sentiment of text.

3.4.1 Tools and Techniques

Programming Language: Python

Python is an interpreted flexible open-ended language used for machine learning and data
analysis because of its simplicity and extensive support base.

Figure 10 of python logo

Jupyter Notebook

A live coding environment perfect for developing and demonstrating prototypes, as well as
visualizing results in cells with markdown.

21 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Figure 11 of Jupyter notebook

3.4.2 Libraries and Frameworks

NumPy

It is used for analysis high dimensions data and mathematical calculation on multi-
dimensional arrays. (naik, 2024)

Pandas

Allows for easy data operations such as selection, cleaning and structuring data by
converting data sets into data frames for further analysis. (geeksforgeeks, n.d.)

Matplotlib and Seaborn

Applied to data visualization where data patterns are represented by plots, histograms and
heat maps. (geeksforgeeks, 2025)

Scikit-Learn

A machine learning library containing Naïve Bayes and Logistic Regression algorithms, and
tools for data pre-processing, feature extraction and assessment. (domino, n.d.)

TensorFlow

It is helpful in applying deep learning models such as RNNs or LSTMS that are essential in
processing analysed text sequences. (banoula, 2024)

NLTK

Combines features such as tokenization of text documents, removal of stop words and text
preprocessing (stemming and lemmatization). (sidak, 2023)

22 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Spacy/TextBlob

Semantic analysis for entity recognition and ideal sentiment analysis from the advanced
NLP packages. (gichere, 2023)

TF-IDF Vectorizer

Transforms text data into numeric by highlighting important words and downplaying
frequent words. (kilmen, 2022)

3.4.3 Dataset

The dataset of Twitter sentiment data was also labelled and used to train and test the
machine learning models. This dataset is pre-labelled by sentiments like positive, negative
and neutral, so it will be useful for testing and developing a sentiment analysis system.

3.5 Code Implementation and Result


I trained my models to classify tweets as Positive, Negative, or Neutral. After training, I
tested them with my own custom text to see how well they worked. The models
successfully predicted the sentiments, showing they can analyse emotions accurately and
handle new data effectively.

3.6 Importing Libraries

Figure 12 of code implementation

23 | P a g e Sujan Dulal
Artificial Intelligence CU6051

3.7 Loading Dataset

Figure 13 of loading dataset

3.8 Splitting Dataset

Figure 14 of splitting dataset

3.9 Implementing Naïve Bayes Algorithm

Figure 15 of implementing naive bayes algorithm

24 | P a g e Sujan Dulal
Artificial Intelligence CU6051

3.10 Implementing Logistic Regression

Figure 16 of Implementing Logistic Regression

3.11 Implementing RNN

Figure 17 Implementing RNN

25 | P a g e Sujan Dulal
Artificial Intelligence CU6051

3.12 Plotting and Visualizing Chart for Each model

Figure 18 of visualizing chart code

26 | P a g e Sujan Dulal
Artificial Intelligence CU6051

3.13 Training Data Comparison on Pie Chart

Figure 19 data comparison through naïve bayes

27 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Figure 20 data comparison through Logistic regression

28 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Figure 21 of RNN

3.14 Model Result with Test Data

29 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Figure 22 testing the data

30 | P a g e Sujan Dulal
Artificial Intelligence CU6051

3.15 Accuracy for Each Model

Figure 23: Accuracy result

31 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Figure 24: Accuracy comparison chart

32 | P a g e Sujan Dulal
Artificial Intelligence CU6051

3.16 Metrics for Each Model

Figure 25: Metrices for each model

33 | P a g e Sujan Dulal
Artificial Intelligence CU6051

4 Conclusion
Sentiment analysis helps figure out if a text is positive, negative, or neutral. Businesses use
it to see how customers feel about their products or services. If customers are unhappy,
businesses can improve and make them happier. It also helps marketers understand if their
ads or products are liked or disliked by customers.

The use of machine learning methods to implement those aspects has been described in this
report. The introduction of the subject of "Sentiment Analysis" has also been covered along
with an analysis of the methods used to approach various issue domains.

4.1 Results
4.1.1 Naïve Bayes

 Accuracy: 0.67

 Precision, Recall, and F1-Score (Class-wise):

 Positive: Precision=0.66, Recall=0.68, F1-Score=0.67


 Negative: Precision=0.65, Recall=0.81, F1-Score=0.72
 Neutral: Precision=0.73, Recall=0.47, F1-Score=0.57

 Macro Average:

 Precision=0.68, Recall=0.65, F1-Score=0.65

4.1.2 Logistic Regression

 Accuracy: 0.68

 Precision, Recall, and F1-Score (Class-wise):

 Positive: Precision=0.65, Recall=0.70, F1-Score=0.67

 Negative: Precision=0.72, Recall=0.74, F1-Score=0.73

 Neutral: Precision=0.67, Recall=0.58, F1-Score=0.62

 Macro Average:

 Precision=0.68, Recall=0.67, F1-Score=0.68

4.1.3 Recurrent Neural Network (RNN)

 Accuracy: 0.52

 Precision, Recall, and F1-Score (Class-wise):

 Positive: Precision=0.44, Recall=0.89, F1-Score=0.59

34 | P a g e Sujan Dulal
Artificial Intelligence CU6051

 Negative: Precision=0.69, Recall=0.60, F1-Score=0.65


 Neutral: Precision=0.00, Recall=0.00, F1-Score=0.00

 Macro Average:

 Precision=0.38, Recall=0.50, F1-Score=0.41

4.1.4 Key Observations

 Best Accuracy: Logistic Regression achieved the highest accuracy of 0.68.

 Class-wise Performance:

 For Naïve Bayes, the Negative class had the highest recall (0.81) and F1-
score (0.72).

 For Logistic Regression, the Negative class showed the best overall balance
with F1-score=0.73.

 For RNN, the Positive class achieved high recall (0.89) but performed poorly
for the Neutral class, with no predictions for that category.

 Macro Averages: Logistic Regression outperformed others with the best macro
averages across Precision, Recall, and F1-score.

Logistic Regression is the most balanced model in terms of both accuracy and macro
averages, making it the most suitable for this dataset. However, the performance of all
models for the Neutral class indicates potential room for improvement in handling this
category.

4.2 Analysis of work done

S.N. Task Status


1. Research on Artificial Intelligence Completed
2. Research on chosen topic (Sentiment Analysis) Completed
3. Research on AI concept used Completed
4. Problem statement of chosen topic Completed
5. Research on work done method Completed
6. Similar system case study Completed
7. Review and analysis of existing work Completed

35 | P a g e Sujan Dulal
Artificial Intelligence CU6051

8. Research on solution of selected algorithm Completed


9. Flowchart and Pseudocode Completed
10. Documenting the development Process Completed
11 Final Report Completed
12 Presentation Slides Completed
Table 1 Analysis work done

4.3 How the solution addresses real problems


Sentiment analysis is a technique used to determine the emotional tone of a piece of text,
whether it is positive, negative, or neutral. Companies face several challenges in achieving
accurate sentiment analysis, as it requires sophisticated understanding of human emotions
and language. However, as data science continues to advance, software for sentiment
analysis is becoming more sophisticated in addressing these challenges. Sentiment analysis
has been used in multiple research and competition analysis, allowing to examination of
customer reviews of a brand's products, compare it with its competition, or analyse
sentiment across international markets. This technique can also be used to monitor public
opinion on a product by analysing tweets, Facebook posts, or product reviews over time.

Gathering public opinion and feedback from customers is crucial for any service to function
well. Surveys are often used to gain insights on how people think and feel about a particular
service. Sentiment analysis can be used within these surveys to evaluate the effectiveness
of services, measure their impact on people and identify areas that require improvement. In
short, sentiment analysis can provide valuable information to make necessary adjustments
to better the services.

5 References
aimtechnologies, n.d. aimtechnologies. [Online]

Available at: https://www.aimtechnologies.co/why-sentiment-analysis-is-important-the-


powerof-emotions-in-data/
[Accessed 22 12 2024].

analyticsvidhya, 2020. analyticsvidhya. [Online]

Available at: https://www.analyticsvidhya.com/blog/2020/11/fine-grained-sentiment-


analysisof-smartphone-review/

36 | P a g e Sujan Dulal
Artificial Intelligence CU6051

[Accessed 23 12 2024].

Anon., n.d. lexalytics. [Online]

Available at:
https://www.lexalytics.com/technology/sentiment-analysis/#machinelearningsentiment
[Accessed 22 12 2024].

Bessa, A., 2023. knime. [Online]

Available at: https://www.knime.com/blog/lexicon-based-sentiment-analysis


[Accessed 23 12 2024]. conceptdarw, n.d. conceptdarw. [Online]
Available at: https://www.conceptdraw.com/How-To-Guide/flowchart-definition [Accessed
24 12 2024].
craig, l., 2020. techtarget. [Online]

Available at: Artificial Intelligence (AI) is the simulation of human cognitive functions by
computer systems, including expert systems, natural language processing, speech
recognition, and machine vision. It is used in various industries such as self-driving cars, cus
[Accessed 22 12 2024].
Gamal, B., 2020. medium. [Online]

Available at: https://medium.com/analytics-vidhya/na%C3%AFve-bayes-algorithm-


5bf31e9032a2 [Accessed
23 12 2024].
geeksforgeeks, 2024. geeksforgeeks. [Online]

Available at: https://www.geeksforgeeks.org/what-is-sentiment-analysis/


[Accessed 22 12 2024]. geeksforgeeks, 2024. geeksforgeeks. [Online]
Available at: https://www.geeksforgeeks.org/how-to-write-a-pseudo-code/ [Accessed
23 12 2024].
geeksforgeeks, n.d. geeksforgeeks. [Online]

Available at: https://www.geeksforgeeks.org/supervised-machine-learning/


[Accessed 12 22 2024]. ibm, n.d. ibm. [Online]
Available at: https://www.ibm.com/think/topics/unsupervised-learning [Accessed
22 12 2024].
innoplexus, n.d. [Online]

37 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Available at: https://www.innoplexus.com/blog/how-artificial-intelligence-works/ [Accessed


22 12 2024].
jannade, v., n.d. spicewords. [Online]

Available at: https://www.spiceworks.com/tech/artificial-intelligence/articles/what-is-ml/


[Accessed 22 12 2024].
javatpoint, 2020. javatpoint. [Online]

Available at: https://www.javatpoint.com/artificial-intelligence-ai


[Accessed 22 12 2024]. kumar, r., n.d. [Online]
Available at: https://www.semanticscholar.org/paper/Sentiment-analysis-from-social-
mediain-crisis-Singh-Kumar/8edd2dbf4fade631ac6746c6c310052d4ee8a8a3 [Accessed
24 12 2024].
oh, j., 2024. pubmed central. [Online]

Available at: https://pmc.ncbi.nlm.nih.gov/articles/PMC10794729/ [Accessed


24 12 2024].
Patel, A., 2019. SSRN. [Online]

Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3349572


[Accessed 23 12 2024]. peterfoy, 2017. Q. [Online]
Available at: https://blog.mlq.ai/nlp-sentiment-analysis-logistic-
regression/#:~:text=In%20the%20context%20of%20sentiment,1%20and%20negative%20as
%200.
[Accessed 23 12 2024].

repustate, n.d. repustate. [Online]

Available at: https://www.repustate.com/blog/aspect-based-sentiment-analysis/ [Accessed


23 12 2024].
stryker, j., n.d. ibm. [Online]

Available at: https://www.ibm.com/think/topics/natural-language-processing


[Accessed 22 12 2024]. thecleverprogrammer, 2020.
thecleverprogrammer. [Online]
Available at: https://thecleverprogrammer.com/2020/08/21/emotion-detection-model-
withmachine-learning/ [Accessed 23 12 2024].
valleywood, n.d. medium. [Online]

38 | P a g e Sujan Dulal
Artificial Intelligence CU6051

Available at: https://medium.com/nerd-for-tech/aspect-based-sentiment-


analysis5ac9f15cc678
[Accessed 23 12 2024].

banoula, M., 2024. simplilearn. [Online]


Available at: https://www.simplilearn.com/tutorials/deep-learning-tutorial/what-is-tensorflow
[Accessed 20 1 2025].

domino, n.d. domino. [Online]


Available at: https://domino.ai/data-science-dictionary/sklearn
[Accessed 20 1 2025].

geeksforgeeks, 2025. geeksforgeeks. [Online]


Available at: https://www.geeksforgeeks.org/data-visualization-with-python-seaborn/
[Accessed 20 1 2025].

geeksforgeeks, n.d. geeksforgeeks. [Online]


Available at: https://www.geeksforgeeks.org/python-data-analysis-using-pandas/
[Accessed 20 1 2025].

gichere, f., 2023. medium. [Online]


Available at: https://becominghuman.ai/sentiment-analysis-of-app-reviews-a-comparison-
of-bert-spacy-textblob-and-nltk-9016054d54dc
[Accessed 20 1 2020].

kilmen, s., 2022. okan. [Online]


Available at: https://okan.cloud/posts/2022-01-16-text-vectorization-using-python-tf-idf/
[Accessed 20 1 2025].

naik, s., 2024. olibr. [Online]


Available at: https://olibr.com/blog/what-is-numpy-what-are-its-features-and-applications/
[Accessed 20 1 2025].

sidak, k., 2023. codefinity. [Online]


Available at: https://codefinity.com/blog/A-Comprehensive-Guide-to-Text-Preprocessing-
with-NLTK
[Accessed 20 1 2025].

39 | P a g e Sujan Dulal
Artificial Intelligence CU6051

40 | P a g e Sujan Dulal
Artificial Intelligence CU6051

41 | P a g e Sujan Dulal

You might also like