Sem Project
Sem Project
Submitted by
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE ENGINEERING
R.M.D. ENGINEERING
COLLEGE
(An Autonomous Institution)
KAVARAIPETTAI - 601206
APRIL 2024
BONAFIDE CERTIFICATE
Certified that this project report titled “EMOTIONAL CHAT BOT”, is a bonafide
SIGNATURE SIGNATURE
INTERNAL EXAMINER
ACKNOWLEDGEMENT
5
TABLE OF CONTENTS
ABSTRACT 5
1. INTRODUCTION 8
2. LITERATURE REVIEW 10
3. METHODLOGY 15
4. RESULT 19
5. CONLUDION 21
REFERENCES 25
6
CHAPTER 1:INTRODUCTION
1. Introduction
Textual conversational agents or chatbots development gather tremendous
traction from both academic and industries in recent years. Nowadays,
chatbots are widely used as an agent to communicate with a human in some
services such as booking assistant, customer service and also a personal
partner. Emotion Recognition is a buzz research topic in the field of Human
Computer Interaction. It has potentially wide applications, such as the
interface with robots, banking, call centers, car board systems, computer
games etc. Many chatbot systems are created for various purposes but very
few of them are designed to know the feedback of the system and the
organization indirectly from the user. That is why the Emotionally Intelligent
College Enquiry Chatbot System came into picture in our research work. The
proposed idea consists of developing an intelligent chatbot system for college
enquiry purposes using a web-based chatbot agent platform DialogFlow,
generating a valid response to the user, and retrieving the conversation history
from google logfiles for sentiment and emotion classification of the user to
determine his/her interests in the college towards taking the admission.
Human sentiment can be measured in scores between -1 and 1 which is also
called the polarity of the text. For detailed emotion recognition apart from
Positive, Negative or Neutral sentiments, we have used the twitter Journal of
University of Shanghai for Science and Technology ISSN: 1007-6735
Volume 23, Issue 6, June - 2021 Page -69 dataset which contains tweets with
corresponding emotions. The dataset has 40,000 rows and three classifiers as
Multinomial Naive Bayes, Logistic Regression and K-Nearest Neighbors are
trained. The models are evaluated on their accuracy, f1 score and confusion
matrices and the most accurate one is selected. Furthermore, an emotion
classification is carried out on the conversation history of users in general and
each query will be classified into seven different emotional classes such as
neutral, happy, excited, satisfied, not satisfied, boredom and disgust and in the
results dataset each query is attached with a polarity, analyzed sentiment and
corresponding emotion for more details of the affective states of the user.
7
CHAPTER 2:LITERATURE REVIEW
2. Literature Review
Ms. Ch. Lavanya Susanna, R. Pratyusha, P. Swathi, P. Rishi
Krishna, V. Sai Pradeep,” COLLEGE ENQUIRY CHATBOT”,
International Research Journal of Engineering and Technology
(IRJET), on 2020
This paper proposes to develop an algorithm which will be wont to identify
answers associated with user submitted questions. To develop a database were
all the related data are going to be stored and to develop an internet interface.
The user will not waste a lot of time searching for the acceptable notices .
8
Gustavo Assunção, Paulo Menezes Institute of Systems and
Robotics, Coimbra, Portugal,” Speaker Awareness for Speech
Emotion Recognition” (2020)
In this paper, we evaluated a large-scale machine learning model for classification of emotional
states trained for speaker identification which aim to verify that SER improves when some
speaker’s emotional prosody cues are considered. Journal of University of Shanghai for Science
and Technology ISSN: 1007-6735 Volume 23, Issue 6, June - 2021 Page -70
CHATER 3: METHODOLOGY
3. Methodology
The machine-based emotion classification of the user has gained a lot of
attention in recent times especially by the social media companies for
businesses such as ad recommendations, etc. The modules used in this
methodology are chatbot design, conversation history retrieval, data pre-
processing, feature engineering, sentiment analysis, emotion classification and
exploratory data analysis. The proposed methodology is shown in Figure. 2
9
3.1: Data Pre-processing
Data preprocessing is a technique in which we transform the raw data in a
useful, clean and efficient format for further training the models. In our
module, we transform the raw conversation history downloaded from google
log files into clean data. Initially it contained 366 rows and 171 columns and
only 2 were extracted ‘textPayload’ and ‘timestamp’ for our purposes. All the
rows containing null values in the ‘textPayload’ column are dropped and the
data frame is ready for the sentiment analysis part. On the other hand, we
loaded a preprocessed twitter emotion dataset for the training phase (for
Journal of University of Shanghai for Science and Technology ISSN: 1007-
10
6735 training model for emotion classification) as shown in Figure 2 which
contained 40000 rows and 5 columns from which 2 columns ‘emotion’ and
‘content’ are retrieved for further purpose. The tweets are cleaned using
regular expressions for removing mentions, hashtags, retweets, hyperlinks and
punctuation marks. Here lemmatization is introduced. Lemmatization is a
process of converting all the words in the ‘content’ column to their base form
for increasing the accuracy of the model. An NLTK toolkit is used for
removing stopwords from the twitter emotion data and lemmatization is
applied to the ‘content’ column.
12
3.4 Multinomial Naive Bayes Classification
Multinomial Naive Bayes is a probabilistic learning method based on Bayes
theorem which predicts the tag of a text (in our case ‘content’ column data) by
computing the probability of each tag for a given sample and then gives the
tag with highest probability as output. Bayes theorem calculates the
probability of a word occurring in the text based on the prior knowledge of
conditions related to occurrence of that word in the training dataset. It is based
on the following equation.
13
4) Comparison of the features from test data to training data is done in
following steps for classifying the emotion in the test data:
1. Naive Bayes creates a frequency table of training data and lists count of all
the words against corresponding emotion.
2. It finds probability of each word for each emotion and creates a likelihood
table.
3. It assumes that the effect of predictor (A) on a given class (B) is
independent of the values of other predictors.
4. It computes the posterior probability for each emotion using the Naive
Bayes theorem.
5. The emotion with the highest probability will be the outcome of that word
in the data.
6. Classified emotion of the string is returned.
5) Repeat steps from 1 to 4 until all the training set is computed
CHAPTER 4 :RESULT
14
The results obtained from our proposed system are explained below. The
overall performance of the system is evaluated by
Accuracy and F1 score. Accuracy is calculated by: (2) The accuracy of the
proposed system is then calculated by using the total number of correctly
classified emotions divided by the total number of emotions. The accuracy for
logistic regressor is 71.5%, and of k-NN is 59%, and maximum accuracy
came out to be of naive bayes i.e. 73.8%. The confusion matrix is evaluated
for the naive bayes model and heatmap is generated for that confusion matrix
displayed by percentages
Acknowledgements
I would like to take this opportunity to thank my internal guide Prof. Jameer
Kotwal for giving me all the help and guidance I needed. I am grateful to them
for their kind support. Their valuable suggestions were very helpful. I am also
grateful to Prof. Archana Chaugule, Head Of Computer Engineering
Department, Pimpri Chinchwad College Of Engineering And Research for her
indispensable support, suggestions. In the end our special thanks to all the
staff members for providing various resources such as a laboratory with all the
needed software platforms, continuous Internet connections for our project.
REFERENCES
[1] Ms.Ch.Lavanya Susanna, R.Pratyusha, P.Swathi, P.Rishi Krishna, V.Sai
Pradeep,” COLLEGE ENQUIRY CHATBOT”, International Research
Journal of Engineering and Technology (IRJET), on 2020.
[2] Gustavo Assuncao, Paulo Menezes, Fernando Perdigao, “Speaker
Awareness for Speech Emotion, Special Focus Paper—Speaker Awareness
for Speech Emotion Recognition, 2020
[3] Intelligent ChatBot System using Artificial Intelligence and Deep
Learning, International Research Journal of Engineering and Technology
(IRJET) on 2020
[4] Chat-Bot”, International Research Journal of Engineering and Technology
(IRJET), on 2018
[5] Harimi1*, A. Shahzadi1, A.R. Ahmadyfard and Kh.Yaghmaie Department
of Electrical Engineering and Robotics, Shahrood University of technology,
Iran09 February 2013
16