FOA Project Report: Basic Conversational Chatbot - Robo

Uploaded by

s0618614

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

FOA Project Report: Basic Conversational Chatbot - Robo

Uploaded by

s0618614

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

FOA Project Report

Basic Conversational Chatbot – Robo

Submitted By
Vaishnavi Balodhi (04)
Aastha Lokhande (30)
Poornima Menon (31)
Divya Sharma (47)
Submitted on
17th of October, 2024.

1
Introduction
Chatbots are software programs designed to
simulate human conversation through voice or text
interactions. They are increasingly used across
various industries for customer service,
automation, and user engagement. This project
focuses on creating a basic chatbot capable of
holding a simple conversation with users using
natural language processing (NLP) techniques.
Our chatbot, "Robo," responds to users based on
their input by analyzing similarities between their
queries and the given dataset of conversational
text. We implemented the chatbot using Python
and several libraries for text processing and
vectorization.

2
Project Overview
Technology Stack

 Programming Language: Python

 Libraries:
o NLTK (Natural Language Toolkit) for

tokenization, text preprocessing, and

lemmatization.
o NumPy for numerical operations.

o Sklearn for vectorization (TF-IDF) and

similarity measures (cosine similarity).

 Tools:
o TF-IDF Vectorization to represent text in

numerical format.
o Cosine Similarity to compare the

similarity between user input and the

dataset.
Functionality Overview

The chatbot processes user inputs by tokenizing

and normalizing the text, using a pre-trained TF-
IDF model to vectorize the input. The chatbot then
finds the most relevant response by calculating
cosine similarity between the user’s input and
sentences from the dataset.
If the input matches any greeting keywords, the
chatbot responds with a predefined greeting. For
other inputs, it uses similarity measures to

3
generate an appropriate response based on the
dataset.
Dataset
The chatbot is built using a dataset that consists of
conversational dialogues covering everyday topics
like greetings, the weather, school, and social
activities. The dataset was formatted as plain text,
with dialogues separated by sentences.
Sample Dataset Content:
hi, how are you doing?
i'm fine. how about yourself?
i'm pretty good. thanks for asking.
The dataset includes a variety of casual
conversations that simulate real-life scenarios,
allowing the chatbot to provide relevant responses
based on user input(dialogs).

4
Implementation Details
Code Breakdown

 Importing Libraries:
We used Python's nltk and sklearn libraries for
natural language processing and machine
learning. The code begins by importing
necessary packages such as TfidfVectorizer for
vectorization and cosine_similarity for
comparing input text to the dataset.
 Tokenization and Preprocessing:
The text is split into sentences using NLTK's
sent_tokenize() function and into words using
word_tokenize(). We applied lemmatization to
reduce words to their base forms using
WordNetLemmatizer. Special characters and
punctuation were removed to ensure clean
input data.
 Lemmatization and Keyword Matching:
A function LemNormalize() was created to
preprocess user input by tokenizing,
lowercasing, and lemmatizing it. We defined a
set of greeting keywords (e.g., "hello", "hi",
"hey") and set the bot to respond with
predefined replies when these words were
detected.

5
 Response Generation:
For non-greeting inputs, the chatbot appends
the user’s input to the list of tokenized
sentences. Using TfidfVectorizer, the chatbot
converts the input into numerical vectors and
calculates cosine similarity between the input
and the dataset sentences. The chatbot then
responds with the sentence having the highest
similarity score.
Main Features

 Handling Greetings:
The chatbot recognizes greetings like "hello" or
"hi" and responds with a random greeting from
a list of predefined options.
 Conversation Simulation:
For more complex inputs, the chatbot uses
cosine similarity to retrieve the most
contextually relevant response based on the
dataset.

6
Challenges Faced
Some of the challenges encountered during the
project included:
 Dataset Preprocessing:
Cleaning and tokenizing the text dataset
required careful handling of punctuation and
special characters to ensure accurate text
comparison.
 Response Generation Accuracy:
The chatbot relies on cosine similarity and TF-
IDF scores, which work well for simple
conversation but may not always generate
responses that sound natural for more complex
inputs. Future work could involve integrating
more advanced NLP models, such as
transformer-based models like GPT.
 Error Handling:
When the chatbot does not find a relevant
match for the user input, it returns a default
response ("I am sorry! I don't understand
you"). Improving this fallback response
mechanism can be an area for further
development.

7
Conclusion
In this project, we successfully built a basic
conversational chatbot that simulates human-like
conversation using natural language processing
techniques. Although the chatbot is basic, it
demonstrates the foundational principles of how
modern chatbots work.
Future Enhancements:
 Expanding the dataset to include more diverse
conversations.
 Incorporating advanced NLP techniques like
deep learning for improved response
generation.
 Enhancing the error handling to make the
chatbot more interactive when it fails to
understand user input.

8
References
1.NLTK Documentation:
NLTK is a leading platform for building Python
programs to work with human language data. It
provides tools for tokenizing, parsing,
classification, and semantic reasoning.
NLTK Documentation link- https://www.nltk.org/
2.Scikit-learn Documentation:
Scikit-learn is a machine learning library in
Python, providing efficient tools for data mining
and data analysis. It was used in this project
for vectorization (TF-IDF) and calculating cosine
similarity between text samples.
Scikit-learn Documentation link- https://scikit-
learn.org/stable/
3.TF-IDF Explained:
TF-IDF (Term Frequency-Inverse Document
Frequency) is a numerical statistic that is
intended to reflect how important a word is to
a document in a collection or corpus. It is often
used as a weighting factor in information

9
retrieval and text mining.
TF-IDF Article-https://en.wikipedia.org/wiki/Tf
%E2%80%93idf
4.Cosine Similarity Explained:
Cosine similarity is a metric used to determine
how similar two vectors are by measuring the
cosine of the angle between them. It is widely
used in natural language processing to
compare documents or sentences.
Cosine Similarity
Article-https://en.wikipedia.org/wiki/Cosine_simi
larity

CompTIA DA0 001 Exam Questions and Answers PDF
50% (2)
CompTIA DA0 001 Exam Questions and Answers PDF
18 pages
Student Information Chatbot Final Report
No ratings yet
Student Information Chatbot Final Report
21 pages
Selenium Framework For Web Automation Testing
No ratings yet
Selenium Framework For Web Automation Testing
12 pages
Chatbot: Abhishek Verma (00414902018) Archit Kr. Singh (01414902018) Jatin Bagga (03814902018)
No ratings yet
Chatbot: Abhishek Verma (00414902018) Archit Kr. Singh (01414902018) Jatin Bagga (03814902018)
29 pages
Sample Report
No ratings yet
Sample Report
3 pages
ChatBot Using Python Flask
No ratings yet
ChatBot Using Python Flask
4 pages
Python Chat Bot Project
100% (1)
Python Chat Bot Project
6 pages
Python Chatbot Project: January 2022
No ratings yet
Python Chatbot Project: January 2022
6 pages
Python Chatbot Project
No ratings yet
Python Chatbot Project
6 pages
AI project logbook
No ratings yet
AI project logbook
5 pages
Ai Phase 3 Project
No ratings yet
Ai Phase 3 Project
18 pages
Automated Chatbot Implemented Using Natural Language Processing PDF
No ratings yet
Automated Chatbot Implemented Using Natural Language Processing PDF
5 pages
Python Chatbot Project
No ratings yet
Python Chatbot Project
6 pages
Mini Chat Bot
No ratings yet
Mini Chat Bot
22 pages
HAI Report 3
No ratings yet
HAI Report 3
13 pages
ANKUSH
No ratings yet
ANKUSH
20 pages
ChatBot Project in Machine Learning PPT Kundan
No ratings yet
ChatBot Project in Machine Learning PPT Kundan
11 pages
Project Report
No ratings yet
Project Report
4 pages
Coding Creativity - How to Build A Chatbot or Art Generator from Scratch with Bonus: The Ai Prompting Bible
From Everand
Coding Creativity - How to Build A Chatbot or Art Generator from Scratch with Bonus: The Ai Prompting Bible
Michael Ferguson
No ratings yet
base paper
No ratings yet
base paper
3 pages
AI Chatbot: Green University of Bangladesh
100% (2)
AI Chatbot: Green University of Bangladesh
20 pages
AI Chatbot Project
No ratings yet
AI Chatbot Project
3 pages
AI Chatbot Practical File
No ratings yet
AI Chatbot Practical File
2 pages
Building A ChatBot
100% (1)
Building A ChatBot
3 pages
Final Project Work.
No ratings yet
Final Project Work.
3 pages
AI Phae 2 Project
No ratings yet
AI Phae 2 Project
8 pages
Chatbots
No ratings yet
Chatbots
15 pages
AI Phase 4
No ratings yet
AI Phase 4
9 pages
An Automated Conversation System Using Natural Language Processing (NLP) Chatbot in Python
No ratings yet
An Automated Conversation System Using Natural Language Processing (NLP) Chatbot in Python
23 pages
Revised Chatbot Project Documentation
No ratings yet
Revised Chatbot Project Documentation
4 pages
Python Chatbot Project
No ratings yet
Python Chatbot Project
10 pages
AWS Major Project.docx (1)
No ratings yet
AWS Major Project.docx (1)
139 pages
Chatbot
No ratings yet
Chatbot
3 pages
AI_Chatbot_Documentation
No ratings yet
AI_Chatbot_Documentation
2 pages
Exploring Chatbot Development Using Python A Final Year Computer Science Project
No ratings yet
Exploring Chatbot Development Using Python A Final Year Computer Science Project
2 pages
AI - Phase 5
No ratings yet
AI - Phase 5
47 pages
Natural Language Understanding in Chatbots
No ratings yet
Natural Language Understanding in Chatbots
4 pages
NLP Report Merged
No ratings yet
NLP Report Merged
16 pages
GRP 117 Review 1 Chatbot
No ratings yet
GRP 117 Review 1 Chatbot
28 pages
Whats App
No ratings yet
Whats App
24 pages
FINAL-MIDTERM Major2
No ratings yet
FINAL-MIDTERM Major2
20 pages
ChatBot Using TenserFlow
No ratings yet
ChatBot Using TenserFlow
12 pages
Course Project Report For: Artificial Intelligence EL-3011
No ratings yet
Course Project Report For: Artificial Intelligence EL-3011
8 pages
Deep Learning Project
No ratings yet
Deep Learning Project
21 pages
3598-Article Text-6763-1-10-20210422
No ratings yet
3598-Article Text-6763-1-10-20210422
4 pages
m
No ratings yet
m
3 pages
Final
No ratings yet
Final
51 pages
Prabhu NM Chatbot Project
No ratings yet
Prabhu NM Chatbot Project
17 pages
Chat Bot
No ratings yet
Chat Bot
10 pages
INNOVATIVE
No ratings yet
INNOVATIVE
17 pages
Report For Chatbot Using NLTK Library Using Python Programming Python For Machine Learning (Int 522)
No ratings yet
Report For Chatbot Using NLTK Library Using Python Programming Python For Machine Learning (Int 522)
9 pages
Rule Based Chatbot
No ratings yet
Rule Based Chatbot
1 page
Programming Challenges of Chatbot Current and Future Prospective
No ratings yet
Programming Challenges of Chatbot Current and Future Prospective
4 pages
Synopsis Chatbot PDF
100% (1)
Synopsis Chatbot PDF
6 pages
ChatBot_synopsis-final (1)
No ratings yet
ChatBot_synopsis-final (1)
7 pages
(Web Page) Python Chatbot
No ratings yet
(Web Page) Python Chatbot
1 page
Generating and Analyzing Chatbot Responses
No ratings yet
Generating and Analyzing Chatbot Responses
10 pages
Ir Review 3 PPT
No ratings yet
Ir Review 3 PPT
14 pages
Chat GPT
No ratings yet
Chat GPT
21 pages
Design and Implementation of A Web-Based Chatbot System
No ratings yet
Design and Implementation of A Web-Based Chatbot System
8 pages
Full Text 01
No ratings yet
Full Text 01
132 pages
Review Draft Paper-1
No ratings yet
Review Draft Paper-1
3 pages
Subject Code: 80359 Subject Name: Data Warehousing and Data Mining Common Subject Code (If Any)
No ratings yet
Subject Code: 80359 Subject Name: Data Warehousing and Data Mining Common Subject Code (If Any)
9 pages
Ganesh
No ratings yet
Ganesh
1 page
Lab 03
No ratings yet
Lab 03
20 pages
Machine Learning Techniques Kcs 055
No ratings yet
Machine Learning Techniques Kcs 055
2 pages
cs53 Super-Imp-Tie-23
No ratings yet
cs53 Super-Imp-Tie-23
2 pages
Admsci 14 00157 With Cover
No ratings yet
Admsci 14 00157 With Cover
21 pages
icest Journal Paper
No ratings yet
icest Journal Paper
12 pages
Objectives
No ratings yet
Objectives
10 pages
AI Brochure - PDF 3
No ratings yet
AI Brochure - PDF 3
8 pages
aacl
No ratings yet
aacl
12 pages
LARRY Data Mining in Fast Food Industry
No ratings yet
LARRY Data Mining in Fast Food Industry
5 pages
Pls Academy Pde Student Slides 3 2405
No ratings yet
Pls Academy Pde Student Slides 3 2405
137 pages
6th International Conference on Advances in Artificial Intelligence Techniques (ArIT 2025)
No ratings yet
6th International Conference on Advances in Artificial Intelligence Techniques (ArIT 2025)
2 pages
Music Suggestion Expert System Final
No ratings yet
Music Suggestion Expert System Final
9 pages
Databases Intro
No ratings yet
Databases Intro
16 pages
Big Data Unit 1 Notes
No ratings yet
Big Data Unit 1 Notes
27 pages
Data Support And Structure
No ratings yet
Data Support And Structure
24 pages
IoT_and_Cloud_Quiz_Questions_and_Answers
No ratings yet
IoT_and_Cloud_Quiz_Questions_and_Answers
5 pages
Research Methods Content Analysis
No ratings yet
Research Methods Content Analysis
3 pages
Natural Language Processing
No ratings yet
Natural Language Processing
8 pages
Blockchain Technology and Applications
No ratings yet
Blockchain Technology and Applications
58 pages
Resume Template for MNC
No ratings yet
Resume Template for MNC
1 page
Module 1 AI
No ratings yet
Module 1 AI
19 pages
Threading The Narrative Exploring User Sentiments To Understand Thread's Influence On Twitter's Ecosys
No ratings yet
Threading The Narrative Exploring User Sentiments To Understand Thread's Influence On Twitter's Ecosys
7 pages
AWS - Data Flow Poster - Long - Final
No ratings yet
AWS - Data Flow Poster - Long - Final
1 page
Lesson 5 - Enterprise Resource Planning
No ratings yet
Lesson 5 - Enterprise Resource Planning
32 pages
Query Optimization in Mysql Database Usi F8e2fb8b
No ratings yet
Query Optimization in Mysql Database Usi F8e2fb8b
7 pages
Integrating E-Commerce and Data Mining
No ratings yet
Integrating E-Commerce and Data Mining
11 pages