Chatbots
Chatbots
1 Abstract
Chatbots or Conversational agents area unit succeeding important technological
leap within the field of informal services,that is, enabling a tool to speak with a
user receiving user requests in language.The device uses various fields of artifi-
cial intelligence and machine learning to reply to the user with machine driven
responses. This Systematic Literature Review focuses on presenting detailed
study of number of the recent chatbot systems/papers developed in numerous
domains.These recent papers have been reviewed while keeping special attention
to the kind of data given to those systems, the domain that these systems are
developed, among other parameters so as to know the recent trends within the
development of chatbot systems,throught out the complete course of literature
review,known databases were explored within the specific field to incorporate
relevant researches up until year 2022. For reviewing the studies in this specific
field, several studies were evaluated, except for thorough study and analysis in
10 studies were looked upon.
2 Keywords
Chatbot testing; algorithm testing; NLP; ambiguity; verification; validation;
machine learning
3 Introduction
Increased computing power paved the way for new technological advances. Ar-
tificial intelligence has played a very important role in the advancement of these
technologies. One of the most important applications of artificial intelligence
is natural language processing. Natural language processing is a method of
making machines and computers understand human language[12]. Artificial in-
telligence (AI) is increasingly integrating our daily lives into the creation and
analysis of intelligent software and attacks (so-called intelligent agents). Intelli-
gent agents can perform a variety of tasks, from labor to advanced operations.
Chatbots are a classic example of an AI system and one of the most basic and
1
comprehensive examples of intelligent human-computer interaction (HCI). This
is a computer program that reacts to text and speech like an intelligent reality
and through natural language processing (NLP) he understands one or more
human languages. [13]. A chatbot is defined as ”a computer program designed
to simulate an argument with a deadly stoner, especially over the Internet.”
The main purpose of chatbots is to enable computers to exchange
natural language with humans.
2
Conclusions are summarized in Section 6.
4 Related Work
A chatbot can be defined as a computer program that emulates a conversation
with a human using text messages, navigation buttons, or simulated voice to
provide a particular service, most often within a messenger application. Exist[2]
previous literature Research work on various aspects of chatbots has focused on
chatbot design and implementation, history and background, evaluation meth-
ods, and applications of chatbots in specific fields. I have guessed. In particular,
the author focuses on systematic his literature searches and conducting quanti-
tative research related to chatbots[5]. Finally, they expressed concern about the
amount of published material and stressed the importance of interdisciplinarity.
In this article[17], the author compared the features and technical requirements
of his most common 11 chatbot application systems. Research conducted by [1]
included his analysis of literature discussing the history, technology, and appli-
cations of chatbots. Tracing the historical development from generative ideas
to the present, the author highlighted potential shortcomings at each point.
After presenting a comprehensive classification scheme, the authors discussed
key implementation technologies. Finally, we covered common architectures for
modern chatbots and the leading platforms for creating them. Author concluded
that more research is needed on existing chatbot platforms and the ethical issues
associated with chatbots. This study [6] examined recent advances in chatbots
that use artificial intelligence and natural language processing. The main chal-
lenges, limitations of the current work and recommendations for future research
investigations are all very well highlighted. e.[20] proposing a RobustAnswer-
Driven Assistant (RADA) using the Chatbot framework. It consists of an en-
semble of entity recognition, entity prediction, question answering models, and
dialogue systems. A quantitative experiment was conducted that included com-
parison of the Web Question data set with the prior art. Results demonstrate
the efficacy of RADA compared to other methods under the F1 score metric.
In-depth research has been conducted on recent chatbot systems/papers devel-
oped in various fields. In order to understand recent trends in the development
of chatbot systems, particular attention was paid to such things as the kind of
knowledge given to these systems and the domains in which these systems were
developed. Reviewed all articles. Author[4] very cleverly mentioned the AI con-
cepts needed to build an intelligent conversational agent based on deep learning
models, and also presented a functional architecture for building an intelligent
health support chatbot. This review[22] notes that the use of deep learning and
reinforcement learning architectures are the most commonly used techniques to
understand user queries and generate appropriate responses. Also, the Twit-
ter dataset (open domain) was the most common dataset used for evaluation
, followed by Airline Travel Information Systems (ATIS) (closed domain) and
3
Ubuntu Dialog Corpora (technical support) was also pointed out. This SLR re-
view also shows that the Twitter dataset, Airline , and open domains provided
by technical support are the most common domains for chatbots. Additionally,
the most commonly used metrics in to evaluate chatbot performance (in order of
popularity) are Accuracy, F1 Score, BLEU (Bilingual Evaluation Understudy),
Recall, Human Rating, and It turned out to be precision.
5 Evolution of Chatbot
5.1 First Chatbot introduced
In 1950, Alan Turing asked the question ”Can machines think?” Turing con-
ceived of the problem as an ”imitation game” (now called the Turing Test) in
which an ”interviewer” asked human and machine subjects questions about a
human-related matter. However, we say that a machine can assume, a mortal
and a machine are indistinguishable. In 1966, Joseph Weizenbaum at MIT cre-
ated the first chatbot that probably came close to mimicking a mortal. ELIZA:
Based on input judgment, ELIZA would identify keywords and match those
keywords against a set of pre-programmed rules to trigger actionable responses.
Since ELIZA, progress has been made in the development of increasingly intel-
ligent chatbots. In 1972, Kenneth Colby at Stanford created PARRY, a robot
that pretended to be a paranoid schizophrenic. In 1995, Richard Wallace created
A.L.I.C.E, a significantly more complex robot that generated responses using
sample inputs against (input) (affair) dyads stored in documents in a knowledge
base. These documents were written in Artificial Intelligence Markup Language
(AIML), an extension of XML that is still in use. ALICE is a three-time winner
of the Loebner Prize, a competition held each time that attempts to run the
Turing Test and recognizes the most intelligent chatbot.[?]
4
chine learning technology and natural language processing tools combined with
the availability of computational power have led to new frameworks and algo-
rithms for implementing ”sophisticated” chatbots without relying on rules or
pattern recognition techniques. was created to promote the commercial use of
chatbots.[?]
5
to ensure the semantic framework is complete and unambiguous.
• Data Source: Stores and retrieves information and data used by the dialog
manager. A chatbot’s data source can be internal or external. You can use
internal data sources in AIML (Artificial Intelligence Markup Language) tem-
plates or rule structures to understand user requests and respond accordingly.
Chatbot can build a database from scratch or use an existing database with
domains and capabilities. • Response Generator: Once the set of candidate
responses is created after the action is taken, the appropriate response is pro-
vided. Based on the generation of responses, chatbots can be categorized into
two types of this models: search-based models and his generative models.
6 Chatbot Technologies
This section briefly discusses the techniques , consist of three primary phases:
pre-processing (NLP), processing (NLU), and generation (NLG) in the following
subsections.
6.1.1 Embedding.:
After a preprocessing step, the text data needs to be converted into numeric
form so that the computer can understand the meaning of the words or phrases.
This concept is called embedding (encoding or vectorization). There are several
types of embedding. B. Character embeddings, word embeddings, sentence em-
beddings, etc. Among these kinds of embeddings, the term embedding is mainly
used word embedding is a language modeling technique used to map words to a
6
vector of real numbers. In vector space, it represents a multidimensional word
or sentences.
7
is necessary to extract entities (named or specific of the domain) that would be
the arguments or constraints of the identified intent, also known as slot filling.
Slot filling is primarily concerned with extracting the relevant information from
the user query; this task, also called entity extraction, is required by the system
to process the query further. For example, in a restaurant reservation scenario,
given the sentence, “Are there any French restaurants in downtown Toronto?”
as an input, the task is to correctly output, or fill, the following slots: (cuisine:
French) and (location: downtown Toronto). Slot filling is generally considered
a sequence tagging problem; herein, each relevant word token is tagged with the
respective slot name using the B I O (Begin, Inside, Outside) convention or slot
value pair techniques. It is commonly used in a task-oriented chatbot. Several
studies have been done on slot filling to obtain meaningful semantic chunks from
text accurately, such as biLSTM, CNNs or combination with other approaches.
6.2.3 Joint task of intent identification and slot filling (Joint model).:
Unfortunately, this method allows errors to propagate throughout the pipeline
as each module depends on the previous module’s output. To address this is-
sue, some researchers combined these two tasks (detecting user’s intent and
labeling the slots) by intent detection (ID) and slot filling (SF) joint models
present in a text concurrently.[8] employ an end-to-end hierarchical multi-task
model based on sequence-to-sequence learning using encoder-decoder architec-
ture. This model extracts relevant data from user utterances and represents it
through CNN, and RNNs work with the biLSTM and biGRU model hierarchi-
cally. The representations learned from these models are shared between intent
and slot filling. To resolve sequence labeling tasks from slot filling, the author
employs a probabilistic CRF classifier in place of the traditional Softmax classi-
fier at the output layer. The CRF accurately captures label dependency, vastly
improving the success rate of the slot filling task and further the system per-
formance for all datasets. One of the limitations in joint models is performing
considerably well only on either of the two tasks at a time, owing to the varied
values of the trade-off parameter between the loss function of intent detection
and slot filling. This parameter defines if the model is best trained to detect
intent or slot-filling tasks, making these models in both ID and SF tasks unwork-
able. Therefore, to overcome this issue, [27] proposed combining intent detection
and slot filling into a single task involving sequence-labeling; this would entail
using an attention-based encoder-decoder model with a fresh tag scheme. In
this model, intent information is provided to the slot-filling task, and a single
word may contain different tags that can optimize their performance.
8
stand. Indeed, the generated text must consider the history of the conversa-
tion and the user’s context (NLU). There are two types of response generation
approaches, namely, the retrieval-based (rule-based) approach and the genera-
tivebased approach. However, some studies combined this approach to leverage
both for enhancing the response generation of the chatbot.
7 Algorithm
7.1 Naı̈ve Bayes:
The Naive Bayes algorithm attempts to classify text into specific categories,
allowing chatbots to identify user intent and narrow the possible range of re-
sponses. Intent identification is one of the first and most important steps in
a chatbot conversation, so it is essential that this algorithm works well. Al-
gorithm is based on communalities. This essentially means that a particular
word should have more weight for a particular category.[24] The easiest way
to test this algorithm is to use k-fold cross-validation. This involves training a
chatbot with specific inputs and their corresponding classifications, and using
a test set to assess how often the chatbot can correctly classify specific inputs.
The confusion matrix, accuracy, precision, and recall can be used to evaluate
algorithm performance. One problem with the Naive Bayes algorithm is that it
uses a ”word bag” approach. Basically, the algorithm considers words as whole
sentences and selects the most significant ones to determine the input class.
This means that does not take word order into account. This can be a prob-
lem because certain rearrangements of the words can make the inputs and their
classes different. To overcome this, techniques such as n-gram can be used to
preserve word order. Additionally, Explainable AI (XAI) [19]is a new technique
used to “explain” machine learning and deep learning models and understand
the rationale behind their decisions. [19]
9
7.3 Deep Neural Network:
Inspired by the human brain, neural networks consist of layers of interconnected
artificial neurons that communicate with each other times. These neurons learn
features from data and work together to produce meaningful output. Neural
networks are data-intensive and require large amounts of data to learn patterns
and trends in the data. This algorithm can be tested to see if the chatbot
generates valid responses to input, maintains conversation flow, responds to
user needs, and whether the chatbot recognizes human language characteristics.
You have to decide if it can be used within reason. This may mean that an
adaptation of the Turing test could be a suitable test method. The problem
with neural networks is the lack of explainability. Determining which neuron in
the network contributed to the prediction or which neuron processed a particular
function is not easy. In addition, neural networks tend to rely on the availability
of vast amounts of data to allow enough iterations of the learning process to
ensure that its responses are as valid as possible. Such large amounts of data are
not typically seen in chatbot environments. This has led to the use of efficient
algorithms such as LSTMs and RNNs that work well with text data.
Figure 2: Figure 2
10
The first order model has one word and the third order model has groups of
three words. As a result, higher-order chains more accurately represent the
training data and have less variance. On the other hand, lower-order chains
are more random and produce variable output. Markov chain-based chatbot
performance can be tested using grammar parsing, output analysis, and user
feedback tests. Markov chains work by building statistically more useful and
realistic responses . However, the probability and randomness of combining
different Markov chains can lead to situations where the output does not make
sense. You should be aware of situations like this and retrain your algorithm on
to prevent your chatbot from producing unreadable output.
8 Literature Review
An initial findings were collected through a preliminary literature review to
delineate and map the field of chatbots in education. One of the insights is
that the last two years have seen a lot of activity in the booming space around
educational chatbots. Based on this preliminary search experience, search terms,
queries, and filters were constructed for the actual structured literature search.
Guidelines for reporting systematic reviews and meta-analyses. From this point
on, the SLR process, consisting of his three phases of planning, conducting, and
reporting the literature review, begins. The following subsections describe each
stage of the SLR.
11
8.1.2 Search Criteria
Based on the findings from the initial related work search, we derived the fol-
lowing search query: (Education OR Educational OR Learning OR Learner
OR Student OR Teaching OR School OR University OR Pedagogical) AND
Chatbot. It combines education-related keywords with the “chatbot” keyword.
Since chatbots are related to other technologies, the initial literature search
also considered keywords such as “pedagogical agents,” “dialogue systems,” or
“bots” when composing the search query. However, these increased the number
of irrelevant results significantly and were therefore excluded from the query in
later searches.
8.2.1 Analysis
We selected five databases (Scopus, Web of Science (WOS), ScienceDirect, IEE-
Explore, and ACM Digital Library) to evaluate articles from this study. Initially,
12
our search string found over 3000 publications in various databases. The num-
ber of publication results from each database fluctuated due to the different
strategies that search engines used to find his relevant articles. We then applied
filters to refine the results, narrowing the search results to articles that matched
his primary goals and research questions.
To analyze the identified publications and derive results according to the re-
search questions, full texts were coded, considering for each publication the ob-
jectives for implementing chatbots (RQ1), pedagogical roles of chatbots (RQ2),
their mentoring roles (RQ3), adaptation of chatbots (RQ4), as well as their
implementation domains in education (RQ5) as separated sets of codes.codes
for RQ2 (Pedagogical Roles) were adapted and refined in terms of their level of
abstraction from an initial set of only two codes, 1) a code for chatbots in the
learning role and 2) a code for chatbots in a service-oriented role. After coding
a larger set of publications, it became clear that the code for service-oriented
chatbots needed to be further distinguished. This was because it summarized
e.g. automation activities with activities related to self-regulated learning and
thus could not be distinguished sharply enough from the learning role. After
refining the code set in the next iteration into a learning role, an assistance
role, and a mentoring role, it was then possible to ensure the separation of the
individual codes. This analysis lead to different types of Chatbots with different
platforms.
8.3 Results
In this section, we present and discuss the results of this literature review in
response to the identified RQs .
13
8.3.2 Pedagogical Roles
Regarding RQ2, it is important to consider the use of chatbots in terms of their
intended educational role. After analyzing the selected articles, we were able to
identify her different educational roles: assisting learning, assisting, and teach-
ing. The Support Learning role (Learning) uses chatbots as an educational tool
to deliver content and skills. This is the conversation task [7]. Alternatively,
learning can be supported by additional offers on top of classroom instruction.
Conversations with chatbots like this should motivate students to look up vocab-
ulary, check grammar, and build confidence in a foreign language. In Support
Roles (Auxiliary), the chatbot’s actions can be summarized in such a way as to
simplify everyday student life. That is, relieve students from some or all of the
tasks. This can be achieved by making information more readily available or
by automating chatbots to simplify the process ([7]. An example of this is his
chatbot from [7], which answers common questions about the course, such as
exam dates and office hours.
8.3.4 Domain
For RQ5, we identified many chatbot domains informations. These can be
loosely divided into three domain categories (DC) based on their educational
role: learner chatbots, assistance chatbots, and mentor chatbots are 1) language
learning, 2) Code learning, 3) can be classified into seven domains. Learn com-
munication skills, 4) learn educational techniques, 5) learn cultural heritage, 6)
learn law, and 7) learn mathematics. The Assisting Chatbots domain category
covers chatbots that assist in educational roles and can be divided into four
domains: 1) Administrative Assistance, 2) Campus Assistance, 3) Course As-
sistance, and 4 ) Library Assistance. An example of this can be seen in [10],
where the student registration process shifts entirely into a conversation with a
chatbot.
9 Conclusion
In this methodical literature review, we explored the current geography of chat-
bots in education. we anatomized different publications, disciplines of chatbots
and grouped them grounded on their pedagogical places into four sphere Or-
ders. these pedagogical places are the supporting literacy part( literacy), the
14
aiding part( aiding), and the mentoring part( mentoring). by fastening on ob-
jects for enforcing chatbots, we linked four main objects 1) skill enhancement,
2) effectiveness of education, 3) scholars’ provocation, and 4) vacuity of educa-
tion. We concentrated on the relations between pedagogical places and objects
for enforcing chatbots and linked three main relations 1) chatbots to ameliorate
chops and motivate scholars by supporting literacy and tutoring conditioning,
2) chatbots to make education more effective by furnishing applicable execu-
tive and logistical information to learners, and 3) chatbots to support multiple
Goods by mentoring scholars. we concentrated on chatbots incorporating the
mentoring part and set up that these chatbots are substantially concerned with
three mentoring motifs
1) tone- regulated literacy, 2) life chops, and 3) literacy chops and three men-
toring styles.
1) scaffolding, 2) recommending, and 3) informing.
References
[1] Eleni Adamopoulou and Lefteris Moussiades. An overview of chatbot tech-
nology. In IFIP International Conference on Artificial Intelligence Appli-
cations and Innovations, pages 373–383. Springer, 2020.
[2] Moneerh Aleedy, Hadil Shaiba, and Marija Bezbradica. Generating and
analyzing chatbot responses using natural language processing. Interna-
tional Journal of Advanced Computer Science and Applications, 10(9):60–
68, 2019.
[3] Eslam Amer, Ahmed Hazem, Omar Farouk, Albert Louca, Youssef Mo-
hamed, and Michel Ashraf. A proposed chatbot framework for covid-19.
In 2021 International Mobile, Intelligent, and Ubiquitous Computing Con-
ference (MIUCC), pages 263–268. IEEE, 2021.
[4] Soufyane Ayanouz, Boudhir Anouar Abdelhakim, and Mohammed
Benhmed. A smart chatbot architecture based nlp and machine learning for
health care assistance. In Proceedings of the 3rd International Conference
on Networking, Information Systems & Security, pages 1–6, 2020.
[5] Andréia Ana Bernardini, Arildo A Sônego, and Eliane Pozzebon. Chatbots:
An analysis of the state of art of literature. In Anais do I Workshop on
Advanced Virtual Environments and Education, pages 1–6. SBC, 2018.
[6] Guendalina Caldarini, Sardar Jaf, and Kenneth McGarry. A literature
survey of recent advances in chatbots. Information, 13(1):41, 2022.
[7] Ed de Quincey, Chris Briggs, Theocharis Kyriacou, and Richard Waller.
Student centred design of a learning analytics system. In Proceedings of
the 9th international conference on learning analytics & knowledge, pages
353–362, 2019.
15
[8] Mauajama Firdaus, Ankit Kumar, Asif Ekbal, and Pushpak Bhat-
tacharyya. A multi-task hierarchical approach for intent detection and
slot filling. Knowledge-Based Systems, 183:104846, 2019.
[9] Silvia Gabrielli, Silvia Rizzi, Sara Carbone, Valeria Donisi, et al. A chatbot-
based coaching intervention for adolescents to promote life skills: pilot
study. JMIR Human Factors, 7(1):e16762, 2020.
[10] Lukáš Galko, Jaroslav Porubän, and Jakub Senko. Improving the user expe-
rience of electronic university enrollment. In 2018 16th International Con-
ference on Emerging eLearning Technologies and Applications (ICETA),
pages 179–184. IEEE, 2018.
[14] Michael Pin-Chuan Lin and Daniel Chang. Enhancing post-secondary writ-
ers’ writing skills with a chatbot. Journal of Educational Technology &
Society, 23(1):78–92, 2020.
[15] Milja Milenkovic. The future is now-37 fascinating chatbot statistics. pub-
lished on 30th Oct, 2019.
[16] Tatwadarshi P Nagarhalli, Vinod Vaze, and NK Rana. A review of cur-
rent trends in the development of chatbot systems. In 2020 6th Interna-
tional Conference on Advanced Computing and Communication Systems
(ICACCS), pages 706–710. IEEE, 2020.
16
[19] Wojciech Samek, Thomas Wiegand, and Klaus-Robert Müller. Explain-
able artificial intelligence: Understanding, visualizing and interpreting deep
learning models. arXiv preprint arXiv:1708.08296, 2017.
[20] Japa Sai Sharath and Rekabdar Banafsheh. Conversational question an-
swering over knowledge base using chat-bot framework. In 2021 IEEE 15th
International Conference on Semantic Computing (ICSC), pages 84–85.
IEEE, 2021.
[21] Heung-Yeung Shum, Xiao-dong He, and Di Li. From eliza to xiaoice: chal-
lenges and opportunities with social chatbots. Frontiers of Information
Technology & Electronic Engineering, 19(1):10–26, 2018.
[22] Sinarwati Mohamad Suhaili, Naomie Salim, and Mohamad Nazim Jambli.
Service chatbots: A systematic review. Expert Systems with Applications,
184:115461, 2021.
[23] Oanh Thi Tran and Tho Chi Luong. Understanding what the users say in
chatbots: A case study for the vietnamese language. Engineering Applica-
tions of Artificial Intelligence, 87:103322, 2020.
[24] V Vijayaraghavan, Jack Brian Cooper, et al. Algorithm inspection for
chatbot performance evaluation. Procedia Computer Science, 171:2267–
2274, 2020.
[25] Kyle Williams. Zero shot intent classification using long-short term memory
networks. In INTERSPEECH, 2019.
[26] Ziang Xiao, Michelle X Zhou, and Wat-Tat Fu. Who should be my team-
mates: Using a conversational agent to understand individuals and help
teaming. In Proceedings of the 24th International Conference on Intelli-
gent User Interfaces, pages 437–447, 2019.
[27] Cong Xu, Qing Li, Dezheng Zhang, Jiarui Cui, Zhenqi Sun, and Hao Zhou.
A model with length-variable attention for spoken language understanding.
Neurocomputing, 379:197–202, 2020.
article
17
Table 2: Comparision between various platforms
18