0% found this document useful (0 votes)
23 views27 pages

Context Aware Retrieval Augmented Generation Based Conversational AI

The document outlines a project focused on developing a context-aware Conversational AI using Retrieval-Augmented Generation (RAG) techniques. It highlights the limitations of traditional chatbots and proposes an advanced system that learns from user interactions, providing accurate and relevant responses. The project includes various components such as architecture, hardware and software requirements, and a feedback mechanism for continuous improvement.

Uploaded by

pippost11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views27 pages

Context Aware Retrieval Augmented Generation Based Conversational AI

The document outlines a project focused on developing a context-aware Conversational AI using Retrieval-Augmented Generation (RAG) techniques. It highlights the limitations of traditional chatbots and proposes an advanced system that learns from user interactions, providing accurate and relevant responses. The project includes various components such as architecture, hardware and software requirements, and a feedback mechanism for continuous improvement.

Uploaded by

pippost11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Context Aware Retrieval-Augmented-

Generation based Conversational AI

DOMAIN : Artificial Intelligence - Deep Learning

Team Members:

Tanmay Odapally - 21BD1A053A

Harsh Rathi - 21BD1A052H

PVSS Pranav - 21BD1A053K

Sai Sujan Shamarthi - 21BD1A053N


Table of Contents
• Abstract
• Introduction​
• Literature Survey
• ​Existing System Proposed System
• Hardware and Software Requirements
• Architecture
• UML Diagram
• Code snippet​
• Implementation
• References
• Conclusion
Abstract

In today's fast-paced environment, there's an increasing need for quick, accurate, and contextually relevant information.
Traditional Conversational Agents often fall short in providing truly helpful responses, especially when dealing with complex
scenarios. Our project is an innovative AI Conversational Agent developed to revolutionise conversations.

The Conversational Agent addresses this challenge by combining advanced AI technologies to create a more intelligent,
adaptive, and context-aware Conversational Agent. At its core, it utilizes the Retrieval-Augmented Generation (RAG)
approach, built on the LLM model.This innovative system is designed not only to answer questions but also to understand
the context of conversations, learn from interactions, and continuously improve its performance.

Additionally, a continuous learning mechanism via user feedback ensures the bot evolves and refines its responses over
time. The system also allows for document uploads, enabling the Conversational Agent to provide insights related to user-
specific content, further personalising the conversation.
Introduction
• Our project enhances conversational agents by using advanced
AI that provides accurate, context-aware responses and
improves with new data and feedback.

• Current chatbots struggle with understanding context and


improving over time.

• Our project uses advanced techniques and feedback to help


chatbots give better responses and continuously learn from user
interactions.
What is a
Conversational Agent?
1 Digital Assistant 2 Conversational AI
A Conversational Agent is a Conversational Agent leverage
computer program that natural language processing
simulates conversations with and machine learning to
people, acting as a digital engage in human-like
assistant to help answer dialogues, providing a more
questions and perform tasks. intuitive and interactive user
experience.

3 Business Benefits
Chatbots are important for businesses as they can provide quick
answers, save time, and improve customer service by automating
routine interactions.
Understanding RAG

The RAG model first retrieves relevant The retrieved information is then used to Finally, the augmented input is passed
information from a large dataset or augment the input query or context, through a generative model, like a
database, such as documents, articles, or enhancing the model's understanding and transformer, which generates the final
other sources, that are relevant to the ability to generate more accurate and response or output based on both the
user's query. relevant responses. original input and the retrieved information.
Contextual Awareness
Storing Context Retrieving Context Continuous Learning

The memory buffer is dynamic and


Our chatbot utilizes a memory During response generation, the adapts as the conversation
buffer to store the conversation chatbot accesses this memory progresses. This allows the chatbot
history, enabling it to recall buffer to understand the current to continuously learn and refine its
previous interactions and provide conversation context, tailoring its responses based on the latest
contextually relevant responses. answers to the ongoing discussion. context.
Feedback Mechanism
Positive Feedback
When the chatbot provides a good answer, the user can give a
"thumbs up" to let the system know it was helpful.

Negative Feedback
If the chatbot's response is inaccurate or unhelpful, the user can
provide a "thumbs down" to indicate the need for improvement.

Continuous Learning
The feedback mechanism allows the chatbot to learn from its
mistakes and successes, continuously improving its responses over
time.
Literature Survey
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.
• The paper shows that combining pre-trained models with external memory improves knowledge access for complex tasks.
• The Retrieval-Augmented Generation (RAG) models achieve top results in NLP by integrating both stored and retrieved
knowledge.
• Author : Lewis, Patrick, et al.
• Date Published: 12 Apr, 2021.
• This study was published as an arXiv preprint.
• Link : https://arxiv.org/pdf/2005.11401
Lost in the Middle: How Language Models Use Long Contexts
• The paper reveals that recent language models struggle to effectively use information in the middle of long input
contexts, showing a U-shaped performance curve with accuracy highest at the beginning or end.
• The study highlights strong positional biases in language models, limiting their effectiveness in tasks requiring long-
context processing, and suggests new evaluation methods to improve model robustness.
• Author: Liu, Nelson F., et al.
• Date Published: , 6 Jul 2023.
• This study was published as an arXiv preprint.
• Link: https://cs.stanford.edu/~nfliu/papers/lost-in-the-middle.arxiv2023.pd
3. Mistral 7B

• Mistral 7B v0.1 is a 7-billion-parameter language model designed for high performance and efficiency, surpassing Llama 2
13B on all tested benchmarks and Llama 1 34B in reasoning, math, and code tasks
• It uses grouped-query attention (GQA) for faster inference and sliding window attention (SWA) for handling long
sequences with lower costs.
• Author:Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, et al.
• Date Published: 10 October,2023.
• This study was published as an arXiv preprint.
• Link: https://arxiv.org/pdf/2001.09977
Feedback-Based Self-Learning in Large-Scale Conversational AI Agents

• The paper introduces a system that improves conversational AI by automatically learning from user interactions, reducing
the need for manual data labelling.
• Using this self-learning approach, the system significantly cuts error rates and enhances performance by analysing and
correcting common mistakes.
• Author: Pragaash Ponnusamy, Alireza Roshan Ghias, Chenlei Guo, Ruhi Sarikaya
• Date Published: 6 Nov 2019.
• This study was published as an arXiv preprint.
• Link: https://arxiv.org/pdf/1911.02557
Existing System
Traditional Chatbots
Traditional language models generate responses based on their internal knowledge, which may be outdated or too general, limiting their ability to provide accurate
and contextually relevant answers.

Limited and potentially outdated


knowledge:
Traditional chatbots rely on static internal knowledge that can quickly
Lack of contextual understanding:
become outdated. This limits their ability to provide current information
or discuss recent events accurately. As you mentioned, maintaining coherence over extended dialogues is
challenging for traditional chatbots. They may contradict themselves or
lose track of previously discussed information.

Inconsistency in long conversations:


These systems often struggle to grasp the full context of a conversation,
leading to responses that may be factually correct but irrelevant or
inappropriate for the specific situation. Inability to learn or adapt:
Most traditional chatbots can't learn from interactions or update their
knowledge base, making them inflexible and unable to improve over
time.
Proposed System
RAG-based Chatbots
RAG-based chatbots combine the strengths of retrieval systems (access to up-to-date and specific information) with the generative capabilities of
language models, leading to more accurate and contextually relevant answers.

Up-to-date information:
RAG-based chatbots can access and utilise the most current
information from their retrieval systems, allowing them to provide
answers based on the latest data available. Improved accuracy:
By combining retrieved information with generative capabilities,
these chatbots can produce more accurate responses, especially
Enhanced contextual relevance: for factual queries or questions requiring specific knowledge.

The retrieval component allows the system to find information


that's most relevant to the user's query, leading to more
contextually appropriate responses. Broader knowledge base:
RAG systems can potentially access a much larger pool of
information than what can be encoded in a traditional language
model's parameters.
Hardware Requirements

Processor RAM Storage


• Minimum Intel i5 or • At least 8 GB (16 GB • Minimum 256 GB SSD
AMD Ryzen 5 recommended for (for fast data access
large-scale and model storage)
deployments)

Graphics Card
• A dedicated GPU like
V100 GPU with 30 GB
of vRAM is required
for inferencing
Software Requirements

Operating System Programming Libraries and


• Windows 10/11, macOS, or Languages Frameworks
Linux • •
Python 3.8 or above NLP & ML: PyTorch or
TensorFlow, Hugging Face
Transformers, LangChain

• Retrieval Systems: FAISS or


ChromaDB
• Development: Jupyter
Notebook, VS Code

Others
• Streamlit (for deployment)
• Docker (optional, for
containerized environments)
• Git (for version control)
Architecture
Use Case Diagram
Class Diagram
Sequence Diagram
Code Snippet
User Interface
Future Scope

• Expansion to Multimodal Capabilities


• Multilingual Support
• Integration with External Knowledge Sources
• Improved Contextual Memory
• Advanced Personalisation
• AI Ethics and Bias Mitigation
• Interactive Analytics and Insights
• Scalability for Enterprise
Conclusion
We Successfully implemented the following :

• RAG Approach with Mistral LLM


• Innovative AI Technologies
• Contextual Understanding and Continuous Learning
• Feedback Mechanism
• Document Upload Feature
• Revolutionising Conversations
References
1. Sriram Veturi, Saurabh Vaichal, Reshma Lal Jagadheesh, Nafis Irtiza Tripto, Nian Yan, "RAG based Question-Answering for Contextual Response Prediction System", 6 Sep 2024.
https://arxiv.org/abs/2409.03708

2. Radhakrishnan, Akhilesh, "Retrieval Is All You Need: Developing an AI Powered Chatbot with RAG in Azure", 28 Aug 2024.
https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1892804&dswid=6129

3. Nick Alonso,​Tomás Figliolia, Anthony Ndirango, Beren Millidge, "Toward Conversational Agents with Context and Time Sensitive Long-term Memory", Jun 2024.
https://arxiv.org/pdf/2406.00057

4. Uday Allu, Biddwan Ahmed, Vishesh Tripathi, "Beyond Extraction: Contextualising Tabular Data for Efficient Summarisation by Language Models", Feb 2024.
h
​ ttps://arxiv.org/pdf/2401.02333​

5. Junfeng Liu, Zhuocheng Mei , Kewen Peng , and Ranga Raju Vatsavai, "Context Retrieval via Normalized Contextual Latent Interaction for Conversational Agent", Dec 2023.
https://arxiv.org/pdf/2312.00774

6. Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang, "Lost in the Middle: How Language Models Use Long Contexts", July 2023.
https://cs.stanford.edu/~nfliu/papers/lost-in-the-middle.arxiv2023.pd

7. ​Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel,
Douwe Kiela, ​"Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks", May 2020.
https://arxiv.org/pdf/2005.11401

8. Daniel Adiwardana, Minh-Thang, Luong David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le, "Towards a
Human-like Open-Domain Chatbot", Feb 2020
https://arxiv.org/pdf/2001.09977

9. Moneerh Aleedy, Hadil Shaiba, Marija Bezbradica, "Generating and Analyzing Chatbot Responses using Natural Language Processing" , Jan 2019.
https://thesai.org/Publications/ViewPaper?Volume=10&Issue=9&Code=IJACSA&SerialNo=10

10. Pragaash Ponnusamy, Alireza Roshan Ghias, Chenlei Guo, Ruhi Sarikaya, "Feedback-Based Self-Learning in Large-Scale Conversational AI Agents", Nov 2019.
https://arxiv.org/pdf/1911.02557
ANY QUESTIONS?
Thank You

You might also like