Context Aware Retrieval Augmented Generation Based Conversational AI
Context Aware Retrieval Augmented Generation Based Conversational AI
Team Members:
In today's fast-paced environment, there's an increasing need for quick, accurate, and contextually relevant information.
Traditional Conversational Agents often fall short in providing truly helpful responses, especially when dealing with complex
scenarios. Our project is an innovative AI Conversational Agent developed to revolutionise conversations.
The Conversational Agent addresses this challenge by combining advanced AI technologies to create a more intelligent,
adaptive, and context-aware Conversational Agent. At its core, it utilizes the Retrieval-Augmented Generation (RAG)
approach, built on the LLM model.This innovative system is designed not only to answer questions but also to understand
the context of conversations, learn from interactions, and continuously improve its performance.
Additionally, a continuous learning mechanism via user feedback ensures the bot evolves and refines its responses over
time. The system also allows for document uploads, enabling the Conversational Agent to provide insights related to user-
specific content, further personalising the conversation.
Introduction
• Our project enhances conversational agents by using advanced
AI that provides accurate, context-aware responses and
improves with new data and feedback.
3 Business Benefits
Chatbots are important for businesses as they can provide quick
answers, save time, and improve customer service by automating
routine interactions.
Understanding RAG
The RAG model first retrieves relevant The retrieved information is then used to Finally, the augmented input is passed
information from a large dataset or augment the input query or context, through a generative model, like a
database, such as documents, articles, or enhancing the model's understanding and transformer, which generates the final
other sources, that are relevant to the ability to generate more accurate and response or output based on both the
user's query. relevant responses. original input and the retrieved information.
Contextual Awareness
Storing Context Retrieving Context Continuous Learning
Negative Feedback
If the chatbot's response is inaccurate or unhelpful, the user can
provide a "thumbs down" to indicate the need for improvement.
Continuous Learning
The feedback mechanism allows the chatbot to learn from its
mistakes and successes, continuously improving its responses over
time.
Literature Survey
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.
• The paper shows that combining pre-trained models with external memory improves knowledge access for complex tasks.
• The Retrieval-Augmented Generation (RAG) models achieve top results in NLP by integrating both stored and retrieved
knowledge.
• Author : Lewis, Patrick, et al.
• Date Published: 12 Apr, 2021.
• This study was published as an arXiv preprint.
• Link : https://arxiv.org/pdf/2005.11401
Lost in the Middle: How Language Models Use Long Contexts
• The paper reveals that recent language models struggle to effectively use information in the middle of long input
contexts, showing a U-shaped performance curve with accuracy highest at the beginning or end.
• The study highlights strong positional biases in language models, limiting their effectiveness in tasks requiring long-
context processing, and suggests new evaluation methods to improve model robustness.
• Author: Liu, Nelson F., et al.
• Date Published: , 6 Jul 2023.
• This study was published as an arXiv preprint.
• Link: https://cs.stanford.edu/~nfliu/papers/lost-in-the-middle.arxiv2023.pd
3. Mistral 7B
• Mistral 7B v0.1 is a 7-billion-parameter language model designed for high performance and efficiency, surpassing Llama 2
13B on all tested benchmarks and Llama 1 34B in reasoning, math, and code tasks
• It uses grouped-query attention (GQA) for faster inference and sliding window attention (SWA) for handling long
sequences with lower costs.
• Author:Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, et al.
• Date Published: 10 October,2023.
• This study was published as an arXiv preprint.
• Link: https://arxiv.org/pdf/2001.09977
Feedback-Based Self-Learning in Large-Scale Conversational AI Agents
• The paper introduces a system that improves conversational AI by automatically learning from user interactions, reducing
the need for manual data labelling.
• Using this self-learning approach, the system significantly cuts error rates and enhances performance by analysing and
correcting common mistakes.
• Author: Pragaash Ponnusamy, Alireza Roshan Ghias, Chenlei Guo, Ruhi Sarikaya
• Date Published: 6 Nov 2019.
• This study was published as an arXiv preprint.
• Link: https://arxiv.org/pdf/1911.02557
Existing System
Traditional Chatbots
Traditional language models generate responses based on their internal knowledge, which may be outdated or too general, limiting their ability to provide accurate
and contextually relevant answers.
Up-to-date information:
RAG-based chatbots can access and utilise the most current
information from their retrieval systems, allowing them to provide
answers based on the latest data available. Improved accuracy:
By combining retrieved information with generative capabilities,
these chatbots can produce more accurate responses, especially
Enhanced contextual relevance: for factual queries or questions requiring specific knowledge.
Graphics Card
• A dedicated GPU like
V100 GPU with 30 GB
of vRAM is required
for inferencing
Software Requirements
Others
• Streamlit (for deployment)
• Docker (optional, for
containerized environments)
• Git (for version control)
Architecture
Use Case Diagram
Class Diagram
Sequence Diagram
Code Snippet
User Interface
Future Scope
2. Radhakrishnan, Akhilesh, "Retrieval Is All You Need: Developing an AI Powered Chatbot with RAG in Azure", 28 Aug 2024.
https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1892804&dswid=6129
3. Nick Alonso,Tomás Figliolia, Anthony Ndirango, Beren Millidge, "Toward Conversational Agents with Context and Time Sensitive Long-term Memory", Jun 2024.
https://arxiv.org/pdf/2406.00057
4. Uday Allu, Biddwan Ahmed, Vishesh Tripathi, "Beyond Extraction: Contextualising Tabular Data for Efficient Summarisation by Language Models", Feb 2024.
h
ttps://arxiv.org/pdf/2401.02333
5. Junfeng Liu, Zhuocheng Mei , Kewen Peng , and Ranga Raju Vatsavai, "Context Retrieval via Normalized Contextual Latent Interaction for Conversational Agent", Dec 2023.
https://arxiv.org/pdf/2312.00774
6. Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang, "Lost in the Middle: How Language Models Use Long Contexts", July 2023.
https://cs.stanford.edu/~nfliu/papers/lost-in-the-middle.arxiv2023.pd
7. Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel,
Douwe Kiela, "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks", May 2020.
https://arxiv.org/pdf/2005.11401
8. Daniel Adiwardana, Minh-Thang, Luong David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le, "Towards a
Human-like Open-Domain Chatbot", Feb 2020
https://arxiv.org/pdf/2001.09977
9. Moneerh Aleedy, Hadil Shaiba, Marija Bezbradica, "Generating and Analyzing Chatbot Responses using Natural Language Processing" , Jan 2019.
https://thesai.org/Publications/ViewPaper?Volume=10&Issue=9&Code=IJACSA&SerialNo=10
10. Pragaash Ponnusamy, Alireza Roshan Ghias, Chenlei Guo, Ruhi Sarikaya, "Feedback-Based Self-Learning in Large-Scale Conversational AI Agents", Nov 2019.
https://arxiv.org/pdf/1911.02557
ANY QUESTIONS?
Thank You