0% found this document useful (0 votes)
10 views34 pages

GenAI PDF

Generative AI refers to models that create new content like text, images, and music, learning from large datasets. The document discusses various AI concepts including Large Language Models (LLMs), prompt engineering, and Retrieval-Augmented Generation (RAG), highlighting their applications and frameworks. Key takeaways emphasize the importance of integrating retrieval systems with generative models to enhance response accuracy and relevance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views34 pages

GenAI PDF

Generative AI refers to models that create new content like text, images, and music, learning from large datasets. The document discusses various AI concepts including Large Language Models (LLMs), prompt engineering, and Retrieval-Augmented Generation (RAG), highlighting their applications and frameworks. Key takeaways emphasize the importance of integrating retrieval systems with generative models to enhance response accuracy and relevance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

GENERATIVE AI

W O R K S H O P
What is GEN AI ?
Generative AI (Generative Artificial Intelligence)
refers to a class of AI models designed to create
new content, such as text, images, music, code, or
even videos, rather than just analyzing or
classifying data. These models learn from vast
datasets and generate human-like outputs based
on patterns they recognize.
DL Models
Discriminative AI Generative AI

1. Focuses on classifying or distinguishing 1. Focuses on generating new data that


between different types of data. resembles real data.
2. Learns the decision boundary between 2. Learns the underlying probability
different classes (P(y|x)). distribution of data (P(x)) to create new
3. Image classification, fraud detection, samples.
sentiment analysis. 3. Text generation, image synthesis, music
4. Logistic Regression, SVM, Random Forest, composition.
BERT. 4. GPT, Stable Diffusion, GANs, DALL·E.
5. Produces labels or class predictions (e.g., 5. Produces new content (e.g., a new
"Spam" or "Not Spam"). paragraph, image, or video).
6. "Is this email spam or not?" 6. "Write a new email in a professional tone."
Large Language Models
(LLMs)
LLMs (Large Language Models) are artificial intelligence models
trained on massive amounts of text data to understand, generate, and
manipulate human-like language. They use deep learning, particularly
transformer architectures, to perform tasks like answering questions,
summarizing text, generating code, and more.

BERT, GPT, XLM, T5, Megatron, M2M are some examples


Model Kwargs
(Keyword Arguments)

✅ "Kwargs" (Keyword Arguments) allow fine-tuning of model


behavior by passing parameters dynamically.
✅ Used in LLMs, Embedding Models, and Vector Stores to customize
responses, performance, and retrieval.
✅ Helps control output length, temperature, retrieval depth, and
more.
Parameter Description Example Value

Controls randomness in response (lower = 0.2 (Factual) / 0.8


temperature
deterministic) (Creative)

max_tokens Limits response length 512

top_k Selects top-k most relevant documents 5

retriever.search_kwar
Filters retrieved documents {"k": 3} (Retrieve top 3)
gs
Prompt Engineering: The Art of Talking to AI
Optimizing AI Responses for Accuracy & Relevance

Definition:
The practice of designing input prompts
to guide an AI model’s response.
Why It Matters:
LLMs respond based on how the question
is framed.
Better prompts → Better responses.
Types of Prompts

1️⃣ Zero-shot Prompting


2️⃣ Few-shot Prompting
3️⃣ Chain-of-Thought (CoT) Prompting
4️⃣ Role-based Prompting
Prompt Engineering in RAG
Why is prompt design crucial for RAG?
Helps the model better utilize retrieved data.
Reduces hallucinations by guiding the AI to cite sources.
Example:
❌ Weak Prompt: "What is the latest AI policy update?"
✅ Strong Prompt: "Based on the retrieved government
document, summarize the key changes in AI policy for
2024."
AI Frameworks

LangChain LangFlow LangGraph


LangChain is a framework
LangFlow is a low-code UI LangGraph is a
designed to help
for LangChain, making it framework built on
developers build
easier to create, visualize,
applications powered by LangChain that
and experiment with
large language models enables graph-based
LangChain workflows
(LLMs) like OpenAI’s GPT, workflows for LLMs.
using a drag-and-drop
Claude, or open-source
interface.
models like LLaMA.
Lang Chain
Key Features:
Chains: Sequences of LLM calls or logic (e.g.,
multi-step reasoning).
Retrieval: Enhances responses using external
knowledge sources like vector databases.
Agents: Uses LLMs to decide which
tools/functions to call dynamically.
Memory: Enables stateful conversations across
multiple interactions.
Prompt Template
A PROMPT TEMPLATE IS A STRUCTURED WAY OF DESIGNING PROMPTS, OFTEN WITH
PLACEHOLDERS FOR DYNAMIC INPUTS. THIS IS USEFUL WHEN WORKING WITH LLMS
(LARGE LANGUAGE MODELS), RAG-BASED SYSTEMS, OR AUTOMATION IN AI
APPLICATIONS.

USE CASES IN AI:


1. LLMS (LARGE LANGUAGE MODELS):
DESIGNING STRUCTURED PROMPTS FOR CHATGPT, OPENAI, OR ANY LANGUAGE
MODEL TO IMPROVE RESPONSE QUALITY.
2. RAG-BASED SYSTEMS:
IN RETRIEVAL-AUGMENTED GENERATION, RETRIEVED DATA (FROM A KNOWLEDGE
BASE) IS INSERTED INTO THE TEMPLATE BEFORE PASSING IT TO THE MODEL.
3. AI AUTOMATION:
USED IN APPLICATIONS LIKE AUTOMATED REPORT GENERATION, EMAIL
ASSISTANTS, OR CUSTOMER SERVICE BOTS WHERE THE INPUT KEEPS CHANGING.
Agents
An Agent is an AI system that dynamically decides
which actions to take based on user input. Unlike
basic prompt-response models, agents can reason,
plan, and interact with multiple tools to solve tasks.
🧠 How Do Agents Work?
1. Receive Input – The agent takes in a user's query.
2. Decide an Action – It determines what tool or
approach to use (retrieval, calculations, API
calls, etc.).
3. Execute the Action – Calls an external API,
searches documents, or retrieves data from
memory.
4. Process Results – Uses the gathered data to
generate a final response.
🔹 Example: Instead of just answering "What is the
weather today?" using a fixed knowledge base, an
agent can call an API to get live weather updates
and then respond.
Chain
Chains in LangChain are designed to
structure multi-step interactions between a
user and an AI model. Instead of making a
single LLM call, chains allow for sequential or
parallel execution of multiple components
to achieve a more complex task.
eg- LLMChain,
SequentialChain,MultiQueryChain,
Document
Loader
A Document Loader in LangChain is responsible for
loading and preprocessing data from different
sources (PDFs, text files, web pages, databases, etc.)
before passing them to the LLM or a retrieval system

Flow:
1. Load Data – Fetch documents from a source
(PDF, database, API, etc.).
2. Preprocess & Split – Convert into chunks for
better retrieval.
3. Embed & Store – Convert text into embeddings
(vector database).
4. Retrieve & Generate – Use RAG to fetch relevant
data and answer queries.
.
Memory
Memory Type Use Case Strengths Weaknesses

Full conversation
history
ConversationBufferMemory Keeps everything High token usage

ConversationBufferWindowMem
ory Limited history (last
Reduces token usage Forgets older context
k interactions)

ConversationSummaryMemory Summarized history Condensed storage May lose details

ConversationKGMemory Fact storage Structured knowledge Not ideal for casual chats

Large-scale
VectorStoreMemory Efficient for retrieval Requires a vector database
knowledge storage
Hugging Face
Hugging Face is a leading platform for open-source AI models, including LLMs, transformers,
embeddings, and datasets. It provides tools like:
🤗 Transformers (for NLP models)
🤗 Datasets (for data processing)
🤗 Spaces (for hosting AI apps)
🤗 Model Hub (for accessing pre-trained models)
LangChain integrates Hugging Face models for:
1. LLMs (HuggingFaceHub, Transformers)
2. Embeddings (Sentence Transformers, BERT, etc.)
3. Vector Databases (Chroma, Pinecone, etc.)
RAG
(Retrieval Augmented Generation)
Retrieval-Augmented Generation (RAG) is an advanced
AI technique that combines a language model (LLM)
with an external knowledge retrieval system to
generate more accurate, fact-based, and up-to-date
responses.
Instead of relying only on pre-trained knowledge, RAG
fetches relevant information from external sources
(e.g., vector databases, documents, APIs) before
generating a response.
Fine Tuning vs RAG
Fine-Tuning:
Adjusting the model’s weights by training it on new
data
Requires labeled datasets for supervised learning

RAG:
Enhancing responses by retrieving external
information at query time
Uses external knowledge bases (documents, APIs,
databases)
How RAG works?
Retrieval-
Generation-
RAG Pipeline Architecture
User Query →
Embedding Model (Vectorization) →
Retrieval from Vector Database →
Augmentation (Appending Context) →
LLM Response Generation
Embedding Model
An embedding model converts text, images,
or other data into high-dimensional vectors
(numerical representations). These vectors
capture semantic meaning, enabling
efficient similarity search in RAG (Retrieval-
Augmented Generation).
Vectorization in RAG
Vectorization is the process of converting
text, images, or other data into numerical
vectors for machine learning, similarity
search, and retrieval in RAG (Retrieval-
Augmented Generation) systems.
Vector
KING QUEEN MAN FOX

Power 1 0.9 0.1 0

Male 1 0 1 0.5

Female 0 1 0 0.5

Hardwork 0.2 0.1 1 1


ChromaDB
ChromaDB is an open-source vector
database designed for storing and retrieving
high-dimensional embeddings efficiently. It
is widely used in Retrieval-Augmented
Generation (RAG) systems to enable fast,
accurate, and scalable retrieval of relevant
documents.
Other examples: Pinecone, FAISS
Similarity Search
In Retrieval-Augmented Generation (RAG),
similarity search is used to find the most
relevant information from a vector
database. This helps in fetching documents
or knowledge chunks that match a user’s
query.
Types of Similarity Search
1️⃣ Cosine Similarity
Measures the cosine of the angle between two
vectors. (Text Similarity, Embeddings)

2️⃣Dot Product Similarity


Measures the magnitude and direction of
vectors(Ranking, attention mechanisms)
Retrieval
Retrieval is the process of finding and
fetching relevant information from a
database or knowledge source based on a
user's query. In Retrieval-Augmented
Generation (RAG), this step ensures that an
LLM gets accurate, contextually relevant
data before generating a response.
Tech Stack for RAG
LLMs: OpenAI GPT, LLaMA, Mistral,
HuggingFace
Vector Databases: ChromaDB, Pinecone,
FAISS, Weaviate
Embeddings: OpenAI Embeddings, Sentence
Transformers, BGE
Frameworks: LangChain, LlamaIndex
Document Loaders: PDFs, APIs, Notion,
Google Drive
LLM Response
In Retrieval-Augmented Generation (RAG),
an LLM (Large Language Model) response is
the final output generated after retrieving
relevant information. The response is
contextually enhanced by the retrieved
data, making it more factual and relevant.
EMBEDDING
MODEL SIMILARITY
SEARCH DATASET
(PDF)

USER QUERY
VECTOR

EMBEDDING
MODEL
CHROMA DB

LLM RESPONSE RETRIEVED


DOCUMENTS
VECTOR
Applications of RAG
AI-Powered Chatbots (Customer
Support, FAQs)
Legal & Medical Assistants (Retrieving
case laws, medical guidelines)
Enterprise Search (AI-powered
document retrieval)
Education & Research (Summarizing
academic papers)
Conclusion
Summary
🚀 Key Takeaways from Retrieval-Augmented
Generation (RAG)
✅ Enhances LLMs by retrieving relevant, factual data
before generating responses.
✅ Uses Vector Databases like ChromaDB, FAISS,
Pinecone to store and retrieve embeddings.
✅ Reduces Hallucinations by grounding responses in
real-world knowledge.
✅ Combines Retrieval + Generation for accurate,
context-aware answers.
✅ Widely Used in chatbots, search engines, and
enterprise AI solutions.
THANK YOU!

You might also like