GenAI PDF
GenAI PDF
W O R K S H O P
What is GEN AI ?
Generative AI (Generative Artificial Intelligence)
refers to a class of AI models designed to create
new content, such as text, images, music, code, or
even videos, rather than just analyzing or
classifying data. These models learn from vast
datasets and generate human-like outputs based
on patterns they recognize.
DL Models
Discriminative AI Generative AI
retriever.search_kwar
Filters retrieved documents {"k": 3} (Retrieve top 3)
gs
Prompt Engineering: The Art of Talking to AI
Optimizing AI Responses for Accuracy & Relevance
Definition:
The practice of designing input prompts
to guide an AI model’s response.
Why It Matters:
LLMs respond based on how the question
is framed.
Better prompts → Better responses.
Types of Prompts
Flow:
1. Load Data – Fetch documents from a source
(PDF, database, API, etc.).
2. Preprocess & Split – Convert into chunks for
better retrieval.
3. Embed & Store – Convert text into embeddings
(vector database).
4. Retrieve & Generate – Use RAG to fetch relevant
data and answer queries.
.
Memory
Memory Type Use Case Strengths Weaknesses
Full conversation
history
ConversationBufferMemory Keeps everything High token usage
ConversationBufferWindowMem
ory Limited history (last
Reduces token usage Forgets older context
k interactions)
ConversationKGMemory Fact storage Structured knowledge Not ideal for casual chats
Large-scale
VectorStoreMemory Efficient for retrieval Requires a vector database
knowledge storage
Hugging Face
Hugging Face is a leading platform for open-source AI models, including LLMs, transformers,
embeddings, and datasets. It provides tools like:
🤗 Transformers (for NLP models)
🤗 Datasets (for data processing)
🤗 Spaces (for hosting AI apps)
🤗 Model Hub (for accessing pre-trained models)
LangChain integrates Hugging Face models for:
1. LLMs (HuggingFaceHub, Transformers)
2. Embeddings (Sentence Transformers, BERT, etc.)
3. Vector Databases (Chroma, Pinecone, etc.)
RAG
(Retrieval Augmented Generation)
Retrieval-Augmented Generation (RAG) is an advanced
AI technique that combines a language model (LLM)
with an external knowledge retrieval system to
generate more accurate, fact-based, and up-to-date
responses.
Instead of relying only on pre-trained knowledge, RAG
fetches relevant information from external sources
(e.g., vector databases, documents, APIs) before
generating a response.
Fine Tuning vs RAG
Fine-Tuning:
Adjusting the model’s weights by training it on new
data
Requires labeled datasets for supervised learning
RAG:
Enhancing responses by retrieving external
information at query time
Uses external knowledge bases (documents, APIs,
databases)
How RAG works?
Retrieval-
Generation-
RAG Pipeline Architecture
User Query →
Embedding Model (Vectorization) →
Retrieval from Vector Database →
Augmentation (Appending Context) →
LLM Response Generation
Embedding Model
An embedding model converts text, images,
or other data into high-dimensional vectors
(numerical representations). These vectors
capture semantic meaning, enabling
efficient similarity search in RAG (Retrieval-
Augmented Generation).
Vectorization in RAG
Vectorization is the process of converting
text, images, or other data into numerical
vectors for machine learning, similarity
search, and retrieval in RAG (Retrieval-
Augmented Generation) systems.
Vector
KING QUEEN MAN FOX
Male 1 0 1 0.5
Female 0 1 0 0.5
USER QUERY
VECTOR
EMBEDDING
MODEL
CHROMA DB