SlideShare a Scribd company logo
1 | © Copyright 8/16/23 Zilliz
1 | © Copyright 8/16/23 Zilliz
Stephen Batifol | Zilliz
A Beginners Guide to Building
a RAG App Using Milvus
2 | © Copyright 8/16/23 Zilliz
2 | © Copyright 8/16/23 Zilliz
Stephen Batifol
Developer Advocate, Zilliz
stephen.batifol@zilliz.com
https://www.linkedin.com/in/stephen-batifol/
https://twitter.com/stephenbtl
Speaker
3 | © Copyright 8/16/23 Zilliz
3 | © Copyright 8/16/23 Zilliz
| © Copyright 8/16/23 Zilliz
3
RAG
(Retrieval Augmented Generation)
4 | © Copyright 8/16/23 Zilliz
4 | © Copyright 8/16/23 Zilliz
Basic Idea
Use RAG to force the LLM to work with your data
by injecting it via a vector database like Milvus
5 | © Copyright 8/16/23 Zilliz
5 | © Copyright 8/16/23 Zilliz
Vector DB for RAG
Vector Databases provide the ability to inject your data via
semantic similarity
Considerations include: scale, performance, and flexibility
6 | © Copyright 8/16/23 Zilliz
6 | © Copyright 8/16/23 Zilliz
LLMs are Stochastic
LLMs predict future tokens (a-la RNNs)
• “Milvus is the world ’s most popular vector ___”
• {“database”: 0.86, “search”: 0.11, “embedding”, 0.01,
…}
Downside: outdated input data could be cause for
hallucination
• Plausible-sounding but factually incorrect responses
7 | © Copyright 8/16/23 Zilliz
7 | © Copyright 8/16/23 Zilliz
Basic RAG Architecture
8 | © Copyright 8/16/23 Zilliz
8 | © Copyright 8/16/23 Zilliz
01 Tech Stack
9 | © Copyright 8/16/23 Zilliz
9 | © Copyright 8/16/23 Zilliz
Tech Stack
10 | © Copyright 8/16/23 Zilliz
10 | © Copyright 8/16/23 Zilliz
• Framework for building LLM Applications
• Focus on retrieving data and integrating with LLMs
• Loading the Data
• Chunk & Chunk Overlap
• Integrations with most popular tools
Langchain
11 | © Copyright 8/16/23 Zilliz
11 | © Copyright 8/16/23 Zilliz
Ollama
• Run quantized LLMs Locally
• Embeddings Models
12 | © Copyright 8/16/23 Zilliz
12 | © Copyright 8/16/23 Zilliz
Milvus
1. Cloud Native, Distributed System Architecture
2. True Separation of Concerns
3. Scalable Index Creation Strategy with 512 MB Segments
13 | © Copyright 8/16/23 Zilliz
13 | © Copyright 8/16/23 Zilliz
Embeddings Models
14 | © Copyright 8/16/23 Zilliz
14 | © Copyright 8/16/23 Zilliz
02 Embeddings
15 | © Copyright 8/16/23 Zilliz
15 | © Copyright 8/16/23 Zilliz
Examining Embeddings
Picking a model
What to embed
Metadata
16 | © Copyright 8/16/23 Zilliz
16 | © Copyright 8/16/23 Zilliz
Embeddings Strategies
Level 1: Embedding Chunks Directly
Level 2: Embedding Sub and Super Chunks
Level 3: Incorporating Chunking and Non-Chunking Metadata
17 | © Copyright 8/16/23 Zilliz
17 | © Copyright 8/16/23 Zilliz
Metadata Examples
Chunking
- Paragraph position
- Section header
- Larger paragraph
- Sentence Number
- …
Non-Chunking
- Author
- Publisher
- Organization
- Role Based Access Control
- …
18 | © Copyright 8/16/23 Zilliz
18 | © Copyright 8/16/23 Zilliz
Text:
“preferences of customers and prospective customers with respect to remote or hybrid
working, as a result of the COVID-19 pandemic, leading to a parallel delay, or potentially
permanent change, in receiving the corresponding revenue; •our projected financial
information, anticipated growth rate, and market opportunity; •our ability to maintain the
listing of our Class A Common Stock and Warrants on the NYSE; •our public securities’
potential liquidity and trading;”
Vector:
[-0.09975282847881317,-0.02853492833673954,-0.047886092215776443,0.01231582183
3908558,-0.004004416521638632,0.08756010979413986,0.013248161412775517,0.01070
4956017434597,-0.06194952502846718,0.021150749176740646,0.02453230880200863,0
.03979797288775444,-0.032914288341999054,-0.011855324730277061,...]
What your data looks like
19 | © Copyright 8/16/23 Zilliz
19 | © Copyright 8/16/23 Zilliz
Your embeddings strategy depends on your accuracy,
cost, and use case needs
Takeaway:
20 | © Copyright 8/16/23 Zilliz
20 | © Copyright 8/16/23 Zilliz
03 Chunking
21 | © Copyright 8/16/23 Zilliz
21 | © Copyright 8/16/23 Zilliz
Chunking Considerations
Chunk Size
Chunk Overlap
Character Splitters
22 | © Copyright 8/16/23 Zilliz
22 | © Copyright 8/16/23 Zilliz
Examples
Chunk Size=50, Overlap=0
23 | © Copyright 8/16/23 Zilliz
23 | © Copyright 8/16/23 Zilliz
Examples
Chunk Size=128, Overlap=20
24 | © Copyright 8/16/23 Zilliz
24 | © Copyright 8/16/23 Zilliz
Examples
Chunk Size=256, Overlap=50
25 | © Copyright 8/16/23 Zilliz
25 | © Copyright 8/16/23 Zilliz
Examples
SemanticChunker
26 | © Copyright 8/16/23 Zilliz
26 | © Copyright 8/16/23 Zilliz
How Does Your Data Look?
Conversation
Data
Documentation
Data
Lecture or Q/A
Data
27 | © Copyright 8/16/23 Zilliz
27 | © Copyright 8/16/23 Zilliz
Your chunking strategy depends on what your data looks
like and what you need from it.
Takeaway:
28 | © Copyright 8/16/23 Zilliz
28 | © Copyright 8/16/23 Zilliz
| © Copyright 8/16/23 Zilliz
28
Demo!
29 | © Copyright 8/16/23 Zilliz
29 | © Copyright 8/16/23 Zilliz
Questions?
Give Milvus a Star! Chat with me on Discord!
30 | © Copyright 8/16/23 Zilliz
30 | © Copyright 8/16/23 Zilliz
Meta Storage
Root Query Data Index
Coordinator Service
Proxy
Proxy
etcd
Log Broker
SDK
Load Balancer
DDL/DCL
DML
NOTIFICATION
CONTROL SIGNAL
Object Storage
Minio / S3 / AzureBlob
Log Snapshot Delta File Index File
Worker Node QUERY DATA DATA
Message Storage
VECTOR
DATABASE
Access Layer
Query Node Data Node Index Node
Milvus Architecture

More Related Content

Similar to A Beginners Guide to Building a RAG App Using Open Source Milvus (20)

Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Zilliz
 
Chunking, Embeddings, and Vector Databases
Chunking, Embeddings, and Vector DatabasesChunking, Embeddings, and Vector Databases
Chunking, Embeddings, and Vector Databases
Zilliz
 
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
Timothy Spann
 
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
17-October-2024 NYC AI Camp - Step-by-Step RAG 10117-October-2024 NYC AI Camp - Step-by-Step RAG 101
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
Timothy Spann
 
Using LLM Agents with Llama 3.2, LangGraph and Milvus
Using LLM Agents with Llama 3.2, LangGraph and MilvusUsing LLM Agents with Llama 3.2, LangGraph and Milvus
Using LLM Agents with Llama 3.2, LangGraph and Milvus
Zilliz
 
Building an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and MilvusBuilding an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and Milvus
Zilliz
 
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
Timothy Spann
 
Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systems
Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systemsSupercharge Spark: Unleashing Big Data Potential with Milvus for RAG systems
Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systems
Zilliz
 
GraphRAG Agents with Neo4j, Milvus and GPT4
GraphRAG Agents with Neo4j, Milvus and GPT4GraphRAG Agents with Neo4j, Milvus and GPT4
GraphRAG Agents with Neo4j, Milvus and GPT4
Zilliz
 
Multi-agent Systems with Mistral AI, Milvus and Llama-agents
Multi-agent Systems with Mistral AI, Milvus and Llama-agentsMulti-agent Systems with Mistral AI, Milvus and Llama-agents
Multi-agent Systems with Mistral AI, Milvus and Llama-agents
Zilliz
 
09-12-2024 - Milvus, Vector database used for Sensor Data RAG
09-12-2024 - Milvus, Vector database used for Sensor Data RAG09-12-2024 - Milvus, Vector database used for Sensor Data RAG
09-12-2024 - Milvus, Vector database used for Sensor Data RAG
Timothy Spann
 
Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+
Zilliz
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
Zilliz
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
Zilliz
 
Multi-agent Systems with Mistral AI, Milvus and Llama-agents
Multi-agent Systems with Mistral AI, Milvus and Llama-agentsMulti-agent Systems with Mistral AI, Milvus and Llama-agents
Multi-agent Systems with Mistral AI, Milvus and Llama-agents
Zilliz
 
2025-02-24 - AWS meetup - Zilliz presentation.pdf
2025-02-24 - AWS meetup - Zilliz presentation.pdf2025-02-24 - AWS meetup - Zilliz presentation.pdf
2025-02-24 - AWS meetup - Zilliz presentation.pdf
Ivan Tang
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
Timothy Spann
 
RAG Pipelines with Real-Time data Cloudera
RAG Pipelines with Real-Time data ClouderaRAG Pipelines with Real-Time data Cloudera
RAG Pipelines with Real-Time data Cloudera
Zilliz
 
Milvus: Scaling Vector Data Solutions for Gen AI
Milvus: Scaling Vector Data Solutions for Gen AIMilvus: Scaling Vector Data Solutions for Gen AI
Milvus: Scaling Vector Data Solutions for Gen AI
Zilliz
 
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Zilliz
 
Chunking, Embeddings, and Vector Databases
Chunking, Embeddings, and Vector DatabasesChunking, Embeddings, and Vector Databases
Chunking, Embeddings, and Vector Databases
Zilliz
 
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
Timothy Spann
 
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
17-October-2024 NYC AI Camp - Step-by-Step RAG 10117-October-2024 NYC AI Camp - Step-by-Step RAG 101
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
Timothy Spann
 
Using LLM Agents with Llama 3.2, LangGraph and Milvus
Using LLM Agents with Llama 3.2, LangGraph and MilvusUsing LLM Agents with Llama 3.2, LangGraph and Milvus
Using LLM Agents with Llama 3.2, LangGraph and Milvus
Zilliz
 
Building an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and MilvusBuilding an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and Milvus
Zilliz
 
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
Timothy Spann
 
Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systems
Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systemsSupercharge Spark: Unleashing Big Data Potential with Milvus for RAG systems
Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systems
Zilliz
 
GraphRAG Agents with Neo4j, Milvus and GPT4
GraphRAG Agents with Neo4j, Milvus and GPT4GraphRAG Agents with Neo4j, Milvus and GPT4
GraphRAG Agents with Neo4j, Milvus and GPT4
Zilliz
 
Multi-agent Systems with Mistral AI, Milvus and Llama-agents
Multi-agent Systems with Mistral AI, Milvus and Llama-agentsMulti-agent Systems with Mistral AI, Milvus and Llama-agents
Multi-agent Systems with Mistral AI, Milvus and Llama-agents
Zilliz
 
09-12-2024 - Milvus, Vector database used for Sensor Data RAG
09-12-2024 - Milvus, Vector database used for Sensor Data RAG09-12-2024 - Milvus, Vector database used for Sensor Data RAG
09-12-2024 - Milvus, Vector database used for Sensor Data RAG
Timothy Spann
 
Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+
Zilliz
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
Zilliz
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
Zilliz
 
Multi-agent Systems with Mistral AI, Milvus and Llama-agents
Multi-agent Systems with Mistral AI, Milvus and Llama-agentsMulti-agent Systems with Mistral AI, Milvus and Llama-agents
Multi-agent Systems with Mistral AI, Milvus and Llama-agents
Zilliz
 
2025-02-24 - AWS meetup - Zilliz presentation.pdf
2025-02-24 - AWS meetup - Zilliz presentation.pdf2025-02-24 - AWS meetup - Zilliz presentation.pdf
2025-02-24 - AWS meetup - Zilliz presentation.pdf
Ivan Tang
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
Timothy Spann
 
RAG Pipelines with Real-Time data Cloudera
RAG Pipelines with Real-Time data ClouderaRAG Pipelines with Real-Time data Cloudera
RAG Pipelines with Real-Time data Cloudera
Zilliz
 
Milvus: Scaling Vector Data Solutions for Gen AI
Milvus: Scaling Vector Data Solutions for Gen AIMilvus: Scaling Vector Data Solutions for Gen AI
Milvus: Scaling Vector Data Solutions for Gen AI
Zilliz
 

More from Zilliz (20)

Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Smarter RAG Pipelines: Scaling Search with Milvus and Feast
Smarter RAG Pipelines: Scaling Search with Milvus and FeastSmarter RAG Pipelines: Scaling Search with Milvus and Feast
Smarter RAG Pipelines: Scaling Search with Milvus and Feast
Zilliz
 
Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...
Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...
Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...
Zilliz
 
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
Zilliz
 
Webinar - Zilliz Cloud Monthly Demo - March 2025
Webinar - Zilliz Cloud Monthly Demo - March 2025Webinar - Zilliz Cloud Monthly Demo - March 2025
Webinar - Zilliz Cloud Monthly Demo - March 2025
Zilliz
 
What Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI AgentsWhat Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI Agents
Zilliz
 
Combining Lexical and Semantic Search with Milvus 2.5
Combining Lexical and Semantic Search with Milvus 2.5Combining Lexical and Semantic Search with Milvus 2.5
Combining Lexical and Semantic Search with Milvus 2.5
Zilliz
 
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Bedrock Data Automation (Preview): Simplifying Unstructured Data ProcessingBedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Zilliz
 
Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM
Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLMDeploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM
Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM
Zilliz
 
February Product Demo: Discover the Power of Zilliz Cloud
February Product Demo: Discover the Power of Zilliz CloudFebruary Product Demo: Discover the Power of Zilliz Cloud
February Product Demo: Discover the Power of Zilliz Cloud
Zilliz
 
Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23
Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23
Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23
Zilliz
 
Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & Milvus
Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & MilvusBuilding the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & Milvus
Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & Milvus
Zilliz
 
Voice-to-Value- LLM-Powered Customer Interaction Analysis.pdf
Voice-to-Value- LLM-Powered Customer Interaction Analysis.pdfVoice-to-Value- LLM-Powered Customer Interaction Analysis.pdf
Voice-to-Value- LLM-Powered Customer Interaction Analysis.pdf
Zilliz
 
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
Zilliz
 
1 Table = 1000 Words? Foundation Models for Tabular Data
1 Table = 1000 Words? Foundation Models for Tabular Data1 Table = 1000 Words? Foundation Models for Tabular Data
1 Table = 1000 Words? Foundation Models for Tabular Data
Zilliz
 
How Milvus allows you to run Full Text Search
How Milvus allows you to run Full Text SearchHow Milvus allows you to run Full Text Search
How Milvus allows you to run Full Text Search
Zilliz
 
How to Optimize Your Embedding Model Selection and Development through TDA Cl...
How to Optimize Your Embedding Model Selection and Development through TDA Cl...How to Optimize Your Embedding Model Selection and Development through TDA Cl...
How to Optimize Your Embedding Model Selection and Development through TDA Cl...
Zilliz
 
Keeping Data Fresh: Mastering Updates in Vector Databases
Keeping Data Fresh: Mastering Updates in Vector DatabasesKeeping Data Fresh: Mastering Updates in Vector Databases
Keeping Data Fresh: Mastering Updates in Vector Databases
Zilliz
 
Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!
Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!
Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!
Zilliz
 
Vector Databases for Enhanced Classification
Vector Databases for Enhanced ClassificationVector Databases for Enhanced Classification
Vector Databases for Enhanced Classification
Zilliz
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Smarter RAG Pipelines: Scaling Search with Milvus and Feast
Smarter RAG Pipelines: Scaling Search with Milvus and FeastSmarter RAG Pipelines: Scaling Search with Milvus and Feast
Smarter RAG Pipelines: Scaling Search with Milvus and Feast
Zilliz
 
Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...
Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...
Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...
Zilliz
 
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
Zilliz
 
Webinar - Zilliz Cloud Monthly Demo - March 2025
Webinar - Zilliz Cloud Monthly Demo - March 2025Webinar - Zilliz Cloud Monthly Demo - March 2025
Webinar - Zilliz Cloud Monthly Demo - March 2025
Zilliz
 
What Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI AgentsWhat Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI Agents
Zilliz
 
Combining Lexical and Semantic Search with Milvus 2.5
Combining Lexical and Semantic Search with Milvus 2.5Combining Lexical and Semantic Search with Milvus 2.5
Combining Lexical and Semantic Search with Milvus 2.5
Zilliz
 
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Bedrock Data Automation (Preview): Simplifying Unstructured Data ProcessingBedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Bedrock Data Automation (Preview): Simplifying Unstructured Data Processing
Zilliz
 
Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM
Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLMDeploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM
Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLM
Zilliz
 
February Product Demo: Discover the Power of Zilliz Cloud
February Product Demo: Discover the Power of Zilliz CloudFebruary Product Demo: Discover the Power of Zilliz Cloud
February Product Demo: Discover the Power of Zilliz Cloud
Zilliz
 
Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23
Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23
Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23
Zilliz
 
Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & Milvus
Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & MilvusBuilding the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & Milvus
Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & Milvus
Zilliz
 
Voice-to-Value- LLM-Powered Customer Interaction Analysis.pdf
Voice-to-Value- LLM-Powered Customer Interaction Analysis.pdfVoice-to-Value- LLM-Powered Customer Interaction Analysis.pdf
Voice-to-Value- LLM-Powered Customer Interaction Analysis.pdf
Zilliz
 
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
Zilliz
 
1 Table = 1000 Words? Foundation Models for Tabular Data
1 Table = 1000 Words? Foundation Models for Tabular Data1 Table = 1000 Words? Foundation Models for Tabular Data
1 Table = 1000 Words? Foundation Models for Tabular Data
Zilliz
 
How Milvus allows you to run Full Text Search
How Milvus allows you to run Full Text SearchHow Milvus allows you to run Full Text Search
How Milvus allows you to run Full Text Search
Zilliz
 
How to Optimize Your Embedding Model Selection and Development through TDA Cl...
How to Optimize Your Embedding Model Selection and Development through TDA Cl...How to Optimize Your Embedding Model Selection and Development through TDA Cl...
How to Optimize Your Embedding Model Selection and Development through TDA Cl...
Zilliz
 
Keeping Data Fresh: Mastering Updates in Vector Databases
Keeping Data Fresh: Mastering Updates in Vector DatabasesKeeping Data Fresh: Mastering Updates in Vector Databases
Keeping Data Fresh: Mastering Updates in Vector Databases
Zilliz
 
Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!
Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!
Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!
Zilliz
 
Vector Databases for Enhanced Classification
Vector Databases for Enhanced ClassificationVector Databases for Enhanced Classification
Vector Databases for Enhanced Classification
Zilliz
 
Ad

Recently uploaded (20)

Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementaryMurdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und AnwendungsfälleDomino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
panagenda
 
If You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FMEIf You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FME
Safe Software
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Trends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary MeekerTrends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary Meeker
Clive Dickens
 
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
Safe Software
 
Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025
Safe Software
 
Creating an Accessible Future-How AI-powered Accessibility Testing is Shaping...
Creating an Accessible Future-How AI-powered Accessibility Testing is Shaping...Creating an Accessible Future-How AI-powered Accessibility Testing is Shaping...
Creating an Accessible Future-How AI-powered Accessibility Testing is Shaping...
Impelsys Inc.
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOMEstablish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Anchore
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptxISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
AyilurRamnath1
 
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Infrassist Technologies Pvt. Ltd.
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdfvertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
Data Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any ApplicationData Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any Application
Safe Software
 
7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf
Minuscule Technologies
 
Introduction to Typescript - GDG On Campus EUE
Introduction to Typescript - GDG On Campus EUEIntroduction to Typescript - GDG On Campus EUE
Introduction to Typescript - GDG On Campus EUE
Google Developer Group On Campus European Universities in Egypt
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementaryMurdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und AnwendungsfälleDomino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
panagenda
 
If You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FMEIf You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FME
Safe Software
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Trends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary MeekerTrends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary Meeker
Clive Dickens
 
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
Safe Software
 
Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025Mastering AI Workflows with FME - Peak of Data & AI 2025
Mastering AI Workflows with FME - Peak of Data & AI 2025
Safe Software
 
Creating an Accessible Future-How AI-powered Accessibility Testing is Shaping...
Creating an Accessible Future-How AI-powered Accessibility Testing is Shaping...Creating an Accessible Future-How AI-powered Accessibility Testing is Shaping...
Creating an Accessible Future-How AI-powered Accessibility Testing is Shaping...
Impelsys Inc.
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOMEstablish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Anchore
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptxISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
AyilurRamnath1
 
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Infrassist Technologies Pvt. Ltd.
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdfvertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
Data Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any ApplicationData Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any Application
Safe Software
 
7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf
Minuscule Technologies
 
Ad

A Beginners Guide to Building a RAG App Using Open Source Milvus

  • 1. 1 | © Copyright 8/16/23 Zilliz 1 | © Copyright 8/16/23 Zilliz Stephen Batifol | Zilliz A Beginners Guide to Building a RAG App Using Milvus
  • 2. 2 | © Copyright 8/16/23 Zilliz 2 | © Copyright 8/16/23 Zilliz Stephen Batifol Developer Advocate, Zilliz [email protected] https://www.linkedin.com/in/stephen-batifol/ https://twitter.com/stephenbtl Speaker
  • 3. 3 | © Copyright 8/16/23 Zilliz 3 | © Copyright 8/16/23 Zilliz | © Copyright 8/16/23 Zilliz 3 RAG (Retrieval Augmented Generation)
  • 4. 4 | © Copyright 8/16/23 Zilliz 4 | © Copyright 8/16/23 Zilliz Basic Idea Use RAG to force the LLM to work with your data by injecting it via a vector database like Milvus
  • 5. 5 | © Copyright 8/16/23 Zilliz 5 | © Copyright 8/16/23 Zilliz Vector DB for RAG Vector Databases provide the ability to inject your data via semantic similarity Considerations include: scale, performance, and flexibility
  • 6. 6 | © Copyright 8/16/23 Zilliz 6 | © Copyright 8/16/23 Zilliz LLMs are Stochastic LLMs predict future tokens (a-la RNNs) • “Milvus is the world ’s most popular vector ___” • {“database”: 0.86, “search”: 0.11, “embedding”, 0.01, …} Downside: outdated input data could be cause for hallucination • Plausible-sounding but factually incorrect responses
  • 7. 7 | © Copyright 8/16/23 Zilliz 7 | © Copyright 8/16/23 Zilliz Basic RAG Architecture
  • 8. 8 | © Copyright 8/16/23 Zilliz 8 | © Copyright 8/16/23 Zilliz 01 Tech Stack
  • 9. 9 | © Copyright 8/16/23 Zilliz 9 | © Copyright 8/16/23 Zilliz Tech Stack
  • 10. 10 | © Copyright 8/16/23 Zilliz 10 | © Copyright 8/16/23 Zilliz • Framework for building LLM Applications • Focus on retrieving data and integrating with LLMs • Loading the Data • Chunk & Chunk Overlap • Integrations with most popular tools Langchain
  • 11. 11 | © Copyright 8/16/23 Zilliz 11 | © Copyright 8/16/23 Zilliz Ollama • Run quantized LLMs Locally • Embeddings Models
  • 12. 12 | © Copyright 8/16/23 Zilliz 12 | © Copyright 8/16/23 Zilliz Milvus 1. Cloud Native, Distributed System Architecture 2. True Separation of Concerns 3. Scalable Index Creation Strategy with 512 MB Segments
  • 13. 13 | © Copyright 8/16/23 Zilliz 13 | © Copyright 8/16/23 Zilliz Embeddings Models
  • 14. 14 | © Copyright 8/16/23 Zilliz 14 | © Copyright 8/16/23 Zilliz 02 Embeddings
  • 15. 15 | © Copyright 8/16/23 Zilliz 15 | © Copyright 8/16/23 Zilliz Examining Embeddings Picking a model What to embed Metadata
  • 16. 16 | © Copyright 8/16/23 Zilliz 16 | © Copyright 8/16/23 Zilliz Embeddings Strategies Level 1: Embedding Chunks Directly Level 2: Embedding Sub and Super Chunks Level 3: Incorporating Chunking and Non-Chunking Metadata
  • 17. 17 | © Copyright 8/16/23 Zilliz 17 | © Copyright 8/16/23 Zilliz Metadata Examples Chunking - Paragraph position - Section header - Larger paragraph - Sentence Number - … Non-Chunking - Author - Publisher - Organization - Role Based Access Control - …
  • 18. 18 | © Copyright 8/16/23 Zilliz 18 | © Copyright 8/16/23 Zilliz Text: “preferences of customers and prospective customers with respect to remote or hybrid working, as a result of the COVID-19 pandemic, leading to a parallel delay, or potentially permanent change, in receiving the corresponding revenue; •our projected financial information, anticipated growth rate, and market opportunity; •our ability to maintain the listing of our Class A Common Stock and Warrants on the NYSE; •our public securities’ potential liquidity and trading;” Vector: [-0.09975282847881317,-0.02853492833673954,-0.047886092215776443,0.01231582183 3908558,-0.004004416521638632,0.08756010979413986,0.013248161412775517,0.01070 4956017434597,-0.06194952502846718,0.021150749176740646,0.02453230880200863,0 .03979797288775444,-0.032914288341999054,-0.011855324730277061,...] What your data looks like
  • 19. 19 | © Copyright 8/16/23 Zilliz 19 | © Copyright 8/16/23 Zilliz Your embeddings strategy depends on your accuracy, cost, and use case needs Takeaway:
  • 20. 20 | © Copyright 8/16/23 Zilliz 20 | © Copyright 8/16/23 Zilliz 03 Chunking
  • 21. 21 | © Copyright 8/16/23 Zilliz 21 | © Copyright 8/16/23 Zilliz Chunking Considerations Chunk Size Chunk Overlap Character Splitters
  • 22. 22 | © Copyright 8/16/23 Zilliz 22 | © Copyright 8/16/23 Zilliz Examples Chunk Size=50, Overlap=0
  • 23. 23 | © Copyright 8/16/23 Zilliz 23 | © Copyright 8/16/23 Zilliz Examples Chunk Size=128, Overlap=20
  • 24. 24 | © Copyright 8/16/23 Zilliz 24 | © Copyright 8/16/23 Zilliz Examples Chunk Size=256, Overlap=50
  • 25. 25 | © Copyright 8/16/23 Zilliz 25 | © Copyright 8/16/23 Zilliz Examples SemanticChunker
  • 26. 26 | © Copyright 8/16/23 Zilliz 26 | © Copyright 8/16/23 Zilliz How Does Your Data Look? Conversation Data Documentation Data Lecture or Q/A Data
  • 27. 27 | © Copyright 8/16/23 Zilliz 27 | © Copyright 8/16/23 Zilliz Your chunking strategy depends on what your data looks like and what you need from it. Takeaway:
  • 28. 28 | © Copyright 8/16/23 Zilliz 28 | © Copyright 8/16/23 Zilliz | © Copyright 8/16/23 Zilliz 28 Demo!
  • 29. 29 | © Copyright 8/16/23 Zilliz 29 | © Copyright 8/16/23 Zilliz Questions? Give Milvus a Star! Chat with me on Discord!
  • 30. 30 | © Copyright 8/16/23 Zilliz 30 | © Copyright 8/16/23 Zilliz Meta Storage Root Query Data Index Coordinator Service Proxy Proxy etcd Log Broker SDK Load Balancer DDL/DCL DML NOTIFICATION CONTROL SIGNAL Object Storage Minio / S3 / AzureBlob Log Snapshot Delta File Index File Worker Node QUERY DATA DATA Message Storage VECTOR DATABASE Access Layer Query Node Data Node Index Node Milvus Architecture