A Beginners Guide to Building a RAG App Using Open Source Milvus

May 2, 20241 like386 views

We will showcase how you can build a RAG using Milvus. Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.

1 | © Copyright 8/16/23 Zilliz
1 | © Copyright 8/16/23 Zilliz
Stephen Batifol | Zilliz
A Beginners Guide to Building
a RAG App Using Milvus

2 | © Copyright 8/16/23 Zilliz
2 | © Copyright 8/16/23 Zilliz
Stephen Batifol
Developer Advocate, Zilliz
stephen.batifol@zilliz.com
https://www.linkedin.com/in/stephen-batifol/
https://twitter.com/stephenbtl
Speaker

3 | © Copyright 8/16/23 Zilliz
3 | © Copyright 8/16/23 Zilliz
| © Copyright 8/16/23 Zilliz
3
RAG
(Retrieval Augmented Generation)

4 | © Copyright 8/16/23 Zilliz
4 | © Copyright 8/16/23 Zilliz
Basic Idea
Use RAG to force the LLM to work with your data
by injecting it via a vector database like Milvus

5 | © Copyright 8/16/23 Zilliz
5 | © Copyright 8/16/23 Zilliz
Vector DB for RAG
Vector Databases provide the ability to inject your data via
semantic similarity
Considerations include: scale, performance, and flexibility

$6 | © Copyright 8/16/23 Zilliz 6 | © Copyright 8/16/23 Zilliz LLMs are Stochastic LLMs predict future tokens (a-la RNNs) • “Milvus is the world ’s most popular vector ___” • {“database”: 0.86, “search”: 0.11, “embedding”, 0.01, …} Downside: outdated input data could be cause for hallucination • Plausible-sounding but factually incorrect responses$

7 | © Copyright 8/16/23 Zilliz
7 | © Copyright 8/16/23 Zilliz
Basic RAG Architecture

Most read

27 | © Copyright 8/16/23 Zilliz
27 | © Copyright 8/16/23 Zilliz
Your chunking strategy depends on what your data looks
like and what you need from it.
Takeaway:

28 | © Copyright 8/16/23 Zilliz
28 | © Copyright 8/16/23 Zilliz
| © Copyright 8/16/23 Zilliz
28
Demo!

29 | © Copyright 8/16/23 Zilliz
29 | © Copyright 8/16/23 Zilliz
Questions?
Give Milvus a Star! Chat with me on Discord!

30 | © Copyright 8/16/23 Zilliz
30 | © Copyright 8/16/23 Zilliz
Meta Storage
Root Query Data Index
Coordinator Service
Proxy
Proxy
etcd
Log Broker
SDK
Load Balancer
DDL/DCL
DML
NOTIFICATION
CONTROL SIGNAL
Object Storage
Minio / S3 / AzureBlob
Log Snapshot Delta File Index File
Worker Node QUERY DATA DATA
Message Storage
VECTOR
DATABASE
Access Layer
Query Node Data Node Index Node
Milvus Architecture

8 | © Copyright 8/16/23 Zilliz
8 | © Copyright 8/16/23 Zilliz
01 Tech Stack

9 | © Copyright 8/16/23 Zilliz
9 | © Copyright 8/16/23 Zilliz
Tech Stack

10 | © Copyright 8/16/23 Zilliz
10 | © Copyright 8/16/23 Zilliz
• Framework for building LLM Applications
• Focus on retrieving data and integrating with LLMs
• Loading the Data
• Chunk & Chunk Overlap
• Integrations with most popular tools
Langchain

11 | © Copyright 8/16/23 Zilliz
11 | © Copyright 8/16/23 Zilliz
Ollama
• Run quantized LLMs Locally
• Embeddings Models

12 | © Copyright 8/16/23 Zilliz
12 | © Copyright 8/16/23 Zilliz
Milvus
1. Cloud Native, Distributed System Architecture
2. True Separation of Concerns
3. Scalable Index Creation Strategy with 512 MB Segments

13 | © Copyright 8/16/23 Zilliz
13 | © Copyright 8/16/23 Zilliz
Embeddings Models

14 | © Copyright 8/16/23 Zilliz
14 | © Copyright 8/16/23 Zilliz
02 Embeddings

15 | © Copyright 8/16/23 Zilliz
15 | © Copyright 8/16/23 Zilliz
Examining Embeddings
Picking a model
What to embed
Metadata

16 | © Copyright 8/16/23 Zilliz
16 | © Copyright 8/16/23 Zilliz
Embeddings Strategies
Level 1: Embedding Chunks Directly
Level 2: Embedding Sub and Super Chunks
Level 3: Incorporating Chunking and Non-Chunking Metadata

17 | © Copyright 8/16/23 Zilliz
17 | © Copyright 8/16/23 Zilliz
Metadata Examples
Chunking
- Paragraph position
- Section header
- Larger paragraph
- Sentence Number
- …
Non-Chunking
- Author
- Publisher
- Organization
- Role Based Access Control
- …

18 | © Copyright 8/16/23 Zilliz
18 | © Copyright 8/16/23 Zilliz
Text:
“preferences of customers and prospective customers with respect to remote or hybrid
working, as a result of the COVID-19 pandemic, leading to a parallel delay, or potentially
permanent change, in receiving the corresponding revenue; •our projected financial
information, anticipated growth rate, and market opportunity; •our ability to maintain the
listing of our Class A Common Stock and Warrants on the NYSE; •our public securities’
potential liquidity and trading;”
Vector:
[-0.09975282847881317,-0.02853492833673954,-0.047886092215776443,0.01231582183
3908558,-0.004004416521638632,0.08756010979413986,0.013248161412775517,0.01070
4956017434597,-0.06194952502846718,0.021150749176740646,0.02453230880200863,0
.03979797288775444,-0.032914288341999054,-0.011855324730277061,...]
What your data looks like

19 | © Copyright 8/16/23 Zilliz
19 | © Copyright 8/16/23 Zilliz
Your embeddings strategy depends on your accuracy,
cost, and use case needs
Takeaway:

20 | © Copyright 8/16/23 Zilliz
20 | © Copyright 8/16/23 Zilliz
03 Chunking

21 | © Copyright 8/16/23 Zilliz
21 | © Copyright 8/16/23 Zilliz
Chunking Considerations
Chunk Size
Chunk Overlap
Character Splitters

22 | © Copyright 8/16/23 Zilliz
22 | © Copyright 8/16/23 Zilliz
Examples
Chunk Size=50, Overlap=0

23 | © Copyright 8/16/23 Zilliz
23 | © Copyright 8/16/23 Zilliz
Examples
Chunk Size=128, Overlap=20

24 | © Copyright 8/16/23 Zilliz
24 | © Copyright 8/16/23 Zilliz
Examples
Chunk Size=256, Overlap=50

25 | © Copyright 8/16/23 Zilliz
25 | © Copyright 8/16/23 Zilliz
Examples
SemanticChunker

26 | © Copyright 8/16/23 Zilliz
26 | © Copyright 8/16/23 Zilliz
How Does Your Data Look?
Conversation
Data
Documentation
Data
Lecture or Q/A
Data

Rishabh Software is a technology services company that has been providing services to businesses globally for over a decade. They have experience with various Java and J2EE technologies and have completed many architecture-based software solutions. They utilize development methodologies like Waterfall and Scrum and have expertise in domains like BFSI, ecommerce, healthcare, and more. They have experience with various projects including an ecommerce shopping cart, AML compliance solutions, and mobile app development.

LWC Datatable LDV, Christian Knapp & Christian MenzingerCzechDreamin

This document summarizes a presentation about using Lightning Web Components (LWC) to build a datatable to search for products from a large dataset. It discusses challenges faced, such as platform restrictions, performance issues, and type coercion errors when trying to query fields. It provides tips for handling large data volumes, like using offsets, limits, and Apex for sorting. While promising, LWC datables have limitations and require special handling of features like selections and relationship fields. Open source tools can help generate test data to explore solutions before projects launch.

Docker Security OverviewSreenivas Makam

Angular Taiwan 2019 - 大型 Angular 專案的的管理心得與技巧升煌黃

Static Analysis Security Testing for Dummies... and YouKevin Fealey

Most enterprise application security teams have at least one Static Analysis Security Testing (SAST) tool in their tool-belt; but for many, the tool never leaves the belt. SAST tools have gotten a reputation for being slow, error-prone, and difficult to use; and out of the box, many of them are – but with a little more knowledge behind how these tools are designed, a SAST tool can be a valuable part of any security program. In this talk, we’ll help you understand the strengths and weaknesses of SAST tools by illustrating how they trace your code for vulnerabilities. You’ll see out-of-the-box rules for commercial and open-source SAST tools, and learn how to write custom rules for the widely-used open source SAST tool, PMD. We’ll explain the value of customizing tools for your organization; and you’ll learn how to integrate SAST technologies into your existing build and deployment pipelines. Lastly, we’ll describe many of the common challenges organizations face when deploying a new security tool to security or development teams, as well as some helpful hints to resolve these issues

Intro To Observability-March-2023.pdfPremDomingo

This presentation discusses Splunk's observability solution and its benefits. It can provide complete business visibility across hybrid landscapes, predict and prevent problems before customers notice through leveraging historic knowledge, and AI-directed troubleshooting. The open telemetry-native approach allows for full control of data collection. Customers saw reduced downtime, improved efficiency, faster innovation, and better customer experiences through using Splunk's observability capabilities. Splunk is a leader in the observability market according to analysts.

Shift Left Security - The What, Why and HowDevOps.com

This document discusses shift-left security, which involves moving security practices earlier into the software development lifecycle to proactively address risks rather than reactively. It notes that only 20% of organizations consistently integrate security early in DevOps processes. Shift-left security is important because traditional security teams cannot keep up with development speeds. The document outlines how to implement shift-left security through automating security practices, using control gates, and learning from production environments. It argues containers help shift security left through their minimal, declarative, and predictable nature which simplifies security requirements and policy automation.

Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaZilliz

Chunking, Embeddings, and Vector DatabasesZilliz

Retrieval Augmented Generation (RAG), is a popular method to use a large language model, a vector database, and some sort of prompt interface to build better chat bots. On the surface, it seems pretty simple to build a RAG app, but when it comes down to implementation, there are many details to hash out. These details include how to: chunk data, work with embeddings, and even how to select and use a vector database.

2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...Timothy Spann

2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Techniques Timothy Spann https://2024.allthingsopen.org/sessions/advanced-retrieval-augmented-generation-rag-techniques In 2023, we saw many simple retrieval augmented generation (RAG) examples being built. However, most of these examples and frameworks built around them simplified the process too much. Businesses were unable to derive value from their implementations. That’s because there are many other techniques involved in tuning a basic RAG app to work for you. In this talk we will cover three of the techniques you need to understand and leverage to build better RAG: chunking, embedding model choice, and metadata structuring.

17-October-2024 NYC AI Camp - Step-by-Step RAG 101Timothy Spann

17-October-2024 NYC AI Camp - Step-by-Step RAG 101 https://github.com/tspannhw/AIM-BecomingAnAIEngineer https://github.com/tspannhw/AIM-Ghosts AIM - Becoming An AI Engineer Step 1 - Start off local Download Python (or use your local install) https://www.python.org/downloads/ python3.11 -m venv yourenv source yourenv/bin/activate Create an environment https://docs.python.org/3/library/venv.html Use Pip https://pip.pypa.io/en/stable/installation/ Setup a .env file for environment variables Download Jupyter Lab https://jupyter.org/ Run your notebook jupyter lab --ip="0.0.0.0" --port=8881 --allow-root Running on a Mac or Linux machine is optimal. Setup environment variables source .env Alternatives Download Conda https://docs.conda.io/projects/conda/en/latest/index.html https://colab.research.google.com/ Other languages: Java, .Net, Go, NodeJS Other notebooks to try https://zilliz.com/learn/milvus-notebooks https://github.com/milvus-io/bootcamp/blob/master/bootcamp/tutorials/quickstart/build_RAG_with_milvus.ipynb References Guides https://zilliz.com/learn HuggingFace Friend https://zilliz.com/learn/effortless-ai-workflows-a-beginners-guide-to-hugging-face-and-pymilvus Milvus https://zilliz.com/milvus-downloads https://milvus.io/docs/quickstart.md LangChain https://zilliz.com/learn/LangChain Notebook display https://ipywidgets.readthedocs.io/en/stable/user_install.html References https://medium.com/@zilliz_learn/function-calling-with-ollama-llama-3-2-and-milvus-ac2bc2122538 https://github.com/milvus-io/bootcamp/tree/master/bootcamp/RAG/advanced_rag https://zilliz.com/learn/Retrieval-Augmented-Generation https://zilliz.com/blog/scale-search-with-milvus-handle-massive-datasets-with-ease https://zilliz.com/learn/generative-ai https://zilliz.com/learn/what-are-binary-vector-embedding https://zilliz.com/learn/choosing-right-vector-index-for-your-project

Using LLM Agents with Llama 3.2, LangGraph and MilvusZilliz

Building an Agentic RAG locally with Ollama and MilvusZilliz

11-OCT-2024_AI_101_CryptoOracle_UnstructuredDataTimothy Spann

Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systemsZilliz

Apache Spark dominates the big data processing world, but efficient vector similarity search on massive datasets remains a bottleneck. This talk will show how you can seamlessly integrate Milvus with Spark to unlock the true power of vector similarity search. We'll explore how Milvus integrates with Spark, enabling efficient vector search within Spark workflows. Real-world applications showcasing the combined power of Spark and Milvus in tackling complex similarity search challenges will be presented. Finally, we'll shed light on the significant performance gains achieved through this integration. Whether you're dealing with recommendation systems, image retrieval, or any other application requiring vector similarity search, this talk will equip you with the knowledge to leverage Spark and Milvus to their maximum potential. Join us on this exploration of how Spark and Milvus can enhance your big data processing capabilities with fast similarity search even at scale!

GraphRAG Agents with Neo4j, Milvus and GPT4Zilliz

Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz

09-12-2024 - Milvus, Vector database used for Sensor Data RAGTimothy Spann

09-12-2024 - Milvus Sensor Data RAG https://voxel51.com/blog/the-computer-vision-interface-for-vector-search/ https://www.linkedin.com/feed/update/urn:li:activity:7233322212370300929/ It’s in the Air Tonight. Sensor Data in RAG– Tim Spann 🥑 at Milvus #computervision#ai#artificialintelligence#machinevision#machinelearning#datascience https://voxel51.com/computer-vision-events/ai-machine-learning-computer-vision-meetup-sept-12-2024/ will do a quick overview of the basics of Vector Databases and Milvus and then dive into a practical example of how to use one as part of an application. I will demonstrate how to consume air quality data and ingest it into Milvus as vectors and scalars. We will then use our vector database of Air Quality readings to feed our LLM and get proper answers to Air Quality questions. I will show you how to all the steps to build a RAG application with Milvus, LangChain, Ollama, Python and Air Quality Reports. Preview the demo on Medium. About the Speaker Tim Spann is a Principal Developer Advocate for Zilliz and Milvus. He works with Milvus, Generative AI, HuggingFace, Python, Big Data, IoT, and Edge AI. Tim has over twelve years of experience with the IoT, big data, distributed computing, messaging, machine learning and streaming technologies.

Scaling Vector Search: How Milvus Handles Billions+Zilliz

Using LLM Agents with Llama 3, LangGraph and MilvusZilliz

Retrieval Augmented Generation Evaluation with RagasZilliz

Retrieval Augmented Generation (RAG) enhances chatbots by incorporating custom data in the prompt. Using large language models (LLMs) as judge has gained prominence in modern RAG systems. This talk will demo Ragas, an open-source automation tool for RAG evaluations. Christy will talk about and demo evaluating a RAG pipeline using Milvus and RAG metrics like context F1-score and answer correctness.

Introduction to Open Source RAG and RAG EvaluationZilliz

You’ve heard good data matters in Machine Learning, but does it matter for Generative AI applications? Corporate data often differs significantly from the general Internet data used to train most foundation models. Join me for a demo on building an open source RAG (Retrieval Augmented Generation) stack using Milvus vector database for Retrieval, LangChain, Llama 3 with Ollama, Ragas RAG Eval, and optional Zilliz cloud, OpenAI.

Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz

Agentic systems are on the rise, helping developers create intelligent, autonomous systems. LLMs are becoming more and more capable of following diverse sets of instructions, making them ideal for managing these agents. This advancement opens up numerous possibilities for handling complex tasks with minimal human intervention in so many areas. In this talk, we will see how to build agents using llama-agents. We’ll also explore how combining different LLMs can enable various actions. For simpler tasks, we'll use Mistral Nemo, a smaller and more cost-effective model, and Mistral Large for orchestrating different agents.

2025-02-24 - AWS meetup - Zilliz presentation.pdfIvan Tang

Zilliz's presentation in AWS x Apache Doris meetup on 24th Feb 2025 in Singapore. In this presentation, I shared a live demo on how you can outsource thinking and reasoning to Amazon Nova for generating a research report on any topic. Code repo for live demo can be found here: https://github.com/zilliztech/deep-searcher I've also shared about the hard tradeoffs you need to make when choosing vector indexes and lastly my top 5 favourite features of Zilliz Cloud.

06-18-2024-Princeton Meetup-Introduction to MilvusTimothy Spann

06-18-2024-Princeton Meetup-Introduction to Milvus [email protected] https://www.linkedin.com/in/timothyspann/ https://x.com/paasdev https://github.com/tspannhw https://github.com/milvus-io/milvus Get Milvused! https://milvus.io/ Read my Newsletter every week! https://github.com/tspannhw/FLiPStackWeekly/blob/main/142-17June2024.md For more cool Unstructured Data, AI and Vector Database videos check out the Milvus vector database videos here https://www.youtube.com/@MilvusVectorDatabase/videos Unstructured Data Meetups - https://www.meetup.com/unstructured-data-meetup-new-york/ https://lu.ma/calendar/manage/cal-VNT79trvj0jS8S7 https://www.meetup.com/pro/unstructureddata/ https://zilliz.com/community/unstructured-data-meetup https://zilliz.com/event Twitter/X: https://x.com/milvusio https://x.com/paasdev LinkedIn: https://www.linkedin.com/company/zilliz/ https://www.linkedin.com/in/timothyspann/ GitHub: https://github.com/milvus-io/milvus https://github.com/tspannhw Invitation to join Discord: https://discord.com/invite/FjCMmaJng6 Blogs: https://milvusio.medium.com/ https://www.opensourcevectordb.cloud/ https://medium.com/@tspann Expand LLMs' knowledge by incorporating external data sources into LLMs and your AI applications.

RAG Pipelines with Real-Time data ClouderaZilliz

Milvus: Scaling Vector Data Solutions for Gen AIZilliz

Milvus, an LF AI project, is an open-source vector database built to power Gen AI solutions. 80% of the data in the world is unstructured data, and vector databases are the databases that help you get valuable insights from unstructured data. With this in mind, we built Milvus as a distributed system on top of other open-source solutions, including MinIO and Kafka, to support vector collections that exceed billion-scale. This session will deeply dive into the architecture decisions that make this cloud-native vector database seamlessly scale horizontally, provide users with tunable consistency, orchestrate in-memory and on-disk indexing, and scalable search strategies.

Zilliz Cloud Monthly Technical Review: May 2025Zilliz

About this webinar Join our monthly demo for a technical overview of Zilliz Cloud, a highly scalable and performant vector database service for AI applications Topics covered - Zilliz Cloud's scalable architecture - Key features of the developer-friendly UI - Security best practices and data privacy - Highlights from recent product releases This webinar is an excellent opportunity for developers to learn about Zilliz Cloud's capabilities and how it can support their AI projects. Register now to join our community and stay up-to-date with the latest vector database technology.

Smarter RAG Pipelines: Scaling Search with Milvus and FeastZilliz

About this webinar Learn how Milvus and Feast can be used together to scale vector search and easily declare views for retrieval using open source. We’ll demonstrate how to integrate Milvus with Feast to build a customized RAG pipeline. Topics Covered - Leverage Feast for dynamic metadata and document storage and retrieval, ensuring that the correct data is always available at inference time - Learn how to integrate Feast with Milvus to support vector-based retrieval in RAG systems - Use Milvus for fast, high-dimensional similarity search, enhancing the retrieval phase of your RAG model

More Related Content

Similar to A Beginners Guide to Building a RAG App Using Open Source Milvus (20)

Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaZilliz

Chunking, Embeddings, and Vector DatabasesZilliz

2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...Timothy Spann

17-October-2024 NYC AI Camp - Step-by-Step RAG 101Timothy Spann

Using LLM Agents with Llama 3.2, LangGraph and MilvusZilliz

Building an Agentic RAG locally with Ollama and MilvusZilliz

11-OCT-2024_AI_101_CryptoOracle_UnstructuredDataTimothy Spann

Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systemsZilliz

GraphRAG Agents with Neo4j, Milvus and GPT4Zilliz

Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz

09-12-2024 - Milvus, Vector database used for Sensor Data RAGTimothy Spann

Scaling Vector Search: How Milvus Handles Billions+Zilliz

Using LLM Agents with Llama 3, LangGraph and MilvusZilliz

Retrieval Augmented Generation Evaluation with RagasZilliz

Introduction to Open Source RAG and RAG EvaluationZilliz

Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz

2025-02-24 - AWS meetup - Zilliz presentation.pdfIvan Tang

06-18-2024-Princeton Meetup-Introduction to MilvusTimothy Spann

RAG Pipelines with Real-Time data ClouderaZilliz

Milvus: Scaling Vector Data Solutions for Gen AIZilliz

Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaZilliz

Chunking, Embeddings, and Vector DatabasesZilliz

2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...Timothy Spann

17-October-2024 NYC AI Camp - Step-by-Step RAG 101Timothy Spann

Using LLM Agents with Llama 3.2, LangGraph and MilvusZilliz

Building an Agentic RAG locally with Ollama and MilvusZilliz

11-OCT-2024_AI_101_CryptoOracle_UnstructuredDataTimothy Spann

Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systemsZilliz

GraphRAG Agents with Neo4j, Milvus and GPT4Zilliz

Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz

09-12-2024 - Milvus, Vector database used for Sensor Data RAGTimothy Spann

Scaling Vector Search: How Milvus Handles Billions+Zilliz

Using LLM Agents with Llama 3, LangGraph and MilvusZilliz

Retrieval Augmented Generation Evaluation with RagasZilliz

Introduction to Open Source RAG and RAG EvaluationZilliz

Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz

2025-02-24 - AWS meetup - Zilliz presentation.pdfIvan Tang

06-18-2024-Princeton Meetup-Introduction to MilvusTimothy Spann

RAG Pipelines with Real-Time data ClouderaZilliz

Milvus: Scaling Vector Data Solutions for Gen AIZilliz

More from Zilliz (20)

Zilliz Cloud Monthly Technical Review: May 2025Zilliz

Smarter RAG Pipelines: Scaling Search with Milvus and FeastZilliz

Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...Zilliz

Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...Zilliz

About this webinar Discover how to integrate Vision Language Models with Browser Use and Milvus to create an agentic system capable of real-time visual and textual analysis. Ideal for developers who want to learn how to use Agents that can see, take action, and remember what they saw. This Session Will: - Demonstrate a workflow where Browser Use extracts dynamic web data, while Milvus stores and retrieves the data, that way you can always come back to what the agent saw. - Showcase practical use cases, such as querying live web content with AI agents that reason over historical and visual data. - Explore balancing autonomy and control in agentic systems, including challenges like hallucination mitigation and performance optimization.

Webinar - Zilliz Cloud Monthly Demo - March 2025Zilliz

Join our monthly demo for a technical overview of Zilliz Cloud, a highly scalable and performant vector database service for AI applications Topics covered - Zilliz Cloud's scalable architecture - Key features of the developer-friendly UI - Security best practices and data privacy - Highlights from recent product releases - This webinar is an excellent opportunity for developers to learn about Zilliz Cloud's capabilities and how it can support their AI projects. Register now to join our community and stay up-to-date with the latest vector database technology.

What Makes "Deep Research"? A Dive into AI AgentsZilliz

About this webinar: Unless you live under a rock, you will have heard about OpenAI’s release of Deep Research on Feb 2, 2025. This new product promises to revolutionize how we answer questions requiring the synthesis of large amounts of diverse information. But how does this technology work, and why is Deep Research a noticeable improvement over previous attempts? In this webinar, we will examine the concepts underpinning modern agents using our basic clone, Deep Searcher, as an example. Topics covered: Tool use Structured output Reflection Reasoning models Planning Types of agentic memory

Combining Lexical and Semantic Search with Milvus 2.5Zilliz

In short, lexical search is a way to search your documents based on the keywords they contain, in contrast to semantic search, which compares the similarity of embeddings. We’ll be covering: Why, when, and how should you use lexical search What is the BM25 distance metric How exactly does Milvus 2.5 implement lexical search How to build an improved hybrid lexical + semantic search with Milvus 2.5

Bedrock Data Automation (Preview): Simplifying Unstructured Data ProcessingZilliz

Bedrock Data Automation (BDA) is a cloud-based service that simplifies the process of extracting valuable insights from unstructured content—such as documents, images, video, and audio. Come learn how BDA leverages generative AI to automate the transformation of multi-modal data into structured formats, enabling developers to build applications and automate complex workflows with greater speed and accuracy.

Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLMZilliz

About this webinar: While text-based RAG systems have been everywhere in the last year and a half, there is so much more than text data. Images, audio, and documents often need to be processed together to provide meaningful insights, yet most RAG implementations focus solely on text. Think about automated visual inspection systems understanding manufacturing logs and production line images, or robotics systems correlating sensor data with visual feedback. These multimodal scenarios demand RAG systems that go beyond text-only processing. In this talk, we'll walk through how to build a Multimodal RAG system that helps solve this problem. We'll explore the architecture that makes it possible to run such a system and demonstrate how to build one using Milvus, LlamaIndex, and vLLM for deploying open-source LLMs on your infrastructure. Through a live demo, we'll showcase a real-world application processing both images and text queries. Whether you're looking to reduce API costs, maintain data privacy, or gain more control over your AI infrastructure, this session will provide you with actionable insights to implement Multimodal RAG in your organization. Topics covered: - vLLM and self hosting LLMs - Multimodal RAG Demo: a real-world application processing both images and text queries

February Product Demo: Discover the Power of Zilliz CloudZilliz

Join our monthly demo for a technical overview of Zilliz Cloud, a highly scalable and performant vector database service for AI applications Topics covered - Zilliz Cloud's scalable architecture - Key features of the developer-friendly UI - Security best practices and data privacy - Highlights from recent product releases This webinar is an excellent opportunity for developers to learn about Zilliz Cloud's capabilities and how it can support their AI projects. Register now to join our community and stay up-to-date with the latest vector database technology.

Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23Zilliz

"Milvus 2.5 introduces text search by introducing native full text search capabilities, seamlessly combining term-based matching with vector similarity in a single system. This feature automatically handles text-to-vector conversion and real-time BM25 scoring, eliminating the complexity of manual embedding generation and external processing pipelines. Through a live demo, we'll showcase how easy we make it to use Full Text search now :D"

Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & MilvusZilliz

"This session dives deep into the power of Multimodal Retrieval, a revolutionzing approach that enhances personalization by seamlessly integrating diverse data sources for more intuitive product interactions. Explore the foundational concepts of Multimodal Embedding and Any-to-Any Search, and learn how to leverage these technologies to build next generation products. Discover how to seamlessly integrate the Twelve Labs Embed API and Milvus into your projects. Through live demos, you’ll see how Fashion Product Search is redefined with deeper insights into the architecture, and discover how this approach is revolutionizing user interactions, especially with bots. We’ll also explore real world case studies that demonstrate the ease and power of building multimodal apps."

Voice-to-Value- LLM-Powered Customer Interaction Analysis.pdfZilliz

"Explore the transformative potential of Voice AI in customer interaction analysis powered by LLMs. Learn how Gemini 2.0 enables transcription, summarization, and actionable insight extraction to streamline ticket resolution and enhance customer experiences. This session delves into the architecture and practical applications of LLM-powered systems, showcasing how they revolutionize customer support workflows through real-world examples and insights"

Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...Zilliz

AI agents are transforming industries, especially with recent vision-language models like Llama 3.2 Vision that enable AI agents to go beyond text-based understanding by integrating multimodal capabilities. Building such advanced AI agents can feel complex, but FriendliAI simplifies the process by offering end-to-end solutions, from creating your own custom models to deploying them in production. In this webinar, we’ll learn about the AI developer workflow from model fine-tuning to inference serving. We’ll also work through building a simple AI agent with advanced multimodal RAG capabilities using Friendli Serverless Endpoints and Milvus DB. This session is ideal for those looking to learn more about large-language model inference serving, start building AI agents with RAG capabilities, or explore multimodal RAG queries in greater depth

1 Table = 1000 Words? Foundation Models for Tabular DataZilliz

Tables form the backbone of modern data storage, powering everything from relational databases to enterprise systems. Yet despite their ubiquity, we've barely scratched the surface of their potential. While Deep Learning has revolutionized our ability to process text and images, its impact on tabular data has been surprisingly limited. This gap is now being bridged through groundbreaking research in multimodal modeling, particularly with innovations like the TableGPT2 model. In this talk, we'll explore how these new multimodal foundation models are trained to understand tabular data, and demonstrate practical ways to unlock hidden value in your organization's data assets.

How Milvus allows you to run Full Text SearchZilliz

How to Optimize Your Embedding Model Selection and Development through TDA Cl...Zilliz

About this webinar: Embedding models are a crucial layer in vector database applications, yet figuring out which embedding model is best for your dataset has been a notoriously difficult task. However, an efficient and intuitive approach for many use cases can be produced through Topological Data Analysis (TDA) on your evaluation dataset. Identifying patterns of weak performing behavior in your model is made easy and scalable through a table that reveals the performance of different semantic categories of queries being made to your vector database. Topics covered: - Risks and limitations of current evaluation approaches for embeddings - Compare embedding models on your own dataset using Navigable TDA clusters - ML lifecycle case studies in ecommerce: model selection, fine-tuning, and post-deployment

Keeping Data Fresh: Mastering Updates in Vector DatabasesZilliz

Managing and extracting value from unstructured data has become a critical challenge as the volume of data continues to grow. This virtual event brings together industry experts to explore the latest techniques in Retrieval Augmented Generation (RAG) and vector databases. Discover how RAG systems are revolutionizing natural language processing by seamlessly integrating information retrieval techniques, enabling more accurate and contextual language generation. Gain practical insights into building and optimizing these applications. This session will also cover how vector databases like Milvus, play a key role in RAG and working with unstructured data. Learn proven strategies for maintaining data freshness, accuracy, and efficiency, ensuring your organization stays ahead of the curve.

Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!Zilliz

Vector Databases for Enhanced ClassificationZilliz

What will you learn? In this webinar, we dive into the use of Milvus as a high-performance vector database tailored for handling large-scale document collections, focusing on European Commission and Parliament acts. Our approach shifts from traditional RAG-based classification to a hybrid search method, leveraging K-Nearest Neighbor (KNN) for pinpointing top documents relevant to classification tasks. This session is ideal for those aiming to refine classification accuracy by leveraging vector-based indexing and hybrid retrieval in vast datasets. Topics covered: KNN and Sparse Search Integration: How KNN retrieval combined with sparse search helps extract top documents aligned with classification needs. Versatile Embeddings for Multilingual and Multi-Domain Applications: The BGE M3-Embedding model is designed to provide robust, high-quality embeddings across multiple languages and domains, making it adaptable for diverse tasks in multilingual and cross-functional environments. Real-World Application: Step-by-step demonstration using European legislative acts to showcase KNN-driven retrieval and classification workflows.