Using LLM Agents with Llama 3, LangGraph and Milvus

Jul 11, 20240 likes1,265 views

RAG systems are talked about in detail, but usually stick to the basics. In this talk, Stephen will show you how to build an Agentic RAG System using Langchain and Milvus.

Stephen Batifol | Zilliz
Zilliz Webinar, July 11
Using LLM Agents with Llama
3, LangGraph and Milvus

Stephen Batifol
Developer Advocate, Zilliz/ Milvus
stephen.batifol@zilliz.com
linkedin.com/in/stephen-batifol/
@stephenbtl
Speaker

27K+
GitHub
Stars
25M+
Downloads
250+
Contributors
2,600
+
Forks
Milvus is an open-source vector database for GenAI projects. pip install on your
laptop, plug into popular AI dev tools, and push to production with a single line of
code.
Easy Setup
Pip-install to start
coding in a notebook
within seconds.
Reusable Code
Write once, and
deploy with one line
of code into the
production
environment
Integration
Plug into OpenAI,
Langchain,
LlmaIndex, and
many more
Feature-rich
Dense & sparse
embeddings,
filtering, reranking
and beyond

Seamless integration with all popular AI toolkits

Most read

| © Copyright 8/16/23 Zilliz
5
RAG
(Retrieval Augmented Generation)

Basic Idea
Use RAG to force the LLM to work with your data
by injecting it via a vector database like Milvus

9 | © Copyright 8/16/23 Zilliz
9 | © Copyright 8/16/23 Zilliz
01 Tech Stack

● Framework for building LLM Applications
● Focus on retrieving data and integrating with LLMs
● Integrations with most AI popular tools
🦜🔗 LangChain

🦜🕸 LangGraph by LangChain
● Build Stateful apps with LLMs and Multi-Agents workflow
● Cycles and Branching
● Human-in-the-Loop
● Persistence

Ollama
● Run LLMs anywhere
● Run Embedding Models

Using LLM Agents with Llama 3, LangGraph and Milvus

14 | © Copyright 8/16/23 Zilliz
14 | © Copyright 8/16/23 Zilliz
02 Agentic RAG

Agentic RAG
✅ Multi-turn
✅ Query / task planning layer
✅ Tool interface for external environment
✅ Reflection
✅ Memory for personalization

● Routing: Adaptive RAG
○ Route Questions to different retrieval approaches
● Fallback: Corrective RAG
○ Fallback to web search if docs are not relevant to query
● Self-Correction: Self-RAG
○ Try to fix answers with hallucinations or don’t address question
General Ideas

17 | © Copyright 8/16/23 Zilliz
17 | © Copyright 8/16/23 Zilliz
03 RAG in action with Milvus Lite

milvus.io
github.com/milvus-io/
@milvusio
@stephenbtl
/in/stephen-batifol
Thank you

Meta Storage
Root Query Data Index
Coordinator Service
Proxy
Proxy
etcd
Log Broker
SDK
Load Balancer
DDL/DCL
DML
NOTIFICATION
CONTROL SIGNAL
Object Storage
Minio / S3 / AzureBlob
Log Snapshot Delta File Index File
Worker Node QUERY DATA DATA
Message Storage
VECTOR
DATABASE
Access Layer
Query Node Data Node Index Node
Milvus Architecture

A deep-dive into Agentic AI covering the following topics: Linking to the corresponding articles on Medium. The article on Long-term #Memory Management is now 3rd among the 10 most read stories on #AgenticAI on Medium -:) - Agentic AI Platform Reference #Architecture https://medium.com/datadriveninvestor/ai-agent-platform-reference-architecture-0be5b19d0eba - AI Agents #Marketplace & Discovery for #MultiAgentSystems https://medium.com/ai-advances/ai-agents-marketplace-discovery-for-multi-agent-systems-27a31b6b1ca6 - Personalizing #UX for Agentic AI https://medium.com/ai-advances/personalized-ux-for-agentic-ai-ab132f2eeb03 - Agent #Observability https://medium.com/ai-advances/stateful-and-responsible-ai-agents-7af386268554 - Long-term #Memory Management https://medium.com/ai-advances/long-term-memory-for-agentic-ai-systems-4ae9b37c6c0f - Agentic AI Scenarios: Agentic #RAGs: extending RAGs to #SQL Databases https://medium.com/ai-advances/agentic-rags-extending-rags-to-sql-databases-1509b25ca3e7 #ReinforcementLearning Agents for #ControlSystems https://medium.com/ai-advances/reinforcement-learning-agents-for-industrial-control-systems-b917b513f0c4 - Responsible AI Agents #Privacy Risks of LLMs https://medium.com/ai-advances/privacy-risks-of-large-language-models-llms-5c0f96dccc56 Responsible #LLMOps in Towards Data Science https://towardsdatascience.com/responsible-llmops-985cd1af3639

Devoxx Morocco 2024 - The Future Beyond LLMs: Exploring Agentic AIRaphaël Semeteys

What exactly are AI Agents, and how do they operate? How do they compare to and interact with LLMs and functionalities such as function calling, chain-of-thought processing, assistants, tools, or actions? In this talk, I delve into the unique features of Agentic AI, including perception, state estimation, goal setting, planning, and action selection & execution. We will define various levels of Agentic AI and form a map to help navigate this emerging landscape. By categorizing current agent-based or agent-related solutions with practical examples, we'll provide an overview of the current state of Agentic AI.

Collborative Agents with Tools & Knowledge (Graphs) using LangGraph & LangChainTilores

Discover how enterprise AI systems can harness the power of multi-agent architectures through LangGraph, a framework that masterfully balances control with autonomy. Learn how to overcome common LLM application challenges using LangGraph's powerful features for parallel execution, multi-agent orchestration, and human-in-the-loop interactions. As part of the thriving LangChain ecosystem that powers over 100,000 applications, LangGraph offers developers practical approaches to building sophisticated AI systems with robust feedback loops. Perfect for developers and architects seeking to advance their enterprise AI solutions using state-of-the-art multi-agent architectures.

generative-ai-fundamentals and Large language modelsAdventureWorld5

Thank you for the detailed review of the protein bars. I'm glad to hear you and your family are enjoying them as a healthy snack and meal replacement option. A couple suggestions based on your feedback: - For future orders, you may want to check the expiration dates to help avoid any dried out bars towards the end of the box. Freshness is key to maintaining the moist texture. - When introducing someone new to the bars, selecting one in-person if possible allows checking the flexibility as an indicator it's moist inside. This could help avoid a disappointing first impression from a dry sample. - Storing opened boxes in an airtight container in the fridge may help extend the freshness even further when you can't

Pitch Deck Teardown: Wilco's $7 million Seed deckHajeJanKamps

Wilco is a platform that allows software engineers to practice their hands-on skills through interactive "quests" that simulate real-life work scenarios. Engineers join a "fantasy company" and work through challenges using common tools from their actual jobs. This helps engineers improve both technical skills and soft skills. It also provides managers visibility into their engineers' progress. The platform is customizable and connects to common tools. Wilco aims to address gaps in engineers' development in a scalable way that other options like internal programs and bootcamps cannot. It is seeking funding to further develop its product and community.

Gearing to the new Future of Work: Embracing Agentic AI.DianaGray10

As we look toward the future of AI in 2025, Agentic AI is poised to revolutionize how we interact with intelligent systems, offering unprecedented autonomy, adaptability, and decision-making capabilities. This event will dive deep into what Agentic AI means, its key characteristics, and the significant value it can bring to businesses and industries. Attendees will gain insights into the emerging trends, challenges, and opportunities that come with this transformative technology, and learn how to prepare to support and implement Agentic AI solutions in the coming years. Join us as we explore: 🔍 What is Agentic AI? Understanding its core principles and how it differs from traditional AI systems. 💡 The Value Proposition: How Agentic AI can streamline processes, drive innovation, and support smarter, more autonomous decision-making in various industries. 🚀 The Road Ahead for 2025: What businesses and professionals can do today to prepare for the shift toward more autonomous, intelligent systems. 🛠️ Supporting Agentic AI Solutions: Practical insights into how to integrate and manage Agentic AI solutions effectively within organizations. Who Should Attend: This event is ideal for anyone interested in the future of AI and automation, as well as those looking to understand and leverage Agentic AI for business growth and operational efficiency. 🤖 AI/Automation Professionals: Learn how to future-proof your skills and understand the upcoming wave of AI-driven transformation. 💼 Business Leaders and Decision-Makers: Gain strategic insights into the potential of Agentic AI for operational improvements, innovation, and competitive advantage. 🖥️ Technology and Solutions Architects: Understand the technical landscape of Agentic AI and how to build, integrate, and scale these advanced systems. 👨‍💻 Developers and Engineers: Learn about the latest advancements in AI technology and how to start working with Agentic AI solutions. 🌱 AI Enthusiasts and Innovators: For anyone curious about the future of AI and its potential to drive change in industries ranging from finance to healthcare and beyond.

Introduction to Open Source RAG and RAG EvaluationZilliz

You’ve heard good data matters in Machine Learning, but does it matter for Generative AI applications? Corporate data often differs significantly from the general Internet data used to train most foundation models. Join me for a demo on building an open source RAG (Retrieval Augmented Generation) stack using Milvus vector database for Retrieval, LangChain, Llama 3 with Ollama, Ragas RAG Eval, and optional Zilliz cloud, OpenAI.

Plantee Seed Pitch Deck for TC Pitch Deck TeardownHajeJanKamps

The document is a pitch deck for Plantee Innovations, which is developing smart indoor gardening devices. They are raising $1.4 million to fund mass production and reach $1.7 million in revenue by 2025. Their flagship product is an all-in-one smart indoor greenhouse that monitors conditions like light, water, and temperature to provide ideal care for plants. They have market validation from a successful Kickstarter campaign and aim to address the large market of people who struggle to keep houseplants alive.

GENERATIVE AI, THE FUTURE OF PRODUCTIVITYAndre Muscat

Discuss the impact and opportunity of using Generative AI to support your development and creative teams * Explore business challenges in content creation * Cost-per-unit of different types of content * Use AI to reduce cost-per-unit * New partnerships being formed that will have a material impact on the way we search and engage with content Part 4 of a 9 Part Research Series named "What matters in AI" published on www.andremuscat.com

How ChatGPT and AI-assisted coding changes software engineering profoundlyPekka Abrahamsson / Tampere University

This document summarizes a presentation given by Professor Pekka Abrahamsson on how ChatGPT and AI-assisted coding is profoundly changing software engineering. The presentation covers several key points: - ChatGPT and AI tools like Copilot are beginning to be adopted in software engineering to provide code snippets, answers to technical questions, and assist with debugging, but issues around code ownership, reliability, and security need to be addressed. - Early studies show potential benefits of ChatGPT for tasks like software testing education, code quality improvement, and requirements elicitation, but more research is still needed. - Prompt engineering techniques can help maximize the usefulness of ChatGPT for software engineering tasks. Overall, AI

200109-Open AI Chat GPT-4-3.pptxandre241421

OpenAI is an AI research company dedicated to developing safe and beneficial artificial intelligence. Their mission is to ensure AI benefits humanity. OpenAI conducts research across various AI domains and develops technologies like ChatGPT, a large language model capable of answering questions and generating human-like responses. The company also offers developers access to its models and tools through an API.

Generative AI at the edge.pdfQualcomm Research

Generative AI models, such as ChatGPT and Stable Diffusion, can create new and original content like text, images, video, audio, or other data from simple prompts, as well as handle complex dialogs and reason about problems with or without images. These models are disrupting traditional technologies, from search and content creation to automation and problem solving, and are fundamentally shaping the future user interface to computing devices. Generative AI can apply broadly across industries, providing significant enhancements for utility, productivity, and entertainment. As generative AI adoption grows at record-setting speeds and computing demands increase, on-device and hybrid processing are more important than ever. Just like traditional computing evolved from mainframes to today’s mix of cloud and edge devices, AI processing will be distributed between them for AI to scale and reach its full potential. In this presentation you’ll learn about: - Why on-device AI is key - Full-stack AI optimizations to make on-device AI possible and efficient - Advanced techniques like quantization, distillation, and speculative decoding - How generative AI models can be run on device and examples of some running now - Qualcomm Technologies’ role in scaling on-device generative AI

Exploring Opportunities in the Generative AI Value Chain.pdfDung Hoang

Leveraging Generative AI & Best practicesDianaGray10

Simplified Introduction to AIDeepu S Nath

1) Artificial intelligence is the science and engineering of making intelligent machines that can perceive and take actions to maximize their success. 2) Early AI programs included the Logic Theorist which solved math theorems, and programs for playing checkers that learned from experience. 3) Recent advances in data, computing power, and techniques like machine learning, deep learning and neural networks have greatly expanded what AI can accomplish, with applications including computer vision, speech recognition, translation and more. 4) While current AI is specialized or "weak," the goal is to develop "strong" or general human-level AI that can perform any intellectual task, but this poses risks that must be addressed to ensure such systems remain

Generative AI, WiDS 2023.pptxColleen Farrelly

Andy Roy - Conversational AI - Why We Must Build.pdfSOLTUIONSpeople, THINKubators, THINKathons

This document discusses the rise of conversational AI and how digital agents can represent brands. It notes that generative AI enables new types of interactions that are more helpful than traditional chatbots. Digital agents can automate work by having natural conversations to complete tasks on behalf of users. The document provides examples of how a sales digital agent could assist a user before, during, and after a client meeting. It outlines six key ingredients for building effective digital agents, including prompting, context, proprietary knowledge, voice, reasoning, and code generation. The challenge for brands is to design unique digital agents that embody their values and approach in order to benefit from the changes brought by conversational AI.

Generative AIlutzsuarnaba1

The document discusses how generative AI can be used to scale content operations by reducing the time it takes to generate content. It explains that generative AI learns from natural language models and can generate new text or ideas based on prompts provided by users. While generative AI has benefits like speeding up content creation and ideation, it also has limitations such as not being able to conduct original research or ensure quality. The document provides examples of how generative AI can be used for tasks like generating ideas, simplifying complex text, creating visuals, and more. It also discusses challenges like bias in AI models and the low risk of plagiarism.

Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti

Mihai is the Principal Architect for Platform Engineering and Technology Solutions at IBM, responsible for Cloud Native and AI Solutions. He is a Red Hat Certified Architect, CKA/CKS, a leader in the IBM Open Innovation community, and advocate for open source development. Mihai is driving the development of Retrieval Augmentation Generation platforms, and solutions for Generative AI at IBM that leverage WatsonX, Vector databases, LangChain, HuggingFace and open source AI models. Mihai will share lessons learned building Retrieval Augmented Generation, or “Chat with Documents” platforms and APIs that scale, and deploy on Kubernetes. His talk will cover use cases for Generative AI, limitations of Large Language Models, use of RAG, Vector Databases and Fine Tuning to overcome model limitations and build solutions that connect to your data and provide content grounding, limit hallucinations and form the basis of explainable AI. In terms of technology, he will cover LLAMA2, HuggingFace TGIS, SentenceTransformers embedding models using Python, LangChain, and Weaviate and ChromaDB vector databases. He’ll also share tips on writing code using LLM, including building an agent for Ansible and containers. Scaling factors for Large Language Model Architectures: • Vector Database: consider sharding and High Availability • Fine Tuning: collecting data to be used for fine tuning • Governance and Model Benchmarking: how are you testing your model performance over time, with different prompts, one-shot, and various parameters • Chain of Reasoning and Agents • Caching embeddings and responses • Personalization and Conversational Memory Database • Streaming Responses and optimizing performance. A fine tuned 13B model may perform better than a poor 70B one! • Calling 3rd party functions or APIs for reasoning or other type of data (ex: LLMs are terrible at reasoning and prediction, consider calling other models) • Fallback techniques: fallback to a different model, or default answers • API scaling techniques, rate limiting, etc. • Async, streaming and parallelization, multiprocessing, GPU acceleration (including embeddings), generating your API using OpenAPI, etc.

Large Language Models BootcampData Science Dojo

This document provides a 50-hour roadmap for building large language model (LLM) applications. It introduces key concepts like text-based and image-based generative AI models, encoder-decoder models, attention mechanisms, and transformers. It then covers topics like intro to image generation, generative AI applications, embeddings, attention mechanisms, transformers, vector databases, semantic search, prompt engineering, fine-tuning foundation models, orchestration frameworks, autonomous agents, bias and fairness, and recommended LLM application projects. The document recommends several hands-on exercises and lists upcoming bootcamp dates and locations for learning to build LLM applications.

Stanford AI Report 2023Kapil Khandelwal (KK)

The AI Index Report 2023 provides the following key highlights from its research and development chapter: 1. The US and China have the most cross-country AI research collaborations, though the rate of growth has slowed in recent years. 2. Global AI research output has more than doubled since 2010, led by areas like machine learning, computer vision and pattern recognition. 3. China now leads in total AI research publications, while the US still leads in conference and repository citations but these leads are decreasing. 4. Industry now produces far more significant AI models than academia, as building state-of-the-art systems requires greater resources that industry can provide. 5. Large language models

Landscape of AI/ML in 2023HyunJoon Jung

The Creative Ai stormLeandro Righini

An overview of the most important AI capabilities in marketing, advertising and content creation. I made this presentation to inform, educate and inspire people in the creative industries to familiarise themselves with the incredible toolsets that are already here and in development. I also explain how generative Ai works explore some possible new roles and business models for agencies. Hope you enjoy it!

LLMs BootcampFiza987241

This document provides information about a bootcamp to build applications using Large Language Models (LLMs). The bootcamp consists of 11 modules covering topics such as introduction to generative AI, text analytics techniques, neural network models for natural language processing, transformer models, embedding retrieval, semantic search, prompt engineering, fine-tuning LLMs, orchestration frameworks, the LangChain application platform, and a final project to build a custom LLM application. The bootcamp will be held in various locations and dates between September 2023 and January 2024.

Intro to LLMsLoic Merckel

The Future of AI is Generative not Discriminative 5/26/2021Steve Omohundro

The deep learning AI revolution has been sweeping the world for a decade now. Deep neural nets are routinely used for tasks like translation, fraud detection, and image classification. PwC estimates that they will create $15.7 trillion/year of value by 2030. But most current networks are "discriminative" in that they directly map inputs to predictions. This type of model requires lots of training examples, doesn't generalize well outside of its training set, creates inscrutable representations, is subject to adversarial examples, and makes knowledge transfer difficult. People, in contrast, can learn from just a few examples, generalize far beyond their experience, and can easily transfer and reuse knowledge. In recent years, new kinds of "generative" AI models have begun to exhibit these desirable human characteristics. They represent the causal generative processes by which the data is created and can be compositional, compact, and directly interpretable. Generative AI systems that assist people can model their needs and desires and interact with empathy. Their adaptability to changing circumstances will likely be required by rapidly changing AI-driven business and social systems. Generative AI will be the engine of future AI innovation.

Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation

This document provides an agenda for a full-day bootcamp on large language models (LLMs) like GPT-3. The bootcamp will cover fundamentals of machine learning and neural networks, the transformer architecture, how LLMs work, and popular LLMs beyond ChatGPT. The agenda includes sessions on LLM strategy and theory, design patterns for LLMs, no-code/code stacks for LLMs, and building a custom chatbot with an LLM and your own data.

Responsible Generative AICMassociates

Using LLM Agents with Llama 3.2, LangGraph and MilvusZilliz

Building an Agentic RAG locally with Ollama and MilvusZilliz

More Related Content

What's hot (20)

GENERATIVE AI, THE FUTURE OF PRODUCTIVITYAndre Muscat

How ChatGPT and AI-assisted coding changes software engineering profoundlyPekka Abrahamsson / Tampere University

200109-Open AI Chat GPT-4-3.pptxandre241421

Generative AI at the edge.pdfQualcomm Research

Exploring Opportunities in the Generative AI Value Chain.pdfDung Hoang

Leveraging Generative AI & Best practicesDianaGray10

Simplified Introduction to AIDeepu S Nath

Generative AI, WiDS 2023.pptxColleen Farrelly

Andy Roy - Conversational AI - Why We Must Build.pdfSOLTUIONSpeople, THINKubators, THINKathons

Generative AIlutzsuarnaba1

Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti

Large Language Models BootcampData Science Dojo

Stanford AI Report 2023Kapil Khandelwal (KK)

Landscape of AI/ML in 2023HyunJoon Jung

The Creative Ai stormLeandro Righini

LLMs BootcampFiza987241

Intro to LLMsLoic Merckel

The Future of AI is Generative not Discriminative 5/26/2021Steve Omohundro

Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation

Responsible Generative AICMassociates

GENERATIVE AI, THE FUTURE OF PRODUCTIVITYAndre Muscat

How ChatGPT and AI-assisted coding changes software engineering profoundlyPekka Abrahamsson / Tampere University

200109-Open AI Chat GPT-4-3.pptxandre241421

Generative AI at the edge.pdfQualcomm Research

Exploring Opportunities in the Generative AI Value Chain.pdfDung Hoang

Leveraging Generative AI & Best practicesDianaGray10

Simplified Introduction to AIDeepu S Nath

Generative AI, WiDS 2023.pptxColleen Farrelly

Andy Roy - Conversational AI - Why We Must Build.pdfSOLTUIONSpeople, THINKubators, THINKathons

Generative AIlutzsuarnaba1

Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti

Large Language Models BootcampData Science Dojo

Stanford AI Report 2023Kapil Khandelwal (KK)

Landscape of AI/ML in 2023HyunJoon Jung

The Creative Ai stormLeandro Righini

LLMs BootcampFiza987241

Intro to LLMsLoic Merckel

The Future of AI is Generative not Discriminative 5/26/2021Steve Omohundro

Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation

Responsible Generative AICMassociates

Similar to Using LLM Agents with Llama 3, LangGraph and Milvus (20)

Using LLM Agents with Llama 3.2, LangGraph and MilvusZilliz

Building an Agentic RAG locally with Ollama and MilvusZilliz

GraphRAG Agents with Neo4j, Milvus and GPT4Zilliz

Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz

Agentic systems are on the rise, helping developers create intelligent, autonomous systems. LLMs are becoming more and more capable of following diverse sets of instructions, making them ideal for managing these agents. This advancement opens up numerous possibilities for handling complex tasks with minimal human intervention in so many areas. In this talk, we will see how to build agents using llama-agents. We’ll also explore how combining different LLMs can enable various actions. For simpler tasks, we'll use Mistral Nemo, a smaller and more cost-effective model, and Mistral Large for orchestrating different agents.

Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz

2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...Timothy Spann

2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Techniques Timothy Spann https://2024.allthingsopen.org/sessions/advanced-retrieval-augmented-generation-rag-techniques In 2023, we saw many simple retrieval augmented generation (RAG) examples being built. However, most of these examples and frameworks built around them simplified the process too much. Businesses were unable to derive value from their implementations. That’s because there are many other techniques involved in tuning a basic RAG app to work for you. In this talk we will cover three of the techniques you need to understand and leverage to build better RAG: chunking, embedding model choice, and metadata structuring.

17-October-2024 NYC AI Camp - Step-by-Step RAG 101Timothy Spann

17-October-2024 NYC AI Camp - Step-by-Step RAG 101 https://github.com/tspannhw/AIM-BecomingAnAIEngineer https://github.com/tspannhw/AIM-Ghosts AIM - Becoming An AI Engineer Step 1 - Start off local Download Python (or use your local install) https://www.python.org/downloads/ python3.11 -m venv yourenv source yourenv/bin/activate Create an environment https://docs.python.org/3/library/venv.html Use Pip https://pip.pypa.io/en/stable/installation/ Setup a .env file for environment variables Download Jupyter Lab https://jupyter.org/ Run your notebook jupyter lab --ip="0.0.0.0" --port=8881 --allow-root Running on a Mac or Linux machine is optimal. Setup environment variables source .env Alternatives Download Conda https://docs.conda.io/projects/conda/en/latest/index.html https://colab.research.google.com/ Other languages: Java, .Net, Go, NodeJS Other notebooks to try https://zilliz.com/learn/milvus-notebooks https://github.com/milvus-io/bootcamp/blob/master/bootcamp/tutorials/quickstart/build_RAG_with_milvus.ipynb References Guides https://zilliz.com/learn HuggingFace Friend https://zilliz.com/learn/effortless-ai-workflows-a-beginners-guide-to-hugging-face-and-pymilvus Milvus https://zilliz.com/milvus-downloads https://milvus.io/docs/quickstart.md LangChain https://zilliz.com/learn/LangChain Notebook display https://ipywidgets.readthedocs.io/en/stable/user_install.html References https://medium.com/@zilliz_learn/function-calling-with-ollama-llama-3-2-and-milvus-ac2bc2122538 https://github.com/milvus-io/bootcamp/tree/master/bootcamp/RAG/advanced_rag https://zilliz.com/learn/Retrieval-Augmented-Generation https://zilliz.com/blog/scale-search-with-milvus-handle-massive-datasets-with-ease https://zilliz.com/learn/generative-ai https://zilliz.com/learn/what-are-binary-vector-embedding https://zilliz.com/learn/choosing-right-vector-index-for-your-project

11-OCT-2024_AI_101_CryptoOracle_UnstructuredDataTimothy Spann

2024 Dec 05 - PyData Global - Tutorial Its In The Air TonightTimothy Spann

2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight https://pydata.org/global2024/schedule Tim Spann https://www.youtube.com/@FLaNK-Stack https://medium.com/@tspann https://global2024.pydata.org/cfp/talk/L9JXKS/ It's in the Air Tonight. Sensor Data in RAG 12-05, 18:30–20:00 (UTC), General Track This session's header image Today we will learn how to build an application around sensor data, REST Feeds, weather data, traffic cameras and vector data. We will write a simple Python application to collect various structured, semistructured data and unstructured data, We will process, enrich, augment and vectorize this data and insert it into a Vector Database to be used for semantic hybrid search and filtering. We will then build a Jupyter notebook to analyze, query and return this data. Along the way we will learn the basics of Vector Databases and Milvus. While building it we will see the practical reasons we choose what indexes make sense, what to vectorize, how to query multiple vectors even when one is an image and one is text. We will see why we do filtering. We will then use our vector database of Air Quality readings to feed our LLM and get proper answers to Air Quality questions. I will show you how to all the steps to build a RAG application with Milvus, LangChain, Ollama, Python and Air Quality Reports. Finally after demos I will answer questions, provide the source code and additional resources including articles. Goal of this Application In this application, we will build an advanced data model and use it for ingest and various search options. For this notebook portion, we will 1️⃣ Ingest Data Fields, Enrich Data With Lookups, and Format : Learn to ingest data from including JSON and Images, format and transform to optimize hybrid searches. This is done inside the streetcams.py application. 2️⃣ Store Data into Milvus: Learn to store data into Milvus, an efficient vector database designed for high-speed similarity searches and AI applications. In this step we are optimizing data model with scalar and multiple vector fields -- one for text and one for the camera image. We do this in the streetcams.py application. 3️⃣ Use Open Source Models for Data Queries in a Hybrid Multi-Modal, Multi-Vector Search: Discover how to use scalars and multiple vectors to query data stored in Milvus and re-rank the final results in this notebook. 4️⃣ Display resulting text and images: Build a quick output for validation and checking in this notebook. Timothy Spann Tim Spann is a Principal. He works with Apache Kafka, Apache Pulsar, Apache Flink, Flink SQL, Milvus, Generative AI, HuggingFace, Python, Java, Apache NiFi, Apache Spark, Big Data, IoT, Cloud, AI/DL, Machine Learning, and Deep Learning. Tim has over ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Principal Developer Advocate at Zilliz, Principal Developer Advocate at cldra

MultiModal RAG using vLLM and Pixtral - Stephen BatifolZilliz

While text-based RAG systems have been everywhere in the last year and a half, there is so much more than text data. Images, audio, and documents often need to be processed together to provide meaningful insights, yet most RAG implementations focus solely on text. In this talk, we'll explore the architecture that makes it possible to run such a system and demonstrate how to build one using Milvus, LlamaIndex, and vLLM for deploying open-source LLMs on your own infrastructure. Through a live demo, we'll showcase a real-world application processing both images and text queries :D

A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz

Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaZilliz

Evaluating Retrieval-Augmented Generation - WebinarZilliz

Retrieval-Augmented Generation (RAG) has emerged as a cornerstone technology powering the latest wave of Generative AI applications, from sophisticated question-answering systems to advanced semantic search engines. As RAG's popularity has grown, we've witnessed a proliferation of methods promising to enhance the traditional RAG pipeline. These innovations include query rewriting, intelligent routing, and result reranking—but how do we measure their real impact on application performance? Join us for an informational webinar where we'll explore robust evaluation frameworks, including LLM-as-a-Judge methodologies, industry-standard benchmarking datasets, and innovative synthetic data generation techniques. By the end of this session, you'll master practical approaches to evaluate and optimize RAG systems, equipped with the knowledge to implement these tools effectively in your own applications.

Building Production Ready Search Pipelines with Spark and MilvusZilliz

Read more: https://zilliz.com/blog/building-production-ready-search-pipelines-spark-milvus Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.

Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz

tspann06-NOV-2024_AI-Alliance_NYC_ intro to Data Prep Kit and Open Source RAGTimothy Spann

Fact based Generative AIStefan Weber

The document discusses Retrievable Augmented Generation (RAG), a technique to improve responses from large language models by providing additional context from external knowledge sources. It outlines challenges with current language models providing inconsistent responses and lack of understanding. As a solution, it proposes fine-tuning models using RAG and additional context. It then provides an example of implementing a RAG pipeline to power a question answering system for Munich Airport, describing components needed and hosting options for large language models.

Chat with your data, privately and locallyZilliz

LLMs are powerful tools for generating responses but have limitations without access to up-to-date and proprietary information. A retrieval augmented generation (RAG) workflow enables LLMs to provide more accurate answers by incorporating a vector database with proprietary data and using text embedding models to retrieve and rank relevant information to augment the LLM's response. Running RAG locally on RTX GPUs provides benefits like low latency, data privacy, and no server costs compared to cloud solutions.

Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systemsZilliz

Apache Spark dominates the big data processing world, but efficient vector similarity search on massive datasets remains a bottleneck. This talk will show how you can seamlessly integrate Milvus with Spark to unlock the true power of vector similarity search. We'll explore how Milvus integrates with Spark, enabling efficient vector search within Spark workflows. Real-world applications showcasing the combined power of Spark and Milvus in tackling complex similarity search challenges will be presented. Finally, we'll shed light on the significant performance gains achieved through this integration. Whether you're dealing with recommendation systems, image retrieval, or any other application requiring vector similarity search, this talk will equip you with the knowledge to leverage Spark and Milvus to their maximum potential. Join us on this exploration of how Spark and Milvus can enhance your big data processing capabilities with fast similarity search even at scale!

06-18-2024-Princeton Meetup-Introduction to MilvusTimothy Spann

06-18-2024-Princeton Meetup-Introduction to Milvus [email protected] https://www.linkedin.com/in/timothyspann/ https://x.com/paasdev https://github.com/tspannhw https://github.com/milvus-io/milvus Get Milvused! https://milvus.io/ Read my Newsletter every week! https://github.com/tspannhw/FLiPStackWeekly/blob/main/142-17June2024.md For more cool Unstructured Data, AI and Vector Database videos check out the Milvus vector database videos here https://www.youtube.com/@MilvusVectorDatabase/videos Unstructured Data Meetups - https://www.meetup.com/unstructured-data-meetup-new-york/ https://lu.ma/calendar/manage/cal-VNT79trvj0jS8S7 https://www.meetup.com/pro/unstructureddata/ https://zilliz.com/community/unstructured-data-meetup https://zilliz.com/event Twitter/X: https://x.com/milvusio https://x.com/paasdev LinkedIn: https://www.linkedin.com/company/zilliz/ https://www.linkedin.com/in/timothyspann/ GitHub: https://github.com/milvus-io/milvus https://github.com/tspannhw Invitation to join Discord: https://discord.com/invite/FjCMmaJng6 Blogs: https://milvusio.medium.com/ https://www.opensourcevectordb.cloud/ https://medium.com/@tspann Expand LLMs' knowledge by incorporating external data sources into LLMs and your AI applications.