RAG systems are talked about in detail, but usually stick to the basics. In this talk, Stephen will show you how to build an Agentic RAG System using Langchain and Milvus.
A deep-dive into Agentic AI covering the following topics:
Linking to the corresponding articles on Medium. The article on Long-term #Memory Management is now 3rd among the 10 most read stories on #AgenticAI on Medium -:)
- Agentic AI Platform Reference #Architecture
https://medium.com/datadriveninvestor/ai-agent-platform-reference-architecture-0be5b19d0eba
- AI Agents #Marketplace & Discovery for #MultiAgentSystems
https://medium.com/ai-advances/ai-agents-marketplace-discovery-for-multi-agent-systems-27a31b6b1ca6
- Personalizing #UX for Agentic AI
https://medium.com/ai-advances/personalized-ux-for-agentic-ai-ab132f2eeb03
- Agent #Observability
https://medium.com/ai-advances/stateful-and-responsible-ai-agents-7af386268554
- Long-term #Memory Management
https://medium.com/ai-advances/long-term-memory-for-agentic-ai-systems-4ae9b37c6c0f
- Agentic AI Scenarios:
Agentic #RAGs: extending RAGs to #SQL Databases
https://medium.com/ai-advances/agentic-rags-extending-rags-to-sql-databases-1509b25ca3e7
#ReinforcementLearning Agents for #ControlSystems
https://medium.com/ai-advances/reinforcement-learning-agents-for-industrial-control-systems-b917b513f0c4
- Responsible AI Agents
#Privacy Risks of LLMs
https://medium.com/ai-advances/privacy-risks-of-large-language-models-llms-5c0f96dccc56
Responsible #LLMOps in Towards Data Science
https://towardsdatascience.com/responsible-llmops-985cd1af3639
What exactly are AI Agents, and how do they operate? How do they compare to and interact with LLMs and functionalities such as function calling, chain-of-thought processing, assistants, tools, or actions? In this talk, I delve into the unique features of Agentic AI, including perception, state estimation, goal setting, planning, and action selection & execution. We will define various levels of Agentic AI and form a map to help navigate this emerging landscape. By categorizing current agent-based or agent-related solutions with practical examples, we'll provide an overview of the current state of Agentic AI.
Collborative Agents with Tools & Knowledge (Graphs) using LangGraph & LangChainTilores
Discover how enterprise AI systems can harness the power of multi-agent architectures through LangGraph, a framework that masterfully balances control with autonomy. Learn how to overcome common LLM application challenges using LangGraph's powerful features for parallel execution, multi-agent orchestration, and human-in-the-loop interactions. As part of the thriving LangChain ecosystem that powers over 100,000 applications, LangGraph offers developers practical approaches to building sophisticated AI systems with robust feedback loops. Perfect for developers and architects seeking to advance their enterprise AI solutions using state-of-the-art multi-agent architectures.
generative-ai-fundamentals and Large language modelsAdventureWorld5
Thank you for the detailed review of the protein bars. I'm glad to hear you and your family are enjoying them as a healthy snack and meal replacement option. A couple suggestions based on your feedback:
- For future orders, you may want to check the expiration dates to help avoid any dried out bars towards the end of the box. Freshness is key to maintaining the moist texture.
- When introducing someone new to the bars, selecting one in-person if possible allows checking the flexibility as an indicator it's moist inside. This could help avoid a disappointing first impression from a dry sample.
- Storing opened boxes in an airtight container in the fridge may help extend the freshness even further when you can't
Pitch Deck Teardown: Wilco's $7 million Seed deckHajeJanKamps
Wilco is a platform that allows software engineers to practice their hands-on skills through interactive "quests" that simulate real-life work scenarios. Engineers join a "fantasy company" and work through challenges using common tools from their actual jobs. This helps engineers improve both technical skills and soft skills. It also provides managers visibility into their engineers' progress. The platform is customizable and connects to common tools. Wilco aims to address gaps in engineers' development in a scalable way that other options like internal programs and bootcamps cannot. It is seeking funding to further develop its product and community.
Gearing to the new Future of Work: Embracing Agentic AI.DianaGray10
As we look toward the future of AI in 2025, Agentic AI is poised to revolutionize how we interact with intelligent systems, offering unprecedented autonomy, adaptability, and decision-making capabilities. This event will dive deep into what Agentic AI means, its key characteristics, and the significant value it can bring to businesses and industries. Attendees will gain insights into the emerging trends, challenges, and opportunities that come with this transformative technology, and learn how to prepare to support and implement Agentic AI solutions in the coming years.
Join us as we explore:
🔍 What is Agentic AI? Understanding its core principles and how it differs from traditional AI systems.
💡 The Value Proposition: How Agentic AI can streamline processes, drive innovation, and support smarter, more autonomous decision-making in various industries.
🚀 The Road Ahead for 2025: What businesses and professionals can do today to prepare for the shift toward more autonomous, intelligent systems.
🛠️ Supporting Agentic AI Solutions: Practical insights into how to integrate and manage Agentic AI solutions effectively within organizations.
Who Should Attend:
This event is ideal for anyone interested in the future of AI and automation, as well as those looking to understand and leverage Agentic AI for business growth and operational efficiency.
🤖 AI/Automation Professionals: Learn how to future-proof your skills and understand the upcoming wave of AI-driven transformation.
💼 Business Leaders and Decision-Makers: Gain strategic insights into the potential of Agentic AI for operational improvements, innovation, and competitive advantage.
🖥️ Technology and Solutions Architects: Understand the technical landscape of Agentic AI and how to build, integrate, and scale these advanced systems.
👨💻 Developers and Engineers: Learn about the latest advancements in AI technology and how to start working with Agentic AI solutions.
🌱 AI Enthusiasts and Innovators: For anyone curious about the future of AI and its potential to drive change in industries ranging from finance to healthcare and beyond.
Introduction to Open Source RAG and RAG EvaluationZilliz
You’ve heard good data matters in Machine Learning, but does it matter for Generative AI applications? Corporate data often differs significantly from the general Internet data used to train most foundation models. Join me for a demo on building an open source RAG (Retrieval Augmented Generation) stack using Milvus vector database for Retrieval, LangChain, Llama 3 with Ollama, Ragas RAG Eval, and optional Zilliz cloud, OpenAI.
Plantee Seed Pitch Deck for TC Pitch Deck TeardownHajeJanKamps
The document is a pitch deck for Plantee Innovations, which is developing smart indoor gardening devices. They are raising $1.4 million to fund mass production and reach $1.7 million in revenue by 2025. Their flagship product is an all-in-one smart indoor greenhouse that monitors conditions like light, water, and temperature to provide ideal care for plants. They have market validation from a successful Kickstarter campaign and aim to address the large market of people who struggle to keep houseplants alive.
GENERATIVE AI, THE FUTURE OF PRODUCTIVITYAndre Muscat
Discuss the impact and opportunity of using Generative AI to support your development and creative teams
* Explore business challenges in content creation
* Cost-per-unit of different types of content
* Use AI to reduce cost-per-unit
* New partnerships being formed that will have a material impact on the way we search and engage with content
Part 4 of a 9 Part Research Series named "What matters in AI" published on www.andremuscat.com
This document summarizes a presentation given by Professor Pekka Abrahamsson on how ChatGPT and AI-assisted coding is profoundly changing software engineering. The presentation covers several key points:
- ChatGPT and AI tools like Copilot are beginning to be adopted in software engineering to provide code snippets, answers to technical questions, and assist with debugging, but issues around code ownership, reliability, and security need to be addressed.
- Early studies show potential benefits of ChatGPT for tasks like software testing education, code quality improvement, and requirements elicitation, but more research is still needed.
- Prompt engineering techniques can help maximize the usefulness of ChatGPT for software engineering tasks. Overall, AI
OpenAI is an AI research company dedicated to developing safe and beneficial artificial intelligence. Their mission is to ensure AI benefits humanity. OpenAI conducts research across various AI domains and develops technologies like ChatGPT, a large language model capable of answering questions and generating human-like responses. The company also offers developers access to its models and tools through an API.
Generative AI models, such as ChatGPT and Stable Diffusion, can create new and original content like text, images, video, audio, or other data from simple prompts, as well as handle complex dialogs and reason about problems with or without images. These models are disrupting traditional technologies, from search and content creation to automation and problem solving, and are fundamentally shaping the future user interface to computing devices. Generative AI can apply broadly across industries, providing significant enhancements for utility, productivity, and entertainment. As generative AI adoption grows at record-setting speeds and computing demands increase, on-device and hybrid processing are more important than ever. Just like traditional computing evolved from mainframes to today’s mix of cloud and edge devices, AI processing will be distributed between them for AI to scale and reach its full potential.
In this presentation you’ll learn about:
- Why on-device AI is key
- Full-stack AI optimizations to make on-device AI possible and efficient
- Advanced techniques like quantization, distillation, and speculative decoding
- How generative AI models can be run on device and examples of some running now
- Qualcomm Technologies’ role in scaling on-device generative AI
Exploring Opportunities in the Generative AI Value Chain.pdfDung Hoang
The article "Exploring Opportunities in the Generative AI Value Chain" by McKinsey & Company's QuantumBlack provides insights into the value created by generative artificial intelligence (AI) and its potential applications.
Leveraging Generative AI & Best practicesDianaGray10
In this event we will cover:
- What is Generative AI and how it is being for future of work.
- Best practices for developing and deploying generative AI based models in productions.
- Future of Generative AI, how generative AI is expected to evolve in the coming years.
1) Artificial intelligence is the science and engineering of making intelligent machines that can perceive and take actions to maximize their success.
2) Early AI programs included the Logic Theorist which solved math theorems, and programs for playing checkers that learned from experience.
3) Recent advances in data, computing power, and techniques like machine learning, deep learning and neural networks have greatly expanded what AI can accomplish, with applications including computer vision, speech recognition, translation and more.
4) While current AI is specialized or "weak," the goal is to develop "strong" or general human-level AI that can perform any intellectual task, but this poses risks that must be addressed to ensure such systems remain
This talk overviews my background as a female data scientist, introduces many types of generative AI, discusses potential use cases, highlights the need for representation in generative AI, and showcases a few tools that currently exist.
This document discusses the rise of conversational AI and how digital agents can represent brands. It notes that generative AI enables new types of interactions that are more helpful than traditional chatbots. Digital agents can automate work by having natural conversations to complete tasks on behalf of users. The document provides examples of how a sales digital agent could assist a user before, during, and after a client meeting. It outlines six key ingredients for building effective digital agents, including prompting, context, proprietary knowledge, voice, reasoning, and code generation. The challenge for brands is to design unique digital agents that embody their values and approach in order to benefit from the changes brought by conversational AI.
The document discusses how generative AI can be used to scale content operations by reducing the time it takes to generate content. It explains that generative AI learns from natural language models and can generate new text or ideas based on prompts provided by users. While generative AI has benefits like speeding up content creation and ideation, it also has limitations such as not being able to conduct original research or ensure quality. The document provides examples of how generative AI can be used for tasks like generating ideas, simplifying complex text, creating visuals, and more. It also discusses challenges like bias in AI models and the low risk of plagiarism.
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti
Mihai is the Principal Architect for Platform Engineering and Technology Solutions at IBM, responsible for Cloud Native and AI Solutions. He is a Red Hat Certified Architect, CKA/CKS, a leader in the IBM Open Innovation community, and advocate for open source development. Mihai is driving the development of Retrieval Augmentation Generation platforms, and solutions for Generative AI at IBM that leverage WatsonX, Vector databases, LangChain, HuggingFace and open source AI models.
Mihai will share lessons learned building Retrieval Augmented Generation, or “Chat with Documents” platforms and APIs that scale, and deploy on Kubernetes. His talk will cover use cases for Generative AI, limitations of Large Language Models, use of RAG, Vector Databases and Fine Tuning to overcome model limitations and build solutions that connect to your data and provide content grounding, limit hallucinations and form the basis of explainable AI. In terms of technology, he will cover LLAMA2, HuggingFace TGIS, SentenceTransformers embedding models using Python, LangChain, and Weaviate and ChromaDB vector databases. He’ll also share tips on writing code using LLM, including building an agent for Ansible and containers.
Scaling factors for Large Language Model Architectures:
• Vector Database: consider sharding and High Availability
• Fine Tuning: collecting data to be used for fine tuning
• Governance and Model Benchmarking: how are you testing your model performance
over time, with different prompts, one-shot, and various parameters
• Chain of Reasoning and Agents
• Caching embeddings and responses
• Personalization and Conversational Memory Database
• Streaming Responses and optimizing performance. A fine tuned 13B model may
perform better than a poor 70B one!
• Calling 3rd party functions or APIs for reasoning or other type of data (ex: LLMs are
terrible at reasoning and prediction, consider calling other models)
• Fallback techniques: fallback to a different model, or default answers
• API scaling techniques, rate limiting, etc.
• Async, streaming and parallelization, multiprocessing, GPU acceleration (including
embeddings), generating your API using OpenAPI, etc.
This document provides a 50-hour roadmap for building large language model (LLM) applications. It introduces key concepts like text-based and image-based generative AI models, encoder-decoder models, attention mechanisms, and transformers. It then covers topics like intro to image generation, generative AI applications, embeddings, attention mechanisms, transformers, vector databases, semantic search, prompt engineering, fine-tuning foundation models, orchestration frameworks, autonomous agents, bias and fairness, and recommended LLM application projects. The document recommends several hands-on exercises and lists upcoming bootcamp dates and locations for learning to build LLM applications.
The AI Index Report 2023 provides the following key highlights from its research and development chapter:
1. The US and China have the most cross-country AI research collaborations, though the rate of growth has slowed in recent years.
2. Global AI research output has more than doubled since 2010, led by areas like machine learning, computer vision and pattern recognition.
3. China now leads in total AI research publications, while the US still leads in conference and repository citations but these leads are decreasing.
4. Industry now produces far more significant AI models than academia, as building state-of-the-art systems requires greater resources that industry can provide.
5. Large language models
Presenting the landscape of AI/ML in 2023 by introducing a quick summary of the last 10 years of its progress, current situation, and looking at things happening behind the scene.
An overview of the most important AI capabilities in marketing, advertising and content creation. I made this presentation to inform, educate and inspire people in the creative industries to familiarise themselves with the incredible toolsets that are already here and in development. I also explain how generative Ai works explore some possible new roles and business models for agencies. Hope you enjoy it!
This document provides information about a bootcamp to build applications using Large Language Models (LLMs). The bootcamp consists of 11 modules covering topics such as introduction to generative AI, text analytics techniques, neural network models for natural language processing, transformer models, embedding retrieval, semantic search, prompt engineering, fine-tuning LLMs, orchestration frameworks, the LangChain application platform, and a final project to build a custom LLM application. The bootcamp will be held in various locations and dates between September 2023 and January 2024.
A non-technical overview of Large Language Models, exploring their potential, limitations, and customization for specific challenges. While this deck is tailored for an audience from the financial industry in mind, its content remains broadly applicable.
(Note: Discover a slightly updated version of this deck at slideshare.net/LoicMerckel/introduction-to-llms.)
The Future of AI is Generative not Discriminative 5/26/2021Steve Omohundro
The deep learning AI revolution has been sweeping the world for a decade now. Deep neural nets are routinely used for tasks like translation, fraud detection, and image classification. PwC estimates that they will create $15.7 trillion/year of value by 2030. But most current networks are "discriminative" in that they directly map inputs to predictions. This type of model requires lots of training examples, doesn't generalize well outside of its training set, creates inscrutable representations, is subject to adversarial examples, and makes knowledge transfer difficult. People, in contrast, can learn from just a few examples, generalize far beyond their experience, and can easily transfer and reuse knowledge. In recent years, new kinds of "generative" AI models have begun to exhibit these desirable human characteristics. They represent the causal generative processes by which the data is created and can be compositional, compact, and directly interpretable. Generative AI systems that assist people can model their needs and desires and interact with empathy. Their adaptability to changing circumstances will likely be required by rapidly changing AI-driven business and social systems. Generative AI will be the engine of future AI innovation.
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation
This document provides an agenda for a full-day bootcamp on large language models (LLMs) like GPT-3. The bootcamp will cover fundamentals of machine learning and neural networks, the transformer architecture, how LLMs work, and popular LLMs beyond ChatGPT. The agenda includes sessions on LLM strategy and theory, design patterns for LLMs, no-code/code stacks for LLMs, and building a custom chatbot with an LLM and your own data.
* "Responsible AI Leadership: A Global Summit on Generative AI"
*April 2023 guide for experts and policymakers
* Developing and governing generative AI systems
* + 100 thought leaders and practitioners participated
* Recommendations for responsible development, open innovation & social progress
* 30 action-oriented recommendations aim
* Navigate AI complexities
Using LLM Agents with Llama 3.2, LangGraph and MilvusZilliz
We explore Agentic RAG (Internet search, check for hallucinations, correct answers). Don’t miss this deep dive into one of the hottest topics in AI today!
Building an Agentic RAG locally with Ollama and MilvusZilliz
With the rise of Open-Source LLMs like Llama, Mistral, Gemma, and more, it has become apparent that LLMs might also be useful even when run locally. In this talk, we will see how to deploy an Agentic Retrieval Augmented Generation (RAG) setup using Ollama, with Milvus as the vector database on your laptop. That way, you can also avoid being Rate Limited by OpenAI like I have been in the past.
GENERATIVE AI, THE FUTURE OF PRODUCTIVITYAndre Muscat
Discuss the impact and opportunity of using Generative AI to support your development and creative teams
* Explore business challenges in content creation
* Cost-per-unit of different types of content
* Use AI to reduce cost-per-unit
* New partnerships being formed that will have a material impact on the way we search and engage with content
Part 4 of a 9 Part Research Series named "What matters in AI" published on www.andremuscat.com
This document summarizes a presentation given by Professor Pekka Abrahamsson on how ChatGPT and AI-assisted coding is profoundly changing software engineering. The presentation covers several key points:
- ChatGPT and AI tools like Copilot are beginning to be adopted in software engineering to provide code snippets, answers to technical questions, and assist with debugging, but issues around code ownership, reliability, and security need to be addressed.
- Early studies show potential benefits of ChatGPT for tasks like software testing education, code quality improvement, and requirements elicitation, but more research is still needed.
- Prompt engineering techniques can help maximize the usefulness of ChatGPT for software engineering tasks. Overall, AI
OpenAI is an AI research company dedicated to developing safe and beneficial artificial intelligence. Their mission is to ensure AI benefits humanity. OpenAI conducts research across various AI domains and develops technologies like ChatGPT, a large language model capable of answering questions and generating human-like responses. The company also offers developers access to its models and tools through an API.
Generative AI models, such as ChatGPT and Stable Diffusion, can create new and original content like text, images, video, audio, or other data from simple prompts, as well as handle complex dialogs and reason about problems with or without images. These models are disrupting traditional technologies, from search and content creation to automation and problem solving, and are fundamentally shaping the future user interface to computing devices. Generative AI can apply broadly across industries, providing significant enhancements for utility, productivity, and entertainment. As generative AI adoption grows at record-setting speeds and computing demands increase, on-device and hybrid processing are more important than ever. Just like traditional computing evolved from mainframes to today’s mix of cloud and edge devices, AI processing will be distributed between them for AI to scale and reach its full potential.
In this presentation you’ll learn about:
- Why on-device AI is key
- Full-stack AI optimizations to make on-device AI possible and efficient
- Advanced techniques like quantization, distillation, and speculative decoding
- How generative AI models can be run on device and examples of some running now
- Qualcomm Technologies’ role in scaling on-device generative AI
Exploring Opportunities in the Generative AI Value Chain.pdfDung Hoang
The article "Exploring Opportunities in the Generative AI Value Chain" by McKinsey & Company's QuantumBlack provides insights into the value created by generative artificial intelligence (AI) and its potential applications.
Leveraging Generative AI & Best practicesDianaGray10
In this event we will cover:
- What is Generative AI and how it is being for future of work.
- Best practices for developing and deploying generative AI based models in productions.
- Future of Generative AI, how generative AI is expected to evolve in the coming years.
1) Artificial intelligence is the science and engineering of making intelligent machines that can perceive and take actions to maximize their success.
2) Early AI programs included the Logic Theorist which solved math theorems, and programs for playing checkers that learned from experience.
3) Recent advances in data, computing power, and techniques like machine learning, deep learning and neural networks have greatly expanded what AI can accomplish, with applications including computer vision, speech recognition, translation and more.
4) While current AI is specialized or "weak," the goal is to develop "strong" or general human-level AI that can perform any intellectual task, but this poses risks that must be addressed to ensure such systems remain
This talk overviews my background as a female data scientist, introduces many types of generative AI, discusses potential use cases, highlights the need for representation in generative AI, and showcases a few tools that currently exist.
This document discusses the rise of conversational AI and how digital agents can represent brands. It notes that generative AI enables new types of interactions that are more helpful than traditional chatbots. Digital agents can automate work by having natural conversations to complete tasks on behalf of users. The document provides examples of how a sales digital agent could assist a user before, during, and after a client meeting. It outlines six key ingredients for building effective digital agents, including prompting, context, proprietary knowledge, voice, reasoning, and code generation. The challenge for brands is to design unique digital agents that embody their values and approach in order to benefit from the changes brought by conversational AI.
The document discusses how generative AI can be used to scale content operations by reducing the time it takes to generate content. It explains that generative AI learns from natural language models and can generate new text or ideas based on prompts provided by users. While generative AI has benefits like speeding up content creation and ideation, it also has limitations such as not being able to conduct original research or ensure quality. The document provides examples of how generative AI can be used for tasks like generating ideas, simplifying complex text, creating visuals, and more. It also discusses challenges like bias in AI models and the low risk of plagiarism.
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti
Mihai is the Principal Architect for Platform Engineering and Technology Solutions at IBM, responsible for Cloud Native and AI Solutions. He is a Red Hat Certified Architect, CKA/CKS, a leader in the IBM Open Innovation community, and advocate for open source development. Mihai is driving the development of Retrieval Augmentation Generation platforms, and solutions for Generative AI at IBM that leverage WatsonX, Vector databases, LangChain, HuggingFace and open source AI models.
Mihai will share lessons learned building Retrieval Augmented Generation, or “Chat with Documents” platforms and APIs that scale, and deploy on Kubernetes. His talk will cover use cases for Generative AI, limitations of Large Language Models, use of RAG, Vector Databases and Fine Tuning to overcome model limitations and build solutions that connect to your data and provide content grounding, limit hallucinations and form the basis of explainable AI. In terms of technology, he will cover LLAMA2, HuggingFace TGIS, SentenceTransformers embedding models using Python, LangChain, and Weaviate and ChromaDB vector databases. He’ll also share tips on writing code using LLM, including building an agent for Ansible and containers.
Scaling factors for Large Language Model Architectures:
• Vector Database: consider sharding and High Availability
• Fine Tuning: collecting data to be used for fine tuning
• Governance and Model Benchmarking: how are you testing your model performance
over time, with different prompts, one-shot, and various parameters
• Chain of Reasoning and Agents
• Caching embeddings and responses
• Personalization and Conversational Memory Database
• Streaming Responses and optimizing performance. A fine tuned 13B model may
perform better than a poor 70B one!
• Calling 3rd party functions or APIs for reasoning or other type of data (ex: LLMs are
terrible at reasoning and prediction, consider calling other models)
• Fallback techniques: fallback to a different model, or default answers
• API scaling techniques, rate limiting, etc.
• Async, streaming and parallelization, multiprocessing, GPU acceleration (including
embeddings), generating your API using OpenAPI, etc.
This document provides a 50-hour roadmap for building large language model (LLM) applications. It introduces key concepts like text-based and image-based generative AI models, encoder-decoder models, attention mechanisms, and transformers. It then covers topics like intro to image generation, generative AI applications, embeddings, attention mechanisms, transformers, vector databases, semantic search, prompt engineering, fine-tuning foundation models, orchestration frameworks, autonomous agents, bias and fairness, and recommended LLM application projects. The document recommends several hands-on exercises and lists upcoming bootcamp dates and locations for learning to build LLM applications.
The AI Index Report 2023 provides the following key highlights from its research and development chapter:
1. The US and China have the most cross-country AI research collaborations, though the rate of growth has slowed in recent years.
2. Global AI research output has more than doubled since 2010, led by areas like machine learning, computer vision and pattern recognition.
3. China now leads in total AI research publications, while the US still leads in conference and repository citations but these leads are decreasing.
4. Industry now produces far more significant AI models than academia, as building state-of-the-art systems requires greater resources that industry can provide.
5. Large language models
Presenting the landscape of AI/ML in 2023 by introducing a quick summary of the last 10 years of its progress, current situation, and looking at things happening behind the scene.
An overview of the most important AI capabilities in marketing, advertising and content creation. I made this presentation to inform, educate and inspire people in the creative industries to familiarise themselves with the incredible toolsets that are already here and in development. I also explain how generative Ai works explore some possible new roles and business models for agencies. Hope you enjoy it!
This document provides information about a bootcamp to build applications using Large Language Models (LLMs). The bootcamp consists of 11 modules covering topics such as introduction to generative AI, text analytics techniques, neural network models for natural language processing, transformer models, embedding retrieval, semantic search, prompt engineering, fine-tuning LLMs, orchestration frameworks, the LangChain application platform, and a final project to build a custom LLM application. The bootcamp will be held in various locations and dates between September 2023 and January 2024.
A non-technical overview of Large Language Models, exploring their potential, limitations, and customization for specific challenges. While this deck is tailored for an audience from the financial industry in mind, its content remains broadly applicable.
(Note: Discover a slightly updated version of this deck at slideshare.net/LoicMerckel/introduction-to-llms.)
The Future of AI is Generative not Discriminative 5/26/2021Steve Omohundro
The deep learning AI revolution has been sweeping the world for a decade now. Deep neural nets are routinely used for tasks like translation, fraud detection, and image classification. PwC estimates that they will create $15.7 trillion/year of value by 2030. But most current networks are "discriminative" in that they directly map inputs to predictions. This type of model requires lots of training examples, doesn't generalize well outside of its training set, creates inscrutable representations, is subject to adversarial examples, and makes knowledge transfer difficult. People, in contrast, can learn from just a few examples, generalize far beyond their experience, and can easily transfer and reuse knowledge. In recent years, new kinds of "generative" AI models have begun to exhibit these desirable human characteristics. They represent the causal generative processes by which the data is created and can be compositional, compact, and directly interpretable. Generative AI systems that assist people can model their needs and desires and interact with empathy. Their adaptability to changing circumstances will likely be required by rapidly changing AI-driven business and social systems. Generative AI will be the engine of future AI innovation.
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation
This document provides an agenda for a full-day bootcamp on large language models (LLMs) like GPT-3. The bootcamp will cover fundamentals of machine learning and neural networks, the transformer architecture, how LLMs work, and popular LLMs beyond ChatGPT. The agenda includes sessions on LLM strategy and theory, design patterns for LLMs, no-code/code stacks for LLMs, and building a custom chatbot with an LLM and your own data.
* "Responsible AI Leadership: A Global Summit on Generative AI"
*April 2023 guide for experts and policymakers
* Developing and governing generative AI systems
* + 100 thought leaders and practitioners participated
* Recommendations for responsible development, open innovation & social progress
* 30 action-oriented recommendations aim
* Navigate AI complexities
Using LLM Agents with Llama 3.2, LangGraph and MilvusZilliz
We explore Agentic RAG (Internet search, check for hallucinations, correct answers). Don’t miss this deep dive into one of the hottest topics in AI today!
Building an Agentic RAG locally with Ollama and MilvusZilliz
With the rise of Open-Source LLMs like Llama, Mistral, Gemma, and more, it has become apparent that LLMs might also be useful even when run locally. In this talk, we will see how to deploy an Agentic Retrieval Augmented Generation (RAG) setup using Ollama, with Milvus as the vector database on your laptop. That way, you can also avoid being Rate Limited by OpenAI like I have been in the past.
LangGraph GraphRAG agent with Llama 3.1 and GPT4o
Let's build an Advanced RAG with a GraphRAG agent that will run a combination of Llama 3.1 and GPT4o, for Llama 3.1 we will use Ollama. The idea is that we use GPT4o for advanced tasks, like generating the Neo4j query and Llama3.1 for the rest.
Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz
Agentic systems are on the rise, helping developers create intelligent, autonomous systems. LLMs are becoming more and more capable of following diverse sets of instructions, making them ideal for managing these agents. This advancement opens up numerous possibilities for handling complex tasks with minimal human intervention in so many areas. In this talk, we will see how to build agents using llama-agents. We’ll also explore how combining different LLMs can enable various actions. For simpler tasks, we'll use Mistral Nemo, a smaller and more cost-effective model, and Mistral Large for orchestrating different agents.
Multi-agent Systems with Mistral AI, Milvus and Llama-agentsZilliz
With the recent release of Llama Agents, we can now build agents that are async first and run as their own service. During this webinar, Stephen will show you how to build an Agentic RAG System using Llama Agents and Milvus.
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...Timothy Spann
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Techniques
Timothy Spann
https://2024.allthingsopen.org/sessions/advanced-retrieval-augmented-generation-rag-techniques
In 2023, we saw many simple retrieval augmented generation (RAG) examples being built. However, most of these examples and frameworks built around them simplified the process too much. Businesses were unable to derive value from their implementations. That’s because there are many other techniques involved in tuning a basic RAG app to work for you. In this talk we will cover three of the techniques you need to understand and leverage to build better RAG: chunking, embedding model choice, and metadata structuring.
17-October-2024 NYC AI Camp - Step-by-Step RAG 101Timothy Spann
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
https://github.com/tspannhw/AIM-BecomingAnAIEngineer
https://github.com/tspannhw/AIM-Ghosts
AIM - Becoming An AI Engineer
Step 1 - Start off local
Download Python (or use your local install)
https://www.python.org/downloads/
python3.11 -m venv yourenv
source yourenv/bin/activate
Create an environment
https://docs.python.org/3/library/venv.html
Use Pip
https://pip.pypa.io/en/stable/installation/
Setup a .env file for environment variables
Download Jupyter Lab
https://jupyter.org/
Run your notebook
jupyter lab --ip="0.0.0.0" --port=8881 --allow-root
Running on a Mac or Linux machine is optimal.
Setup environment variables
source .env
Alternatives
Download Conda
https://docs.conda.io/projects/conda/en/latest/index.html
https://colab.research.google.com/
Other languages: Java, .Net, Go, NodeJS
Other notebooks to try
https://zilliz.com/learn/milvus-notebooks
https://github.com/milvus-io/bootcamp/blob/master/bootcamp/tutorials/quickstart/build_RAG_with_milvus.ipynb
References
Guides
https://zilliz.com/learn
HuggingFace Friend
https://zilliz.com/learn/effortless-ai-workflows-a-beginners-guide-to-hugging-face-and-pymilvus
Milvus
https://zilliz.com/milvus-downloads
https://milvus.io/docs/quickstart.md
LangChain
https://zilliz.com/learn/LangChain
Notebook display
https://ipywidgets.readthedocs.io/en/stable/user_install.html
References
https://medium.com/@zilliz_learn/function-calling-with-ollama-llama-3-2-and-milvus-ac2bc2122538
https://github.com/milvus-io/bootcamp/tree/master/bootcamp/RAG/advanced_rag
https://zilliz.com/learn/Retrieval-Augmented-Generation
https://zilliz.com/blog/scale-search-with-milvus-handle-massive-datasets-with-ease
https://zilliz.com/learn/generative-ai
https://zilliz.com/learn/what-are-binary-vector-embedding
https://zilliz.com/learn/choosing-right-vector-index-for-your-project
2024 Dec 05 - PyData Global - Tutorial Its In The Air TonightTimothy Spann
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
https://pydata.org/global2024/schedule
Tim Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://global2024.pydata.org/cfp/talk/L9JXKS/
It's in the Air Tonight. Sensor Data in RAG
12-05, 18:30–20:00 (UTC), General Track
This session's header image
Today we will learn how to build an application around sensor data, REST Feeds, weather data, traffic cameras and vector data. We will write a simple Python application to collect various structured, semistructured data and unstructured data, We will process, enrich, augment and vectorize this data and insert it into a Vector Database to be used for semantic hybrid search and filtering. We will then build a Jupyter notebook to analyze, query and return this data.
Along the way we will learn the basics of Vector Databases and Milvus. While building it we will see the practical reasons we choose what indexes make sense, what to vectorize, how to query multiple vectors even when one is an image and one is text. We will see why we do filtering. We will then use our vector database of Air Quality readings to feed our LLM and get proper answers to Air Quality questions. I will show you how to all the steps to build a RAG application with Milvus, LangChain, Ollama, Python and Air Quality Reports. Finally after demos I will answer questions, provide the source code and additional resources including articles.
Goal of this Application
In this application, we will build an advanced data model and use it for ingest and various search options. For this notebook portion, we will
1️⃣ Ingest Data Fields, Enrich Data With Lookups, and Format :
Learn to ingest data from including JSON and Images, format and transform to optimize hybrid searches. This is done inside the streetcams.py application.
2️⃣ Store Data into Milvus:
Learn to store data into Milvus, an efficient vector database designed for high-speed similarity searches and AI applications. In this step we are optimizing data model with scalar and multiple vector fields -- one for text and one for the camera image. We do this in the streetcams.py application.
3️⃣ Use Open Source Models for Data Queries in a Hybrid Multi-Modal, Multi-Vector Search:
Discover how to use scalars and multiple vectors to query data stored in Milvus and re-rank the final results in this notebook.
4️⃣ Display resulting text and images:
Build a quick output for validation and checking in this notebook.
Timothy Spann
Tim Spann is a Principal. He works with Apache Kafka, Apache Pulsar, Apache Flink, Flink SQL, Milvus, Generative AI, HuggingFace, Python, Java, Apache NiFi, Apache Spark, Big Data, IoT, Cloud, AI/DL, Machine Learning, and Deep Learning. Tim has over ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Principal Developer Advocate at Zilliz, Principal Developer Advocate at cldra
MultiModal RAG using vLLM and Pixtral - Stephen BatifolZilliz
While text-based RAG systems have been everywhere in the last year and a half, there is so much more than text data. Images, audio, and documents often need to be processed together to provide meaningful insights, yet most RAG implementations focus solely on text.
In this talk, we'll explore the architecture that makes it possible to run such a system and demonstrate how to build one using Milvus, LlamaIndex, and vLLM for deploying open-source LLMs on your own infrastructure.
Through a live demo, we'll showcase a real-world application processing both images and text queries :D
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
We will showcase how you can build a RAG using Milvus. Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.
Retrieval-Augmented Generation (RAG) has emerged as a cornerstone technology powering the latest wave of Generative AI applications, from sophisticated question-answering systems to advanced semantic search engines. As RAG's popularity has grown, we've witnessed a proliferation of methods promising to enhance the traditional RAG pipeline. These innovations include query rewriting, intelligent routing, and result reranking—but how do we measure their real impact on application performance?
Join us for an informational webinar where we'll explore robust evaluation frameworks, including LLM-as-a-Judge methodologies, industry-standard benchmarking datasets, and innovative synthetic data generation techniques. By the end of this session, you'll master practical approaches to evaluate and optimize RAG systems, equipped with the knowledge to implement these tools effectively in your own applications.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Read more: https://zilliz.com/blog/building-production-ready-search-pipelines-spark-milvus
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
tspann06-NOV-2024_AI-Alliance_NYC_ intro to Data Prep Kit and Open Source RAGTimothy Spann
tspann06-NOV-2024_AI-Alliance_NYC_ intro to Data Prep Kit and Open Source RAG
Open source toolkit
Helps with data prep
Handles documents + code
Many ready to use modules out of the box
Python
Develop on laptop, scale on clusters
https://medium.com/@tspann
The document discusses Retrievable Augmented Generation (RAG), a technique to improve responses from large language models by providing additional context from external knowledge sources. It outlines challenges with current language models providing inconsistent responses and lack of understanding. As a solution, it proposes fine-tuning models using RAG and additional context. It then provides an example of implementing a RAG pipeline to power a question answering system for Munich Airport, describing components needed and hosting options for large language models.
LLMs are powerful tools for generating responses but have limitations without access to up-to-date and proprietary information. A retrieval augmented generation (RAG) workflow enables LLMs to provide more accurate answers by incorporating a vector database with proprietary data and using text embedding models to retrieve and rank relevant information to augment the LLM's response. Running RAG locally on RTX GPUs provides benefits like low latency, data privacy, and no server costs compared to cloud solutions.
Supercharge Spark: Unleashing Big Data Potential with Milvus for RAG systemsZilliz
Apache Spark dominates the big data processing world, but efficient vector similarity search on massive datasets remains a bottleneck. This talk will show how you can seamlessly integrate Milvus with Spark to unlock the true power of vector similarity search.
We'll explore how Milvus integrates with Spark, enabling efficient vector search within Spark workflows. Real-world applications showcasing the combined power of Spark and Milvus in tackling complex similarity search challenges will be presented. Finally, we'll shed light on the significant performance gains achieved through this integration.
Whether you're dealing with recommendation systems, image retrieval, or any other application requiring vector similarity search, this talk will equip you with the knowledge to leverage Spark and Milvus to their maximum potential.
Join us on this exploration of how Spark and Milvus can enhance your big data processing capabilities with fast similarity search even at scale!
06-18-2024-Princeton Meetup-Introduction to MilvusTimothy Spann
06-18-2024-Princeton Meetup-Introduction to Milvus
[email protected]
https://www.linkedin.com/in/timothyspann/
https://x.com/paasdev
https://github.com/tspannhw
https://github.com/milvus-io/milvus
Get Milvused!
https://milvus.io/
Read my Newsletter every week!
https://github.com/tspannhw/FLiPStackWeekly/blob/main/142-17June2024.md
For more cool Unstructured Data, AI and Vector Database videos check out the Milvus vector database videos here
https://www.youtube.com/@MilvusVectorDatabase/videos
Unstructured Data Meetups -
https://www.meetup.com/unstructured-data-meetup-new-york/
https://lu.ma/calendar/manage/cal-VNT79trvj0jS8S7
https://www.meetup.com/pro/unstructureddata/
https://zilliz.com/community/unstructured-data-meetup
https://zilliz.com/event
Twitter/X: https://x.com/milvusio https://x.com/paasdev
LinkedIn: https://www.linkedin.com/company/zilliz/ https://www.linkedin.com/in/timothyspann/
GitHub: https://github.com/milvus-io/milvus https://github.com/tspannhw
Invitation to join Discord: https://discord.com/invite/FjCMmaJng6
Blogs: https://milvusio.medium.com/ https://www.opensourcevectordb.cloud/ https://medium.com/@tspann
Expand LLMs' knowledge by incorporating external data sources into LLMs and your AI applications.
Zilliz Cloud Monthly Technical Review: May 2025Zilliz
About this webinar
Join our monthly demo for a technical overview of Zilliz Cloud, a highly scalable and performant vector database service for AI applications
Topics covered
- Zilliz Cloud's scalable architecture
- Key features of the developer-friendly UI
- Security best practices and data privacy
- Highlights from recent product releases
This webinar is an excellent opportunity for developers to learn about Zilliz Cloud's capabilities and how it can support their AI projects. Register now to join our community and stay up-to-date with the latest vector database technology.
Smarter RAG Pipelines: Scaling Search with Milvus and FeastZilliz
About this webinar
Learn how Milvus and Feast can be used together to scale vector search and easily declare views for retrieval using open source. We’ll demonstrate how to integrate Milvus with Feast to build a customized RAG pipeline.
Topics Covered
- Leverage Feast for dynamic metadata and document storage and retrieval, ensuring that the correct data is always available at inference time
- Learn how to integrate Feast with Milvus to support vector-based retrieval in RAG systems
- Use Milvus for fast, high-dimensional similarity search, enhancing the retrieval phase of your RAG model
Hands-on Tutorial: Building an Agent to Reason about Private Data with OpenAI...Zilliz
In this tutorial, we build an agent from scratch to reason over the Milvus documentation and Discord server history. We demonstrate fundamental agentic concepts such as long-term memory, tool use, reflection, conditional execution flow, and reasoning models. Our agent’s design is informed by recent open-source attempts to reproduce Deep Research.
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...Zilliz
About this webinar
Discover how to integrate Vision Language Models with Browser Use and Milvus to create an agentic system capable of real-time visual and textual analysis. Ideal for developers who want to learn how to use Agents that can see, take action, and remember what they saw.
This Session Will:
- Demonstrate a workflow where Browser Use extracts dynamic web data, while Milvus stores and retrieves the data, that way you can always come back to what the agent saw.
- Showcase practical use cases, such as querying live web content with AI agents that reason over historical and visual data.
- Explore balancing autonomy and control in agentic systems, including challenges like hallucination mitigation and performance optimization.
Webinar - Zilliz Cloud Monthly Demo - March 2025Zilliz
Join our monthly demo for a technical overview of Zilliz Cloud, a highly scalable and performant vector database service for AI applications
Topics covered
- Zilliz Cloud's scalable architecture
- Key features of the developer-friendly UI
- Security best practices and data privacy
- Highlights from recent product releases
- This webinar is an excellent opportunity for developers to learn about Zilliz Cloud's capabilities and how it can support their AI projects. Register now to join our community and stay up-to-date with the latest vector database technology.
What Makes "Deep Research"? A Dive into AI AgentsZilliz
About this webinar:
Unless you live under a rock, you will have heard about OpenAI’s release of Deep Research on Feb 2, 2025. This new product promises to revolutionize how we answer questions requiring the synthesis of large amounts of diverse information. But how does this technology work, and why is Deep Research a noticeable improvement over previous attempts? In this webinar, we will examine the concepts underpinning modern agents using our basic clone, Deep Searcher, as an example.
Topics covered:
Tool use
Structured output
Reflection
Reasoning models
Planning
Types of agentic memory
Combining Lexical and Semantic Search with Milvus 2.5Zilliz
In short, lexical search is a way to search your documents based on the keywords they contain, in contrast to semantic search, which compares the similarity of embeddings. We’ll be covering:
Why, when, and how should you use lexical search
What is the BM25 distance metric
How exactly does Milvus 2.5 implement lexical search
How to build an improved hybrid lexical + semantic search with Milvus 2.5
Bedrock Data Automation (Preview): Simplifying Unstructured Data ProcessingZilliz
Bedrock Data Automation (BDA) is a cloud-based service that simplifies the process of extracting valuable insights from unstructured content—such as documents, images, video, and audio. Come learn how BDA leverages generative AI to automate the transformation of multi-modal data into structured formats, enabling developers to build applications and automate complex workflows with greater speed and accuracy.
Deploying a Multimodal RAG System Using Open Source Milvus, LlamaIndex, and vLLMZilliz
About this webinar:
While text-based RAG systems have been everywhere in the last year and a half, there is so much more than text data. Images, audio, and documents often need to be processed together to provide meaningful insights, yet most RAG implementations focus solely on text. Think about automated visual inspection systems understanding manufacturing logs and production line images, or robotics systems correlating sensor data with visual feedback. These multimodal scenarios demand RAG systems that go beyond text-only processing.
In this talk, we'll walk through how to build a Multimodal RAG system that helps solve this problem. We'll explore the architecture that makes it possible to run such a system and demonstrate how to build one using Milvus, LlamaIndex, and vLLM for deploying open-source LLMs on your infrastructure.
Through a live demo, we'll showcase a real-world application processing both images and text queries. Whether you're looking to reduce API costs, maintain data privacy, or gain more control over your AI infrastructure, this session will provide you with actionable insights to implement Multimodal RAG in your organization.
Topics covered:
- vLLM and self hosting LLMs
- Multimodal RAG Demo: a real-world application processing both images and text queries
February Product Demo: Discover the Power of Zilliz CloudZilliz
Join our monthly demo for a technical overview of Zilliz Cloud, a highly scalable and performant vector database service for AI applications
Topics covered
- Zilliz Cloud's scalable architecture
- Key features of the developer-friendly UI
- Security best practices and data privacy
- Highlights from recent product releases
This webinar is an excellent opportunity for developers to learn about Zilliz Cloud's capabilities and how it can support their AI projects. Register now to join our community and stay up-to-date with the latest vector database technology.
Full Text Search with Milvus 2.5 - UD Meetup Berlin Jan 23Zilliz
"Milvus 2.5 introduces text search by introducing native full text search capabilities, seamlessly combining term-based matching with vector similarity in a single system. This feature automatically handles text-to-vector conversion and real-time BM25 scoring, eliminating the complexity of manual embedding generation and external processing pipelines.
Through a live demo, we'll showcase how easy we make it to use Full Text search now :D"
Building the Next-Gen Apps with Multimodal Retrieval using Twelve Labs & MilvusZilliz
"This session dives deep into the power of Multimodal Retrieval, a revolutionzing approach that enhances personalization by seamlessly integrating diverse data sources for more intuitive product interactions. Explore the foundational concepts of Multimodal Embedding and Any-to-Any Search, and learn how to leverage these technologies to build next generation products. Discover how to seamlessly integrate the Twelve Labs Embed API and Milvus into your projects.
Through live demos, you’ll see how Fashion Product Search is redefined with deeper insights into the architecture, and discover how this approach is revolutionizing user interactions, especially with bots. We’ll also explore real world case studies that demonstrate the ease and power of building multimodal apps."
"Explore the transformative potential of Voice AI in customer interaction analysis powered by LLMs. Learn how Gemini 2.0 enables transcription, summarization, and actionable insight extraction to streamline ticket resolution and enhance customer experiences.
This session delves into the architecture and practical applications of LLM-powered systems, showcasing how they revolutionize customer support workflows through real-world examples and insights"
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...Zilliz
AI agents are transforming industries, especially with recent vision-language models like Llama 3.2 Vision that enable AI agents to go beyond text-based understanding by integrating multimodal capabilities. Building such advanced AI agents can feel complex, but FriendliAI simplifies the process by offering end-to-end solutions, from creating your own custom models to deploying them in production. In this webinar, we’ll learn about the AI developer workflow from model fine-tuning to inference serving. We’ll also work through building a simple AI agent with advanced multimodal RAG capabilities using Friendli Serverless Endpoints and Milvus DB. This session is ideal for those looking to learn more about large-language model inference serving, start building AI agents with RAG capabilities, or explore multimodal RAG queries in greater depth
1 Table = 1000 Words? Foundation Models for Tabular DataZilliz
Tables form the backbone of modern data storage, powering everything from relational databases to enterprise systems. Yet despite their ubiquity, we've barely scratched the surface of their potential. While Deep Learning has revolutionized our ability to process text and images, its impact on tabular data has been surprisingly limited. This gap is now being bridged through groundbreaking research in multimodal modeling, particularly with innovations like the TableGPT2 model. In this talk, we'll explore how these new multimodal foundation models are trained to understand tabular data, and demonstrate practical ways to unlock hidden value in your organization's data assets.
How Milvus allows you to run Full Text SearchZilliz
Milvus 2.5 introduces text search by introducing native full text search capabilities, seamlessly combining term-based matching with vector similarity in a single system. This feature automatically handles text-to-vector conversion and real-time BM25 scoring, eliminating the complexity of manual embedding generation and external processing pipelines.
How to Optimize Your Embedding Model Selection and Development through TDA Cl...Zilliz
About this webinar:
Embedding models are a crucial layer in vector database applications, yet figuring out which embedding model is best for your dataset has been a notoriously difficult task. However, an efficient and intuitive approach for many use cases can be produced through Topological Data Analysis (TDA) on your evaluation dataset. Identifying patterns of weak performing behavior in your model is made easy and scalable through a table that reveals the performance of different semantic categories of queries being made to your vector database.
Topics covered:
- Risks and limitations of current evaluation approaches for embeddings
- Compare embedding models on your own dataset using Navigable TDA clusters
- ML lifecycle case studies in ecommerce: model selection, fine-tuning, and post-deployment
Milvus: Scaling Vector Data Solutions for Gen AIZilliz
Milvus, an LF AI project, is an open-source vector database built to power Gen AI solutions. 80% of the data in the world is unstructured data, and vector databases are the databases that help you get valuable insights from unstructured data. With this in mind, we built Milvus as a distributed system on top of other open-source solutions, including MinIO and Kafka, to support vector collections that exceed billion-scale. This session will deeply dive into the architecture decisions that make this cloud-native vector database seamlessly scale horizontally, provide users with tunable consistency, orchestrate in-memory and on-disk indexing, and scalable search strategies.
Keeping Data Fresh: Mastering Updates in Vector DatabasesZilliz
Managing and extracting value from unstructured data has become a critical challenge as the volume of data continues to grow. This virtual event brings together industry experts to explore the latest techniques in Retrieval Augmented Generation (RAG) and vector databases.
Discover how RAG systems are revolutionizing natural language processing by seamlessly integrating information retrieval techniques, enabling more accurate and contextual language generation. Gain practical insights into building and optimizing these applications.
This session will also cover how vector databases like Milvus, play a key role in RAG and working with unstructured data. Learn proven strategies for maintaining data freshness, accuracy, and efficiency, ensuring your organization stays ahead of the curve.
Milvus 2.5: Full-Text Search, More Powerful Metadata Filtering, and more!Zilliz
Milvus 2.5 introduces text search by introducing native full text search capabilities, seamlessly combining term-based matching with vector similarity in a single system. This feature automatically handles text-to-vector conversion and real-time BM25 scoring, eliminating the complexity of manual embedding generation and external processing pipelines.
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationChristine Shepherd
AI agents are reshaping logistics and supply chain operations by enabling automation, predictive insights, and real-time decision-making across key functions such as demand forecasting, inventory management, procurement, transportation, and warehouse operations. Powered by technologies like machine learning, NLP, computer vision, and robotic process automation, these agents deliver significant benefits including cost reduction, improved efficiency, greater visibility, and enhanced adaptability to market changes. While practical use cases show measurable gains in areas like dynamic routing and real-time inventory tracking, successful implementation requires careful integration with existing systems, quality data, and strategic scaling. Despite challenges such as data integration and change management, AI agents offer a strong competitive edge, with widespread industry adoption expected by 2025.
Jeremy Millul - A Talented Software DeveloperJeremy Millul
Jeremy Millul is a talented software developer based in NYC, known for leading impactful projects such as a Community Engagement Platform and a Hiking Trail Finder. Using React, MongoDB, and geolocation tools, Jeremy delivers intuitive applications that foster engagement and usability. A graduate of NYU’s Computer Science program, he brings creativity and technical expertise to every project, ensuring seamless user experiences and meaningful results in software development.
Mastering AI Workflows with FME - Peak of Data & AI 2025Safe Software
Harness the full potential of AI with FME: From creating high-quality training data to optimizing models and utilizing results, FME supports every step of your AI workflow. Seamlessly integrate a wide range of models, including those for data enhancement, forecasting, image and object recognition, and large language models. Customize AI models to meet your exact needs with FME’s powerful tools for training, optimization, and seamless integration
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfällepanagenda
Webinar Recording: https://www.panagenda.com/webinars/domino-iq-was-sie-erwartet-erste-schritte-und-anwendungsfalle/
HCL Domino iQ Server – Vom Ideenportal zur implementierten Funktion. Entdecken Sie, was es ist, was es nicht ist, und erkunden Sie die Chancen und Herausforderungen, die es bietet.
Wichtige Erkenntnisse
- Was sind Large Language Models (LLMs) und wie stehen sie im Zusammenhang mit Domino iQ
- Wesentliche Voraussetzungen für die Bereitstellung des Domino iQ Servers
- Schritt-für-Schritt-Anleitung zur Einrichtung Ihres Domino iQ Servers
- Teilen und diskutieren Sie Gedanken und Ideen, um das Potenzial von Domino iQ zu maximieren
Down the Rabbit Hole – Solving 5 Training RoadblocksRustici Software
Feeling stuck in the Matrix of your training technologies? You’re not alone. Managing your training catalog, wrangling LMSs and delivering content across different tools and audiences can feel like dodging digital bullets. At some point, you hit a fork in the road: Keep patching things up as issues pop up… or follow the rabbit hole to the root of the problems.
Good news, we’ve already been down that rabbit hole. Peter Overton and Cameron Gray of Rustici Software are here to share what we found. In this webinar, we’ll break down 5 training roadblocks in delivery and management and show you how they’re easier to fix than you might think.
FME Beyond Data Processing Creating A Dartboard Accuracy AppSafe Software
At Nordend, we want to push the boundaries of FME and explore its potential for more creative applications. In our office, we have a dartboard, and while improving our dart-throwing skills was an option, we took a different approach: What if we could use FME to calculate where we should aim to achieve the highest possible score, based on our accuracy? Using FME’s Geometry User parameter, we designed a custom solution. When launching the FME Flow app, the map is now a dartboard. The centre of the map is always fixed on the same area of the world, where we pinned a PNG picture of a dartboard as a basemap through a self-created WMS. This visual setup allowed us to draw polygons—each with three points—where our darts landed, using the Geometry parameter. These polygons get processed through an FME workspace, which translates the coordinates from the map into exact X and Y positions on the dartboard. With this accurate data, we calculate all sorts of statistics: rolling averages, best scores, and even standard deviations. The results get displayed on a dashboard in FME Flow, giving us insights into how we could maximize our scores, based purely on where we actually tend to throw. Join us for a live demonstration of the app! The takeaway? FME isn’t just a powerful data processing tool; with a bit of imagination, it can be used for far more creative and unconventional applications. This project demonstrates that the only limit to what FME can do is the creativity you bring to it.
Your startup on AWS - How to architect and maintain a Lean and Mean account J...angelo60207
Prevent infrastructure costs from becoming a significant line item on your startup’s budget! Serial entrepreneur and software architect Angelo Mandato will share his experience with AWS Activate (startup credits from AWS) and knowledge on how to architect a lean and mean AWS account ideal for budget minded and bootstrapped startups. In this session you will learn how to manage a production ready AWS account capable of scaling as your startup grows for less than $100/month before credits. We will discuss AWS Budgets, Cost Explorer, architect priorities, and the importance of having flexible, optimized Infrastructure as Code. We will wrap everything up discussing opportunities where to save with AWS services such as S3, EC2, Load Balancers, Lambda Functions, RDS, and many others.
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOMAnchore
Over 70% of any given software application consumes open source software (most likely not even from the original source) and only 15% of organizations feel confident in their risk management practices.
With the newly announced Anchore SBOM feature, teams can start safely consuming OSS while mitigating security and compliance risks. Learn how to import SBOMs in industry-standard formats (SPDX, CycloneDX, Syft), validate their integrity, and proactively address vulnerabilities within your software ecosystem.
Interested in leveling up your JavaScript skills? Join us for our Introduction to TypeScript workshop.
Learn how TypeScript can improve your code with dynamic typing, better tooling, and cleaner architecture. Whether you're a beginner or have some experience with JavaScript, this session will give you a solid foundation in TypeScript and how to integrate it into your projects.
Workshop content:
- What is TypeScript?
- What is the problem with JavaScript?
- Why TypeScript is the solution
- Coding demo
DevOps in the Modern Era - Thoughtfully Critical PodcastChris Wahl
https://youtu.be/735hP_01WV0
My journey through the world of DevOps! From the early days of breaking down silos between developers and operations to the current complexities of cloud-native environments. I'll talk about my personal experiences, the challenges we faced, and how the role of a DevOps engineer has evolved.
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Anish Kumar
Presented by: Anish Kumar
LinkedIn: https://www.linkedin.com/in/anishkumar/
This lightning talk dives into real-world GenAI projects that scaled from prototype to production using Databricks’ fully managed tools. Facing cost and time constraints, we leveraged four key Databricks features—Workflows, Model Serving, Serverless Compute, and Notebooks—to build an AI inference pipeline processing millions of documents (text and audiobooks).
This approach enables rapid experimentation, easy tuning of GenAI prompts and compute settings, seamless data iteration and efficient quality testing—allowing Data Scientists and Engineers to collaborate effectively. Learn how to design modular, parameterized notebooks that run concurrently, manage dependencies and accelerate AI-driven insights.
Whether you're optimizing AI inference, automating complex data workflows or architecting next-gen serverless AI systems, this session delivers actionable strategies to maximize performance while keeping costs low.
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdfAlkin Tezuysal
As the demand for vector databases and Generative AI continues to rise, integrating vector storage and search capabilities into traditional databases has become increasingly important. This session introduces the *MyVector Plugin*, a project that brings native vector storage and similarity search to MySQL. Unlike PostgreSQL, which offers interfaces for adding new data types and index methods, MySQL lacks such extensibility. However, by utilizing MySQL's server component plugin and UDF, the *MyVector Plugin* successfully adds a fully functional vector search feature within the existing MySQL + InnoDB infrastructure, eliminating the need for a separate vector database. The session explains the technical aspects of integrating vector support into MySQL, the challenges posed by its architecture, and real-world use cases that showcase the advantages of combining vector search with MySQL's robust features. Attendees will leave with practical insights on how to add vector search capabilities to their MySQL systems.
In this talk, Elliott explores how developers can embrace AI not as a threat, but as a collaborative partner.
We’ll examine the shift from routine coding to creative leadership, highlighting the new developer superpowers of vision, integration, and innovation.
We'll touch on security, legacy code, and the future of democratized development.
Whether you're AI-curious or already a prompt engineering, this session will help you find your rhythm in the new dance of modern development.
Improving Developer Productivity With DORA, SPACE, and DevExJustin Reock
Ready to measure and improve developer productivity in your organization?
Join Justin Reock, Deputy CTO at DX, for an interactive session where you'll learn actionable strategies to measure and increase engineering performance.
Leave this session equipped with a comprehensive understanding of developer productivity and a roadmap to create a high-performing engineering team in your company.
3. 27K+
GitHub
Stars
25M+
Downloads
250+
Contributors
2,600
+
Forks
Milvus is an open-source vector database for GenAI projects. pip install on your
laptop, plug into popular AI dev tools, and push to production with a single line of
code.
Easy Setup
Pip-install to start
coding in a notebook
within seconds.
Reusable Code
Write once, and
deploy with one line
of code into the
production
environment
Integration
Plug into OpenAI,
Langchain,
LlmaIndex, and
many more
Feature-rich
Dense & sparse
embeddings,
filtering, reranking
and beyond
10. ● Framework for building LLM Applications
● Focus on retrieving data and integrating with LLMs
● Integrations with most AI popular tools
🦜🔗 LangChain
11. 🦜🕸 LangGraph by LangChain
● Build Stateful apps with LLMs and Multi-Agents workflow
● Cycles and Branching
● Human-in-the-Loop
● Persistence
16. ● Routing: Adaptive RAG
○ Route Questions to different retrieval approaches
● Fallback: Corrective RAG
○ Fallback to web search if docs are not relevant to query
● Self-Correction: Self-RAG
○ Try to fix answers with hallucinations or don’t address question
General Ideas
19. Meta Storage
Root Query Data Index
Coordinator Service
Proxy
Proxy
etcd
Log Broker
SDK
Load Balancer
DDL/DCL
DML
NOTIFICATION
CONTROL SIGNAL
Object Storage
Minio / S3 / AzureBlob
Log Snapshot Delta File Index File
Worker Node QUERY DATA DATA
Message Storage
VECTOR
DATABASE
Access Layer
Query Node Data Node Index Node
Milvus Architecture