0% found this document useful (0 votes)

36 views

Google Search Tips

This document discusses using Apache Airflow to build and deploy applications using large language models (LLMs). It outlines how Airflow can help with ingesting data from various sources, processing data pipelines on a schedule or ad-hoc, handling retries and dependencies, and monitoring models at scale. The document then presents a real use case of using an LLM to answer questions by ingesting documentation into an embedding model and vector database accessed by prompts. Airflow is well-suited for orchestrating the data pipelines and feedback loops involved in such an application.

Uploaded by

NoorAhmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views

Google Search Tips

Uploaded by

NoorAhmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Building and

deploying LLM
applications with
Apache Airﬂow
Julian LaNeve Kaxil Naik
Senior Product Manager @ Apache Airﬂow Committer & PMC Member
Astronomer Director of Eng @ Astronomer
Agenda

Why Airﬂow should be at the centre of LLMOps?

Real Use-case & reference architecture

Next Steps: Community collaboration

Generative AI:
A Creative New World

A powerful new class of large

language models is making it
possible for machines to
write, code, draw, and create
with credible and sometimes
superhuman results.
Normally, for ML, you need to…

Ingest Data Train Model Prediction

…but…but
nownow
you
youcan
can:

You hit a pre-trained model

instead of your own model

Ingest Data Train Model Prediction

Less Data
Going from “Idea to Production” with LLM Apps involves
solving a lot of data engineering problems:

■ Ingestion from several sources

■ Day 2 operations on data pipelines
■ Data preparation
■ Data privacy
■ Data freshness
■ Model deployment & monitoring
■ Scaling Models
■ Experimentation & ﬁne-tuning
■ Feedback Loops
Typical Architecture for Q&A use-case using LLM

Document Loading Splitting Storage Retrieval Output

URLs

PDFs
Relevant
Splits

Prompt
Splits LLM <Answer>
Vectorstore

Query
Database <Question>

Legacy Data Store

Source: https://python.langchain.com/docs/use_cases/question_answering/
Airﬂow is a Natural Fit…

Python Native Common Interface Document Parsing

The language of data scientists and Between Data Engineering, Data Decorator and pythonic interfaces
ML engineers. Science, ML Engineering and for standard LLM tools
Operations.

Monitoring & Alerting Extensible Ingestion

Built in features for logging, Standardize custom operators and Extract and load data into
monitoring and alerting to external templates for common DS tasks vectordbs and other destinations
systems. across the organization.

Pluggable Compute Data Agnostic Day 2 Ops

GPUs, Kubernetes, EC2, VMs etc. But data aware. Handle retries, dependencies, and
all other day 2 ops associated with
data pipelines
Let’s Talk About a
Real Use Case
Problem Statement:

We have customers, employees, and community members

that ask questions about our product with answers that
exist across several sources of documentation.

How do we provide an easy interface for folks to

get their questions answered without adding
further strain to the team?
Data Ingestion, Processing, and Embedding

GitHub Docs (.md)

issues ﬁles
🦜🔗 LangChain
Pre-process and split into
chunks Embed chunks Write to Weaviate

Docs (.md) Slack

ﬁles Messages

■ Airﬂow gives a framework to load data ■ After content is split into chunks, each
from APIs & other sources into LangChain chunk is embedded into vectors (semantic
representations)
■ LangChain helps pre-process and split
documents into smaller chunks ■ Those vectors are written to Weaviate for
depending on content type later retrieval
Prompt Orchestration and Answering

🦜🔗LangChain

Rewording 1
Web App

Original Prompt Rewording 2

User Asks
a Question

Slack Bot
Rewording 3
Combine docs
and make ﬁnal
🦜
🔗
Reword to get more Vector DB search
related documents with prompts LLM call to
answer

Users can interact with UI or ■ Original prompt gets reworded 3x using gpt-3.5-turbo
Slack Bot; they both use the
same API ■ Answer is generated by combining docs from each prompt
and making a gpt-4 call

■ State is stored in Firestore and prompt tracing is done through

LangSmith
LLM & Product Feedback Loops

On schedule

If good answer, write to

vector DB to use in future
answers
🦜🔗 LangChain
Fetch new runs: input, Classify Q&A according
User Rates
output, and user feedback to helpfulness,
Answer
relevance, and public
🦜 If good answer, mark as
good to show on Ask Astro

🔗 homepage

When a user submits feedback, it ■ Airﬂow DAGs process feedback async to evaluate answers on helpfulness,, relevance,
gets stored in Firestore and and publicness
LangSmith for later use
■ If answer is good, it gets stored in Weaviate and can be used as a source for future
questions

■ UI also shows the most recent good prompts on the homepage

Running this in production meant:

■ Experimenting with different sources of data to ingest

■ Running the pipelines on a schedule and ad-hoc

■ Running the same workloads with variable chunking

strategies

■ Needing to retry tasks due to ﬁnicky python libraries and

unreliable external services

■ Giving different parts of the workload variable compute

■ Creating standard interfaces to interact with external

systems
Running this in production meant:

■ Experimenting with different sources of data to ingest

■ Running the pipelines on a schedule and ad-hoc Which is

■ Running the same workloads with variable chunking
strategies
what
■ Needing to retry tasks due to ﬁnicky python libraries and
Airﬂow’s
unreliable external services great at!
■ Giving different parts of the workload variable compute

■ Creating standard interfaces to interact with external

systems
ask.astronomer.io

github.com/astronomer/ask-astro
a16z’s Emerging LLM App Stack
Legend
Contextual Data Pipelines Embedding Model Vector Database
(Databricks, Airﬂow, (OpenAI, Cohere, (Pinecone, Weaviate,
data Unstructured, etc.) Hugging Face) Chroma, pgvector)
Gray boxes show key components of the stack, with leading tools /
systems listed. Arrows show the ﬂow of data through the stack.

Contextual data provided by app developers to condition

LLM outputs

Prompts and few-shot examples that are sent to the LLM

Queries submitted by users

Prompt Playground APIs/Plugins
Few-shot (OpenAI, nat.dev, (Serp, Wolfram,
Output returned to users

examples Humanloop) Zapier, etc.)

Orchestration
(Python/DIY,
LangChain,
LlamaIndex, LLM APIs and Hosting
ChatGPT)
LLM Cache
Query Proprietary API Open API
(Redis, SQLite, (OpenAI, Anthropic) (Hugging Face, Replicate)
GPTCache)

Cloud Provider Opinionated Cloud

Logging/LLMops
(AWS, GCP, Azure, (Databricks, Anyscale,
(Weights & Biases, MLflow,
App Hosting PromptLayer, Helicone)
Coreweave) Mosaic, Modal, Runpod)
Output (Vercel, Steamship,
Streamlit, Modal)
Validation
(Guardrails, Rebuff,
Guidance, LMQL)
AskAstro has a few parts of this…
Legend
Contextual Data Pipelines Embedding Model Vector Database
(Databricks, Airflow, (OpenAI, Cohere, (Pinecone, Weaviate,
data Unstructured, etc.) Hugging Face) Chroma, pgvector)
Gray boxes show key components of the stack, with leading tools /
systems listed. Arrows show the flow of data through the stack.

Contextual data provided by app developers to condition

LLM outputs

Prompts and few-shot examples that are sent to the LLM

Queries submitted by users

Prompt Playground APIs/Plugins
Few-shot (OpenAI, nat.dev, (Serp, Wolfram,
Output returned to users

examples Humanloop) Zapier, etc.)

Orchestration
(Python/DIY,
LangChain,
LlamaIndex, LLM APIs and Hosting
ChatGPT)
LLM Cache
Query Proprietary API Open API
(Redis, SQLite, (OpenAI, Anthropic) (Hugging Face, Replicate)
GPTCache)

Cloud Provider Opinionated Cloud

Data Governance
■ How do you account for private data?
■ How do you provide transparency into data lineage?
Airﬂow is
foundational
Fine Tuning
■ Does it improve results?
to best
■ How much does it cost? practices for
Feedback Loops
all of this.
■ Semantic cache for correct responses
■ Ranking sources based on accuracy and ranking accordingly
■ Prompt clustering – what are people asking?
Thanks to the AskAstro Team:

Philippe Gagnon Michael Gregory

Community Collaboration

Patterns and
Providers Interfaces
Use Cases
What are all the providers the ecosystem needs?

pgvector
What’s the
interface that
feels right for
LLMOps?
What’s the
interface that
feels right for
LLMOps?
Patterns

■ Do you use one task to ingest and write?

What are the ■ Can you use dynamic task mapping to break it out?
best practices
■ Do you write to disk?
for building
pipelines for ■ Can you store embedding values in XCOMs?

LLM Apps? ■ How do you reconcile Airﬂow orchestration with

prompt orchestration?
Let’s do this all in the open source!

Java 17 Backend Development: Design backend systems using Spring Boot, Docker, Kafka, Eureka, Redis, and Tomcat
From Everand
Java 17 Backend Development: Design backend systems using Spring Boot, Docker, Kafka, Eureka, Redis, and Tomcat
Elara Drevyn
No ratings yet
-OceanofPDF.com-Generative AI Apps With Langchain and Python - Rabi Jay
No ratings yet
-OceanofPDF.com-Generative AI Apps With Langchain and Python - Rabi Jay
387 pages
LLMs in Production-MLC - GRC
No ratings yet
LLMs in Production-MLC - GRC
39 pages
Professional Machine Learning Engineer V12.75
100% (1)
Professional Machine Learning Engineer V12.75
26 pages
Laravel Queues in Action Dark
No ratings yet
Laravel Queues in Action Dark
232 pages
GuideToApacheAirflow PDF
100% (1)
GuideToApacheAirflow PDF
6 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Productionizing ML With Workflows at Twitter
No ratings yet
Productionizing ML With Workflows at Twitter
11 pages
Emerging Architectures for LLM Applications _ Andreessen Horowitz
No ratings yet
Emerging Architectures for LLM Applications _ Andreessen Horowitz
15 pages
Master Catalog for GenAI Programs for LNW-19Jul2024
No ratings yet
Master Catalog for GenAI Programs for LNW-19Jul2024
9 pages
Datastax-langchain-architecture-design-guide
No ratings yet
Datastax-langchain-architecture-design-guide
16 pages
Building Effective Agents Anthropic
No ratings yet
Building Effective Agents Anthropic
26 pages
GALLM_Unit_5_Note
No ratings yet
GALLM_Unit_5_Note
7 pages
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Trường Đại Học Khoa Học Tự Nhiên: Đại Học Quốc Gia Thành Phố Hồ Chí Minh
No ratings yet
Trường Đại Học Khoa Học Tự Nhiên: Đại Học Quốc Gia Thành Phố Hồ Chí Minh
8 pages
Amazon SimpleDB: LITE
From Everand
Amazon SimpleDB: LITE
Prabhakar Chaganti
No ratings yet
Learning Concurrent Programming in Scala
From Everand
Learning Concurrent Programming in Scala
Aleksandar Prokopec
No ratings yet
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
Collaborating_LLM
No ratings yet
Collaborating_LLM
4 pages
Fulll Stack LLMs Stanford University
No ratings yet
Fulll Stack LLMs Stanford University
39 pages
Assignment
No ratings yet
Assignment
5 pages
LLM_Project_Guide
No ratings yet
LLM_Project_Guide
4 pages
Assignment
No ratings yet
Assignment
5 pages
Designing Retrieval Augmented Generation
No ratings yet
Designing Retrieval Augmented Generation
32 pages
LLMOps Toolkit - Prashant Sahu
No ratings yet
LLMOps Toolkit - Prashant Sahu
12 pages
Java 17 Backend Development
From Everand
Java 17 Backend Development
Elara Drevyn
No ratings yet
Software Architecture with Python
From Everand
Software Architecture with Python
Anand Balachandran Pillai
3/5 (1)
AI Engineer Roadmap
No ratings yet
AI Engineer Roadmap
22 pages
Elements of Android Room
From Everand
Elements of Android Room
Mark Murphy
No ratings yet
LLM Framework - Documentation
100% (2)
LLM Framework - Documentation
23 pages
Anthropic《Building_effective_agents》
No ratings yet
Anthropic《Building_effective_agents》
14 pages
Planet, Code - PYTHON for LARGE LANGUAGE MODELS_ a Beginners Handbook for Leveraging Llms Into Modern Development Workflows and Applications (2025)
No ratings yet
Planet, Code - PYTHON for LARGE LANGUAGE MODELS_ a Beginners Handbook for Leveraging Llms Into Modern Development Workflows and Applications (2025)
254 pages
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
From Everand
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
Lucas Merritt
No ratings yet
Doubt clearance
No ratings yet
Doubt clearance
5 pages
300 LangChain Projects
100% (1)
300 LangChain Projects
17 pages
ASP.NET Core 1.0 High Performance
From Everand
ASP.NET Core 1.0 High Performance
James Singleton
No ratings yet
Using LLMs for Smart Contract Programming
No ratings yet
Using LLMs for Smart Contract Programming
16 pages
AI_Agent_Workflow_vs_Agent_Part_5_by_Vipra_Singh_Mar,_2025_Medium (2)
No ratings yet
AI_Agent_Workflow_vs_Agent_Part_5_by_Vipra_Singh_Mar,_2025_Medium (2)
25 pages
Learning Dynamics NAV Patterns: Create solutions that are easy to maintain, are quick to upgrade, and follow proven concepts and design
From Everand
Learning Dynamics NAV Patterns: Create solutions that are easy to maintain, are quick to upgrade, and follow proven concepts and design
Marije Brummel
No ratings yet
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
From Everand
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
Will Girten
No ratings yet
Config File Types
From Everand
Config File Types
Frank Wellington
No ratings yet
The Busy Person Intro To LLMs. Covering All The Major Updates in The - by Vishal Rajput - AIGuys - Dec, 2023 - Medium
No ratings yet
The Busy Person Intro To LLMs. Covering All The Major Updates in The - by Vishal Rajput - AIGuys - Dec, 2023 - Medium
1 page
Tabular_Lecture_Outline
No ratings yet
Tabular_Lecture_Outline
1 page
LLM System Design
No ratings yet
LLM System Design
11 pages
A Survey of Efficient LLM Inference Serving
No ratings yet
A Survey of Efficient LLM Inference Serving
20 pages
MSFT LLMOps ArunansuPattanayak Feb14
No ratings yet
MSFT LLMOps ArunansuPattanayak Feb14
19 pages
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
100% (1)
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
48 pages
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
From Everand
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
Anthony Serpico
No ratings yet
PHP And LLMs
No ratings yet
PHP And LLMs
49 pages
Anthropic
No ratings yet
Anthropic
21 pages
Evolving LLOMPS For RAG
No ratings yet
Evolving LLOMPS For RAG
6 pages
Building Effective AI Agents _ Anthropic
No ratings yet
Building Effective AI Agents _ Anthropic
16 pages
Backend Development
From Everand
Backend Development
Kai Turing
No ratings yet
PS Presentation
No ratings yet
PS Presentation
15 pages
Ways To Use LLM in Finance Organisation
No ratings yet
Ways To Use LLM in Finance Organisation
5 pages
Learning Hadoop 2
From Everand
Learning Hadoop 2
Garry Turkington
4/5 (1)
Pieces DZ RC 393 Getting Started Llms 2024
No ratings yet
Pieces DZ RC 393 Getting Started Llms 2024
8 pages
LLM Intro
No ratings yet
LLM Intro
8 pages
pathwaycom_llm-app_ Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. ?Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more_
No ratings yet
pathwaycom_llm-app_ Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. ?Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more_
1 page
Java Concurrency and Parallelism: Master advanced Java techniques for cloud-based applications through concurrency and parallelism
From Everand
Java Concurrency and Parallelism: Master advanced Java techniques for cloud-based applications through concurrency and parallelism
Jay Wang
No ratings yet
Advanced Penetration Testing for Highly-Secured Environments: The Ultimate Security Guide
From Everand
Advanced Penetration Testing for Highly-Secured Environments: The Ultimate Security Guide
Allen Lee
4.5/5 (6)
MLops
No ratings yet
MLops
24 pages
Cba 6th Math Key
No ratings yet
Cba 6th Math Key
2 pages
6TH Hindi
No ratings yet
6TH Hindi
27 pages
Effective Data Governance PDF
100% (1)
Effective Data Governance PDF
8 pages
Nano Text Editor Cheat Sheet: by Via
No ratings yet
Nano Text Editor Cheat Sheet: by Via
1 page
Obstacles To Enterprise Agility
No ratings yet
Obstacles To Enterprise Agility
3 pages
Module 3 ACA Notes
No ratings yet
Module 3 ACA Notes
50 pages
Explain Pipeline With Example
No ratings yet
Explain Pipeline With Example
3 pages
RSD ICBS v3
No ratings yet
RSD ICBS v3
35 pages
Renderstream_Whitepaper
No ratings yet
Renderstream_Whitepaper
16 pages
Week 7 Lecture Material ISG PDF
No ratings yet
Week 7 Lecture Material ISG PDF
103 pages
OS Sheet (1) Solution
No ratings yet
OS Sheet (1) Solution
6 pages
Brewer 1987
No ratings yet
Brewer 1987
7 pages
Vl9253 Vlsi Signal Processing
No ratings yet
Vl9253 Vlsi Signal Processing
1 page
Computer Bus Architecture, Pipelining and Memory Management
No ratings yet
Computer Bus Architecture, Pipelining and Memory Management
13 pages
Azure de Project
No ratings yet
Azure de Project
73 pages
Pipeline
No ratings yet
Pipeline
22 pages
04 - Aggregation Operations
No ratings yet
04 - Aggregation Operations
68 pages
Pipeline 2
No ratings yet
Pipeline 2
10 pages
Computer Architecture Pipe Line
No ratings yet
Computer Architecture Pipe Line
28 pages
Jenkins Cheat Sheet 002
No ratings yet
Jenkins Cheat Sheet 002
8 pages
pg119 C Accum
No ratings yet
pg119 C Accum
24 pages
Pipelining and Parallel Processing
No ratings yet
Pipelining and Parallel Processing
37 pages
Computer Organization: Instructions and
No ratings yet
Computer Organization: Instructions and
25 pages
Software Engineering for Data Scientists (MEAP V2) Andrew Treadway - Download the ebook with all fully detailed chapters
100% (2)
Software Engineering for Data Scientists (MEAP V2) Andrew Treadway - Download the ebook with all fully detailed chapters
56 pages
Messerschmitt Timing PDF
No ratings yet
Messerschmitt Timing PDF
16 pages
Parallel Terminology 2
No ratings yet
Parallel Terminology 2
7 pages
Chap-10: Speed and Efficiency
No ratings yet
Chap-10: Speed and Efficiency
29 pages
Instruction Pipelining (Ii) : Reducing Pipeline Branch Penalties
No ratings yet
Instruction Pipelining (Ii) : Reducing Pipeline Branch Penalties
5 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
55 pages
Advanced Digital Design: HW Datapath Components
No ratings yet
Advanced Digital Design: HW Datapath Components
20 pages
2012 01 - Pipeline Route Selection Right of Way - Mustang PDF
No ratings yet
2012 01 - Pipeline Route Selection Right of Way - Mustang PDF
3 pages
Refreshing Power BI Dataset From An Azure Data Factory
No ratings yet
Refreshing Power BI Dataset From An Azure Data Factory
3 pages

Google Search Tips

Uploaded by

Google Search Tips

Uploaded by

Building and

Why Airﬂow should be at the centre of LLMOps?

Real Use-case & reference architecture

Next Steps: Community collaboration

A powerful new class of large

Ingest Data Train Model Prediction

You hit a pre-trained model

Ingest Data Train Model Prediction

■ Ingestion from several sources

Document Loading Splitting Storage Retrieval Output

Legacy Data Store

Python Native Common Interface Document Parsing

Monitoring & Alerting Extensible Ingestion

Pluggable Compute Data Agnostic Day 2 Ops

We have customers, employees, and community members

How do we provide an easy interface for folks to

GitHub Docs (.md)

Docs (.md) Slack

Original Prompt Rewording 2

■ State is stored in Firestore and prompt tracing is done through

If good answer, write to

■ UI also shows the most recent good prompts on the homepage

■ Experimenting with different sources of data to ingest

■ Running the pipelines on a schedule and ad-hoc

■ Running the same workloads with variable chunking

■ Needing to retry tasks due to ﬁnicky python libraries and

■ Giving different parts of the workload variable compute

■ Creating standard interfaces to interact with external

■ Experimenting with different sources of data to ingest

■ Running the pipelines on a schedule and ad-hoc Which is

■ Creating standard interfaces to interact with external

Contextual data provided by app developers to condition

Prompts and few-shot examples that are sent to the LLM

Queries submitted by users

examples Humanloop) Zapier, etc.)

Cloud Provider Opinionated Cloud

Contextual data provided by app developers to condition

Prompts and few-shot examples that are sent to the LLM

Queries submitted by users

examples Humanloop) Zapier, etc.)

Cloud Provider Opinionated Cloud

Philippe Gagnon Michael Gregory

■ Do you use one task to ingest and write?

LLM Apps? ■ How do you reconcile Airﬂow orchestration with

You might also like