Google Search Tips
Google Search Tips
deploying LLM
applications with
Apache Airflow
Julian LaNeve Kaxil Naik
Senior Product Manager @ Apache Airflow Committer & PMC Member
Astronomer Director of Eng @ Astronomer
Agenda
Less Data
Going from “Idea to Production” with LLM Apps involves
solving a lot of data engineering problems:
URLs
PDFs
Relevant
Splits
Prompt
Splits LLM <Answer>
Vectorstore
Query
Database <Question>
Source: https://python.langchain.com/docs/use_cases/question_answering/
Airflow is a Natural Fit…
■ Airflow gives a framework to load data ■ After content is split into chunks, each
from APIs & other sources into LangChain chunk is embedded into vectors (semantic
representations)
■ LangChain helps pre-process and split
documents into smaller chunks ■ Those vectors are written to Weaviate for
depending on content type later retrieval
Prompt Orchestration and Answering
🦜🔗LangChain
Rewording 1
Web App
Slack Bot
Rewording 3
Combine docs
and make final
🦜
🔗
Reword to get more Vector DB search
related documents with prompts LLM call to
answer
Users can interact with UI or ■ Original prompt gets reworded 3x using gpt-3.5-turbo
Slack Bot; they both use the
same API ■ Answer is generated by combining docs from each prompt
and making a gpt-4 call
On schedule
🔗 homepage
When a user submits feedback, it ■ Airflow DAGs process feedback async to evaluate answers on helpfulness,, relevance,
gets stored in Firestore and and publicness
LangSmith for later use
■ If answer is good, it gets stored in Weaviate and can be used as a source for future
questions
github.com/astronomer/ask-astro
a16z’s Emerging LLM App Stack
Legend
Contextual Data Pipelines Embedding Model Vector Database
(Databricks, Airflow, (OpenAI, Cohere, (Pinecone, Weaviate,
data Unstructured, etc.) Hugging Face) Chroma, pgvector)
Gray boxes show key components of the stack, with leading tools /
systems listed. Arrows show the flow of data through the stack.
Data Governance
■ How do you account for private data?
■ How do you provide transparency into data lineage?
Airflow is
foundational
Fine Tuning
■ Does it improve results?
to best
■ How much does it cost? practices for
Feedback Loops
all of this.
■ Semantic cache for correct responses
■ Ranking sources based on accuracy and ranking accordingly
■ Prompt clustering – what are people asking?
Thanks to the AskAstro Team:
Patterns and
Providers Interfaces
Use Cases
What are all the providers the ecosystem needs?
pgvector
What’s the
interface that
feels right for
LLMOps?
What’s the
interface that
feels right for
LLMOps?
Patterns
What are the ■ Can you use dynamic task mapping to break it out?
best practices
■ Do you write to disk?
for building
pipelines for ■ Can you store embedding values in XCOMs?