SlideShare a Scribd company logo
Fine-tuning Large
Language Models
with Declarative
ML Orchestration
Shivay Lamba
Developer Relations Engineer, Couchbase
@howdevelop
Outline
● 🤔 Why Fine-tune LLMs?
● 🔧 How to Fine-tune LLMs
● 🔀 Why declarative ML orchestration?
● 🚀 Fine-tuning LLMs with Flyte
Flyte
Flyte
Why Fine-tuning LLMs?
Flyte
LLM
Input /
Prompt
Output /
Response
Prompt Engineering
Holding the model weights fixed and updating the input prompt to obtain the desired output.
Flyte
LLM
Input /
Prompt
Output /
Response
🔧
Flyte
LLM
Input /
Prompt
Output /
Response
Fine-tuning
Updating model weights using a specific data distribution to obtain the desired behavior from the model.
🔧
👀
data
privacy
?
✍ prompt
engineering
get
desired
output?
🚀 ship it!
tried
really
hard?
🔧 fine-tuning
ML
skills?
☁ low-code
fine-tuning
💻 “high”-code
fine-tuning
get
desired
output?
⏱ wait for R&D
to try again…
no
yes
yes
yes
yes
no
no
yes
no
no
downstream
application…
Flyte
How to Fine-tune LLMs
👉 Supervised Fine Tuning
SFT
📖 Continued Pre-training
CPT
Flyte
🔄 RL from Human Feedback
RLHF
Choose a pretrained model
Create “pile of tokens” dataset
Minimum quality: low
Pick an optimization method
Mixed precision, ZeRO
Result: text completer
Choose a pretrained model
Create “prompt-response” dataset
Minimum quality: high
Pick an optimization method
Mixed precision, ZeRO, PEFT
Result: a prompt responder
Choose an SFT model
Create “prompt-multiresponse”
dataset with human preferences
Minimum quality: high
Train a reward model (RM)
Initialize from SFT
Train a policy on the reward model
Initialize from SFT
Loss based on RM
Result: a prompt responder
Focus of this talk
Types of Fine-tuning
Flyte
Methods to train large models
Source: https://aman.ai/primers/ai/grad-accum-checkpoint/
Gradient Accumulation Model Parallelism
Source: https://xiandong79.github.io/Intro-Distributed-Deep-Learning
Flyte
Methods to train models faster
Optimizers Data Parallelism
Source: https://xiandong79.github.io/Intro-Distributed-Deep-Learning
Source: https://www.fast.ai/posts/2018-07-02-adam-weight-decay.html
Schedulers
Source:
https://github.com/sgugger/Deep-Learning/blob/master/Cy
clical%20LR%20and%20momentums.ipynb
ZeRO takes ideas from data and model
parallelism, sharding model weights
across workers in a distributed system and
proceeds from forward and backward
passes in a layer-wise fashion
Flyte
Zero Redundancy Optimization
Source:
https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/
Quantization reduces memory
requirements but trades off precision.
You can fit larger models into your GPUs,
but it can lead to training instability.
Flyte
Intro to 8-bit Quantization
LoRA is a technique that allows us to fine-tune
large models like Stable Diffusion without the need
to retrain them entirely, saving both time and
computational resources.
LoRA stands for Low-Rank Adaptation, a method
designed to fine-tune large-scale models in a
more efficient manner. The key idea behind LoRA is
to update only a small part of the model's weights,
specifically targeting those that have the most
significant impact on the task at hand. This
approach contrasts with traditional fine-tuning
methods, where a large portion of the model's
weights might be updated, requiring substantial
computational power and time.
Flyte
Low-Rank Adaptation (LoRA)
Source: https://sebastianraschka.com/blog/2023/llm-finetuning-lora.html
Flyte
Why Declarative ML Orchestration?
Flyte
Orchestrators coordinate the logical flow of computations needed
to get data from its raw state 🧱 into a desired state 🏠
📦 The units of computation you need for your workload
🔀 How data flows between those units
✅ What the types and state of your data is at any given point
🐳 What dependencies each unit relies on to do its computation
🌳 What resources each unit has available to it
Orchestrators help you reason about:
Flyte
Flyte
Flyte is a production-grade orchestrator
that unifies data, ML, and analytics stacks.
Kubernetes Cluster
Flyte Cluster
Compiled Workflow
Config
⚙
💻 Create Tasks and Workflows
Workflow Execution
K8s Pod
Container
🐳
Package &
Register
Flyte
Tasks
The smallest unit of
work in Flyte.
Task
inputs
outputs
🐳 Containerized
Strongly
Typed
Versioned
Flyte
Workflows
Compositions of Tasks to
achieve complex
computations
inputs
Task
inputs
outputs
Workflow
inputs
outputs
Task
inputs
outputs
Task
inputs
outputs
outputs
Strongly
Typed
Data Flow is
1st Class
Citizen
Versioned
Flyte
Projects and Domains
Logical groupings of tasks and
workflows for built-in
multi-tenancy and isolation.
Development Staging Production
Data ETL
Classification
Models
Forecasting
Models
Domains
Projects
Flyte
Type Safety
Get errors about your execution
graph at compile-time, even before
executing your code
Input: int
Train Model
Input: List[Dict]
Output: Model
Create Dataset
Input: int
Output: DataFrame
Output: Model
Incompatible
Types
Flyte
Declarative Infrastructure
Declaratively provisions ephemeral cluster,
CPU/GPU, and memory resources.
inputs
inputs
outputs
inputs
outputs
outputs
Spark Cluster
Setup
Teardown
Ray Cluster
Setup
Teardown
Flyte Backend
inputs
outputs
GPUs
Setup
Teardown
Flyte
Workflow
Declarative Dependencies
Specify your package dependencies as
code.
Flyte
Abstracted Data
Persistence
Don’t worry about how data is
serialized/deserialized as your
execution graph runs
inputs
inputs
outputs
outputs
Raw Data
Raw Data Store
inputs
outputs
Flyte
Raw data
Raw Data
Flyte Access Boundary
Metadata Store
Metadata
Metadata
Metadata
Workflow
pointer
pointer
pointer
Flyte
Fine-tuning LLMs with Flyte
Flyte
📖
Wikipedia
🔧
RedPajama-3B
ZeRO w/
DeepSpeed 🤗
Publish to HF
Hub
📊
Evaluate Model
🔮
Interactive
Inference
CPT Workflow
🔧
RedPajama-7B
8-bit LoRA
🤗
Publish to HF
Hub
SFT Workflow
Fine-tuning Workflows
📖
Alpaca
📖
8bit quantize
🤗
Publish to HF
Hub
Flyte
📖
Wikipedia
🔧
RedPajama-3B
ZeRO w/
DeepSpeed 🤗
Publish to HF
Hub
📊
Evaluate Model
🔮
Interactive
Inference
CPT Workflow
Demo with FLyte
📖
8bit quantize
🤗
Publish to HF
Hub
Flyte Website: https://flyte.org/
Flyte Docs: https://docs.flyte.org/en/latest/
LLM Fine-tuning Repo: https://github.com/unionai-oss/llm-fine-tuning
LLM Evaluation Notebook: https://go.flyte.org/tmls-2023-llm-eval
LLM Inference Notebook: https://go.flyte.org/tmls-2023-llm-inference
8-bit Quantization Notebook: https://go.flyte.org/tmls-2023-8bit-quantization
Basic LoRA Notebook: https://go.flyte.org/tmls-2023-lora
Flyte
Resources
👉 If you care about building applications from LLMs, start off with prompt engineering. Only if you
can’t get the desired behavior after a lot of effort should you embark on fine-tuning. If you care
about research, go ahead and do fine-tuning!
👉 The space of architectures, optimizers, and fine-tuning techniques is exploding, not just in LLMs.
These range from data type representations (8-bit quantization), distributed training setups (ZeRO),
and parameter efficiency tricks (LoRA).
👉 Flyte provides a orchestration platform that allows you to write Python code to compose
workflows for modern ML workloads, allowing you to reason about resource requirements,
infrastructure, and data flow in a declarative way.
👉 Flyte integrates well with cutting edge ML tools like transformers, peft, bitsandbytes,
deepspeed, pytorch elasticand the entire Python data / ML ecosystem of libraries (pandas,
pandera, numpy, etc.) while giving you reliable and reproducible workflows.
Flyte
Summary
Thank you for attending
Connect with me here:
X: @HowDevelop
Github: @ShivayLamba
Ad

More Related Content

Similar to Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay Lamba, Couchbase (20)

NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
Databricks
 
Unlocking the Future of AI_ Top 5 Open-Source LLMs for 2024.pdf
Unlocking the Future of AI_ Top 5 Open-Source LLMs for 2024.pdfUnlocking the Future of AI_ Top 5 Open-Source LLMs for 2024.pdf
Unlocking the Future of AI_ Top 5 Open-Source LLMs for 2024.pdf
GPU SERVER
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
Graal Tutorial at CGO 2015 by Christian Wimmer
Graal Tutorial at CGO 2015 by Christian WimmerGraal Tutorial at CGO 2015 by Christian Wimmer
Graal Tutorial at CGO 2015 by Christian Wimmer
Thomas Wuerthinger
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
PAPIs.io
 
Enterprise grade deployment and security with PostgreSQL
Enterprise grade deployment and security with PostgreSQLEnterprise grade deployment and security with PostgreSQL
Enterprise grade deployment and security with PostgreSQL
Himanchali -
 
Weak Supervision.pdf
Weak Supervision.pdfWeak Supervision.pdf
Weak Supervision.pdf
StephenLeo7
 
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
In-Memory Computing Summit
 
Python for Data Logistics
Python for Data LogisticsPython for Data Logistics
Python for Data Logistics
Ken Farmer
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015
PivotalOpenSourceHub
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache Airflow
Kaxil Naik
 
ETL Practices for Better or Worse
ETL Practices for Better or WorseETL Practices for Better or Worse
ETL Practices for Better or Worse
Eric Sun
 
Ceph - High Performance Without High Costs
Ceph - High Performance Without High CostsCeph - High Performance Without High Costs
Ceph - High Performance Without High Costs
Jonathan Long
 
Managing Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure DataManaging Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure Data
Aki Ariga
 
Lap around .net 4
Lap around .net 4Lap around .net 4
Lap around .net 4
Abdul Khan
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
zekeLabs Technologies
 
How to Migrate Applications Off a Mainframe
How to Migrate Applications Off a MainframeHow to Migrate Applications Off a Mainframe
How to Migrate Applications Off a Mainframe
VMware Tanzu
 
Building an MLOps Stack for Companies at Reasonable Scale
Building an MLOps Stack for Companies at Reasonable ScaleBuilding an MLOps Stack for Companies at Reasonable Scale
Building an MLOps Stack for Companies at Reasonable Scale
Merelda
 
Sap Tips and Tricks Training for End user
Sap Tips and Tricks Training for End userSap Tips and Tricks Training for End user
Sap Tips and Tricks Training for End user
Arghadip Kar
 
(ZDM) Zero Downtime DB Migration to Oracle Cloud
(ZDM) Zero Downtime DB Migration to Oracle Cloud(ZDM) Zero Downtime DB Migration to Oracle Cloud
(ZDM) Zero Downtime DB Migration to Oracle Cloud
Ruggero Citton
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
Databricks
 
Unlocking the Future of AI_ Top 5 Open-Source LLMs for 2024.pdf
Unlocking the Future of AI_ Top 5 Open-Source LLMs for 2024.pdfUnlocking the Future of AI_ Top 5 Open-Source LLMs for 2024.pdf
Unlocking the Future of AI_ Top 5 Open-Source LLMs for 2024.pdf
GPU SERVER
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
Graal Tutorial at CGO 2015 by Christian Wimmer
Graal Tutorial at CGO 2015 by Christian WimmerGraal Tutorial at CGO 2015 by Christian Wimmer
Graal Tutorial at CGO 2015 by Christian Wimmer
Thomas Wuerthinger
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
PAPIs.io
 
Enterprise grade deployment and security with PostgreSQL
Enterprise grade deployment and security with PostgreSQLEnterprise grade deployment and security with PostgreSQL
Enterprise grade deployment and security with PostgreSQL
Himanchali -
 
Weak Supervision.pdf
Weak Supervision.pdfWeak Supervision.pdf
Weak Supervision.pdf
StephenLeo7
 
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
In-Memory Computing Summit
 
Python for Data Logistics
Python for Data LogisticsPython for Data Logistics
Python for Data Logistics
Ken Farmer
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015
PivotalOpenSourceHub
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache Airflow
Kaxil Naik
 
ETL Practices for Better or Worse
ETL Practices for Better or WorseETL Practices for Better or Worse
ETL Practices for Better or Worse
Eric Sun
 
Ceph - High Performance Without High Costs
Ceph - High Performance Without High CostsCeph - High Performance Without High Costs
Ceph - High Performance Without High Costs
Jonathan Long
 
Managing Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure DataManaging Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure Data
Aki Ariga
 
Lap around .net 4
Lap around .net 4Lap around .net 4
Lap around .net 4
Abdul Khan
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
zekeLabs Technologies
 
How to Migrate Applications Off a Mainframe
How to Migrate Applications Off a MainframeHow to Migrate Applications Off a Mainframe
How to Migrate Applications Off a Mainframe
VMware Tanzu
 
Building an MLOps Stack for Companies at Reasonable Scale
Building an MLOps Stack for Companies at Reasonable ScaleBuilding an MLOps Stack for Companies at Reasonable Scale
Building an MLOps Stack for Companies at Reasonable Scale
Merelda
 
Sap Tips and Tricks Training for End user
Sap Tips and Tricks Training for End userSap Tips and Tricks Training for End user
Sap Tips and Tricks Training for End user
Arghadip Kar
 
(ZDM) Zero Downtime DB Migration to Oracle Cloud
(ZDM) Zero Downtime DB Migration to Oracle Cloud(ZDM) Zero Downtime DB Migration to Oracle Cloud
(ZDM) Zero Downtime DB Migration to Oracle Cloud
Ruggero Citton
 

More from All Things Open (20)

Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
Let's Create a GitHub Copilot Extension! - Nick Taylor, PomeriumLet's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
All Things Open
 
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
All Things Open
 
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
All Things Open
 
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
All Things Open
 
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
All Things Open
 
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
All Things Open
 
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
All Things Open
 
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
Don't just talk to AI, do more with AI: how to improve productivity with AI a...Don't just talk to AI, do more with AI: how to improve productivity with AI a...
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
All Things Open
 
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
All Things Open
 
The Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQLThe Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQL
All Things Open
 
Making Operating System updates fast, easy, and safe
Making Operating System updates fast, easy, and safeMaking Operating System updates fast, easy, and safe
Making Operating System updates fast, easy, and safe
All Things Open
 
Reshaping the landscape of belonging to transform community
Reshaping the landscape of belonging to transform communityReshaping the landscape of belonging to transform community
Reshaping the landscape of belonging to transform community
All Things Open
 
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
All Things Open
 
Integrating Diversity, Equity, and Inclusion into Product Design
Integrating Diversity, Equity, and Inclusion into Product DesignIntegrating Diversity, Equity, and Inclusion into Product Design
Integrating Diversity, Equity, and Inclusion into Product Design
All Things Open
 
The Open Source Ecosystem for eBPF in Kubernetes
The Open Source Ecosystem for eBPF in KubernetesThe Open Source Ecosystem for eBPF in Kubernetes
The Open Source Ecosystem for eBPF in Kubernetes
All Things Open
 
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon PitmanOpen Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
All Things Open
 
Open-Source Low-Code - Craig St. Jean, Xebia
Open-Source Low-Code - Craig St. Jean, XebiaOpen-Source Low-Code - Craig St. Jean, Xebia
Open-Source Low-Code - Craig St. Jean, Xebia
All Things Open
 
How I Learned to Stop Worrying about my Infrastructure and Love [Open]Tofu
How I Learned to Stop Worrying about my Infrastructure and Love [Open]TofuHow I Learned to Stop Worrying about my Infrastructure and Love [Open]Tofu
How I Learned to Stop Worrying about my Infrastructure and Love [Open]Tofu
All Things Open
 
The Developers' Framework for Content Creation
The Developers' Framework for Content CreationThe Developers' Framework for Content Creation
The Developers' Framework for Content Creation
All Things Open
 
Dependency management: the cause of—and solution to—all supply chain problems
Dependency management: the cause of—and solution to—all supply chain problemsDependency management: the cause of—and solution to—all supply chain problems
Dependency management: the cause of—and solution to—all supply chain problems
All Things Open
 
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
Let's Create a GitHub Copilot Extension! - Nick Taylor, PomeriumLet's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
All Things Open
 
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
All Things Open
 
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
All Things Open
 
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
All Things Open
 
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
All Things Open
 
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
All Things Open
 
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
All Things Open
 
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
Don't just talk to AI, do more with AI: how to improve productivity with AI a...Don't just talk to AI, do more with AI: how to improve productivity with AI a...
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
All Things Open
 
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
All Things Open
 
The Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQLThe Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQL
All Things Open
 
Making Operating System updates fast, easy, and safe
Making Operating System updates fast, easy, and safeMaking Operating System updates fast, easy, and safe
Making Operating System updates fast, easy, and safe
All Things Open
 
Reshaping the landscape of belonging to transform community
Reshaping the landscape of belonging to transform communityReshaping the landscape of belonging to transform community
Reshaping the landscape of belonging to transform community
All Things Open
 
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
All Things Open
 
Integrating Diversity, Equity, and Inclusion into Product Design
Integrating Diversity, Equity, and Inclusion into Product DesignIntegrating Diversity, Equity, and Inclusion into Product Design
Integrating Diversity, Equity, and Inclusion into Product Design
All Things Open
 
The Open Source Ecosystem for eBPF in Kubernetes
The Open Source Ecosystem for eBPF in KubernetesThe Open Source Ecosystem for eBPF in Kubernetes
The Open Source Ecosystem for eBPF in Kubernetes
All Things Open
 
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon PitmanOpen Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
All Things Open
 
Open-Source Low-Code - Craig St. Jean, Xebia
Open-Source Low-Code - Craig St. Jean, XebiaOpen-Source Low-Code - Craig St. Jean, Xebia
Open-Source Low-Code - Craig St. Jean, Xebia
All Things Open
 
How I Learned to Stop Worrying about my Infrastructure and Love [Open]Tofu
How I Learned to Stop Worrying about my Infrastructure and Love [Open]TofuHow I Learned to Stop Worrying about my Infrastructure and Love [Open]Tofu
How I Learned to Stop Worrying about my Infrastructure and Love [Open]Tofu
All Things Open
 
The Developers' Framework for Content Creation
The Developers' Framework for Content CreationThe Developers' Framework for Content Creation
The Developers' Framework for Content Creation
All Things Open
 
Dependency management: the cause of—and solution to—all supply chain problems
Dependency management: the cause of—and solution to—all supply chain problemsDependency management: the cause of—and solution to—all supply chain problems
Dependency management: the cause of—and solution to—all supply chain problems
All Things Open
 
Ad

Recently uploaded (20)

Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Ad

Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay Lamba, Couchbase

  • 1. Fine-tuning Large Language Models with Declarative ML Orchestration Shivay Lamba Developer Relations Engineer, Couchbase @howdevelop
  • 2. Outline ● 🤔 Why Fine-tune LLMs? ● 🔧 How to Fine-tune LLMs ● 🔀 Why declarative ML orchestration? ● 🚀 Fine-tuning LLMs with Flyte Flyte
  • 5. Prompt Engineering Holding the model weights fixed and updating the input prompt to obtain the desired output. Flyte LLM Input / Prompt Output / Response 🔧
  • 6. Flyte LLM Input / Prompt Output / Response Fine-tuning Updating model weights using a specific data distribution to obtain the desired behavior from the model. 🔧
  • 7. 👀 data privacy ? ✍ prompt engineering get desired output? 🚀 ship it! tried really hard? 🔧 fine-tuning ML skills? ☁ low-code fine-tuning 💻 “high”-code fine-tuning get desired output? ⏱ wait for R&D to try again… no yes yes yes yes no no yes no no downstream application…
  • 9. 👉 Supervised Fine Tuning SFT 📖 Continued Pre-training CPT Flyte 🔄 RL from Human Feedback RLHF Choose a pretrained model Create “pile of tokens” dataset Minimum quality: low Pick an optimization method Mixed precision, ZeRO Result: text completer Choose a pretrained model Create “prompt-response” dataset Minimum quality: high Pick an optimization method Mixed precision, ZeRO, PEFT Result: a prompt responder Choose an SFT model Create “prompt-multiresponse” dataset with human preferences Minimum quality: high Train a reward model (RM) Initialize from SFT Train a policy on the reward model Initialize from SFT Loss based on RM Result: a prompt responder Focus of this talk Types of Fine-tuning
  • 10. Flyte Methods to train large models Source: https://aman.ai/primers/ai/grad-accum-checkpoint/ Gradient Accumulation Model Parallelism Source: https://xiandong79.github.io/Intro-Distributed-Deep-Learning
  • 11. Flyte Methods to train models faster Optimizers Data Parallelism Source: https://xiandong79.github.io/Intro-Distributed-Deep-Learning Source: https://www.fast.ai/posts/2018-07-02-adam-weight-decay.html Schedulers Source: https://github.com/sgugger/Deep-Learning/blob/master/Cy clical%20LR%20and%20momentums.ipynb
  • 12. ZeRO takes ideas from data and model parallelism, sharding model weights across workers in a distributed system and proceeds from forward and backward passes in a layer-wise fashion Flyte Zero Redundancy Optimization Source: https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/
  • 13. Quantization reduces memory requirements but trades off precision. You can fit larger models into your GPUs, but it can lead to training instability. Flyte Intro to 8-bit Quantization
  • 14. LoRA is a technique that allows us to fine-tune large models like Stable Diffusion without the need to retrain them entirely, saving both time and computational resources. LoRA stands for Low-Rank Adaptation, a method designed to fine-tune large-scale models in a more efficient manner. The key idea behind LoRA is to update only a small part of the model's weights, specifically targeting those that have the most significant impact on the task at hand. This approach contrasts with traditional fine-tuning methods, where a large portion of the model's weights might be updated, requiring substantial computational power and time. Flyte Low-Rank Adaptation (LoRA) Source: https://sebastianraschka.com/blog/2023/llm-finetuning-lora.html
  • 15. Flyte Why Declarative ML Orchestration?
  • 16. Flyte Orchestrators coordinate the logical flow of computations needed to get data from its raw state 🧱 into a desired state 🏠
  • 17. 📦 The units of computation you need for your workload 🔀 How data flows between those units ✅ What the types and state of your data is at any given point 🐳 What dependencies each unit relies on to do its computation 🌳 What resources each unit has available to it Orchestrators help you reason about: Flyte
  • 18. Flyte Flyte is a production-grade orchestrator that unifies data, ML, and analytics stacks.
  • 19. Kubernetes Cluster Flyte Cluster Compiled Workflow Config ⚙ 💻 Create Tasks and Workflows Workflow Execution K8s Pod Container 🐳 Package & Register Flyte
  • 20. Tasks The smallest unit of work in Flyte. Task inputs outputs 🐳 Containerized Strongly Typed Versioned Flyte
  • 21. Workflows Compositions of Tasks to achieve complex computations inputs Task inputs outputs Workflow inputs outputs Task inputs outputs Task inputs outputs outputs Strongly Typed Data Flow is 1st Class Citizen Versioned Flyte
  • 22. Projects and Domains Logical groupings of tasks and workflows for built-in multi-tenancy and isolation. Development Staging Production Data ETL Classification Models Forecasting Models Domains Projects Flyte
  • 23. Type Safety Get errors about your execution graph at compile-time, even before executing your code Input: int Train Model Input: List[Dict] Output: Model Create Dataset Input: int Output: DataFrame Output: Model Incompatible Types Flyte
  • 24. Declarative Infrastructure Declaratively provisions ephemeral cluster, CPU/GPU, and memory resources. inputs inputs outputs inputs outputs outputs Spark Cluster Setup Teardown Ray Cluster Setup Teardown Flyte Backend inputs outputs GPUs Setup Teardown Flyte Workflow
  • 25. Declarative Dependencies Specify your package dependencies as code. Flyte
  • 26. Abstracted Data Persistence Don’t worry about how data is serialized/deserialized as your execution graph runs inputs inputs outputs outputs Raw Data Raw Data Store inputs outputs Flyte Raw data Raw Data Flyte Access Boundary Metadata Store Metadata Metadata Metadata Workflow pointer pointer pointer
  • 28. Flyte 📖 Wikipedia 🔧 RedPajama-3B ZeRO w/ DeepSpeed 🤗 Publish to HF Hub 📊 Evaluate Model 🔮 Interactive Inference CPT Workflow 🔧 RedPajama-7B 8-bit LoRA 🤗 Publish to HF Hub SFT Workflow Fine-tuning Workflows 📖 Alpaca 📖 8bit quantize 🤗 Publish to HF Hub
  • 29. Flyte 📖 Wikipedia 🔧 RedPajama-3B ZeRO w/ DeepSpeed 🤗 Publish to HF Hub 📊 Evaluate Model 🔮 Interactive Inference CPT Workflow Demo with FLyte 📖 8bit quantize 🤗 Publish to HF Hub
  • 30. Flyte Website: https://flyte.org/ Flyte Docs: https://docs.flyte.org/en/latest/ LLM Fine-tuning Repo: https://github.com/unionai-oss/llm-fine-tuning LLM Evaluation Notebook: https://go.flyte.org/tmls-2023-llm-eval LLM Inference Notebook: https://go.flyte.org/tmls-2023-llm-inference 8-bit Quantization Notebook: https://go.flyte.org/tmls-2023-8bit-quantization Basic LoRA Notebook: https://go.flyte.org/tmls-2023-lora Flyte Resources
  • 31. 👉 If you care about building applications from LLMs, start off with prompt engineering. Only if you can’t get the desired behavior after a lot of effort should you embark on fine-tuning. If you care about research, go ahead and do fine-tuning! 👉 The space of architectures, optimizers, and fine-tuning techniques is exploding, not just in LLMs. These range from data type representations (8-bit quantization), distributed training setups (ZeRO), and parameter efficiency tricks (LoRA). 👉 Flyte provides a orchestration platform that allows you to write Python code to compose workflows for modern ML workloads, allowing you to reason about resource requirements, infrastructure, and data flow in a declarative way. 👉 Flyte integrates well with cutting edge ML tools like transformers, peft, bitsandbytes, deepspeed, pytorch elasticand the entire Python data / ML ecosystem of libraries (pandas, pandera, numpy, etc.) while giving you reliable and reproducible workflows. Flyte Summary
  • 32. Thank you for attending Connect with me here: X: @HowDevelop Github: @ShivayLamba