SlideShare a Scribd company logo
Presented By: Divyansh Jain
Software Consultant
Knoldus Inc
Apache Airflow
Agenda
What is Airflow?
How it works?
Pros & Cons
UI Screenshots
Demo
Why use Airflow ?
c
Airflow is a platform to
programmatically author, schedule
and monitor workflows (a.k.a DAGs or
Directed Acyclic Graph)
What is Apache
Airflow?
Incubating
- Created @ Airbnb in 2015 by Maxime
Beauchemin
- 6100+ Forks
- 16k+ Github Stars
- 1k+ Contributors
- 150+ companies officially using it
- More than 8k committers
What is Airflow?
Pipelines are configured as
code, allowing for dynamic
pipeline generation
A platform to schedule
data pipelines
Easy to define our own
operators, executors
It’s all about DAGs
A platform to monitor and
control data pipelines
Why to use Airflow ?
When would you use a Workflow Scheduler like
Airflow?
- ETL Pipelines
- Machine Learning Projects
- Predictive Data Pipelines such as Fraud Detection etc.
- Job Scheduling
Why to use Airflow ?
What should a Workflow do well?
- Schedule your workflow plan
- Handle task failures
- Monitor Performance
Very Flexible
- Rich User Interface
- Easily Extensible
- Allow communication between task
- Backfill Control
- Efficient CLI
Sample DAGs
How the Job initiates?
Web Server
Scheduler
Worker Worker
Metadata
User
- A user manages/ schedules
DAGs using Airflow
- Airflow webserver stores
scheduling metadata in
metadata DB
- Scheduler picks up new
schedule and distributes work
over Celery or RabbitMQ
- Workers picks up Airflow Task
Note- Celery is used to manage
node and Redis or RabbitMQ for
communication
Pros
Easy-to-use user interface - Independent scheduler Active open-source
community
● can easily view task logs
● can easily view hierarchy,
statuses, and code runs
● allows you to easily
change task statuses,
rerun historical tasks
● comes with its own
scheduler
● supports multiple DAGs
● chatroom itself is very
active
● newbies can get their
questions answered within
a span of few hours.
Cons
Task optimization - No direct dealing with
- tasks
Lesser flexibility
● sometimes unclear of
how to organize multiple
tasks into the massive
pipeline
● doesn't deal with data sets
or files as inputs of tasks
directly
● If database is lost, harder
to restore
● Workers cannot start the
task independently or pick
tasks flexibly.
DAG View
Tree View
Graph View
Code View
Thank You !
Ad

More Related Content

What's hot (20)

Airflow introduction
Airflow introductionAirflow introduction
Airflow introduction
Chandler Huang
 
Apache Airflow overview
Apache Airflow overviewApache Airflow overview
Apache Airflow overview
NikolayGrishchenkov
 
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow managementIntro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Burasakorn Sabyeying
 
Airflow - a data flow engine
Airflow - a data flow engineAirflow - a data flow engine
Airflow - a data flow engine
Walter Liu
 
Airflow 101
Airflow 101Airflow 101
Airflow 101
SaarBergerbest
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache Airflow
Yohei Onishi
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache Airflow
Sid Anand
 
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Yohei Onishi
 
Apache Airflow Introduction
Apache Airflow IntroductionApache Airflow Introduction
Apache Airflow Introduction
Liangjun Jiang
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow Architecture
Gerard Toonstra
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
Knoldus Inc.
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflow
mutt_data
 
Airflow tutorials hands_on
Airflow tutorials hands_onAirflow tutorials hands_on
Airflow tutorials hands_on
pko89403
 
Orchestrating workflows Apache Airflow on GCP & AWS
Orchestrating workflows Apache Airflow on GCP & AWSOrchestrating workflows Apache Airflow on GCP & AWS
Orchestrating workflows Apache Airflow on GCP & AWS
Derrick Qin
 
Server monitoring using grafana and prometheus
Server monitoring using grafana and prometheusServer monitoring using grafana and prometheus
Server monitoring using grafana and prometheus
Celine George
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoop
clairvoyantllc
 
Grafana
GrafanaGrafana
Grafana
NoelMc Grath
 
From airflow to google cloud composer
From airflow to google cloud composerFrom airflow to google cloud composer
From airflow to google cloud composer
Bruce Kuo
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
Rico Chen
 
Airflow Best Practises & Roadmap to Airflow 2.0
Airflow Best Practises & Roadmap to Airflow 2.0Airflow Best Practises & Roadmap to Airflow 2.0
Airflow Best Practises & Roadmap to Airflow 2.0
Kaxil Naik
 
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow managementIntro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Burasakorn Sabyeying
 
Airflow - a data flow engine
Airflow - a data flow engineAirflow - a data flow engine
Airflow - a data flow engine
Walter Liu
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache Airflow
Yohei Onishi
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache Airflow
Sid Anand
 
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Yohei Onishi
 
Apache Airflow Introduction
Apache Airflow IntroductionApache Airflow Introduction
Apache Airflow Introduction
Liangjun Jiang
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow Architecture
Gerard Toonstra
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflow
mutt_data
 
Airflow tutorials hands_on
Airflow tutorials hands_onAirflow tutorials hands_on
Airflow tutorials hands_on
pko89403
 
Orchestrating workflows Apache Airflow on GCP & AWS
Orchestrating workflows Apache Airflow on GCP & AWSOrchestrating workflows Apache Airflow on GCP & AWS
Orchestrating workflows Apache Airflow on GCP & AWS
Derrick Qin
 
Server monitoring using grafana and prometheus
Server monitoring using grafana and prometheusServer monitoring using grafana and prometheus
Server monitoring using grafana and prometheus
Celine George
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoop
clairvoyantllc
 
From airflow to google cloud composer
From airflow to google cloud composerFrom airflow to google cloud composer
From airflow to google cloud composer
Bruce Kuo
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
Rico Chen
 
Airflow Best Practises & Roadmap to Airflow 2.0
Airflow Best Practises & Roadmap to Airflow 2.0Airflow Best Practises & Roadmap to Airflow 2.0
Airflow Best Practises & Roadmap to Airflow 2.0
Kaxil Naik
 

Similar to Apache Airflow (20)

How to pinpoint and fix sources of performance problems in your SAP BusinessO...
How to pinpoint and fix sources of performance problems in your SAP BusinessO...How to pinpoint and fix sources of performance problems in your SAP BusinessO...
How to pinpoint and fix sources of performance problems in your SAP BusinessO...
Xoomworks Business Intelligence
 
DataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptxDataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptx
John J Zhao
 
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiaoadaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
lyvanlinh519
 
Sap basis online training classes
Sap basis online training classesSap basis online training classes
Sap basis online training classes
sapehsit
 
How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with Airflow
PyData
 
Apache Airflow presentation by GenPPT.pptx
Apache Airflow presentation by GenPPT.pptxApache Airflow presentation by GenPPT.pptx
Apache Airflow presentation by GenPPT.pptx
VikasTomar93
 
Airflow techtonic template
Airflow   techtonic templateAirflow   techtonic template
Airflow techtonic template
Sampath Kumar
 
Sap basis training demo basis online training in usa,uk and india
Sap basis training demo  basis online training in usa,uk and indiaSap basis training demo  basis online training in usa,uk and india
Sap basis training demo basis online training in usa,uk and india
magnificsmile
 
Sap basis training demo basis online training in usa,uk and india
Sap basis training demo  basis online training in usa,uk and indiaSap basis training demo  basis online training in usa,uk and india
Sap basis training demo basis online training in usa,uk and india
magnifics
 
Airflow at lyft
Airflow at lyftAirflow at lyft
Airflow at lyft
Tao Feng
 
Serverless - DevOps Lessons Learned From Production
Serverless - DevOps Lessons Learned From ProductionServerless - DevOps Lessons Learned From Production
Serverless - DevOps Lessons Learned From Production
Steve Hogg
 
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
BI Brainz
 
Building Automated Data Pipelines with Airflow.pdf
Building Automated Data Pipelines with Airflow.pdfBuilding Automated Data Pipelines with Airflow.pdf
Building Automated Data Pipelines with Airflow.pdf
abhaykm804
 
Tordatasci meetup-precima-retail-analytics-201901
Tordatasci meetup-precima-retail-analytics-201901Tordatasci meetup-precima-retail-analytics-201901
Tordatasci meetup-precima-retail-analytics-201901
WeCloudData
 
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDelivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Databricks
 
Serverless Computing & Automation - GCP
Serverless Computing & Automation -  GCPServerless Computing & Automation -  GCP
Serverless Computing & Automation - GCP
abiguimeleroy
 
Dataflow.pptx
Dataflow.pptxDataflow.pptx
Dataflow.pptx
Sadeka Islam
 
How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with Airflow
Laura Lorenz
 
airflow web UI and CLI.pptx
airflow web UI and CLI.pptxairflow web UI and CLI.pptx
airflow web UI and CLI.pptx
VIJAYAPRABAP
 
SAP BASIS Simplified Learning with End to End
SAP BASIS Simplified Learning with End to EndSAP BASIS Simplified Learning with End to End
SAP BASIS Simplified Learning with End to End
nagaraj2004811
 
How to pinpoint and fix sources of performance problems in your SAP BusinessO...
How to pinpoint and fix sources of performance problems in your SAP BusinessO...How to pinpoint and fix sources of performance problems in your SAP BusinessO...
How to pinpoint and fix sources of performance problems in your SAP BusinessO...
Xoomworks Business Intelligence
 
DataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptxDataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptx
John J Zhao
 
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiaoadaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
lyvanlinh519
 
Sap basis online training classes
Sap basis online training classesSap basis online training classes
Sap basis online training classes
sapehsit
 
How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with Airflow
PyData
 
Apache Airflow presentation by GenPPT.pptx
Apache Airflow presentation by GenPPT.pptxApache Airflow presentation by GenPPT.pptx
Apache Airflow presentation by GenPPT.pptx
VikasTomar93
 
Airflow techtonic template
Airflow   techtonic templateAirflow   techtonic template
Airflow techtonic template
Sampath Kumar
 
Sap basis training demo basis online training in usa,uk and india
Sap basis training demo  basis online training in usa,uk and indiaSap basis training demo  basis online training in usa,uk and india
Sap basis training demo basis online training in usa,uk and india
magnificsmile
 
Sap basis training demo basis online training in usa,uk and india
Sap basis training demo  basis online training in usa,uk and indiaSap basis training demo  basis online training in usa,uk and india
Sap basis training demo basis online training in usa,uk and india
magnifics
 
Airflow at lyft
Airflow at lyftAirflow at lyft
Airflow at lyft
Tao Feng
 
Serverless - DevOps Lessons Learned From Production
Serverless - DevOps Lessons Learned From ProductionServerless - DevOps Lessons Learned From Production
Serverless - DevOps Lessons Learned From Production
Steve Hogg
 
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
BI Brainz
 
Building Automated Data Pipelines with Airflow.pdf
Building Automated Data Pipelines with Airflow.pdfBuilding Automated Data Pipelines with Airflow.pdf
Building Automated Data Pipelines with Airflow.pdf
abhaykm804
 
Tordatasci meetup-precima-retail-analytics-201901
Tordatasci meetup-precima-retail-analytics-201901Tordatasci meetup-precima-retail-analytics-201901
Tordatasci meetup-precima-retail-analytics-201901
WeCloudData
 
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDelivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Databricks
 
Serverless Computing & Automation - GCP
Serverless Computing & Automation -  GCPServerless Computing & Automation -  GCP
Serverless Computing & Automation - GCP
abiguimeleroy
 
How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with Airflow
Laura Lorenz
 
airflow web UI and CLI.pptx
airflow web UI and CLI.pptxairflow web UI and CLI.pptx
airflow web UI and CLI.pptx
VIJAYAPRABAP
 
SAP BASIS Simplified Learning with End to End
SAP BASIS Simplified Learning with End to EndSAP BASIS Simplified Learning with End to End
SAP BASIS Simplified Learning with End to End
nagaraj2004811
 
Ad

More from Knoldus Inc. (20)

Angular Hydration Presentation (FrontEnd)
Angular Hydration Presentation (FrontEnd)Angular Hydration Presentation (FrontEnd)
Angular Hydration Presentation (FrontEnd)
Knoldus Inc.
 
Optimizing Test Execution: Heuristic Algorithm for Self-Healing
Optimizing Test Execution: Heuristic Algorithm for Self-HealingOptimizing Test Execution: Heuristic Algorithm for Self-Healing
Optimizing Test Execution: Heuristic Algorithm for Self-Healing
Knoldus Inc.
 
Self-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - HealeniumSelf-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - Healenium
Knoldus Inc.
 
Kanban Metrics Presentation (Project Management)
Kanban Metrics Presentation (Project Management)Kanban Metrics Presentation (Project Management)
Kanban Metrics Presentation (Project Management)
Knoldus Inc.
 
Java 17 features and implementation.pptx
Java 17 features and implementation.pptxJava 17 features and implementation.pptx
Java 17 features and implementation.pptx
Knoldus Inc.
 
Chaos Mesh Introducing Chaos in Kubernetes
Chaos Mesh Introducing Chaos in KubernetesChaos Mesh Introducing Chaos in Kubernetes
Chaos Mesh Introducing Chaos in Kubernetes
Knoldus Inc.
 
GraalVM - A Step Ahead of JVM Presentation
GraalVM - A Step Ahead of JVM PresentationGraalVM - A Step Ahead of JVM Presentation
GraalVM - A Step Ahead of JVM Presentation
Knoldus Inc.
 
Nomad by HashiCorp Presentation (DevOps)
Nomad by HashiCorp Presentation (DevOps)Nomad by HashiCorp Presentation (DevOps)
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
Nomad by HashiCorp Presentation (DevOps)
Nomad by HashiCorp Presentation (DevOps)Nomad by HashiCorp Presentation (DevOps)
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
DAPR - Distributed Application Runtime Presentation
DAPR - Distributed Application Runtime PresentationDAPR - Distributed Application Runtime Presentation
DAPR - Distributed Application Runtime Presentation
Knoldus Inc.
 
Introduction to Azure Virtual WAN Presentation
Introduction to Azure Virtual WAN PresentationIntroduction to Azure Virtual WAN Presentation
Introduction to Azure Virtual WAN Presentation
Knoldus Inc.
 
Introduction to Argo Rollouts Presentation
Introduction to Argo Rollouts PresentationIntroduction to Argo Rollouts Presentation
Introduction to Argo Rollouts Presentation
Knoldus Inc.
 
Intro to Azure Container App Presentation
Intro to Azure Container App PresentationIntro to Azure Container App Presentation
Intro to Azure Container App Presentation
Knoldus Inc.
 
Insights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability ExcellenceInsights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability Excellence
Knoldus Inc.
 
Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)
Knoldus Inc.
 
Code Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis FrameworkCode Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis Framework
Knoldus Inc.
 
AWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS PresentationAWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS Presentation
Knoldus Inc.
 
Amazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and AuthorizationAmazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and Authorization
Knoldus Inc.
 
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web DevelopmentZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
Knoldus Inc.
 
Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.
Knoldus Inc.
 
Angular Hydration Presentation (FrontEnd)
Angular Hydration Presentation (FrontEnd)Angular Hydration Presentation (FrontEnd)
Angular Hydration Presentation (FrontEnd)
Knoldus Inc.
 
Optimizing Test Execution: Heuristic Algorithm for Self-Healing
Optimizing Test Execution: Heuristic Algorithm for Self-HealingOptimizing Test Execution: Heuristic Algorithm for Self-Healing
Optimizing Test Execution: Heuristic Algorithm for Self-Healing
Knoldus Inc.
 
Self-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - HealeniumSelf-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - Healenium
Knoldus Inc.
 
Kanban Metrics Presentation (Project Management)
Kanban Metrics Presentation (Project Management)Kanban Metrics Presentation (Project Management)
Kanban Metrics Presentation (Project Management)
Knoldus Inc.
 
Java 17 features and implementation.pptx
Java 17 features and implementation.pptxJava 17 features and implementation.pptx
Java 17 features and implementation.pptx
Knoldus Inc.
 
Chaos Mesh Introducing Chaos in Kubernetes
Chaos Mesh Introducing Chaos in KubernetesChaos Mesh Introducing Chaos in Kubernetes
Chaos Mesh Introducing Chaos in Kubernetes
Knoldus Inc.
 
GraalVM - A Step Ahead of JVM Presentation
GraalVM - A Step Ahead of JVM PresentationGraalVM - A Step Ahead of JVM Presentation
GraalVM - A Step Ahead of JVM Presentation
Knoldus Inc.
 
Nomad by HashiCorp Presentation (DevOps)
Nomad by HashiCorp Presentation (DevOps)Nomad by HashiCorp Presentation (DevOps)
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
Nomad by HashiCorp Presentation (DevOps)
Nomad by HashiCorp Presentation (DevOps)Nomad by HashiCorp Presentation (DevOps)
Nomad by HashiCorp Presentation (DevOps)
Knoldus Inc.
 
DAPR - Distributed Application Runtime Presentation
DAPR - Distributed Application Runtime PresentationDAPR - Distributed Application Runtime Presentation
DAPR - Distributed Application Runtime Presentation
Knoldus Inc.
 
Introduction to Azure Virtual WAN Presentation
Introduction to Azure Virtual WAN PresentationIntroduction to Azure Virtual WAN Presentation
Introduction to Azure Virtual WAN Presentation
Knoldus Inc.
 
Introduction to Argo Rollouts Presentation
Introduction to Argo Rollouts PresentationIntroduction to Argo Rollouts Presentation
Introduction to Argo Rollouts Presentation
Knoldus Inc.
 
Intro to Azure Container App Presentation
Intro to Azure Container App PresentationIntro to Azure Container App Presentation
Intro to Azure Container App Presentation
Knoldus Inc.
 
Insights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability ExcellenceInsights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability Excellence
Knoldus Inc.
 
Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)
Knoldus Inc.
 
Code Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis FrameworkCode Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis Framework
Knoldus Inc.
 
AWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS PresentationAWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS Presentation
Knoldus Inc.
 
Amazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and AuthorizationAmazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and Authorization
Knoldus Inc.
 
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web DevelopmentZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
Knoldus Inc.
 
Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.
Knoldus Inc.
 
Ad

Recently uploaded (20)

Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 

Apache Airflow

  • 1. Presented By: Divyansh Jain Software Consultant Knoldus Inc Apache Airflow
  • 2. Agenda What is Airflow? How it works? Pros & Cons UI Screenshots Demo Why use Airflow ?
  • 3. c Airflow is a platform to programmatically author, schedule and monitor workflows (a.k.a DAGs or Directed Acyclic Graph) What is Apache Airflow?
  • 4. Incubating - Created @ Airbnb in 2015 by Maxime Beauchemin - 6100+ Forks - 16k+ Github Stars - 1k+ Contributors - 150+ companies officially using it - More than 8k committers
  • 5. What is Airflow? Pipelines are configured as code, allowing for dynamic pipeline generation A platform to schedule data pipelines Easy to define our own operators, executors It’s all about DAGs A platform to monitor and control data pipelines
  • 6. Why to use Airflow ? When would you use a Workflow Scheduler like Airflow? - ETL Pipelines - Machine Learning Projects - Predictive Data Pipelines such as Fraud Detection etc. - Job Scheduling
  • 7. Why to use Airflow ? What should a Workflow do well? - Schedule your workflow plan - Handle task failures - Monitor Performance
  • 8. Very Flexible - Rich User Interface - Easily Extensible - Allow communication between task - Backfill Control - Efficient CLI
  • 10. How the Job initiates? Web Server Scheduler Worker Worker Metadata User - A user manages/ schedules DAGs using Airflow - Airflow webserver stores scheduling metadata in metadata DB - Scheduler picks up new schedule and distributes work over Celery or RabbitMQ - Workers picks up Airflow Task Note- Celery is used to manage node and Redis or RabbitMQ for communication
  • 11. Pros Easy-to-use user interface - Independent scheduler Active open-source community ● can easily view task logs ● can easily view hierarchy, statuses, and code runs ● allows you to easily change task statuses, rerun historical tasks ● comes with its own scheduler ● supports multiple DAGs ● chatroom itself is very active ● newbies can get their questions answered within a span of few hours.
  • 12. Cons Task optimization - No direct dealing with - tasks Lesser flexibility ● sometimes unclear of how to organize multiple tasks into the massive pipeline ● doesn't deal with data sets or files as inputs of tasks directly ● If database is lost, harder to restore ● Workers cannot start the task independently or pick tasks flexibly.