SlideShare a Scribd company logo
It's a Breeze to develop Apache Airflow (Apache Con Berlin)
Polidea
Polidea
Airflow is a platform to programmatically author,
schedule and monitor workflows.
Dynamic/Elegant
Extensible
Scalable
Polidea
Polidea
● Developing Airflow is (was !) hard
● Road taken to developer productivity
● Improving first time experience for the developers
● Focus on teamwork
Polidea
Polidea
Polidea
Hi!
Principal Software Engineer @Polidea
Apache Airflow PMC member
Certified GCP Architect
ex-Googler, ex-CTO, ex-choir member
Polidea
TALENTS
PROJECTS
DELIVERED
USERS OF
OUR APPS
OF BUSINESS
THROUGH
REFERRALS
Polidea
Polidea
Polidea
Polidea
● 100+ operators
● 18+ GCP services
● Oozie-To-Airflow
Polidea
● 1 Apache Airflow committer, 1 PMC member
● Documentation improvements
● Breeze - improved development environment
● Py2 -> Py3
● Pylint compatibility
● CI environment reimplemented
● Operator scaffolding
Polidea
Polidea
Polidea
● Multiple backends: postgres, mysql, sqlite
● Multiple python versions (2.7) - 3.5, 3.6. 3.7
● Multiple executors: Local/Sequential/Kubernetes
● Automated static code analysis
● Automated documentation building
Polidea
● Long time to set it up
● Frustrations of fresh developer experience
● High friction/learning curve for Airflow development environment
● Slow iteration speed
● Complicated Development Environment
Polidea
● Scripts only designed for CI, not local environment
● Dependencies installed every time you start the environment
● Always full database reset
● Minutes to run one test
● No guidance how to iterate over tests
Polidea
Polidea
Polidea
● Focus on developer productivity
● Faster development cycle
● Decrease developer frustration
● Improve the teamwork
● Easy for ad-hoc contributors to code & test
Polidea
● AIP-10: Multi-layered and multi-stage official Airflow image
● AIP-7: Simplified Development Workflow
● AIP-26: Production-ready Airflow Docker Image and helm chart
● AIP-23: Migrate out of Travis CI
● AIP-4: Support for System Tests for external systems
Polidea
● Local virtualenv
● Own Travis CI fork
● Docker compose (Travis CI equivalent)
Polidea
● Total time: 7 minutes
● Running one test only
● Failure at the end (!)
● Re-run - 10-20 seconds for DB
● Re-enter - same time (!)
● No bash history
Polidea
Polidea
● Docker images built from master automatically (DockerHub)
● Local images use cached images
● Tests and static checks run using Docker Compose/Docker environment
● Can be run on Kubernetes Cluster (Docker-In-Docker)
● CI system - independent
● Base to build production image
Polidea
Polidea
● works out-of-the-box
● initializes DB when needed
● environment variables set
● sub-second test overhead
● ipdb debugging
● verbose output
Polidea
Polidea
● entering the environment: ./breeze --backend sqlite --python 3.5
● last-used environment: ./breeze
● automated image management
● autocomplete of options
● sub-second test execution overhead
● host sources mounted to Docker container
● ports forwarded
● hints for ad-hoc developers
Polidea
● run-tests tests.core<TAB><TAB> autocomplete
● bash history across sessions
● run static checks with Breeze
● build documentation with Breeze
● run licence checks with Breeze
● easy debugging (including debugging with IDE)
● pre-commit checks
Polidea
Polidea
Polidea
Polidea
● Docker image management
● Run-tests with DB initialisation
● Travis CI integration
● Run all tasks (docs/static/licence check ...)
● Pre-commit checks
● Comprehensive documentation - Google Season of Docs YAY!
Polidea
Polidea
Polidea
● easy to use
○ pre-commit install
○ pre-commit run
○ pre-commit run mypy
○ pre-commit run --all-files
● run only for changed files (fast)
● catches errors early
● make committers time efficient
● promotes good practices
Polidea
Polidea
● Production-ready Apache Airflow official image
● Simplifications (less images, easier scripts)
● Migrating out of Travis CI
○ GitLab CI (only CI) or GitHub Actions
○ Kubernetes Cluster on Google Kubernetes Engine (Thanks Google!)
● Automation of Performance Tests
● Automation of Release Tests
It's a Breeze to develop Apache Airflow (Apache Con Berlin)
It's a Breeze to develop Apache Airflow (Apache Con Berlin)
Polidea
hello@polidea.com
Ad

More Related Content

What's hot (20)

FOSDEM 2017: GitLab CI
FOSDEM 2017:  GitLab CIFOSDEM 2017:  GitLab CI
FOSDEM 2017: GitLab CI
OlinData
 
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Samsul Ma'arif
 
An OpenShift Migration: From 3.9 to 4.5
An OpenShift Migration: From 3.9 to 4.5An OpenShift Migration: From 3.9 to 4.5
An OpenShift Migration: From 3.9 to 4.5
Everett Toews
 
Introduction to GIT
Introduction to GITIntroduction to GIT
Introduction to GIT
Piotr Benetkiewicz
 
Using GitLab CI
Using GitLab CIUsing GitLab CI
Using GitLab CI
ColCh
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub Actions
Bo-Yi Wu
 
Gitlab ci-cd
Gitlab ci-cdGitlab ci-cd
Gitlab ci-cd
Dan MAGIER
 
Git Tutorial
Git TutorialGit Tutorial
Git Tutorial
Moshe Kaplan
 
Flow
FlowFlow
Flow
Eugen Martynov
 
Git with the flow
Git with the flowGit with the flow
Git with the flow
Dana White
 
DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)
Francesco Fiore
 
Architecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Architecting Qt Mobile Applications: Frameworks, Code Generators and BeyondArchitecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Architecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Sandro Andrade
 
Training: Day Two - Eclipse, Git, Maven
Training: Day Two - Eclipse, Git, MavenTraining: Day Two - Eclipse, Git, Maven
Training: Day Two - Eclipse, Git, Maven
Artur Ventura
 
Open Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Open Innovation Lab (OIL) - 20150227 - GIT Intro WorkshopOpen Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Open Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Wong Hoi Sing Edison
 
Git Tutorial I
Git Tutorial IGit Tutorial I
Git Tutorial I
Jim Yeh
 
GitLab for CI/CD process
GitLab for CI/CD processGitLab for CI/CD process
GitLab for CI/CD process
HYS Enterprise
 
Upgrading to Apache Airflow 2 | Airflow Summit 2021
Upgrading to Apache Airflow 2 | Airflow Summit 2021Upgrading to Apache Airflow 2 | Airflow Summit 2021
Upgrading to Apache Airflow 2 | Airflow Summit 2021
Kaxil Naik
 
Lets git to it
Lets git to itLets git to it
Lets git to it
Yoram Michaeli
 
Git and Github
Git and GithubGit and Github
Git and Github
Wen-Tien Chang
 
Log monitoring with Logstash and Icinga
Log monitoring with Logstash and IcingaLog monitoring with Logstash and Icinga
Log monitoring with Logstash and Icinga
OlinData
 
FOSDEM 2017: GitLab CI
FOSDEM 2017:  GitLab CIFOSDEM 2017:  GitLab CI
FOSDEM 2017: GitLab CI
OlinData
 
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Deploy Multinode GitLab Runner in openSUSE 15.1 Instances with Ansible Automa...
Samsul Ma'arif
 
An OpenShift Migration: From 3.9 to 4.5
An OpenShift Migration: From 3.9 to 4.5An OpenShift Migration: From 3.9 to 4.5
An OpenShift Migration: From 3.9 to 4.5
Everett Toews
 
Using GitLab CI
Using GitLab CIUsing GitLab CI
Using GitLab CI
ColCh
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub Actions
Bo-Yi Wu
 
Git with the flow
Git with the flowGit with the flow
Git with the flow
Dana White
 
DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)
Francesco Fiore
 
Architecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Architecting Qt Mobile Applications: Frameworks, Code Generators and BeyondArchitecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Architecting Qt Mobile Applications: Frameworks, Code Generators and Beyond
Sandro Andrade
 
Training: Day Two - Eclipse, Git, Maven
Training: Day Two - Eclipse, Git, MavenTraining: Day Two - Eclipse, Git, Maven
Training: Day Two - Eclipse, Git, Maven
Artur Ventura
 
Open Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Open Innovation Lab (OIL) - 20150227 - GIT Intro WorkshopOpen Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Open Innovation Lab (OIL) - 20150227 - GIT Intro Workshop
Wong Hoi Sing Edison
 
Git Tutorial I
Git Tutorial IGit Tutorial I
Git Tutorial I
Jim Yeh
 
GitLab for CI/CD process
GitLab for CI/CD processGitLab for CI/CD process
GitLab for CI/CD process
HYS Enterprise
 
Upgrading to Apache Airflow 2 | Airflow Summit 2021
Upgrading to Apache Airflow 2 | Airflow Summit 2021Upgrading to Apache Airflow 2 | Airflow Summit 2021
Upgrading to Apache Airflow 2 | Airflow Summit 2021
Kaxil Naik
 
Log monitoring with Logstash and Icinga
Log monitoring with Logstash and IcingaLog monitoring with Logstash and Icinga
Log monitoring with Logstash and Icinga
OlinData
 

Similar to It's a Breeze to develop Apache Airflow (Apache Con Berlin) (20)

It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
Jarek Potiuk
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous Deployment
Leandro Totino Pereira
 
Advanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the FieldAdvanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the Field
Ariel Moskovich
 
Dev + DevOps для PHP розробника
Dev + DevOps для PHP розробникаDev + DevOps для PHP розробника
Dev + DevOps для PHP розробника
phpfriendsclub
 
Continuous testing
Continuous testingContinuous testing
Continuous testing
Oleksandr Metelytsia
 
Modern Web-site Development Pipeline
Modern Web-site Development PipelineModern Web-site Development Pipeline
Modern Web-site Development Pipeline
GlobalLogic Ukraine
 
DrupalCon Los Angeles - Continuous Integration Toolbox
DrupalCon Los Angeles - Continuous Integration ToolboxDrupalCon Los Angeles - Continuous Integration Toolbox
DrupalCon Los Angeles - Continuous Integration Toolbox
Andrii Podanenko
 
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
Ortus Solutions, Corp
 
kpatch.kgraft
kpatch.kgraftkpatch.kgraft
kpatch.kgraft
Udo Seidel
 
Why You Should be Using Multi-stage Docker Builds in 2019
Why You Should be Using Multi-stage Docker Builds in 2019Why You Should be Using Multi-stage Docker Builds in 2019
Why You Should be Using Multi-stage Docker Builds in 2019
Codefresh
 
Docker based-Pipelines with Codefresh
Docker based-Pipelines with CodefreshDocker based-Pipelines with Codefresh
Docker based-Pipelines with Codefresh
Codefresh
 
Beyond Puppet
Beyond PuppetBeyond Puppet
Beyond Puppet
Kris Buytaert
 
MoldCamp - multidimentional testing workflow. CIBox.
MoldCamp  - multidimentional testing workflow. CIBox.MoldCamp  - multidimentional testing workflow. CIBox.
MoldCamp - multidimentional testing workflow. CIBox.
Andrii Podanenko
 
Docking with Docker
Docking with DockerDocking with Docker
Docking with Docker
University of Alabama at Birmingham
 
Chef - Administration for programmers
Chef - Administration for programmersChef - Administration for programmers
Chef - Administration for programmers
mrsabo
 
Docker SQL Continuous Integration Flow
Docker SQL Continuous Integration FlowDocker SQL Continuous Integration Flow
Docker SQL Continuous Integration Flow
Andrii Podanenko
 
Настройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'aНастройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'a
corehard_by
 
Ruby microservices with Docker - Sergii Koba
Ruby microservices with Docker -  Sergii KobaRuby microservices with Docker -  Sergii Koba
Ruby microservices with Docker - Sergii Koba
Ruby Meditation
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
Kris Buytaert
 
Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)
Igalia
 
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
Jarek Potiuk
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous Deployment
Leandro Totino Pereira
 
Advanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the FieldAdvanced Code Flow, Notes From the Field
Advanced Code Flow, Notes From the Field
Ariel Moskovich
 
Dev + DevOps для PHP розробника
Dev + DevOps для PHP розробникаDev + DevOps для PHP розробника
Dev + DevOps для PHP розробника
phpfriendsclub
 
Modern Web-site Development Pipeline
Modern Web-site Development PipelineModern Web-site Development Pipeline
Modern Web-site Development Pipeline
GlobalLogic Ukraine
 
DrupalCon Los Angeles - Continuous Integration Toolbox
DrupalCon Los Angeles - Continuous Integration ToolboxDrupalCon Los Angeles - Continuous Integration Toolbox
DrupalCon Los Angeles - Continuous Integration Toolbox
Andrii Podanenko
 
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
CBDW2014- Intro to CommandBox; The ColdFusion CLI, Package Manager, and REPL ...
Ortus Solutions, Corp
 
Why You Should be Using Multi-stage Docker Builds in 2019
Why You Should be Using Multi-stage Docker Builds in 2019Why You Should be Using Multi-stage Docker Builds in 2019
Why You Should be Using Multi-stage Docker Builds in 2019
Codefresh
 
Docker based-Pipelines with Codefresh
Docker based-Pipelines with CodefreshDocker based-Pipelines with Codefresh
Docker based-Pipelines with Codefresh
Codefresh
 
MoldCamp - multidimentional testing workflow. CIBox.
MoldCamp  - multidimentional testing workflow. CIBox.MoldCamp  - multidimentional testing workflow. CIBox.
MoldCamp - multidimentional testing workflow. CIBox.
Andrii Podanenko
 
Chef - Administration for programmers
Chef - Administration for programmersChef - Administration for programmers
Chef - Administration for programmers
mrsabo
 
Docker SQL Continuous Integration Flow
Docker SQL Continuous Integration FlowDocker SQL Continuous Integration Flow
Docker SQL Continuous Integration Flow
Andrii Podanenko
 
Настройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'aНастройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'a
corehard_by
 
Ruby microservices with Docker - Sergii Koba
Ruby microservices with Docker -  Sergii KobaRuby microservices with Docker -  Sergii Koba
Ruby microservices with Docker - Sergii Koba
Ruby Meditation
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
Kris Buytaert
 
Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)
Igalia
 
Ad

More from Jarek Potiuk (8)

Subtle Differences between Python versions
Subtle Differences between Python versionsSubtle Differences between Python versions
Subtle Differences between Python versions
Jarek Potiuk
 
Caching in Docker - the hardest thing in computer science
Caching in Docker - the hardest thing in computer scienceCaching in Docker - the hardest thing in computer science
Caching in Docker - the hardest thing in computer science
Jarek Potiuk
 
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFestManageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Jarek Potiuk
 
Off time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social mediaOff time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social media
Jarek Potiuk
 
Berlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow WorkshopsBerlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow Workshops
Jarek Potiuk
 
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Jarek Potiuk
 
Ci for android OS
Ci for android OSCi for android OS
Ci for android OS
Jarek Potiuk
 
React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)
Jarek Potiuk
 
Subtle Differences between Python versions
Subtle Differences between Python versionsSubtle Differences between Python versions
Subtle Differences between Python versions
Jarek Potiuk
 
Caching in Docker - the hardest thing in computer science
Caching in Docker - the hardest thing in computer scienceCaching in Docker - the hardest thing in computer science
Caching in Docker - the hardest thing in computer science
Jarek Potiuk
 
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFestManageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Jarek Potiuk
 
Off time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social mediaOff time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social media
Jarek Potiuk
 
Berlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow WorkshopsBerlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow Workshops
Jarek Potiuk
 
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Jarek Potiuk
 
React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)
Jarek Potiuk
 
Ad

Recently uploaded (20)

Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 

It's a Breeze to develop Apache Airflow (Apache Con Berlin)

  • 3. Polidea Airflow is a platform to programmatically author, schedule and monitor workflows. Dynamic/Elegant Extensible Scalable
  • 5. Polidea ● Developing Airflow is (was !) hard ● Road taken to developer productivity ● Improving first time experience for the developers ● Focus on teamwork
  • 8. Polidea Hi! Principal Software Engineer @Polidea Apache Airflow PMC member Certified GCP Architect ex-Googler, ex-CTO, ex-choir member
  • 13. Polidea ● 100+ operators ● 18+ GCP services ● Oozie-To-Airflow
  • 14. Polidea ● 1 Apache Airflow committer, 1 PMC member ● Documentation improvements ● Breeze - improved development environment ● Py2 -> Py3 ● Pylint compatibility ● CI environment reimplemented ● Operator scaffolding
  • 17. Polidea ● Multiple backends: postgres, mysql, sqlite ● Multiple python versions (2.7) - 3.5, 3.6. 3.7 ● Multiple executors: Local/Sequential/Kubernetes ● Automated static code analysis ● Automated documentation building
  • 18. Polidea ● Long time to set it up ● Frustrations of fresh developer experience ● High friction/learning curve for Airflow development environment ● Slow iteration speed ● Complicated Development Environment
  • 19. Polidea ● Scripts only designed for CI, not local environment ● Dependencies installed every time you start the environment ● Always full database reset ● Minutes to run one test ● No guidance how to iterate over tests
  • 22. Polidea ● Focus on developer productivity ● Faster development cycle ● Decrease developer frustration ● Improve the teamwork ● Easy for ad-hoc contributors to code & test
  • 23. Polidea ● AIP-10: Multi-layered and multi-stage official Airflow image ● AIP-7: Simplified Development Workflow ● AIP-26: Production-ready Airflow Docker Image and helm chart ● AIP-23: Migrate out of Travis CI ● AIP-4: Support for System Tests for external systems
  • 24. Polidea ● Local virtualenv ● Own Travis CI fork ● Docker compose (Travis CI equivalent)
  • 25. Polidea ● Total time: 7 minutes ● Running one test only ● Failure at the end (!) ● Re-run - 10-20 seconds for DB ● Re-enter - same time (!) ● No bash history
  • 27. Polidea ● Docker images built from master automatically (DockerHub) ● Local images use cached images ● Tests and static checks run using Docker Compose/Docker environment ● Can be run on Kubernetes Cluster (Docker-In-Docker) ● CI system - independent ● Base to build production image
  • 29. Polidea ● works out-of-the-box ● initializes DB when needed ● environment variables set ● sub-second test overhead ● ipdb debugging ● verbose output
  • 31. Polidea ● entering the environment: ./breeze --backend sqlite --python 3.5 ● last-used environment: ./breeze ● automated image management ● autocomplete of options ● sub-second test execution overhead ● host sources mounted to Docker container ● ports forwarded ● hints for ad-hoc developers
  • 32. Polidea ● run-tests tests.core<TAB><TAB> autocomplete ● bash history across sessions ● run static checks with Breeze ● build documentation with Breeze ● run licence checks with Breeze ● easy debugging (including debugging with IDE) ● pre-commit checks
  • 36. Polidea ● Docker image management ● Run-tests with DB initialisation ● Travis CI integration ● Run all tasks (docs/static/licence check ...) ● Pre-commit checks ● Comprehensive documentation - Google Season of Docs YAY!
  • 39. Polidea ● easy to use ○ pre-commit install ○ pre-commit run ○ pre-commit run mypy ○ pre-commit run --all-files ● run only for changed files (fast) ● catches errors early ● make committers time efficient ● promotes good practices
  • 41. Polidea ● Production-ready Apache Airflow official image ● Simplifications (less images, easier scripts) ● Migrating out of Travis CI ○ GitLab CI (only CI) or GitHub Actions ○ Kubernetes Cluster on Google Kubernetes Engine (Thanks Google!) ● Automation of Performance Tests ● Automation of Release Tests