SlideShare a Scribd company logo
Getting Deep on Orchestration
Jeff
Nickoloff
Engineer, Author, and
Consultant
• Author of Docker in Action
• Blog about Docker at https://medium.com/on-docker
• Professional engineering and containerization consultant
• Work with startups and fortune 100 companies
• Training
• http://allingeek.com
• Heaps of orchestration experience at Amazon
About me
What is
Orchestration?
Before Time
Examples
Abstractions
OSS
Agenda
Picking Apart
Platforms
Abstractions
Architecture
Components
Failures
Demo: Entropy
About failure
Architecture
Break stuff
Orchestration
What is it anyway?
Remember what it was like to SCP deployment artifacts?
How about just checking out from SVN right into prod?
Perl wizardry…
How about the luxurious experience of deploying via app server?
<2005
… and the following years of pain
“The first time you encounter real orchestration -
having lived in its absence - you may experience
the urge to laugh, or cry, or both.”
— Jeff Nickoloff
Getting Deep on Orchestration - Nickoloff - DockerCon16
• Amazon Apollo (Code Deploy)
• Deployment as an entity
• Amazon Marketplace Feeds and Reports
• Clustering, resource aware scheduling, state management
It was like living in a different world and there was no going back for anyone.
How many proprietary orchestration platforms do you think there are?
Proprietary Orchestration
An orchestration platform is:
a system that provides control of high-level abstractions which
imply certain deployment and lifecycle management semantics.
Abstractions:
• Force multipliers for communication
• Often required to cope with complexity or variability
An Observational Description
In the configuration management OSS world we have:
• Puppet, Chef, Ansible, Packer/Terraform, etc
Cloud specific
• CloudFormation, ElasticBeanstalk, ECS, etc
In the container platform OSS world we have:
• Swarm, UCP, Kubernetes, Mesosphere, Rancher, etc
OSS Orchestration
Reading this list gives me the feels…
Getting Deep on Orchestration - Nickoloff - DockerCon16
Container Orchestration
• Docker + Libnetwork + Swarm + Compose + etcd / consul / zk

• Docker + Flannel + Systemd + etcd + kubelet + kube (api, scheduler,
controller-manager, proxy, DNS)

• Docker (sometimes) + Mesos + Marathon + Calico + zk
Breaking down Swarm, Kubernetes, and Mesosphere
Force multiplying ideas: higher level examples
• Container (an isolated context for processes)
• Composition / Pod (containers w/ shared lifecycle)
• Service (long running event handler)
• Replication Controller (managed scale)
• Job (process with a linear lifecycle)
• Deployment (pushing and starting containers)
Abstractions
What features should we expect?
• Frontend
• Ingest descriptions of desired state
• Provide system visibility (and integration)
• Backend
• Manage a compute resource pool
• Manage logical networks, routes, and names
• Service discovery and load balancing
• Manage storage attachment, distribution, and replication
• Manage containers and maintain the desired state
• AA infrastructure integration
Architecture
How do you control the system?
• One or a collection of APIs
• Command Line Program
• Web or Native GUI
What do these interfaces interact with on the
backend?
An Interface
Where do they maintain the authoritative state?
Databases provide accounting for entities and their state.
Centralized or decentralized and commonly provide:
• KV semantics
• Distributed locks
• HA with strong consistency (Paxos / Raft)
• Record observation (watches)
• Update / Delete semantics with fencing tokens
System of Record
Control loops are the platform’s automata.
• React to state changes on notification or polling
• Calculate deltas with desired state and apply
• Might require leadership election for HA
Agents / Control Loops
How do they maintain the desired state of the system?
Event driven agents that coordinate changes in the system
• Container (re)scheduler
• Cluster node registrar
• Service node registrar
• Local supervisor / init
• Service controller
• Job controller
Agents / Control Loops
Event driven agents that coordinate changes in the system
• State Observation
• Feedback for control loops
• Entity Lifecycle Graph
• Registration and Discovery
• Route to an IP
• Engine in a cluster
• Replica of a service
• Endpoint of a service
Patterns
Getting Deep on Orchestration - Nickoloff - DockerCon16
What Breaks?
How am I going to fix it?
• Still need high-low entropy service discovery
and routing
• This is becoming less of a problem
• Bugs/failures you’d experience without
orchestration or clustering
Common Issues
• System of Record is a big SPoF
• Need durable distributed DB
• “Something isn’t showing up!“
• Everything uses the same
registration/discovery pattern
Clustering Issues
• N machines and M containers on an
overlay or virtual network
• Container attachment / detachment
• Node attachment / detachment
• Heartbeats / node monitoring
Networking Stress
• Upgrade mechanics
• Careful dance?
• Is rollback possible?
• Blue/Green?
Platform Lifecycle
Getting Deep on Orchestration - Nickoloff - DockerCon16
The Future
Where do we take these
systems from here?
Maybe a few new abstractions?
• Feed (Job with an input document)
• Report (Job with an output document)
• Request (Job with input and output documents)
• Cron Job (run a periodic job)
• User / User collection maybe?
• Failure Injection
Extensions
Putting your money where your mouth is…
Build confidence in complex distributed systems by injecting
realistic failures and comparing operations against a steady state.
• http://principlesofchaos.org
• Failure mode and effects analysis
• Netflix / SimianArmy
Not much in the container space yet…
Failure Injection
An orchestration abstraction for failure injection
Features:
• Probabilistic failure injection policies
• Failure modes
• Latency, partition, GC pause, etc
• Applied to existing containers filtered by label
• An event stream
• Notifications
• Integrates with the Docker API
Project: Entropy (PoC)
Demo Slide
$ entropy create 
--failure latency 
--frequency 10 
--probability .50 
--image allingeek/gremlins 
--criteria service=PingGoogle
Thank you!
Checkout:
github.com/buildertools/entropy
Twitter: @allingeek

More Related Content

What's hot (20)

A Brief Intro to Microsoft Orleans
A Brief Intro to Microsoft OrleansA Brief Intro to Microsoft Orleans
A Brief Intro to Microsoft Orleans
Uri Goldstein
 
Introduction to Akka-Streams
Introduction to Akka-StreamsIntroduction to Akka-Streams
Introduction to Akka-Streams
dmantula
 
Staying friendly with the gc
Staying friendly with the gcStaying friendly with the gc
Staying friendly with the gc
Oren Eini
 
Rebooting design in RavenDB
Rebooting design in RavenDBRebooting design in RavenDB
Rebooting design in RavenDB
Oren Eini
 
Apache Zeppelin & Cluster
Apache Zeppelin & ClusterApache Zeppelin & Cluster
Apache Zeppelin & Cluster
Jongyoul Lee
 
Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015
Pavel Chunyayev
 
Zeppelin meetup 2016 madrid
Zeppelin meetup 2016 madridZeppelin meetup 2016 madrid
Zeppelin meetup 2016 madrid
Jongyoul Lee
 
Dublin JUG February 2018 - Microservices in action at the Dutch National Police
Dublin JUG February 2018 - Microservices in action at the Dutch National PoliceDublin JUG February 2018 - Microservices in action at the Dutch National Police
Dublin JUG February 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
Get There meetup March 2018 - Microservices in action at the Dutch National P...
Get There meetup March 2018 - Microservices in action at the Dutch National P...Get There meetup March 2018 - Microservices in action at the Dutch National P...
Get There meetup March 2018 - Microservices in action at the Dutch National P...
Bert Jan Schrijver
 
Going serverless with aws
Going serverless with awsGoing serverless with aws
Going serverless with aws
Alex Landa
 
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
Panagiotis Kanavos
 
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National PoliceiSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
Microservices in action at the Dutch National Police
Microservices in action at the Dutch National PoliceMicroservices in action at the Dutch National Police
Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
JavaZone 2017 - Microservices in action at the Dutch National Police
JavaZone 2017 - Microservices in action at the Dutch National PoliceJavaZone 2017 - Microservices in action at the Dutch National Police
JavaZone 2017 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
OpenValue meetup October 2017 - Microservices in action at the Dutch National...
OpenValue meetup October 2017 - Microservices in action at the Dutch National...OpenValue meetup October 2017 - Microservices in action at the Dutch National...
OpenValue meetup October 2017 - Microservices in action at the Dutch National...
Bert Jan Schrijver
 
Event Driven Architecture with Apache Camel
Event Driven Architecture with Apache CamelEvent Driven Architecture with Apache Camel
Event Driven Architecture with Apache Camel
prajods
 
Massively Scaleable .NET Web Services with Project Orleans
Massively Scaleable .NET Web Services with Project OrleansMassively Scaleable .NET Web Services with Project Orleans
Massively Scaleable .NET Web Services with Project Orleans
Newman Hunter
 
Multi-threading in the modern era: Vertx Akka and Quasar
Multi-threading in the modern era: Vertx Akka and QuasarMulti-threading in the modern era: Vertx Akka and Quasar
Multi-threading in the modern era: Vertx Akka and Quasar
Gal Marder
 
Spring Boot
Spring BootSpring Boot
Spring Boot
gedoplan
 
Devoxx PL 2018 - Microservices in action at the Dutch National Police
Devoxx PL 2018 - Microservices in action at the Dutch National PoliceDevoxx PL 2018 - Microservices in action at the Dutch National Police
Devoxx PL 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
A Brief Intro to Microsoft Orleans
A Brief Intro to Microsoft OrleansA Brief Intro to Microsoft Orleans
A Brief Intro to Microsoft Orleans
Uri Goldstein
 
Introduction to Akka-Streams
Introduction to Akka-StreamsIntroduction to Akka-Streams
Introduction to Akka-Streams
dmantula
 
Staying friendly with the gc
Staying friendly with the gcStaying friendly with the gc
Staying friendly with the gc
Oren Eini
 
Rebooting design in RavenDB
Rebooting design in RavenDBRebooting design in RavenDB
Rebooting design in RavenDB
Oren Eini
 
Apache Zeppelin & Cluster
Apache Zeppelin & ClusterApache Zeppelin & Cluster
Apache Zeppelin & Cluster
Jongyoul Lee
 
Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015
Pavel Chunyayev
 
Zeppelin meetup 2016 madrid
Zeppelin meetup 2016 madridZeppelin meetup 2016 madrid
Zeppelin meetup 2016 madrid
Jongyoul Lee
 
Dublin JUG February 2018 - Microservices in action at the Dutch National Police
Dublin JUG February 2018 - Microservices in action at the Dutch National PoliceDublin JUG February 2018 - Microservices in action at the Dutch National Police
Dublin JUG February 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
Get There meetup March 2018 - Microservices in action at the Dutch National P...
Get There meetup March 2018 - Microservices in action at the Dutch National P...Get There meetup March 2018 - Microservices in action at the Dutch National P...
Get There meetup March 2018 - Microservices in action at the Dutch National P...
Bert Jan Schrijver
 
Going serverless with aws
Going serverless with awsGoing serverless with aws
Going serverless with aws
Alex Landa
 
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
Panagiotis Kanavos
 
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National PoliceiSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
Microservices in action at the Dutch National Police
Microservices in action at the Dutch National PoliceMicroservices in action at the Dutch National Police
Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
JavaZone 2017 - Microservices in action at the Dutch National Police
JavaZone 2017 - Microservices in action at the Dutch National PoliceJavaZone 2017 - Microservices in action at the Dutch National Police
JavaZone 2017 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
OpenValue meetup October 2017 - Microservices in action at the Dutch National...
OpenValue meetup October 2017 - Microservices in action at the Dutch National...OpenValue meetup October 2017 - Microservices in action at the Dutch National...
OpenValue meetup October 2017 - Microservices in action at the Dutch National...
Bert Jan Schrijver
 
Event Driven Architecture with Apache Camel
Event Driven Architecture with Apache CamelEvent Driven Architecture with Apache Camel
Event Driven Architecture with Apache Camel
prajods
 
Massively Scaleable .NET Web Services with Project Orleans
Massively Scaleable .NET Web Services with Project OrleansMassively Scaleable .NET Web Services with Project Orleans
Massively Scaleable .NET Web Services with Project Orleans
Newman Hunter
 
Multi-threading in the modern era: Vertx Akka and Quasar
Multi-threading in the modern era: Vertx Akka and QuasarMulti-threading in the modern era: Vertx Akka and Quasar
Multi-threading in the modern era: Vertx Akka and Quasar
Gal Marder
 
Spring Boot
Spring BootSpring Boot
Spring Boot
gedoplan
 
Devoxx PL 2018 - Microservices in action at the Dutch National Police
Devoxx PL 2018 - Microservices in action at the Dutch National PoliceDevoxx PL 2018 - Microservices in action at the Dutch National Police
Devoxx PL 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 

Similar to Getting Deep on Orchestration - Nickoloff - DockerCon16 (20)

Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Docker, Inc.
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
Abdelmonaim Remani
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
jaxconf
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
Ilya Ganelin
 
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Nikolay Savvinov
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Tibo Beijen
 
Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL Databases
Emanuel Calvo
 
Cloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: OpenstackCloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: Openstack
Microsoft
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?
GetInData
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
John Adams
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
Jonas Bonér
 
Modern DevOps across Technologies on premises and clouds with Oracle Manageme...
Modern DevOps across Technologies on premises and clouds with Oracle Manageme...Modern DevOps across Technologies on premises and clouds with Oracle Manageme...
Modern DevOps across Technologies on premises and clouds with Oracle Manageme...
Lucas Jellema
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
OpenEBS
 
Docker-N-Beyond
Docker-N-BeyondDocker-N-Beyond
Docker-N-Beyond
santosh007
 
Integration in the age of DevOps
Integration in the age of DevOpsIntegration in the age of DevOps
Integration in the age of DevOps
Albert Wong
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013
aspyker
 
OpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudOpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient Cloud
Mark Voelker
 
What Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaWhat Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versa
Brendan Gregg
 
Effective terraform
Effective terraformEffective terraform
Effective terraform
Calvin French-Owen
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
Gibraltar Software
 
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Docker, Inc.
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
Abdelmonaim Remani
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
jaxconf
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
Ilya Ganelin
 
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Nikolay Savvinov
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Tibo Beijen
 
Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL Databases
Emanuel Calvo
 
Cloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: OpenstackCloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: Openstack
Microsoft
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?
GetInData
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
John Adams
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
Jonas Bonér
 
Modern DevOps across Technologies on premises and clouds with Oracle Manageme...
Modern DevOps across Technologies on premises and clouds with Oracle Manageme...Modern DevOps across Technologies on premises and clouds with Oracle Manageme...
Modern DevOps across Technologies on premises and clouds with Oracle Manageme...
Lucas Jellema
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
OpenEBS
 
Docker-N-Beyond
Docker-N-BeyondDocker-N-Beyond
Docker-N-Beyond
santosh007
 
Integration in the age of DevOps
Integration in the age of DevOpsIntegration in the age of DevOps
Integration in the age of DevOps
Albert Wong
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013
aspyker
 
OpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudOpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient Cloud
Mark Voelker
 
What Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaWhat Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versa
Brendan Gregg
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
Gibraltar Software
 

More from allingeek (6)

Why we got to Docker
Why we got to DockerWhy we got to Docker
Why we got to Docker
allingeek
 
Retiring Service Interfaces: A Retrospective on Two 10+ Year Old Services
Retiring Service Interfaces: A Retrospective on Two 10+ Year Old ServicesRetiring Service Interfaces: A Retrospective on Two 10+ Year Old Services
Retiring Service Interfaces: A Retrospective on Two 10+ Year Old Services
allingeek
 
Docker for Development
Docker for DevelopmentDocker for Development
Docker for Development
allingeek
 
Docker: Aspects of Container Isolation
Docker: Aspects of Container IsolationDocker: Aspects of Container Isolation
Docker: Aspects of Container Isolation
allingeek
 
Single Host Docker Networking
Single Host Docker NetworkingSingle Host Docker Networking
Single Host Docker Networking
allingeek
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
allingeek
 
Why we got to Docker
Why we got to DockerWhy we got to Docker
Why we got to Docker
allingeek
 
Retiring Service Interfaces: A Retrospective on Two 10+ Year Old Services
Retiring Service Interfaces: A Retrospective on Two 10+ Year Old ServicesRetiring Service Interfaces: A Retrospective on Two 10+ Year Old Services
Retiring Service Interfaces: A Retrospective on Two 10+ Year Old Services
allingeek
 
Docker for Development
Docker for DevelopmentDocker for Development
Docker for Development
allingeek
 
Docker: Aspects of Container Isolation
Docker: Aspects of Container IsolationDocker: Aspects of Container Isolation
Docker: Aspects of Container Isolation
allingeek
 
Single Host Docker Networking
Single Host Docker NetworkingSingle Host Docker Networking
Single Host Docker Networking
allingeek
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
allingeek
 

Recently uploaded (20)

AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 

Getting Deep on Orchestration - Nickoloff - DockerCon16

  • 1. Getting Deep on Orchestration Jeff Nickoloff Engineer, Author, and Consultant
  • 2. • Author of Docker in Action • Blog about Docker at https://medium.com/on-docker • Professional engineering and containerization consultant • Work with startups and fortune 100 companies • Training • http://allingeek.com • Heaps of orchestration experience at Amazon About me
  • 3. What is Orchestration? Before Time Examples Abstractions OSS Agenda Picking Apart Platforms Abstractions Architecture Components Failures Demo: Entropy About failure Architecture Break stuff
  • 5. Remember what it was like to SCP deployment artifacts? How about just checking out from SVN right into prod? Perl wizardry… How about the luxurious experience of deploying via app server? <2005 … and the following years of pain
  • 6. “The first time you encounter real orchestration - having lived in its absence - you may experience the urge to laugh, or cry, or both.” — Jeff Nickoloff
  • 8. • Amazon Apollo (Code Deploy) • Deployment as an entity • Amazon Marketplace Feeds and Reports • Clustering, resource aware scheduling, state management It was like living in a different world and there was no going back for anyone. How many proprietary orchestration platforms do you think there are? Proprietary Orchestration
  • 9. An orchestration platform is: a system that provides control of high-level abstractions which imply certain deployment and lifecycle management semantics. Abstractions: • Force multipliers for communication • Often required to cope with complexity or variability An Observational Description
  • 10. In the configuration management OSS world we have: • Puppet, Chef, Ansible, Packer/Terraform, etc Cloud specific • CloudFormation, ElasticBeanstalk, ECS, etc In the container platform OSS world we have: • Swarm, UCP, Kubernetes, Mesosphere, Rancher, etc OSS Orchestration Reading this list gives me the feels…
  • 12. Container Orchestration • Docker + Libnetwork + Swarm + Compose + etcd / consul / zk
 • Docker + Flannel + Systemd + etcd + kubelet + kube (api, scheduler, controller-manager, proxy, DNS)
 • Docker (sometimes) + Mesos + Marathon + Calico + zk Breaking down Swarm, Kubernetes, and Mesosphere
  • 13. Force multiplying ideas: higher level examples • Container (an isolated context for processes) • Composition / Pod (containers w/ shared lifecycle) • Service (long running event handler) • Replication Controller (managed scale) • Job (process with a linear lifecycle) • Deployment (pushing and starting containers) Abstractions
  • 14. What features should we expect? • Frontend • Ingest descriptions of desired state • Provide system visibility (and integration) • Backend • Manage a compute resource pool • Manage logical networks, routes, and names • Service discovery and load balancing • Manage storage attachment, distribution, and replication • Manage containers and maintain the desired state • AA infrastructure integration Architecture
  • 15. How do you control the system? • One or a collection of APIs • Command Line Program • Web or Native GUI What do these interfaces interact with on the backend? An Interface
  • 16. Where do they maintain the authoritative state? Databases provide accounting for entities and their state. Centralized or decentralized and commonly provide: • KV semantics • Distributed locks • HA with strong consistency (Paxos / Raft) • Record observation (watches) • Update / Delete semantics with fencing tokens System of Record
  • 17. Control loops are the platform’s automata. • React to state changes on notification or polling • Calculate deltas with desired state and apply • Might require leadership election for HA Agents / Control Loops How do they maintain the desired state of the system?
  • 18. Event driven agents that coordinate changes in the system • Container (re)scheduler • Cluster node registrar • Service node registrar • Local supervisor / init • Service controller • Job controller Agents / Control Loops
  • 19. Event driven agents that coordinate changes in the system • State Observation • Feedback for control loops • Entity Lifecycle Graph • Registration and Discovery • Route to an IP • Engine in a cluster • Replica of a service • Endpoint of a service Patterns
  • 21. What Breaks? How am I going to fix it?
  • 22. • Still need high-low entropy service discovery and routing • This is becoming less of a problem • Bugs/failures you’d experience without orchestration or clustering Common Issues
  • 23. • System of Record is a big SPoF • Need durable distributed DB • “Something isn’t showing up!“ • Everything uses the same registration/discovery pattern Clustering Issues
  • 24. • N machines and M containers on an overlay or virtual network • Container attachment / detachment • Node attachment / detachment • Heartbeats / node monitoring Networking Stress
  • 25. • Upgrade mechanics • Careful dance? • Is rollback possible? • Blue/Green? Platform Lifecycle
  • 27. The Future Where do we take these systems from here?
  • 28. Maybe a few new abstractions? • Feed (Job with an input document) • Report (Job with an output document) • Request (Job with input and output documents) • Cron Job (run a periodic job) • User / User collection maybe? • Failure Injection Extensions
  • 29. Putting your money where your mouth is… Build confidence in complex distributed systems by injecting realistic failures and comparing operations against a steady state. • http://principlesofchaos.org • Failure mode and effects analysis • Netflix / SimianArmy Not much in the container space yet… Failure Injection
  • 30. An orchestration abstraction for failure injection Features: • Probabilistic failure injection policies • Failure modes • Latency, partition, GC pause, etc • Applied to existing containers filtered by label • An event stream • Notifications • Integrates with the Docker API Project: Entropy (PoC)
  • 31. Demo Slide $ entropy create --failure latency --frequency 10 --probability .50 --image allingeek/gremlins --criteria service=PingGoogle