SlideShare a Scribd company logo
Pull, don’t push!
Architectures for monitoring and configuration in a
microservices era
Julian Dunn, Director of Product Marketing, Chef
@julian_dunn
Fletcher Nichol, Senior Software Development Engineer, Chef
@fnichol
Pull, Don't Push! Sensu Summit 2018 Talk
• Modular, self-contained, pre-fabricated components
• Neighbors share components
• Complex shares services as a whole
Pull, Don't Push! Sensu Summit 2018 Talk
Pull, Don't Push! Sensu Summit 2018 Talk
Orchestration
An ordered set of operations
Across a set of independent machines
Connected to an orchestrator only via a
network.
Pull, Don't Push! Sensu Summit 2018 Talk
Humans acting on Microsoft Visio acting on
machines
Humans acting on code acting on machines
An ordered set of operations
Defined in code
Across a set of independent machines
Connected to an orchestrator only via a
network.
mylaptop:~$ ./disable-load-balancer.sh
mylaptop:~$ ssh db01 do-database-migration.sh
mylaptop:~$ for i in app01 app02; do
> ssh $i do-deployment.sh
> done
mylaptop:~$ ./enable-load-balancer.sh
Problems with Orchestration
Resilience Scalability
Deployment Technical
Operational Cognitive
Deployment Resilience
for i in app01 app02 app03; do
do-deploy.sh –server $i
done
Deployment Resilience
for i in app01 app02 app03; do
do-deploy.sh –server $i
if $? != 0; then
failed=$i
break
end
done
# what goes down here?
# roll back $failed?
# roll back all others?
# ignore it?
Pull, Don't Push! Sensu Summit 2018 Talk
Operational Resilience
Operational Resilience
Orchestration Backplane – must be up at all times!
Application Plane – delegated resilience to the backplane
Operational Resilience
Orchestration Backplane
Application Plane
Orchestration Backplane
Cognitive Scalability
Cognitive Scalability
Technical Scalability
Mainframes
Time Sharing
Client/Server
Web 1.0
Web 2.0
Cloud
Internet of
Things
Edge
Time
Distributed
Centralized
The Future Is Distributed
Pull, Don't Push! Sensu Summit 2018 Talk
Distributed Devices Need Distributed Management
• Adaptive
Learning
• Configuration
Updates
• Software
Updates
Distributed, Autonomous Systems
Make progress towards promised
desired state
Expose interfaces to allow others to
verify promises
Can promise to take certain behaviors
in the face of failure of others
The Design of Sensu
and
The Design of Habitat
The Design of Sensu vs. Traditional “Monitoring”
Nagios master
Agent
1
Agent
2
1. Poll
(orchestrate)
2. Run
checks
1. Run
checks
Agent
1
Agent
2
Sensu Backend
2. Post data
Habitat supervisor in a nutshell
•Network-connected supervision system
•Like systemd+consul/etcd (process supervision with
lifecycle hooks + shared state for reactive realtime change
management)
•Eventually-consistent global state using SWIM masterless
(peer-to-peer) membership protocol
sensu-
backend
hab-sup
sensu-
backend
hab-sup
sensu-
backend
hab-sup
backend.default
sensu-
agent
hab-sup
agent.default
--bind sensu:backend.default
Resolve symbol “sensu” in configs to
properties of service group
backend.default
Let’s See it in Action!
Demo: Sensu running under Habitat
• Modern architectures demand a
choreographed rather than an
orchestrated approach
• At scale, fleet management and
cognitive complexity is the biggest
problem
• Habitat and Sensu are both examples
of edge-centric, autonomous actor
systems, and they work well together
😺
Pull, Don't Push! Sensu Summit 2018 Talk

More Related Content

PDF
SFScon19 - Marco Bizzantino - GitOps and Immutable Infrastructure
South Tyrol Free Software Conference
 
PDF
Manage the Data Center Network as We Do the Servers
Open Networking Summits
 
PPTX
Nagios Conference 2012 - Jason Cook - Nagios and Mod-Gearman
Nagios
 
ODP
Nagios Conference 2012 - Andreas Ericsson - Merlin
Nagios
 
PDF
WSO2Con ASIA 2016: Getting More 9s from Your Deployment
WSO2
 
PDF
Zabbix Enterprise Network Monitor
Softnix Technology
 
PPTX
Spice world london 2012 Grey Howe
Spiceworks
 
PPTX
Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio
Nagios
 
SFScon19 - Marco Bizzantino - GitOps and Immutable Infrastructure
South Tyrol Free Software Conference
 
Manage the Data Center Network as We Do the Servers
Open Networking Summits
 
Nagios Conference 2012 - Jason Cook - Nagios and Mod-Gearman
Nagios
 
Nagios Conference 2012 - Andreas Ericsson - Merlin
Nagios
 
WSO2Con ASIA 2016: Getting More 9s from Your Deployment
WSO2
 
Zabbix Enterprise Network Monitor
Softnix Technology
 
Spice world london 2012 Grey Howe
Spiceworks
 
Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio
Nagios
 

What's hot (19)

PDF
Monitoring in a Microservices World
Docker, Inc.
 
PDF
30 Minutes to a Private Cloud
Deborah Schalm
 
PDF
Chaudhry_Gap Analysis_IEEE_HQs
Mohammad Asad Rehman Chaudhry
 
PDF
Boundary for puppet @ puppet conf2012
Boundary
 
PPTX
Unified Deployment: Including the Mainframe in Enterprise DevOps
Compuware
 
PDF
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
The Linux Foundation
 
PPTX
Designing apps for resiliency
Masashi Narumoto
 
PPTX
Mixed Reality Toolkit - Introduction to configuration
Simon Jackson
 
PPTX
Multi-Cloud Global Server Load Balancing (GSLB)
Avi Networks
 
PPT
Computer fundamental
ashishsharma1506
 
PPTX
Chaos Engineering with Gremlin Platform
Anshul Patel
 
PPTX
Going Serverless on AWS
Aleksandr Maklakov
 
PPTX
Continuous Delivery series: How to automate your infrastructure toolchain
Serena Software
 
PPTX
ManageEngine OpUtils Technical Overview
ManageEngine, Zoho Corporation
 
PDF
A Note on Distributed Computing - Papers We Love Hyderabad
Hrishikesh Barua
 
PDF
Server Monitoring as a Service
CopperEgg
 
PPTX
System center 2012 configurations manager
Belarmino Tomicha
 
PPTX
Designing microservices
Masashi Narumoto
 
PPTX
Managing Updates with System Center Configuration Manager 2012
JasonCondo
 
Monitoring in a Microservices World
Docker, Inc.
 
30 Minutes to a Private Cloud
Deborah Schalm
 
Chaudhry_Gap Analysis_IEEE_HQs
Mohammad Asad Rehman Chaudhry
 
Boundary for puppet @ puppet conf2012
Boundary
 
Unified Deployment: Including the Mainframe in Enterprise DevOps
Compuware
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
The Linux Foundation
 
Designing apps for resiliency
Masashi Narumoto
 
Mixed Reality Toolkit - Introduction to configuration
Simon Jackson
 
Multi-Cloud Global Server Load Balancing (GSLB)
Avi Networks
 
Computer fundamental
ashishsharma1506
 
Chaos Engineering with Gremlin Platform
Anshul Patel
 
Going Serverless on AWS
Aleksandr Maklakov
 
Continuous Delivery series: How to automate your infrastructure toolchain
Serena Software
 
ManageEngine OpUtils Technical Overview
ManageEngine, Zoho Corporation
 
A Note on Distributed Computing - Papers We Love Hyderabad
Hrishikesh Barua
 
Server Monitoring as a Service
CopperEgg
 
System center 2012 configurations manager
Belarmino Tomicha
 
Designing microservices
Masashi Narumoto
 
Managing Updates with System Center Configuration Manager 2012
JasonCondo
 
Ad

Similar to Pull, Don't Push! Sensu Summit 2018 Talk (20)

PPTX
Simplifying SDN Networking Across Private and Public Clouds
5nine
 
PPTX
M.Tech Internet of Things Unit - IV.pptx
AvinashAvuthu2
 
DOC
Neeraj_Virmani_Resume
Neeraj Virmani
 
PPTX
Build Time Hacking
Mohammed Tanveer
 
PPTX
TechWiseTV Workshop: Open NX-OS and Devops with Puppet Labs
Robb Boyd
 
PDF
Meteor South Bay Meetup - Kubernetes & Google Container Engine
Kit Merker
 
DOC
Remote sensing and control of an irrigation system using a distributed wirele...
nithinreddykaithi
 
PDF
Twelve Factor App
Christ Ngantung
 
PPTX
Continous delivvery devops Tools Technologies.pptx
projectsasd125
 
PPTX
InfrastructureDevOps.pptx it is most sui
pmishra37
 
PDF
Application Streaming is dead. A smart way to choose an alternative
Denis Gundarev
 
PDF
Containerization Principles Overview for app development and deployment
Dr Ganesh Iyer
 
PDF
Operational Visibiliy and Analytics - BU Seminar
Canturk Isci
 
PPTX
Meet Puppet's new product lineup 12/7/2017
Puppet
 
PDF
Sdn primer pdf
Pooja Patel
 
PPTX
DEVNET-1169 CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...
Cisco DevNet
 
PPTX
Netflix Cloud Architecture and Open Source
aspyker
 
PDF
Open shift and docker - october,2014
Hojoong Kim
 
PDF
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
VMware Tanzu
 
PPTX
Virtualization 101
Gaurav Marwaha
 
Simplifying SDN Networking Across Private and Public Clouds
5nine
 
M.Tech Internet of Things Unit - IV.pptx
AvinashAvuthu2
 
Neeraj_Virmani_Resume
Neeraj Virmani
 
Build Time Hacking
Mohammed Tanveer
 
TechWiseTV Workshop: Open NX-OS and Devops with Puppet Labs
Robb Boyd
 
Meteor South Bay Meetup - Kubernetes & Google Container Engine
Kit Merker
 
Remote sensing and control of an irrigation system using a distributed wirele...
nithinreddykaithi
 
Twelve Factor App
Christ Ngantung
 
Continous delivvery devops Tools Technologies.pptx
projectsasd125
 
InfrastructureDevOps.pptx it is most sui
pmishra37
 
Application Streaming is dead. A smart way to choose an alternative
Denis Gundarev
 
Containerization Principles Overview for app development and deployment
Dr Ganesh Iyer
 
Operational Visibiliy and Analytics - BU Seminar
Canturk Isci
 
Meet Puppet's new product lineup 12/7/2017
Puppet
 
Sdn primer pdf
Pooja Patel
 
DEVNET-1169 CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...
Cisco DevNet
 
Netflix Cloud Architecture and Open Source
aspyker
 
Open shift and docker - october,2014
Hojoong Kim
 
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
VMware Tanzu
 
Virtualization 101
Gaurav Marwaha
 
Ad

More from Julian Dunn (20)

PPTX
Technical Careers Beyond DevOps
Julian Dunn
 
PPTX
Now That I Have Choreography, What Do I Do With It?
Julian Dunn
 
PPTX
Distributed systems are hard; distributed systems of people are harder
Julian Dunn
 
PPTX
Orchestration? You Don't Need Orchestration. What You Want is Choreography.
Julian Dunn
 
PPTX
Chef on AIX
Julian Dunn
 
PPTX
Configuration Management in a Containerized World
Julian Dunn
 
PPTX
Cooking with Chef on Windows: 2015 Edition
Julian Dunn
 
PPTX
Automating That "Other" OS
Julian Dunn
 
PPTX
Chef-NYC Announcements July 2014
Julian Dunn
 
PPTX
Chef NYC Users' Group - Announcements for June 2014
Julian Dunn
 
PPTX
Improving Your Mac Productivity
Julian Dunn
 
PPTX
Chef Cookbook Governance BoF at ChefConf
Julian Dunn
 
PPTX
Chef and PowerShell Desired State Configuration
Julian Dunn
 
PPTX
What Makes a Good Chef Cookbook? (May 2014 Edition)
Julian Dunn
 
PPT
What Makes a Good Cookbook?
Julian Dunn
 
PPT
Configuration Management Isn't Everything
Julian Dunn
 
PPT
Cooking with Chef on Windows
Julian Dunn
 
PDF
An Introduction to DevOps with Chef
Julian Dunn
 
PDF
Chef Cookbook Testing and Continuous Integration
Julian Dunn
 
PDF
ChefConf 2013: Beginner Chef Antipatterns
Julian Dunn
 
Technical Careers Beyond DevOps
Julian Dunn
 
Now That I Have Choreography, What Do I Do With It?
Julian Dunn
 
Distributed systems are hard; distributed systems of people are harder
Julian Dunn
 
Orchestration? You Don't Need Orchestration. What You Want is Choreography.
Julian Dunn
 
Chef on AIX
Julian Dunn
 
Configuration Management in a Containerized World
Julian Dunn
 
Cooking with Chef on Windows: 2015 Edition
Julian Dunn
 
Automating That "Other" OS
Julian Dunn
 
Chef-NYC Announcements July 2014
Julian Dunn
 
Chef NYC Users' Group - Announcements for June 2014
Julian Dunn
 
Improving Your Mac Productivity
Julian Dunn
 
Chef Cookbook Governance BoF at ChefConf
Julian Dunn
 
Chef and PowerShell Desired State Configuration
Julian Dunn
 
What Makes a Good Chef Cookbook? (May 2014 Edition)
Julian Dunn
 
What Makes a Good Cookbook?
Julian Dunn
 
Configuration Management Isn't Everything
Julian Dunn
 
Cooking with Chef on Windows
Julian Dunn
 
An Introduction to DevOps with Chef
Julian Dunn
 
Chef Cookbook Testing and Continuous Integration
Julian Dunn
 
ChefConf 2013: Beginner Chef Antipatterns
Julian Dunn
 

Recently uploaded (20)

PPTX
原版北不列颠哥伦比亚大学毕业证文凭UNBC成绩单2025年新版在线制作学位证书
e7nw4o4
 
PPTX
Crypto Recovery California Services.pptx
lionsgate network
 
PDF
Cybersecurity Awareness Presentation ppt.
banodhaharshita
 
PDF
KIPER4D situs Exclusive Game dari server Star Gaming Asia
hokimamad0
 
PDF
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
APNIC
 
PDF
KIPER4D situs Exclusive Game dari server Star Gaming Asia
hokimamad0
 
PPTX
EthicalHack{aksdladlsfsamnookfmnakoasjd}.pptx
dagarabull
 
PPTX
Generics jehfkhkshfhskjghkshhhhlshluhueheuhuhhlhkhk.pptx
yashpavasiya892
 
PDF
The Internet of Things (IoT) refers to a vast network of interconnected devic...
chethana8182
 
PDF
UI/UX Developer Guide: Tools, Trends, and Tips for 2025
Penguin peak
 
PDF
DNSSEC Made Easy, presented at PHNOG 2025
APNIC
 
PPT
Transformaciones de las funciones elementales.ppt
rirosel211
 
PPTX
Perkembangan Perangkat jaringan komputer dan telekomunikasi 3.pptx
Prayudha3
 
PPTX
How tech helps people in the modern era.
upadhyayaryan154
 
PPTX
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
PPTX
SEO Trends in 2025 | B3AITS - Bow & 3 Arrows IT Solutions
B3AITS - Bow & 3 Arrows IT Solutions
 
PPTX
The Monk and the Sadhurr and the story of how
BeshoyGirgis2
 
PPTX
dns domain name system history work.pptx
MUHAMMADKAVISHSHABAN
 
PPTX
Artificial-Intelligence-in-Daily-Life (2).pptx
nidhigoswami335
 
PDF
PDF document: World Game (s) Great Redesign.pdf
Steven McGee
 
原版北不列颠哥伦比亚大学毕业证文凭UNBC成绩单2025年新版在线制作学位证书
e7nw4o4
 
Crypto Recovery California Services.pptx
lionsgate network
 
Cybersecurity Awareness Presentation ppt.
banodhaharshita
 
KIPER4D situs Exclusive Game dari server Star Gaming Asia
hokimamad0
 
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
APNIC
 
KIPER4D situs Exclusive Game dari server Star Gaming Asia
hokimamad0
 
EthicalHack{aksdladlsfsamnookfmnakoasjd}.pptx
dagarabull
 
Generics jehfkhkshfhskjghkshhhhlshluhueheuhuhhlhkhk.pptx
yashpavasiya892
 
The Internet of Things (IoT) refers to a vast network of interconnected devic...
chethana8182
 
UI/UX Developer Guide: Tools, Trends, and Tips for 2025
Penguin peak
 
DNSSEC Made Easy, presented at PHNOG 2025
APNIC
 
Transformaciones de las funciones elementales.ppt
rirosel211
 
Perkembangan Perangkat jaringan komputer dan telekomunikasi 3.pptx
Prayudha3
 
How tech helps people in the modern era.
upadhyayaryan154
 
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
SEO Trends in 2025 | B3AITS - Bow & 3 Arrows IT Solutions
B3AITS - Bow & 3 Arrows IT Solutions
 
The Monk and the Sadhurr and the story of how
BeshoyGirgis2
 
dns domain name system history work.pptx
MUHAMMADKAVISHSHABAN
 
Artificial-Intelligence-in-Daily-Life (2).pptx
nidhigoswami335
 
PDF document: World Game (s) Great Redesign.pdf
Steven McGee
 

Pull, Don't Push! Sensu Summit 2018 Talk

Editor's Notes

  • #2: Fletcher and I were part of the original team that launched Habitat by Chef in 2016; I was the product manager and Fletcher was one of the lead engineers. We both have technical backgrounds, except that we do different jobs now. Fletcher’s computer boots into Linux and mine boots into PowerPoint.
  • #3: So this is a talk about architecture and systems design, and if we’re going to talk about architecture maybe a good way to think about good architecture is via, well, actual architecture. One of the most famous buildings in the world is the Habitat 67 complex in Montreal, built, as you can see, for Expo 67, which was Canada’s 100th anniversary. Shout out, by the way, to the Canadians in the room, including Sean Porter, Sensu’s CTO; Fletcher and I are both Canadians so we have to make a pitch for the Great White North anytime we're up here. Universal health care! One year of paid maternity leave! Super-hot prime minister! Ok, that's enough of that Anyway, Habitat 67 was such an iconic building that Canada Post put it on the stamp for Canada’s 150th anniversary last year.
  • #4: Here’s another picture, in its full glory. Probably would have actually used shipping containers today but remember, TEU (standardized) containerization didn’t arrive until the late 1960’s. But the components were standardized as you can see from the middle versus the right One unit’s roof is the other neighbor’s garden Shopping, schools, common services built into the ground floor of each complex These things sound a lot like software architectural principles Every component is responsible for its own resiliency (like Bezos’ infamous memo) Components declare peer-to-peer level dependencies All components share a base substrate of services and management (e.g. deployment, monitoring, observability, etc.)
  • #5: The Habitat 67 complex is actually quite large
  • #6: I wanted to put the big pictures up of Habitat 67 because, well, architecture starts to look a lot like architecture, right? These are visual diagrams (probably several years old) of microservice architectures at Amazon and Netflix. When you have complex systems this big, there are architectural patterns you’ll need to put in place to deal with it. Because when you get to something big and complex, your issue isn’t adding more to it – your issue becomes how do you manage this. Today’s talk which is really about how you design complex systems so that you can _manage_ them. It’s better to design systems with these characteristics built-in up front rather than to try and bolt them on later.
  • #7: Which brings me to the patterns of management for complex systems. Traditionally, we have and in many scenarios we continue to try and manage things using a centralized approach, which I call “orchestration”. So does everyone else, unfortunately, so let me define what I mean by this.
  • #9: IBM Cloud Orchestrator HP Operations Orchestration VMWare vRealize Orchestrator
  • #11: But since I’m in the orchestration track I’d better try to define it so that I actually have a talk, right? Here is the definition I'll be using for the rest of the talk. And then I’m still going to tell you how and why that breaks down.
  • #12: This is a trivial example of orchestration. Last year I said I at least hope you’re doing your orchestration in code, if you’re doing orchestration, because this is pretty awful. And as you can see, it causes downtime because you need to wait for the previous thing to complete before you can proceed with the next one. You can add more fancy error checking and branching to orchestration to try and handle no-downtime deploys, but that orchestration gets really complicated – more complexity means more error conditions means more things that need to be handled.
  • #13: Resilience Deployment Operational Scalability Technical Cognitive
  • #14: Treating machines all connected via an unreliable network as an atomic unit to which updates must be applied in full, or not at all This *used* to work when you had a small fleet and/or your network was mostly reliable (e.g. on a LAN) - not so good in a cloud
  • #15: An atomic set that is assumed to succeed as a whole or not. What happens when it doesn't? A lot of complexity in failure conditions that need to be encapsulated and dealt with. Or more commonly, the approach is to drop this all off on the operator's lap and have them deal with it.
  • #16: Modern orchestration systems try to get around this fundamental issue by creating more disposability and just throwing away larger and larger parts of the infrastructure. The theory goes, let’s get the exact right “new” setup first, and then cut over to it. The problem is that while this mostly works, it is an incredibly complicated and slow way to make changes – you’re saying that for every config change or deployment I have to stand up a whole new production environment and cut over everything to it? For example, how do I do things like quiesce writes to a database? I think this creates more complexity even though the interfaces seem really attractive.
  • #17: Orchestration systems treat application components as dumb entities to be scheduled. Those entities don’t know about each other except through the orchestration system. This means that if components fail, they depend on the orchestration backplane (and here I’m picking on Kubernetes again) to manage their lifecycle. They also depend on the orchestration backplane to tell them where the other entities are (like where the database server is, if I’m the app server). The apps themselves are deliberately kept in the dark about their execution context.
  • #18: Now remember, we’re running in the cloud now – a place where machines and networks can go down at any time. And we’re trying to build reliable applications on top of that unreliable fabric.
  • #19: Now who does such a system design benefit? It only benefits the person or organization that is running the orchestration backplane – that is, if it’s external to the unreliable vagaries of the “cloud”. In other words, if it’s, say, a hosted service provided by your cloud vendor? Kubernetes and other orchestration systems soften you up for that approach so that when you run into the inherent resilience limitations, you outsource. Therefore I believe Google has never intended that you run a Kubernetes cluster on your own, but to buy it from someone (hopefully them) as a managed service. And don’t get me wrong, it’s an amazing business model, and, if you can offer your developers an experience on top of all this that’s just “push a container and it runs”, then that’s great. This is why there has been this Precambrian explosion of hosted Kubernetes solutions... Because these vendors know that this architectural model locks you into building applications on their platform. When your app is operationally dumb and the backplane is operationally smart, they have your money forever.
  • #20: I don’t have that much to say about this one other than that orchestration systems or operations become really difficult to understand the more entities you’re trying to address. In particular because an orchestration activity (“play”) is intended to run to completion, atomically, trying to debug failures halfway through and figure out what to do is really hard. When things go wrong, it’s easier for the human brain to try and understand a small part of the system – where the fault is – rather than the entire global state. We know this with computer programming (“locality of reference”) and that’s why we have techniques like “information hiding” (i.e. abstracting logic).
  • #22: We used to show this slide as part of old Opscode training materials when I first started at Chef. I’m sure you’ve seen slides like this before, where we talk about the # of nodes running applications, etc, and how they grow over time. While this is all true, I think these graphs neglect one key thing, which is not that the *quantity* of machines increases over time, but the fact that systems as a whole tend towards becoming more *distributed*. By "distributed" I mean that more of the computing runs at the "edge" if you will and not in a centralized way.
  • #23: It’s not a straight line, though. <Talk through the build> Cloud: ML, databases, etc. – now starting to centralize more stuff into the cloud. The more that our systems become distributed, the less a centralized approach makes sense. This is true not only for data processing (why can’t it happen at the edge), but also to configuration updates and even software upgrades.
  • #24: https://medium.com/@timanglade/how-hbos-silicon-valley-built-not-hotdog-with-mobile-tensorflow-keras-react-native-ef03260747f3 Tensorflow, Keras, React Native First version was centralized – too much latency So the final version runs an entire neural network on your phone.
  • #25: Nike HyperAdapt shoe Number of devices continues to increase Machine Learning, Analytics, AI Latency becomes currency At-scale problems will re-emerge just like they did with Client/Server and the Web Distributed devices need distributed management
  • #26: Sounds a lot like wherein we started with convergent configuration management and this guy, right? Everything old is new again.
  • #29: Using SWIM rather than something like RAFT, because SWIM is masterless
  • #30: This slide will be a build to show some of Habitat’s terminology, specifically: Service group Contains one or more entities that share a configuration template, but run the same workload Leaders and followers are in the same group Have a name Supervisors are responsible for [re-]writing configuration of the workload and restarting the process, possibly in coordination with other supervisors in that group Supervisors have a REST interface that allows you to modify their config (inject new configs as rumors into the network – they will be propagated. Can use any authorized supervisor as an entrypoint, doesn’t have to be the group we care about) External service groups can be subscribed to the configuration of this service group using binding Talk about communication protocol across the fleet – SWIM membership protocol/failure detector, with a gossip layer on top for distributed consensus Because we get asked a lot of questions about the protocol, it is an implementation of SWIM It's an implementation of SWIM+Infection+Suspicion for membership, and a ZeroMQ based newscast-inspired gossip protocol. Goals Eventually consistent. Over a long enough time horizon, every living member will converge on the same state. Reasonably efficient. The protocol avoids any back-chatter; messages are sent but never confirmed. Reliable. As a building block, it should be safe and reliable to use.
  • #31: Config changes: injected into any peer, ACL is checked, and if accepted, gossiped around the network. No SPOF.