SlideShare a Scribd company logo
Planes, Raft, and Pods
A Tour of Distributed Systems Within Kubernetes
@boluptuous
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
?
“Open-source platform for automating
deployment, scaling, and operations of
application containers around clusters of
hosts, providing container-centric
infrastructure”
- Kubernetes Documentation
???????
Flexible platform for running
containerized apps!
● Autoscaling
● Rolling Deploys
● Secret Management
● Load Balancing
● Auto-Recovery from Failures
How does Kubernetes
leverage distributed
systems?
What is a container?
Cgroups + namespaces
Pod = 1 or more containers
Deployments manage pods
Kubernetes Architecture!
etcd!
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Why etcd?
etcd is designed for “large scale distributed
systems… that never tolerate split brain
behavior and are willing to sacrifice
availability” to achieve it
- etcd Documentation
Simple interface hides
complex problems
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Let’s look at a Not Raft system
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
New Scenario!
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Consensus requires
coordination
Raft = consensus algorithm
for managing replicated log
Elected leader is put in
charge of managing the log
Three States!
● Leader
● Follower
● Candidate
One leader per term
Leader sends heartbeat
messages
What happens if a follower
doesn’t get a heartbeat?
Election time!
In the game of Raft leadership elections, you win or you lose.
1. Write goes to leader
2. Leader appends command to log
3. Tells other servers via RPC to append it
to their logs (followers will say no if
they’re behind)
4. Once majority append, leader commits
5. Leader tells nodes in subsequent
messages of the last committed entry
6. Nodes commit
Solves problems in our bad system
Consistency and partition-tolerance
are achieved through requiring a
majority of nodes to act
Further Raft Reading
● The Raft Paper
● The Secret Lives of Data
(Raft Visualization)
Controller = loop that watches cluster
state and makes changes to ensure
we keep the desired state
Replica Set Controller makes sure
there’s a given number of pods running
at any time
Deployment controller manages the
whole deployment process of your app
Scheduler watches for unscheduled pods and
assigns them to a given node
The Scheduling Algorithm
1. Filter out nodes that aren’t desired or
not a great fit
2. Rank the remaining nodes
3. Pick the top ranked node
Step 1: Filter Against Predicates
● HostName
● MatchNodeSelector
● PodFitsHostPort
● PodFitsResources
● CheckNodeMemoryPressure
● CheckNodeDiskPressure
Ranking applies a series of
weighted priority functions
that return a score from 0 to
10 (least to most desirable)
Functions are run against
each node, added up, and the
node with the highest score
is the winner!
Some Ranking Functions
● LeastRequestedPriority
● BalancedResourceAllocation
● SelectorSpreadPriority
What happens when we
submit a deployment to
Kubernetes?
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: hello-world-deployment
spec:
replicas: 3
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: hello-world
image: tutum/hello-world
ports:
- containerPort: 80
How do we submit our
deployment? Kubectl!
What We Expect
1. We create deployment
2. Deployment creates a replica set
3. Replica set creates three pods
4. Our scheduler schedules those three
pods
5. Kubelet will run scheduled pods
What actually happens...
These are REAL events
involvedObject:
apiVersion: extensions
kind: Deployment
name: hello-world-deployment
resourceVersion: "1097"
uid: c8806e58-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:41Z
message: Scaled up replica set hello-world-deployment-
3877114392 to 3
reason: ScalingReplicaSet
source:
component: deployment-controller
type: Normal
involvedObject:
apiVersion: extensions
kind: ReplicaSet
name: hello-world-deployment-3877114392
namespace: default
resourceVersion: "1098"
uid: c8811ca5-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:41Z
message: 'Created pod: hello-world-deployment-
3877114392-jk3ps'
reason: SuccessfulCreate
source:
component: replicaset-controller
type: Normal
involvedObject:
apiVersion: extensions
kind: ReplicaSet
name: hello-world-deployment-3877114392
namespace: default
resourceVersion: "1098"
uid: c8811ca5-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:41Z
message: 'Created pod: hello-world-deployment-
3877114392-nt62j'
reason: SuccessfulCreate
source:
component: replicaset-controller
type: Normal
involvedObject:
apiVersion: v1
kind: Pod
name: hello-world-deployment-3877114392-nt62j
namespace: default
resourceVersion: "1102"
uid: c8833786-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:41Z
message: Successfully assigned hello-world-deployment-
3877114392-nt62j to minikube
reason: Scheduled
source:
component: default-scheduler
type: Normal
involvedObject:
apiVersion: extensions
kind: ReplicaSet
name: hello-world-deployment-3877114392
namespace: default
resourceVersion: "1098"
uid: c8811ca5-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:41Z
message: 'Created pod: hello-world-deployment-
3877114392-c71lp'
reason: SuccessfulCreate
source:
component: replicaset-controller
type: Normal
involvedObject:
apiVersion: v1
kind: Pod
name: hello-world-deployment-3877114392-jk3ps
namespace: default
resourceVersion: "1103"
uid: c88336ab-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:41Z
message: Successfully assigned hello-world-deployment-
3877114392-jk3ps to minikube
reason: Scheduled
source:
component: default-scheduler
type: Normal
involvedObject:
apiVersion: v1
kind: Pod
name: hello-world-deployment-3877114392-c71lp
namespace: default
resourceVersion: "1104"
uid: c8833ea2-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:41Z
message: Successfully assigned hello-world-deployment-
3877114392-c71lp to minikube
reason: Scheduled
source:
component: default-scheduler
type: Normal
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-c71lp
namespace: default
resourceVersion: "1111"
uid: c8833ea2-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:42Z
message: pulling image "tutum/hello-world"
reason: Pulling
source:
component: kubelet
host: minikube
type: Normal
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-jk3ps
namespace: default
resourceVersion: "1109"
uid: c88336ab-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:42Z
message: pulling image "tutum/hello-world"
reason: Pulling
source:
component: kubelet
host: minikube
type: Normal
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-nt62j
namespace: default
resourceVersion: "1106"
uid: c8833786-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:42Z
message: pulling image "tutum/hello-world"
reason: Pulling
source:
component: kubelet
host: minikube
type: Normal
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-c71lp
namespace: default
resourceVersion: "1111"
uid: c8833ea2-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:44Z
message: Successfully pulled image "tutum/hello-world"
reason: Pulled
source:
component: kubelet
host: minikube
type: Normal
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-c71lp
namespace: default
resourceVersion: "1111"
uid: c8833ea2-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:44Z
message: Created container with id
c71c2605bcb7ab52e0c4fc7e08545664c628dd8eb5ceea2
0eff5ccff4afb865d
reason: Created
source:
component: kubelet
host: minikube
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-c71lp
namespace: default
resourceVersion: "1111"
uid: c8833ea2-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:44Z
message: Started container with id
c71c2605bcb7ab52e0c4fc7e08545664c628dd8eb5ceea2
0eff5ccff4afb865d
reason: Started
source:
component: kubelet
host: minikube
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-jk3ps
namespace: default
resourceVersion: "1109"
uid: c88336ab-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:45Z
message: Successfully pulled image "tutum/hello-world"
reason: Pulled
source:
component: kubelet
host: minikube
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-jk3ps
namespace: default
resourceVersion: "1109"
uid: c88336ab-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:45Z
message: Created container with id
26cc7eff24538a09647a8a595d606c1988ca802a74d1930
fdcb801aafc624075
reason: Created
source:
component: kubelet
host: minikube
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-jk3ps
namespace: default
resourceVersion: "1109"
uid: c88336ab-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:45Z
message: Started container with id
26cc7eff24538a09647a8a595d606c1988ca802a74d1930
fdcb801aafc624075
reason: Started
source:
component: kubelet
host: minikube
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-nt62j
namespace: default
resourceVersion: "1106"
uid: c8833786-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:46Z
message: Successfully pulled image "tutum/hello-world"
reason: Pulled
source:
component: kubelet
host: minikube
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-nt62j
namespace: default
resourceVersion: "1106"
uid: c8833786-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:46Z
message: Created container with id
ef8a53303a35c539d7a16f2d633d7f7f6f70d42c6dc7f629
da771a49a95a0c25
reason: Created
source:
component: kubelet
host: minikube
involvedObject:
apiVersion: v1
fieldPath: spec.containers{hello-world}
kind: Pod
name: hello-world-deployment-3877114392-nt62j
namespace: default
resourceVersion: "1106"
uid: c8833786-5371-11e7-9d84-0800278f5909
kind: Event
lastTimestamp: 2017-06-17T15:29:46Z
message: Started container with id
ef8a53303a35c539d7a16f2d633d7f7f6f70d42c6dc7f629
da771a49a95a0c25
reason: Started
source:
component: kubelet
host: minikube
We’ve got a running app...
...but no way to talk to it.
We’ll add a service!
kind: Service
apiVersion: v1
metadata:
name: hello-world-service
spec:
selector:
app: hello-world
ports:
- protocol: TCP
port: 80
type: NodePort
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
We did it!
Things We’ve Done
● Look at Kubernetes components
● Shown how it handles distributed
state
● Dove into how we reconcile state and
schedule pods
● Trace a deployment through the
system
Questions?

More Related Content

What's hot (20)

PDF
Kubernetes Node Deep Dive
Lei (Harry) Zhang
 
PDF
What's new in Kubernetes
Daniel Smith
 
PDF
Hands-On Introduction to Kubernetes at LISA17
Ryan Jarvinen
 
PDF
Kubernetes Architecture and Introduction
Stefan Schimanski
 
PDF
Kubernetes - introduction
Sparkbit
 
PDF
Kubernetes deep dive - - Huawei 2015-10
Vishnu Kannan
 
PPTX
Kubernetes Introduction & Whats new in Kubernetes 1.6
Opcito Technologies
 
PDF
Scaling Microservices with Kubernetes
Deivid Hahn Fração
 
PPTX
Introduction to Kubernetes
Paris Apostolopoulos
 
PDF
Marc Sluiter - 15 Kubernetes Features in 15 Minutes
Marc Sluiter
 
PPTX
Docker and kubernetes
Dongwon Kim
 
PDF
Kubernetes automation in production
Paul Bakker
 
PDF
Kubernetes 101 and Fun
Mario-Leander Reimer
 
PDF
Evolution of containers to kubernetes
Krishna-Kumar
 
PDF
Intro to Kubernetes
Joonathan Mägi
 
PDF
Kubernetes Architecture - beyond a black box - Part 2
Hao H. Zhang
 
PDF
Kubernetes and CoreOS @ Athens Docker meetup
Mist.io
 
PPTX
Kubernetes Workshop
loodse
 
PPTX
DevOps with Kubernetes
EastBanc Tachnologies
 
PDF
(Draft) Kubernetes - A Comprehensive Overview
Bob Killen
 
Kubernetes Node Deep Dive
Lei (Harry) Zhang
 
What's new in Kubernetes
Daniel Smith
 
Hands-On Introduction to Kubernetes at LISA17
Ryan Jarvinen
 
Kubernetes Architecture and Introduction
Stefan Schimanski
 
Kubernetes - introduction
Sparkbit
 
Kubernetes deep dive - - Huawei 2015-10
Vishnu Kannan
 
Kubernetes Introduction & Whats new in Kubernetes 1.6
Opcito Technologies
 
Scaling Microservices with Kubernetes
Deivid Hahn Fração
 
Introduction to Kubernetes
Paris Apostolopoulos
 
Marc Sluiter - 15 Kubernetes Features in 15 Minutes
Marc Sluiter
 
Docker and kubernetes
Dongwon Kim
 
Kubernetes automation in production
Paul Bakker
 
Kubernetes 101 and Fun
Mario-Leander Reimer
 
Evolution of containers to kubernetes
Krishna-Kumar
 
Intro to Kubernetes
Joonathan Mägi
 
Kubernetes Architecture - beyond a black box - Part 2
Hao H. Zhang
 
Kubernetes and CoreOS @ Athens Docker meetup
Mist.io
 
Kubernetes Workshop
loodse
 
DevOps with Kubernetes
EastBanc Tachnologies
 
(Draft) Kubernetes - A Comprehensive Overview
Bob Killen
 

Similar to Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes (20)

PDF
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
QAware GmbH
 
PDF
Kubernetes intro public - kubernetes user group 4-21-2015
reallavalamp
 
PDF
Kubernetes intro public - kubernetes meetup 4-21-2015
Rohit Jnagal
 
PPTX
Introduction to kubernetes
Rishabh Indoria
 
PPTX
Kubernetes #1 intro
Terry Cho
 
PPTX
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kevin Lynch
 
PPTX
Container & kubernetes
Ted Jung
 
PDF
How to build a tool for operating Flink on Kubernetes
AndreaMedeghini
 
PPTX
Kubernetes - Why It's Cool & Why It Maaters
Bob Reselman
 
PPT
Kubernetes for Cloud-Native Environments
AdiB912552
 
PDF
Operator SDK for K8s using Go
CloudOps2005
 
PPTX
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Puppet
 
PDF
Container orchestration from theory to practice
Docker, Inc.
 
PDF
Distributed fun with etcd
Abdulaziz AlMalki
 
PDF
Kubecon seattle 2018 workshop slides
Weaveworks
 
PDF
Nex clipper 1905_summary_eng
Jinyong Kim
 
PDF
Creating microservices architectures using node.js and Kubernetes
Paul Goldbaum
 
PDF
Container Orchestration from Theory to Practice
Docker, Inc.
 
PDF
Phil Basford - machine learning at scale with aws sage maker
AWSCOMSUM
 
PPTX
Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Aleksey Asiutin
 
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
QAware GmbH
 
Kubernetes intro public - kubernetes user group 4-21-2015
reallavalamp
 
Kubernetes intro public - kubernetes meetup 4-21-2015
Rohit Jnagal
 
Introduction to kubernetes
Rishabh Indoria
 
Kubernetes #1 intro
Terry Cho
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kevin Lynch
 
Container & kubernetes
Ted Jung
 
How to build a tool for operating Flink on Kubernetes
AndreaMedeghini
 
Kubernetes - Why It's Cool & Why It Maaters
Bob Reselman
 
Kubernetes for Cloud-Native Environments
AdiB912552
 
Operator SDK for K8s using Go
CloudOps2005
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Puppet
 
Container orchestration from theory to practice
Docker, Inc.
 
Distributed fun with etcd
Abdulaziz AlMalki
 
Kubecon seattle 2018 workshop slides
Weaveworks
 
Nex clipper 1905_summary_eng
Jinyong Kim
 
Creating microservices architectures using node.js and Kubernetes
Paul Goldbaum
 
Container Orchestration from Theory to Practice
Docker, Inc.
 
Phil Basford - machine learning at scale with aws sage maker
AWSCOMSUM
 
Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Aleksey Asiutin
 
Ad

Recently uploaded (20)

PDF
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
PPT
Carmon_Remote Sensing GIS by Mahesh kumar
DhananjayM6
 
DOC
MRRS Strength and Durability of Concrete
CivilMythili
 
PPTX
artificial intelligence applications in Geomatics
NawrasShatnawi1
 
PPTX
Introduction to Design of Machine Elements
PradeepKumarS27
 
PPTX
Green Building & Energy Conservation ppt
Sagar Sarangi
 
PPTX
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 
PDF
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PDF
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
PPTX
Depth First Search Algorithm in 🧠 DFS in Artificial Intelligence (AI)
rafeeqshaik212002
 
PPTX
Arduino Based Gas Leakage Detector Project
CircuitDigest
 
PPTX
Hashing Introduction , hash functions and techniques
sailajam21
 
PPTX
Solar Thermal Energy System Seminar.pptx
Gpc Purapuza
 
PPTX
Day2 B2 Best.pptx
helenjenefa1
 
DOCX
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
PPTX
原版一样(Acadia毕业证书)加拿大阿卡迪亚大学毕业证办理方法
Taqyea
 
PDF
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
PPTX
Shinkawa Proposal to meet Vibration API670.pptx
AchmadBashori2
 
PDF
Electrical Engineer operation Supervisor
ssaruntatapower143
 
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
Carmon_Remote Sensing GIS by Mahesh kumar
DhananjayM6
 
MRRS Strength and Durability of Concrete
CivilMythili
 
artificial intelligence applications in Geomatics
NawrasShatnawi1
 
Introduction to Design of Machine Elements
PradeepKumarS27
 
Green Building & Energy Conservation ppt
Sagar Sarangi
 
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
Depth First Search Algorithm in 🧠 DFS in Artificial Intelligence (AI)
rafeeqshaik212002
 
Arduino Based Gas Leakage Detector Project
CircuitDigest
 
Hashing Introduction , hash functions and techniques
sailajam21
 
Solar Thermal Energy System Seminar.pptx
Gpc Purapuza
 
Day2 B2 Best.pptx
helenjenefa1
 
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
原版一样(Acadia毕业证书)加拿大阿卡迪亚大学毕业证办理方法
Taqyea
 
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
Shinkawa Proposal to meet Vibration API670.pptx
AchmadBashori2
 
Electrical Engineer operation Supervisor
ssaruntatapower143
 
Ad

Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes

Editor's Notes

  • #6: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
  • #9: These are pretty cool things, and most talks about Kubernetes concentrates on these things. I’d like to do something a little different. Let’s all be in agreement that Kubernetes can do it, and they’re cool. I want to peek behind the curtain.
  • #12: https://jvns.ca/blog/2016/10/10/what-even-is-a-container/ Cgroups - specify limits for how much memory and CPU a process can use Namespace - stop process from interfering with other processes Containerized app examples - nginx deployment, microservice, Redis instance
  • #13: 1 or more containers that share a unique IP guaranteed to have all containers running on the same host can share disk volumes (ex. App that logs to file, sidecar to forward logs)
  • #14: Declarative way of managing pods What containers we want to run and how many instances Use replica sets under the hood - make sure # running in cluster is = # desired
  • #15: Etcd - distributed k/v store that serves as Kubernetes database API server - controls access to etcd - interacted with via REST calls Controller manager - background threads that handle routine cluster tasks such as making sure a deployment has enough pods deployed and triggers the scheduling of new ones if needed scheduler - schedules unscheduled pods, whenever there’s an unscheduled pod, the scheduler determines where the appropriate home for it is. These components collectively make up the control plane and run on the master hosts in the cluster. Kubelet - watches for pods that have been assigned to node and runs them, constantly polls API server + local config, cAdvisor - metrics collector Kube proxy - handles some networking stuff
  • #17: https://www.comp.nus.edu.sg/~gilbert/pubs/BrewersConjecture-SigAct.pdf Consistency - always reading the most recently written value Availability - every request to a non-failing node receives a response Partition-tolerance - system can handle the dropping of messages between nodes Etcd is consistent/partition-tolerant If in a 5 node cluster, 3 nodes take a dive, the other 2 stop responding to requests until the quroum is restored
  • #18: If your goal is to have a system for running containerized applications, you need something that plays nice with clustered systems
  • #19: CAP theorem reference - etcd is going to sacrifice availability to preserve consistency Clarification -> Etcd unavailability doesn’t take down k8s, just the ability to mutate
  • #20: Typical CRUD operations through rest calls or command line tool - etcdctl How do the nodes agree on what a value is for a given key? If not done smartly, things can go awry pretty quickly How does etcd agree on the value for a given key while still upholding its consistency guarantee?
  • #21: Method of achieving distributed consensus - having multiple distributed servers agree on what the value is for a given key in a fault tolerant manner Trivial way to solve - the value is always 3
  • #22: Typical CRUD operations through rest calls or command line tool - etcdctl How do the nodes agree on what a value is for a given key? If not done smartly, things can go awry pretty quickly How does etcd agree on the value for a given key while still upholding its consistency guarantee?
  • #23: Imagine a system that does NOT implement the Raft algorithm Three nodes, A, B, C that each store a single value When a client wants to read a value it contacts any node and gets its value When a client wants to write a value, it contacts any node, and that node tells all the other nodes of the new value
  • #26: Now contemplate what happens if three clients update the value in three different ways at the same time. At time t=0, 3 clients write 3 different values to our 3 nodes simultaneously
  • #27: At time t=5, after all updates have been applied, what’s the value, assuming no other writes have occurred? No idea! In fact, it’s possible for each node to have a different value depending on what order messages were received in
  • #31: All of the previous messages have been lost forever After the network recovers, the value at C is still going to be M until someone updates the value.
  • #32: Seemingly simple systems can fail in all sorts of Fun ways when exposed to concurrent operations In order for consensus to be achived, we’re going to require greater coordination between our nodes This is where Raft comes into play
  • #33: Replicated log - series of commands executed in order by a state machine We want each log to have the same commands in the same order so that each machine has the same state
  • #34: Accepts entries from clients, replicates them on other nodes, says when it’s safe to actually apply them If leader fails, a new one is elected
  • #35: Followers just hang out, respond to requests from leaders Leader handles all the requests and does all of the fun coordination Candidate is the state used to elect a new leader
  • #36: Part of dealing with ordered statements is dealing with time Raft uses an incrementing integer timestamp called a term, which is tied to an election Terms last until there’s a new leader Serves as a logical clock and aid in detecting obsolete info (like stale leaders)
  • #37: Leaders send heartbeat messages to the follower nodes, saying “Hey I’m still alive”
  • #38: Follower increments term and declares itself as a candidate Sends an RPC request out for votes to all of the other servers
  • #39: Servers vote for the first candidate who asks for their votes, assuming the candidate is at least as up to date as the voter. If the voter has entries that the candidate doesn’t, then the voter isn’t going to vote for that candidate. If the candidate gets a majority of the votes, it wins! It sends out a new set of heartbeat messages to notify others of its victory, and life goes on
  • #41: Simultaneous write scenario solved by requirement that all writes go to leader Leader receives each write, and each write is a new entry with a new index -- can’t occur simultaneously Network partition scenario - followers reject requests if they’re behind and will eventually be caught up also, requests go to the leader and our dead node can’t become leader because it’s behind - effect is minimized
  • #42: Elections require a majority of the nodes to agree to elect a leader All writes require a majority of the nodes to replicate the transaction to be successful If we lose more than half the nodes, then we’re unable to do anything
  • #43: https://raft.github.io/ http://thesecretlivesofdata.com/raft/
  • #45: Kubernetes actually suggests you don’t interact with replica sets directly, we instead use the deployment controller to manage them It knows which pods to manage through labels, we define labels on the pods we create and we tell the replica set which labels it should look for so that it knows which pods to manage
  • #46: Provides a declarative way of managing pods and uses that to roll out your desired changes in a friendly way Handles rolling deploys, rolling back your application, and autoscaling it out as well
  • #47: What it’s looking for specifically are pods without a node name Constantly querying the API server (and by extension, etcd) looking for those unscheduled pods
  • #49: HostName - compares the node’s name to the node name specified in the pod spec (if any), excludes every node that doesn’t match, lets us schedule directly to that node, assuming it passes other predicates MatchNodeSelector - variant of HostName, Kubernetes let you provide label selectors to associate resources using labels, this predicate checks that the node selector supplied in the pod spec is valid for a given node Scheduler also checks the resources of each node PodFitsHostPort - whether or not the host port (a hard-coded port specified in the pod spec) is available for that node, if not available -> filtered out PodFitsResources - another resource-aware check, pods can request a given amount of CPU and memory; this predicate checks whether the node is capable of satisfying that request CheckNodeMemoryPressure and CheckNodeDiskPressure won’t schedule onto nodes whose memory usage or disk usage is too high Also volume checks - make sure that we’re not using more than allowed (for cloud providers) or there’s no conflicts on volume claims
  • #51: Ties are broken by randomly picking a winner
  • #52: Most obvious strategy is put our pod on the least used note - LeastRequestedPriority - calculates how much CPU + memory (equally weighted and added together) would be left after scheduling pod to that node - helps with balancing resource usage across cluster BalancedResourceAllocation - attempts to prevent nodes from being largely weighted towards CPU or memory usage - try to avoid nodes with 95% CPU usage and 5% memory usage Checks to see how the pod affects the balance of resource usage, favoring nodes who would have CPU utilization closer to memory utilization after scheduling Last one I want to look at is SelectorSpreadPriority Lose a great chunk of the benefits of having multiple copies of an application if they’re all sitting on the same node. If the node goes down, you’ll lose every copy of the app that’s running on the node (insert metaphor about eggs and baskets) This function minimizes the amount of pods from the same service or replica set on the same node, causing our currently-being-scheduled pods to favor nodes without its managed siblings
  • #54: Specify we want three instances We’re going to label it with the tag ‘app:hello-world’ Going to launch the hello-world container, when we hit port 80 it’ll return a hello world!
  • #55: We use kubectl to actually get our app running in kubernetes Kubectl is how we interact with the cluster - can view resources as well as modfiy them Gives a wonderful view of what’s going on with our cluster We can check the status of our pods, our deployments, and importantly for us right now, use it to construct a timeline of events
  • #58: Though they’ve been formatted to fit the screen
  • #80: A service defines a logical set of pods and a policy by which to access them - load balancer, port exposed on the host, external IP
  • #81: Defines a service that “selects” pods with the app tag with a value of hello-world Exposes the service as a port on the node Can’t do a load balancer because that requires a cloud provider, otherwise we could create an Amazon ELB or whatever google cloud’s equiavalent We’ll also create this via kubectl