0% found this document useful (0 votes)
16 views

Module 2-4

Uploaded by

Siddhant Yadav
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Module 2-4

Uploaded by

Siddhant Yadav
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Containerization and

Micro Services

CCA3010

Module 2
Syllabus
Working with Kubernetes: Cluster Architecture, the costs of self-hosting
Kubernetes, Manages Kubernetes services, Turnkey Kubernetes solutions,
Kubernetes installers, Cluster less Container Services, Deployments of Kubernetes,
pods, Replica Sets, Maintaining Desired State, The Kubernetes Scheduler, Resource
Manifest in YAML format, Kubernetes Package Manager. Kubernetes Volume
Management, Submitting Jobs to Kubernetes.
Architecture of Kubernetes
Architecture of Kubernetes
• Node
• Cluster
• Control Plane (Master)
• API Server
• etcd
• Scheduler
• Controller Manager
• Node Components
• Kubelet
• Container runtime
• Kube Proxy
• Pods
• Replication Controller/Replica Set
• Service
• Volume
• Namespace
• Label and Selector
• Config Map and Secret
Architecture of Kubernetes
• We create manifest (.yml)

• Apply this to cluster (to master) to bring into desired state.

• Pod runs on node, which is controlled by master


Architecture of Kubernetes
Master Node

• Kubernetes cluster contains containers running on bare metal/ VM instances/ Cloud


instances/ all mix

• Kubernetes designates one or more of these as master and all others as workers

• The master is now going to run set of K8s processes. These processes will ensure smooth
functioning of cluster. These are called “ Control Plane”.

• Can be multi master for high availability

• Master runs control plane to run cluster smoothly


Architecture of Kubernetes
Components of Control Plane (Master)
• 1. Kube- API Server (for all communication)
• This API server interacts directly with user (i.e. we apply .yml or json manifest to kube-api server.
• This Kube-api server is meant to scale automatically as per load
• Kube api-server is front end of control plane.

• 2. Controller Manager- It makes sure that actual state of cluster matches to desired state.

• Two possible choices for controller manager


• If K8s on cloud, then it will be cloud controller manager
• If k8s on non-cloud, then it will be kube controller manager
Architecture of Kubernetes
• 3. Kube Scheduler- When user make request for the creation and management of pods, kube-
scheduler is going to take action on these requests.

• Handles pod creation and management.

• Kube-scheduler match/assign any node to create and run pods

• A scheduler watches for newly created pods that have no node assigned for every pod that the
scheduler discovers, the scheduler becomes responsible for finding best node for that pod to run
on.
• Scheduler gets the information for hardware configuration from configuration files and
schedules the pods on nodes accordingly.
Architecture of Kubernetes
• 4. etcd
• Stores metadata and status of cluster
• etcd is consistent and high available store ( key_value store)
• Source of touch for cluster state (info about state of cluster)
• etcd has following features
• Fully replicated- The entire state is available on every node in the cluster
• Source- Implements automatic TLS with optional client certificate authentication
• Fast- Benchmarked at 10,000 writes per second

• Components on masters that run controllers


• Node controller-for checking the cloud provider to determine if a node has been detected in the cloud after it
stops responding
• Route controller-responsible for setting up network routes on your cloud
• Service controller-responsible for load balancers on your cloud against services of type load balancers
• Volume controller- for creating, attaching and mounting volumes and interacting with the cloud providers to
orchestrate volume.
Architecture of Kubernetes
Node
• Node is going to run 3 important pieces of software/processes

• 1.Kubelet
• Agent running on the node.
• Listens to Kubernetes master (ex- pod creation request)
• Use port 10255
• Send success/fail reports to master
• 2. container engine
• Works with kubelet
• Pulling images
• Start/stop container
• Exposing containers on ports specified in manifest
Architecture of Kubernetes
• 3.Kube-proxy

• Assign IP to each pod

• It is required to assign IP addresses to pods (dynamic)

• Kube-proxy runs on each node and this make sure that each pod will get its own unique IP addresses.

These 3 components collectively consist “node”.


Architecture of Kubernetes
Pods
• Smallest unit in Kubernetes
• Pod is a group of one or more containers that are deployed together on the same host.
• A cluster is a group of nodes.
• A cluster has at least one worker node and master node.
• In Kubernetes, the control unit is the Pod, not container.
• Consists of one or more tightly coupled containers.
• Pod runs on node, which is control by master.
• Kubernetes only knows about pod not know about individual containers.
• Cannot start a container without a pod.
• One pod usually contains one container
Architecture of Kubernetes
Multi-container Pods
• Share access to memory space
• Connect to each other using localhost <container port>
• Share access to the same volume
• Container within pod are deployed in an all-or-nothing manner
• Entire pod is hosted on the same node (scheduler will decide about which node)
Architecture of Kubernetes
Pods Limitation

• No auto-healing or scaling

• Pod crashes
Types of Kubernetes
1.Self-Hosted Kubernetes:
As discussed earlier, this involves manually setting up and managing all the components of a
Kubernetes cluster, including the control plane and worker nodes, on your own infrastructure.
2.Managed Kubernetes Services:
Cloud providers offer managed Kubernetes services that abstract much of the operational
complexity. Examples include:
1.Amazon EKS (Elastic Kubernetes Service): Managed Kubernetes service on AWS.
2.Google Kubernetes Engine (GKE): Managed Kubernetes service on Google Cloud.
3.Azure Kubernetes Service (AKS): Managed Kubernetes service on Microsoft Azure.
3.On-Premises Kubernetes:
Deploying Kubernetes in an on-premises environment, typically within a private data center.
This allows organizations to have full control over their infrastructure but requires managing the
hardware and networking components.
Types of Kubernetes
4. Bare-Metal Kubernetes:
Running Kubernetes directly on physical servers without an underlying virtualization layer. This
approach is chosen when organizations want to maximize resource utilization and avoid the overhead
of virtualization.
The costs of self-hosting Kubernetes
• Self-hosting Kubernetes can involve various costs, both direct and indirect. It's essential to
consider these factors when planning to deploy and maintain a Kubernetes cluster on your own
infrastructure. Here are some key cost considerations:

1.Hardware Costs:
1. Servers/Nodes: You'll need physical or virtual machines to act as Kubernetes nodes. The
number and specifications of these nodes will depend on your workload and performance
requirements.
2. Storage: Depending on your storage needs, you may incur costs for local or network-attached
storage.

2.Networking Costs:
1. Bandwidth: Data transfer between nodes, as well as communication with external services,
may incur network costs. Ensure your network infrastructure can handle the traffic.
The costs of self-hosting Kubernetes
3. Software Costs:
1. Kubernetes Distribution: Some distributions may have licensing costs or support fees.
Examples include Red Hat OpenShift, VMware Tanzu, and others.
2. Container Runtimes: Consider the container runtime you use (Docker, containerD, etc.) and
any associated costs.
4. Monitoring and Logging:
Implementing monitoring and logging solutions can incur costs. Tools like Grafana, ELK stack,
etc., might have associated expenses.
5. Security:
Security tools, such as vulnerability scanners or network security solutions, may have costs
associated with them.
6. Human Resources:
Employing or training staff to manage, monitor, and troubleshoot the Kubernetes cluster will
contribute to the overall cost.
The costs of self-hosting Kubernetes
7. Training and Certification:
If your team needs training or certification to effectively manage Kubernetes, consider the
associated costs.
8. Backup and Disaster Recovery:
Implementing backup and disaster recovery solutions may have costs, including storage costs for
backups.
9. Upgrades and Maintenance:
Ongoing maintenance, updates, and upgrades might require time and resources.
10. Power and Cooling:
Running and cooling the physical infrastructure or virtualization hosts will contribute to
operational costs.
11. Facility Costs:
If you are using a data center, there will be associated costs for space, power, and other facilities.
The costs of self-hosting Kubernetes
12. Scaling Costs:
As your workload grows, you may need to scale your infrastructure, which incurs additional
costs.
13. Legal and Compliance:
Costs associated with ensuring compliance with data protection laws, industry regulations, etc.

In some cases, using managed Kubernetes services from cloud providers might be a cost-effective
alternative, especially for smaller organizations or those without dedicated infrastructure expertise.
Managed Kubernetes Services
Managed Kubernetes services are offerings provided by cloud service providers to simplify the
deployment, management, and scaling of Kubernetes clusters. These services abstract much of the
operational complexity, allowing users to focus more on deploying and managing applications rather
than the underlying infrastructure.
1.Amazon EKS (Elastic Kubernetes Service):
Amazon EKS is a fully managed Kubernetes service provided by Amazon Web Services (AWS).
It simplifies the process of deploying, managing, and scaling containerized applications using
Kubernetes. EKS integrates with other AWS services and provides features such as automatic
updates and seamless integration with AWS Identity and Access Management (IAM).
2.Google Kubernetes Engine (GKE):
Google Kubernetes Engine is a managed Kubernetes service offered by Google Cloud. GKE
provides features like automated scaling, monitoring, and seamless integration with other Google
Cloud services. It is optimized for Google Cloud's infrastructure and supports features such as
auto-upgrades.
Managed Kubernetes Services
3. Azure Kubernetes Service (AKS):
Azure Kubernetes Service is a managed Kubernetes offering from Microsoft Azure. It simplifies
the deployment and management of containerized applications using Kubernetes. AKS integrates
with Azure services and provides features like auto-scaling, monitoring, and seamless integration
with Azure Active Directory.
4. IBM Cloud Kubernetes Service:
IBM Cloud offers a managed Kubernetes service that allows users to deploy, manage, and scale
containerized applications using Kubernetes. It supports features like automated updates,
monitoring, and integration with other IBM Cloud services.
5. Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE):
Oracle Cloud provides a managed Kubernetes service known as Oracle Cloud Infrastructure
Container Engine for Kubernetes. It enables users to deploy, manage, and scale containerized
applications on Oracle Cloud Infrastructure. OKE supports features like automated updates,
monitoring, and integration with Oracle Cloud services.
These managed Kubernetes services aim to reduce the operational overhead associated with
running Kubernetes clusters.
Turnkey Kubernetes solutions
• Turnkey Kubernetes solutions are pre-packaged and easy-to-deploy Kubernetes distributions that
streamline the process of setting up and managing Kubernetes clusters.
• These solutions are designed to simplify the complexities of deploying and maintaining
Kubernetes, making it more accessible to users who may not have extensive expertise in
Kubernetes administration.
• Turnkey solutions often come with integrated tools, configurations, and automation to expedite the
deployment process.
• popular turnkey Kubernetes solutions
• Rancher
• KubeSphere
• K3s
• MicroK8s: (developed by Canonical)
• Minikube:Minikube is a tool for running Kubernetes clusters locally on a developer's machine. While not suitable
for production use, Minikube provides a quick and convenient way for developers to set up and experiment with
Kubernetes on their local environment.
• KubeVirt
• OpenShift (developed by Red Hat)
Kubernetes installers
1. kubeadm:
Kubeadm is a command-line tool that helps you bootstrap Kubernetes clusters.
It is part of the official Kubernetes project and is widely used for setting up clusters manually.
Kubeadm is suitable for users who want more control over the installation process.

2. Minikube:
Minikube is a tool that allows you to run a single-node Kubernetes cluster on your local machine.
It's useful for development and testing purposes, providing a lightweight and easy-to-use Kubernetes
environment.

3. Kops (Kubernetes Operations):


Kops is a command-line tool for creating, upgrading, and managing Kubernetes clusters on AWS
(Amazon Web Services).
It's designed to be cloud-provider agnostic, and it supports other providers like GCP (Google Cloud
Platform) and Azure.
Kubernetes installers
4. kubespray:
Kubespray is an Ansible-based Kubernetes deployment tool.
It allows you to deploy a production-ready Kubernetes cluster on various platforms, including bare metal,
virtual machines, or cloud providers.
Kubespray supports multiple Linux distributions and cloud platforms.
5. RKE (Rancher Kubernetes Engine):
RKE is a Kubernetes distribution that is part of the Rancher ecosystem.
It is an installer that simplifies the deployment of Kubernetes clusters, offering a declarative configuration
approach.
6. EKS (Amazon Elastic Kubernetes Service):
If you are using AWS, you can use EKS, a managed Kubernetes service that simplifies the deployment,
management, and scaling of Kubernetes clusters on AWS.
7. GKE (Google Kubernetes Engine):
Similar to EKS, GKE is a managed Kubernetes service provided by Google Cloud, making it easy to
deploy and manage Kubernetes clusters on Google Cloud Platform.
Choose the installer that best fits your requirements, infrastructure, and level of control needed for your
Kubernetes cluster.
Cluster less Container Services
A clusterless environment refers to a computing environment where the traditional concept of a
cluster is minimized or not present. In a clusterless setup, there is typically less emphasis on
managing a group of interconnected nodes working together as a single entity. Instead, the focus may
be on more lightweight, serverless, or decentralized approaches.

1.AWS Fargate:
AWS Fargate is a serverless container orchestration service provided by Amazon Web Services
(AWS). With Fargate, you don't need to manage the underlying infrastructure or clusters. You
can run containers directly without provisioning or configuring virtual machines.

2.Azure Container Instances (ACI):


Azure Container Instances is a serverless container service on Microsoft Azure. It allows you to
run containers without managing the underlying infrastructure. ACI abstracts away the cluster
management aspect, making it a clusterless option for running containers.
Cluster less Container Services
3. Google Cloud Run:
Google Cloud Run is a fully managed compute platform that automatically scales your
containerized applications. It abstracts away the infrastructure and allows you to deploy and run
containerized applications without managing clusters. It is suitable for stateless, HTTP-driven
containers.
4. Docker Swarm (Mode):
Docker Swarm is a native clustering and orchestration solution for Docker. While it does involve
managing nodes in a cluster, it's generally considered simpler and more lightweight than
solutions like Kubernetes.
Deployments of Kubernetes
1. Choose a Deployment Method
• Manual Deployment with kubeadm
• Minikube for Local Development
• Managed Kubernetes Services

2. Prepare Infrastructure
• Ensure that the infrastructure meets the Kubernetes system requirements.
• Allocate sufficient resources for nodes, including CPU, memory, and storage.

3. Install Container runtime


• Choose a container runtime like Docker, containerd, or CRI-O.
• Install the container runtime on all nodes in the cluster.
Deployments of Kubernetes
4. Configure kubectl:
• Configure kubectl on your local machine to communicate with the Kubernetes cluster.
• Set the kubeconfig file with the necessary credentials.

5. Join Worker Nodes:


• For manual deployments, execute the kubeadm join command on each worker node.
• For automated tools, follow their specific procedures to add worker nodes.

6. Network Setup:
• Choose a networking solution compatible with your Kubernetes deployment.
• Popular choices include Calico, Flannel, and Weave.
Deployments of Kubernetes
7. Monitor and Maintain:
• Implement monitoring solutions like Grafana.
• Regularly update and patch Kubernetes components and worker nodes.

8. Scale and Manage Workloads:


• Use kubectl commands or manifest files to deploy applications and services.
• Scale deployments, manage rolling updates, and handle pod autoscaling as needed.
Pods fundamentals
• When a pod gets created, it is scheduled to run on a node in cluster

• The pod remains on that node until the process is terminated.

• The pod is deleted, the pod is evicted for lack of resources or the node fails.

• If a pod is scheduled toa a node that fails, or if the scheduling operation itself fails, the pod is
deleted.

• If a node dies, the pod scheduled to that node are scheduled for deletion after a timeout.

• A given pod (UID) is not reschedulable to a new node, instead it will be replaced by an identical
pod, with even the same name if desired, but with a new UID.

• Volume in a pod will exist as long as that pod (with that UID exist) if that pod is deleted for any
reason, volume is also destroyed and created as new on new pod.

• A controller can create and manage multiple pods, and handling replication, rollout and providing
self healing capabilities.
Replica Sets
• A ReplicaSet is a higher-level abstraction in Kubernetes that ensures a specified number of replicas (copies)
of a pod are running at all times. It helps maintain the desired number of identical pod instances to ensure
high availability and scalability.

1. Purpose:
The primary purpose of a ReplicaSet is to maintain a specified number of pod replicas running at all times.
If any of the pods fail or are deleted, the ReplicaSet automatically creates new ones to replace them,
ensuring the desired number is always met.
2. Selectors:
ReplicaSets use label selectors to identify and manage the pods they are responsible for. When creating a
ReplicaSet, you define a set of labels, and the ReplicaSet ensures that the pods it manages have these
labels.
3. Pod Template:
A ReplicaSet includes a pod template that specifies the characteristics of the pods it should create and
maintain. This template includes details such as the container image, resource requirements, environment
variables, and any other settings necessary for the pod.
Replica Sets

kubectl apply -f your-replicaset.yaml


Maintaining Desired State
• Maintaining the desired state is a fundamental concept in Kubernetes, and it's achieved through the use of
various resources and controllers.
1. Declarative Configuration:
In Kubernetes, you define the desired state of your system using declarative configuration files. These files
specify how many replicas of your application should be running, what images to use, what ports to
expose, etc.
2. Kubernetes API Server:
The Kubernetes API server is the central control plane component. It receives and processes API requests,
including the declarative configuration files provided by users.
3. Controllers:
Controllers are control loops that continuously work to bring the current state of the system closer to the
desired state. They monitor the state of objects in the cluster and make adjustments as needed.
4. Example Controllers:
1. ReplicaSet Controller: Ensures the specified number of pod replicas is running.
2. Deployment Controller: Manages rolling updates and rollbacks of applications.
3. StatefulSet Controller: Ensures stable network identities for stateful applications.
Maintaining Desired State
• Desired State Persistence:
The desired state is persisted in the etcd key-value store, which is a distributed database used by Kubernetes
to store configuration data, state, and metadata.
• kubectl:
The kubectl command-line tool is a common way to interact with the Kubernetes API server. Users can
apply or update their configuration files using kubectl apply -f <file.yaml>. This triggers the controllers to
take action and reconcile the state.

• Pod Termination and Replacement:


When a pod becomes unhealthy or needs to be updated, the controller (e.g., ReplicaSet or Deployment)
terminates the existing pod and replaces it with a new one, ensuring the desired number of replicas is
maintained.
• Scalability:
• If the desired state specifies scaling (e.g., increasing the number of replicas), controllers ensure the
new replicas are created and added to the cluster.
• By following this declarative model and utilizing controllers, Kubernetes provides a robust mechanism
for maintaining the desired state of applications and infrastructure in a cluster, automating much of the
operational complexity.
The Kubernetes Scheduler
The Kubernetes Scheduler is a crucial component of the Kubernetes control plane responsible for making
decisions about where and when to run pods
1. Node Selection:
When a pod is created or needs to be rescheduled due to a failure or scale-up event, the Scheduler
determines the optimal node for running the pod. It takes into account factors such as resource
requirements, affinity/anti-affinity rules, and other constraints.
2. Declarative Model:
Kubernetes follows a declarative model, meaning users declare their desired state in configuration files,
and the Scheduler works to bring the current state of the cluster in line with this declared state.
3. Filtering and Scoring:
The Scheduler uses a two-step process involving filtering and scoring. In the filtering phase, nodes that
don't meet the pod's requirements (e.g., available resources) are excluded. The remaining nodes are then
scored based on various criteria, and the node with the highest score is chosen.
4. Node Affinity and Anti-Affinity:
Node affinity and anti-affinity rules allow users to influence the scheduling decisions based on node labels.
This helps ensure that pods are scheduled onto nodes with specific characteristics or avoid nodes with
certain characteristics.
The Kubernetes Scheduler
5. Resource Constraints:
The Scheduler considers resource requirements and constraints when making scheduling decisions. It
looks at CPU and memory requests and limits to ensure that nodes have sufficient resources to
accommodate the pod.
6. Pod Priority and Preemption:
Kubernetes allows users to assign priorities to pods. The Scheduler takes these priorities into account when
making scheduling decisions. If resources become scarce, higher priority pods may be scheduled at the
expense of lower priority ones.
7. Interactions with Controllers:
The Scheduler works in conjunction with other controllers, such as ReplicaSet and Deployment
controllers, to ensure the desired number of pod replicas are maintained across the cluster.
8. Optimization for Fault Tolerance:
The Scheduler takes fault tolerance into account, avoiding scheduling multiple replicas of the same pod on
the same node (anti-affinity) to improve resiliency.
• The Kubernetes Scheduler plays a vital role in the automatic and efficient allocation of resources in a cluster,
allowing users to focus on defining their applications' desired state without having to manually specify where
each pod should run.
Resource Manifest in YAML format
A resource manifest in YAML format is a configuration file that defines a Kubernetes resource, such as a
Deployment, Pod, Service, or any other object in the Kubernetes ecosystem.
Below is an example of a simple Pod resource manifest in YAML format:

apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
Resource Manifest in YAML format
•apiVersion: Specifies the API version of the Kubernetes resource being defined. In this case, it's a Pod using
the "v1" version.

•kind: Specifies the type of resource being defined. Here, it's a Pod.

•metadata: Contains metadata such as the name of the resource.

•spec: Specifies the desired state of the resource. In the case of a Pod, it includes information about the
containers to run.

In this example, the manifest defines a Pod named "example-pod" running a single container named "nginx-
container" based on the "nginx:latest" Docker image. The container exposes port 80.

You can apply this manifest using the kubectl apply command. For instance:

kubectl apply -f pod-manifest.yaml


Submitting Jobs to Kubernetes.
• Submitting jobs to Kubernetes involves creating and managing a Job resource. A Job in Kubernetes is a
controller that creates one or more pods and ensures that a specified number of them successfully terminate.
Jobs are useful for running tasks, batch processes, or computations that are expected to complete and then
exit.
1. Create a Job YAML Manifest:
• Create a YAML manifest file describing the Job you want to run. Below is a simple example that runs a
container to perform a specific task: (Save this YAML file (e.g., job.yaml).
Submitting Jobs to Kubernetes.
2. Apply the Job Manifest:
• Use the ‘kubectl apply’ command to submit the job manifest to your Kubernetes cluster
kubectl apply -f job.yaml
This will create the Job and its associated pod(s) in the cluster.
3. Monitor Job Execution:
• You can monitor the execution of the Job using the following commands:
• To view the status of the Job: kubectl get job example-job
• To view the logs of a specific pod: kubectl logs <pod-name>

4. CleanUp (Optional):
If needed, you can delete the Job and its associated resources once it has completed:
kubectl delete job example-job
Kubernetes Volume Management
1. EmptyDir Volume:
•An EmptyDir volume is created when a pod is assigned to a node and exists as long as the pod is running on
that node. It is initially empty but can be used to share files between containers within the same pod.
volumes:
- name: temp-volume
emptyDir: {}

2. HostPath Volume:
•HostPath allows a pod to use a file or directory from the host machine's filesystem. It is often used for sharing
files between the host and the pod.

volumes:
- name: host-volume
hostPath:
path: /path/on/host
Steps to configure master and node
To connect a master node and a worker node in Kubernetes, you need to follow several steps.

1.Setup Kubernetes Cluster: Ensure that you have set up a Kubernetes cluster. This typically involves
installing Kubernetes on both the master and worker nodes.

2.Configure Master Node:


•Install Kubernetes on the master node.
•Initialize the Kubernetes cluster using kubeadm init.
•Once initialized, configure kubectl to communicate with the master node. You can typically do this by
copying the kubeconfig file generated during kubeadm init to the appropriate location
(~/.kube/config).

3.Join Worker Node to the Cluster:


•On the worker node, install Kubernetes dependencies.
•Join the worker node to the Kubernetes cluster using the kubeadm join command. You'll need the
token generated during kubeadm init on the master node.
•After executing the kubeadm join command, the worker node should be connected to the master
node.
Steps to configure master and node

4.Verify Node Connection:


•Use kubectl get nodes on the master node to verify that both the master and worker nodes are
connected and listed.
•Ensure that both nodes have the status "Ready", indicating they are healthy and ready to accept
workloads.

5.Test Deployment:
•Deploy a simple application or workload to the Kubernetes cluster to ensure that it runs successfully
across both the master and worker nodes.
•Monitor the deployment to confirm that it's distributed across the cluster.

6.Additional Configuration (Optional):


•You may need to configure networking, storage, and other resources depending on your specific
requirements. This might involve installing networking plugins like Calico, Flannel, or others.

You might also like