100% found this document useful (1 vote)

268 views240 pages

Dev Ops for Cloud

Uploaded by

vijaygovind461

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

268 views240 pages

Dev Ops for Cloud

Uploaded by

vijaygovind461

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 240

Agenda for today’s class

• Custom Image and Container creation

• Container Orchestration
• What is it?
• Why do we need it?
• History of Container Orchestration Tools

• Kubernetes Cluster Architecture

• Minikube
• Working with Kubernetes Objects
• pods,
• replicasets,
• deployment
• Hands-on exercises using Minikube / cloud platforms

BITS Pilani, Pilani Campus

Creating a custom docker image
➢ Dockerfile specifies all the
steps defining what all are
needed to be done on the
container once created. docker hub

➢ To create a Dockerfile use

below command
containers images
touch Dockerfile
mix

➢ Format of lines in Dockerfile:

<instruction> <argument>
Ex: FROM Ubuntu

➢ Here instruction is FROM and

argument is Ubuntu
docker host
Creating a custom docker image
- Dockerfile
we can use # for adding comments

# Use an official Python runtime as a base image

FROM python:3.9-slim
# Set the working directory in the container
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
# Install any needed packages
RUN pip install --no-cache-dir Flask
# Make port 5000 available to the world outside this container
EXPOSE 5000
# Define environment variable
ENV FLASK_APP=app.py
# Run app.py when the container launches
CMD ["python", "app.py"]
The need for Orchestration
Both Linux Containers and Docker Containers
– Isolate the application from the host

Not easily
Scalable

BITS Pilani, Pilani Campus

Why Do We Need Container
Orchestration?
➢ Imagine you’re running a website.
➢ Initially, you have one server, and everything works fine.
➢ As your website grows, you add more servers, and instead of running
everything on the server directly, you start using containers (like Docker) to
package your apps and their dependencies.

➢ Challenges arise when you manage multiple containers:

➢ Scaling:
➢ What if a lot of users are using your app?
➢ You need to spin up more containers.
➢ Resilience:
➢ What if a container crashes?
➢ You need a system to restart it.
➢ Distribution:
➢ You may want to run containers across multiple servers.
➢ Management:
➢ Keeping track of all your running containers, scaling, updating them, etc.
Container Orchestration

Container orchestration
automates the
deployment,
management, scaling,
and networking of
containers.
Container Orchestration
Tools

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956

History of Container
Orchestration Tools - 1
➢ As containerized apps became more common, orchestration tools evolved
to help manage them:

➢ 2013 - Docker Containers

➢ Docker - Containers were popularized with Docker, but Docker alone
could not handle complex systems
➢ Ex: scaling or restarting crashed containers.

➢ 2014 - Mesos and Marathon

➢ Mesos was used by companies like Twitter for resource management, and
➢ Marathon helped manage containers on Mesos.

➢ 2015: Docker Swarm

➢ Docker’s native orchestration tool aimed at solving container
management but lacked advanced features.

BITS Pilani, Pilani Campus

History of Container
Orchestration Tools - 2
➢ 2014-2015: Kubernetes (K8s)
➢ Developed by Google, based on its
internal system (Borg), it became the
most popular solution for managing
containers.
➢ It’s open-source and supports
➢ advanced scheduling
➢ scaling
➢ self-healing and more.

➢ 2018:
➢ Kubernetes became the de-facto
industry standard, overshadowing most
alternatives.

BITS Pilani, Pilani Campus

Docker Swarm
➢ Docker Swarm is an orchestration management
tool that runs on Docker applications.

➢ While Docker is great for running containers on a

single machine, Swarm turns several Docker hosts
into a cluster, coordinating their efforts.

➢ Docker Swarm allows you to manage multiple

containers on multiple hosts as a single system.

➢ Each node of a Docker Swarm is a Docker

daemon, and all Docker daemons interact using
the Docker API.

BITS Pilani, Pilani Campus

Docker Swarm

➢ Each container within the Swarm can be deployed and

accessed by nodes of the same cluster.

BITS Pilani, Pilani Campus

Kubernetes
• It is an open-source container
orchestration tool that was originally
developed and designed by engineers at
Google.

• Kubernetes orchestration allows you to

build application services that span
multiple containers, schedule containers
across a cluster, scale those containers,
and manage their health over time.

BITS Pilani, Pilani Campus

Kubernetes Cluster
Architecture - Details
➢ Kubernetes uses a master-worker architecture:

➢ Master Node: The control plane responsible for managing the

cluster.
➢ API Server: Interacts with Kubernetes users. Validates requests, authenticates
users.
➢ Controller Manager: Ensures that the desired state of the system is maintained.
➢ Scheduler: Assigns workloads to worker nodes.
➢ etcd: A key-value store for cluster data.

➢ Worker Nodes: The machines that run your applications.

➢ CRI: Container run time interface - docker
➢ Kubelet: Manages containers on the node by talking to master.
➢ Kube Proxy: Handles network traffic between pods and expose pods (your
applications) to end users.

BITS Pilani, Pilani Campus

Kubernetes Cluster
Architecture

BITS Pilani, Pilani Campus

Kubernetes Cluster
Architecture

BITS Pilani, Pilani Campus

Minikube Setup on Windows
-1
➢ Minikube is a tool that lets you run Kubernetes locally, perfect for learning and
testing.

Step 1: Install Virtualization Software:

➢ We need a hypervisor like Hyper-V or Oracle VirtualBox.
➢ Minikube requires virtualization to run the Kubernetes cluster locally
➢ To check if virtualization is enabled, you can go to the Task Manager > Performance
tab > CPU. There should be a section for Virtualization: Enabled.
➢ To enable Hyper-V, open PowerShell (Admin) and run the following command and
restart the computer after that:
➢ Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V –All
➢ Verify the installation using VBoxManage --version

Step 2: Install kubectl (Kubernetes Command-line Tool)

➢ Download kubectl by following steps at
https://kubernetes.io/docs/tasks/tools/install-kubectl-windows/
➢ Add kubectl.exe to your system path and verify using:
➢ minikube version

➢ We now have a single-node Kubernetes cluster running locally!

BITS Pilani, Pilani Campus

Minikube Setup on Windows
-2
Step3: Start Minikube
➢ Open PowerShell as Administrator.
➢ To start a Minikube cluster, run the following command:
➢ minikube start --driver=hyperv or
➢ minikube start --driver=virtualbox
➢ Minikube will download the necessary Kubernetes components and start
the cluster.
➢ Verify Minikube is running by checking the status:
➢ minikube status

➢ Once Minikube starts, verify your cluster is running by checking

the node status:
➢ kubectl get nodes

BITS Pilani, Pilani Campus

Kubernetes Objects

1. Pod

2. ReplicaSet

3. Deployment

4. Service

5. Job

6. ConfigMap

7. Secret

8. Namespace

9. Volume etc

BITS Pilani, Pilani Campus

kubectl commands - 1

1) client and server versions

kubectl version

➢ lists the versions of client and server

kubectl version --client for getting version of client

kubectl version --server for getting version of server

2) help
kubectl help
➢ to get commands help
BITS Pilani, Pilani Campus
kubectl commands - 2

3) nodes
# to get the list of nodes running
kubectl get nodes

# to get list of multiple components

kubectl get pods,svc

# for more verbose info use

kubectl get nodes -o wide

BITS Pilani, Pilani Campus

kubectl commands - 3

4) yaml file configuration

# to create yaml file for given config

kubectl run nginx --image=nginx --dry-run=client
-o yaml > nginx-pod.yaml

# to create many at once from many config files

kubectl create -f pod1-def.yaml,pod2-def.yaml

# to delete many at once

kubectl delete pod1,pod2

BITS Pilani, Pilani Campus

What is a Pod?

➢ A Pod is the smallest deployable unit

in Kubernetes.

➢ A pod represents a single instance of

an application.

➢ A pod can contain one or more

containers.

➢ A pod is defined in a YAML file.

BITS Pilani, Pilani Campus

Pod definition
apiVersion: v1 #(version of kubernetes api to be used to create this object)

kind: Pod (type of object being created)

metadata: (key value pairs)

name: myapp-pod api versions for various objects
labels: (any key value pairs)
app: myapp kind version
spec: (specifies whats inside the component)
Pod v1
containers: ReplicaSet apps/v1
- name: nginx-container
image: nginx ConfigMap v1
Job v1
Deployment apps/v1
Service v1

BITS Pilani, Pilani Campus

Pod definition - example

Let us create a pod named nginx-pod running a container named

nginx-container based on nginx image.

apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: myapp
tier: frontend
spec:
containers:
- name: nginx-container
image: nginx
ports:
- containerPort: 80

BITS Pilani, Pilani Campus

Running a Pod

1) directly using image

syntax: kubectl run <pod_name>

runs a pod by downloading the nginx image from docker repo

Ex: kubectl run redis-pod --image=redis

2) using a pod-def yaml file

syntax: kubectl create/apply-f <pod-defn_yaml_file>

kubectl create -f pod-def.yml (first time)

kubectl apply -f pod-def.yml (after updates/changes)

BITS Pilani, Pilani Campus

ReplicaSet

➢ Pods are not self-healing, meaning if

one crashes, it won’t restart on its own.

➢ ReplicaSet ensures that a specified

number of identical pods are always
running.

➢ If a pod fails, replicaSet automatically

replaces it.

➢ ReplicaSet is used for High Availability

of the applications.

➢ Monitors the pods based on the selector

BITS Pilani, Pilani Campus

ReplicaSet definition

apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-replicaSet
spec: (contains template of pod to be replicated)
template:
<your_pod-def>
replicas: 3
selector:
matchLabels:
tier: frontend

Notes:
======
1) the selector matchLabels is used to match the pods running in kubernetes
cluster having labels as tier: frontend and this replicaset will be applicable to all of
them
2) the pod being defined inside the replicaset needs to always match wrt label
selector!

BITS Pilani, Pilani Campus

Running a ReplicaSet

1) create replicaSet using the config yml file

syntax: kubectl create/apply -f repset-defn.yml

2) to get the list of replicaset running

kubectl get replicaset or kubectl get rs

3) to delete a replicaset

kubectl delete replicaset <replicaset_name>

4) to update a replicaset

kubectl edit replicaset <replicaset_name>

kubectl replace -f repset-defn.yml

BITS Pilani, Pilani Campus

Running a ReplicaSet

5) to scale a replicaset to change replicas

kubectl scale -replicas=6 -f repset-defn.yml

6) to get more details about a replicaset

kubectl describe replicaset <replicaset_name>

kubectl explain replicaset

BITS Pilani, Pilani Campus

Agenda for today’s class
➢ Working with Kubernetes Objects
➢ Pods,
➢ Replicasets
➢ Deployment
➢ Config map
➢ Secrets
➢ Services

➢ Hands-on exercises deploying Custom frontend App

Deployment on k8s
➢ Clusterless Container services
➢ AWS Fargate Example

➢ Configuration management - Helm

➢ Example of using Helm for mysql

BITS Pilani, Pilani Campus

Deployment
➢ Wrapper on replicasets

➢ 1 replicatSet would be created for 1 deployment.

➢ provides declarative updates to applications for roll

over updates and rollbacks.

➢ Ex:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 3

➢ When using the RollingUpdate strategy, Kubernetes

gradually replaces old Pods with new ones, instead of
stopping all Pods at once.

➢ This ensures high availability of your application

during updates and zero downtime upgrades!

BITS Pilani, Pilani Campus

Deployment vs ReplicaSet vs
pod
➢ Let’s say your app is running with v1, and you
want to upgrade to v2.

➢ You can create a Deployment for v1 that

initially runs 5 replicas.

➢ When you want to update the app to v2, you

update the Deployment configuration.

➢ If your Deployment has 5 replicas and

maxSurge is set to 3, Kubernetes can create up
to 8 Pods (5 + 3) during the update to ensure
that v1 Pods are gradually replaced by v2
Pods, without downtime.

➢ v1 v1 v1 v1 v1
➢ v1 v1 v1 v1 v1 v2 v2 v2
➢ v2 v2 v2 v1 v1 v2 v2
➢ v2 v2 v2 v2 v2

BITS Pilani, Pilani Campus

Deployment definition
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-dep
spec: (contains template of pod to be replicated)
template:
<your_pod-def>
replicas: 3
selector:
matchLabels:
tier: frontend
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 3

Notes:
1) the selector matchLabels is used to match the pods running in kubernetes cluster having
labels as tier: frontend and this replicaset will be applicable to all of them
2) the pod being defined inside the replicaset needs to always match wrt label selector!

BITS Pilani, Pilani Campus

Custom frontend App
Deployment on k8s
1) Create frontend folder

2) Create index.html in it

3) Create a Dockerfile

4) Build the Docker Image

5) Run and test the Docker Container Locally

6) Push the image to docker registry

(public – dockerhub or private – container
registries)

6) Deploy the Container to Kubernetes

a. create Configmaps, Secrets
b. create Deployment
c. create Service to expose pod to
external world

BITS Pilani, Pilani Campus

Steps 1-2) Create front end
folder and index.html
<!DOCTYPE html>
<html>
<head>
<title>Front end app</title>
</head>

<body>
<h1>Welcome to Hello World Frontend!</h1>
Version: 
API key: 
</body>
</html>

BITS Pilani, Pilani Campus

Steps 3-5) Dockerfile, building
and running the custom image
Create a Dockerfile:
# Use the official Nginx image as the base image
FROM nginx:alpine

# Copy the HTML file to the Nginx web server directory

COPY ./index.html /usr/share/nginx/html/index.html

# Copy the entrypoint script

COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh

# Expose port 80
EXPOSE 80

# Set the entrypoint to run the script

ENTRYPOINT ["/entrypoint.sh"]

Build the Docker Image:

docker build -t frontend-app .

Run and test the Docker Container Locally:

docker run -d -p 8080:80 hello-world-app

BITS Pilani, Pilani Campus

ConfigMap
➢ ConfigMap is an API object that allows you to store non-
sensitive configuration data as key-value pairs.

➢ It can be consumed by pods or other objects.

➢ Ex:

frontend-app-config.yaml Deploy the config map

apiVersion: v1
kind: ConfigMap kubectl apply -f frontend-app-config.yaml
metadata:
name: frontend-app-config
data: Verify the deployment:
APP_TITLE: "Hello World Frontend"
APP_VERSION: "1.0" kubectl get configmap
or

kubectl get cm

BITS Pilani, Pilani Campus

Secret
➢ Secret is an API object that allows you to store sensitive
configuration data as key-value pairs.

➢ It can be consumed by pods or other objects.

➢ Ex: to generate base64 encoded value for API_KEY

➢ echo -n ‘my api key pass' | base64

app-secret.yaml Deploy the config map

apiVersion: v1
kind: Secret kubectl apply -f db-secret.yaml
metadata:
name: app-secret Verify the deployment:
type: Opaque
data:
API_KEY: cGFzc3dvcmQ= kubectl get secret
# base64-encoded value for "password"

BITS Pilani, Pilani Campus

Deploy the Container to
Kubernetes
frontend-app-dep.yaml env:
apiVersion: apps/v1 - name: APP_TITLE
kind: Deployment valueFrom:
metadata: configMapKeyRef:
name: frontend-app-dep name: frontend-app-config
spec: key: APP_TITLE
replicas: 3 - name: APP_VERSION
selector: valueFrom:
matchLabels: configMapKeyRef:
app: frontend-app name: frontend-app-config
template: key: APP_VERSION
metadata: - name: API_KEY
labels: valueFrom:
app: frontend-app secretKeyRef:
spec: name: app-secret
containers: key: API_KEY
- name: frontend-app-container
image: Deploy the pod
ports:
- containerPort: 80
kubectl apply -f frontend-app-dep.yaml

Verify the deployment:

kubectl get pods

BITS Pilani, Pilani Campus

Entrypoint
entrypoint.sh
#!/bin/sh
# Replace the placeholder for APP_VERSION in index.html

sed -i "s/<\/span>/<span id=\"app-

version\">$APP_VERSION<\/span>/" /usr/share/nginx/html/index.html

# Replace the placeholder for API_KEY in index.html

sed -i "s/<\/span>/$API_KEY<\/span>/"

/usr/share/nginx/html/index.html

# Start Nginx or another web server

nginx -g "daemon off;“

Note: Decoding password if needed:

echo "c2VjdXJlX3Bhc3N3b3Jk" | base64 --decode

BITS Pilani, Pilani Campus

Clusterless Container services
➢ Clusterless container services refer to container platforms or frameworks
that allow users to run and manage containers without directly managing
the underlying cluster infrastructure.

➢ Infrastructure management means taking care of nodes, networking, and

scaling.

➢ These services abstract away the complexities of orchestrating and managing

a cluster like Kubernetes does.

➢ Ex:
➢ AWS Fargate (for ECS and EKS)
➢ Azure Container Instances (ACI)
➢ Google Cloud Run

➢ Benefits:
➢ No Infrastructure Management: You don't need to manage servers, clusters, or
nodes.
➢ Automatic Scaling: Services can automatically scale containers up or down based
on demand.
➢ Cost-Efficiency: Pay only for the compute resources you use, avoiding over-
provisioning.

BITS Pilani, Pilani Campus

AWS Fargate Example
1) Create a Docker image:
This can be any simple application, for example, a Python web app.

2) Define ECS Task:

➢ You define a task with a container definition in Amazon ECS.
➢ CPU: 256 MB
➢ Memory: 512 MB
➢ Container Image: Your image from ECR (my-python-app)
➢ Port Mappings: Expose port 80 (or any other)
➢ Any other config like configmap , secrets etc

3) Create a Service:
➢ In ECS, define a Fargate service that runs your task:
➢ Desired number of tasks: 1
➢ Networking: VPC, Subnet, and Security Groups
➢ Auto-scaling options, if required.

4) Run the Service:

➢ AWS Fargate will launch your container without you needing to manage any
infrastructure (like EC2 instances).

5) Access the Application:

➢ Once the service is up and running, AWS Fargate will assign a load balancer or public
IP (depending on your setup) so you can access your application.
BITS Pilani, Pilani Campus
What is Helm?
➢ Helm is a package manager for Kubernetes, designed to simplify the process of
deploying, managing, and maintaining Kubernetes applications.

➢ Helm enables us to define, install, and upgrade complex Kubernetes applications using
Helm charts.

➢ Helm charts are collections of pre-configured Kubernetes resources.

➢ Key Use cases:

1) Manage Configuration:
Use a single configuration file (values.yaml) to manage parameters like the number of
replicas, environment variables, secrets, and more.

2) Deploy Applications:
Deploy the same application across different environments (dev, staging, production) with
a single command, while applying environment-specific configuration.

3) Version Control:
Version Helm charts, allowing you to roll back to previous versions if necessary.

BITS Pilani, Pilani Campus

Configuration management
with Helm Example
1) Install Helm using Chocolatey on Windows:
Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol =
[System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object
System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1’))

choco install kubernetes-helm

2) Create a Helm Chart:

helm create mysql-chart

3) Add Helm Stable Repository:

helm repo add bitnami https://charts.bitnami.com/bitnami

helm repo update

4) Install MySQL Chart:

➢ We can deploy MySQL using the official Bitnami Helm chart:

helm install mysql-chart-release1 bitnami/mysql

➢ This command deploys MySQL with default settings.

➢ Helm will create a Kubernetes deployment, service, and persistent volume claim for the
database

BITS Pilani, Pilani Campus

Configuration management
with Helm Example
5) Customize MySQL Configuration:

➢ To customize the deployment, we create a values.yaml file with desired configuration

(such as root password, database name, storage settings).

➢ For example for mysql:

values.yaml
mysql:
rootPassword: "root_password"
service:
type: ClusterIP
persistence:
enabled: true
accessMode: ReadWriteOnce
size: 1Gi

6) Applying the Custom Configuration:

helm install mysql-chart-release1 mysql-chart --values values.yaml

BITS Pilani, Pilani Campus

Agenda for today’s class

➢ Kubernetes Services
➢ Traditional CD
➢ Anatomy of a Deployment Pipeline
➢ Deployment Pipeline Practices
➢ Human-free deployments
➢ Environment-based release patterns

➢ GitOps CD
➢ What is GitOps
➢ Developer benefits of GitOps
➢ Operational benefits of GitOps
➢ GitOps Continuous Integration (CI) - CI stages
➢ GitOps Continuous Delivery (CD) - CD stages
➢ Declarative vs Imperative object management in Kubernetes
➢ Kubernetes with GitOps

BITS Pilani, Pilani Campus

Service
➢ the ip addr associated with pod may
change as and when pod crashes and
new one replaces it.

➢ To ensure we have static name to refer

to a deployment, we have a service.

➢ Service acts as a load balancer and

proxy.

use cases:
➢ 1) to expose one pod to another pod via
endpoint (service name Ex: redis-db for
redis database pod) in the k8 cluster.

➢ 2) to expose pod to external client

outside k8 cluster

BITS Pilani, Pilani Campus

Deploy the Container to
Kubernetes - Service
frontend-app-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: frontend-app-svc
spec:
type: ClusterIP
selector:
app: frontend-app
ports:
- protocol: TCP
port: 80
targetPort: 80

BITS Pilani, Pilani Campus

Service types
➢ Kubernetes provides several types of
Services to facilitate communication
between Pods and external users.

➢ Each Service type serves a different

purpose in managing access and load
balancing.

➢ Here are the primary types of

Kubernetes Services:

➢ ClusterIP,
➢ NodePort,
➢ LoadBalancer and
➢ ExternalName

BITS Pilani, Pilani Campus

Service types - ClusterIP
➢ Default Service Type.

➢ Exposes the Service on a cluster-internal IP.

➢ Only accessible within the cluster and not accessible

from outside.

➢ Useful for internal communication between Pods.

apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx-app
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP

BITS Pilani, Pilani Campus

Service types - NodePort
➢ Exposes the service externally on a
specific static port on all nodes in the
cluster.

➢ Accessible from outside the cluster by

<NodeIP>:<NodePort>.

➢ Routes traffic to the appropriate Pod(s)

within the cluster.

➢ Useful for simple external access to

Services for testing or when you want
direct access from outside the cluster.

BITS Pilani, Pilani Campus

Service definition example
for Nodeport Service Type
apiVersion: v1
kind: Service
metadata:
name: nginx-svc
spec: (contains template of pod to be replicated)
type: NodePort
ports:
- port: 80 (service port)
targetPort: 8080 (pod port)
nodePort: 30008 (node port)
selector: (to link what all pods this service will be linked to)
name: redis-pod

BITS Pilani, Pilani Campus

Service types - LoadBalancer
➢ Exposes the Service externally using a cloud
provider’s load balancer.

➢ Automatically assigns a public IP address

for the Service.

➢ Routes traffic to the corresponding

NodePort or ClusterIP Service.

➢ Useful for production environments where

you need reliable external access.

BITS Pilani, Pilani Campus

Service type – LoadBalancer Example
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx-app
ports:
- protocol: TCP
port: 80 # Exposed port
targetPort: 80 # Port inside the Pod
type: LoadBalancer

# External load balancer, exposes service to the

internet

BITS Pilani, Pilani Campus

Traditional CD - Continuous
Deployment
➢ Traditional CD automates the entire release process after code
has passed the build and test phases.

➢ Anatomy of a Deployment Pipeline:

➢ A deployment pipeline automates the workflow from code
commit to production.

➢ It includes stages like:

➢ Commit Stage: Running unit tests on the latest commit.

➢ Automated Test Stage: Running integration and end-to-end
tests.
➢ Staging Environment: A replica of the production environment
to test changes.
➢ Production Deployment: Final stage, deploying to live users.

BITS Pilani, Pilani Campus

Example: Jenkins pipeline for
a simple web app
pipeline {
agent any
stages {
stage('Build') {
steps {
sh 'mvn clean install'
}
}
stage('Test') {
steps {
sh 'mvn test'
}
}
stage('Deploy to Staging') {
steps {
sh 'kubectl apply -f k8s/staging.yaml'
}
}
stage('Deploy to Production') {
steps {
input message: 'Deploy to production?', ok: 'Deploy'
sh 'kubectl apply -f k8s/production.yaml'
}
}
}
}

BITS Pilani, Pilani Campus

Jenkins pipeline using docker
image
1) Pull the Jenkins Docker Image and Start Jenkins

docker pull jenkins/jenkins:lts

docker run -d --name jenkins_container -p 8080:8080 -p 50000:50000
8080 -> Jenkins UI
50000 -> Jenkins agent

2) Access Jenkins by going to http://localhost:8080 in your browser.

3) Jenkins will ask for an initial administrator password. You can find this in the Docker logs:

docker exec jenkins_container cat /var/jenkins_home/secrets/initialAdminPassword

4) Install Pipeline Plugins:

Go to Manage Jenkins → Manage Plugins.

In the Available tab, search for Pipeline and install the Pipeline plugin.

5) Set Up a Simple Jenkins Pipeline

a) Create a New Pipeline Job

b) Go to Jenkins UI → New Item.
c) Enter the job name (e.g., MyPipeline) and select Pipeline as the job type.
d) Click OK.
e) You can use a Jenkinsfile (Pipeline as Code) or define the pipeline directly in the Jenkins UI.

BITS Pilani, Pilani Campus

Environment-based Release Patterns
➢ These patterns manage deployments across environments:

1) Blue-Green Deployment: Runs two environments, switching traffic between them.

➢ Blue: The current application version is running in this environment.

➢ Green: The new application version is running in this environment.
➢ Test the green environment.
➢ Once the green environment is ready, redirect live application traffic to the green
environment.
➢ Deprecate the blue environment

2) Canary Release: Incrementally deploying to users.

➢ Releases new software to a small group of users before rolling it out to the entire
user base.
➢ This gradual approach allows for more control over the release process and
immediate feedback from users.
➢ If issues are detected, the rollout can be stopped without affecting the majority of
users

3) Feature Toggles: Enabling/disabling features without redeploying code.

BITS Pilani, Pilani Campus

Deployment Pipeline Practices
➢ The Best practices for deployment pipeline include:

1) Automating Everything:
➢ Each step, from build to deploy, should be automated.
➢ Human-free deployments aim for full automation.
➢ Ensures no human intervention is required for successful
deployments to production.

2) Frequent Releases: Encourages smaller, frequent deployments to reduce

risk.

3) Canary Releases: Deploying to a subset of users before full rollout.

BITS Pilani, Pilani Campus

Blue-Green Deployment in Kubernetes
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app-green
spec:
replicas: 3
template:
metadata:
labels:
app: web-app-green-pod
spec:
containers:
- name: web-container
image: web-app:v2
---
apiVersion: v1
kind: Service
metadata:
name: web-app-service
spec:
selector:
app: web-app-green

BITS Pilani, Pilani Campus

GitOps CD
➢ GitOps CD leverages Git as the single source of truth for both infrastructure and
applications, automating deployments by syncing Git changes with the desired state
of Kubernetes clusters.

➢ Operational Benefits of GitOps:

1) Version Control: All deployment configurations are versioned in Git.

2) Auditing and Compliance: Git logs provide a detailed history of changes.

3) Reproducibility: Clusters can be easily recreated by applying Git configurations

4) Declarative vs. Imperative Object Management in Kubernetes:

1) Declarative Management: You declare the desired state in files (e.g., YAML), and
Kubernetes ensures the actual state matches.

2) Imperative Management: You directly issue commands to change the current state.

Ex: kubectl run my-app --image=nginx --replicas=3

5) Kubernetes with GitOps: Tools like ArgoCD or Flux are used to sync the Kubernetes
cluster state with the Git repository.

BITS Pilani, Pilani Campus

Agenda for today’s class

➢ IaC basics and principles

➢ What is IAC?
➢ Brief History of IAC
➢ Types of IAC
➢ Examples
➢ Key IAC Principles
➢ IaC workflow
➢ Serveress IaC - Tools
➢ Automated deployment using AWS Serverless
Application Model (SAM)
➢ Release pipeline for Serverless deployment

BITS Pilani, Pilani Campus

What is IAC?
➢ Infrastructure as Code (IaC) is the process of managing and
provisioning infrastructure through machine-readable
scripts or code, instead of manual hardware configurations
or interactive configuration tools.

➢ IaC automates the setup and management of

infrastructure components such as servers, networks, and
databases, reducing the need for manual intervention.

➢ Some popular IaC tools include Terraform , AWS

CloudFormation, Ansible, Chef etc

BITS Pilani, Pilani Campus

History of IAC
Early 2000s:
➢ The concept of configuration management tools began to emerge
with the rise of virtualization.
➢ Puppet was developed in 2005 to automate the management of
system configurations.
➢ Chef was created by Opscode in 2009, furthering the concept of
IaC by allowing developers to define infrastructure using Ruby
DSL (Domain-Specific Language).
➢ Ansible was released in 2012, emphasizing simplicity and ease of
use in configuration management.
➢ It allowed users to describe their infrastructure in YAML files
without requiring an agent on managed nodes.

BITS Pilani, Pilani Campus

History of IAC
2010-2020:
➢ Terraform, developed by HashiCorp in 2014, was introduced as a tool to
provide a declarative way to manage infrastructure across multiple cloud
providers.
➢ It allowed users to define their infrastructure using configuration files,
making it easier to manage complex deployments.
➢ AWS CloudFormation, an existing service since 2011, gained significant
traction as an IaC tool, enabling users to create and manage AWS
resources using templates.

2020s and beyond:

➢ The adoption of IaC practices grew significantly, fueled by the rise of
container orchestration tools (like Kubernetes) and serverless architectures,
prompting further innovations in IaC tooling and practices.

BITS Pilani, Pilani Campus

Types of IAC
1) Declarative IaC:
➢ Focuses on defining the desired end state of the
infrastructure.
➢ The system figures out how to achieve that state.
➢ We describe what the infrastructure should look like.
➢ Ex: Terraform, AWS CloudFormation.

2) Imperative IaC:
➢ Focuses on the steps needed to reach the desired state.
➢ We explicitly describe how to achieve the target
configuration.
➢ Ex: Ansible, Chef.

BITS Pilani, Pilani Campus

Terraform vs Chef - provisioning a
Nginx server on AWS EC2 Example
Chef Recipe (webserver.rb): Terraform Code (main.tf)

# Define the provider (AWS)

Chef doesn’t directly provision infrastructure (like creating EC2
provider "aws" {
package "ngininstances)
region = "us-east-1"
}
# Update the package manager cache
execute "update_apt" do # Provision a basic EC2 instance
command "apt-get update" resource "aws_instance" "web" {
ami = "ami-0c55b159cbfafe1f0" # Ubuntu AMI
action :run
instance_type = "t2.micro"
end
# User data script to install and start Nginx
# Install the Nginx package user_data = <<-EOF
x" do #!/bin/bash
apt-get update
action :install
apt-get install -y nginx
end systemctl start nginx
EOF
# Ensure Nginx service is started and enabled to run on boot
service "nginx" do tags = {
Name = "nginx-web-server"
action [:enable, :start]
}
end }

BITS Pilani, Pilani Campus

Key IAC Principles
1) Idempotency:
➢ Running the same code multiple times should yield the
same result.
➢ This ensures consistency in infrastructure state.

2) Version Control:
➢ Like application source code, IaC scripts are managed in
version control system like Git.
➢ This enables us to backup, track and revert changes.

BITS Pilani, Pilani Campus

Key IAC Principles
3) Automation:
➢ IaC enables full automation of provisioning and scaling
infrastructure using CI/CD pipelines.

4) Consistency:
➢ Infrastructure is consistent across environments Ex: dev,
staging, production
➢ This eliminates configuration drift.

BITS Pilani, Pilani Campus

IAC Workflow – part1
1) Write IaC Code:
Define the desired infrastructure (Ex: compute instances,
databases, networks) using tools like Terraform, AWS
CloudFormation, or AWS SAM.

2) Version Control:
Store the IaC code in a Git repository for tracking changes,
rollback, and collaboration.

3) CI Pipeline:
a) Linting: Ensure code follows best practices.
b) Testing: Test infrastructure in a sandbox
environment to validate provisioning.

BITS Pilani, Pilani Campus

IAC Workflow – part2
4) CD Pipeline:
a) Review Changes: IaC tools usually provide a plan of
changes before applying them.
b) Apply the Changes: Execute the IaC code to
create/update the infrastructure.

5) Monitor and Audit:

➢ Use monitoring tools to track infrastructure and detect
drifts.

BITS Pilani, Pilani Campus

Terraform
➢ Terraform is an open-source Infrastructure as Code (IaC)
tool developed by HashiCorp.

➢ It enables users to define, provision, and manage

infrastructure resources (such as servers, databases,
networks etc) across multiple cloud providers and on-
premises environments in a declarative configuration
language called HashiCorp Configuration Language (HCL).

➢ The Configuration Files are written in a file with extension

as .tf to indicate it is a terraform script.

BITS Pilani, Pilani Campus

Terraform - Setup
Step 1: Install Terraform
➢ Download the appropriate Terraform binary from
https://www.terraform.io/, specifically
https://developer.hashicorp.com/terraform/install?product_
intent=terraform#windows for windows.
➢ Extract the binary and add it to your system’s PATH so
that you can run Terraform commands from the terminal.
➢ To confirm installation, run:
➢ terraform -v

Step 2: Create Terraform Configuration Files

➢ Create a file named main.tf

BITS Pilani, Pilani Campus

Terraform - Providers
➢ In Terraform, providers are plugins that allow Terraform to
interact with different infrastructure platforms and services.

➢ Providers are essential because they define what resources

Terraform can manage and how it interacts with those resources,
typically via APIs provided by cloud platforms or other service
vendors.

➢ Each provider in Terraform has its own set of resource types and
data sources it can manage.

➢ Ex1: The AWS provider manages resources like EC2 instances, S3 buckets,
and IAM roles.

➢ Ex2: The Azure provider manages resources like Virtual Machines,

Resource Groups, and Storage Accounts.

BITS Pilani, Pilani Campus

Terraform – Provider
Syntax and example
syntax:

provider “<cloud_provider_name>" {
# Configuration options
}

example:

provider "aws" {
region = “ap-south-1"
}

BITS Pilani, Pilani Campus

Terraform - Resource
➢ In Terraform, a resource is a fundamental component that represents a
specific piece of infrastructure or service managed by a provider.

➢ Resources define what you want to create, modify, or delete in your

infrastructure, such as a virtual machine, storage bucket, load balancer,
or database.

➢ Each resource has 3 things:

1) Resource Type: Defined by the provider
Ex: aws_instance, google_compute_instance.

2) Resource Name: A unique name you assign within your configuration to

identify the resource.

3) Arguments and Attributes: Parameters and settings that specify

properties for the resource, such as the machine type, size, or region.

BITS Pilani, Pilani Campus

Terraform – Resource Syntax
and example
syntax:
resource "<provider>_<resource_type>" "<resource_name>" {
# Arguments for the resource configuration
attribute1 = "value1"
attribute2 = "value2"
# ...additional configuration
}
example:
resource "aws_instance" "web_server" # resource name in tf
{
ami = "ami-0c55b159cbfafe1f0”
instance_type = "t2.micro"
tags = {
Name = "MyWebServer" # resource name in aws
}
}
BITS Pilani, Pilani Campus
Terraform –
Workflow and Commands
1) Initialize: terraform init
➢ Run terraform init to set up plugins required to interact with providers (e.g., AWS,
Azure).

2) Lint: terraform fmt

➢ Check code with terraform fmt for formatting issues

3) Plan: terraform plan

➢ Run terraform plan to see what changes Terraform will make to achieve the desired
state.
➢ This is like a dry run and shows what resources will be created, modified, or
destroyed.

4) Apply: terraform apply

Run terraform apply to provision the defined resources.

5) Destroy: terraform destroy

➢ Run terraform destroy to remove all resources defined in the configuration file,
which is helpful for cleaning up temporary or test environments.

BITS Pilani, Pilani Campus

Terraform - AWS example
➢ Let us try to create, update, and destroy an AWS S3
bucket using Terraform.

➢ Can you recall the high level steps to be done?

BITS Pilani, Pilani Campus

Terraform - AWS S3 example
1) Create a main.tf
# Configure the AWS provider
provider "aws" {
region = "us-west-2" # Specify the AWS region
}
# Create an S3 bucket
resource "aws_s3_bucket" "ssd-s3-bucket-test1" {
bucket = "ssd-s3-bucket-test1"
}
2) terraform fmt
3) terraform init
4) terraform plan
5) terraform apply
6) terraform destroy

BITS Pilani, Pilani Campus

Serveress IaC - Tools
Several tools are specifically designed for Serverless IaC:
1) AWS CloudFormation:
➢ AWS’s native IaC tool.
➢ Can be used to manage AWS infrastructure, including serverless resources
like Lambda and API Gateway.

2) AWS Serverless Application Model (SAM):

➢ Built on CloudFormation, SAM simplifies the definition of serverless
resources like Lambda functions and API Gateway by using a shorthand
syntax.

3) Terraform:
➢ Supports managing AWS Lambda functions and other serverless
components.

4) Serverless Framework: An open-source framework that simplifies the

deployment of serverless applications across multiple cloud providers (AWS,
Azure, Google Cloud).
BITS Pilani, Pilani Campus
Automated deployment using AWS Serverless
Application Model (SAM) example

Create a template.yaml
AWSTemplateFormatVersion: ‘2024-10-16'
Transform: AWS::Serverless-2016-10-31
Resources:
MyLambdaFunction:
Type: AWS::Serverless::Function
Properties:
Handler: app.lambdaHandler
Runtime: python3.8
CodeUri: ./
Events:
ApiEvent:
Type: Api
Properties:
Path: /greet #endpoint
Method: GET

sam build
sam deploy –guided
curl https://<api-id>.execute-api.<region>.amazonaws.com/Prod/greet

BITS Pilani, Pilani Campus

Release pipeline for Serverless deployment

BITS Pilani, Pilani Campus

Release pipeline for Serverless
deployment – AWS Example
➢ Create an AWS and Terraform Backend
➢ Create a DynamoDB table
➢ Create an S3 bucket

➢ Create Repository and AWS Terraform files

➢ Create AWS Codecommit
➢ Create Terraform files

➢ Automate CI/CD deployment to AWS with Terraform

➢ Create the CI/CD Pipeline using AWS CodePipeline
➢ Create a CodeBuild Role using AWS IAM.
➢ Deploy resources with CodePipeline

➢ More details can be found at

https://www.ioconnectservices.com/insight/aws-with-terraform/

BITS Pilani, Pilani Campus

Agenda for today’s class
➢ Security in the DevOps lifecycle
➢ Delivering Secure Software through Continuous Delivery
➢ Continuous Delivery using AWS CodePipeline
➢ Injecting Security into DevOps
➢ Difference between DevSecOps vs DevOpsSec vs SecDevOps
➢ Authentication vs Authorization
➢ AWS IAM
➢ Security as Code
➢ AWS IAM
➢ Compliance as Code
➢ AWS Config

BITS Pilani, Pilani Campus

Delivering Secure Software through
Continuous Delivery
➢ Continuous Delivery (CD) emphasizes automation, fast feedback loops, and frequent
releases, which are challenging when integrating security due to the complexity and
the potential to slow down the pipeline.

Secure software delivery can be achieved by:

➢ Automated Testing: Integration of security-focused automated tests in the pipeline,

including static analysis (SAST), dynamic analysis (DAST), and dependency scanning.

➢ SAST examples: SonarQube, Fortify SCA, ESLint

➢ DAST examples: OWASP ZAP, Burp Suite

➢ Secured Environments: Using infrastructure-as-code (IaC) to create reproducible and

isolated environments, ensuring that configurations are secure and consistent across
development, testing, and production.

➢ Release Gates: Establishing security checks as mandatory gates in the pipeline, such as
requiring code scanning or penetration testing before deployment.

BITS Pilani, Pilani Campus

Fortify Scan Issue – Example - SQL Injection

MERN
React – frontend – login page which takes username and password s input from user
<> ,<>

Express + Node.js – apis / endpoints – js

M - mysql db
select *
from auth
where username = <>
and password = <>;

SQL Injection: user = abc

Typical flow: user – abc , password – xyz
password – “w\” or 1=1”
select *
select *
from auth
from auth
where username = “abc”
where username = “abc”
and password = “xyz”;
and password = “w” or 1=1 ;

BITS Pilani, Pilani Campus

Continuous Delivery using AWS
CodePipeline
➢ Step 1: Set Up Your CI/CD Pipeline:
➢ Create a codepipeline and connect your source repository (ex: AWS CodeCommit,
GitHub, etc.) where your application code is stored.

➢ Step 2: Add Build and Test Stages:

➢ Add a Build stage that compiles your code and runs tests.
➢ Ensure that you include tests that must pass before a release can proceed (e.g., unit
tests, integration tests).

➢ Step 3: Implement the Release Gate:

➢ Determine the conditions that must be met for a release to proceed.
➢ This could include:
➢ Successful completion of previous stages (build, tests).
➢ Compliance checks (e.g., adherence to coding standards, security policies).
➢ Manual approval from specific users or groups.

➢ Step 4: Configure Notifications and Monitoring:

➢ CloudWatch Alarms/SNS Notifications:

BITS Pilani, Pilani Campus

Security and Compliance Challenges
and Constraints in DevOps
1) Speed vs. Security: DevOps pushes for rapid releases, often compromising security
reviews that could slow the pipeline.

2) Lack of Security Knowledge: DevOps teams may lack deep security knowledge,
making it hard to implement best practices.

3) Toolchain Complexity: Integrating various tools can expose vulnerabilities in the

toolchain itself.

4) Compliance Requirements: For regulated industries (like finance or healthcare),

compliance frameworks such as HIPAA, GDPR, and SOC2 require stringent
controls, making it challenging to maintain a fast-paced CI/CD process.

➢ To overcome these challenges, DevOps teams integrate Security as Code (i.e., using
code to define security policies), enhance their security skills.

➢ Apply automation to achieve both compliance and security without compromising on

speed.

BITS Pilani, Pilani Campus

Injecting Security into DevOps
1) Shift Security Left: Moving security checks earlier
in the development process, such as in code reviews
and continuous integration.

2) Security Tools in CI/CD: Integrating tools like

static code analysis, vulnerability scanning, and
dependency management within the CI/CD pipeline.

3) Secrets Management: Ensuring that sensitive

information (e.g., API keys, passwords) is managed
securely, using tools like AWS Secrets Manager or
HashiCorp Vault.

4) Real-time Monitoring: Implementing logging,

monitoring, and alerting to detect any security
issues in real time.

5) Automation: Automating security tasks where

possible, reducing the risk of human error while
ensuring continuous security testing.

BITS Pilani, Pilani Campus

DevSecOps vs DevOpsSec vs
SecDevOps

1) DevOpsSec:
➢ Starts with DevOps practices, with security added as a layer on top, often later in the pipeline.
➢ This approach is more reactive, security as DevOps practices mature

2) SecDevOps:
➢ Emphasizes security as the core foundation, followed by development and operational practices.
➢ This approach places security as the primary focus, even if it impacts speed.

3) DevSecOps:
➢ Integrates security at each stage of DevOps, promoting security as everyone’s responsibility.
➢ Security is treated as a part of development and operations from the start.
➢ In practical terms, DevSecOps is widely adopted due to its balanced approach. Security is embedded into each phase
without overshadowing the goals of DevOps, unlike SecDevOps, which might prioritize security over other objectives

BITS Pilani, Pilani Campus

AWS - IAM

➢ IAM stands for Identity and Access Management.

➢ It is a web service that helps you securely control access to

AWS resources for your users.

➢ Terminology:

➢ IAM Users: root (admin) user , other (non-admin) users

➢ IAM Groups: to manage permissions for multiple users more

efficiently.

➢ IAM Policies: to enforce security best practices and restrict

access.

➢ IAM Roles: to grant permissions to AWS services.

AWS – IAM Users , Policies, Groups

AWS Policies – AWS managed vs custom

AWS – IAM Roles
Authentication
Definition:

process of verifying the identity of a user or system attempting to

access a resource.

Purpose:

It ensures that the entity trying to access the resource is who it

claims to be.

Methods:

Common authentication methods include passwords, biometric

authentication (fingerprint, facial recognition), security tokens,
and multi-factor authentication (MFA).
Authorization
Definition:

process of determining what actions an authenticated user or system

is allowed to perform on a resource.

Purpose:

It enforces access control policies to ensure that only authorized

entities can perform specific actions.

Methods:

Authorization is typically implemented using permissions, roles, and

policies that define the level of access granted to authenticated
entities.
Authentication vs Authorization
Authentication:
Users authenticate using their username and password or access keys
to verify their identity.

Authorization:
After authentication, AWS IAM determines what actions the
authenticated user is allowed to perform based on the policies attached
to their user or group.

authentication verifies identity, while authorization determines access

rights based on that identity.

Example:
Authentication - A user logging into an application using a username
and password.

Authorization - After logging in, a user is allowed to view, edit, or

delete specific resources based on their permissions
AWS - IAM Best Practices

➢ Principle of Least Privilege:

Assign only the permissions required to perform a task.

➢ Regular Audits:
Regularly review IAM policies and permissions to ensure they are
up-to-date and secure.
Security as Code
➢ Security as Code is a DevOps practice that leverages IaC and security tools to
automate security practices:

1) Static Application Security Testing (SAST): Tools like SonarQube analyze code
for vulnerabilities during development.

2) Dynamic Application Security Testing (DAST): Tools like OWASP ZAP or Burp
Suite run tests against live applications, simulating attacks to detect
vulnerabilities.

3) Dependency Scanning: Tools such as Snyk or Dependabot check for

vulnerabilities in libraries and dependencies.

4) Infrastructure Security: Using IaC tools (like Terraform) with security best
practices baked in (e.g., ensuring IAM roles have the least privilege).

BITS Pilani, Pilani Campus

Security as Code using AWS IAM
➢ AWS Identity and Access Management (IAM) plays a crucial role in implementing
security as code (SaC) practices by providing robust tools and features for
managing permissions, roles, and policies in a programmable manner:

1) Fine-Grained Access Control: IAM allows you to create detailed policies that
define what actions are allowed or denied on specific AWS resources.

2) Infrastructure as Code (IaC) Integration: Tools like AWS CloudFormation or

Terraform enable you to define IAM roles, policies, and permissions as code. This
means you can version control your security configurations alongside your
application code, promoting consistency and ease of management.

3) Role-Based Access Control (RBAC): IAM supports the creation of roles that can
be assumed by AWS services (e.g., EC2, Lambda).

This allows you to securely grant permissions to applications and services without
hardcoding credentials, enhancing security in automated deployments.

BITS Pilani, Pilani Campus

Compliance as Code

➢ Compliance as Code applies IaC concepts to ensure compliance requirements are met
automatically through code:
1) Policy-as-Code: Define and enforce policies within IaC. For example, using tools like
Open Policy Agent (OPA) and AWS Config to enforce rules such as all storage
buckets must be encrypted.

2) Automated Auditing: Compliance frameworks (e.g., CIS benchmarks for cloud

services) are enforced programmatically, ensuring configurations comply with
policies and automatically flagging any deviations.

3) Automated Documentation and Reporting: Compliance checks can generate audit-

ready reports, reducing the burden of manual documentation and ensuring up-to-
date compliance.

BITS Pilani, Pilani Campus

Compliance as Code using AWS Config

➢ We can use AWS Config to monitor AWS

resources continuously.

➢ We can define rules to evaluate whether your

AWS resource configurations comply with your
defined policies.

➢ AWS Config can trigger notifications or

automated remediation when resources are non-
compliant.

➢ Managed Rules: AWS Config offers managed rules

for common compliance frameworks.

➢ Ex: AWS Foundations Benchmark.

➢ We can enable these rules to check compliance
automatically.

BITS Pilani, Pilani Campus

Agenda for today’s class
Observability and Continuous Monitoring
 Need for observability and continuous monitoring
 What is Observability and Continuous Monitoring?
 Three Pillars of Observability – Logs, Metrics and Traces
 Components of a monitoring stack
 Four Golden Signals of Continuous Monitoring

 Tools for Observability and Monitoring

 Prometheus Setup on Laptop
 Prometheus Service types

BITS Pilani, Pilani Campus

Devops lifecycle - Observability
and Continuous Monitoring

BITS Pilani, Pilani Campus

Need for Continuous Monitoring

Organization App deployed on k8s Customers

 Would you find the live issues first or your Customers?

 As systems scale, complexity increases, making it harder to identify and

diagnose issues!

BITS Pilani, Pilani Campus

Observability and Continuous
Monitoring – Real World Example
 Electronic vital sign monitors have been
common in hospitals for more than 40 years

 Small sensors attached to your body carry

information to the monitor.

 Some sensors are patches that stick to your

skin, while others may be clipped on one of
your fingers

 If one of your vital signs rises or falls outside

healthy levels, the monitor will sound a
warning. This usually involves a beeping noise
and a flashing color. Many will highlight the
problem reading in some way.

BITS Pilani, Pilani Campus

What is Observability and
Continuous Monitoring
Continuous Monitoring Observability

 These signals are called metrics and  Doctor rushes to the patient where the
monitoring system help in tracking device is beeping and asks nurse to provide
them and alert the hospital staff health records to further diagnose.
whenever some metric is outside
the normal range.
 Observability in DevOps is about
 Continuous monitoring is the understanding the internal state of systems
ongoing process of collecting, by gathering telemetry data like metrics,
analyzing, and alerting on data to logs, and traces.
ensure systems’ health and
reliability in real-time.

BITS Pilani, Pilani Campus

Observability and Continuous
Monitoring – Real World Example2
 Lets say you are using a banking
application for checking your
account balance.

 You get the result after 10 mins,

would that be cool?

 You would instead stop using such

app and give a bad rating and
switch to another app! Client Backend Database
Server

 Can banks identify this? – need to

deploy Continuous Monitoring
tools!

 Can they know why the query is

running slow? - need
observability tools!

BITS Pilani, Pilani Campus

Observability and
Continuous Monitoring
 Continuous monitoring is the ongoing process of collecting, analyzing, and
alerting on data to ensure systems’ health and reliability in real-time.

 Observability in DevOps is about understanding the internal state of

systems by gathering metrics that reflect the same.

BITS Pilani, Pilani Campus

Three Pillars of Observability
 The three main pillars of observability are Logs, Metrics, and Traces.

 Each pillar provides a different type of insight into the system's health and
performance:

1) Logs:
 Logs are timestamped records of discrete events that occur within an
application or system.

 They provide detailed context about what happened at a particular time.

 Logs are essential for understanding specific events and errors in detail,
especially when debugging or conducting root-cause analysis.

 Ex1: An error log detailing a failed transaction.

 Ex2: An informational log that records a user’s login activity.

BITS Pilani, Pilani Campus

Types and Levels of Logs

1) System Logs: OS-level events like

kernel messages.

2) Application Logs: Events specific to

application functions.

3) Audit Logs: Track access to

resources and user actions for
compliance.

BITS Pilani, Pilani Campus

Three Pillars of Observability
2) Metrics
 Metrics are numerical representations of data measured over intervals of
time.

 They typically represent system states or resource utilization, like CPU load,
memory usage, or request rate.

 Metrics provide real-time and historical data to identify trends, detect

anomalies, and monitor overall system health and performance.

 Ex1: CPU usage percentage over time.

 Ex2: Number of active users or HTTP requests per second.
 100 active users 2:00PM
 200 active users 3:00PM
 Time Series Data

BITS Pilani, Pilani Campus

Metrics and Its Types
 Metrics measure system performance and are critical for
observability. Common metric types include:

1) System Metrics: CPU, memory, network usage.

2) Application Metrics: Request count, latency, errors.

3) Business Metrics: Conversion rates, revenue, etc.

 What metrics should we use?

BITS Pilani, Pilani Campus

Classification of Metrics
 Metrics are classified as counters, gauges, histograms, and
summaries:

1) Counters: Monotonically increasing values, like a request count.

2) Gauges: Variable metrics that can increase or decrease, such as

memory usage.

3) Histograms: Represent the distribution of values, such as response

times.

4) Summaries: Similar to histograms but can be pre-aggregated.

BITS Pilani, Pilani Campus

Three Pillars of Observability
3) Traces

 Traces are representations of the flow of a request as it traverses through

various services in a distributed system.

 They record details of each operation in a transaction.

 Traces help visualize the path of a request through multiple services, making
it easier to detect latency, failures, and bottlenecks across the entire request
flow.

 Ex1: A trace showing the execution time for each microservice in a single
user transaction.

 Ex2: A trace highlighting where a request failed in a service chain.

BITS Pilani, Pilani Campus

Components of a Monitoring
Stack
 An effective monitoring stack
typically consists of:

1) Data Collection: Tools like

Prometheus for metrics, Fluentd for
logs.

2) Storage: Long-term data storage,

such as InfluxDB or TimescaleDB, for
historical metrics.

3) Visualization: Tools like Grafana

provide visual dashboards for real-time
monitoring.

4) Alerting: Systems like Alertmanager

and PagerDuty notify relevant teams
when certain thresholds are breached.

BITS Pilani, Pilani Campus

Setting Up Alerts and Alert
Management
 Alerts notify teams when system behavior deviates from
expected parameters.

 Effective alerting helps prevent alert fatigue and ensures

critical issues are noticed promptly.

 Alert Types:

1) Threshold-based: Triggered when metrics exceed set

limits.

2) Anomaly-based: Triggered when unusual patterns are

detected.

BITS Pilani, Pilani Campus

4 Golden Signals of
Continuous Monitoring
 Google came up with 4 Golden Signals that are a standard
approach to monitoring the health of distributed systems and
microservices.

 These Golden Signals help focus on critical metrics to prevent

service degradation, ensuring a more systematic response to issues.

BITS Pilani, Pilani Campus

4 Golden Signals of
Continuous Monitoring
1) Latency: Time between request sent by
client and response sent by server.

2) Traffic: The demand on the system, often

measured in requests per second.

3) Errors: The rate of requests that fail.

Ex: keep track of http status codes – 500.

4) Saturation: The system’s resource

capacity like CPU, RAM, Disk usage. High
saturation indicates that resources may be
nearing full usage.

BITS Pilani, Pilani Campus

Tools for Observability and
Monitoring
 AWS CloudWatch:

 Monitors AWS resources (e.g., EC2, S3) and custom metrics.

 Logs events and alarms for infrastructure health monitoring.

 Prometheus:

 Open-source monitoring and alerting toolkit for collecting metrics.

 It scrapes metrics from instrumented jobs and stores them for analysis.

 Grafana:

 Visualization tool that integrates with Prometheus and other data sources.

 Allows users to create dashboards for real-time insights into system metrics.

BITS Pilani, Pilani Campus

Prometheus Setup
 Download minikube for windows from
https://minikube.sigs.k8s.io/docs/start/?arch=%2Fwindows%2F
x86-64%2Fstable%2F.exe+download

 Start the minikube cluster

 minikube start --driver=docker

 Install helm using choco

 choco install kubernetes-helm

 Add helm chart for prometheus

 helm repo add prometheus-community
https://prometheus-community.github.io/helm-charts
 helm repo update

BITS Pilani, Pilani Campus

Prometheus Setup
 Create a namespace in Kubernetes for Prometheus services:
 kubectl create namespace prometheus

 Installing Prometheus using Helm-charts

 helm install prometheus prometheus-community/prometheus -n
prometheus

 List all pods running in Prometheus namespace

 kubectl get pods -n prometheus

 Forwading 80 to 9090 host port (laptop)

 kubectl port-forward svc/prometheus-server -n prometheus
9090:80

BITS Pilani, Pilani Campus

Prometheus Architecture - 1

BITS Pilani, Pilani Campus

Prometheus Service types
1)Prometheus Server

 Prometheus’s direct scraping service metrics (Ex:

CPU, memory, and disk usage) allows teams to
respond quickly to potential issues.

 Ideal for Long-lived, persistent services.

 Ex1: long running batch jobs.

 Ex2: custom applications need to be instrumented

to create metrics.

 To use Prometheus with Python, the most popular

library is prometheus-client.
 pip install prometheus-client

BITS Pilani, Pilani Campus

Prometheus Service types
2) Pushgateway:

 Allows ephemeral or batch jobs to push metrics that would

otherwise be missed between scrapes.

 Ideal for short-lived services, batch jobs, or any workload

where instances exist only briefly, as these types of jobs
might end before Prometheus can scrape them.

 Ex: Build pipelines can push job metrics (e.g., build time,
status) to the Pushgateway when triggered by CI/CD tools
like Jenkins.

 Prometheus server pulling metric -> 5 seconds

 T0 -> Prometheus scraps , finds nothing

 T1 -> Time Jenkins job – 2 seconds
 T2 -> job ends -> push using pushgateway
 cant be tracked by Prometheus!

BITS Pilani, Pilani Campus

Prometheus Service types
3) Service Discovery

 In modern architectures like

microservices and Kubernetes, services
are constantly scaling up and down,
moving across nodes, or being replaced,
making static configuration impractical.

 Service Discovery automates the process

of tracking these changes, ensuring that
Prometheus always has an updated list
of targets (services and instances) to
monitor without manual intervention.

 Myapp -> current ip of pod

BITS Pilani, Pilani Campus

Prometheus Service types
4) Alertmanager

 Timely notifications to the DevOps team can ensure faster response, reducing
risks to service availability and data safety.

 Ex: For healthcare applications, Alertmanager sends alerts for high-priority

incidents, like API response errors in patient-data handling services.

5) Federation

 Each region has a dedicated Prometheus instance, and Federation

consolidates essential metrics (e.g., response time, error rates) into a global
view.

 Ex: Aggregation in Large Enterprise Environments where there are

Distributed services across regions.
BITS Pilani, Pilani Campus
Prometheus Service types
6) Exporters

 Not all applications or systems expose metrics in the format that Prometheus requires.

 Exporters serve as a bridge, translating native system metrics into a format that
Prometheus understands.

 Examples of Common Prometheus Exporters:

 a) Node Exporter: Collects and exposes system metrics (CPU, memory, disk I/O) from
Linux and other Unix-based systems.

 b) Database Exporters: Specialized exporters for databases, like MySQL Exporter ,

PostgreSQL Exporter, MongoDB Exporter etc.

7) Web UI

 The Prometheus web UI is a built-in user interface that allows users to explore and
interact with the metrics data collected by Prometheus.

 It offers several functionalities, including running queries, visualizing metrics, checking

alert statuses, and monitoring the health of Prometheus components

BITS Pilani, Pilani Campus

Prometheus Architecture - 2

BITS Pilani, Pilani Campus

Agenda for today’s class
Observability and Continuous Monitoring
➢ Prometheus Client
➢ Setup in Python
➢ Metric Types
➢ Demo
➢ Monitoring in Kubernetes
➢ Metrics in Kubernetes – cluster health, deployment, container, application, runtime
➢ Analysing and alerting on Metrics

MLOps
➢ What is a ML model?
➢ Introducing MLOps – definition, challenges, scale
➢ People of MLOps
➢ Key MLOps features
➢ Containerization of ML Models end to end
➢ Deploying to Production - CI/CD pipelines, ML artifacts, strategies, containerization
➢ MLOps in Practice: Marketing Recommendation Engines case study

BITS Pilani, Pilani Campus

Prometheus Client Setup in
Python
➢ Prometheus server can scrape service metrics exposed on /metrics endpoint.

➢ Custom applications need to be instrumented to create metrics.

➢ To use Prometheus with Python, the most popular library is prometheus-

client.
➢ pip install prometheus_client

Metric Types:

➢ In the prometheus_client Python library, the primary metric types are

Counter, Histogram, and Gauge, each serving different purposes for
monitoring and instrumentation.

BITS Pilani, Pilani Campus

Prometheus Client - Metric
Types - Counter
1. Counter:
➢ A Counter is a cumulative metric that only increases and doesn’t decrease (except when reset to
zero).
➢ It is used to count discrete events like number of requests served, tasks completed, or errors
occurred etc.

➢ syntax:
Counter(metric name, help text, labelnames=(), namespace='', subsystem='', unit=‘’)

➢ Ex:
from prometheus_client import Counter

# Create a counter to track requests

request_counter = Counter('http_requests_total', 'Total number of HTTP requests')

# Increment the counter by 1

request_counter.inc()

# Increment the counter by a specific value

request_counter.inc(5)

BITS Pilani, Pilani Campus

Prometheus Client - Metric
Types - Histogram
2. Histogram:
➢ A Histogram measures the distribution of a set of continuous values, such as request durations or
response sizes.
➢ It divides these values into configurable buckets and tracks counts for each bucket.
➢ Includes default buckets, but custom buckets can be defined.

➢ syntax:
Histogram(metric name, help text, labelnames=(), namespace='', subsystem='', unit=‘’, buckets=(0.005, 0.01,
0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10))

➢ Ex:
from prometheus_client import Histogram

request_duration = Histogram('http_request_duration_seconds', 'Histogram of HTTP request durations in

seconds', buckets=(0.1, 0.5, 1.0, 2.5, 5.0))

# Record a duration of 1.5 seconds for a GET request to /home endpoint

request_duration.labels(method='GET', endpoint='/home’)
request_duration.observe(1.5)

BITS Pilani, Pilani Campus

Prometheus Client - Metric
Types - Gauge
3. Gauge:
➢ A Gauge is a metric that represents a single numerical value that can arbitrarily go up or down.
➢ Can be used to track values that can increase and decrease, such as current memory usage,
number of active threads, or CPU utilization.

➢ syntax:
Gauge(metric name, help text, labelnames=(), namespace='', subsystem='', unit)

➢ Ex:
from prometheus_client import Gauge

# Create a gauge to track memory usage

memory_usage = Gauge('memory_usage_bytes', 'Current memory usage in bytes’)

# Set the gauge to a specific value

memory_usage.set(12345678)

# Increment the gauge

memory_usage.inc(5000)

# Decrement the gauge

memory_usage.dec(2000)

BITS Pilani, Pilani Campus

Instrumentation Example -
Prometheus Client
1) Create a new python file named main.py and define a fastapi app and metrics

from prometheus_client import Counter, Gauge, Histogram, generate_latest

from fastapi import FastAPI

app = FastAPI()

# metrics definitions

# Traffic (request count)

REQUEST_COUNT = Counter("fastapi_request_count", "Total number of requests")

# Latency (request latency)

REQUEST_LATENCY = Histogram("fastapi_request_latency_seconds", "Request latency")

# Errors (count of error responses)

ERROR_COUNT = Counter("fastapi_error_count", "Total number of error responses")

# Saturation (e.g., CPU utilization as a Gauge)

CPU_UTILIZATION = Gauge("fastapi_cpu_utilization", "CPU utilization in percentage")

BITS Pilani, Pilani Campus

Instrumentation Example -
Prometheus Client
2) Define Middleware to collect Prometheus metrics for each request.

➢ Add prometheus_middleware to process metrics for every request.

➢ This ensures metrics are collected regardless of which endpoint is accessed.

@app.middleware("http")
async def prometheus_middleware(request: Request, call_next):

# Start timer
start_time = time.time()

# Process the request

response = await call_next(request)

# Update metrics
REQUEST_COUNT.inc() # Increment request count
if response.status_code >= 500:
ERROR_COUNT.inc() # Increment error count for 5xx responses

latency = time.time() - start_time

REQUEST_LATENCY.observe(latency) # Record latency

CPU_UTILIZATION.set(psutil.cpu_percent(interval=None)) # Update CPU utilization

return response

BITS Pilani, Pilani Campus

Instrumentation Example -
Prometheus Client
3) Define metrics end point which is by default: /metrics

@app.get("/metrics")
async def metrics():

return Response(content=generate_latest(), media_type="text/plain")

4) Create Prometheus's configuration file (prometheus.yml)

global:
scrape_interval: 5s # How often to scrape targets by default

scrape_configs:
- job_name: "fastapi_app"
static_configs:
- targets: ["localhost:8000"]

5) Start Prometheus server as

prometheus --config.file=prometheus.yml
Prometheus will periodically scrape metrics from http://localhost:8000/metrics

6) Access App and Metrics:

Application root: http://localhost:8000/greet

Prometheus metrics: http://localhost:8000/metrics

BITS Pilani, Pilani Campus

Monitoring in Kubernetes
➢ Kubernetes introduces unique challenges for monitoring due to its
distributed nature.

➢ In Kubernetes, observability focuses on metrics at multiple levels:

1) Cluster Health: Monitor the control plane, nodes, and network

components.

2) Deployment Health: Check replica counts, rollout status, and service

availability.

3) Container Health: Focus on container resource usage (CPU, memory).

4) Application Health: Service-level metrics, error rates, response times.

BITS Pilani, Pilani Campus

Metrics in Kubernetes
➢ Kubernetes metrics are generally categorized as follows:

1) Cluster Health Metrics: Availability and performance of nodes and control

plane components (API server, etcd).

2) Deployment Metrics: Status of Pods and Deployments, including replica

counts and rollout status.

3) Container Metrics: CPU and memory usage, restarts, and liveness/readiness

probe results.

4) Application Metrics: Business and service metrics related to application

functionality.

5) Runtime Metrics: Resource usage over time to detect anomalies and trends.

BITS Pilani, Pilani Campus

Example Workflow: Observability
and Monitoring Setup
➢ Collect Metrics:

➢ Set up Prometheus on a Kubernetes cluster to scrape metrics from the

Kubernetes API server, containers, and applications.

➢ Use various service types from Prometheus for various types of services we are
running.

➢ Store and Visualize Metrics:

➢ Integrate Prometheus with Grafana for visualizing cluster health metrics.

➢ Set up Grafana dashboards for Golden Signals (latency, traffic, errors,

saturation).

BITS Pilani, Pilani Campus

Example Workflow: Observability
and Monitoring Setup

➢ Alerting:

➢ Define Prometheus alert rules for critical metrics (e.g., high latency
or error rates).

➢ Configure Alertmanager to send notifications based on these rules.

➢ Continuous Monitoring:

➢ Use CloudWatch for monitoring AWS resources if running

Kubernetes on EKS.

➢ Set up CloudWatch alarms to notify based on specific criteria (e.g.,

node failures).

BITS Pilani, Pilani Campus

What is a Machine Learning
Model?
➢ An ML model is a mathematical representation that learns patterns from
historical data to make predictions or decisions.

➢ It’s created through training, where algorithms use labeled data to identify
features that contribute to accurate predictions.

BITS Pilani, Pilani Campus

What is a Machine Learning
Model?
ML Algo:
Traditional Algo : i/p: x =[1,2,3]
i/p: x =[1,2,3] o/p: y = [3,4,5]
Given to system Given to system

Form: (ML algo – linear regression)

System gives out y = mx + c (form)
o/p: y = [3,4,5]
Training data:
Program: [1,3]
Given to system [2,4]
[3,5]
y = x + 2 Test data:
[8 , 10] => 10

Machine learns
And comes up with
m = 1, c = 2
y = x + 1.8
Predict:
X = 4, Y = 6
BITS Pilani, Pilani Campus
Key components of an ML
model
➢ Algorithm: The underlying mathematical framework used to build the
model (e.g., linear regression, decision trees).

➢ Features: Input variables that the model uses to make predictions.

➢ Parameters: Values that the model optimizes during training to fit the
data.

➢ Hyperparameters: Configurations set before training to influence the

learning process (e.g., learning rate, batch size).

BITS Pilani, Pilani Campus

ML Artifacts
➢ Models: The trained ML models themselves.

➢ Datasets: The data used for training and validation.

➢ Code Artifacts: Scripts and code for preprocessing, training,

and inference.

➢ Metadata: Information about experiments, hyperparameters,

and training configurations.

BITS Pilani, Pilani Campus

MLOps
➢ MLOps stands for Machine Learning Operations.

➢ MLOps is a set of practices combining machine learning, DevOps, and data

engineering to automate and enhance the deployment, monitoring, and
lifecycle management of ML models in production.

➢ It focuses on integrating ML systems with software engineering principles

to enable seamless collaboration between data scientists, ML engineers,
and DevOps teams.

BITS Pilani, Pilani Campus

Agenda for today’s class
MLOps
➢ Introducing MLOps–definition, challenges, scale
➢ People of MLOps
➢ Key MLOps features
➢ Containerization of ML Models end to end
➢ DVC

DataOps
➢ What is DataOps?
➢ Data Pipeline – Environment and Orchestration
➢ Organizing for DataOps – Team Structure

Deploying to Production -CI/CD pipelines

MLOpsin Practice: Marketing Recommendation Engines case study

Multi Cloud IaC - Example

End Sem Question Paper format and Topics
BITS Pilani, Pilani Campus
MLOps
➢ MLOps stands for Machine Learning Operations.

➢ MLOps is a set of practices combining machine learning, DevOps, and data

engineering to automate and enhance the deployment, monitoring, and
lifecycle management of ML models in production.

➢ It focuses on integrating ML systems with software engineering principles

to enable seamless collaboration between data scientists, ML engineers,
and DevOps teams.

BITS Pilani, Pilani Campus

Challenges in MLOps
➢ Complex Workflows: ML involves diverse workflows, including data
processing, model training, testing, and deployment.

➢ Model and Data Drift: Over time, models can lose accuracy due to changes
in data distribution (data drift) or evolving business contexts (concept
drift).

➢ Scalability: Scaling model training, deployment, and serving is challenging,

especially with large datasets. Large-scale MLOps implementations involve
multiple pipelines for feature engineering, model training, and real-time
inference serving.

➢ Collaboration: MLOps requires coordination among multiple teams with

different expertise.

➢ Monitoring: Continuous monitoring of ML models is necessary to ensure

they perform as expected over time.

BITS Pilani, Pilani Campus

People of MLOps
➢ The key stakeholders in MLOps are:

1) Product Managers: They define the business requirements, monitoring

model performance and ensuring alignment with business goals

2) Data Scientists: They develop and experiment with ML models, focusing on

model accuracy and feature engineering.

3) ML Engineers: They handle model optimization, deployment, and ensure

models can scale and meet latency requirements.

4) Data Engineers: They prepare and manage the data infrastructure,

ensuring high-quality and consistent data for ML models.

5) DevOps Engineers: They ensure the infrastructure is automated, scalable,

and secure, enabling the seamless deployment of ML pipelines.

BITS Pilani, Pilani Campus

Key MLOps Features
➢ Automated ML Pipelines: Automating data preprocessing, model training,
and deployment pipelines to streamline the workflow.

➢ Continuous Integration/Continuous Deployment (CI/CD): Automated

workflows for integrating code changes, testing, and deploying ML models.

➢ Model Versioning and Tracking: Version control for ML models, data, and
experiments for reproducibility and traceability.

➢ Monitoring and Retraining: Tools to monitor model performance in

production and trigger retraining when performance degrades.

➢ Data and Model Governance: Ensuring compliance with data privacy and
governance policies, managing data lineage, and access control.

➢ Experiment Management: Tracking experiments to optimize

hyperparameters, algorithms, and preprocessing steps efficiently.

BITS Pilani, Pilani Campus

Types of ML Algorithms
1) Supervised Learning ( y is given)
Ex: predicting revenue for next year using historical data

2) Unsupervised Learning ( y is not given)

Ex: Customer segmentation using clustering
get patterns from data
product -> t- shirt -> what all sizes? 3 GROUPS -> G1 , G2 , G3
S , M, L
K = 3

3) Reinforcement Learning
Ex: self driving car

BITS Pilani, Pilani Campus

Containerization of ML Models
➢ Docker and Kubernetes are commonly used to package ML models with
dependencies, ensuring that they run consistently across environments.

➢ Containerization helps in scaling models across multiple nodes and

allows easy rollback in case of model failure.

➢ The following are the steps of containerizing and deploying an ML algo.

1) Train the Model

2) Save the Model
3) Create a Prediction Script (ML algo)
4) Create a Dockerfile
5) Build and Run the Docker Container
6) Optimize the Container
7) Deploy the container
8) Monitor and Scale

BITS Pilani, Pilani Campus

Containerization of ML Models
1) Train the Model:
➢ Train model using any ML library (e.g., TensorFlow, PyTorch, scikit-learn).

2) Save the Model:

➢ Export the trained model in a suitable format:
➢ Ex:
TensorFlow: .h5 or SavedModel format.
PyTorch: .pth or .pt.
Scikit-learn: Use joblib or pickle.

Code:
import joblib
model = ... # Train your model
joblib.dump(model, 'model.pkl’)

BITS Pilani, Pilani Campus

Containerization of ML Models
3) Create a Prediction Script (ML algo)
➢ Create a Python script to load the model and expose an API for predictions.
➢ Use a framework like FastAPI or Flask.

from fastapi import FastAPI, HTTPException

import joblib @app.post("/predict")
import numpy as np def predict(data: InputData):
from pydantic import BaseModel try:
# Convert input to a NumPy array
app = FastAPI() input_data = np.array([[data.feature1,
data.feature2, data.feature3]])
# Load the model prediction = model.predict(input_data)
model = joblib.load("model.pkl") return {"prediction": prediction.tolist()}
except Exception as e:
# Define the input schema raise HTTPException(status_code=500,
class InputData(BaseModel): detail=str(e))
feature1: float
feature2: float
feature3: float

BITS Pilani, Pilani Campus

Containerization of ML Models
4) Create a Dockerfile
➢ Create a dockerfile that defines the container's environment and how to
run the application.

# Use an official Python runtime as the base image

FROM python:3.9-slim

# Set the working directory

WORKDIR /app

# Copy the local files to the container

COPY . .

# Install dependencies
RUN pip install --no-cache-dir fastapi uvicorn joblib numpy scikit-learn

# Expose the API port

EXPOSE 8000

# Command to run the application

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

BITS Pilani, Pilani Campus

Containerization of ML Models
5) Build and Run the Docker Container

Build the Docker Image:

docker build -t ml-model-api .

Run the Container:

docker run -d -p 8000:8000 ml-model-api

Test the API:

Use a tool like curl or Postman to test the API:

curl -X POST "http://localhost:8000/predict" \ -H "Content-Type:

application/json" \ -d '{"feature1": 1.5, "feature2": 2.3, "feature3": 3.7}'

BITS Pilani, Pilani Campus

Containerization of ML Models
6) Optimize the Container

a) Reduce Image Size: Use a lightweight base image like python:3.9-alpine.

b) Use Multi-stage Builds: Split the Dockerfile into build and runtime stages
to exclude unnecessary build dependencies.

c) Add a .dockerignore File: Exclude unnecessary files (e.g., training data,

logs) from the image.

*.pyc
__pycache__/
*.log
data/

BITS Pilani, Pilani Campus

Containerization of ML Models
7) Deploy the Container

➢ Deploy the container to a production environment:

a) Docker Compose: Orchestrate multi-container deployments.

b) Kubernetes: For scalable, production-grade deployments.

c) Cloud Services:
➢ Use AWS ECS, Azure AKS, or Google Kubernetes Engine (GKE) for
container orchestration.
➢ Use AWS SageMaker or Google AI Platform for model hosting.

BITS Pilani, Pilani Campus

Containerization of ML Models
8) Monitor and Scale

a) Monitoring:
➢ Integrate monitoring tools like Prometheus and Grafana to track
container metrics.
➢ Use application performance monitoring (APM) tools for API response
time and error tracking.

b) Scaling: Use horizontal scaling (multiple replicas) with Kubernetes or a

cloud service to handle high traffic.

BITS Pilani, Pilani Campus

Key components of MLOps
CI/CD Pipeline
1) Continuous Integration (CI): Incorporates code versioning, testing, and
experiment tracking, ensuring that model code is reproducible and
compliant with quality standards.

a) Version Control for Models and Data:

➢ Implementing version control (using tools like Git + DVC) ensures that
both models and datasets are tracked over time.

➢ This enables the team to reproduce experiments, compare different

versions of the models, and manage changes in datasets, which is crucial
for maintaining model reliability and reproducibility.

➢ This contributes to model stability and continuous improvement.

➢ This is a well known devops practice called Data Versioning and Lineage
Tracking

BITS Pilani, Pilani Campus

DVC
➢ DVC (Data Version Control) is an open-source version control
system specifically designed for machine learning (ML) projects.

➢ It complements Git by handling large datasets, model artifacts,

and other ML-specific assets that Git struggles with.

1. Initialize a Project with Git and DVC:

# Initialize Git and DVC
git init
dvc init

# Add DVC to Git tracking

git add .dvc .gitignore
git commit -m "Initialize Git and DVC"

BITS Pilani, Pilani Campus

DVC
2. Add a Dataset
dvc add data/train.csv

# This creates a .dvc file pointing to the dataset

git add data/train.csv.dvc
git commit -m "Add dataset to DVC"

➢ DVC moves data/train.csv to a cache directory. A .dvc file is

created, which contains metadata (like hash, size, etc.).

➢ The dataset itself is excluded from Git tracking (avoiding Git's size
limits).

BITS Pilani, Pilani Campus

DVC
3. Push Dataset to Remote Storage:

➢ We can Configure a remote storage (e.g., AWS S3, Google Drive,

or a local server).

dvc remote add -d myremote s3://mybucket/path

dvc push

➢ The dataset (data/train.csv) is uploaded to the remote storage.

➢ The .dvc file stays in the Git repo, acting as a pointer

BITS Pilani, Pilani Campus

Key components of MLOps
CI/CD Pipeline
2) Automated Model Testing and Validation:

➢ Establishing automated testing pipelines (unit tests, integration tests,

and performance validation) ensures that models meet performance
and accuracy standards before deployment.

➢ By automatically validating model performance, you can avoid errors in

production that could negatively affect user experience or lead to
system downtime

3) Continuous Deployment (CD):

➢ Ensures automated deployment of models, with proper checks to

transition models from development to production seamlessly.

BITS Pilani, Pilani Campus

DataOps
➢ DataOps stands for Data Operations.

➢ DataOps is a collaborative data management practice focused on

improving the quality, speed, and reliability of data-related operations.

➢ It applies principles of Agile, DevOps, and lean manufacturing to data

analytics and engineering, enabling organizations to deliver data-driven
insights quickly and efficiently.

BITS Pilani, Pilani Campus

Key components of DataOps
CI/CD Pipeline
1. Automated Data Validation:
➢ Implement scripts or tools to validate incoming data for missing values,
schema mismatches, or outliers.

➢ Ensures the input data to the ML algorithm is clean and consistent.

➢ Example Tools: Talend, Apache Griffin Deequ.

2. Data Lineage and Provenance:

➢ Track the origin and transformations of data to ensure transparency
and reliability.

➢ Helps in debugging issues and understanding the data flow through the
pipeline.

➢ Example Tools: Informatica Enterprise Data Catalog, Microsoft Purview

BITS Pilani, Pilani Campus

Key components of DataOps
CI/CD Pipeline
3. Incremental Data Processing:
➢ Process only the new or changed data instead of reprocessing the entire
dataset.

➢ Improves efficiency and ensures that only new data is used for periodic
model retraining, reducing computational overhead.

➢ Example Tools:

1) Apache Kafka - Supports incremental processing by allowing consumers

to read only new data.

2) Delta Lake - Handles incremental data processing with features like

MERGE INTO for upserts.

BITS Pilani, Pilani Campus

MLOps vs DataOps
Aspect MLOps DataOps

Definition Practices for deploying, monitoring, Practices for managing data

and maintaining ML models in workflows, ensuring data quality,
production. and facilitating collaboration.

versioning machine learning models, their datasets, their schemas,

parameters, hyperparameters, and the transformations, and data
associated code. pipelines.

Key Model versioning, retraining, Data versioning, data quality

Components deployment, monitoring, CI/CD for ML checks, lineage, and pipeline
pipelines. automation.

Tools MLflow, Kubeflow, TFX, SageMaker, DVC, Apache NiFi, Talend, Great
Azure ML. Expectations, dbt, Airflow.

BITS Pilani, Pilani Campus

MLOps in Practice:
Deployment Strategies
Shadow Deployment:
➢ Running the new model in the background without affecting the
production model, used for testing performance.

➢ new version passively receives a copy of real user requests to process in

parallel with the current version.

Canary Deployment:
➢ Gradually rolling out the model to a small percentage of users before a
full rollout.

A/B Testing:
➢ Deploying multiple models to compare their effectiveness based on real-
world feedback.

BITS Pilani, Pilani Campus

MLOps in Practice:
Deployment Strategies
Blue Green Deployment:
➢ One environment (Blue) runs the current production version, while the
new version is deployed to the other environment (Green).

➢ Traffic is then switched from Blue to Green once the new version is
validated.

Distributed Model Training and Inference:

➢ Using distributed computing frameworks (e.g., TensorFlow, PyTorch)
allows training the model on large datasets across multiple machines.

➢ For inference, deploying the model in a distributed manner across

multiple instances or microservices ensures that the system can process
high volumes of data in real time without performance degradation.

BITS Pilani, Pilani Campus

MLOps in Practice: Marketing
Recommendation Engines Case Study
➢ A marketing recommendation engine suggests products or content to
users based on their preferences and behaviors.

➢ Using MLOps principles can significantly improve the efficiency and

scalability of this use case.

➢ Steps in the MLOps Pipeline for a Recommendation Engine:

1) Data Collection and Processing:

➢ Collect data on user behavior, product views, and purchases.
➢ Preprocess the data using tools like Apache Spark or Pandas.

1) Model Training and Experimentation:

➢ Train collaborative filtering or neural network-based recommendation
models.
➢ Track experiments, hyperparameters, and results using tools like
MLflow or Weights & Biases.

BITS Pilani, Pilani Campus

MLOps in Practice: Marketing
Recommendation Engines Case Study
3) Model Validation and Testing:
➢ Evaluate models using validation metrics like mean squared
error or mean average precision.
➢ Conduct A/B testing to assess model effectiveness in real-world
scenarios.

4) Continuous Integration and Deployment:

➢ Package the model using Docker and deploy it to a staging
environment.
➢ Run tests for data quality, model accuracy, and latency
requirements.
➢ Use Kubernetes for scalable deployment and model
management.

BITS Pilani, Pilani Campus

MLOps in Practice: Marketing
Recommendation Engines Case Study
5) Monitoring and Retraining:
➢ Continuously monitor the model’s performance using tools like
Prometheus and Grafana.
➢ Set up alerts for model drift and trigger automatic retraining
pipelines using AWS Sagemaker or Google AI Platform.

6) Feedback and Improvement:

➢ Collect feedback from marketing teams and customers to refine
the model.
➢ Integrate user feedback into retraining datasets to improve
personalization.

BITS Pilani, Pilani Campus

Multi Cloud IaC - Example
➢ Let us create the following infrastructure using terraform:

1) A Serverless Lambda Function on GCP

2) A Virtual Machine on Azure

3) Integrate Lambda (GCP) and VM (Azure)

BITS Pilani, Pilani Campus

1) A Serverless Lambda
Function on GCP
➢ Google Cloud offers Cloud Functions for serverless functions, which are similar
to AWS Lambda.

➢ Official documentation of google terraform providers -

https://github.com/hashicorp/terraform-provider-google

High Level Steps:

1) Set up GCP provider
2) Storage Bucket Creation
3) Upload Source Code
4) Deploy Cloud Function
5) HTTP Trigger Setup

Step 1: Set up GCP provider:

provider "google" {
project = "project-id"
region = "us-east1"
}

BITS Pilani, Pilani Campus

1) A Serverless Lambda
Function on GCP
main.py
from fastapi import FastAPI
from mangum import Mangum

app = FastAPI()
@app.get("/")
def greet():
return {"message": "Hello, FastAPI on Google Cloud Functions!"}

# ASGI-compatible entry point for Google Cloud Functions

handler = Mangum(app)

requirements.txt:
fastapi==0.95.1
mangum==0.14.0
uvicorn==0.22.0
functions-framework==3.3.0

BITS Pilani, Pilani Campus

1) A Serverless Lambda
Function on GCP
Step 2: Create a Google Cloud Storage bucket to store the source code of the
Cloud Function:
resource “google_storage_bucket” “ssd_gcp_bucket” {
name = "function-source-bucket"
location = "US"
}
This resource creates a Google Cloud Storage bucket named ssd_gcp_bucket to store the
source code of the Cloud Function.

Step 3: Create a Google Cloud Storage bucket object for the source code zip of the Cloud
Function:
resource "google_storage_bucket_object" “ssd_gcp_lambda_code_object" {
bucket = "source.zip"
source = "path_to_source.zip"
}

This resource uploads source.zip into the bucket ssd_gcp_bucket.

BITS Pilani, Pilani Campus

1) A Serverless Lambda
Function on GCP
➢ Mangum converts the FastAPI app into an ASGI-compatible handler that
Google Cloud Functions can invoke.

➢ ASGI frameworks like FastAPI, Starlette, and Django with Channels are asynchronous
and optimized for modern web applications.

➢ However, platforms like AWS Lambda and Google Cloud Functions expect specific
synchronous HTTP interfaces. Mangum bridges this gap, enabling ASGI applications to
run seamlessly in these serverless environments.

➢ Create a source.zip that contains

├── main.py # FastAPI application code
├── requirements.txt # Dependencies

BITS Pilani, Pilani Campus

1) A Serverless Lambda
Function on GCP
Step 4: Create a Google Cloud Function:
resource "google_cloudfunctions_function" “ssd_gcp_lambda_function" {
name = " ssd_gcp_lambda_function "
runtime = "python311"
entry_point = "app“ # fastapi entry point

source_archive_bucket = google_storage_bucket.ssd_gcp_bucket
source_archive_object = google_storage_bucket_object. ssd_gcp_lambda_code_object.bucket

trigger_http = true
}

➢ A public HTTPS endpoint is automatically generated for the function

Ex: https://REGION-PROJECT_ID.cloudfunctions.net/ssd_gcp_lambda_function

BITS Pilani, Pilani Campus

2) A Virtual Machine on
Azure
➢ Azure provides the azurerm provider for provisioning virtual machines.

➢ Link: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs

Step 1: Set up Azure provider

provider "azurerm" {
features {}
}

Step 2: Create the Virtual Machine

resource "azurerm_linux_virtual_machine" “ssd_ma_vm" {
name = " ssd_ma_vm "
location = "East US"
…
admin_username = "adminuser"
admin_password = "AdminPassword123!"
network_interface_ids = [azurerm_network_interface.nic.id]}}

BITS Pilani, Pilani Campus

3) Integrate Lambda (GCP)
and VM (Azure)
➢ On the Azure VM, we can write a script to make an HTTP request to the Cloud Function URL.

Step1: Create a script to invoke google cloud function:

invoker.sh
#!/bin/bash
curl -X GET "https://REGION-PROJECT_ID.cloudfunctions.net/ssd_gcp_lambda_function"

➢ We can provision this script on the VM using Terraform's azurerm_linux_virtual_machine_extension to

run the script upon VM creation.

Step2: Create a script to invoke google cloud function:

resource "azurerm_linux_virtual_machine_extension" "init_script" {
name = "init-script"
virtual_machine_id = azurerm_linux_virtual_machine.ssd_ma_vm.id
publisher = "Microsoft.Azure.Extensions"
type = "CustomScript"
type_handler_version = "2.0"
script = "./invoker.sh“
}

➢ This will run the script on the Azure VM after it is provisioned, which will call the GCP Cloud Function.

BITS Pilani, Pilani Campus

Cloud native applications:-
Now, let me provide a detailed explanation of cloud native applications, referencing the
visual elements in the diagram:

1. Definition:
Cloud native applications are software applications designed specifically to take advantage
of cloud computing frameworks, which typically consist of microservices packaged in
containers, deployed as part of a continuous delivery process, and managed on elastic
infrastructure through agile DevOps processes and tools.

2. Key Characteristics:

a) Microservices Architecture:
- As shown in the top-left of the diagram, cloud native apps are built using a microservices
architecture.
- Each microservice (Service A, B, C) is a small, independent module focused on a specific
business capability.
- This approach allows for easier development, testing, and maintenance of individual
components.

b) Containerization:
- Illustrated in the top-right, containerization is a key aspect of cloud native applications.
- Containers package the application code along with its dependencies, ensuring
consistency across different environments.
- Popular containerization technologies include Docker and containers.

c) Orchestration:
- Shown in the middle-left, orchestration tools like Kubernetes manage the deployment,
scaling, and operation of application containers.
- They handle tasks such as load balancing, service discovery, and rolling updates.

d) CI/CD Pipeline:
- The middle-right of the diagram shows a typical Continuous Integration/Continuous
Deployment (CI/CD) pipeline.
- This automated process includes stages for coding, building, testing, and deploying
applications.
- It enables frequent and reliable releases of application changes.

e) Scalability:
- Depicted in the bottom-left, cloud native apps are designed to scale horizontally.
- They can quickly spawn new instances to handle increased load and scale down when
demand decreases.

f) Observability:
- The bottom-right shows the three pillars of observability: logging, metrics, and tracing.
- These provide deep insights into application behavior and performance in distributed
systems.

3. Benefits of Cloud Native Applications:

- Agility: Faster development and deployment cycles.
- Scalability: Easy to scale up or down based on demand.
- Resilience: Built-in fault tolerance and self-healing capabilities.
- Portability: Can run consistently across different cloud environments.
- Cost-efficiency: Pay only for the resources you use.

4. Challenges:
- Complexity: Managing distributed systems can be challenging.
- Security: Increased attack surface due to distributed nature.
- Monitoring: Requires sophisticated observability tools and practices.
- Cultural shift: Requires changes in development and operational practices.

5. Technologies and Tools:

- Containers: Docker, containers
- Orchestration: Kubernetes, Docker Swarm
- Service Mesh: Istio, Linkerd
- CI/CD: Jenkins, GitLab CI, GitHub Actions
- Monitoring: Prometheus, Grafana, ELK stack

6. Best Practices:
- Design for failure: Assume any component can fail and design accordingly.
- Implement automated testing and deployment.
- Use declarative rather than imperative programming.
- Implement proper logging and monitoring from the start.
- Embrace infrastructure as code for consistent environments.

Cloud native applications represent a significant shift in how we build and deploy software,
offering numerous benefits in terms of scalability, resilience, and agility. However, they also
come with their own set of challenges that organizations need to address to fully leverage
their potential.
Now, let me explain cloud native applications from a DevOps perspective, referencing the
diagram:
1. DevOps Philosophy in Cloud Native: DevOps in cloud native environments focuses
on automating and integrating the processes between software development and IT
teams. The goal is to build, test, and release software faster and more reliably in the
cloud.
2. The DevOps Lifecycle for Cloud Native Apps: a) Plan:
o DevOps engineers collaborate with product managers and developers to plan
features and sprints.
o Tools: Jira, Trello for project management and tracking.
b) Code:
o Developers write code for microservices, while DevOps engineers focus on
infrastructure-as-code and automation scripts.
o Tools: Git for version control, GitHub or GitLab for collaboration.
c) Build:
o Automated build processes create container images from the source code.
o Tools: Docker for containerization, Maven or Gradle for build automation.
d) Test:
o Automated testing includes unit tests, integration tests, and end-to-end tests.
o Tools: Selenium for UI testing, JUnit or PyTest for unit testing, Postman for
API testing.
e) Deploy:
o Automated deployment to cloud environments, often using container
orchestration.
o Tools: Kubernetes for orchestration, Helm for Kubernetes package
management, Terraform for infrastructure provisioning.
f) Operate:
o Continuous operation and maintenance of the deployed applications.
o Tools: Prometheus for monitoring, ELK stack (Elasticsearch, Logstash,
Kibana) for logging.
g) Monitor:
o Constant monitoring of application performance and user experience.
o Tools: Grafana for visualization, Datadog for full-stack monitoring.
h) Feedback:
o Gathering and analyzing user feedback and system performance data to inform
the next iteration.
3. Key DevOps Practices for Cloud Native: a) Infrastructure as Code (IaC):
o Define and manage infrastructure using code (e.g., Terraform, AWS
CloudFormation).
o Ensures consistency and repeatability in cloud environments.
b) Continuous Integration/Continuous Deployment (CI/CD):
o Automate the build, test, and deployment processes.
o Tools: Jenkins, GitLab CI, GitHub Actions.
c) Microservices Architecture:
o Build applications as a set of small, loosely coupled services.
o Enables independent development, deployment, and scaling of components.
d) Containerization:
o Package applications and dependencies into containers.
o Ensures consistency across different environments and simplifies deployment.
e) Container Orchestration:
o Use tools like Kubernetes to manage, scale, and maintain containerized
applications.
f) Automated Monitoring and Logging:
o Implement comprehensive monitoring and centralized logging from the start.
o Crucial for maintaining visibility in distributed cloud native systems.
g) Security as Code:
o Integrate security practices into the DevOps workflow (DevSecOps).
o Implement automated security testing and compliance checks.
4. Cloud Platforms: DevOps engineers work with various cloud platforms such as AWS,
Google Cloud Platform (GCP), and Microsoft Azure. Each platform offers specific
tools and services that integrate into the DevOps workflow.
5. Challenges and Considerations:
o Managing complexity in distributed systems.
o Ensuring security in a more exposed environment.
o Optimizing costs in pay-as-you-go cloud models.
o Maintaining performance and reliability at scale.
o Keeping up with rapidly evolving cloud technologies and best practices.
6. Best Practices:
o Embrace automation at every stage of the lifecycle.
o Implement robust monitoring and alerting systems.
o Use blue-green deployments or canary releases for safer updates.
o Regularly audit and optimize cloud resource usage.
o Foster a culture of continuous learning and improvement.
From a DevOps perspective, cloud native applications require a shift in mindset and
practices. The focus is on automation, continuous delivery, and leveraging cloud services to
build scalable, resilient, and efficient applications. DevOps engineers play a crucial role in
bridging development and operations, ensuring smooth deployment and operation of cloud
native applications throughout their lifecycle.

Hypervisors and containers:-

1. Virtualization: Virtualization, shown in the top-left of the diagram, is a technology
that allows you to create multiple simulated environments or dedicated resources from
a single physical hardware system. It enables better utilization of physical hardware
resources and provides isolation between different virtual instances.
2. Hypervisors: A hypervisor, also known as a Virtual Machine Monitor (VMM), is
software that creates and runs virtual machines (VMs). Hypervisors are classified into
two types: a) Type 1 Hypervisor (Bare Metal):
o Illustrated in the top-right of the diagram.
o Runs directly on the host's hardware to control the hardware and manage guest
operating systems.
o Examples: VMware ESXi, Microsoft Hyper-V, Xen.
o Advantages: Better performance, more secure, more efficient.
o Use cases: Enterprise servers, data centers.
b) Type 2 Hypervisor (Hosted):
o Shown in the middle-left of the diagram.
o Runs on a conventional operating system just like other computer programs.
o Examples: VMware Workstation, Oracle VirtualBox, Parallels Desktop for
Mac.
o Advantages: Easier to set up and manage, more flexible.
o Use cases: Personal use, development and testing environments.
3. Containers: Containers, depicted in the middle-right of the diagram, are a lightweight
form of virtualization that operates at the operating system level. They package
application code and dependencies together, ensuring consistent operation across
different computing environments. Key characteristics of containers:
o Share the host system's kernel.
o Provide process-level isolation.
o Lightweight and fast to start up.
o Examples: Docker, containers, CRI-O.
4. Comparison between VMs and Containers: The bottom of the diagram shows a
comparison table between VMs and containers: a) Isolation:
o VMs: Provide strong isolation as each VM has its own OS and kernel.
o Containers: Offer process-level isolation, sharing the host OS kernel.
b) Performance:
o VMs: Have some overhead due to running a full OS.
o Containers: Provide near-native performance due to sharing the host OS
kernel.
c) Resource Usage:
o VMs: More resource-intensive, as each VM needs its own OS.
o Containers: Lightweight, as they share the host OS.
5. Similarities between Hypervisors and Container Engines:
o Both provide isolation for running applications.
o Both allow multiple instances to run on a single physical machine.
o Both improve hardware utilization and efficiency.
6. Differences between Hypervisors and Container Engines:
o Level of abstraction: Hypervisors virtualize hardware, while container engines
virtualize at the OS level.
o Resource overhead: VMs have more overhead due to running full OS
instances.
o Isolation: VMs provide stronger isolation but at the cost of higher resource
usage.
o Portability: Containers are generally more portable across different
environments.
7. Use Cases:
o Hypervisors (VMs):
▪ Running applications that require different OS environments.
▪ Legacy applications that need a specific OS version.
▪ Scenarios requiring strong isolation between instances.
o Containers:
▪ Microservices architecture.
▪ DevOps and CI/CD pipelines.
▪ Cloud-native applications.
▪ Scenarios where fast deployment and scaling are crucial.
8. Hybrid Approaches: Some modern approaches combine aspects of both VMs and
containers:
o Nested virtualization: Running containers inside VMs.
o Kubernetes: Orchestrating containers, sometimes running on VMs in cloud
environments.
9. Trends and Future Directions:
o Increased use of containers in cloud-native applications.
o Development of more secure container runtimes.
o Integration of AI/ML for better resource management in virtualized
environments.
o Continued importance of VMs for certain use cases, especially in enterprise
environments.
Both virtualization through hypervisors and containerization have their place in modern IT
infrastructure. The choice between them depends on specific use cases, security requirements,
performance needs, and the overall architecture of the application or system being deployed.

Cloud native computing foundation:-

1. Overview of CNCF: The Cloud Native Computing Foundation is part of the Linux
Foundation and serves as a vendor-neutral home for many critical components of the
global technology infrastructure. It fosters innovation by hosting and nurturing cloud
native open source projects.
2. Key Technical Concepts: a) Cloud Native: An approach to building and running
applications that exploits the advantages of the cloud computing delivery model. b)
Microservices: An architectural style that structures an application as a collection of
loosely coupled services. c) Containers: Lightweight, portable, and consistent
software environments for applications. d) Dynamic Orchestration: Automated
arrangement, coordination, and management of computer systems, middleware, and
services.
3. CNCF Project Lifecycle: As shown in the diagram, CNCF projects go through
different stages: a) Sandbox: Early-stage projects. b) Incubating: Projects with
growing adoption and maturity. c) Graduated: Stable projects with high adoption and
a sustainable ecosystem.
4. Key CNCF Graduated Projects: a) Kubernetes: Container orchestration platform. b)
Prometheus: Monitoring and alerting toolkit. c) Envoy: Cloud-native
high-performance edge/middle/service proxy. d) Containerd: Industry-standard
container runtime. e) Fluentd: Unified logging layer. f) CoreDNS: DNS
server/forwarder. g) etcd: Distributed key-value store.
5. Notable Incubating Projects: a) Argo: Workflow engine for Kubernetes. b)
Buildpacks: Transforming source code into container images. c) Cilium: eBPF-based
Networking, Observability, and Security. d) Helm: Package manager for Kubernetes.
e) Linkerd: Service mesh for Kubernetes.
6. CNCF Landscape Categories: The CNCF landscape categorizes cloud native
technologies into several areas: a) Orchestration & Management b) Runtime c) App
Definition & Development d) Observability & Analysis e) Platform f) Provisioning g)
Serverless
7. Technical Standards and Specifications: CNCF hosts several important specifications:
a) Open Container Initiative (OCI): Standardizes container formats and runtimes. b)
Container Network Interface (CNI): Specification for configuring network interfaces
in Linux containers. c) Service Mesh Interface (SMI): Standard interface for service
meshes on Kubernetes.
8. Cloud Native Trail Map: CNCF provides a "trail map" for adopting cloud native
technologies, suggesting this order: a) Containerization b) CI/CD c) Orchestration d)
Observability & Analysis e) Service Proxy, Discovery & Mesh f) Networking, Policy
& Security g) Distributed Database & Storage h) Streaming & Messaging i) Container
Registry & Runtime
9. Technical Enablers: a) DevOps: Combining software development and IT operations.
b) GitOps: Using Git as a single source of truth for declarative infrastructure and
applications. c) Infrastructure as Code (IaC): Managing and provisioning
infrastructure through code instead of manual processes.
10. CNCF Technical Oversight Committee (TOC): The TOC is responsible for defining
and maintaining the technical vision for the CNCF, approving new projects, and
creating working groups to investigate areas of interest.
11. CNCF Certifications: CNCF offers various technical certifications: a) Certified
Kubernetes Administrator (CKA) b) Certified Kubernetes Application Developer
(CKAD) c) Certified Kubernetes Security Specialist (CKS)
12. CNCF Technical Initiatives: a) Cloud Native Network Functions (CNF) Testbed:
Provides a neutral test environment for telecom implementations. b) CNCF CI:
Continuous integration testing for CNCF projects. c) CNCF Community
Infrastructure Lab: Provides free access to state-of-the-art computing resources for
open source developers.
13. Emerging Trends in CNCF: a) Edge Computing: Extending cloud capabilities to the
edge of the network. b) eBPF: Technology for safe, performant networking and
observability. c) WebAssembly: Portable binary instruction format for stackable VMs.
d) Artificial Intelligence and Machine Learning integration with cloud native
technologies.
The CNCF ecosystem is vast and constantly evolving, with new projects and technologies
emerging regularly. It plays a crucial role in shaping the future of cloud computing and
distributed systems.

Docker Container:-
1. What is Docker? Docker is an open-source platform that automates the deployment,
scaling, and management of applications using containerization technology. It allows
you to package an application with all of its dependencies into a standardized unit for
software development and deployment.
2. Docker Architecture: a) Docker Client: The command-line interface (CLI) that allows
users to interact with Docker. b) Docker Daemon: The background service running on
the host that manages building, running, and distributing Docker containers. c)
Docker Registry: A repository for Docker images, which can be public (like Docker
Hub) or private.
3. Key Concepts: a) Docker Image: A read-only template with instructions for creating a
Docker container. b) Docker Container: A runnable instance of an image, which can
be started, stopped, moved, or deleted. c) Dockerfile: A text file that contains
instructions to build a Docker image. d) Docker Compose: A tool for defining and
running multi-container Docker applications.
4. Container Structure: A Docker container typically consists of:
o Application code
o Runtime environment
o System tools and libraries
o Settings
5. Docker Lifecycle: a) Build: Create a Docker image from a Dockerfile. b) Ship: Push
the image to a registry for storage and sharing. c) Run: Create and run containers from
the image.
6. Key Docker Commands:
o docker build: Build an image from a Dockerfile
o docker run: Create and start a container
o docker pull: Pull an image from a registry
o docker push: Push an image to a registry
o docker ps: List running containers
o docker stop: Stop a running container
o docker rm: Remove a container
o docker rmi: Remove an image
7. Advantages of Docker:
o Consistency across development, testing, and production environments
o Lightweight and fast compared to traditional VMs
o Efficient use of system resources
o Rapid deployment and scaling
o Isolation and security
8. Docker Networking: Docker provides several networking options:
o Bridge: Default network driver
o Host: Remove network isolation between container and host
o Overlay: Connect multiple Docker daemons
o Macvlan: Assign a MAC address to a container
9. Docker Storage:
o Volumes: Preferred mechanism for persisting data generated by and used by
Docker containers
o Bind Mounts: Map a host file or directory to a container file or directory
o tmpfs Mounts: Stored in the host system's memory only
10. Docker Security:
o Namespaces: Provide isolation for running processes
o Control Groups (cgroups): Limit application resources
o Docker Content Trust: Ensures integrity of container images
o Security scanning: Identify vulnerabilities in container images
11. Docker Swarm: Docker's native clustering and orchestration solution, allowing you to
create and manage a swarm of Docker nodes.
12. Dockerfile Best Practices:
o Use official base images
o Minimize the number of layers
o Leverage build cache
o Use multi-stage builds for smaller final images
o Don't install unnecessary packages
o Use .dockerignore file
13. Docker Compose: Allows defining and running multi-container Docker applications.
Key features:
o Define services in a YAML file
o Start all services with a single command
o Manage the lifecycle of services
14. Docker in CI/CD: Docker is widely used in Continuous Integration and Continuous
Deployment pipelines:
o Consistent environments for testing
o Easy integration with CI/CD tools (Jenkins, GitLab CI, GitHub Actions)
o Simplified deployment process
15. Limitations and Considerations:
o Containers share the host OS kernel, which can be a security concern
o Complex applications may require additional orchestration tools (like
Kubernetes)
o Learning curve for teams new to containerization
16. Docker Alternatives: While Docker is the most popular, there are alternatives:
o containerd
o CRI-O
o Podman
o LXC (Linux Containers)
17. Future Trends:
o Increased adoption of Docker in edge computing
o Integration with serverless architectures
o Enhanced security features
o Improved support for IoT and embedded systems
Docker has revolutionized the way applications are developed, shipped, and run, making it an
essential tool in modern software development and DevOps practices. Its ecosystem
continues to grow and evolve, addressing the complex needs of today's distributed systems
and cloud-native applications.

Maven:-
Maven is a popular build automation and project management tool used primarily for Java
projects. Key features of Maven include:
1. Project Object Model (POM): Maven uses an XML file called pom.xml to describe
the project structure, dependencies, and build process.
2. Convention over Configuration: Maven follows a standard directory structure and
build lifecycle, reducing the need for extensive configuration.
3. Dependency Management: Maven can automatically download and manage project
dependencies from remote repositories.
4. Build Lifecycle: As shown in the diagram, Maven has a defined build lifecycle with
phases like validate, compile, test, package, verify, and deploy.
5. Plugins: Maven's functionality can be extended through plugins, which can be bound
to different phases of the build lifecycle.

Gradle:-
Gradle is another popular build automation tool that can be used for multiple programming
languages. Here are some key features of Gradle:
1. Flexibility: Gradle uses a Groovy or Kotlin-based DSL (Domain Specific Language)
for describing builds, offering more flexibility than XML-based systems.
2. Performance: Gradle is designed to be fast, with features like incremental builds and
build cache.
3. Multi-project Builds: Gradle excels at handling complex, multi-project builds.
4. Dependency Management: Like Maven, Gradle can manage project dependencies
efficiently.
5. Plugins: Gradle has a rich ecosystem of plugins that extend its functionality.
6. Build Phases: As shown in the diagram, Gradle has three main build phases:
Initialization, Configuration, and Execution. During these phases, it constructs and
executes a task graph.

### DevOps Practices with Git:

- **Trunk-Based Development**: Developers merge small, frequent updates to a core
"trunk" or main branch
- **Feature Branching**: Create a new branch for each feature or bug fix
- **Pull Requests**: Code review process before merging changes
- **Automated Testing**: Trigger automated tests on each commit or pull request
- **Continuous Integration**: Automatically build and test code changes
- **Continuous Delivery**: Automatically deploy code changes to staging or production
environments

## 4. Git in Cloud Environments

When working with cloud environments, Git becomes even more powerful and integral to the
development and deployment process.

### Key Aspects:

1. **Infrastructure as Code (IaC)**:
- Store infrastructure definitions in Git repositories
- Use tools like Terraform, CloudFormation, or Ansible with Git

2. **Cloud-Native CI/CD**:
- Cloud providers offer Git-integrated CI/CD services (e.g., AWS CodePipeline, Azure
DevOps, Google Cloud Build)
- Automatically deploy to cloud services on Git push

3. **Serverless Deployments**:
- Deploy serverless functions directly from Git repositories
- Examples: AWS Lambda, Azure Functions, Google Cloud Functions

4. **Container Orchestration**:
- Store Kubernetes manifests or Helm charts in Git
- Implement GitOps practices for Kubernetes deployments
5. **Secrets Management**:
- Use Git hooks or CI/CD pipelines to inject secrets from secure stores
- Never store sensitive information directly in Git repositories

6. **Multi-Cloud Strategies**:
- Use Git branching strategies to manage deployments across multiple cloud providers
- Implement cloud-agnostic IaC practices

## 5. Best Practices for Git in Cloud DevOps

1. Use .gitignore: Prevent sensitive or unnecessary files from being committed

2. **Implement Git Hooks**: Automate tasks like code formatting or running tests before
commits
3. **Sign Commits**: Use GPG to sign commits for enhanced security
4. **Use Meaningful Commit Messages**: Follow conventions like Conventional Commits
5. **Regular Backups**: Regularly backup Git repositories, including all branches and tags
6. **Monitor Git Activity**: Use tools to monitor repository activity and detect anomalies
7. **Implement Branch Protection**: Protect important branches (e.g., main) with required
reviews and status checks
8. **Use Git LFS for Large Files**: Efficiently handle large files in cloud deployments
9. **Automate Environment Provisioning**: Use Git webhooks to trigger environment
provisioning in the cloud
10. **Implement GitOps**: Use Git as the single source of truth for both application code
and infrastructure

By leveraging Git effectively in cloud DevOps practices, teams can achieve faster, more
reliable, and more secure software delivery pipelines.

1. Developer Workstations: Where developers write code and make local commits.
2. Feature Branches: Developers push their code to feature branches for collaboration
and review.
3. Main Branch: The primary branch (often called 'main' or 'master') where approved
code is merged.
4. CI/CD Pipeline: Automated processes for building, testing, and deploying code.
5. Cloud Deployment: The final stage where code is deployed to cloud infrastructure.
The arrows show the flow of code through the system:
● Developers push code to feature branches.
● Pull requests are created to merge feature branches into the main branch.
● Once approved and merged, the CI/CD pipeline is triggered.
● The pipeline automatically deploys the code to the cloud environment.
This workflow enables continuous integration and continuous deployment, key practices in
DevOps. It allows for rapid, frequent, and reliable software releases.
GIT workflows:-
1. Centralized Workflow
The centralized workflow is the simplest and most traditional model for version control. In
this workflow, there's a single central repository that serves as the "single point of truth" for
the project.
Key points:
● All developers clone from and push to this central repository
● Typically uses a single main branch (often called 'master' or 'main')
● Suitable for small teams or projects with infrequent updates
DevOps perspective for cloud:
● Easy to set up and manage in cloud environments
● Can be integrated with CI/CD pipelines for automated testing and deployment
● May lead to bottlenecks as all changes go through a single point
● Requires careful coordination to avoid conflicts, especially in larger teams

2. Master Workflow (also known as Trunk-Based Development)

The master workflow is a development model where all developers work directly on a single
branch, usually called 'master' or 'main'. This approach emphasizes continuous integration
and small, frequent updates.
Key points:
● All work is done on the main branch
● Requires developers to integrate their changes frequently (at least daily)
● Relies heavily on automated testing and CI/CD practices
● Feature flags are often used to manage incomplete features in production
DevOps perspective for cloud:
● Promotes rapid iteration and deployment
● Requires robust automated testing and monitoring
● Well-suited for cloud-native applications and microservices architectures
● Facilitates continuous deployment practices

3. Feature Workflow (also known as Git Flow or Feature Branch Workflow)

The feature workflow is a branching model where each new feature is developed in its own
dedicated branch. This allows for parallel development of multiple features and provides
isolation for each feature's code.
Key points:
● Main branch (master/main) always contains production-ready code
● Develop branch serves as an integration branch for features
● Feature branches are created for each new feature or bug fix
● Pull requests and code reviews are typically used before merging
DevOps perspective for cloud:
● Allows for more controlled and isolated development
● Facilitates code reviews and quality control
● Can be integrated with CI/CD to test each feature branch
● May require more complex branch management and merging strategies

TDD, FDD, BDD:-

Test-Driven Development (TDD)
Overview
TDD is a software development approach where tests are written before the actual code. It
follows a cyclical process known as the "Red-Green-Refactor" cycle.
Process
1. Red: Write a failing test for a new feature or function.
2. Green: Write the minimal code necessary to pass the test.
3. Refactor: Improve the code while ensuring all tests still pass.
Benefits
● Encourages simple design and helps prevent over-engineering.
● Ensures a comprehensive test suite is built alongside the codebase.
● Provides documentation through tests, making it easier for new developers to
understand the code.
Example Workflow
1. Write a Test: Define the expected outcome.
2. Run the Test: Observe it fails (red).
3. Write Code: Implement just enough code to pass the test (green).
4. Run All Tests: Confirm the new test passes and existing tests remain unaffected.
5. Refactor: Clean up the code as needed.

Feature-Driven Development (FDD)

Overview
FDD is an agile methodology that focuses on developing features in a systematic way. It
emphasizes collaboration, iterative development, and delivering tangible results.
Process
1. Develop an Overall Model: Create a high-level model of the system.
2. Build a Features List: Break down the system into a list of features.
3. Plan by Feature: Create a plan for implementing features.
4. Design by Feature: Develop detailed designs for features.
5. Build by Feature: Implement the features, ensuring to test each one thoroughly.
Benefits
● Encourages team collaboration and communication.
● Provides clear deliverables with an emphasis on features.
● Allows for iterative progress, making it easy to adapt to changing requirements.
Example Workflow
1. Overall Model: Understand and model the system.
2. Feature List: Create a comprehensive list of features.
3. Planning and Design: Plan and design features.
4. Implementation: Build the features and ensure quality through testing.
Behavior-Driven Development (BDD)
Overview
BDD extends TDD by focusing on the behavior of an application from the user's perspective.
It encourages collaboration among developers, QA, and non-technical stakeholders.
Process
1. Write Scenarios: Use natural language to define expected behavior.
2. Automate Tests: Create automated tests based on the scenarios.
3. Develop Code: Implement the necessary code to pass the tests.
4. Refactor: Improve the codebase while ensuring tests pass.
Benefits
● Enhances communication among team members by using a common language.
● Helps ensure the final product meets user needs and expectations.
● Encourages a shared understanding of requirements.
Example Workflow
1. Define Behavior: Write user stories or scenarios.
2. Automate Tests: Use tools like Cucumber or SpecFlow.
3. Develop: Write the code that fulfills the defined behaviors.
4. Refactor: Improve code quality while maintaining test coverage.

Summary
● TDD focuses on writing tests first, ensuring each piece of code is validated before
moving on.
● FDD emphasizes delivering features systematically, breaking down development into
manageable parts.
● BDD bridges the gap between technical and non-technical stakeholders, ensuring the
software behaves as intended from a user perspective.
Each methodology has its strengths and can be adapted based on the project’s requirements
and team dynamics.

IAC, security in devops lifecycle:-

Infrastructure as Code (IaC) and Security in the DevOps Lifecycle

What is Infrastructure as Code (IaC)?

Infrastructure as Code (IaC) is a fundamental practice in DevOps that allows the

management and provisioning of infrastructure using code and automation tools, rather than
through manual configuration or interaction with physical hardware. This approach aligns
infrastructure management with the principles of software development, making it repeatable,
versioned, and scalable.

IaC enables teams to define and manage infrastructure components such as servers,
networking, storage, and services in a declarative manner (where the desired end state is
defined) or an imperative manner (where specific steps to reach the end state are defined).
The code used in IaC can be version-controlled and shared, similar to application code, and is
executed using automation tools like Terraform, Ansible, and CloudFormation.

Key Concepts in Infrastructure as Code (IaC)

1. Declarative vs. Imperative IaC:

○ Declarative IaC: The user specifies what the infrastructure should look like,
and the IaC tool figures out how to achieve it. Example tools: Terraform,
Kubernetes, CloudFormation.
○ Imperative IaC: The user specifies how to achieve the infrastructure
configuration, including all the steps. Example tools: Ansible (though it can
also be declarative depending on how it is used).
2. Version Control:

○ Infrastructure code is stored in version control systems (e.g., Git). This allows
teams to track changes over time, roll back to previous configurations, and
ensure consistency across environments.
3. Automation:

○ IaC emphasizes automation. Infrastructure provisioning, management, and

scaling tasks are automated, reducing manual intervention and minimizing
errors.
4. Idempotency:

○ IaC tools are idempotent, meaning that running the same script multiple times
results in the same outcome, regardless of the number of executions. This
ensures predictable results and avoids unintended changes.
5. Immutable Infrastructure:

○ In IaC, instead of modifying running instances, the infrastructure is replaced

when updates or changes are needed. This practice ensures that the system is
always running a clean, known version of the configuration.
6. Popular IaC Tools:

○ Terraform: A cloud-agnostic tool that allows infrastructure management

across multiple providers (e.g., AWS, GCP, Azure).
○ AWS CloudFormation: AWS-specific tool for defining cloud infrastructure
using JSON or YAML templates.
○ Ansible: A configuration management tool that can be used for infrastructure
provisioning and application deployment.
○ Puppet and Chef: Used for configuration management but also support
infrastructure as code.
○ Kubernetes: Manages containerized applications and provides infrastructure
as code for managing clusters.

Security in the DevOps Lifecycle

Security in the DevOps lifecycle (often called DevSecOps) integrates security practices
directly into the DevOps workflow. The goal of DevSecOps is to ensure that security is a
shared responsibility throughout the entire DevOps lifecycle, from planning and development
to deployment and maintenance.

Key Security Aspects in the DevOps Lifecycle

1. Shift Left Security:

○ Security should be integrated early in the development process (known as

"shifting left"). This means addressing security risks during the planning and
development phases rather than waiting until the deployment phase.
2. Continuous Integration and Continuous Delivery (CI/CD):

○ CI/CD pipelines automate the process of integrating new code into a shared
repository and deploying it to production. Security tests and controls can be
built into these pipelines to ensure that vulnerabilities are detected and
remediated early.
3. Security as Code:

○ Similar to infrastructure as code, security practices should be codified. This

can include automated security testing, vulnerability scanning, policy
enforcement, and code review processes. Security configurations, such as
firewalls, network policies, and access controls, should also be managed
through code.
4. Automated Security Testing:

○ Static Application Security Testing (SAST): Tools that analyze the source
code for vulnerabilities without executing the code (e.g., Checkmarx,
Veracode).
○ Dynamic Application Security Testing (DAST): Tools that test a running
application for vulnerabilities (e.g., OWASP ZAP, Acunetix).
○ Software Composition Analysis (SCA): Scans for vulnerabilities in
third-party libraries and dependencies (e.g., WhiteSource, Snyk).
○ Interactive Application Security Testing (IAST): Monitors application
behavior during runtime to identify vulnerabilities in a live environment.
5. Secrets Management:

○ Secrets management refers to securely storing, managing, and accessing

sensitive data such as API keys, passwords, tokens, and certificates. In the
DevOps lifecycle, secrets management tools (e.g., HashiCorp Vault, AWS
Secrets Manager, CyberArk) should be used to prevent hardcoding credentials
into the codebase.
6. Compliance Automation:

○ DevSecOps automates compliance checks and audits, ensuring that

infrastructure and applications meet industry regulations (e.g., GDPR, HIPAA,
PCI-DSS). Tools like Chef InSpec and OpenSCAP help automate security
compliance checks.
7. Security Monitoring and Incident Response:

○ Monitoring systems must be in place to detect anomalous activities, breaches,

or misconfigurations. Tools like Prometheus, Grafana, and Splunk can be used
to collect and analyze logs for security events.
○ An effective incident response plan is essential to respond to security threats
quickly and minimize damage.
8. Container and Kubernetes Security:

○ Containers and orchestrators like Kubernetes present unique security

challenges, such as ensuring secure container images, scanning for
vulnerabilities in containerized applications, and securing Kubernetes clusters.
Tools like Aqua Security, Twistlock (now part of Palo Alto Networks), and
Falco can be used to address these issues.
9. Network Security:

○ Security controls should be applied to the network infrastructure to prevent

unauthorized access to resources. Network policies in tools like Kubernetes
(Network Policies), as well as firewalls, VPNs, and Virtual Private Cloud
(VPC) configurations, should be automated and tested regularly.

Integrating Security with IaC

When IaC is combined with security principles, it ensures that infrastructure is not only
provisioned efficiently but is also secured from the outset.
1. Security in Infrastructure Code:

○ Security considerations should be part of the IaC scripts. For example, tools
like Terraform can integrate security-related checks (e.g., ensuring that
security groups in AWS are properly configured, and ensuring that storage is
encrypted).
2. Automated Security Testing for IaC:

○ Tools like Terraform's tflint and Checkov can be used to analyze IaC code
for security vulnerabilities or misconfigurations before the infrastructure is
provisioned.
3. Compliance as Code:

○ Compliance checks can be incorporated directly into IaC code, ensuring that
every environment meets specific regulatory requirements.
4. Immutable Infrastructure with Security:

○ In an immutable infrastructure model, when an application or system needs to

be updated, a new version is created, and the old version is destroyed. This
reduces the chances of unauthorized changes or configuration drift. Security
configurations are part of the initial code and remain consistent throughout the
lifecycle.
5. Managing Secrets in IaC:

○ Secret management can be integrated into IaC. For example, when

provisioning cloud resources, sensitive data (like API keys or access
credentials) should be pulled from a secure vault rather than hardcoded into
the IaC scripts.
6. Secure Deployment Pipelines:

○ Pipeline Security: Ensure that CI/CD pipelines themselves are secure. This
includes using encrypted channels for communication and ensuring that the
CI/CD server has proper access control and is not a point of attack.

Best Practices for Security in DevOps and IaC

1. Automate Security Scanning:

○ Use automated tools for continuous scanning of code, infrastructure

configurations, and containers to identify vulnerabilities and misconfigurations
early.
2. Principle of Least Privilege (PoLP):
○ Ensure that only necessary permissions are granted to each user, process, or
service. Use role-based access control (RBAC) to enforce this.
3. Apply Patching and Updates:

○ Automate patching of both application code and infrastructure components. In

a DevOps environment, patches should be tested and deployed quickly.
4. Monitor and Audit:

○ Implement continuous monitoring for anomalies, failed login attempts,

unauthorized changes, and other security incidents. Enable logging and audit
trails to track who made changes to the infrastructure and code.
5. Regular Penetration Testing:

○ Conduct regular penetration tests and vulnerability assessments on both

infrastructure and application layers to identify security weaknesses.
6. Enforce Strong Authentication:

○ Use multi-factor authentication (MFA) for accessing infrastructure and

repositories. Ensure that strong authentication practices are enforced for all
team members.
7. Backup and Disaster Recovery Plans:

○ Have a disaster recovery plan in place, including automated backup and

restoration processes, to recover quickly from any security breaches or
failures.

Integrating Infrastructure as Code (IaC) with DevSecOps practices brings multiple

benefits, including increased automation, consistency, and security in the deployment and
management of infrastructure. Security should be considered at every step of the DevOps
lifecycle, from planning and coding to deployment and maintenance. By embedding security
practices directly into the pipeline, and using tools like Terraform, CloudFormation,
Kubernetes, and various security testing tools, teams can create secure, scalable, and resilient
infrastructures that are compliant with regulatory standards.

MLOPS & DATAOPS:-

MLOps and DataOps: Overview and Key Concepts

In today’s rapidly evolving world of data-driven technologies, both MLOps and DataOps
play crucial roles in enabling organizations to streamline their data management and machine
learning workflows. These practices aim to automate, improve collaboration, and ensure the
smooth delivery of data and machine learning models to production environments.

Let's explore each of these concepts in detail.

MLOps (Machine Learning Operations)

MLOps is a set of practices and tools designed to automate and streamline the deployment,
monitoring, and management of machine learning (ML) models in production. It is
essentially the DevOps equivalent for machine learning, focusing on the lifecycle of ML
models—from development to deployment and ongoing maintenance.

Key Components of MLOps:

1. Model Development:

○ MLOps starts with the data scientists' work on feature engineering, data
preprocessing, model training, and validation. The models are built and tested,
often using notebooks or scripts.
2. Version Control:

○ Just like in traditional software development, version control is essential for

managing model versions, training datasets, and code. Tools like Git or DVC
(Data Version Control) are used to track changes in code and data.
3. Model Training and Evaluation:

○ Once the data scientists have developed a model, it undergoes training, where
various machine learning algorithms are used. MLOps ensures the
repeatability and reproducibility of model training by automating the training
pipelines and scaling the process across multiple machines.
4. Model Deployment:

○ Model deployment is the process of putting a machine learning model into a

production environment where it can start providing predictions. MLOps
ensures that the model deployment process is automated, robust, and scalable,
often involving tools like Kubernetes or Docker to manage containerized
environments.
5. Continuous Integration and Continuous Deployment (CI/CD):

○ MLOps integrates CI/CD pipelines specifically for ML models, enabling

automated testing, validation, and deployment of models. This allows teams to
continuously integrate new models and updates into production without
manual intervention.
6. Model Monitoring:

○ After deployment, it’s crucial to monitor the performance of models in

production. MLOps provides tools to track model drift, where the
performance of a model degrades over time as real-world data changes.
Automated alerts and retraining pipelines are set up to address this drift.
7. Model Governance and Compliance:

○ MLOps helps enforce governance policies, ensuring that models are

interpretable, auditable, and compliant with regulations. This is especially
important in regulated industries such as finance, healthcare, and insurance.
8. Collaboration:

○ MLOps facilitates better collaboration between data scientists, engineers, and

operations teams by using a shared infrastructure and standardized tools for
model development, testing, and deployment.

MLOps Tools:

● Kubeflow: An open-source platform that supports end-to-end ML workflows on

Kubernetes.
● MLflow: A tool for managing the ML lifecycle, including experimentation,
reproducibility, and deployment.
● TensorFlow Extended (TFX): A production-ready ML pipeline framework.
● Docker and Kubernetes: For containerization and orchestration of ML model
environments.
● DVC (Data Version Control): For managing versions of machine learning datasets
and models.

Benefits of MLOps:

● Automation: Automates the entire lifecycle of model development and deployment.

● Scalability: Makes it easier to scale machine learning workflows and manage large
datasets and models.
● Collaboration: Enables collaboration between data scientists and
DevOps/engineering teams.
● Continuous Delivery: Promotes continuous delivery of model updates to production
environments.

DataOps (Data Operations)

DataOps refers to the set of practices, tools, and technologies used to streamline the data
pipeline, ensuring faster, more efficient, and error-free data movement across the
organization. It is similar to DevOps but focuses on the flow of data within an organization
rather than software or ML models.

Key Components of DataOps:

1. Data Pipeline Automation:

○ DataOps focuses on automating the movement of data from sources

(databases, APIs, sensors, etc.) through ETL (Extract, Transform, Load)
processes and into data storage solutions. This includes the automation of data
cleaning, validation, and transformation.
2. Data Integration:

○ A key element of DataOps is integrating data from disparate sources, such as

cloud storage, on-premise systems, and third-party applications, ensuring data
is accessible, reliable, and consistent.
3. Collaboration and Communication:

○ DataOps promotes collaboration between teams that manage the data pipeline,
including data engineers, data analysts, and data scientists. It provides a
unified view of data across different teams and systems.
4. Data Quality:

○ Ensuring that the data is accurate, consistent, and reliable is a critical

component of DataOps. Automated data validation and testing are
incorporated into the pipeline to maintain high data quality throughout the
process.
5. Data Security and Compliance:

○ DataOps practices enforce proper security and compliance protocols (e.g.,

encryption, anonymization, access control) to protect sensitive data and meet
regulatory standards.
6. Monitoring and Observability:

○ Just as DevOps involves monitoring software applications, DataOps includes

monitoring the health and performance of data pipelines, ensuring that data
flows efficiently, and identifying any bottlenecks or issues quickly.
7. Continuous Integration and Continuous Delivery (CI/CD):

○ DataOps uses CI/CD principles for continuous integration of new data

sources, and for automating the deployment of data changes and updates. This
includes testing, validating, and deploying new data pipelines.
8. Version Control for Data:

○ DataOps involves version control not just for code, but also for datasets and
data pipelines. This ensures that data transformations are reproducible, and
previous data states can be accessed for auditing or rollback purposes.

DataOps Tools:

● Apache Airflow: A platform used to programmatically author, schedule, and monitor

workflows.
● Talend: A tool for data integration, preparation, and management.
● dbt (Data Build Tool): A framework for transforming raw data into valuable insights.
● Great Expectations: A tool for testing and validating data quality and ensuring that
data meets predefined expectations.
● Apache Kafka: A distributed event streaming platform used for real-time data
processing.

Benefits of DataOps:

● Faster Data Delivery: By automating the data pipeline, DataOps ensures faster and
more reliable delivery of data to stakeholders.
● Improved Collaboration: DataOps breaks down silos and improves collaboration
between different teams (data engineers, scientists, analysts).
● Data Quality and Governance: Helps ensure that data is accurate, consistent, and
compliant with regulatory standards.
● Scalability: Enables organizations to scale their data infrastructure and processes to
handle larger volumes of data and more complex workflows.

MLOps vs. DataOps: Key Differences and Similarities

Similarities:

● Both MLOps and DataOps emphasize automation and collaboration across teams.
● Both aim to improve the speed and efficiency of delivering reliable and consistent
results (models or data) to stakeholders.
● Continuous integration and deployment (CI/CD) are key components of both
practices, enabling continuous updates and improvements.

Differences:

● Focus Area: MLOps primarily deals with machine learning models and their
deployment, while DataOps is focused on the flow, transformation, and governance of
data.
● End Goal: The end goal of MLOps is to deploy machine learning models into
production environments with reliable monitoring and maintenance, whereas the goal
of DataOps is to ensure that data is clean, accurate, and flows seamlessly across
various systems and processes.

Integrating MLOps and DataOps

In a data-driven organization, MLOps and DataOps should work hand in hand:

● DataOps ensures that the data pipeline is efficient and the data is high quality,
which is critical for training reliable machine learning models in MLOps.
● MLOps ensures that the models are deployed and maintained based on the data
processed by the DataOps pipelines.
An integrated approach where data quality and governance (DataOps) complement model
development and deployment (MLOps) can lead to faster, more reliable, and more scalable
solutions for businesses working with machine learning and big data.

Conclusion

Both MLOps and DataOps are integral to building and maintaining efficient, scalable, and
automated workflows for managing data and machine learning models. By focusing on
automation, collaboration, monitoring, and quality control, these practices enable
organizations to improve the speed, reliability, and performance of their data-driven
operations.

● MLOps is crucial for managing the lifecycle of machine learning models, including
development, deployment, and monitoring.
● DataOps focuses on the seamless flow, quality, and governance of data across the
organization, ensuring that data is available, accurate, and secure for downstream
tasks, including ML model training and analytics.

Together, they form the foundation of a data-driven DevOps approach, where both data and
machine learning workflows are automated, secure, and scalable.

Observability, Continuous Monitoring, and Infrastructure as Code (IaC)

In the context of modern software development, observability and continuous monitoring

are essential practices that help ensure systems are reliable, scalable, and secure. When paired
with Infrastructure as Code (IaC), these practices enable automated management and
monitoring of infrastructure and applications in a way that fosters proactive problem
resolution, continuous improvement, and efficient resource management. Let’s delve into
these concepts in detail.

Observability

Observability refers to the ability to understand the internal state of a system based on the
external outputs or signals it provides. In other words, it’s about how well you can observe
and diagnose what's happening in a system in real time. Observability is crucial for
monitoring the health, performance, and security of applications and infrastructure.

Key Pillars of Observability:

1. Metrics:

○ Metrics are quantitative data points that track the health and performance of a
system. Examples of metrics include CPU utilization, memory usage, response
times, and throughput.
○ Example: In a web application, metrics like the number of requests per second
or error rates give an indication of performance and potential issues.
2. Logs:

○ Logs are detailed, timestamped records that provide insights into events and
transactions within the system. Logs can be generated by applications, servers,
databases, and other infrastructure components.
○ Example: A 500 error log entry in a web server log file can help identify a
failure in the system that needs attention.
3. Traces:

○ Tracing provides a way to track the flow of a request through a distributed

system. Distributed tracing allows teams to visualize and analyze how a
request moves across multiple services, helping to pinpoint where
performance bottlenecks or errors occur.
○ Example: In a microservices architecture, tracing helps track the request from
the frontend service through various backend services and databases to
identify latencies or errors.
4. Alerts:

○ Alerts are triggered based on predefined thresholds for metrics, logs, or traces.
Alerts notify teams of critical issues that require immediate attention, such as
high error rates or system downtimes.
○ Example: An alert might be triggered if CPU utilization exceeds 90%,
indicating a potential performance bottleneck.

Why is Observability Important?

● Proactive Problem Resolution: Observability allows for early detection of

performance issues, errors, and security threats before they affect users.
● Root Cause Analysis: When a problem occurs, having comprehensive observability
data (metrics, logs, traces) allows teams to perform root cause analysis and fix issues
faster.
● Improved Customer Experience: By quickly identifying and resolving issues,
organizations can provide a better experience for end users.
● Collaboration: Observability enables different teams (development, operations,
security) to work together to solve issues and improve system performance.

Tools for Observability:

● Prometheus: An open-source monitoring and alerting toolkit designed for reliability

and scalability, especially in cloud-native environments.
● Grafana: An open-source analytics and monitoring platform that integrates with
Prometheus, Elasticsearch, and other data sources to visualize metrics.
● ELK Stack (Elasticsearch, Logstash, Kibana): A set of tools for searching,
analyzing, and visualizing log data in real time.
● Datadog: A SaaS platform for monitoring, security, and analytics that provides
full-stack observability across applications, infrastructure, and networks.
● Jaeger: An open-source distributed tracing system that helps track requests as they
move across microservices.
● New Relic: A cloud-based platform that offers real-time performance monitoring and
tracing for applications and infrastructure.

Continuous Monitoring

Continuous Monitoring refers to the practice of continuously observing the health and
performance of an application, infrastructure, and its components to detect and respond to
issues promptly. It involves real-time collection of data, automated analysis, and alerting to
ensure that systems are operating efficiently.

Key Characteristics of Continuous Monitoring:

1. Real-Time Monitoring:

○ Continuous monitoring is about having up-to-date insights into system

performance. This requires tools and techniques that provide real-time data on
metrics, logs, traces, and errors.
○ Example: Monitoring a cloud-based infrastructure to ensure that resource
usage doesn’t exceed available capacity.
2. Automated Alerts:

○ Continuous monitoring is closely tied to automated alerting systems that

inform teams of anomalies, performance degradation, or failures. Alerts
should be triggered automatically when systems behave abnormally, without
requiring manual intervention.
○ Example: Sending an alert when an API response time exceeds a certain
threshold, indicating a potential bottleneck.
3. Anomaly Detection:

○ Anomaly detection involves identifying patterns that deviate from the normal
operation of a system. Machine learning models can be used to automatically
learn the baseline behavior of applications and flag unusual patterns.
○ Example: Identifying sudden spikes in traffic or resource usage that could
indicate a performance issue or a security breach.
4. Continuous Feedback Loop:

○ Continuous monitoring feeds directly into feedback loops that help

development teams improve their code, systems, and infrastructure over time.
It provides actionable insights for optimization and problem resolution.
○ Example: Using performance data to optimize application code and improve
system architecture.
Benefits of Continuous Monitoring:

● Faster Issue Resolution: By continuously monitoring systems and automating alerts,

issues can be detected and addressed before they impact users.
● Optimization: Continuous monitoring helps identify underused or overused
resources, enabling teams to optimize infrastructure and reduce costs.
● Security: Continuous monitoring of security events, logs, and network traffic helps in
detecting breaches or vulnerabilities early.
● System Health: It ensures that all systems are functioning properly, which is critical
for maintaining high availability and reliability.

Continuous Monitoring Tools:

● Nagios: A widely-used tool for monitoring the availability and performance of

systems, networks, and infrastructure.
● Zabbix: An open-source monitoring tool for tracking the health and performance of
applications and infrastructure.
● AppDynamics: Provides end-to-end application performance monitoring with
real-time insights into business transactions and user interactions.
● Splunk: A platform for analyzing machine data, which is commonly used for
monitoring logs and metrics.
● Dynatrace: A monitoring tool that uses AI to provide insights into application
performance and user experience.

Infrastructure as Code (IaC)

Infrastructure as Code (IaC) is the practice of managing and provisioning computing

infrastructure through code, instead of manual processes. IaC allows teams to define the
desired infrastructure state (e.g., servers, networks, databases) using code and automation
tools. This practice is integral to modern DevOps and helps in managing cloud environments
at scale.

Key Benefits of IaC:

1. Automation:

○ IaC automates the setup and configuration of infrastructure, reducing manual

intervention and human error. This makes the process repeatable, consistent,
and efficient.
2. Consistency:

○ By defining infrastructure in code, the same configurations are applied across

multiple environments (development, staging, production), ensuring
consistency and reducing configuration drift.
3. Scalability:
○ IaC enables rapid scaling of infrastructure by automating the provisioning of
new resources when demand increases.
4. Version Control:

○ IaC files (such as configuration scripts or templates) are stored in version

control systems (e.g., Git). This allows teams to track changes, roll back to
previous versions, and collaborate efficiently.
5. Disaster Recovery:

○ If infrastructure is lost (e.g., due to a data center failure), IaC allows you to
recreate it from the code, facilitating faster disaster recovery.

Key Concepts in IaC:

1. Declarative vs. Imperative:

○ Declarative IaC: Describes what the end state of the infrastructure should be,
and the tool ensures that the infrastructure matches this state. (e.g., Terraform,
CloudFormation)
○ Imperative IaC: Specifies how to achieve a particular infrastructure state,
often with step-by-step instructions. (e.g., Ansible, Chef, Puppet)
2. Infrastructure as Code Tools:

○ Terraform: An open-source IaC tool that is cloud-agnostic, allowing

infrastructure management across multiple cloud platforms (AWS, Azure,
GCP).
○ AWS CloudFormation: An IaC tool specifically for managing AWS
resources using JSON or YAML templates.
○ Ansible: A configuration management tool that is also used for infrastructure
provisioning and application deployment.
○ Puppet and Chef: Popular configuration management tools used for
automating infrastructure provisioning.
3. Configuration Management:

○ Configuration management ensures that software configurations, such as

application settings or security policies, are consistently applied across all
environments.
○ Tools like Ansible and Chef automate the application of these configurations.

IaC and Observability:

IaC and observability go hand-in-hand. With IaC, infrastructure can be provisioned

automatically, ensuring consistency across environments. Once the infrastructure is in place,
observability tools can continuously monitor the health and performance of the systems
deployed, ensuring that any issues are caught early and addressed quickly.

Best Practices for IaC:

● Version Control: Store infrastructure code in version control systems like Git to track
changes and enable collaboration.
● Modularization: Break down infrastructure code into reusable modules to improve
maintainability and reduce duplication.
● Testing: Test infrastructure code in isolated environments to catch errors before
deployment.
● Documentation: Ensure that IaC is well-documented to aid in understanding and
troubleshooting.

Conclusion

● Observability enables teams to monitor and understand the state of their systems in
real time, providing insights into performance, security, and user experience.
● Continuous Monitoring enhances observability by continuously tracking the health
of applications and infrastructure, ensuring that issues are detected and resolved
proactively.
● Infrastructure as Code (IaC) automates and streamlines the process of managing
infrastructure, ensuring consistency, scalability, and faster recovery in case of failures.

Together, these practices help in building and maintaining highly reliable, efficient, and
secure systems, ensuring that applications and infrastructure are always running smoothly.
Integrating IaC with observability and continuous monitoring creates a powerful, automated,
and responsive DevOps pipeline.
Container Orchestration: Kubernetes

Introduction to Container Orchestration

Container orchestration automates the deployment, management, scaling, and

networking of containers. It is necessary for handling complex containerized
applications, ensuring scalability, resilience, and efficient management across multiple
servers.

Kubernetes Overview

● What is Kubernetes? Kubernetes (K8s) is an open-source container

orchestration tool developed by Google. It enables managing application services
spanning multiple containers and ensures their health over time.

● Features:

○ Advanced scheduling
○ Scaling
○ Self-healing
○ Networking

Kubernetes Architecture

Kubernetes follows a master-worker architecture:

1. Master Node (Control Plane):

○ API Server: Handles requests from users and validates them.

○ Controller Manager: Maintains the desired state of the cluster.
○ Scheduler: Assigns workloads to worker nodes.
○ etcd: Stores cluster data in a distributed key-value store.
2. Worker Node:

○ Container Runtime Interface (CRI): Typically Docker, manages

containers.
○ Kubelet: Communicates with the master to manage containers on the
node.
○ Kube Proxy: Handles network traffic between pods and exposes pods to
users.

Working with Kubernetes Objects

1. Pod:

○ Smallest deployable unit in Kubernetes.

○ Represents a single application instance.
○ Can contain one or more containers.
○ Defined using YAML configuration.

Example YAML:

apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: myapp
tier: frontend
spec:
containers:
- name: nginx-container
image: nginx
ports:
- containerPort: 80

2. ReplicaSet:

○ Ensures a specified number of identical pods are always running.

○ Provides high availability and self-healing capabilities.

Example YAML:

apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-replicaSet
spec:
replicas: 3
selector:
matchLabels:
tier: frontend
template:
metadata:
labels:
tier: frontend
spec:
containers:
- name: nginx-container
image: nginx

3. Deployment:

○ Manages ReplicaSets and Pods.

○ Enables rolling updates and rollbacks.
○ Provides declarative updates to applications.
4. Service:

○ Exposes Pods to the network.

○ Provides load balancing.
5. Other Objects:

○ Job: Executes tasks until completion.

○ ConfigMap: Stores configuration data.
○ Secret: Stores sensitive data.
○ Namespace: Segregates resources.
○ Volume: Provides persistent storage for containers.

Common Kubernetes Commands

1. Version Information:

○ Get client/server versions:

kubectl version
○ Client-only:
kubectl version --client
2. Node and Pod Management:

○ List nodes:
kubectl get nodes
○ List pods and services:
kubectl get pods,svc
3. Pod Operations:

○ Run a pod:
kubectl run nginx-pod --image=nginx
○ Create from YAML:
kubectl create -f pod-def.yml
○ Update:
kubectl apply -f pod-def.yml
4. ReplicaSet Operations:

○ Create:
kubectl create -f repset-def.yml
○ Scale:
kubectl scale --replicas=6 -f repset-def.yml
5. General Help:

○ Get help for commands:

kubectl help

Hands-On Practice with Minikube

● Minikube sets up a local Kubernetes cluster, ideal for learning.

○ Install virtualization software like VirtualBox.
○ Start Minikube:
minikube start --driver=virtualbox
○ Verify cluster status:
kubectl get nodes

Possible Scenarios:-
Here’s an expanded and detailed material for each scenario with
examples, YAML configurations, and explanations. You can copy
and print this directly for your open-book exam.

Scenario 1: Deploying a Simple Web Application

Task: Deploy a Pod named web-app-pod running an NGINX container

on port 80.

YAML Configuration:

apiVersion: v1

kind: Pod

metadata:

labels:

app: web-app

spec:

containers:

- name: nginx-container

image: nginx

ports:

- containerPort: 80

Commands:
Create the Pod:
kubectl apply -f pod-definition.yaml

Verify:
kubectl get pods

kubectl describe pod web-app-pod

Scenario 2: Ensuring High Availability

Task: Use a ReplicaSet to ensure three replicas of a Pod are

running.

YAML Configuration:

apiVersion: apps/v1

kind: ReplicaSet

metadata:

spec:

replicas: 3

selector:

matchLabels:

app: web-app

template:

metadata:

labels:

app: web-app
spec:

containers:

- name: nginx-container

image: nginx

ports:

- containerPort: 80

Commands:

Create the ReplicaSet:

kubectl apply -f replicaset-definition.yaml

Verify the ReplicaSet and Pods:

kubectl get replicaset

kubectl get pods

Scenario 3: Rolling Updates

Task: Update a Deployment to use a new version of the NGINX

image.

YAML Configuration:

apiVersion: apps/v1

kind: Deployment

metadata:

spec:

replicas: 3

selector:
matchLabels:

app: web-app

template:

metadata:

labels:

app: web-app

spec:

containers:

- name: nginx-container

image: nginx:1.20

Commands:

Apply the Deployment:

kubectl apply -f deployment-definition.yaml

Check rollout status:

kubectl rollout status deployment/web-app-deployment

Scenario 4: Managing Secrets

Task: Store sensitive data securely using a Secret.

YAML Configuration:

apiVersion: v1

kind: Secret

metadata:

data:

db-password: bXlTZWNyZXRQYXNzd29yZCE= # Base64 encoded value

of "mySecretPassword!"

Accessing Secret in a Pod:

apiVersion: v1

kind: Pod

metadata:

spec:

containers:

- name: app-container

image: busybox

command: ["sh", "-c", "echo $(DB_PASSWORD) && sleep 3600"]

env:

- name: DB_PASSWORD

valueFrom:

secretKeyRef:

key: db-password
Scenario 5: Exposing Your Application

Task: Create a Service to expose a Pod on port 30001.

YAML Configuration:

apiVersion: v1

kind: Service

metadata:

spec:

type: NodePort

ports:

- port: 80

targetPort: 80

nodePort: 30001

selector:

app: web-app

Commands:

Apply the Service:

kubectl apply -f service-definition.yaml

Access the application via <Node-IP>:30001.

Scenario 6: Handling Workloads

Task: Run a one-time data processing job.

YAML Configuration:

apiVersion: batch/v1

kind: Job

metadata:

spec:

template:

spec:

containers:

- name: processor

image: data-processor:latest

command: ["python", "process.py"]

restartPolicy: OnFailure

Scenario 7: Namespace Segregation

Commands:

Create namespaces:
kubectl create namespace frontend

kubectl create namespace backend

Deploy in frontend namespace:
kubectl apply -f pod-definition.yaml -n frontend

kubectl get pods -n frontend

Scenario 8: Scaling Applications

Commands:

Scale ReplicaSet manually:

kubectl scale --replicas=10 -f replicaset-definition.yaml

Use HPA:
apiVersion: autoscaling/v1

kind: HorizontalPodAutoscaler

metadata:

spec:

scaleTargetRef:

apiVersion: apps/v1

kind: Deployment

minReplicas: 3

maxReplicas: 10

targetCPUUtilizationPercentage: 70
Scenario 9: Debugging a Pod

Commands:

Check Pod status:

kubectl get pods

View logs:
kubectl logs <pod-name>

Describe the Pod:

kubectl describe pod <pod-name>

Scenario 10: Persistent Data Storage

YAML Configuration:

PersistentVolume:

apiVersion: v1

kind: PersistentVolume

metadata:

spec:

capacity:

storage: 5Gi

accessModes:

- ReadWriteOnce

hostPath:
path: "/data/storage"

PersistentVolumeClaim:

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

spec:

accessModes:

- ReadWriteOnce

resources:

requests:

storage: 5Gi

Pod mounting PVC:

apiVersion: v1

kind: Pod

metadata:

spec:

containers:

- name: app-container

image: busybox
volumeMounts:

- mountPath: "/data"

volumes:

- name: storage

persistentVolumeClaim:

claimName: pvc-storage

Pods

Definition:

● A Pod is the smallest deployable unit in Kubernetes.

● It represents a single instance of a running process in the
cluster.

Scenario: Deploying a Simple Pod

Question: Deploy a Pod named simple-pod with an NGINX container.

YAML Configuration:

apiVersion: v1

kind: Pod

metadata:

labels:

app: nginx-app

spec:
containers:

- name: nginx-container

image: nginx:latest

ports:

- containerPort: 80

Commands:

Create the Pod:

kubectl apply -f pod.yaml

Verify:
kubectl get pods

kubectl describe pod simple-pod

ReplicaSets

Definition:

● ReplicaSets ensure a specified number of replicas

(identical Pods) are running at all times.
● Self-healing: Recreates Pods if they fail or are deleted.

Scenario: Ensuring High Availability

Question: Create a ReplicaSet with 3 replicas for the nginx

application.

YAML Configuration:

apiVersion: apps/v1

kind: ReplicaSet

metadata:
name: nginx-replicaset

spec:

replicas: 3

selector:

matchLabels:

app: nginx-app

template:

metadata:

labels:

app: nginx-app

spec:

containers:

- name: nginx-container

image: nginx:latest

ports:

- containerPort: 80

Commands:

Apply the ReplicaSet:

kubectl apply -f replicaset.yaml

View ReplicaSet and Pods:

kubectl get replicaset

kubectl get pods

Deployments

Definition:

● A Deployment manages ReplicaSets and Pods.

● Supports rolling updates and rollbacks for seamless
updates.

Scenario: Rolling Updates for an Application

Question: Create a Deployment for the nginx application with 3

replicas and enable rolling updates.

YAML Configuration:

apiVersion: apps/v1

kind: Deployment

metadata:

spec:

replicas: 3

selector:

matchLabels:

app: nginx-app

strategy:

type: RollingUpdate

rollingUpdate:

maxSurge: 1
maxUnavailable: 1

template:

metadata:

labels:

app: nginx-app

spec:

containers:

- name: nginx-container

image: nginx:1.19

ports:

- containerPort: 80

Commands:

Apply Deployment:
kubectl apply -f deployment.yaml

Rollout Status:
kubectl rollout status deployment/nginx-deployment

ConfigMaps

Definition:

● Stores non-sensitive configuration data as key-value pairs.

● Used by Pods or other Kubernetes objects to externalize
configuration.

Scenario: Using a ConfigMap for Environment Variables

Question: Create a ConfigMap for application settings and use it
in a Pod.

YAML Configuration:

ConfigMap:

apiVersion: v1

kind: ConfigMap

metadata:

data:

APP_NAME: "MyApp"

APP_ENV: "Production"

Pod referencing ConfigMap:

apiVersion: v1

kind: Pod

metadata:

spec:

containers:

- name: app-container

image: busybox

command: ["sh", "-c", "echo $(APP_NAME) in $(APP_ENV);

sleep 3600"]
env:

- name: APP_NAME

valueFrom:

configMapKeyRef:

key: APP_NAME

- name: APP_ENV

valueFrom:

configMapKeyRef:

key: APP_ENV

Commands:

Apply ConfigMap and Pod:

kubectl apply -f configmap.yaml

kubectl apply -f pod.yaml

Secrets

Definition:

● Stores sensitive information (e.g., passwords, tokens)

securely as base64-encoded key-value pairs.

Scenario: Securely Storing an API Key

Question: Create a Secret to store an API key and use it in a

Pod.
YAML Configuration:

Secret:

apiVersion: v1

kind: Secret

metadata:

type: Opaque

data:

API_KEY: bXlTZXN0QXBpS2V5 # Base64 of "myTestApiKey"

Pod referencing Secret:

apiVersion: v1

kind: Pod

metadata:

spec:

containers:

- name: app-container

image: busybox

command: ["sh", "-c", "echo $(API_KEY); sleep 3600"]

env:

- name: API_KEY
valueFrom:

secretKeyRef:

key: API_KEY

Commands:

Apply Secret and Pod:

kubectl apply -f secret.yaml

kubectl apply -f pod.yaml

Services

Definition:

● Exposes a set of Pods as a network service.

● Types: ClusterIP (default), NodePort, LoadBalancer,
ExternalName.

Scenario: Exposing a Pod via NodePort

Question: Expose a Pod using a NodePort service on port 30001.

YAML Configuration:

apiVersion: v1

kind: Service

metadata:

spec:

type: NodePort
selector:

app: nginx-app

ports:

- protocol: TCP

port: 80

targetPort: 80

nodePort: 30001

Commands:

Apply the Service:

kubectl apply -f service.yaml

Access the application: Use <Node-IP>:30001 in your browser.

AWS Cloud Practitioner Full Course
86% (14)
AWS Cloud Practitioner Full Course
246 pages
Jon Bonso - AWS Certified Solutions Architect Associate SAA-C03-Tutorials Dojo (2022)
67% (9)
Jon Bonso - AWS Certified Solutions Architect Associate SAA-C03-Tutorials Dojo (2022)
288 pages
AWS Certified Solutions Architect Professional Slides v6
100% (8)
AWS Certified Solutions Architect Professional Slides v6
823 pages
Tutorials Dojo Study Guide and Cheat Sheets AWS Certified Cloud Practitioner 2021 10 01 xrhf1w
100% (13)
Tutorials Dojo Study Guide and Cheat Sheets AWS Certified Cloud Practitioner 2021 10 01 xrhf1w
196 pages
TERRAFORM COMPLETE NOTES BY DevOps Shack
No ratings yet
TERRAFORM COMPLETE NOTES BY DevOps Shack
113 pages
Kubernetes Basic To Advance End To End
100% (4)
Kubernetes Basic To Advance End To End
295 pages
AWS Certified Solutions Architect Associate (Jon Bonso and Adrian Formaran)
75% (4)
AWS Certified Solutions Architect Associate (Jon Bonso and Adrian Formaran)
236 pages
AWS CSAA Practice-Questions DCT V08-Ambu0d
100% (11)
AWS CSAA Practice-Questions DCT V08-Ambu0d
411 pages
Terraform Interview Questions Guide
100% (3)
Terraform Interview Questions Guide
11 pages
Hyperlexia and Reading Comprehension
No ratings yet
Hyperlexia and Reading Comprehension
22 pages
Azure Devops Complete Ci CD Pipeline PDF
100% (8)
Azure Devops Complete Ci CD Pipeline PDF
109 pages
DevSecOps Study Notes
100% (1)
DevSecOps Study Notes
55 pages
DevSecOps Practical Labs
No ratings yet
DevSecOps Practical Labs
1,351 pages
AWS Certified Solution Architect Associate Study Guide V1.0 Abdul Jaseem VP Release 30 Aug 2020
100% (6)
AWS Certified Solution Architect Associate Study Guide V1.0 Abdul Jaseem VP Release 30 Aug 2020
235 pages
Boto 3
No ratings yet
Boto 3
3 pages
AWS Course - All Slides
80% (10)
AWS Course - All Slides
879 pages
Kubernetes Tutorial
100% (11)
Kubernetes Tutorial
83 pages
Terraform Practice Guide
100% (14)
Terraform Practice Guide
109 pages
Terraform Full Course
75% (4)
Terraform Full Course
34 pages
Cka PDF
80% (5)
Cka PDF
58 pages
6.1) Kubernetes Detailed Notes
100% (3)
6.1) Kubernetes Detailed Notes
75 pages
Edureka AWS Ebook
100% (4)
Edureka AWS Ebook
24 pages
Kubernetes Cheat Sheet: Cheatsheet: Kubernetes For Operations 1
100% (1)
Kubernetes Cheat Sheet: Cheatsheet: Kubernetes For Operations 1
11 pages
Amazon AWS Certified Solutions Architect Associate PDF
100% (2)
Amazon AWS Certified Solutions Architect Associate PDF
369 pages
Docker Docker Tutorial For Beginners Build Ship and Run - Dennis Hutten
100% (11)
Docker Docker Tutorial For Beginners Build Ship and Run - Dennis Hutten
187 pages
SOQL
100% (3)
SOQL
27 pages
SAS Certification A00-212
63% (8)
SAS Certification A00-212
81 pages
8.terraform Modules
No ratings yet
8.terraform Modules
147 pages
CICD Pipelines For Different Deployment Stratgeies
100% (1)
CICD Pipelines For Different Deployment Stratgeies
12 pages
CCP-Rev4..1a GCF SENDV1
No ratings yet
CCP-Rev4..1a GCF SENDV1
140 pages
Comprehensive Guide to AWS Virtual Private Cloud (VPC)
No ratings yet
Comprehensive Guide to AWS Virtual Private Cloud (VPC)
19 pages
Introduction To AWS IAM
No ratings yet
Introduction To AWS IAM
5 pages
Cluster Architecture - Kubernetes
100% (1)
Cluster Architecture - Kubernetes
29 pages
Week 7 - AcademyCloudFoundations_Module_06 - Compute
No ratings yet
Week 7 - AcademyCloudFoundations_Module_06 - Compute
65 pages
Containers in The Cloud
No ratings yet
Containers in The Cloud
56 pages
Migrating Your Application - Participant Guide
No ratings yet
Migrating Your Application - Participant Guide
267 pages
Week-3 Lecture Notes
No ratings yet
Week-3 Lecture Notes
171 pages
82 Pages - Devops Interview Questions
No ratings yet
82 Pages - Devops Interview Questions
82 pages
FANG CI - CD DevSecOps Best Practices
No ratings yet
FANG CI - CD DevSecOps Best Practices
1,113 pages
Dev Ops
No ratings yet
Dev Ops
124 pages
Week 3 AcademyCloudFoundations_Module_05 - Networking
No ratings yet
Week 3 AcademyCloudFoundations_Module_05 - Networking
110 pages
Kubernates Kubectl Context and Configuration: Authenticating Across Clusters With Kubeconfig
No ratings yet
Kubernates Kubectl Context and Configuration: Authenticating Across Clusters With Kubeconfig
9 pages
00 - Kubernetes Day 1
No ratings yet
00 - Kubernetes Day 1
146 pages
Kuber Net Es
No ratings yet
Kuber Net Es
33 pages
100 Kubernetes Commands
No ratings yet
100 Kubernetes Commands
16 pages
Realtime Hands on Kubernetes Q-A
No ratings yet
Realtime Hands on Kubernetes Q-A
117 pages
Introduction to Appsec Combined Keynote PDF
No ratings yet
Introduction to Appsec Combined Keynote PDF
107 pages
saa3_wk7
No ratings yet
saa3_wk7
117 pages
cloud all notes
No ratings yet
cloud all notes
147 pages
Essentials On Azure DevOps Services and GitHub Book 6
No ratings yet
Essentials On Azure DevOps Services and GitHub Book 6
102 pages
05 - Security
No ratings yet
05 - Security
59 pages
Prometheus Certified Associate-1
No ratings yet
Prometheus Certified Associate-1
513 pages
Getting Started With Kubernetes by Eric Shanks
No ratings yet
Getting Started With Kubernetes by Eric Shanks
193 pages
Mastering Azure Kubernetes Service
No ratings yet
Mastering Azure Kubernetes Service
8 pages
Week 6 AcademyCloudFoundations_Module_04-Security
No ratings yet
Week 6 AcademyCloudFoundations_Module_04-Security
44 pages
Labs Practical of Aws
No ratings yet
Labs Practical of Aws
3 pages
Terraform+Notes+PPT+26th+December+2024+ +KPLABS
No ratings yet
Terraform+Notes+PPT+26th+December+2024+ +KPLABS
707 pages
Explore Kubernetes Environment
No ratings yet
Explore Kubernetes Environment
5 pages
ec2 class notes
No ratings yet
ec2 class notes
15 pages
Kuber Net Es
No ratings yet
Kuber Net Es
219 pages
DOP327-R2 - Monitoring and Observability of Serverless Apps Using AWS X-Ray
100% (1)
DOP327-R2 - Monitoring and Observability of Serverless Apps Using AWS X-Ray
22 pages
saa-c03_9
No ratings yet
saa-c03_9
25 pages
Hacktricks Cloud
No ratings yet
Hacktricks Cloud
1,854 pages
Kubernetes Networking: Marian Babik, Spyridon Trigazis Cern
No ratings yet
Kubernetes Networking: Marian Babik, Spyridon Trigazis Cern
19 pages
Kubernetes Architecture
No ratings yet
Kubernetes Architecture
60 pages
DevOps Tasks Devops Shack
No ratings yet
DevOps Tasks Devops Shack
5 pages
Cicd Realtime Project
No ratings yet
Cicd Realtime Project
13 pages
SLA and DevsecOps
No ratings yet
SLA and DevsecOps
11 pages
Top 100
No ratings yet
Top 100
59 pages
02 AKS SF AzureStack
No ratings yet
02 AKS SF AzureStack
45 pages
CK Ad 1052011601566968852
No ratings yet
CK Ad 1052011601566968852
157 pages
500 devops errors, solutions and rca
100% (1)
500 devops errors, solutions and rca
128 pages
Azure Kubernetes Service - Architecture & Implementation Case Study
No ratings yet
Azure Kubernetes Service - Architecture & Implementation Case Study
9 pages
Developing Cloud Native Applications
No ratings yet
Developing Cloud Native Applications
71 pages
Prometheus and Grafana For EKS Cluster
No ratings yet
Prometheus and Grafana For EKS Cluster
9 pages
1 Devops Interview Questions
No ratings yet
1 Devops Interview Questions
106 pages
Microsoft Cloud Storage For Enterprise Architects
No ratings yet
Microsoft Cloud Storage For Enterprise Architects
5 pages
Managing Kubernetes Traffic With F5 NGINX
No ratings yet
Managing Kubernetes Traffic With F5 NGINX
126 pages
Concepts _ Kubernetes
No ratings yet
Concepts _ Kubernetes
609 pages
Virtual Private Cloud
100% (1)
Virtual Private Cloud
17 pages
AWS in ACTION Part -1: Real-world Solutions for Cloud Professionals
From Everand
AWS in ACTION Part -1: Real-world Solutions for Cloud Professionals
Poonam Devi
No ratings yet
Docker Kubernete For Training
No ratings yet
Docker Kubernete For Training
26 pages
Unit6
No ratings yet
Unit6
37 pages
AWS Certified DevOps Engineer Professional... Tests 2021
100% (3)
AWS Certified DevOps Engineer Professional... Tests 2021
210 pages
Kubernetes Practicals Ebook
67% (3)
Kubernetes Practicals Ebook
187 pages
Kubernetes
100% (6)
Kubernetes
138 pages
100 Days of Kubernetes
100% (4)
100 Days of Kubernetes
121 pages
Terraform From Bigginer To Master
100% (3)
Terraform From Bigginer To Master
90 pages
A Complete Guide To DevOps With AWS
100% (1)
A Complete Guide To DevOps With AWS
579 pages
Terraform Certified
100% (3)
Terraform Certified
121 pages
Kubernetes Security
100% (2)
Kubernetes Security
156 pages
Containers, Docker and Microservices
77% (13)
Containers, Docker and Microservices
61 pages
Aws Dumps and Study Material
100% (1)
Aws Dumps and Study Material
518 pages
Finite Elements of 2 and 3 Dimensions
No ratings yet
Finite Elements of 2 and 3 Dimensions
125 pages
Past Prgressive PDF
No ratings yet
Past Prgressive PDF
7 pages
Memory Management
No ratings yet
Memory Management
34 pages
Full Chapter of Starting Out With C From Control Structures to Objects 9th Edition Tony Gaddis EPUB DOCX PDF Download Now
No ratings yet
Full Chapter of Starting Out With C From Control Structures to Objects 9th Edition Tony Gaddis EPUB DOCX PDF Download Now
160 pages
THE SPHERE: Science Philosophy Religion by Periander A. Esplana
No ratings yet
THE SPHERE: Science Philosophy Religion by Periander A. Esplana
5 pages
3
No ratings yet
3
1 page
Worldview Essay
No ratings yet
Worldview Essay
2 pages
Magadalena Bieniak The Soul-Body Problem at Paris, CA. 1200-1250. Hugh of St-Cher and His Contemporaries
No ratings yet
Magadalena Bieniak The Soul-Body Problem at Paris, CA. 1200-1250. Hugh of St-Cher and His Contemporaries
265 pages
212092-Article Text-524949-1-10-20210806
No ratings yet
212092-Article Text-524949-1-10-20210806
8 pages
Unit 3 Travel Broadens The Mind Cls 11
No ratings yet
Unit 3 Travel Broadens The Mind Cls 11
26 pages
Lindbeck, George
No ratings yet
Lindbeck, George
19 pages
Full Download Groups Languages and Automata 1st Edition Derek F. Holt PDF
100% (6)
Full Download Groups Languages and Automata 1st Edition Derek F. Holt PDF
39 pages
Grade 4 Literacy Curriculum PDF
No ratings yet
Grade 4 Literacy Curriculum PDF
19 pages
Full Stack UNIT3
No ratings yet
Full Stack UNIT3
57 pages
UNIV256 Practice Problems 9
No ratings yet
UNIV256 Practice Problems 9
4 pages
1-What Is The Difference Between Sequence Diagram and Communication Diagram in UML?
No ratings yet
1-What Is The Difference Between Sequence Diagram and Communication Diagram in UML?
8 pages
Education 1
No ratings yet
Education 1
2 pages
TOP 250+ QC Welding Inspector Interview Questions and Answers 11 February 2021 - QC Welding Inspector Interview Questions - Wisdom Jobs India1
No ratings yet
TOP 250+ QC Welding Inspector Interview Questions and Answers 11 February 2021 - QC Welding Inspector Interview Questions - Wisdom Jobs India1
1 page
Informatica - Course
No ratings yet
Informatica - Course
2 pages
Syllabus For Assamese - (Code014) 2022-23
No ratings yet
Syllabus For Assamese - (Code014) 2022-23
11 pages
NetMaster Installation Guide
No ratings yet
NetMaster Installation Guide
59 pages
Assignment Oral Communication
No ratings yet
Assignment Oral Communication
13 pages
CS614 Updated Quiz 1 Solution BY MCS of Virtuallians
0% (1)
CS614 Updated Quiz 1 Solution BY MCS of Virtuallians
9 pages
Session22 To 24 PYTHON COLAB
No ratings yet
Session22 To 24 PYTHON COLAB
128 pages
An introduction to vectors - Math Insight
No ratings yet
An introduction to vectors - Math Insight
5 pages
Ip Final Practical File PDF
No ratings yet
Ip Final Practical File PDF
68 pages
Semester-1 /2: Chitkara School of Engineering and Technology
No ratings yet
Semester-1 /2: Chitkara School of Engineering and Technology
4 pages