Dev Ops for Cloud
Dev Ops for Cloud
Not easily
Scalable
Container orchestration
automates the
deployment,
management, scaling,
and networking of
containers.
Container Orchestration
Tools
➢ 2018:
➢ Kubernetes became the de-facto
industry standard, overshadowing most
alternatives.
1. Pod
2. ReplicaSet
3. Deployment
4. Service
5. Job
6. ConfigMap
7. Secret
8. Namespace
9. Volume etc
2) help
kubectl help
➢ to get commands help
BITS Pilani, Pilani Campus
kubectl commands - 2
3) nodes
# to get the list of nodes running
kubectl get nodes
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: myapp
tier: frontend
spec:
containers:
- name: nginx-container
image: nginx
ports:
- containerPort: 80
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-replicaSet
spec: (contains template of pod to be replicated)
template:
<your_pod-def>
replicas: 3
selector:
matchLabels:
tier: frontend
Notes:
======
1) the selector matchLabels is used to match the pods running in kubernetes
cluster having labels as tier: frontend and this replicaset will be applicable to all of
them
2) the pod being defined inside the replicaset needs to always match wrt label
selector!
3) to delete a replicaset
4) to update a replicaset
➢ Ex:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 3
➢ v1 v1 v1 v1 v1
➢ v1 v1 v1 v1 v1 v2 v2 v2
➢ v2 v2 v2 v1 v1 v2 v2
➢ v2 v2 v2 v2 v2
Notes:
1) the selector matchLabels is used to match the pods running in kubernetes cluster having
labels as tier: frontend and this replicaset will be applicable to all of them
2) the pod being defined inside the replicaset needs to always match wrt label selector!
2) Create index.html in it
3) Create a Dockerfile
<body>
<h1>Welcome to Hello World Frontend!</h1>
<p>Version: <span id="app-version"></span></p>
<p>API key: <span id=“api-key"></span></p>
</body>
</html>
# Expose port 80
EXPOSE 80
➢ Ex:
kubectl get cm
➢ Ex:
➢ AWS Fargate (for ECS and EKS)
➢ Azure Container Instances (ACI)
➢ Google Cloud Run
➢ Benefits:
➢ No Infrastructure Management: You don't need to manage servers, clusters, or
nodes.
➢ Automatic Scaling: Services can automatically scale containers up or down based
on demand.
➢ Cost-Efficiency: Pay only for the compute resources you use, avoiding over-
provisioning.
3) Create a Service:
➢ In ECS, define a Fargate service that runs your task:
➢ Desired number of tasks: 1
➢ Networking: VPC, Subnet, and Security Groups
➢ Auto-scaling options, if required.
➢ Helm enables us to define, install, and upgrade complex Kubernetes applications using
Helm charts.
1) Manage Configuration:
Use a single configuration file (values.yaml) to manage parameters like the number of
replicas, environment variables, secrets, and more.
2) Deploy Applications:
Deploy the same application across different environments (dev, staging, production) with
a single command, while applying environment-specific configuration.
3) Version Control:
Version Helm charts, allowing you to roll back to previous versions if necessary.
➢ Helm will create a Kubernetes deployment, service, and persistent volume claim for the
database
➢ Kubernetes Services
➢ Traditional CD
➢ Anatomy of a Deployment Pipeline
➢ Deployment Pipeline Practices
➢ Human-free deployments
➢ Environment-based release patterns
➢ GitOps CD
➢ What is GitOps
➢ Developer benefits of GitOps
➢ Operational benefits of GitOps
➢ GitOps Continuous Integration (CI) - CI stages
➢ GitOps Continuous Delivery (CD) - CD stages
➢ Declarative vs Imperative object management in Kubernetes
➢ Kubernetes with GitOps
use cases:
➢ 1) to expose one pod to another pod via
endpoint (service name Ex: redis-db for
redis database pod) in the k8 cluster.
➢ ClusterIP,
➢ NodePort,
➢ LoadBalancer and
➢ ExternalName
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx-app
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP
Notes:
1) the selector matchLabels is used to match the pods running in kubernetes cluster having
labels as tier: frontend and this replicaset will be applicable to all of them
2) the pod being defined inside the replicaset needs to always match wrt label selector!
3) Jenkins will ask for an initial administrator password. You can find this in the Docker logs:
➢ Releases new software to a small group of users before rolling it out to the entire
user base.
➢ This gradual approach allows for more control over the release process and
immediate feedback from users.
➢ If issues are detected, the rollout can be stopped without affecting the majority of
users
1) Automating Everything:
➢ Each step, from build to deploy, should be automated.
➢ Human-free deployments aim for full automation.
➢ Ensures no human intervention is required for successful
deployments to production.
1) Declarative Management: You declare the desired state in files (e.g., YAML), and
Kubernetes ensures the actual state matches.
2) Imperative Management: You directly issue commands to change the current state.
5) Kubernetes with GitOps: Tools like ArgoCD or Flux are used to sync the Kubernetes
cluster state with the Git repository.
2) Imperative IaC:
➢ Focuses on the steps needed to reach the desired state.
➢ We explicitly describe how to achieve the target
configuration.
➢ Ex: Ansible, Chef.
2) Version Control:
➢ Like application source code, IaC scripts are managed in
version control system like Git.
➢ This enables us to backup, track and revert changes.
4) Consistency:
➢ Infrastructure is consistent across environments Ex: dev,
staging, production
➢ This eliminates configuration drift.
2) Version Control:
Store the IaC code in a Git repository for tracking changes,
rollback, and collaboration.
3) CI Pipeline:
a) Linting: Ensure code follows best practices.
b) Testing: Test infrastructure in a sandbox
environment to validate provisioning.
➢ Each provider in Terraform has its own set of resource types and
data sources it can manage.
➢ Ex1: The AWS provider manages resources like EC2 instances, S3 buckets,
and IAM roles.
provider “<cloud_provider_name>" {
# Configuration options
}
example:
provider "aws" {
region = “ap-south-1"
}
3) Terraform:
➢ Supports managing AWS Lambda functions and other serverless
components.
Create a template.yaml
AWSTemplateFormatVersion: ‘2024-10-16'
Transform: AWS::Serverless-2016-10-31
Resources:
MyLambdaFunction:
Type: AWS::Serverless::Function
Properties:
Handler: app.lambdaHandler
Runtime: python3.8
CodeUri: ./
Events:
ApiEvent:
Type: Api
Properties:
Path: /greet #endpoint
Method: GET
sam build
sam deploy –guided
curl https://<api-id>.execute-api.<region>.amazonaws.com/Prod/greet
➢ Release Gates: Establishing security checks as mandatory gates in the pipeline, such as
requiring code scanning or penetration testing before deployment.
MERN
React – frontend – login page which takes username and password s input from user
<> ,<>
M - mysql db
select *
from auth
where username = <>
and password = <>;
2) Lack of Security Knowledge: DevOps teams may lack deep security knowledge,
making it hard to implement best practices.
➢ To overcome these challenges, DevOps teams integrate Security as Code (i.e., using
code to define security policies), enhance their security skills.
1) DevOpsSec:
➢ Starts with DevOps practices, with security added as a layer on top, often later in the pipeline.
➢ This approach is more reactive, security as DevOps practices mature
2) SecDevOps:
➢ Emphasizes security as the core foundation, followed by development and operational practices.
➢ This approach places security as the primary focus, even if it impacts speed.
3) DevSecOps:
➢ Integrates security at each stage of DevOps, promoting security as everyone’s responsibility.
➢ Security is treated as a part of development and operations from the start.
➢ In practical terms, DevSecOps is widely adopted due to its balanced approach. Security is embedded into each phase
without overshadowing the goals of DevOps, unlike SecDevOps, which might prioritize security over other objectives
➢ Terminology:
Purpose:
Methods:
Purpose:
Methods:
Authorization:
After authentication, AWS IAM determines what actions the
authenticated user is allowed to perform based on the policies attached
to their user or group.
Example:
Authentication - A user logging into an application using a username
and password.
➢ Regular Audits:
Regularly review IAM policies and permissions to ensure they are
up-to-date and secure.
Security as Code
➢ Security as Code is a DevOps practice that leverages IaC and security tools to
automate security practices:
1) Static Application Security Testing (SAST): Tools like SonarQube analyze code
for vulnerabilities during development.
2) Dynamic Application Security Testing (DAST): Tools like OWASP ZAP or Burp
Suite run tests against live applications, simulating attacks to detect
vulnerabilities.
4) Infrastructure Security: Using IaC tools (like Terraform) with security best
practices baked in (e.g., ensuring IAM roles have the least privilege).
1) Fine-Grained Access Control: IAM allows you to create detailed policies that
define what actions are allowed or denied on specific AWS resources.
3) Role-Based Access Control (RBAC): IAM supports the creation of roles that can
be assumed by AWS services (e.g., EC2, Lambda).
This allows you to securely grant permissions to applications and services without
hardcoding credentials, enhancing security in automated deployments.
➢ Compliance as Code applies IaC concepts to ensure compliance requirements are met
automatically through code:
1) Policy-as-Code: Define and enforce policies within IaC. For example, using tools like
Open Policy Agent (OPA) and AWS Config to enforce rules such as all storage
buckets must be encrypted.
These signals are called metrics and Doctor rushes to the patient where the
monitoring system help in tracking device is beeping and asks nurse to provide
them and alert the hospital staff health records to further diagnose.
whenever some metric is outside
the normal range.
Observability in DevOps is about
Continuous monitoring is the understanding the internal state of systems
ongoing process of collecting, by gathering telemetry data like metrics,
analyzing, and alerting on data to logs, and traces.
ensure systems’ health and
reliability in real-time.
Each pillar provides a different type of insight into the system's health and
performance:
1) Logs:
Logs are timestamped records of discrete events that occur within an
application or system.
Logs are essential for understanding specific events and errors in detail,
especially when debugging or conducting root-cause analysis.
They typically represent system states or resource utilization, like CPU load,
memory usage, or request rate.
Traces help visualize the path of a request through multiple services, making
it easier to detect latency, failures, and bottlenecks across the entire request
flow.
Ex1: A trace showing the execution time for each microservice in a single
user transaction.
Alert Types:
Prometheus:
It scrapes metrics from instrumented jobs and stores them for analysis.
Grafana:
Visualization tool that integrates with Prometheus and other data sources.
Allows users to create dashboards for real-time insights into system metrics.
Ex: Build pipelines can push job metrics (e.g., build time,
status) to the Pushgateway when triggered by CI/CD tools
like Jenkins.
Timely notifications to the DevOps team can ensure faster response, reducing
risks to service availability and data safety.
5) Federation
Not all applications or systems expose metrics in the format that Prometheus requires.
Exporters serve as a bridge, translating native system metrics into a format that
Prometheus understands.
7) Web UI
The Prometheus web UI is a built-in user interface that allows users to explore and
interact with the metrics data collected by Prometheus.
MLOps
➢ What is a ML model?
➢ Introducing MLOps – definition, challenges, scale
➢ People of MLOps
➢ Key MLOps features
➢ Containerization of ML Models end to end
➢ Deploying to Production - CI/CD pipelines, ML artifacts, strategies, containerization
➢ MLOps in Practice: Marketing Recommendation Engines case study
Metric Types:
➢ syntax:
Counter(metric name, help text, labelnames=(), namespace='', subsystem='', unit=‘’)
➢ Ex:
from prometheus_client import Counter
➢ syntax:
Histogram(metric name, help text, labelnames=(), namespace='', subsystem='', unit=‘’, buckets=(0.005, 0.01,
0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10))
➢ Ex:
from prometheus_client import Histogram
➢ syntax:
Gauge(metric name, help text, labelnames=(), namespace='', subsystem='', unit)
➢ Ex:
from prometheus_client import Gauge
app = FastAPI()
# metrics definitions
@app.middleware("http")
async def prometheus_middleware(request: Request, call_next):
# Start timer
start_time = time.time()
# Update metrics
REQUEST_COUNT.inc() # Increment request count
if response.status_code >= 500:
ERROR_COUNT.inc() # Increment error count for 5xx responses
return response
@app.get("/metrics")
async def metrics():
global:
scrape_interval: 5s # How often to scrape targets by default
scrape_configs:
- job_name: "fastapi_app"
static_configs:
- targets: ["localhost:8000"]
prometheus --config.file=prometheus.yml
Prometheus will periodically scrape metrics from http://localhost:8000/metrics
5) Runtime Metrics: Resource usage over time to detect anomalies and trends.
➢ Use various service types from Prometheus for various types of services we are
running.
➢ Alerting:
➢ Define Prometheus alert rules for critical metrics (e.g., high latency
or error rates).
➢ Continuous Monitoring:
➢ It’s created through training, where algorithms use labeled data to identify
features that contribute to accurate predictions.
Machine learns
And comes up with
m = 1, c = 2
y = x + 1.8
Predict:
X = 4, Y = 6
BITS Pilani, Pilani Campus
Key components of an ML
model
➢ Algorithm: The underlying mathematical framework used to build the
model (e.g., linear regression, decision trees).
➢ Parameters: Values that the model optimizes during training to fit the
data.
DataOps
➢ What is DataOps?
➢ Data Pipeline – Environment and Orchestration
➢ Organizing for DataOps – Team Structure
➢ Model and Data Drift: Over time, models can lose accuracy due to changes
in data distribution (data drift) or evolving business contexts (concept
drift).
➢ Model Versioning and Tracking: Version control for ML models, data, and
experiments for reproducibility and traceability.
➢ Data and Model Governance: Ensuring compliance with data privacy and
governance policies, managing data lineage, and access control.
3) Reinforcement Learning
Ex: self driving car
Code:
import joblib
model = ... # Train your model
joblib.dump(model, 'model.pkl’)
# Install dependencies
RUN pip install --no-cache-dir fastapi uvicorn joblib numpy scikit-learn
b) Use Multi-stage Builds: Split the Dockerfile into build and runtime stages
to exclude unnecessary build dependencies.
*.pyc
__pycache__/
*.log
data/
c) Cloud Services:
➢ Use AWS ECS, Azure AKS, or Google Kubernetes Engine (GKE) for
container orchestration.
➢ Use AWS SageMaker or Google AI Platform for model hosting.
a) Monitoring:
➢ Integrate monitoring tools like Prometheus and Grafana to track
container metrics.
➢ Use application performance monitoring (APM) tools for API response
time and error tracking.
➢ This is a well known devops practice called Data Versioning and Lineage
Tracking
➢ The dataset itself is excluded from Git tracking (avoiding Git's size
limits).
dvc push
➢ Helps in debugging issues and understanding the data flow through the
pipeline.
➢ Improves efficiency and ensures that only new data is used for periodic
model retraining, reducing computational overhead.
➢ Example Tools:
Tools MLflow, Kubeflow, TFX, SageMaker, DVC, Apache NiFi, Talend, Great
Azure ML. Expectations, dbt, Airflow.
Canary Deployment:
➢ Gradually rolling out the model to a small percentage of users before a
full rollout.
A/B Testing:
➢ Deploying multiple models to compare their effectiveness based on real-
world feedback.
➢ Traffic is then switched from Blue to Green once the new version is
validated.
app = FastAPI()
@app.get("/")
def greet():
return {"message": "Hello, FastAPI on Google Cloud Functions!"}
requirements.txt:
fastapi==0.95.1
mangum==0.14.0
uvicorn==0.22.0
functions-framework==3.3.0
Step 3: Create a Google Cloud Storage bucket object for the source code zip of the Cloud
Function:
resource "google_storage_bucket_object" “ssd_gcp_lambda_code_object" {
bucket = "source.zip"
source = "path_to_source.zip"
}
➢ ASGI frameworks like FastAPI, Starlette, and Django with Channels are asynchronous
and optimized for modern web applications.
➢ However, platforms like AWS Lambda and Google Cloud Functions expect specific
synchronous HTTP interfaces. Mangum bridges this gap, enabling ASGI applications to
run seamlessly in these serverless environments.
source_archive_bucket = google_storage_bucket.ssd_gcp_bucket
source_archive_object = google_storage_bucket_object. ssd_gcp_lambda_code_object.bucket
trigger_http = true
}
➢ Link: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
➢ This will run the script on the Azure VM after it is provisioned, which will call the GCP Cloud Function.
1. Definition:
Cloud native applications are software applications designed specifically to take advantage
of cloud computing frameworks, which typically consist of microservices packaged in
containers, deployed as part of a continuous delivery process, and managed on elastic
infrastructure through agile DevOps processes and tools.
2. Key Characteristics:
a) Microservices Architecture:
- As shown in the top-left of the diagram, cloud native apps are built using a microservices
architecture.
- Each microservice (Service A, B, C) is a small, independent module focused on a specific
business capability.
- This approach allows for easier development, testing, and maintenance of individual
components.
b) Containerization:
- Illustrated in the top-right, containerization is a key aspect of cloud native applications.
- Containers package the application code along with its dependencies, ensuring
consistency across different environments.
- Popular containerization technologies include Docker and containers.
c) Orchestration:
- Shown in the middle-left, orchestration tools like Kubernetes manage the deployment,
scaling, and operation of application containers.
- They handle tasks such as load balancing, service discovery, and rolling updates.
d) CI/CD Pipeline:
- The middle-right of the diagram shows a typical Continuous Integration/Continuous
Deployment (CI/CD) pipeline.
- This automated process includes stages for coding, building, testing, and deploying
applications.
- It enables frequent and reliable releases of application changes.
e) Scalability:
- Depicted in the bottom-left, cloud native apps are designed to scale horizontally.
- They can quickly spawn new instances to handle increased load and scale down when
demand decreases.
f) Observability:
- The bottom-right shows the three pillars of observability: logging, metrics, and tracing.
- These provide deep insights into application behavior and performance in distributed
systems.
4. Challenges:
- Complexity: Managing distributed systems can be challenging.
- Security: Increased attack surface due to distributed nature.
- Monitoring: Requires sophisticated observability tools and practices.
- Cultural shift: Requires changes in development and operational practices.
6. Best Practices:
- Design for failure: Assume any component can fail and design accordingly.
- Implement automated testing and deployment.
- Use declarative rather than imperative programming.
- Implement proper logging and monitoring from the start.
- Embrace infrastructure as code for consistent environments.
Cloud native applications represent a significant shift in how we build and deploy software,
offering numerous benefits in terms of scalability, resilience, and agility. However, they also
come with their own set of challenges that organizations need to address to fully leverage
their potential.
Now, let me explain cloud native applications from a DevOps perspective, referencing the
diagram:
1. DevOps Philosophy in Cloud Native: DevOps in cloud native environments focuses
on automating and integrating the processes between software development and IT
teams. The goal is to build, test, and release software faster and more reliably in the
cloud.
2. The DevOps Lifecycle for Cloud Native Apps: a) Plan:
o DevOps engineers collaborate with product managers and developers to plan
features and sprints.
o Tools: Jira, Trello for project management and tracking.
b) Code:
o Developers write code for microservices, while DevOps engineers focus on
infrastructure-as-code and automation scripts.
o Tools: Git for version control, GitHub or GitLab for collaboration.
c) Build:
o Automated build processes create container images from the source code.
o Tools: Docker for containerization, Maven or Gradle for build automation.
d) Test:
o Automated testing includes unit tests, integration tests, and end-to-end tests.
o Tools: Selenium for UI testing, JUnit or PyTest for unit testing, Postman for
API testing.
e) Deploy:
o Automated deployment to cloud environments, often using container
orchestration.
o Tools: Kubernetes for orchestration, Helm for Kubernetes package
management, Terraform for infrastructure provisioning.
f) Operate:
o Continuous operation and maintenance of the deployed applications.
o Tools: Prometheus for monitoring, ELK stack (Elasticsearch, Logstash,
Kibana) for logging.
g) Monitor:
o Constant monitoring of application performance and user experience.
o Tools: Grafana for visualization, Datadog for full-stack monitoring.
h) Feedback:
o Gathering and analyzing user feedback and system performance data to inform
the next iteration.
3. Key DevOps Practices for Cloud Native: a) Infrastructure as Code (IaC):
o Define and manage infrastructure using code (e.g., Terraform, AWS
CloudFormation).
o Ensures consistency and repeatability in cloud environments.
b) Continuous Integration/Continuous Deployment (CI/CD):
o Automate the build, test, and deployment processes.
o Tools: Jenkins, GitLab CI, GitHub Actions.
c) Microservices Architecture:
o Build applications as a set of small, loosely coupled services.
o Enables independent development, deployment, and scaling of components.
d) Containerization:
o Package applications and dependencies into containers.
o Ensures consistency across different environments and simplifies deployment.
e) Container Orchestration:
o Use tools like Kubernetes to manage, scale, and maintain containerized
applications.
f) Automated Monitoring and Logging:
o Implement comprehensive monitoring and centralized logging from the start.
o Crucial for maintaining visibility in distributed cloud native systems.
g) Security as Code:
o Integrate security practices into the DevOps workflow (DevSecOps).
o Implement automated security testing and compliance checks.
4. Cloud Platforms: DevOps engineers work with various cloud platforms such as AWS,
Google Cloud Platform (GCP), and Microsoft Azure. Each platform offers specific
tools and services that integrate into the DevOps workflow.
5. Challenges and Considerations:
o Managing complexity in distributed systems.
o Ensuring security in a more exposed environment.
o Optimizing costs in pay-as-you-go cloud models.
o Maintaining performance and reliability at scale.
o Keeping up with rapidly evolving cloud technologies and best practices.
6. Best Practices:
o Embrace automation at every stage of the lifecycle.
o Implement robust monitoring and alerting systems.
o Use blue-green deployments or canary releases for safer updates.
o Regularly audit and optimize cloud resource usage.
o Foster a culture of continuous learning and improvement.
From a DevOps perspective, cloud native applications require a shift in mindset and
practices. The focus is on automation, continuous delivery, and leveraging cloud services to
build scalable, resilient, and efficient applications. DevOps engineers play a crucial role in
bridging development and operations, ensuring smooth deployment and operation of cloud
native applications throughout their lifecycle.
Docker Container:-
1. What is Docker? Docker is an open-source platform that automates the deployment,
scaling, and management of applications using containerization technology. It allows
you to package an application with all of its dependencies into a standardized unit for
software development and deployment.
2. Docker Architecture: a) Docker Client: The command-line interface (CLI) that allows
users to interact with Docker. b) Docker Daemon: The background service running on
the host that manages building, running, and distributing Docker containers. c)
Docker Registry: A repository for Docker images, which can be public (like Docker
Hub) or private.
3. Key Concepts: a) Docker Image: A read-only template with instructions for creating a
Docker container. b) Docker Container: A runnable instance of an image, which can
be started, stopped, moved, or deleted. c) Dockerfile: A text file that contains
instructions to build a Docker image. d) Docker Compose: A tool for defining and
running multi-container Docker applications.
4. Container Structure: A Docker container typically consists of:
o Application code
o Runtime environment
o System tools and libraries
o Settings
5. Docker Lifecycle: a) Build: Create a Docker image from a Dockerfile. b) Ship: Push
the image to a registry for storage and sharing. c) Run: Create and run containers from
the image.
6. Key Docker Commands:
o docker build: Build an image from a Dockerfile
o docker run: Create and start a container
o docker pull: Pull an image from a registry
o docker push: Push an image to a registry
o docker ps: List running containers
o docker stop: Stop a running container
o docker rm: Remove a container
o docker rmi: Remove an image
7. Advantages of Docker:
o Consistency across development, testing, and production environments
o Lightweight and fast compared to traditional VMs
o Efficient use of system resources
o Rapid deployment and scaling
o Isolation and security
8. Docker Networking: Docker provides several networking options:
o Bridge: Default network driver
o Host: Remove network isolation between container and host
o Overlay: Connect multiple Docker daemons
o Macvlan: Assign a MAC address to a container
9. Docker Storage:
o Volumes: Preferred mechanism for persisting data generated by and used by
Docker containers
o Bind Mounts: Map a host file or directory to a container file or directory
o tmpfs Mounts: Stored in the host system's memory only
10. Docker Security:
o Namespaces: Provide isolation for running processes
o Control Groups (cgroups): Limit application resources
o Docker Content Trust: Ensures integrity of container images
o Security scanning: Identify vulnerabilities in container images
11. Docker Swarm: Docker's native clustering and orchestration solution, allowing you to
create and manage a swarm of Docker nodes.
12. Dockerfile Best Practices:
o Use official base images
o Minimize the number of layers
o Leverage build cache
o Use multi-stage builds for smaller final images
o Don't install unnecessary packages
o Use .dockerignore file
13. Docker Compose: Allows defining and running multi-container Docker applications.
Key features:
o Define services in a YAML file
o Start all services with a single command
o Manage the lifecycle of services
14. Docker in CI/CD: Docker is widely used in Continuous Integration and Continuous
Deployment pipelines:
o Consistent environments for testing
o Easy integration with CI/CD tools (Jenkins, GitLab CI, GitHub Actions)
o Simplified deployment process
15. Limitations and Considerations:
o Containers share the host OS kernel, which can be a security concern
o Complex applications may require additional orchestration tools (like
Kubernetes)
o Learning curve for teams new to containerization
16. Docker Alternatives: While Docker is the most popular, there are alternatives:
o containerd
o CRI-O
o Podman
o LXC (Linux Containers)
17. Future Trends:
o Increased adoption of Docker in edge computing
o Integration with serverless architectures
o Enhanced security features
o Improved support for IoT and embedded systems
Docker has revolutionized the way applications are developed, shipped, and run, making it an
essential tool in modern software development and DevOps practices. Its ecosystem
continues to grow and evolve, addressing the complex needs of today's distributed systems
and cloud-native applications.
Maven:-
Maven is a popular build automation and project management tool used primarily for Java
projects. Key features of Maven include:
1. Project Object Model (POM): Maven uses an XML file called pom.xml to describe
the project structure, dependencies, and build process.
2. Convention over Configuration: Maven follows a standard directory structure and
build lifecycle, reducing the need for extensive configuration.
3. Dependency Management: Maven can automatically download and manage project
dependencies from remote repositories.
4. Build Lifecycle: As shown in the diagram, Maven has a defined build lifecycle with
phases like validate, compile, test, package, verify, and deploy.
5. Plugins: Maven's functionality can be extended through plugins, which can be bound
to different phases of the build lifecycle.
Gradle:-
Gradle is another popular build automation tool that can be used for multiple programming
languages. Here are some key features of Gradle:
1. Flexibility: Gradle uses a Groovy or Kotlin-based DSL (Domain Specific Language)
for describing builds, offering more flexibility than XML-based systems.
2. Performance: Gradle is designed to be fast, with features like incremental builds and
build cache.
3. Multi-project Builds: Gradle excels at handling complex, multi-project builds.
4. Dependency Management: Like Maven, Gradle can manage project dependencies
efficiently.
5. Plugins: Gradle has a rich ecosystem of plugins that extend its functionality.
6. Build Phases: As shown in the diagram, Gradle has three main build phases:
Initialization, Configuration, and Execution. During these phases, it constructs and
executes a task graph.
When working with cloud environments, Git becomes even more powerful and integral to the
development and deployment process.
2. **Cloud-Native CI/CD**:
- Cloud providers offer Git-integrated CI/CD services (e.g., AWS CodePipeline, Azure
DevOps, Google Cloud Build)
- Automatically deploy to cloud services on Git push
3. **Serverless Deployments**:
- Deploy serverless functions directly from Git repositories
- Examples: AWS Lambda, Azure Functions, Google Cloud Functions
4. **Container Orchestration**:
- Store Kubernetes manifests or Helm charts in Git
- Implement GitOps practices for Kubernetes deployments
5. **Secrets Management**:
- Use Git hooks or CI/CD pipelines to inject secrets from secure stores
- Never store sensitive information directly in Git repositories
6. **Multi-Cloud Strategies**:
- Use Git branching strategies to manage deployments across multiple cloud providers
- Implement cloud-agnostic IaC practices
By leveraging Git effectively in cloud DevOps practices, teams can achieve faster, more
reliable, and more secure software delivery pipelines.
1. Developer Workstations: Where developers write code and make local commits.
2. Feature Branches: Developers push their code to feature branches for collaboration
and review.
3. Main Branch: The primary branch (often called 'main' or 'master') where approved
code is merged.
4. CI/CD Pipeline: Automated processes for building, testing, and deploying code.
5. Cloud Deployment: The final stage where code is deployed to cloud infrastructure.
The arrows show the flow of code through the system:
● Developers push code to feature branches.
● Pull requests are created to merge feature branches into the main branch.
● Once approved and merged, the CI/CD pipeline is triggered.
● The pipeline automatically deploys the code to the cloud environment.
This workflow enables continuous integration and continuous deployment, key practices in
DevOps. It allows for rapid, frequent, and reliable software releases.
GIT workflows:-
1. Centralized Workflow
The centralized workflow is the simplest and most traditional model for version control. In
this workflow, there's a single central repository that serves as the "single point of truth" for
the project.
Key points:
● All developers clone from and push to this central repository
● Typically uses a single main branch (often called 'master' or 'main')
● Suitable for small teams or projects with infrequent updates
DevOps perspective for cloud:
● Easy to set up and manage in cloud environments
● Can be integrated with CI/CD pipelines for automated testing and deployment
● May lead to bottlenecks as all changes go through a single point
● Requires careful coordination to avoid conflicts, especially in larger teams
Summary
● TDD focuses on writing tests first, ensuring each piece of code is validated before
moving on.
● FDD emphasizes delivering features systematically, breaking down development into
manageable parts.
● BDD bridges the gap between technical and non-technical stakeholders, ensuring the
software behaves as intended from a user perspective.
Each methodology has its strengths and can be adapted based on the project’s requirements
and team dynamics.
IaC enables teams to define and manage infrastructure components such as servers,
networking, storage, and services in a declarative manner (where the desired end state is
defined) or an imperative manner (where specific steps to reach the end state are defined).
The code used in IaC can be version-controlled and shared, similar to application code, and is
executed using automation tools like Terraform, Ansible, and CloudFormation.
○ Declarative IaC: The user specifies what the infrastructure should look like,
and the IaC tool figures out how to achieve it. Example tools: Terraform,
Kubernetes, CloudFormation.
○ Imperative IaC: The user specifies how to achieve the infrastructure
configuration, including all the steps. Example tools: Ansible (though it can
also be declarative depending on how it is used).
2. Version Control:
○ Infrastructure code is stored in version control systems (e.g., Git). This allows
teams to track changes over time, roll back to previous configurations, and
ensure consistency across environments.
3. Automation:
○ IaC tools are idempotent, meaning that running the same script multiple times
results in the same outcome, regardless of the number of executions. This
ensures predictable results and avoids unintended changes.
5. Immutable Infrastructure:
Security in the DevOps lifecycle (often called DevSecOps) integrates security practices
directly into the DevOps workflow. The goal of DevSecOps is to ensure that security is a
shared responsibility throughout the entire DevOps lifecycle, from planning and development
to deployment and maintenance.
○ CI/CD pipelines automate the process of integrating new code into a shared
repository and deploying it to production. Security tests and controls can be
built into these pipelines to ensure that vulnerabilities are detected and
remediated early.
3. Security as Code:
○ Static Application Security Testing (SAST): Tools that analyze the source
code for vulnerabilities without executing the code (e.g., Checkmarx,
Veracode).
○ Dynamic Application Security Testing (DAST): Tools that test a running
application for vulnerabilities (e.g., OWASP ZAP, Acunetix).
○ Software Composition Analysis (SCA): Scans for vulnerabilities in
third-party libraries and dependencies (e.g., WhiteSource, Snyk).
○ Interactive Application Security Testing (IAST): Monitors application
behavior during runtime to identify vulnerabilities in a live environment.
5. Secrets Management:
When IaC is combined with security principles, it ensures that infrastructure is not only
provisioned efficiently but is also secured from the outset.
1. Security in Infrastructure Code:
○ Security considerations should be part of the IaC scripts. For example, tools
like Terraform can integrate security-related checks (e.g., ensuring that
security groups in AWS are properly configured, and ensuring that storage is
encrypted).
2. Automated Security Testing for IaC:
○ Tools like Terraform's tflint and Checkov can be used to analyze IaC code
for security vulnerabilities or misconfigurations before the infrastructure is
provisioned.
3. Compliance as Code:
○ Compliance checks can be incorporated directly into IaC code, ensuring that
every environment meets specific regulatory requirements.
4. Immutable Infrastructure with Security:
○ Pipeline Security: Ensure that CI/CD pipelines themselves are secure. This
includes using encrypted channels for communication and ensuring that the
CI/CD server has proper access control and is not a point of attack.
In today’s rapidly evolving world of data-driven technologies, both MLOps and DataOps
play crucial roles in enabling organizations to streamline their data management and machine
learning workflows. These practices aim to automate, improve collaboration, and ensure the
smooth delivery of data and machine learning models to production environments.
MLOps is a set of practices and tools designed to automate and streamline the deployment,
monitoring, and management of machine learning (ML) models in production. It is
essentially the DevOps equivalent for machine learning, focusing on the lifecycle of ML
models—from development to deployment and ongoing maintenance.
1. Model Development:
○ MLOps starts with the data scientists' work on feature engineering, data
preprocessing, model training, and validation. The models are built and tested,
often using notebooks or scripts.
2. Version Control:
○ Once the data scientists have developed a model, it undergoes training, where
various machine learning algorithms are used. MLOps ensures the
repeatability and reproducibility of model training by automating the training
pipelines and scaling the process across multiple machines.
4. Model Deployment:
MLOps Tools:
Benefits of MLOps:
DataOps refers to the set of practices, tools, and technologies used to streamline the data
pipeline, ensuring faster, more efficient, and error-free data movement across the
organization. It is similar to DevOps but focuses on the flow of data within an organization
rather than software or ML models.
○ DataOps promotes collaboration between teams that manage the data pipeline,
including data engineers, data analysts, and data scientists. It provides a
unified view of data across different teams and systems.
4. Data Quality:
○ DataOps involves version control not just for code, but also for datasets and
data pipelines. This ensures that data transformations are reproducible, and
previous data states can be accessed for auditing or rollback purposes.
DataOps Tools:
Benefits of DataOps:
● Faster Data Delivery: By automating the data pipeline, DataOps ensures faster and
more reliable delivery of data to stakeholders.
● Improved Collaboration: DataOps breaks down silos and improves collaboration
between different teams (data engineers, scientists, analysts).
● Data Quality and Governance: Helps ensure that data is accurate, consistent, and
compliant with regulatory standards.
● Scalability: Enables organizations to scale their data infrastructure and processes to
handle larger volumes of data and more complex workflows.
Similarities:
● Both MLOps and DataOps emphasize automation and collaboration across teams.
● Both aim to improve the speed and efficiency of delivering reliable and consistent
results (models or data) to stakeholders.
● Continuous integration and deployment (CI/CD) are key components of both
practices, enabling continuous updates and improvements.
Differences:
● Focus Area: MLOps primarily deals with machine learning models and their
deployment, while DataOps is focused on the flow, transformation, and governance of
data.
● End Goal: The end goal of MLOps is to deploy machine learning models into
production environments with reliable monitoring and maintenance, whereas the goal
of DataOps is to ensure that data is clean, accurate, and flows seamlessly across
various systems and processes.
● DataOps ensures that the data pipeline is efficient and the data is high quality,
which is critical for training reliable machine learning models in MLOps.
● MLOps ensures that the models are deployed and maintained based on the data
processed by the DataOps pipelines.
An integrated approach where data quality and governance (DataOps) complement model
development and deployment (MLOps) can lead to faster, more reliable, and more scalable
solutions for businesses working with machine learning and big data.
Conclusion
Both MLOps and DataOps are integral to building and maintaining efficient, scalable, and
automated workflows for managing data and machine learning models. By focusing on
automation, collaboration, monitoring, and quality control, these practices enable
organizations to improve the speed, reliability, and performance of their data-driven
operations.
● MLOps is crucial for managing the lifecycle of machine learning models, including
development, deployment, and monitoring.
● DataOps focuses on the seamless flow, quality, and governance of data across the
organization, ensuring that data is available, accurate, and secure for downstream
tasks, including ML model training and analytics.
Together, they form the foundation of a data-driven DevOps approach, where both data and
machine learning workflows are automated, secure, and scalable.
Observability
Observability refers to the ability to understand the internal state of a system based on the
external outputs or signals it provides. In other words, it’s about how well you can observe
and diagnose what's happening in a system in real time. Observability is crucial for
monitoring the health, performance, and security of applications and infrastructure.
1. Metrics:
○ Metrics are quantitative data points that track the health and performance of a
system. Examples of metrics include CPU utilization, memory usage, response
times, and throughput.
○ Example: In a web application, metrics like the number of requests per second
or error rates give an indication of performance and potential issues.
2. Logs:
○ Logs are detailed, timestamped records that provide insights into events and
transactions within the system. Logs can be generated by applications, servers,
databases, and other infrastructure components.
○ Example: A 500 error log entry in a web server log file can help identify a
failure in the system that needs attention.
3. Traces:
○ Alerts are triggered based on predefined thresholds for metrics, logs, or traces.
Alerts notify teams of critical issues that require immediate attention, such as
high error rates or system downtimes.
○ Example: An alert might be triggered if CPU utilization exceeds 90%,
indicating a potential performance bottleneck.
Continuous Monitoring
Continuous Monitoring refers to the practice of continuously observing the health and
performance of an application, infrastructure, and its components to detect and respond to
issues promptly. It involves real-time collection of data, automated analysis, and alerting to
ensure that systems are operating efficiently.
1. Real-Time Monitoring:
○ Anomaly detection involves identifying patterns that deviate from the normal
operation of a system. Machine learning models can be used to automatically
learn the baseline behavior of applications and flag unusual patterns.
○ Example: Identifying sudden spikes in traffic or resource usage that could
indicate a performance issue or a security breach.
4. Continuous Feedback Loop:
1. Automation:
○ If infrastructure is lost (e.g., due to a data center failure), IaC allows you to
recreate it from the code, facilitating faster disaster recovery.
○ Declarative IaC: Describes what the end state of the infrastructure should be,
and the tool ensures that the infrastructure matches this state. (e.g., Terraform,
CloudFormation)
○ Imperative IaC: Specifies how to achieve a particular infrastructure state,
often with step-by-step instructions. (e.g., Ansible, Chef, Puppet)
2. Infrastructure as Code Tools:
Conclusion
● Observability enables teams to monitor and understand the state of their systems in
real time, providing insights into performance, security, and user experience.
● Continuous Monitoring enhances observability by continuously tracking the health
of applications and infrastructure, ensuring that issues are detected and resolved
proactively.
● Infrastructure as Code (IaC) automates and streamlines the process of managing
infrastructure, ensuring consistency, scalability, and faster recovery in case of failures.
Together, these practices help in building and maintaining highly reliable, efficient, and
secure systems, ensuring that applications and infrastructure are always running smoothly.
Integrating IaC with observability and continuous monitoring creates a powerful, automated,
and responsive DevOps pipeline.
Container Orchestration: Kubernetes
Kubernetes Overview
● Features:
○ Advanced scheduling
○ Scaling
○ Self-healing
○ Networking
Kubernetes Architecture
1. Pod:
Example YAML:
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: myapp
tier: frontend
spec:
containers:
- name: nginx-container
image: nginx
ports:
- containerPort: 80
2. ReplicaSet:
Example YAML:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-replicaSet
spec:
replicas: 3
selector:
matchLabels:
tier: frontend
template:
metadata:
labels:
tier: frontend
spec:
containers:
- name: nginx-container
image: nginx
3. Deployment:
1. Version Information:
○ List nodes:
kubectl get nodes
○ List pods and services:
kubectl get pods,svc
3. Pod Operations:
○ Run a pod:
kubectl run nginx-pod --image=nginx
○ Create from YAML:
kubectl create -f pod-def.yml
○ Update:
kubectl apply -f pod-def.yml
4. ReplicaSet Operations:
○ Create:
kubectl create -f repset-def.yml
○ Scale:
kubectl scale --replicas=6 -f repset-def.yml
5. General Help:
Possible Scenarios:-
Here’s an expanded and detailed material for each scenario with
examples, YAML configurations, and explanations. You can copy
and print this directly for your open-book exam.
YAML Configuration:
apiVersion: v1
kind: Pod
metadata:
name: web-app-pod
labels:
app: web-app
spec:
containers:
- name: nginx-container
image: nginx
ports:
- containerPort: 80
Commands:
Create the Pod:
kubectl apply -f pod-definition.yaml
Verify:
kubectl get pods
YAML Configuration:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: web-app-replicaset
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: nginx-container
image: nginx
ports:
- containerPort: 80
Commands:
YAML Configuration:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app-deployment
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: nginx-container
image: nginx:1.20
Commands:
YAML Configuration:
apiVersion: v1
kind: Secret
metadata:
name: api-keys
type: Opaque
data:
apiVersion: v1
kind: Pod
metadata:
name: secret-demo
spec:
containers:
- name: app-container
image: busybox
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: api-keys
key: db-password
Scenario 5: Exposing Your Application
YAML Configuration:
apiVersion: v1
kind: Service
metadata:
name: web-app-service
spec:
type: NodePort
ports:
- port: 80
targetPort: 80
nodePort: 30001
selector:
app: web-app
Commands:
YAML Configuration:
apiVersion: batch/v1
kind: Job
metadata:
name: data-processor
spec:
template:
spec:
containers:
- name: processor
image: data-processor:latest
restartPolicy: OnFailure
Commands:
Create namespaces:
kubectl create namespace frontend
Commands:
Use HPA:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app-deployment
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
Scenario 9: Debugging a Pod
Commands:
View logs:
kubectl logs <pod-name>
YAML Configuration:
PersistentVolume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-storage
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/storage"
PersistentVolumeClaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
apiVersion: v1
kind: Pod
metadata:
name: storage-pod
spec:
containers:
- name: app-container
image: busybox
volumeMounts:
- mountPath: "/data"
name: storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: pvc-storage
Pods
Definition:
YAML Configuration:
apiVersion: v1
kind: Pod
metadata:
name: simple-pod
labels:
app: nginx-app
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
Commands:
Verify:
kubectl get pods
ReplicaSets
Definition:
YAML Configuration:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-replicaset
spec:
replicas: 3
selector:
matchLabels:
app: nginx-app
template:
metadata:
labels:
app: nginx-app
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
Commands:
Definition:
YAML Configuration:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx-app
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
template:
metadata:
labels:
app: nginx-app
spec:
containers:
- name: nginx-container
image: nginx:1.19
ports:
- containerPort: 80
Commands:
Apply Deployment:
kubectl apply -f deployment.yaml
Rollout Status:
kubectl rollout status deployment/nginx-deployment
ConfigMaps
Definition:
YAML Configuration:
ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
APP_NAME: "MyApp"
APP_ENV: "Production"
apiVersion: v1
kind: Pod
metadata:
name: configmap-pod
spec:
containers:
- name: app-container
image: busybox
- name: APP_NAME
valueFrom:
configMapKeyRef:
name: app-config
key: APP_NAME
- name: APP_ENV
valueFrom:
configMapKeyRef:
name: app-config
key: APP_ENV
Commands:
Secrets
Definition:
Secret:
apiVersion: v1
kind: Secret
metadata:
name: api-secret
type: Opaque
data:
apiVersion: v1
kind: Pod
metadata:
name: secret-pod
spec:
containers:
- name: app-container
image: busybox
env:
- name: API_KEY
valueFrom:
secretKeyRef:
name: api-secret
key: API_KEY
Commands:
Services
Definition:
YAML Configuration:
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
type: NodePort
selector:
app: nginx-app
ports:
- protocol: TCP
port: 80
targetPort: 80
nodePort: 30001
Commands: