SlideShare a Scribd company logo
6
Most read
9
Most read
14
Most read
Version 1.0
SRE 101
Introduction to Site
Reliability Engineering
Hussain Mansoor
Topics / Agenda
1
Why & What is DevOps?
2
SRE
3
Relation between SRE & DevOps
4
SRE Details
What is DevOps? - AWS
DevOps is the combination of cultural philosophies, practices, and tools that increases
an organization's ability to deliver applications and services at high velocity: evolving
and improving products at a faster pace than organizations using traditional software
development and infrastructure management processes. Ref
Why DevOps? ● To align on the mindset and
activities which speeds up
software delivery
● Reduce Human Errors
● Consistency (because code)
● Reduce manual efforts
How to DevOps?
*Generally
via DevOps Principles
● Have CICD practices
● Shift Left
● Continuous Improvements
● Remove Silos
● Automate
● Shared Responsibilities
● Autonomous Teams
SRE
Guiding Principles
● You can’t improve what you can’t measure
○ SLI, SLO, Error Budget
● Embracing Risk
● Eliminate Toil
● Implementation agnostic monitoring
● Automate
● Simplicity*
Relation between
SRE & DevOps
Agile Manifesto
Scrum, Kanban, Lean, XP
DevOps
SRE, Systems Engineer,
Platform Engineer, Automation
Engineer, Cloud x Engineer
SRE vs DevOps
● Non Competing
● Class SRE Implements Interface DevOps
https://goo.gl/CKv3tV
● SRE is part of whole DevOps Umbrella
○ SRE defines the practices which DevOps
suggests
○ And MORE
SRE vs DevOps
SRE Details
SLI
Service Level Indicator
Availability, Throughput, Error
Rate
SLO
Service Level Objectives
E.g.: 99% availability
Error Budget
the amount of error that your
service can accumulate over a
certain period of time.
Tolerance of user happiness
SRE Practices
● Remove Toil
● Defining criticalities (System, downtime, unavailability)
● System Designing (DR, Multi or Poly Cloud, Multi-Region Deployments)
● Observability
● Chaos Engineering
SRE Practices
● ONLY people who can touch Production Environment
● MTTR, MTBF
● Incident Management
● Postmortems
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)

More Related Content

What's hot (20)

PDF
Sre summary
Yogesh Shah
 
PPTX
SRE (service reliability engineer) on big DevOps platform running on the clou...
DevClub_lv
 
PDF
Building an SRE Organization @ Squarespace
Franklin Angulo
 
PPTX
Site reliability engineering
Jason Loeffler
 
PDF
SRE Demystified - 01 - SLO SLI and SLA
Dr Ganesh Iyer
 
PPTX
SRE-iously! Defining the Principles, Habits, and Practices of Site Reliabilit...
Tori Wieldt
 
PPTX
How Small Team Get Ready for SRE (public version)
Setyo Legowo
 
PPTX
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
DevOpsDays Tel Aviv
 
PPTX
Site reliability engineering - Lightning Talk
Michae Blakeney
 
PDF
SRE in Startup
Ladislav Prskavec
 
PPTX
DevOps Torino Meetup - SRE Concepts
Rauno De Pasquale
 
PDF
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
Edureka!
 
ODP
Devops Devops Devops
Kris Buytaert
 
PPTX
DevOps
Gehad Elsayed
 
PDF
Service Level Terminology : SLA ,SLO & SLI
Knoldus Inc.
 
PDF
SRE From Scratch
Grier Johnson
 
PDF
Kks sre book_ch1,2
Chris Huang
 
PPTX
DevOps 101 - an Introduction to DevOps
Red Gate Software
 
PPTX
An introduction to DevOps
Alexander Meijers
 
PDF
DevOps
ARYA TM
 
Sre summary
Yogesh Shah
 
SRE (service reliability engineer) on big DevOps platform running on the clou...
DevClub_lv
 
Building an SRE Organization @ Squarespace
Franklin Angulo
 
Site reliability engineering
Jason Loeffler
 
SRE Demystified - 01 - SLO SLI and SLA
Dr Ganesh Iyer
 
SRE-iously! Defining the Principles, Habits, and Practices of Site Reliabilit...
Tori Wieldt
 
How Small Team Get Ready for SRE (public version)
Setyo Legowo
 
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
DevOpsDays Tel Aviv
 
Site reliability engineering - Lightning Talk
Michae Blakeney
 
SRE in Startup
Ladislav Prskavec
 
DevOps Torino Meetup - SRE Concepts
Rauno De Pasquale
 
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
Edureka!
 
Devops Devops Devops
Kris Buytaert
 
Service Level Terminology : SLA ,SLO & SLI
Knoldus Inc.
 
SRE From Scratch
Grier Johnson
 
Kks sre book_ch1,2
Chris Huang
 
DevOps 101 - an Introduction to DevOps
Red Gate Software
 
An introduction to DevOps
Alexander Meijers
 
DevOps
ARYA TM
 

Similar to SRE 101 (Site Reliability Engineering) (20)

PPTX
Top Site Reliability Engineering Training - SRE Course in Ameerpet.pptx
venkatakrishnavisual
 
PDF
Site-Reliability-Engineering-v2[6241].pdf
DeepakGupta747774
 
PDF
Björn Rabenstein - About SRE – and how (not) to apply it - Codemotion Berlin ...
Codemotion
 
PDF
Björn Rabenstein - About SRE and how (not) to apply it - Codemotion Berlin 2018
Codemotion
 
PDF
Bjorn Rabenstein. SRE, DevOps, Google, and you
IT Arena
 
PPTX
Site Reliability Engineering Certification Course in Hyderabad.pptx
venkatakrishnavisual
 
PDF
S.R.E - create ultra-scalable and highly reliable systems
Ricardo Amaro
 
PDF
Site Reliability Engineering slide deck 101
ManikumarKothapalli1
 
PDF
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
Phil Johnson
 
PDF
How to use Istio/Anthos to build Enterprise SRE
Tzung-Hsien (Shawn) Ho
 
PPTX
Kanban India 2023 | Ravishankar N | Don’t implement SRE like this!
LeanKanbanIndia
 
PPTX
DevOps & Site Reliability Engineering (SRE).pptx
abiguimeleroy
 
PDF
stackconf 2022: Want to start with SRE? Start with this talk.
NETWAYS
 
PDF
Upskill Yourself With GSDC Site Reliability Engineering Certification
gsdcouncil1
 
PPTX
Rethinking Site Reliability Engineering for ITSM - SDI virtual event "New Way...
Jon Stevens-Hall
 
PPTX
Site Reliability Engineering: Harnessing (and redefining) it for ITSM
Jon Stevens-Hall
 
PDF
Bridging the Gap Between SRE and DevOps.pdf
unicloudm
 
PPTX
SRE vs DevOps
Levon Avakyan
 
PPTX
ADDO_2022_SRE Architectural Patterns_Nov10.pptx
ShikhaSrivastava820471
 
PPTX
ADDO_2022_SRE Architectural Patterns_Nov10.pptx
Shikha Srivastava
 
Top Site Reliability Engineering Training - SRE Course in Ameerpet.pptx
venkatakrishnavisual
 
Site-Reliability-Engineering-v2[6241].pdf
DeepakGupta747774
 
Björn Rabenstein - About SRE – and how (not) to apply it - Codemotion Berlin ...
Codemotion
 
Björn Rabenstein - About SRE and how (not) to apply it - Codemotion Berlin 2018
Codemotion
 
Bjorn Rabenstein. SRE, DevOps, Google, and you
IT Arena
 
Site Reliability Engineering Certification Course in Hyderabad.pptx
venkatakrishnavisual
 
S.R.E - create ultra-scalable and highly reliable systems
Ricardo Amaro
 
Site Reliability Engineering slide deck 101
ManikumarKothapalli1
 
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
Phil Johnson
 
How to use Istio/Anthos to build Enterprise SRE
Tzung-Hsien (Shawn) Ho
 
Kanban India 2023 | Ravishankar N | Don’t implement SRE like this!
LeanKanbanIndia
 
DevOps & Site Reliability Engineering (SRE).pptx
abiguimeleroy
 
stackconf 2022: Want to start with SRE? Start with this talk.
NETWAYS
 
Upskill Yourself With GSDC Site Reliability Engineering Certification
gsdcouncil1
 
Rethinking Site Reliability Engineering for ITSM - SDI virtual event "New Way...
Jon Stevens-Hall
 
Site Reliability Engineering: Harnessing (and redefining) it for ITSM
Jon Stevens-Hall
 
Bridging the Gap Between SRE and DevOps.pdf
unicloudm
 
SRE vs DevOps
Levon Avakyan
 
ADDO_2022_SRE Architectural Patterns_Nov10.pptx
ShikhaSrivastava820471
 
ADDO_2022_SRE Architectural Patterns_Nov10.pptx
Shikha Srivastava
 
Ad

More from Hussain Mansoor (19)

PPTX
Cloud for Enterprise - AWS Community Day Dubai 2022
Hussain Mansoor
 
PDF
FAST - Karachi Campus - Cloud Computing Introduction
Hussain Mansoor
 
PPTX
FiresideChat on Serverless Architecture
Hussain Mansoor
 
PPTX
Serverless Architecture for Beginners - Murdoch Dubai - AWS UG Dubai.pptx
Hussain Mansoor
 
PPTX
Certification Journey in AWS Cloud
Hussain Mansoor
 
PPTX
Scale Engineering using Cloud. AWS CommunityDay Pakistan 2021
Hussain Mansoor
 
PDF
Intro to docker - innovation demo 2022
Hussain Mansoor
 
PPTX
Design patterns of Distributed Systems
Hussain Mansoor
 
PPTX
Android developer to tech leadership
Hussain Mansoor
 
PPTX
Observability and DevOps Improvements
Hussain Mansoor
 
PPTX
Cache options for Data Layer
Hussain Mansoor
 
PPTX
AWS Lambda and Infrastructure as Code
Hussain Mansoor
 
PPTX
Why everyone should go for Masters Degree
Hussain Mansoor
 
PPTX
Agile101
Hussain Mansoor
 
PPTX
DevOps for iOS
Hussain Mansoor
 
PPTX
Unit Testing Android Application
Hussain Mansoor
 
PPTX
Code quality
Hussain Mansoor
 
PPT
FAST-NUCES Apps/Games presentation by Husyn 2012
Hussain Mansoor
 
PPTX
Maven basics (Android & IntelliJ)
Hussain Mansoor
 
Cloud for Enterprise - AWS Community Day Dubai 2022
Hussain Mansoor
 
FAST - Karachi Campus - Cloud Computing Introduction
Hussain Mansoor
 
FiresideChat on Serverless Architecture
Hussain Mansoor
 
Serverless Architecture for Beginners - Murdoch Dubai - AWS UG Dubai.pptx
Hussain Mansoor
 
Certification Journey in AWS Cloud
Hussain Mansoor
 
Scale Engineering using Cloud. AWS CommunityDay Pakistan 2021
Hussain Mansoor
 
Intro to docker - innovation demo 2022
Hussain Mansoor
 
Design patterns of Distributed Systems
Hussain Mansoor
 
Android developer to tech leadership
Hussain Mansoor
 
Observability and DevOps Improvements
Hussain Mansoor
 
Cache options for Data Layer
Hussain Mansoor
 
AWS Lambda and Infrastructure as Code
Hussain Mansoor
 
Why everyone should go for Masters Degree
Hussain Mansoor
 
Agile101
Hussain Mansoor
 
DevOps for iOS
Hussain Mansoor
 
Unit Testing Android Application
Hussain Mansoor
 
Code quality
Hussain Mansoor
 
FAST-NUCES Apps/Games presentation by Husyn 2012
Hussain Mansoor
 
Maven basics (Android & IntelliJ)
Hussain Mansoor
 
Ad

Recently uploaded (20)

PPTX
Introduction to Design of Machine Elements
PradeepKumarS27
 
DOC
MRRS Strength and Durability of Concrete
CivilMythili
 
PDF
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
PDF
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
PPTX
Element 11. ELECTRICITY safety and hazards
merrandomohandas
 
PDF
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
PDF
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
PPTX
Product Development & DevelopmentLecture02.pptx
zeeshanwazir2
 
PPTX
Shinkawa Proposal to meet Vibration API670.pptx
AchmadBashori2
 
PPTX
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
PDF
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PPTX
Lecture 1 Shell and Tube Heat exchanger-1.pptx
mailforillegalwork
 
PPTX
Hashing Introduction , hash functions and techniques
sailajam21
 
PDF
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
DOCX
8th International Conference on Electrical Engineering (ELEN 2025)
elelijjournal653
 
PPTX
GitOps_Repo_Structure for begeinner(Scaffolindg)
DanialHabibi2
 
PDF
MAD Unit - 1 Introduction of Android IT Department
JappanMavani
 
PDF
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
PPTX
Day2 B2 Best.pptx
helenjenefa1
 
Introduction to Design of Machine Elements
PradeepKumarS27
 
MRRS Strength and Durability of Concrete
CivilMythili
 
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
Element 11. ELECTRICITY safety and hazards
merrandomohandas
 
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
Product Development & DevelopmentLecture02.pptx
zeeshanwazir2
 
Shinkawa Proposal to meet Vibration API670.pptx
AchmadBashori2
 
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
Lecture 1 Shell and Tube Heat exchanger-1.pptx
mailforillegalwork
 
Hashing Introduction , hash functions and techniques
sailajam21
 
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
8th International Conference on Electrical Engineering (ELEN 2025)
elelijjournal653
 
GitOps_Repo_Structure for begeinner(Scaffolindg)
DanialHabibi2
 
MAD Unit - 1 Introduction of Android IT Department
JappanMavani
 
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
Day2 B2 Best.pptx
helenjenefa1
 

SRE 101 (Site Reliability Engineering)

  • 1. Version 1.0 SRE 101 Introduction to Site Reliability Engineering Hussain Mansoor
  • 2. Topics / Agenda 1 Why & What is DevOps? 2 SRE 3 Relation between SRE & DevOps 4 SRE Details
  • 3. What is DevOps? - AWS DevOps is the combination of cultural philosophies, practices, and tools that increases an organization's ability to deliver applications and services at high velocity: evolving and improving products at a faster pace than organizations using traditional software development and infrastructure management processes. Ref
  • 4. Why DevOps? ● To align on the mindset and activities which speeds up software delivery ● Reduce Human Errors ● Consistency (because code) ● Reduce manual efforts
  • 5. How to DevOps? *Generally via DevOps Principles ● Have CICD practices ● Shift Left ● Continuous Improvements ● Remove Silos ● Automate ● Shared Responsibilities ● Autonomous Teams
  • 6. SRE Guiding Principles ● You can’t improve what you can’t measure ○ SLI, SLO, Error Budget ● Embracing Risk ● Eliminate Toil ● Implementation agnostic monitoring ● Automate ● Simplicity*
  • 8. Agile Manifesto Scrum, Kanban, Lean, XP DevOps SRE, Systems Engineer, Platform Engineer, Automation Engineer, Cloud x Engineer
  • 10. ● Non Competing ● Class SRE Implements Interface DevOps https://goo.gl/CKv3tV ● SRE is part of whole DevOps Umbrella ○ SRE defines the practices which DevOps suggests ○ And MORE SRE vs DevOps
  • 11. SRE Details SLI Service Level Indicator Availability, Throughput, Error Rate SLO Service Level Objectives E.g.: 99% availability Error Budget the amount of error that your service can accumulate over a certain period of time. Tolerance of user happiness
  • 12. SRE Practices ● Remove Toil ● Defining criticalities (System, downtime, unavailability) ● System Designing (DR, Multi or Poly Cloud, Multi-Region Deployments) ● Observability ● Chaos Engineering
  • 13. SRE Practices ● ONLY people who can touch Production Environment ● MTTR, MTBF ● Incident Management ● Postmortems

Editor's Notes

  • #7: We define toil as mundane, repetitive operational work providing no enduring value, which scales linearly with service growth A complex system that works necessarily evolved from a simple system that works. Simplicity, goes into this topic in detail