SlideShare a Scribd company logo
Building Site Reliability
Engineering:
A Crash Course
Amin Astaneh, Acquia Inc.
Who am I?
● Senior Manager, SRE at Acquia
● Was in Operations Team from Dec
2010 - Nov 2015
● Built and Lead the Site Reliability
Engineering Team
Agenda
● What is SRE?
● Why Do SRE?
● Acquia, Pre-SRE
● How Acquia Does SRE
● Building an SRE Competency
● How to Hire SREs?
● 1-Year Retrospective
What is SRE?
What is SRE?
“What happens when a software engineer is tasked with what used to be called
operations.”
- Ben Treynor, Google
What is SRE?
SRE takes the manual processes associated with Operations..
What is SRE?
..and replaces them with automation using software engineering.
What is SRE?
They also use a set of methodologies and best practices that help engineering
teams create a mature and sustainable process for service ownership.
How Does This Relate to DevOps?
DevOps is a set of values, tools, and processes that allow teams to best deliver
value to the customer.
Therefore, SRE can be considered a specific implementation of DevOps.
SRE Practices
(according to Google)
1)Hire only coders.
2) Have SLO(s) for your service.
What are SLOs?
● SLI: Service Level Indicators (What to Measure)
● SLOs: Service Level Objectives (Targets for Measurements)
● SLAs: Service Level Agreements (Consequences for Missing Targets)
3) Measure and report performance
against the SLO(s).
4) Use Error Budgets and gate launches
on them.
5) Have a common staffing pool for SRE
and developers.
6) Cap SRE operational load at 50%.
7) Have excess Ops work overflow to the
Dev Team.
8) Share 5% of Ops work with the Dev
Team.
9) Oncall teams should have at least
eight people at one location, or 6 people
at each of multiple locations.
10) Aim for a maximum of two events per
oncall shift.
11) Do a postmortem for every event.
12) Postmortems are blameless and
focus on process and technology, not
people.
Why Do SRE?
Scale
Improve Employees’ Quality of Life
REDUCE COST
Acquia, Pre-SRE
A Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
Things We Tried First
● Implemented Kanban for Ops to make work visible and maximize throughput
● Did ‘Tier 2 Sprints’ to build automation for the team
● Generated team metrics to influence decision-making
“People Metrics: How to Use Team Data to Produce Positive Change”
https://events.drupal.org/dublin2016/sessions/people-metrics
How Acquia Does SRE
How Acquia Does SRE
Acquia SRE was commissioned as the driving force of our DevOps Initiative,
which has the following core values:
● Eliminate Toil
● No Capes
● Deliver With Empathy
● Own Your Service
● Own Your Business
● Own Customer Success
Acquia SRE vs Google SRE
● We embed engineers on teams, rather than build teams that run services on
behalf of engineers
● The entire engineering team (plus the SRE) is expected to ‘own their service’,
with the SRE providing leadership on how to best handle those
responsibilities
● The SRE identifies risk as part of their day-to-day and brings improvement
opportunities directly to the Product Manager for prioritization
Acquia SRE vs Google SRE
● We evaluate with Engineering and Product what the most critical projects are
on a quarterly basis, and allocate the team to best meet the present need
● We still reserve the right to remove engineers if an engagement becomes
untenable, though it has not yet been necessary
● We have a heavy focus on time tracking to aid in toil reduction
8) Share 5% of Ops work with the Dev
Team.
8) Share 5% of Ops work with the Dev
Team.
8) Ops work IS the responsibility of the
Dev Team.
A Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
Building A SRE Competency
Get Management Buy-In
SRE Won’t Work Without Two Things
● Authority to stop releases when the error budget has been
exhausted
● Authority to overflow operational work to the dev team
when operational load > 50%
This must be given from lead of engineering/product efforts.
DO NOT CONTINUE UNLESS YOU HAVE THESE!
How Do You Get Buy-In?
Establish a Sense of Urgency!
https://events.drupal.org/baltimore2017/sessions/%C2%A1viva-la-revoluci%C3%B3n-how-
start-devops-transformation-your-workplace
Automatically Measure Toil
SRE Operational Load Dashboard
Operational Responsibility Assessment
Operational Responsibility Assessment
● Based on the Capability Maturity Model (https://en.wikipedia.org/wiki/Capability_Maturity_Model)
● Evaluates the following responsibilities:
○ Routine Tasks
○ Emergency Response
○ Monitoring and Metrics
○ Capacity Planning
○ Change Management
○ New Product Introduction and Removal
○ Service Deploy and Decommissioning
○ Performance and Efficiency
○ Information Security
Operational Responsibility Assessment
Each responsibility is scored from 1-5:
1. Initial: Chaotic. Undocumented, ad-hoc, and require individual heroics.
2. Repeatable: Documented sufficiently so they can be repeated with the same
results.
3. Defined: Roles and responsibilities for the process are defined and
confirmed.
4. Managed: The process is quantitatively managed in accordance with agreed-
upon metrics.
5. Optimizing: Process management includes deliberate process
A Crash Course in Building Site Reliability
Operational Responsibility Assessment
● Assess your services often! (we suggest quarterly)
● Take findings/risks and create tasks for improvement
● Publish your results and share them with your organization
● Do not tie ORA results to KPIs, incentives, etc
READ APPENDIX A!
Blameless Post Mortems
Blameless Post Mortems
● Document timeline of the incident
● With the team, determine:
○ What went well
○ What didn’t go well (process failures, technical root cause)
○ What was lucky (or circumstantial)
● For each thing that didn’t go well or was circumstantial:
○ File an action item to address it
○ Make sure they have clear acceptance criteria/requirements (grooming)
○ Make sure they have a clear level of effort (sizing)
○ Prioritize in the backlog based on relative risk
● Openly share the post-mortem with the rest of the company
● Review with the team periodically
Launch Readiness Criteria
What is Launch Readiness Criteria?
● A set of guidelines that represent the minimum standard of what a new
product launch requires from an operational standpoint
● Expressed in terms of the Operational Responsibility Assessment
● Intended to address the major forms of risk without introducing needless
roadblocks into the product launch process
● A living document that is continuously maintained and kept relevant
● Inspired by: https://landing.google.com/sre/book/chapters/reliable-product-
launches.html
Example LRC Checklist Items
LRC Enablement
Example Service Pages
Example Service Dashboard
Example Code
Example Operational Runbooks
Example Post Mortem/RCA Template
Create an Onboarding Process
Create an Onboarding Process
● Implement an Incident Response Process
○ On-Call Rotation
○ Documentation for stakeholders on how to get help
○ Fundamentals: production access credentials, runbooks
● Perform/Publish an Operational Responsibility Assessment
● Define/Publish Service Level Objectives
● Create Monitoring/Alerting against SLOs
● Create Dashboards For SLO performance and remaining error budget
Weekly Office Hours
How To Hire SREs?
Hire Software Developers
Hire Software Developers
Hire Operations People
Hire Operations People
What Makes a Good SRE?
● It’s complicated
● You want someone with the ability to contribute to a software engineering
project..
● Yet is motivated by operational concerns and understands the subject matter
(Linux, TCP/IP, monitoring, performance, config management..)
● Is willing to be on-call
● Knowledge of agile practices as a method to suggest improvements
● ‘SRE Temperament’: can communicate their opinions on something in a way
that is persuasive and data-driven
Selling Points for Prospective SREs
● Toil capped at 50%, that means 50%+ project work at all times!
● Authority to stop flow of releases when service is too unreliable
● There is oncall, but responsibility is shared with the whole team
● Root causes of outages are tracked, prioritized, and addressed
These Create A Work Environment That Respects The SRE
1 Year Retrospective
What Went Well
What Went Well
● Launch Readiness Criteria is now a corporate standard
● Teams are independently performing their own blameless post mortems
● Teams are independently performing their own ORAs
● SRE influenced a grassroots reorg of Cloud Engineering around SOA
● More and more teams are taking an active role in on-call responsibilities
● Weekly Office Hours has been an effective tool for sharing ideas
What Didn’t Go Well
What Didn’t Go Well
● We struggled with getting SLOs and error budgets established for all services
● We didn’t get Launch Readiness out the door fast enough for new services
Current Improvements
Current Improvements
● SRE engagements now require the onboarding process before any other
work can take place:
○ Establish Incident Response Process
○ Perform Operational Responsibility Assessment
○ Defining Service Level Objectives
○ Establishing Monitoring and Alerting Against SLOs
○ Create Dashboards Displaying SLOs and Error Budgets
● Operational Stories are required to be prioritized proportional to the SRE
presence on an engineering team.
“When we were in Ops, it was simple, because our purpose was to simply address the incident.
Our purpose now is to address the problems of the business.
We are the vehicle of change. That’s hard work, but we can do it.”
Questions?
Amin Astaneh
T: @aastaneh
M: amin.astaneh@acquia.com
Ad

More Related Content

What's hot (20)

SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)
Hussain Mansoor
 
How Small Team Get Ready for SRE (public version)
How Small Team Get Ready for SRE (public version)How Small Team Get Ready for SRE (public version)
How Small Team Get Ready for SRE (public version)
Setyo Legowo
 
How to SRE when you have no SRE
How to SRE when you have no SREHow to SRE when you have no SRE
How to SRE when you have no SRE
Squadcast Inc
 
SRE (service reliability engineer) on big DevOps platform running on the clou...
SRE (service reliability engineer) on big DevOps platform running on the clou...SRE (service reliability engineer) on big DevOps platform running on the clou...
SRE (service reliability engineer) on big DevOps platform running on the clou...
DevClub_lv
 
Rapid Strategic SRE Assessments
Rapid Strategic SRE AssessmentsRapid Strategic SRE Assessments
Rapid Strategic SRE Assessments
Marc Hornbeek
 
SRE vs DevOps
SRE vs DevOpsSRE vs DevOps
SRE vs DevOps
Levon Avakyan
 
SRE in Startup
SRE in StartupSRE in Startup
SRE in Startup
Ladislav Prskavec
 
Site reliability engineering
Site reliability engineeringSite reliability engineering
Site reliability engineering
Jason Loeffler
 
Site (Service) Reliability Engineering
Site (Service) Reliability EngineeringSite (Service) Reliability Engineering
Site (Service) Reliability Engineering
Mark Underwood
 
Building an SRE Organization @ Squarespace
Building an SRE Organization @ SquarespaceBuilding an SRE Organization @ Squarespace
Building an SRE Organization @ Squarespace
Franklin Angulo
 
What is Site Reliability Engineering (SRE)
What is Site Reliability Engineering (SRE)What is Site Reliability Engineering (SRE)
What is Site Reliability Engineering (SRE)
jeetendra mandal
 
SRE From Scratch
SRE From ScratchSRE From Scratch
SRE From Scratch
Grier Johnson
 
Site reliability engineering - Lightning Talk
Site reliability engineering - Lightning TalkSite reliability engineering - Lightning Talk
Site reliability engineering - Lightning Talk
Michae Blakeney
 
What's an SRE at Criteo - Meetup SRE Paris
What's an SRE at Criteo - Meetup SRE ParisWhat's an SRE at Criteo - Meetup SRE Paris
What's an SRE at Criteo - Meetup SRE Paris
Clément Michaud
 
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
Edureka!
 
"DevOps > CI+CD "
"DevOps > CI+CD ""DevOps > CI+CD "
"DevOps > CI+CD "
Innovation Roots
 
DevOps Torino Meetup - SRE Concepts
DevOps Torino Meetup - SRE ConceptsDevOps Torino Meetup - SRE Concepts
DevOps Torino Meetup - SRE Concepts
Rauno De Pasquale
 
Service Level Terminology : SLA ,SLO & SLI
Service Level Terminology : SLA ,SLO & SLIService Level Terminology : SLA ,SLO & SLI
Service Level Terminology : SLA ,SLO & SLI
Knoldus Inc.
 
DevOps Powerpoint Presentation Slides
DevOps Powerpoint Presentation SlidesDevOps Powerpoint Presentation Slides
DevOps Powerpoint Presentation Slides
SlideTeam
 
DevOps & SRE at Google Scale
DevOps & SRE at Google ScaleDevOps & SRE at Google Scale
DevOps & SRE at Google Scale
Kaushik Bhattacharya
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)
Hussain Mansoor
 
How Small Team Get Ready for SRE (public version)
How Small Team Get Ready for SRE (public version)How Small Team Get Ready for SRE (public version)
How Small Team Get Ready for SRE (public version)
Setyo Legowo
 
How to SRE when you have no SRE
How to SRE when you have no SREHow to SRE when you have no SRE
How to SRE when you have no SRE
Squadcast Inc
 
SRE (service reliability engineer) on big DevOps platform running on the clou...
SRE (service reliability engineer) on big DevOps platform running on the clou...SRE (service reliability engineer) on big DevOps platform running on the clou...
SRE (service reliability engineer) on big DevOps platform running on the clou...
DevClub_lv
 
Rapid Strategic SRE Assessments
Rapid Strategic SRE AssessmentsRapid Strategic SRE Assessments
Rapid Strategic SRE Assessments
Marc Hornbeek
 
Site reliability engineering
Site reliability engineeringSite reliability engineering
Site reliability engineering
Jason Loeffler
 
Site (Service) Reliability Engineering
Site (Service) Reliability EngineeringSite (Service) Reliability Engineering
Site (Service) Reliability Engineering
Mark Underwood
 
Building an SRE Organization @ Squarespace
Building an SRE Organization @ SquarespaceBuilding an SRE Organization @ Squarespace
Building an SRE Organization @ Squarespace
Franklin Angulo
 
What is Site Reliability Engineering (SRE)
What is Site Reliability Engineering (SRE)What is Site Reliability Engineering (SRE)
What is Site Reliability Engineering (SRE)
jeetendra mandal
 
Site reliability engineering - Lightning Talk
Site reliability engineering - Lightning TalkSite reliability engineering - Lightning Talk
Site reliability engineering - Lightning Talk
Michae Blakeney
 
What's an SRE at Criteo - Meetup SRE Paris
What's an SRE at Criteo - Meetup SRE ParisWhat's an SRE at Criteo - Meetup SRE Paris
What's an SRE at Criteo - Meetup SRE Paris
Clément Michaud
 
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
Edureka!
 
DevOps Torino Meetup - SRE Concepts
DevOps Torino Meetup - SRE ConceptsDevOps Torino Meetup - SRE Concepts
DevOps Torino Meetup - SRE Concepts
Rauno De Pasquale
 
Service Level Terminology : SLA ,SLO & SLI
Service Level Terminology : SLA ,SLO & SLIService Level Terminology : SLA ,SLO & SLI
Service Level Terminology : SLA ,SLO & SLI
Knoldus Inc.
 
DevOps Powerpoint Presentation Slides
DevOps Powerpoint Presentation SlidesDevOps Powerpoint Presentation Slides
DevOps Powerpoint Presentation Slides
SlideTeam
 

Viewers also liked (20)

Acquia Partner Program Update
Acquia Partner Program UpdateAcquia Partner Program Update
Acquia Partner Program Update
Acquia
 
Acquia Content Hub: Connect Technologies & Extend Systems to Source Content
Acquia Content Hub: Connect Technologies & Extend Systems to Source ContentAcquia Content Hub: Connect Technologies & Extend Systems to Source Content
Acquia Content Hub: Connect Technologies & Extend Systems to Source Content
Acquia
 
Customer Journey Orchestration: The Secret to Effective Omnichannel Experiences
Customer Journey Orchestration: The Secret to Effective Omnichannel ExperiencesCustomer Journey Orchestration: The Secret to Effective Omnichannel Experiences
Customer Journey Orchestration: The Secret to Effective Omnichannel Experiences
Acquia
 
Tomorrow’s Personalization Today: Increase User Engagement with Content in Co...
Tomorrow’s Personalization Today: Increase User Engagement with Content in Co...Tomorrow’s Personalization Today: Increase User Engagement with Content in Co...
Tomorrow’s Personalization Today: Increase User Engagement with Content in Co...
Acquia
 
Drupal 8 Lessons From the Field: What is Continuous Delivery and Why it’s imp...
Drupal 8 Lessons From the Field: What is Continuous Delivery and Why it’s imp...Drupal 8 Lessons From the Field: What is Continuous Delivery and Why it’s imp...
Drupal 8 Lessons From the Field: What is Continuous Delivery and Why it’s imp...
Acquia
 
Episode 2: Define Customer Segments Using a Data-driven Approach
Episode 2: Define Customer Segments Using a Data-driven ApproachEpisode 2: Define Customer Segments Using a Data-driven Approach
Episode 2: Define Customer Segments Using a Data-driven Approach
Acquia
 
PHP Performance tuning for Drupal 8
PHP Performance tuning for Drupal 8PHP Performance tuning for Drupal 8
PHP Performance tuning for Drupal 8
Acquia
 
Episode 5: Using Technology to Accelerate Your Personalization Initiative
Episode 5: Using Technology to Accelerate Your Personalization InitiativeEpisode 5: Using Technology to Accelerate Your Personalization Initiative
Episode 5: Using Technology to Accelerate Your Personalization Initiative
Acquia
 
Questions To Ask Before a Drupal Project Kickoff
Questions To Ask Before a Drupal Project KickoffQuestions To Ask Before a Drupal Project Kickoff
Questions To Ask Before a Drupal Project Kickoff
Acquia
 
Building a foundation for the future of digital experience (oct 31, 2017)
Building a foundation for the future of digital experience (oct 31, 2017)Building a foundation for the future of digital experience (oct 31, 2017)
Building a foundation for the future of digital experience (oct 31, 2017)
Acquia
 
Lightning Distribution for Drupal: Build Advanced Authoring Experiences in Dr...
Lightning Distribution for Drupal: Build Advanced Authoring Experiences in Dr...Lightning Distribution for Drupal: Build Advanced Authoring Experiences in Dr...
Lightning Distribution for Drupal: Build Advanced Authoring Experiences in Dr...
Acquia
 
Personalization How-To: Driving Conversions with Acquia Lift
Personalization How-To: Driving Conversions with Acquia LiftPersonalization How-To: Driving Conversions with Acquia Lift
Personalization How-To: Driving Conversions with Acquia Lift
Acquia
 
Episode 4: Personalization Best Practices
Episode 4: Personalization Best PracticesEpisode 4: Personalization Best Practices
Episode 4: Personalization Best Practices
Acquia
 
Personalization Using Acquia Lift 2.0
Personalization Using Acquia Lift 2.0Personalization Using Acquia Lift 2.0
Personalization Using Acquia Lift 2.0
Boston Interactive
 
A Professional Software Engineer's Checklist
A Professional Software Engineer's ChecklistA Professional Software Engineer's Checklist
A Professional Software Engineer's Checklist
Acquia
 
Build Personalization into Your Culture: Create Engaging Experiences for Ever...
Build Personalization into Your Culture: Create Engaging Experiences for Ever...Build Personalization into Your Culture: Create Engaging Experiences for Ever...
Build Personalization into Your Culture: Create Engaging Experiences for Ever...
Acquia
 
How to Use the Salesforce Suite with Drupal 8: A Quick Start Guide
How to Use the Salesforce Suite with Drupal 8: A Quick Start GuideHow to Use the Salesforce Suite with Drupal 8: A Quick Start Guide
How to Use the Salesforce Suite with Drupal 8: A Quick Start Guide
Acquia
 
Webinar: Vodafone and The Connected Customer Journey [10.19.2017]
Webinar: Vodafone and The Connected Customer Journey [10.19.2017]Webinar: Vodafone and The Connected Customer Journey [10.19.2017]
Webinar: Vodafone and The Connected Customer Journey [10.19.2017]
Acquia
 
Across the spectrum different approaches to progressively decoupled drupal (...
Across the spectrum  different approaches to progressively decoupled drupal (...Across the spectrum  different approaches to progressively decoupled drupal (...
Across the spectrum different approaches to progressively decoupled drupal (...
Acquia
 
Acquia Lift for Site Builders: How to Define Campaigns, Set Up Tests, and Int...
Acquia Lift for Site Builders: How to Define Campaigns, Set Up Tests, and Int...Acquia Lift for Site Builders: How to Define Campaigns, Set Up Tests, and Int...
Acquia Lift for Site Builders: How to Define Campaigns, Set Up Tests, and Int...
Acquia
 
Acquia Partner Program Update
Acquia Partner Program UpdateAcquia Partner Program Update
Acquia Partner Program Update
Acquia
 
Acquia Content Hub: Connect Technologies & Extend Systems to Source Content
Acquia Content Hub: Connect Technologies & Extend Systems to Source ContentAcquia Content Hub: Connect Technologies & Extend Systems to Source Content
Acquia Content Hub: Connect Technologies & Extend Systems to Source Content
Acquia
 
Customer Journey Orchestration: The Secret to Effective Omnichannel Experiences
Customer Journey Orchestration: The Secret to Effective Omnichannel ExperiencesCustomer Journey Orchestration: The Secret to Effective Omnichannel Experiences
Customer Journey Orchestration: The Secret to Effective Omnichannel Experiences
Acquia
 
Tomorrow’s Personalization Today: Increase User Engagement with Content in Co...
Tomorrow’s Personalization Today: Increase User Engagement with Content in Co...Tomorrow’s Personalization Today: Increase User Engagement with Content in Co...
Tomorrow’s Personalization Today: Increase User Engagement with Content in Co...
Acquia
 
Drupal 8 Lessons From the Field: What is Continuous Delivery and Why it’s imp...
Drupal 8 Lessons From the Field: What is Continuous Delivery and Why it’s imp...Drupal 8 Lessons From the Field: What is Continuous Delivery and Why it’s imp...
Drupal 8 Lessons From the Field: What is Continuous Delivery and Why it’s imp...
Acquia
 
Episode 2: Define Customer Segments Using a Data-driven Approach
Episode 2: Define Customer Segments Using a Data-driven ApproachEpisode 2: Define Customer Segments Using a Data-driven Approach
Episode 2: Define Customer Segments Using a Data-driven Approach
Acquia
 
PHP Performance tuning for Drupal 8
PHP Performance tuning for Drupal 8PHP Performance tuning for Drupal 8
PHP Performance tuning for Drupal 8
Acquia
 
Episode 5: Using Technology to Accelerate Your Personalization Initiative
Episode 5: Using Technology to Accelerate Your Personalization InitiativeEpisode 5: Using Technology to Accelerate Your Personalization Initiative
Episode 5: Using Technology to Accelerate Your Personalization Initiative
Acquia
 
Questions To Ask Before a Drupal Project Kickoff
Questions To Ask Before a Drupal Project KickoffQuestions To Ask Before a Drupal Project Kickoff
Questions To Ask Before a Drupal Project Kickoff
Acquia
 
Building a foundation for the future of digital experience (oct 31, 2017)
Building a foundation for the future of digital experience (oct 31, 2017)Building a foundation for the future of digital experience (oct 31, 2017)
Building a foundation for the future of digital experience (oct 31, 2017)
Acquia
 
Lightning Distribution for Drupal: Build Advanced Authoring Experiences in Dr...
Lightning Distribution for Drupal: Build Advanced Authoring Experiences in Dr...Lightning Distribution for Drupal: Build Advanced Authoring Experiences in Dr...
Lightning Distribution for Drupal: Build Advanced Authoring Experiences in Dr...
Acquia
 
Personalization How-To: Driving Conversions with Acquia Lift
Personalization How-To: Driving Conversions with Acquia LiftPersonalization How-To: Driving Conversions with Acquia Lift
Personalization How-To: Driving Conversions with Acquia Lift
Acquia
 
Episode 4: Personalization Best Practices
Episode 4: Personalization Best PracticesEpisode 4: Personalization Best Practices
Episode 4: Personalization Best Practices
Acquia
 
Personalization Using Acquia Lift 2.0
Personalization Using Acquia Lift 2.0Personalization Using Acquia Lift 2.0
Personalization Using Acquia Lift 2.0
Boston Interactive
 
A Professional Software Engineer's Checklist
A Professional Software Engineer's ChecklistA Professional Software Engineer's Checklist
A Professional Software Engineer's Checklist
Acquia
 
Build Personalization into Your Culture: Create Engaging Experiences for Ever...
Build Personalization into Your Culture: Create Engaging Experiences for Ever...Build Personalization into Your Culture: Create Engaging Experiences for Ever...
Build Personalization into Your Culture: Create Engaging Experiences for Ever...
Acquia
 
How to Use the Salesforce Suite with Drupal 8: A Quick Start Guide
How to Use the Salesforce Suite with Drupal 8: A Quick Start GuideHow to Use the Salesforce Suite with Drupal 8: A Quick Start Guide
How to Use the Salesforce Suite with Drupal 8: A Quick Start Guide
Acquia
 
Webinar: Vodafone and The Connected Customer Journey [10.19.2017]
Webinar: Vodafone and The Connected Customer Journey [10.19.2017]Webinar: Vodafone and The Connected Customer Journey [10.19.2017]
Webinar: Vodafone and The Connected Customer Journey [10.19.2017]
Acquia
 
Across the spectrum different approaches to progressively decoupled drupal (...
Across the spectrum  different approaches to progressively decoupled drupal (...Across the spectrum  different approaches to progressively decoupled drupal (...
Across the spectrum different approaches to progressively decoupled drupal (...
Acquia
 
Acquia Lift for Site Builders: How to Define Campaigns, Set Up Tests, and Int...
Acquia Lift for Site Builders: How to Define Campaigns, Set Up Tests, and Int...Acquia Lift for Site Builders: How to Define Campaigns, Set Up Tests, and Int...
Acquia Lift for Site Builders: How to Define Campaigns, Set Up Tests, and Int...
Acquia
 
Ad

Similar to A Crash Course in Building Site Reliability (20)

S.R.E - create ultra-scalable and highly reliable systems
S.R.E - create ultra-scalable and highly reliable systemsS.R.E - create ultra-scalable and highly reliable systems
S.R.E - create ultra-scalable and highly reliable systems
Ricardo Amaro
 
Fundamentals of Agile
Fundamentals of AgileFundamentals of Agile
Fundamentals of Agile
Zülfikar Karakaya
 
Agile at scale
Agile at scaleAgile at scale
Agile at scale
Eric Cattoir
 
On the road to Engineering excellence
On the road to Engineering excellenceOn the road to Engineering excellence
On the road to Engineering excellence
Alexander Mrynskyi
 
NUS-ISS Learning Day 2019-Site Reliability Engineering – The Modern Method fo...
NUS-ISS Learning Day 2019-Site Reliability Engineering – The Modern Method fo...NUS-ISS Learning Day 2019-Site Reliability Engineering – The Modern Method fo...
NUS-ISS Learning Day 2019-Site Reliability Engineering – The Modern Method fo...
NUS-ISS
 
AWS Well-Architected Framework: Operational Excellence Pillar
AWS Well-Architected Framework: Operational Excellence PillarAWS Well-Architected Framework: Operational Excellence Pillar
AWS Well-Architected Framework: Operational Excellence Pillar
Jonathan LaCour
 
Agile webinar pack (2)
Agile webinar pack (2)Agile webinar pack (2)
Agile webinar pack (2)
Basis Technologies
 
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
Nacho Cougil
 
Continuous Testing: A Key to DevOps Success
Continuous Testing: A Key to DevOps SuccessContinuous Testing: A Key to DevOps Success
Continuous Testing: A Key to DevOps Success
TechWell
 
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
Nacho Cougil
 
Unified process,agile process,process assesment ppt
Unified process,agile process,process assesment pptUnified process,agile process,process assesment ppt
Unified process,agile process,process assesment ppt
Shweta Ghate
 
RESUME_RAJESH CHERUKURI
RESUME_RAJESH CHERUKURIRESUME_RAJESH CHERUKURI
RESUME_RAJESH CHERUKURI
Rajesh Cherukuri
 
How Salesforce built a Scalable, World-Class, Performance Engineering Team
How Salesforce built a Scalable, World-Class, Performance Engineering TeamHow Salesforce built a Scalable, World-Class, Performance Engineering Team
How Salesforce built a Scalable, World-Class, Performance Engineering Team
Salesforce Developers
 
TDWI STL 20140613 Agile - Paul Holway
TDWI STL 20140613 Agile - Paul HolwayTDWI STL 20140613 Agile - Paul Holway
TDWI STL 20140613 Agile - Paul Holway
TDWI St. Louis
 
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdfADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
Phil Johnson
 
Keys to Successful Cohabitation: Governance and Autonomous Teams
Keys to Successful Cohabitation: Governance and Autonomous TeamsKeys to Successful Cohabitation: Governance and Autonomous Teams
Keys to Successful Cohabitation: Governance and Autonomous Teams
DevOps.com
 
Demystifying Devops - Uday kumar
Demystifying Devops - Uday kumarDemystifying Devops - Uday kumar
Demystifying Devops - Uday kumar
Agile Testing Alliance
 
Improving software quality for the future of connected vehicles
Improving software quality for the future of connected vehiclesImproving software quality for the future of connected vehicles
Improving software quality for the future of connected vehicles
Devon Bleibtrey
 
DevOps Primer : Presented by Uday Kumar
DevOps Primer : Presented by Uday KumarDevOps Primer : Presented by Uday Kumar
DevOps Primer : Presented by Uday Kumar
oGuild .
 
Test i agile projekter af Gitte Ottosen, Sogeti
Test i agile projekter af Gitte Ottosen, SogetiTest i agile projekter af Gitte Ottosen, Sogeti
Test i agile projekter af Gitte Ottosen, Sogeti
InfinIT - Innovationsnetværket for it
 
S.R.E - create ultra-scalable and highly reliable systems
S.R.E - create ultra-scalable and highly reliable systemsS.R.E - create ultra-scalable and highly reliable systems
S.R.E - create ultra-scalable and highly reliable systems
Ricardo Amaro
 
On the road to Engineering excellence
On the road to Engineering excellenceOn the road to Engineering excellence
On the road to Engineering excellence
Alexander Mrynskyi
 
NUS-ISS Learning Day 2019-Site Reliability Engineering – The Modern Method fo...
NUS-ISS Learning Day 2019-Site Reliability Engineering – The Modern Method fo...NUS-ISS Learning Day 2019-Site Reliability Engineering – The Modern Method fo...
NUS-ISS Learning Day 2019-Site Reliability Engineering – The Modern Method fo...
NUS-ISS
 
AWS Well-Architected Framework: Operational Excellence Pillar
AWS Well-Architected Framework: Operational Excellence PillarAWS Well-Architected Framework: Operational Excellence Pillar
AWS Well-Architected Framework: Operational Excellence Pillar
Jonathan LaCour
 
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
Nacho Cougil
 
Continuous Testing: A Key to DevOps Success
Continuous Testing: A Key to DevOps SuccessContinuous Testing: A Key to DevOps Success
Continuous Testing: A Key to DevOps Success
TechWell
 
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
Nacho Cougil
 
Unified process,agile process,process assesment ppt
Unified process,agile process,process assesment pptUnified process,agile process,process assesment ppt
Unified process,agile process,process assesment ppt
Shweta Ghate
 
How Salesforce built a Scalable, World-Class, Performance Engineering Team
How Salesforce built a Scalable, World-Class, Performance Engineering TeamHow Salesforce built a Scalable, World-Class, Performance Engineering Team
How Salesforce built a Scalable, World-Class, Performance Engineering Team
Salesforce Developers
 
TDWI STL 20140613 Agile - Paul Holway
TDWI STL 20140613 Agile - Paul HolwayTDWI STL 20140613 Agile - Paul Holway
TDWI STL 20140613 Agile - Paul Holway
TDWI St. Louis
 
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdfADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
Phil Johnson
 
Keys to Successful Cohabitation: Governance and Autonomous Teams
Keys to Successful Cohabitation: Governance and Autonomous TeamsKeys to Successful Cohabitation: Governance and Autonomous Teams
Keys to Successful Cohabitation: Governance and Autonomous Teams
DevOps.com
 
Improving software quality for the future of connected vehicles
Improving software quality for the future of connected vehiclesImproving software quality for the future of connected vehicles
Improving software quality for the future of connected vehicles
Devon Bleibtrey
 
DevOps Primer : Presented by Uday Kumar
DevOps Primer : Presented by Uday KumarDevOps Primer : Presented by Uday Kumar
DevOps Primer : Presented by Uday Kumar
oGuild .
 
Ad

More from Acquia (20)

Acquia_Adcetera Webinar_Marketing Automation.pdf
Acquia_Adcetera Webinar_Marketing Automation.pdfAcquia_Adcetera Webinar_Marketing Automation.pdf
Acquia_Adcetera Webinar_Marketing Automation.pdf
Acquia
 
Acquia Webinar Deck - 9_13 .pdf
Acquia Webinar Deck - 9_13 .pdfAcquia Webinar Deck - 9_13 .pdf
Acquia Webinar Deck - 9_13 .pdf
Acquia
 
Taking Your Multi-Site Management at Scale to the Next Level
Taking Your Multi-Site Management at Scale to the Next LevelTaking Your Multi-Site Management at Scale to the Next Level
Taking Your Multi-Site Management at Scale to the Next Level
Acquia
 
CDP for Retail Webinar with Appnovation - Q2 2022.pdf
CDP for Retail Webinar with Appnovation - Q2 2022.pdfCDP for Retail Webinar with Appnovation - Q2 2022.pdf
CDP for Retail Webinar with Appnovation - Q2 2022.pdf
Acquia
 
May Partner Bootcamp 2022
May Partner Bootcamp 2022May Partner Bootcamp 2022
May Partner Bootcamp 2022
Acquia
 
April Partner Bootcamp 2022
April Partner Bootcamp 2022April Partner Bootcamp 2022
April Partner Bootcamp 2022
Acquia
 
How to Unify Brand Experience: A Hootsuite Story
How to Unify Brand Experience: A Hootsuite Story How to Unify Brand Experience: A Hootsuite Story
How to Unify Brand Experience: A Hootsuite Story
Acquia
 
Using Personas to Guide DAM Results: How Life Time Pumped Up Their UX and CX
Using Personas to Guide DAM Results: How Life Time Pumped Up Their UX and CXUsing Personas to Guide DAM Results: How Life Time Pumped Up Their UX and CX
Using Personas to Guide DAM Results: How Life Time Pumped Up Their UX and CX
Acquia
 
Improve Code Quality and Time to Market: 100% Cloud-Based Development Workflow
Improve Code Quality and Time to Market: 100% Cloud-Based Development WorkflowImprove Code Quality and Time to Market: 100% Cloud-Based Development Workflow
Improve Code Quality and Time to Market: 100% Cloud-Based Development Workflow
Acquia
 
September Partner Bootcamp
September Partner BootcampSeptember Partner Bootcamp
September Partner Bootcamp
Acquia
 
August partner bootcamp
August partner bootcampAugust partner bootcamp
August partner bootcamp
Acquia
 
July 2021 Partner Bootcamp
July  2021 Partner BootcampJuly  2021 Partner Bootcamp
July 2021 Partner Bootcamp
Acquia
 
May Partner Bootcamp
May Partner BootcampMay Partner Bootcamp
May Partner Bootcamp
Acquia
 
DRUPAL 7 END OF LIFE IS NEAR - MIGRATE TO DRUPAL 9 FAST AND EASY
DRUPAL 7 END OF LIFE IS NEAR - MIGRATE TO DRUPAL 9 FAST AND EASYDRUPAL 7 END OF LIFE IS NEAR - MIGRATE TO DRUPAL 9 FAST AND EASY
DRUPAL 7 END OF LIFE IS NEAR - MIGRATE TO DRUPAL 9 FAST AND EASY
Acquia
 
Work While You Sleep: The CMO’s Guide to a 24/7/365 Lead Machine
Work While You Sleep: The CMO’s Guide to a 24/7/365 Lead MachineWork While You Sleep: The CMO’s Guide to a 24/7/365 Lead Machine
Work While You Sleep: The CMO’s Guide to a 24/7/365 Lead Machine
Acquia
 
Acquia webinar: Leveraging Drupal to Bury Your Sales Team In B2B Leads
Acquia webinar: Leveraging Drupal to Bury Your Sales Team In B2B LeadsAcquia webinar: Leveraging Drupal to Bury Your Sales Team In B2B Leads
Acquia webinar: Leveraging Drupal to Bury Your Sales Team In B2B Leads
Acquia
 
April partner bootcamp deck cookieless future
April partner bootcamp deck  cookieless futureApril partner bootcamp deck  cookieless future
April partner bootcamp deck cookieless future
Acquia
 
How to enhance cx through personalised, automated solutions
How to enhance cx through personalised, automated solutionsHow to enhance cx through personalised, automated solutions
How to enhance cx through personalised, automated solutions
Acquia
 
DRUPAL MIGRATIONS AND DRUPAL 9 INNOVATION: HOW PAC-12 DELIVERED DIGITALLY FOR...
DRUPAL MIGRATIONS AND DRUPAL 9 INNOVATION: HOW PAC-12 DELIVERED DIGITALLY FOR...DRUPAL MIGRATIONS AND DRUPAL 9 INNOVATION: HOW PAC-12 DELIVERED DIGITALLY FOR...
DRUPAL MIGRATIONS AND DRUPAL 9 INNOVATION: HOW PAC-12 DELIVERED DIGITALLY FOR...
Acquia
 
Customer Experience (CX): 3 Key Factors Shaping CX Redesign in 2021
Customer Experience (CX): 3 Key Factors Shaping CX Redesign in 2021Customer Experience (CX): 3 Key Factors Shaping CX Redesign in 2021
Customer Experience (CX): 3 Key Factors Shaping CX Redesign in 2021
Acquia
 
Acquia_Adcetera Webinar_Marketing Automation.pdf
Acquia_Adcetera Webinar_Marketing Automation.pdfAcquia_Adcetera Webinar_Marketing Automation.pdf
Acquia_Adcetera Webinar_Marketing Automation.pdf
Acquia
 
Acquia Webinar Deck - 9_13 .pdf
Acquia Webinar Deck - 9_13 .pdfAcquia Webinar Deck - 9_13 .pdf
Acquia Webinar Deck - 9_13 .pdf
Acquia
 
Taking Your Multi-Site Management at Scale to the Next Level
Taking Your Multi-Site Management at Scale to the Next LevelTaking Your Multi-Site Management at Scale to the Next Level
Taking Your Multi-Site Management at Scale to the Next Level
Acquia
 
CDP for Retail Webinar with Appnovation - Q2 2022.pdf
CDP for Retail Webinar with Appnovation - Q2 2022.pdfCDP for Retail Webinar with Appnovation - Q2 2022.pdf
CDP for Retail Webinar with Appnovation - Q2 2022.pdf
Acquia
 
May Partner Bootcamp 2022
May Partner Bootcamp 2022May Partner Bootcamp 2022
May Partner Bootcamp 2022
Acquia
 
April Partner Bootcamp 2022
April Partner Bootcamp 2022April Partner Bootcamp 2022
April Partner Bootcamp 2022
Acquia
 
How to Unify Brand Experience: A Hootsuite Story
How to Unify Brand Experience: A Hootsuite Story How to Unify Brand Experience: A Hootsuite Story
How to Unify Brand Experience: A Hootsuite Story
Acquia
 
Using Personas to Guide DAM Results: How Life Time Pumped Up Their UX and CX
Using Personas to Guide DAM Results: How Life Time Pumped Up Their UX and CXUsing Personas to Guide DAM Results: How Life Time Pumped Up Their UX and CX
Using Personas to Guide DAM Results: How Life Time Pumped Up Their UX and CX
Acquia
 
Improve Code Quality and Time to Market: 100% Cloud-Based Development Workflow
Improve Code Quality and Time to Market: 100% Cloud-Based Development WorkflowImprove Code Quality and Time to Market: 100% Cloud-Based Development Workflow
Improve Code Quality and Time to Market: 100% Cloud-Based Development Workflow
Acquia
 
September Partner Bootcamp
September Partner BootcampSeptember Partner Bootcamp
September Partner Bootcamp
Acquia
 
August partner bootcamp
August partner bootcampAugust partner bootcamp
August partner bootcamp
Acquia
 
July 2021 Partner Bootcamp
July  2021 Partner BootcampJuly  2021 Partner Bootcamp
July 2021 Partner Bootcamp
Acquia
 
May Partner Bootcamp
May Partner BootcampMay Partner Bootcamp
May Partner Bootcamp
Acquia
 
DRUPAL 7 END OF LIFE IS NEAR - MIGRATE TO DRUPAL 9 FAST AND EASY
DRUPAL 7 END OF LIFE IS NEAR - MIGRATE TO DRUPAL 9 FAST AND EASYDRUPAL 7 END OF LIFE IS NEAR - MIGRATE TO DRUPAL 9 FAST AND EASY
DRUPAL 7 END OF LIFE IS NEAR - MIGRATE TO DRUPAL 9 FAST AND EASY
Acquia
 
Work While You Sleep: The CMO’s Guide to a 24/7/365 Lead Machine
Work While You Sleep: The CMO’s Guide to a 24/7/365 Lead MachineWork While You Sleep: The CMO’s Guide to a 24/7/365 Lead Machine
Work While You Sleep: The CMO’s Guide to a 24/7/365 Lead Machine
Acquia
 
Acquia webinar: Leveraging Drupal to Bury Your Sales Team In B2B Leads
Acquia webinar: Leveraging Drupal to Bury Your Sales Team In B2B LeadsAcquia webinar: Leveraging Drupal to Bury Your Sales Team In B2B Leads
Acquia webinar: Leveraging Drupal to Bury Your Sales Team In B2B Leads
Acquia
 
April partner bootcamp deck cookieless future
April partner bootcamp deck  cookieless futureApril partner bootcamp deck  cookieless future
April partner bootcamp deck cookieless future
Acquia
 
How to enhance cx through personalised, automated solutions
How to enhance cx through personalised, automated solutionsHow to enhance cx through personalised, automated solutions
How to enhance cx through personalised, automated solutions
Acquia
 
DRUPAL MIGRATIONS AND DRUPAL 9 INNOVATION: HOW PAC-12 DELIVERED DIGITALLY FOR...
DRUPAL MIGRATIONS AND DRUPAL 9 INNOVATION: HOW PAC-12 DELIVERED DIGITALLY FOR...DRUPAL MIGRATIONS AND DRUPAL 9 INNOVATION: HOW PAC-12 DELIVERED DIGITALLY FOR...
DRUPAL MIGRATIONS AND DRUPAL 9 INNOVATION: HOW PAC-12 DELIVERED DIGITALLY FOR...
Acquia
 
Customer Experience (CX): 3 Key Factors Shaping CX Redesign in 2021
Customer Experience (CX): 3 Key Factors Shaping CX Redesign in 2021Customer Experience (CX): 3 Key Factors Shaping CX Redesign in 2021
Customer Experience (CX): 3 Key Factors Shaping CX Redesign in 2021
Acquia
 

Recently uploaded (20)

IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 

A Crash Course in Building Site Reliability

  • 1. Building Site Reliability Engineering: A Crash Course Amin Astaneh, Acquia Inc.
  • 2. Who am I? ● Senior Manager, SRE at Acquia ● Was in Operations Team from Dec 2010 - Nov 2015 ● Built and Lead the Site Reliability Engineering Team
  • 3. Agenda ● What is SRE? ● Why Do SRE? ● Acquia, Pre-SRE ● How Acquia Does SRE ● Building an SRE Competency ● How to Hire SREs? ● 1-Year Retrospective
  • 5. What is SRE? “What happens when a software engineer is tasked with what used to be called operations.” - Ben Treynor, Google
  • 6. What is SRE? SRE takes the manual processes associated with Operations..
  • 7. What is SRE? ..and replaces them with automation using software engineering.
  • 8. What is SRE? They also use a set of methodologies and best practices that help engineering teams create a mature and sustainable process for service ownership.
  • 9. How Does This Relate to DevOps? DevOps is a set of values, tools, and processes that allow teams to best deliver value to the customer. Therefore, SRE can be considered a specific implementation of DevOps.
  • 12. 2) Have SLO(s) for your service.
  • 13. What are SLOs? ● SLI: Service Level Indicators (What to Measure) ● SLOs: Service Level Objectives (Targets for Measurements) ● SLAs: Service Level Agreements (Consequences for Missing Targets)
  • 14. 3) Measure and report performance against the SLO(s).
  • 15. 4) Use Error Budgets and gate launches on them.
  • 16. 5) Have a common staffing pool for SRE and developers.
  • 17. 6) Cap SRE operational load at 50%.
  • 18. 7) Have excess Ops work overflow to the Dev Team.
  • 19. 8) Share 5% of Ops work with the Dev Team.
  • 20. 9) Oncall teams should have at least eight people at one location, or 6 people at each of multiple locations.
  • 21. 10) Aim for a maximum of two events per oncall shift.
  • 22. 11) Do a postmortem for every event.
  • 23. 12) Postmortems are blameless and focus on process and technology, not people.
  • 25. Scale
  • 33. Things We Tried First ● Implemented Kanban for Ops to make work visible and maximize throughput ● Did ‘Tier 2 Sprints’ to build automation for the team ● Generated team metrics to influence decision-making “People Metrics: How to Use Team Data to Produce Positive Change” https://events.drupal.org/dublin2016/sessions/people-metrics
  • 35. How Acquia Does SRE Acquia SRE was commissioned as the driving force of our DevOps Initiative, which has the following core values: ● Eliminate Toil ● No Capes ● Deliver With Empathy ● Own Your Service ● Own Your Business ● Own Customer Success
  • 36. Acquia SRE vs Google SRE ● We embed engineers on teams, rather than build teams that run services on behalf of engineers ● The entire engineering team (plus the SRE) is expected to ‘own their service’, with the SRE providing leadership on how to best handle those responsibilities ● The SRE identifies risk as part of their day-to-day and brings improvement opportunities directly to the Product Manager for prioritization
  • 37. Acquia SRE vs Google SRE ● We evaluate with Engineering and Product what the most critical projects are on a quarterly basis, and allocate the team to best meet the present need ● We still reserve the right to remove engineers if an engagement becomes untenable, though it has not yet been necessary ● We have a heavy focus on time tracking to aid in toil reduction
  • 38. 8) Share 5% of Ops work with the Dev Team.
  • 39. 8) Share 5% of Ops work with the Dev Team.
  • 40. 8) Ops work IS the responsibility of the Dev Team.
  • 45. Building A SRE Competency
  • 47. SRE Won’t Work Without Two Things ● Authority to stop releases when the error budget has been exhausted ● Authority to overflow operational work to the dev team when operational load > 50% This must be given from lead of engineering/product efforts. DO NOT CONTINUE UNLESS YOU HAVE THESE!
  • 48. How Do You Get Buy-In?
  • 49. Establish a Sense of Urgency! https://events.drupal.org/baltimore2017/sessions/%C2%A1viva-la-revoluci%C3%B3n-how- start-devops-transformation-your-workplace
  • 51. SRE Operational Load Dashboard
  • 53. Operational Responsibility Assessment ● Based on the Capability Maturity Model (https://en.wikipedia.org/wiki/Capability_Maturity_Model) ● Evaluates the following responsibilities: ○ Routine Tasks ○ Emergency Response ○ Monitoring and Metrics ○ Capacity Planning ○ Change Management ○ New Product Introduction and Removal ○ Service Deploy and Decommissioning ○ Performance and Efficiency ○ Information Security
  • 54. Operational Responsibility Assessment Each responsibility is scored from 1-5: 1. Initial: Chaotic. Undocumented, ad-hoc, and require individual heroics. 2. Repeatable: Documented sufficiently so they can be repeated with the same results. 3. Defined: Roles and responsibilities for the process are defined and confirmed. 4. Managed: The process is quantitatively managed in accordance with agreed- upon metrics. 5. Optimizing: Process management includes deliberate process
  • 56. Operational Responsibility Assessment ● Assess your services often! (we suggest quarterly) ● Take findings/risks and create tasks for improvement ● Publish your results and share them with your organization ● Do not tie ORA results to KPIs, incentives, etc
  • 59. Blameless Post Mortems ● Document timeline of the incident ● With the team, determine: ○ What went well ○ What didn’t go well (process failures, technical root cause) ○ What was lucky (or circumstantial) ● For each thing that didn’t go well or was circumstantial: ○ File an action item to address it ○ Make sure they have clear acceptance criteria/requirements (grooming) ○ Make sure they have a clear level of effort (sizing) ○ Prioritize in the backlog based on relative risk ● Openly share the post-mortem with the rest of the company ● Review with the team periodically
  • 61. What is Launch Readiness Criteria? ● A set of guidelines that represent the minimum standard of what a new product launch requires from an operational standpoint ● Expressed in terms of the Operational Responsibility Assessment ● Intended to address the major forms of risk without introducing needless roadblocks into the product launch process ● A living document that is continuously maintained and kept relevant ● Inspired by: https://landing.google.com/sre/book/chapters/reliable-product- launches.html
  • 70. Create an Onboarding Process ● Implement an Incident Response Process ○ On-Call Rotation ○ Documentation for stakeholders on how to get help ○ Fundamentals: production access credentials, runbooks ● Perform/Publish an Operational Responsibility Assessment ● Define/Publish Service Level Objectives ● Create Monitoring/Alerting against SLOs ● Create Dashboards For SLO performance and remaining error budget
  • 72. How To Hire SREs?
  • 77. What Makes a Good SRE? ● It’s complicated ● You want someone with the ability to contribute to a software engineering project.. ● Yet is motivated by operational concerns and understands the subject matter (Linux, TCP/IP, monitoring, performance, config management..) ● Is willing to be on-call ● Knowledge of agile practices as a method to suggest improvements ● ‘SRE Temperament’: can communicate their opinions on something in a way that is persuasive and data-driven
  • 78. Selling Points for Prospective SREs ● Toil capped at 50%, that means 50%+ project work at all times! ● Authority to stop flow of releases when service is too unreliable ● There is oncall, but responsibility is shared with the whole team ● Root causes of outages are tracked, prioritized, and addressed These Create A Work Environment That Respects The SRE
  • 81. What Went Well ● Launch Readiness Criteria is now a corporate standard ● Teams are independently performing their own blameless post mortems ● Teams are independently performing their own ORAs ● SRE influenced a grassroots reorg of Cloud Engineering around SOA ● More and more teams are taking an active role in on-call responsibilities ● Weekly Office Hours has been an effective tool for sharing ideas
  • 83. What Didn’t Go Well ● We struggled with getting SLOs and error budgets established for all services ● We didn’t get Launch Readiness out the door fast enough for new services
  • 85. Current Improvements ● SRE engagements now require the onboarding process before any other work can take place: ○ Establish Incident Response Process ○ Perform Operational Responsibility Assessment ○ Defining Service Level Objectives ○ Establishing Monitoring and Alerting Against SLOs ○ Create Dashboards Displaying SLOs and Error Budgets ● Operational Stories are required to be prioritized proportional to the SRE presence on an engineering team.
  • 86. “When we were in Ops, it was simple, because our purpose was to simply address the incident. Our purpose now is to address the problems of the business. We are the vehicle of change. That’s hard work, but we can do it.”

Editor's Notes

  • #30: Small, well-trained Ops team separate from the dev team
  • #31: Hockey-Stick growth of customers created hockey-stick growth of operational work. In particular, troubleshooting and fixing broken infrastructure in Acquia’s products.
  • #32: Ops became a constraint in service delivery