SlideShare a Scribd company logo
The RED Method
Patterns for instrumentation & monitoring.
Tom Wilkie
tom@grafana.com @tom_wilkie
github.com/tomwilkie
The RED Method: How to monitoring your microservices.
Introduction
Why does this matter?

The USE Method
Utilisation, Saturation, Errors

The RED Method
Requests Rate, Errors, Duration..

The Four Golden Signals
RED + Saturation
Introduction
The USE Method
For every resource, monitor:

• Utilisation: % time that the resource was busy

• Saturation: amount of work resource has to do, often queue length

• Errors: the count of error events
http://www.brendangregg.com/usemethod.html
Utilisation Saturation Errors
CPU ✔ ✔ ✔
Memory ✔ ✔ ✔
Disk ✔ ✔ ✔
Network ✔ ✔ ✖
CPU USE in Prometheus
CPU Utilisation:
1 - avg(rate(node_cpu{job="default/node-exporter",mode="idle"}[1m]))
CPU Saturation:
sum(node_load1{job="default/node-exporter"})
/
sum(node:node_num_cpu:sum)
Memory USE in Prometheus
Memory Utilisation:
1 - sum(
node_memory_MemFree{job=“…”} +
node_memory_Cached{job=“…”} +
node_memory_Buffers{job=“…”}
)
/ sum(node_memory_MemTotal{job=“…”})
Memory Saturation:
1e3 * sum(
rate(node_vmstat_pgpgin{job=“…”}[1m]) +
rate(node_vmstat_pgpgout{job=“…”}[1m]))
)
Interesting / Hard Cases
• CPU Errors, Memory Errors

• Hard Disk Errors!

• Disk Capacity vs Disk IO

• Network Utilisation

• Interconnects
Demo
Time
More Details
• “The USE Method” - Brendan Gregg

• Kubernetes Mixin - https://github.com/grafana/jsonnet-libs
The RED Method
For every service, monitor request:

• Rate - number of requests per second

• Errors - the number of those requests that are failing

• Duration - the amount of time those requests take
The RED Method
The RED Method: How to monitoring your microservices.
Prometheus Implementation
import (
“github.com/prometheus/client_golang/prometheus"
)
var requestDuration = prometheus.NewHistogramVec(prometheus.HistogramOpts{
Name: "request_duration_seconds",
Help: "Time (in seconds) spent serving HTTP requests.",
Buckets: prometheus.DefBuckets,
}, []string{"method", "route", "status_code"})
func init() {
prometheus.MustRegister(requestDuration)
}
func wrap(h http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
m := httpsnoop.CaptureMetrics(h, w, r)
requestDuration.WithLabelValues(r.Method, r.URL.Path,
strconv.Itoa(m.Code)).Observe(m.Duration.Seconds())
})
}
func server(addr string) {
http.Handle("/metrics", prometheus.Handler())
http.Handle("/greeter", wrap(http.HandlerFunc(func(w
http.ResponseWriter, r *http.Request) {
...
}))
}
Prometheus Implementation
Easy to query
Rate:
sum(rate(request_duration_seconds_count{job=“…”}[1m]))
Errors:
sum(rate(request_duration_seconds_count{job=“…”, 
status_code!~”2..”}[1m]))
Duration:
histogram_quantile(0.99,
 sum(rate(request_duration_seconds_bucket{job=“…}[1m])) by (le))
Demo
Time
DAG of Services
The RED Method: How to monitoring your microservices.
Latencies & Averages
More Details
• “Monitoring Microservices” - Weaveworks (slides)

• "The RED Method: key metrics for microservices architecture” - Weaveworks

• “Monitoring and Observability with USE and RED” - VividCortex

• “RED Method for Prometheus – 3 Key Metrics for Monitoring” - Rancher Labs

• “Logs and Metrics” - Cindy Sridharan

• "Logging v. instrumentation”, “Go best practices, six years in” - Peter Bourgon
The Four
Golden Signals
The RED Method: How to monitoring your microservices.
The Four Golden Signals
For each service, monitor:

• Latency - time taken to serve a request
• Traffic - how much demand is places on your system
• Errors - rate or requests that are failing
• Saturation - how “full” your services is
The RED Method: How to monitoring your microservices.
Demo
Time
More Details
• “The Four Golden Signals” - The Google SRE Book

• “How to Monitor the SRE Golden Signals” - Steve Mushero
Summary
Thanks!
Questions?
Tom Wilkie VP Product, Grafana Labs
Previously: Kausal, Weaveworks, Google, Acunu, Xensource
Twitter: @tom_wilkie Email: tom@grafana.com
+
Grafana Cloud is a hosted and fully managed SaaS metrics
platform that helps Ops and Dev teams using Grafana
to understand the behavior of their applications and
infrastructure
Grafana Cloud allows users to provision and manage
the best open source observability tools - Grafana and
Prometheus - all through a simple UI and single API.
What is Grafana Cloud?
Store, visualize and alert without the headache of scaling or managing
your own monitoring stack.
Your complete, fully managed, hosted metrics platform.
Grafana Cloud:

More Related Content

What's hot (20)

PDF
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
NETWAYS
 
PDF
Introducing Change Data Capture with Debezium
ChengKuan Gan
 
PDF
Grafana introduction
Rico Chen
 
PPTX
MeetUp Monitoring with Prometheus and Grafana (September 2018)
Lucas Jellema
 
PDF
Cloud Monitoring tool Grafana
Dhrubaji Mandal ♛
 
PPTX
Observability
Enes Altınok
 
PPTX
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
PDF
Appdynamics Training Session
CodvaTech Labs
 
PDF
Jenkins-CI
Gong Haibing
 
PPT
App Dynamics
Dealmaker Media
 
PPTX
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Sridhar Kumar N
 
PDF
Prometheus Overview
Brian Brazil
 
PDF
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
SANG WON PARK
 
PPTX
Observability
Maganathin Veeraragaloo
 
PDF
Implementing Domain Events with Kafka
Andrei Rugina
 
PDF
Server monitoring using grafana and prometheus
Celine George
 
PPTX
CQRS and Event Sourcing
Inho Kang
 
PPTX
Prometheus and Grafana
Lhouceine OUHAMZA
 
PPTX
Prometheus 101
Paul Podolny
 
PPTX
Kafka 101
Clement Demonchy
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
NETWAYS
 
Introducing Change Data Capture with Debezium
ChengKuan Gan
 
Grafana introduction
Rico Chen
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
Lucas Jellema
 
Cloud Monitoring tool Grafana
Dhrubaji Mandal ♛
 
Observability
Enes Altınok
 
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Appdynamics Training Session
CodvaTech Labs
 
Jenkins-CI
Gong Haibing
 
App Dynamics
Dealmaker Media
 
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Sridhar Kumar N
 
Prometheus Overview
Brian Brazil
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
SANG WON PARK
 
Implementing Domain Events with Kafka
Andrei Rugina
 
Server monitoring using grafana and prometheus
Celine George
 
CQRS and Event Sourcing
Inho Kang
 
Prometheus and Grafana
Lhouceine OUHAMZA
 
Prometheus 101
Paul Podolny
 
Kafka 101
Clement Demonchy
 

Similar to The RED Method: How to monitoring your microservices. (20)

PDF
The RED Method: How To Instrument Your Services
Kausal
 
PDF
The RED Method: How To Instrument Your Services
Kausal
 
PDF
THE RED METHOD: HOW TO INSTRUMENT YOUR SERVICES
InfluxData
 
PDF
Monitor your Java application with Prometheus Stack
Wojciech Barczyński
 
PDF
How to Monitoring the SRE Golden Signals (E-Book)
Siglos
 
PDF
Proactive ops for container orchestration environments
Docker, Inc.
 
PPTX
DockerCon SF 2019 - Observability Workshop
Kevin Crawley
 
PDF
Seeing RED: Monitoring and Observability in the Age of Microservices
Dave McAllister
 
PDF
Intro to open source observability with grafana, prometheus, loki, and tempo(...
LibbySchulze
 
PDF
The Shape of Cloud to Come
Marc Tudurí Cladera
 
PDF
I pushed in production :). Have a nice weekend
Nicolas Carlier
 
PPTX
How to Improve the Observability of Apache Cassandra and Kafka applications...
Paul Brebner
 
PDF
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
Paul Brebner
 
PDF
How to monitor your micro-service with Prometheus?
Wojciech Barczyński
 
PPTX
An Introduction to Prometheus (GrafanaCon 2016)
Brian Brazil
 
PDF
Grafana overview deck - Tech - 2023 May v1.pdf
BillySin5
 
PDF
Employment Hero monitoring solution
Luong Vo
 
PPTX
Observability for Application Developers (1)-1.pptx
OpsTree solutions
 
PDF
Monitorama 2015 Netflix Instance Analysis
Brendan Gregg
 
PPTX
Prometheus (Prometheus London, 2016)
Brian Brazil
 
The RED Method: How To Instrument Your Services
Kausal
 
The RED Method: How To Instrument Your Services
Kausal
 
THE RED METHOD: HOW TO INSTRUMENT YOUR SERVICES
InfluxData
 
Monitor your Java application with Prometheus Stack
Wojciech Barczyński
 
How to Monitoring the SRE Golden Signals (E-Book)
Siglos
 
Proactive ops for container orchestration environments
Docker, Inc.
 
DockerCon SF 2019 - Observability Workshop
Kevin Crawley
 
Seeing RED: Monitoring and Observability in the Age of Microservices
Dave McAllister
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
LibbySchulze
 
The Shape of Cloud to Come
Marc Tudurí Cladera
 
I pushed in production :). Have a nice weekend
Nicolas Carlier
 
How to Improve the Observability of Apache Cassandra and Kafka applications...
Paul Brebner
 
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
Paul Brebner
 
How to monitor your micro-service with Prometheus?
Wojciech Barczyński
 
An Introduction to Prometheus (GrafanaCon 2016)
Brian Brazil
 
Grafana overview deck - Tech - 2023 May v1.pdf
BillySin5
 
Employment Hero monitoring solution
Luong Vo
 
Observability for Application Developers (1)-1.pptx
OpsTree solutions
 
Monitorama 2015 Netflix Instance Analysis
Brendan Gregg
 
Prometheus (Prometheus London, 2016)
Brian Brazil
 
Ad

More from Grafana Labs (6)

PDF
Cortex: Horizontally Scalable, Highly Available Prometheus
Grafana Labs
 
PDF
Monitoring Kubernetes with Prometheus
Grafana Labs
 
PDF
Monitoring the Hashistack with Prometheus
Grafana Labs
 
PDF
Explore your prometheus data in grafana - Promcon 2018
Grafana Labs
 
PDF
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...
Grafana Labs
 
PDF
Prometheus Monitoring Mixins
Grafana Labs
 
Cortex: Horizontally Scalable, Highly Available Prometheus
Grafana Labs
 
Monitoring Kubernetes with Prometheus
Grafana Labs
 
Monitoring the Hashistack with Prometheus
Grafana Labs
 
Explore your prometheus data in grafana - Promcon 2018
Grafana Labs
 
[PromCon2018] Prometheus Monitoring Mixins: Using Jsonnet to Package Together...
Grafana Labs
 
Prometheus Monitoring Mixins
Grafana Labs
 
Ad

Recently uploaded (20)

PDF
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
 
PDF
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
PPTX
𝙳𝚘𝚠𝚗𝚕𝚘𝚊𝚍—Wondershare Filmora Crack 14.0.7 + Key Download 2025
sebastian aliya
 
PDF
5 Things to Consider When Deploying AI in Your Enterprise
Safe Software
 
PDF
From Chatbot to Destroyer of Endpoints - Can ChatGPT Automate EDR Bypasses (1...
Priyanka Aash
 
PDF
How to Visualize the ​Spatio-Temporal Data Using CesiumJS​
SANGHEE SHIN
 
PDF
My Journey from CAD to BIM: A True Underdog Story
Safe Software
 
PDF
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
PDF
UiPath Agentic AI ile Akıllı Otomasyonun Yeni Çağı
UiPathCommunity
 
PDF
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
PPTX
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
PDF
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
PDF
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
PDF
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
PDF
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
PDF
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
ScyllaDB
 
PDF
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
PDF
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
PDF
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
 
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
𝙳𝚘𝚠𝚗𝚕𝚘𝚊𝚍—Wondershare Filmora Crack 14.0.7 + Key Download 2025
sebastian aliya
 
5 Things to Consider When Deploying AI in Your Enterprise
Safe Software
 
From Chatbot to Destroyer of Endpoints - Can ChatGPT Automate EDR Bypasses (1...
Priyanka Aash
 
How to Visualize the ​Spatio-Temporal Data Using CesiumJS​
SANGHEE SHIN
 
My Journey from CAD to BIM: A True Underdog Story
Safe Software
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
UiPath Agentic AI ile Akıllı Otomasyonun Yeni Çağı
UiPathCommunity
 
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
ScyllaDB
 
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 

The RED Method: How to monitoring your microservices.

  • 1. The RED Method Patterns for instrumentation & monitoring. Tom Wilkie [email protected] @tom_wilkie github.com/tomwilkie
  • 3. Introduction Why does this matter? The USE Method Utilisation, Saturation, Errors The RED Method Requests Rate, Errors, Duration.. The Four Golden Signals RED + Saturation
  • 6. For every resource, monitor: • Utilisation: % time that the resource was busy • Saturation: amount of work resource has to do, often queue length • Errors: the count of error events http://www.brendangregg.com/usemethod.html
  • 7. Utilisation Saturation Errors CPU ✔ ✔ ✔ Memory ✔ ✔ ✔ Disk ✔ ✔ ✔ Network ✔ ✔ ✖
  • 8. CPU USE in Prometheus CPU Utilisation: 1 - avg(rate(node_cpu{job="default/node-exporter",mode="idle"}[1m])) CPU Saturation: sum(node_load1{job="default/node-exporter"}) / sum(node:node_num_cpu:sum)
  • 9. Memory USE in Prometheus Memory Utilisation: 1 - sum( node_memory_MemFree{job=“…”} + node_memory_Cached{job=“…”} + node_memory_Buffers{job=“…”} ) / sum(node_memory_MemTotal{job=“…”}) Memory Saturation: 1e3 * sum( rate(node_vmstat_pgpgin{job=“…”}[1m]) + rate(node_vmstat_pgpgout{job=“…”}[1m])) )
  • 10. Interesting / Hard Cases • CPU Errors, Memory Errors • Hard Disk Errors! • Disk Capacity vs Disk IO • Network Utilisation • Interconnects
  • 12. More Details • “The USE Method” - Brendan Gregg • Kubernetes Mixin - https://github.com/grafana/jsonnet-libs
  • 14. For every service, monitor request: • Rate - number of requests per second • Errors - the number of those requests that are failing • Duration - the amount of time those requests take The RED Method
  • 16. Prometheus Implementation import ( “github.com/prometheus/client_golang/prometheus" ) var requestDuration = prometheus.NewHistogramVec(prometheus.HistogramOpts{ Name: "request_duration_seconds", Help: "Time (in seconds) spent serving HTTP requests.", Buckets: prometheus.DefBuckets, }, []string{"method", "route", "status_code"}) func init() { prometheus.MustRegister(requestDuration) }
  • 17. func wrap(h http.Handler) http.Handler { return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { m := httpsnoop.CaptureMetrics(h, w, r) requestDuration.WithLabelValues(r.Method, r.URL.Path, strconv.Itoa(m.Code)).Observe(m.Duration.Seconds()) }) } func server(addr string) { http.Handle("/metrics", prometheus.Handler()) http.Handle("/greeter", wrap(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { ... })) } Prometheus Implementation
  • 23. More Details • “Monitoring Microservices” - Weaveworks (slides) • "The RED Method: key metrics for microservices architecture” - Weaveworks • “Monitoring and Observability with USE and RED” - VividCortex • “RED Method for Prometheus – 3 Key Metrics for Monitoring” - Rancher Labs • “Logs and Metrics” - Cindy Sridharan • "Logging v. instrumentation”, “Go best practices, six years in” - Peter Bourgon
  • 26. The Four Golden Signals For each service, monitor: • Latency - time taken to serve a request • Traffic - how much demand is places on your system • Errors - rate or requests that are failing • Saturation - how “full” your services is
  • 29. More Details • “The Four Golden Signals” - The Google SRE Book • “How to Monitor the SRE Golden Signals” - Steve Mushero
  • 32. Tom Wilkie VP Product, Grafana Labs Previously: Kausal, Weaveworks, Google, Acunu, Xensource Twitter: @tom_wilkie Email: [email protected]
  • 33. + Grafana Cloud is a hosted and fully managed SaaS metrics platform that helps Ops and Dev teams using Grafana to understand the behavior of their applications and infrastructure Grafana Cloud allows users to provision and manage the best open source observability tools - Grafana and Prometheus - all through a simple UI and single API. What is Grafana Cloud? Store, visualize and alert without the headache of scaling or managing your own monitoring stack. Your complete, fully managed, hosted metrics platform. Grafana Cloud: