SlideShare a Scribd company logo
Optimizing
Observability
Spend: Metrics
Eric D. Schabell
Director Evangelism
@ericschabell{@fosstodon.org}
Aug 2023
DataOps Day 2023
chronosphere.io
Observability…
chronosphere.io
Cloud Native
Observability at Scale
chronosphere.io
“It’s remarkable how common this situation is,
where an organization is paying more for their
observability data, than they do for their
production infrastructure.”
chronosphere.io
Data volume
Experiment:
- Hello World app on 4 node
Kubernetes cluster with
Tracing, End User Metrics
(EUM), Logs, Metrics
(containers / nodes)
- 30 days == +450 GB
chronosphere.io
“If you have to ask,
you can’t afford it…”
Only if we get better
outcomes…
chronosphere.io
chronosphere.io
10 hours
on average, per week, trying
to triage and understand
incidents - a quarter
of a 40 hour work week
chronosphere.io
Know the cost of
observability
metrics data?
chronosphere.io
Dedicated FinOps
“By 2023, 80% of organizations
using cloud services will
establish a dedicated FinOps
function to automate policy-
driven observibility and
optimization of cloud resources
to maximize value.”
-- Source: IDC 2022
Observability Data Optimization Cycle
Centralized Governance - It Starts Here
Centralized Governance
Give teams ownership and control of their metrics to control
cardinality and growth
Quotas - Allocate portions of the licensed persisted write
capacity amongst teams and services
Priorities - Prioritize which data is impacted if over
capacity
Analyze Data
Analyze
Understand the value of the observability data to identify what is
useful and what is waste
Metrics Traffic Analyzer - Provides a real-time view of incoming
metrics grouped by label, and their relative frequency
Metrics Usage Analyzer - View all metrics in Chronosphere
ranked from least used to most used to understand the value
each metric delivers
Trace Analyzer - Provides a real-time view of incoming traces
grouped by tag and their relative frequency
The Metrics Traffic Analyzer helps
users:
● Understand metrics traffic
patterns and scale
● Break down biggest and
smallest contributors to traffic
scale (by metric name, label,
application, etc)
● Troubleshoot cardinality
spikes
Metrics Traffic Analyzer
Real-time view of incoming metrics
View Live or Pause to
investigate specific metrics
and their labels
View traffic before
it is stored to help
make decisions
about traffic shape
before you pay for
it
Breakdown traffic by metric name and label
Labels
Metrics
Troubleshoot high cardinality metrics & labels
Metrics with
‘instance’ label
‘instance’ label is
on 100% of metrics
and has 62 unique
values
Metrics Usage Analyzer
The Metrics Usage Analyzer
allows user to:
● Understand the value
each metric delivers
● Identify unused and
underutilized metrics
● Know if a metric is being
used, where, and by
whom
● Help make better
shaping decisions
What is and is not valuable?
Default
sort is
Least
Valuable
Click for more
Usage Details
Resolving uncertainty about value
Where is it being
used?
Select 14 or 30
days
How much is it
used?
Keeping it in context
Utility Score + DPPS
Where and how
much it’s used
Scenario - Low Utility Score but High Ingest
Low Utility Score,
but high DPPS.
How is it being
used?
Let’s take a look at
the Usage Details
Scenario - No references, but some executions
Not being used in
Dashboard, Alerts,
etc… but two
users. Who are
they?
Underutilized metric discovered!
Two of our top
SREs!
What are they
using it for?
Should others be
using it as well?
Refine - Shape and Transform Data
Refine
After understanding cost & value of data, we enable
you to take action without touching source code or
redeploying.
We do this by allowing you to aggregate or
downsample data, remove high cardinality labels, or
drop non-valuable data. This is done real-time at
ingest (streaming), meaning no delay in alerts or
need to store raw data.
The result is reduced cost & improved performance
without alert or query impact.
Operate
The Control Plane has built-in capabilities to ensure queries
perform optimally and require no user intervention, while reducing
idle time and improving engineer productivity
Query Accelerator - Automatically ensures every possible
dashboard is fast and performant – no manual optimizations
needed.
Query Scheduler - Automatically ensures that query resources are
fairly shared so one user, or group of users can’t crowd out others.
Shaping Rules UI - Understand current shaping rules
configuration and value. Preview new policies before they are
implemented.
Operate - Continuously Adjust for Efficiency
Why Optimizing Observability Spend
The need is real
● Study by ESG, 69% of companies are concerned
with the rate of their observability data growth
● When able to control and optimize their data:
○ Expanding visibility and coverage
○ Increasing instrumentation of customer
experience to improve business
outcomes
○ Freeing up observability team time to
tackle strategic projects
chronosphere.io
chronosphere.io
Customer Impact
50%
data volume
reduction
90%
reduction in on-call
pages
80%
data volume
reduction
8x
query latency
improvement
98%
data volume
reduction
8x
MTTD
improvement
"With Chronosphere, we were able to
not only significantly improve
reliability and performance of our
observability solution, but we've also
saved millions of dollars a year. With
the Chronosphere Control Plane, we're
reducing our observability data
volumes by more than 80%."
Yash Kumaraswamy, Senior Staff Engineer, Robinhood
chronosphere.io
chronosphere.io
Learn More
Resources
● Introducing: The Observability Data Optimization Cycle
● Metrics Usage Analyzer: Understand the value of each metric in your system
● How cloud native workloads affect cardinality over time
● Metrics Quotas: Protect yourself from cardinality explosions and budget overruns
Case Studies
● Why DoorDash Needed True Cloud Native Monitoring
● Top FinTech company chooses Chronosphere observability for industry-leading
reliability and performance
Talk to an Observability expert at Chronosphere
○ Schedule a conversation
Questions?
Eric D. Schabell
Director Evangelism
@ericschabell{@fosstodon.org}
Aug 2023
Ad

More Related Content

Similar to Optimizing Observability Spend: Metrics (20)

SplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and LogsSplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and Logs
Splunk
 
Automated Analytics at Scale
Automated Analytics at ScaleAutomated Analytics at Scale
Automated Analytics at Scale
DataWorks Summit/Hadoop Summit
 
Connecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deploymentConnecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deployment
Nagarro
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
Roger Barga
 
How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!
Eric D. Schabell
 
Simplifying Analytics - by Novoniel Deb
Simplifying Analytics - by Novoniel DebSimplifying Analytics - by Novoniel Deb
Simplifying Analytics - by Novoniel Deb
Novoniel Deb
 
Sgcp14dunlea
Sgcp14dunleaSgcp14dunlea
Sgcp14dunlea
Justin Hayward
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics Platform
Arvind Sathi
 
Top 8 Trends in Performance Engineering
Top 8 Trends in Performance EngineeringTop 8 Trends in Performance Engineering
Top 8 Trends in Performance Engineering
Convetit
 
Using analytics in ux design my view
Using analytics in ux design   my viewUsing analytics in ux design   my view
Using analytics in ux design my view
Outi Aramo
 
Driving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine LearningDriving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine Learning
CCG
 
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
IRJET Journal
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
Chief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital TransformationChief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital Transformation
Value Amplify Consulting
 
Data Driven Engineering 2014
Data Driven Engineering 2014Data Driven Engineering 2014
Data Driven Engineering 2014
Roger Barga
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital Transformation
Mukund Babbar
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Impetus Technologies
 
Data Science for Retail Broking
Data Science for Retail BrokingData Science for Retail Broking
Data Science for Retail Broking
AlgoAnalytics Financial Consultancy Pvt. Ltd.
 
Data Science for Retail Broking
Data Science for Retail BrokingData Science for Retail Broking
Data Science for Retail Broking
AlgoAnalytics Financial Consultancy Pvt. Ltd.
 
Leverage Sage Business Intelligence for Your Organization
Leverage Sage Business Intelligence for Your OrganizationLeverage Sage Business Intelligence for Your Organization
Leverage Sage Business Intelligence for Your Organization
RKLeSolutions
 
SplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and LogsSplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and Logs
Splunk
 
Connecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deploymentConnecting the dots – Industrial IoT is more than just sensor deployment
Connecting the dots – Industrial IoT is more than just sensor deployment
Nagarro
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
Roger Barga
 
How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!
Eric D. Schabell
 
Simplifying Analytics - by Novoniel Deb
Simplifying Analytics - by Novoniel DebSimplifying Analytics - by Novoniel Deb
Simplifying Analytics - by Novoniel Deb
Novoniel Deb
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics Platform
Arvind Sathi
 
Top 8 Trends in Performance Engineering
Top 8 Trends in Performance EngineeringTop 8 Trends in Performance Engineering
Top 8 Trends in Performance Engineering
Convetit
 
Using analytics in ux design my view
Using analytics in ux design   my viewUsing analytics in ux design   my view
Using analytics in ux design my view
Outi Aramo
 
Driving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine LearningDriving Customer Loyalty with Azure Machine Learning
Driving Customer Loyalty with Azure Machine Learning
CCG
 
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
IRJET Journal
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
Chief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital TransformationChief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital Transformation
Value Amplify Consulting
 
Data Driven Engineering 2014
Data Driven Engineering 2014Data Driven Engineering 2014
Data Driven Engineering 2014
Roger Barga
 
Data Analytics in Digital Transformation
Data Analytics in Digital TransformationData Analytics in Digital Transformation
Data Analytics in Digital Transformation
Mukund Babbar
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Impetus Technologies
 
Leverage Sage Business Intelligence for Your Organization
Leverage Sage Business Intelligence for Your OrganizationLeverage Sage Business Intelligence for Your Organization
Leverage Sage Business Intelligence for Your Organization
RKLeSolutions
 

More from Eric D. Schabell (20)

Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with PrometheusMeet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Eric D. Schabell
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Observability-as-a-Service: When Platform Engineers meet SREs
Observability-as-a-Service: When Platform Engineers meet SREsObservability-as-a-Service: When Platform Engineers meet SREs
Observability-as-a-Service: When Platform Engineers meet SREs
Eric D. Schabell
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
When Platform Engineers meet SREs - The Birth of O11y-as-a-Service Superpowers
When Platform Engineers meet SREs - The Birth of O11y-as-a-Service SuperpowersWhen Platform Engineers meet SREs - The Birth of O11y-as-a-Service Superpowers
When Platform Engineers meet SREs - The Birth of O11y-as-a-Service Superpowers
Eric D. Schabell
 
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with PrometheusMeet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Eric D. Schabell
 
Taking Back Control of Your Telemetry Data with Fluent Bit
Taking Back Control of Your Telemetry Data with Fluent BitTaking Back Control of Your Telemetry Data with Fluent Bit
Taking Back Control of Your Telemetry Data with Fluent Bit
Eric D. Schabell
 
Finding observability and DevEx tranquility sailing the monitoring data seas
Finding observability and DevEx tranquility sailing the monitoring data seasFinding observability and DevEx tranquility sailing the monitoring data seas
Finding observability and DevEx tranquility sailing the monitoring data seas
Eric D. Schabell
 
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with PrometheusMeet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Eric D. Schabell
 
MTTS - Sleep more, slog less with automated cloud native o11y platforms
MTTS - Sleep more, slog less with automated cloud native o11y platformsMTTS - Sleep more, slog less with automated cloud native o11y platforms
MTTS - Sleep more, slog less with automated cloud native o11y platforms
Eric D. Schabell
 
KCD Porto: Choose Your Own Adventure - Cloud Naive Observability Pitfalls
KCD Porto: Choose Your Own Adventure - Cloud Naive Observability PitfallsKCD Porto: Choose Your Own Adventure - Cloud Naive Observability Pitfalls
KCD Porto: Choose Your Own Adventure - Cloud Naive Observability Pitfalls
Eric D. Schabell
 
Infobip Shift EU 2024: Platform Engineers Arise - Adding Observability to You...
Infobip Shift EU 2024: Platform Engineers Arise - Adding Observability to You...Infobip Shift EU 2024: Platform Engineers Arise - Adding Observability to You...
Infobip Shift EU 2024: Platform Engineers Arise - Adding Observability to You...
Eric D. Schabell
 
PromCon EU 2024: Meet the New Kid in the Sandbox - Integrating Visualization ...
PromCon EU 2024: Meet the New Kid in the Sandbox - Integrating Visualization ...PromCon EU 2024: Meet the New Kid in the Sandbox - Integrating Visualization ...
PromCon EU 2024: Meet the New Kid in the Sandbox - Integrating Visualization ...
Eric D. Schabell
 
Taking Back Control of Your Telemetry Data with Fluent Bit
Taking Back Control of Your Telemetry Data with Fluent BitTaking Back Control of Your Telemetry Data with Fluent Bit
Taking Back Control of Your Telemetry Data with Fluent Bit
Eric D. Schabell
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
Power Up with Podman - Cloud Native + K8s Meetup
Power Up with Podman - Cloud Native + K8s MeetupPower Up with Podman - Cloud Native + K8s Meetup
Power Up with Podman - Cloud Native + K8s Meetup
Eric D. Schabell
 
Choose Your Own Adventure - Cloud Native Observability Pitfalls
Choose Your Own Adventure - Cloud Native Observability PitfallsChoose Your Own Adventure - Cloud Native Observability Pitfalls
Choose Your Own Adventure - Cloud Native Observability Pitfalls
Eric D. Schabell
 
Choose Your Own Observability Adventure
Choose Your Own Observability AdventureChoose Your Own Observability Adventure
Choose Your Own Observability Adventure
Eric D. Schabell
 
Checking the pulse of your cloud native architecture
Checking the pulse of your cloud native architectureChecking the pulse of your cloud native architecture
Checking the pulse of your cloud native architecture
Eric D. Schabell
 
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with PrometheusMeet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Eric D. Schabell
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Observability-as-a-Service: When Platform Engineers meet SREs
Observability-as-a-Service: When Platform Engineers meet SREsObservability-as-a-Service: When Platform Engineers meet SREs
Observability-as-a-Service: When Platform Engineers meet SREs
Eric D. Schabell
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
When Platform Engineers meet SREs - The Birth of O11y-as-a-Service Superpowers
When Platform Engineers meet SREs - The Birth of O11y-as-a-Service SuperpowersWhen Platform Engineers meet SREs - The Birth of O11y-as-a-Service Superpowers
When Platform Engineers meet SREs - The Birth of O11y-as-a-Service Superpowers
Eric D. Schabell
 
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with PrometheusMeet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Eric D. Schabell
 
Taking Back Control of Your Telemetry Data with Fluent Bit
Taking Back Control of Your Telemetry Data with Fluent BitTaking Back Control of Your Telemetry Data with Fluent Bit
Taking Back Control of Your Telemetry Data with Fluent Bit
Eric D. Schabell
 
Finding observability and DevEx tranquility sailing the monitoring data seas
Finding observability and DevEx tranquility sailing the monitoring data seasFinding observability and DevEx tranquility sailing the monitoring data seas
Finding observability and DevEx tranquility sailing the monitoring data seas
Eric D. Schabell
 
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with PrometheusMeet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Meet the New Kid in the Sandbox - Integrating Visualization with Prometheus
Eric D. Schabell
 
MTTS - Sleep more, slog less with automated cloud native o11y platforms
MTTS - Sleep more, slog less with automated cloud native o11y platformsMTTS - Sleep more, slog less with automated cloud native o11y platforms
MTTS - Sleep more, slog less with automated cloud native o11y platforms
Eric D. Schabell
 
KCD Porto: Choose Your Own Adventure - Cloud Naive Observability Pitfalls
KCD Porto: Choose Your Own Adventure - Cloud Naive Observability PitfallsKCD Porto: Choose Your Own Adventure - Cloud Naive Observability Pitfalls
KCD Porto: Choose Your Own Adventure - Cloud Naive Observability Pitfalls
Eric D. Schabell
 
Infobip Shift EU 2024: Platform Engineers Arise - Adding Observability to You...
Infobip Shift EU 2024: Platform Engineers Arise - Adding Observability to You...Infobip Shift EU 2024: Platform Engineers Arise - Adding Observability to You...
Infobip Shift EU 2024: Platform Engineers Arise - Adding Observability to You...
Eric D. Schabell
 
PromCon EU 2024: Meet the New Kid in the Sandbox - Integrating Visualization ...
PromCon EU 2024: Meet the New Kid in the Sandbox - Integrating Visualization ...PromCon EU 2024: Meet the New Kid in the Sandbox - Integrating Visualization ...
PromCon EU 2024: Meet the New Kid in the Sandbox - Integrating Visualization ...
Eric D. Schabell
 
Taking Back Control of Your Telemetry Data with Fluent Bit
Taking Back Control of Your Telemetry Data with Fluent BitTaking Back Control of Your Telemetry Data with Fluent Bit
Taking Back Control of Your Telemetry Data with Fluent Bit
Eric D. Schabell
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
Power Up with Podman - Cloud Native + K8s Meetup
Power Up with Podman - Cloud Native + K8s MeetupPower Up with Podman - Cloud Native + K8s Meetup
Power Up with Podman - Cloud Native + K8s Meetup
Eric D. Schabell
 
Choose Your Own Adventure - Cloud Native Observability Pitfalls
Choose Your Own Adventure - Cloud Native Observability PitfallsChoose Your Own Adventure - Cloud Native Observability Pitfalls
Choose Your Own Adventure - Cloud Native Observability Pitfalls
Eric D. Schabell
 
Choose Your Own Observability Adventure
Choose Your Own Observability AdventureChoose Your Own Observability Adventure
Choose Your Own Observability Adventure
Eric D. Schabell
 
Checking the pulse of your cloud native architecture
Checking the pulse of your cloud native architectureChecking the pulse of your cloud native architecture
Checking the pulse of your cloud native architecture
Eric D. Schabell
 
Ad

Recently uploaded (20)

#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Ad

Optimizing Observability Spend: Metrics

  • 1. Optimizing Observability Spend: Metrics Eric D. Schabell Director Evangelism @ericschabell{@fosstodon.org} Aug 2023 DataOps Day 2023
  • 4. chronosphere.io “It’s remarkable how common this situation is, where an organization is paying more for their observability data, than they do for their production infrastructure.”
  • 5. chronosphere.io Data volume Experiment: - Hello World app on 4 node Kubernetes cluster with Tracing, End User Metrics (EUM), Logs, Metrics (containers / nodes) - 30 days == +450 GB
  • 6. chronosphere.io “If you have to ask, you can’t afford it…” Only if we get better outcomes… chronosphere.io
  • 7. chronosphere.io 10 hours on average, per week, trying to triage and understand incidents - a quarter of a 40 hour work week
  • 8. chronosphere.io Know the cost of observability metrics data?
  • 9. chronosphere.io Dedicated FinOps “By 2023, 80% of organizations using cloud services will establish a dedicated FinOps function to automate policy- driven observibility and optimization of cloud resources to maximize value.” -- Source: IDC 2022
  • 11. Centralized Governance - It Starts Here Centralized Governance Give teams ownership and control of their metrics to control cardinality and growth Quotas - Allocate portions of the licensed persisted write capacity amongst teams and services Priorities - Prioritize which data is impacted if over capacity
  • 12. Analyze Data Analyze Understand the value of the observability data to identify what is useful and what is waste Metrics Traffic Analyzer - Provides a real-time view of incoming metrics grouped by label, and their relative frequency Metrics Usage Analyzer - View all metrics in Chronosphere ranked from least used to most used to understand the value each metric delivers Trace Analyzer - Provides a real-time view of incoming traces grouped by tag and their relative frequency
  • 13. The Metrics Traffic Analyzer helps users: ● Understand metrics traffic patterns and scale ● Break down biggest and smallest contributors to traffic scale (by metric name, label, application, etc) ● Troubleshoot cardinality spikes Metrics Traffic Analyzer
  • 14. Real-time view of incoming metrics View Live or Pause to investigate specific metrics and their labels View traffic before it is stored to help make decisions about traffic shape before you pay for it
  • 15. Breakdown traffic by metric name and label Labels Metrics
  • 16. Troubleshoot high cardinality metrics & labels Metrics with ‘instance’ label ‘instance’ label is on 100% of metrics and has 62 unique values
  • 17. Metrics Usage Analyzer The Metrics Usage Analyzer allows user to: ● Understand the value each metric delivers ● Identify unused and underutilized metrics ● Know if a metric is being used, where, and by whom ● Help make better shaping decisions
  • 18. What is and is not valuable? Default sort is Least Valuable Click for more Usage Details
  • 19. Resolving uncertainty about value Where is it being used? Select 14 or 30 days How much is it used?
  • 20. Keeping it in context Utility Score + DPPS Where and how much it’s used
  • 21. Scenario - Low Utility Score but High Ingest Low Utility Score, but high DPPS. How is it being used? Let’s take a look at the Usage Details
  • 22. Scenario - No references, but some executions Not being used in Dashboard, Alerts, etc… but two users. Who are they?
  • 23. Underutilized metric discovered! Two of our top SREs! What are they using it for? Should others be using it as well?
  • 24. Refine - Shape and Transform Data Refine After understanding cost & value of data, we enable you to take action without touching source code or redeploying. We do this by allowing you to aggregate or downsample data, remove high cardinality labels, or drop non-valuable data. This is done real-time at ingest (streaming), meaning no delay in alerts or need to store raw data. The result is reduced cost & improved performance without alert or query impact.
  • 25. Operate The Control Plane has built-in capabilities to ensure queries perform optimally and require no user intervention, while reducing idle time and improving engineer productivity Query Accelerator - Automatically ensures every possible dashboard is fast and performant – no manual optimizations needed. Query Scheduler - Automatically ensures that query resources are fairly shared so one user, or group of users can’t crowd out others. Shaping Rules UI - Understand current shaping rules configuration and value. Preview new policies before they are implemented. Operate - Continuously Adjust for Efficiency
  • 26. Why Optimizing Observability Spend The need is real ● Study by ESG, 69% of companies are concerned with the rate of their observability data growth ● When able to control and optimize their data: ○ Expanding visibility and coverage ○ Increasing instrumentation of customer experience to improve business outcomes ○ Freeing up observability team time to tackle strategic projects
  • 27. chronosphere.io chronosphere.io Customer Impact 50% data volume reduction 90% reduction in on-call pages 80% data volume reduction 8x query latency improvement 98% data volume reduction 8x MTTD improvement
  • 28. "With Chronosphere, we were able to not only significantly improve reliability and performance of our observability solution, but we've also saved millions of dollars a year. With the Chronosphere Control Plane, we're reducing our observability data volumes by more than 80%." Yash Kumaraswamy, Senior Staff Engineer, Robinhood
  • 29. chronosphere.io chronosphere.io Learn More Resources ● Introducing: The Observability Data Optimization Cycle ● Metrics Usage Analyzer: Understand the value of each metric in your system ● How cloud native workloads affect cardinality over time ● Metrics Quotas: Protect yourself from cardinality explosions and budget overruns Case Studies ● Why DoorDash Needed True Cloud Native Monitoring ● Top FinTech company chooses Chronosphere observability for industry-leading reliability and performance Talk to an Observability expert at Chronosphere ○ Schedule a conversation
  • 30. Questions? Eric D. Schabell Director Evangelism @ericschabell{@fosstodon.org} Aug 2023