SlideShare a Scribd company logo
Building an Observability Platform in 389 Difficult Steps
Who am I?
David Worth
Sr. SRE and Engineering Manager at Strava
Previously Sr. Engineer and Engineering Manager at
DigitalOcean in Compute
Building an Observability
Platform in 383 Difficult Steps
As you may recall - I’ve talked about
the Venn Euler Diagram of
Observability before:
Logging Distributed Tracing
Error Handling Instrumentation
Error Logs
Error Rates &
Timing Information
Error Rates
Call Traces and
outcomes
Let’s get started!
Warning.
This won’t be easy.
This is an Investment in Capabilities.
Those capabilities will pay dividends
in operations, customer value, and
business continuity.
Observability Considerations
A brief survey.
Observability
Considerations
“Build, Buy, or Operate”
For each of these services there
are standard engineering
tradeoffs around building your
own, operating an OSS service,
and paying a 3rd party.
Observability
Considerations
“Stack”
The observability space is
extremely polyglot - even in fairly
heterogeneous ecosystems like
Prometheus (Golang) clients will
be “stack” dependant.
Are you comfortable relying on
tools written in a language your
team may have limited familiarity
with?
Observability
Considerations
Where to keep the data?
You have a few options for these
services:
Cloud Provider managed
3rd Party managed
On-Prem externally-managed
On-Prem self-managed
Observability
Considerations
Retention vs. GDPR / CCPA / ...
If you are bound by privacy
compliance ensure your retention
of any controlled data (PII is less
than required by regulations.
If using a 3rd party - how do they
address ensure compliance?
Observability
Considerations
Data Recording vs. Regulations
Never record passwords,
password hashes, CC#s or CVVs
during in transaction in logs or
otherwise for PCI/HIPAA/etc.
If using a 3rd party for logging,
etc. and you ever have
accidentally logged any of those
things how can you remediate
that?
Do you have a Data Protection Officer (DPO?
Your DPO can help you, and your engineers, navigate the complex
requirements of not just observability platforms but requirements in what
you provide your customers, and internal customers such as data
analysts and business development teams.
Find one.
What are we actually talking about?
What even is an
“Observability
Platform”?
An “Observability Platform” is a
set of shared tools, your
organization uses to understand
the state of your system at any
given time, with some historical
context, to diagnose and improve
the product.
Bare Metal +
Applications or
Services
VM 
Applications or
Services
Container
Orchestration
Container +
Application
Serverless
Function
Data Sources Platform
Exception
Handling
Logging
Metrics
Tracing
Specialized Domains:
DBs, etc.
APM
Sinks
Humans 👥
Robots 🤖
Engineers
Product Owners
Analysts
Finance Team
Chat Bots
AIML
Alerting
Other Sources
● Remote Clients: Mobile (Native) Applications and Browsers
● Short lived batch jobs (cron?
● Long lived but inconsistently run batch jobs (Spark!
● Networking Devices (routers, switches, firewalls, load-balancers, etc.)
● IoT Devices
Let’s actually build some observability
tools!
Let’s start with what we all have ...
Bare Metal +
Applications or
Services
VM 
Applications or
Services
Container
Orchestration
Container +
Application
Serverless
Function
Let’s start with what you have:
Hosts / Containers / Functions
each of which produce some or many
Errors / Logs / Metrics / API calls
Data Sources
Now let’s talk about what you can do with them:
Aggregate Errors / Logs / Metrics / API Call information
Into
A Unified Observability Platform
Errors!
We do! We do have Errors!
panic: really bad error
goroutine 1 [running]:
main.main()
$GOPATH/src/github.com/daveworth/foobar/main.go:14
+0x7b
exit status 2
Exception Handling Pipeline
Application
Exception 💥
+
Context:
Inputs (Query Parameters)
Request ID *
Environment Variables
Stack Trace
Exception
Handler Client
Exception
Handler Service
Exception Handling Services
Sentry
Airbrake
Rollbar
Managed or On-Premise
Bugsnag
OK. We also have (lots of) Logs!
Aug 21 18:34:39 openvpn-access-server-sfo3 systemd[1]: Starting Daily apt
download activities...
Aug 21 18:34:39 openvpn-access-server-sfo3 systemd[1]: Started Daily apt download
activities.
Aug 21 19:17:01 openvpn-access-server-sfo3 CRON[21214]: (root) CMD ( cd / &&
run-parts --report /etc/cron.hourly)
Aug 21 20:17:01 openvpn-access-server-sfo3 CRON[21309]: (root) CMD ( cd / &&
run-parts --report /etc/cron.hourly)
Aug 21 21:17:01 openvpn-access-server-sfo3 CRON[21352]: (root) CMD ( cd / &&
Source(s) Log Collector
UI /
Visualization
👥 / 🤖
Overly Simplified) Logging Pipeline Overview
Source
A More Realistic) Logging Pipeline Overview
Filter Log Collector Log Aggregator
Broker
Ad-Hoc
Stream Query
Indexer
and
Query
UI /
Visualization 👥 / 🤖 /
🕵‍♀
Bare Metal +
Applications or
Services
VM 
Applications or
Services
Container
Orchestration
Container +
Application
Serverless
Function
Log Sources Storage / Access
Log Aggregators
ElasticSearch
Loki
Redis
Kafka
Kinesis
Brokers
Log Collectors
Collectors - can “push” to either Brokers or Log Storage
Logspout
FluentD / FluentBit
Filebeat
Promtail
rsyslog
Log Collectors often are Log Aggregators
Aggregators - pull from Brokers or systems and push
to Log Storage
Logstash
FluentD / FluentBit
Promtail
To get the most out of a (Centralized) Logging
Platform you need to Structure your logs in such a way
they can be best consumed by your Sinks.
A standard format is MITRE’s Common Event
Expression (CEE is represented as JSON. JSON is
well supported by essentially every programming
language and has the advantage of being both human
and robot parsable. Emitting Structured Logs means
humans, programs written by humans, and Centralized
Logging Platforms can consume logs on equal footing.
Not all 3rd party applications you run in your
ecosystem may support your logging format - you may
have to ingest suboptimal logs or write transformers
for them.
Logging Aside - Wire Format
{
"level": "error",
"ts": 1598044449.8620532,
"caller": "zappings/main.go:47",
"msg": "This is an ERROR message",
...
}
Log Storage and Usage
Where are the logs actually going?
Storage / Access
ElasticSearch
Loki
UI / Visualization
Kibana
Grafana
Sinks
Humans 👥
Engineers
Product Owners
Analysts
Finance Team
Robots 🤖
Chat Bots
AIML
Alerting
Legal 🕵‍♀
Audit
Log Processing and Collection
How do we get (logs) there?
Log Processing Infrastructure
Storage / Access
Log Aggregators
ElasticSearch
Loki
Redis
Kafka
Kinesis
Brokers
Log Collectors
Log Collector Deployment Patterns - Hosts
Bare Metal +
Applications or
Services
VM 
Applications or
Services
Container
Orchestration
Log Collector
running as a
service locally
on the host
Log Processing
Infrastructure
Log Collector Deployment Patterns - Containers
Container
Orchestration
Container +
Application
Log Collector
running as a
service locally
Log Processing
Infrastructure
Logging
Sidecar
Container +
Application
Logging
Sidecar
Container +
Application
Logging
Sidecar
Log Collector Deployment Patterns - Functions
Log Processing
Infrastructure
Log Collector
running as a
separate service
Serverless
Function
Cloud Provider
native logging
OK  What about Metrics?
# A weird metric from before the epoch:
something_weird{problem="division by zero"} +Inf -3982045
# A histogram, which has a pretty complex representation
in the text format:
# HELP http_request_duration_seconds A histogram of the
request duration.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.05"} 24054
http_request_duration_seconds_bucket{le="0.1"} 33444
Database
Accesses
Source
Application Metrics
“Timers”
Counters
Gauges
“Instruments”
External
API Calls
RPCs
Critical
Code Paths
Runtime
Metrics
Host
Metrics
Source 👥 / 🤖
Overly Simplified) Metrics Pipeline Overview
“Timers”
Counters
Gauges
“Instruments”
Metrics
Collector
UI / Visualization
Source
A More Realistic) Metrics Pipeline Overview
“Timers”
Counters
Gauges
“Instruments”
Metrics
Collector
👥 / 🤖
UI / Visualization
Metrics
Aggregator
Long-Term Metrics
Storage
Bare Metal +
Applications or
Services
VM 
Applications or
Services
Container
Orchestration
Container +
Application
Serverless
Function
Metrics Sources
StatsD
Prometheus
Push-Gateway
Prometheus
Metrics Collectors
Query / UI
Graphite GrafanaPrometheus
Metrics Aggregators /
Long Term Storage
Graphite
Thanos Cortex
Metrics Storage and Usage
Where are we going?
Query / UI
Graphite
GrafanaPrometheus
Metrics Aggregators &
Long Term Storage
Graphite
Thanos Cortex
Sinks
Humans 👥
Engineers
Product Owners
Analysts
Finance Team
Robots 🤖
Chat Bots
AIML
Alerting
Metric Ingest and Collection
How do we get (metrics) there?
Metric Ingest Infrastructure
StatsD
Prometheus
Push-Gateway
Prometheus
Metrics Collectors Metrics Aggregators /
Long Term Storage
Graphite
Thanos Cortex
Metrics Collector Deployment Patterns - Hosts
Bare Metal +
Applications or
Services
VM 
Applications or
Services
Container
Orchestration
Metrics
collectors and
exporters
running locally
on the host
Metrics Ingest
Infrastructure
Metrics Collector Deployment Patterns - Containers
Container
Orchestration
Containerized
Application
Metrics
collectors and
exporters
Metrics Ingest
Infrastructure
Service
Sidecar(s)
Service
Sidecar(s)
Service
Sidecar(s)
Service Discovery
Metrics Collector Deployment Patterns - Functions
Metric Ingest
Infrastructure
Metrics Collector
running as a separate
service ingesting
native metrics
Serverless
Function
Cloud Provider
native metrics
But is it a
“Platform”?
It is only a “Platform” when every
aspect of your business from
tactical engineering metrics to
business metrics are enabled “for
free” e.g. your team does not
have to remember to integrate
with the platform.
This is doubly true if you are talking
about “Tracing”
You had this...
Bare Metal +
Applications or
Services
VM 
Applications or
Services
Container
Orchestration
Container +
Application
Serverless
Function
Data Sources
Now you have this….
Bare Metal +
Applications or
Services
VMs +
Applications or
Services
Container
Orchestration
Containers +
Application
Serverless
Functions
Whoa! WAY too many Data Sources
Bare Metal +
Applications or
Services
VM 
Applications or
Services
Container
Orchestration
Container +
Application
Serverless
Function
Data Sources
Platform
Exception
Handling
Logging
Metrics
Tracing
You need this ...
Sinks
Humans 👥
Robots 🤖
Engineers
Product Owners
Analysts
Finance Team
Chat Bots
AIML
Alerting
to serve them!
And you build this
by ...
Choosing a unified set of
platform tools for each domain
Exception Handling / Logging /
Metrics / etc…) and building
curated libraries that all of your
applications consume to ensure
they integrate into your platform.
And there just
isn’t one “best”
answer
I have my favorites
… but the answer is you still have
homework to do. I’ve named a
few of my favorites during this
talk - maybe they help?
Let’s briefly talk about tracing...
and why it is a great “forcing function” for
building a true Platform
Tracing only
really works
when
Literally every system in your
entire ecosystem integrates with
it. Every blindspot is magnified in
tracing.
A Distributed Tracing Primer
Start Time
End Time +
Success/Failure
Request - ID: 1234
Cache Miss Service Call (ID: 1234) Render HTMLQuery Wrapper
Database Query DB Query Hard Calculation Coordinator
Calculation Pipelines
Source(s)
Distributed) Tracing Pipeline Overview
Trace Collector Samping Trace Storage
Request-ID Aware
UI /
Visualization
👥
Bare Metal +
Applications or
Services
VM 
Applications or
Services
Container
Orchestration
Container +
Application
Serverless
Function
Tracing Sources Trace Collectors Trace Storage
UI / Visualizations
These Spaces Intentionally
Left Blank.
These Spaces Intentionally
Left Blank.
These Spaces Intentionally
Left Blank.
So why is Tracing
the “forcing
function”?
Every single time you have a
Trace with a “blind spot” it
creates red-herrings and diverts
attention from the real problem.
So tracing lets you fix it by
touching all of your systems and
eliminating those blind spots by
...
Building an Observability Platform in 389 Difficult Steps
… building good and standard Exception Handling libraries.
… logging uniformly, with structure with those libraries.
… exposing good instrumentation primitives in that library.
… adding tracing primitives via that library.
Eliminate those blinds spots by ...
… all of your engineers “get observability for free”
… and they can get more specialized observability with very little work.
Eliminate those blinds spots so ….
Take what you have, process it and centralize it. Ensure that everyone
has the same tools and the same systems to consume them. Solve the
problems you have today and prepare to solve the ones you will have
soon.
Build a platform.
383 Difficult Steps Distilled
And if you’ve done that - you have an Observability Platform!
��
The End.
Thank you!
Questions? 🤔🤔🤔
Building an Observability Platform in 389 Difficult Steps
Ad

More Related Content

What's hot (18)

DCEU 18: From Monolith to Microservices
DCEU 18: From Monolith to MicroservicesDCEU 18: From Monolith to Microservices
DCEU 18: From Monolith to Microservices
Docker, Inc.
 
Infrastructure as code
Infrastructure as codeInfrastructure as code
Infrastructure as code
Axel Quack
 
DevEx | there’s no place like k3s
DevEx | there’s no place like k3sDevEx | there’s no place like k3s
DevEx | there’s no place like k3s
Haggai Philip Zagury
 
Enabling NFV features in kubernetes
Enabling NFV features in kubernetesEnabling NFV features in kubernetes
Enabling NFV features in kubernetes
Kuralamudhan Ramakrishnan
 
DCEU 18: Building Your Development Pipeline
DCEU 18: Building Your Development PipelineDCEU 18: Building Your Development Pipeline
DCEU 18: Building Your Development Pipeline
Docker, Inc.
 
From Code to Kubernetes
From Code to KubernetesFrom Code to Kubernetes
From Code to Kubernetes
Daniel Oliveira Filho
 
PHPIDOL#80: Kubernetes 101 for PHP Developer. Yusuf Hadiwinata - VP Operation...
PHPIDOL#80: Kubernetes 101 for PHP Developer. Yusuf Hadiwinata - VP Operation...PHPIDOL#80: Kubernetes 101 for PHP Developer. Yusuf Hadiwinata - VP Operation...
PHPIDOL#80: Kubernetes 101 for PHP Developer. Yusuf Hadiwinata - VP Operation...
Yusuf Hadiwinata Sutandar
 
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
Josef Adersberger
 
DCEU 18: Docker Enterprise Platform and Architecture
DCEU 18: Docker Enterprise Platform and ArchitectureDCEU 18: Docker Enterprise Platform and Architecture
DCEU 18: Docker Enterprise Platform and Architecture
Docker, Inc.
 
Cloud infrastructure as code
Cloud infrastructure as codeCloud infrastructure as code
Cloud infrastructure as code
Tomasz Cholewa
 
Sf bay area Kubernetes meetup dec8 2016 - deployment models
Sf bay area Kubernetes meetup dec8 2016 - deployment modelsSf bay area Kubernetes meetup dec8 2016 - deployment models
Sf bay area Kubernetes meetup dec8 2016 - deployment models
Peter Ss
 
DCEU 18: Continuous Delivery with Docker Containers and Java: The Good, the B...
DCEU 18: Continuous Delivery with Docker Containers and Java: The Good, the B...DCEU 18: Continuous Delivery with Docker Containers and Java: The Good, the B...
DCEU 18: Continuous Delivery with Docker Containers and Java: The Good, the B...
Docker, Inc.
 
DevOps @ OpenShift Online
DevOps @ OpenShift OnlineDevOps @ OpenShift Online
DevOps @ OpenShift Online
OpenShift Origin
 
DCEU 18: Docker Container Security
DCEU 18: Docker Container SecurityDCEU 18: Docker Container Security
DCEU 18: Docker Container Security
Docker, Inc.
 
Your journey into the serverless world
Your journey into the serverless worldYour journey into the serverless world
Your journey into the serverless world
Red Hat Developers
 
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
Ambassador Labs
 
DCSF19 Deploying Istio as an Ingress Controller
DCSF19 Deploying Istio as an Ingress Controller DCSF19 Deploying Istio as an Ingress Controller
DCSF19 Deploying Istio as an Ingress Controller
Docker, Inc.
 
DCEU 18: State of the Docker Engine
DCEU 18: State of the Docker EngineDCEU 18: State of the Docker Engine
DCEU 18: State of the Docker Engine
Docker, Inc.
 
DCEU 18: From Monolith to Microservices
DCEU 18: From Monolith to MicroservicesDCEU 18: From Monolith to Microservices
DCEU 18: From Monolith to Microservices
Docker, Inc.
 
Infrastructure as code
Infrastructure as codeInfrastructure as code
Infrastructure as code
Axel Quack
 
DevEx | there’s no place like k3s
DevEx | there’s no place like k3sDevEx | there’s no place like k3s
DevEx | there’s no place like k3s
Haggai Philip Zagury
 
DCEU 18: Building Your Development Pipeline
DCEU 18: Building Your Development PipelineDCEU 18: Building Your Development Pipeline
DCEU 18: Building Your Development Pipeline
Docker, Inc.
 
PHPIDOL#80: Kubernetes 101 for PHP Developer. Yusuf Hadiwinata - VP Operation...
PHPIDOL#80: Kubernetes 101 for PHP Developer. Yusuf Hadiwinata - VP Operation...PHPIDOL#80: Kubernetes 101 for PHP Developer. Yusuf Hadiwinata - VP Operation...
PHPIDOL#80: Kubernetes 101 for PHP Developer. Yusuf Hadiwinata - VP Operation...
Yusuf Hadiwinata Sutandar
 
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
Josef Adersberger
 
DCEU 18: Docker Enterprise Platform and Architecture
DCEU 18: Docker Enterprise Platform and ArchitectureDCEU 18: Docker Enterprise Platform and Architecture
DCEU 18: Docker Enterprise Platform and Architecture
Docker, Inc.
 
Cloud infrastructure as code
Cloud infrastructure as codeCloud infrastructure as code
Cloud infrastructure as code
Tomasz Cholewa
 
Sf bay area Kubernetes meetup dec8 2016 - deployment models
Sf bay area Kubernetes meetup dec8 2016 - deployment modelsSf bay area Kubernetes meetup dec8 2016 - deployment models
Sf bay area Kubernetes meetup dec8 2016 - deployment models
Peter Ss
 
DCEU 18: Continuous Delivery with Docker Containers and Java: The Good, the B...
DCEU 18: Continuous Delivery with Docker Containers and Java: The Good, the B...DCEU 18: Continuous Delivery with Docker Containers and Java: The Good, the B...
DCEU 18: Continuous Delivery with Docker Containers and Java: The Good, the B...
Docker, Inc.
 
DCEU 18: Docker Container Security
DCEU 18: Docker Container SecurityDCEU 18: Docker Container Security
DCEU 18: Docker Container Security
Docker, Inc.
 
Your journey into the serverless world
Your journey into the serverless worldYour journey into the serverless world
Your journey into the serverless world
Red Hat Developers
 
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
Ambassador Labs
 
DCSF19 Deploying Istio as an Ingress Controller
DCSF19 Deploying Istio as an Ingress Controller DCSF19 Deploying Istio as an Ingress Controller
DCSF19 Deploying Istio as an Ingress Controller
Docker, Inc.
 
DCEU 18: State of the Docker Engine
DCEU 18: State of the Docker EngineDCEU 18: State of the Docker Engine
DCEU 18: State of the Docker Engine
Docker, Inc.
 

Similar to Building an Observability Platform in 389 Difficult Steps (20)

Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
GetInData
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...
PerformanceVision (previously SecurActive)
 
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdfNET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
Tamir Dresher
 
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB
 
Azure Monitoring Overview
Azure Monitoring OverviewAzure Monitoring Overview
Azure Monitoring Overview
gjuljo
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Brian Brazil
 
Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)
Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)
Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)
Codit
 
Monitoring microservices platform
Monitoring microservices platformMonitoring microservices platform
Monitoring microservices platform
Boyan Dimitrov
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
GetInData
 
Cloud Foundry Technical Overview
Cloud Foundry Technical OverviewCloud Foundry Technical Overview
Cloud Foundry Technical Overview
cornelia davis
 
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User GroupWhat is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
Maarten Balliauw
 
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud ManagementSukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak
 
How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.
Renzo Tomà
 
"Wie passen Serverless & Autonomous zusammen?"
"Wie passen Serverless & Autonomous zusammen?""Wie passen Serverless & Autonomous zusammen?"
"Wie passen Serverless & Autonomous zusammen?"
Volker Linz
 
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
HostedbyConfluent
 
56k.cloud training
56k.cloud training56k.cloud training
56k.cloud training
Brian Christner
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to Telegraf
InfluxData
 
Webinar september 2013
Webinar september 2013Webinar september 2013
Webinar september 2013
Marc Gille
 
AMB110: IT Asset Management – How to Start When You Don’t Know Where to Start
AMB110: IT Asset Management – How to Start When You Don’t Know Where to StartAMB110: IT Asset Management – How to Start When You Don’t Know Where to Start
AMB110: IT Asset Management – How to Start When You Don’t Know Where to Start
Ivanti
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
GetInData
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...
PerformanceVision (previously SecurActive)
 
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdfNET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
Tamir Dresher
 
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB
 
Azure Monitoring Overview
Azure Monitoring OverviewAzure Monitoring Overview
Azure Monitoring Overview
gjuljo
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Brian Brazil
 
Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)
Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)
Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)
Codit
 
Monitoring microservices platform
Monitoring microservices platformMonitoring microservices platform
Monitoring microservices platform
Boyan Dimitrov
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
GetInData
 
Cloud Foundry Technical Overview
Cloud Foundry Technical OverviewCloud Foundry Technical Overview
Cloud Foundry Technical Overview
cornelia davis
 
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User GroupWhat is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
Maarten Balliauw
 
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud ManagementSukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak
 
How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.
Renzo Tomà
 
"Wie passen Serverless & Autonomous zusammen?"
"Wie passen Serverless & Autonomous zusammen?""Wie passen Serverless & Autonomous zusammen?"
"Wie passen Serverless & Autonomous zusammen?"
Volker Linz
 
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
HostedbyConfluent
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to Telegraf
InfluxData
 
Webinar september 2013
Webinar september 2013Webinar september 2013
Webinar september 2013
Marc Gille
 
AMB110: IT Asset Management – How to Start When You Don’t Know Where to Start
AMB110: IT Asset Management – How to Start When You Don’t Know Where to StartAMB110: IT Asset Management – How to Start When You Don’t Know Where to Start
AMB110: IT Asset Management – How to Start When You Don’t Know Where to Start
Ivanti
 
Ad

More from DigitalOcean (20)

Build Cloud Native Apps With DigitalOcean Kubernetes
Build Cloud Native Apps With DigitalOcean KubernetesBuild Cloud Native Apps With DigitalOcean Kubernetes
Build Cloud Native Apps With DigitalOcean Kubernetes
DigitalOcean
 
Benefits of Managed Databases
Benefits of Managed DatabasesBenefits of Managed Databases
Benefits of Managed Databases
DigitalOcean
 
Increase App Confidence Using CI/CD and Infrastructure As Code
Increase App Confidence Using CI/CD and Infrastructure As CodeIncrease App Confidence Using CI/CD and Infrastructure As Code
Increase App Confidence Using CI/CD and Infrastructure As Code
DigitalOcean
 
Build a Tech Brand During Covid in Emerging Tech Ecosystems
Build a Tech Brand During Covid in Emerging Tech EcosystemsBuild a Tech Brand During Covid in Emerging Tech Ecosystems
Build a Tech Brand During Covid in Emerging Tech Ecosystems
DigitalOcean
 
Sailing Through a Sea of CMS: Build and Extend APIs Faster With Strapi
Sailing Through a Sea of CMS: Build and Extend APIs Faster With StrapiSailing Through a Sea of CMS: Build and Extend APIs Faster With Strapi
Sailing Through a Sea of CMS: Build and Extend APIs Faster With Strapi
DigitalOcean
 
Doing E-commerce Right – Magento on DigitalOcean
Doing E-commerce Right – Magento on DigitalOceanDoing E-commerce Right – Magento on DigitalOcean
Doing E-commerce Right – Magento on DigitalOcean
DigitalOcean
 
Headless E-commerce That People Love
Headless E-commerce That People LoveHeadless E-commerce That People Love
Headless E-commerce That People Love
DigitalOcean
 
The Cloud Hosting Revolution Creates Opportunities for Your Business
The Cloud Hosting Revolution Creates Opportunities for Your BusinessThe Cloud Hosting Revolution Creates Opportunities for Your Business
The Cloud Hosting Revolution Creates Opportunities for Your Business
DigitalOcean
 
Build, Deploy, and Scale Your First Web App Using DigitalOcean App Platform
Build, Deploy, and Scale Your First Web App Using DigitalOcean App PlatformBuild, Deploy, and Scale Your First Web App Using DigitalOcean App Platform
Build, Deploy, and Scale Your First Web App Using DigitalOcean App Platform
DigitalOcean
 
Effective Kubernetes Onboarding
Effective Kubernetes OnboardingEffective Kubernetes Onboarding
Effective Kubernetes Onboarding
DigitalOcean
 
Creating Inclusive Learning Experiences
Creating Inclusive Learning ExperiencesCreating Inclusive Learning Experiences
Creating Inclusive Learning Experiences
DigitalOcean
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for Beginners
DigitalOcean
 
Command-line Your Way to PaaS Productivity With DigitalOcean App Platform
Command-line Your Way to PaaS Productivity With DigitalOcean App PlatformCommand-line Your Way to PaaS Productivity With DigitalOcean App Platform
Command-line Your Way to PaaS Productivity With DigitalOcean App Platform
DigitalOcean
 
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
DigitalOcean
 
Kubernetes: Beyond Baby Steps
Kubernetes: Beyond Baby StepsKubernetes: Beyond Baby Steps
Kubernetes: Beyond Baby Steps
DigitalOcean
 
How to Leverage Go for Your Networking Needs
How to Leverage Go for Your Networking NeedsHow to Leverage Go for Your Networking Needs
How to Leverage Go for Your Networking Needs
DigitalOcean
 
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
DigitalOcean
 
Secrets to Building & Scaling SRE Teams
Secrets to Building & Scaling SRE TeamsSecrets to Building & Scaling SRE Teams
Secrets to Building & Scaling SRE Teams
DigitalOcean
 
Deploying to DigitalOcean With GitHub Actions
Deploying to DigitalOcean With GitHub ActionsDeploying to DigitalOcean With GitHub Actions
Deploying to DigitalOcean With GitHub Actions
DigitalOcean
 
Doing This Cloud Thing Right – a Lap Around DigitalOcean Products and a Roadm...
Doing This Cloud Thing Right – a Lap Around DigitalOcean Products and a Roadm...Doing This Cloud Thing Right – a Lap Around DigitalOcean Products and a Roadm...
Doing This Cloud Thing Right – a Lap Around DigitalOcean Products and a Roadm...
DigitalOcean
 
Build Cloud Native Apps With DigitalOcean Kubernetes
Build Cloud Native Apps With DigitalOcean KubernetesBuild Cloud Native Apps With DigitalOcean Kubernetes
Build Cloud Native Apps With DigitalOcean Kubernetes
DigitalOcean
 
Benefits of Managed Databases
Benefits of Managed DatabasesBenefits of Managed Databases
Benefits of Managed Databases
DigitalOcean
 
Increase App Confidence Using CI/CD and Infrastructure As Code
Increase App Confidence Using CI/CD and Infrastructure As CodeIncrease App Confidence Using CI/CD and Infrastructure As Code
Increase App Confidence Using CI/CD and Infrastructure As Code
DigitalOcean
 
Build a Tech Brand During Covid in Emerging Tech Ecosystems
Build a Tech Brand During Covid in Emerging Tech EcosystemsBuild a Tech Brand During Covid in Emerging Tech Ecosystems
Build a Tech Brand During Covid in Emerging Tech Ecosystems
DigitalOcean
 
Sailing Through a Sea of CMS: Build and Extend APIs Faster With Strapi
Sailing Through a Sea of CMS: Build and Extend APIs Faster With StrapiSailing Through a Sea of CMS: Build and Extend APIs Faster With Strapi
Sailing Through a Sea of CMS: Build and Extend APIs Faster With Strapi
DigitalOcean
 
Doing E-commerce Right – Magento on DigitalOcean
Doing E-commerce Right – Magento on DigitalOceanDoing E-commerce Right – Magento on DigitalOcean
Doing E-commerce Right – Magento on DigitalOcean
DigitalOcean
 
Headless E-commerce That People Love
Headless E-commerce That People LoveHeadless E-commerce That People Love
Headless E-commerce That People Love
DigitalOcean
 
The Cloud Hosting Revolution Creates Opportunities for Your Business
The Cloud Hosting Revolution Creates Opportunities for Your BusinessThe Cloud Hosting Revolution Creates Opportunities for Your Business
The Cloud Hosting Revolution Creates Opportunities for Your Business
DigitalOcean
 
Build, Deploy, and Scale Your First Web App Using DigitalOcean App Platform
Build, Deploy, and Scale Your First Web App Using DigitalOcean App PlatformBuild, Deploy, and Scale Your First Web App Using DigitalOcean App Platform
Build, Deploy, and Scale Your First Web App Using DigitalOcean App Platform
DigitalOcean
 
Effective Kubernetes Onboarding
Effective Kubernetes OnboardingEffective Kubernetes Onboarding
Effective Kubernetes Onboarding
DigitalOcean
 
Creating Inclusive Learning Experiences
Creating Inclusive Learning ExperiencesCreating Inclusive Learning Experiences
Creating Inclusive Learning Experiences
DigitalOcean
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for Beginners
DigitalOcean
 
Command-line Your Way to PaaS Productivity With DigitalOcean App Platform
Command-line Your Way to PaaS Productivity With DigitalOcean App PlatformCommand-line Your Way to PaaS Productivity With DigitalOcean App Platform
Command-line Your Way to PaaS Productivity With DigitalOcean App Platform
DigitalOcean
 
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
Escape the Walls of PaaS: Unlock the Power & Flexibility of DigitalOcean App ...
DigitalOcean
 
Kubernetes: Beyond Baby Steps
Kubernetes: Beyond Baby StepsKubernetes: Beyond Baby Steps
Kubernetes: Beyond Baby Steps
DigitalOcean
 
How to Leverage Go for Your Networking Needs
How to Leverage Go for Your Networking NeedsHow to Leverage Go for Your Networking Needs
How to Leverage Go for Your Networking Needs
DigitalOcean
 
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
Combining Cloud Native & PaaS: Building a Fully Managed Application Platform ...
DigitalOcean
 
Secrets to Building & Scaling SRE Teams
Secrets to Building & Scaling SRE TeamsSecrets to Building & Scaling SRE Teams
Secrets to Building & Scaling SRE Teams
DigitalOcean
 
Deploying to DigitalOcean With GitHub Actions
Deploying to DigitalOcean With GitHub ActionsDeploying to DigitalOcean With GitHub Actions
Deploying to DigitalOcean With GitHub Actions
DigitalOcean
 
Doing This Cloud Thing Right – a Lap Around DigitalOcean Products and a Roadm...
Doing This Cloud Thing Right – a Lap Around DigitalOcean Products and a Roadm...Doing This Cloud Thing Right – a Lap Around DigitalOcean Products and a Roadm...
Doing This Cloud Thing Right – a Lap Around DigitalOcean Products and a Roadm...
DigitalOcean
 
Ad

Recently uploaded (20)

Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New VersionPixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
saimabibi60507
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New VersionPixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
saimabibi60507
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 

Building an Observability Platform in 389 Difficult Steps

  • 2. Who am I? David Worth Sr. SRE and Engineering Manager at Strava Previously Sr. Engineer and Engineering Manager at DigitalOcean in Compute
  • 3. Building an Observability Platform in 383 Difficult Steps
  • 4. As you may recall - I’ve talked about the Venn Euler Diagram of Observability before: Logging Distributed Tracing Error Handling Instrumentation Error Logs Error Rates & Timing Information Error Rates Call Traces and outcomes
  • 6. Warning. This won’t be easy. This is an Investment in Capabilities. Those capabilities will pay dividends in operations, customer value, and business continuity.
  • 8. Observability Considerations “Build, Buy, or Operate” For each of these services there are standard engineering tradeoffs around building your own, operating an OSS service, and paying a 3rd party.
  • 9. Observability Considerations “Stack” The observability space is extremely polyglot - even in fairly heterogeneous ecosystems like Prometheus (Golang) clients will be “stack” dependant. Are you comfortable relying on tools written in a language your team may have limited familiarity with?
  • 10. Observability Considerations Where to keep the data? You have a few options for these services: Cloud Provider managed 3rd Party managed On-Prem externally-managed On-Prem self-managed
  • 11. Observability Considerations Retention vs. GDPR / CCPA / ... If you are bound by privacy compliance ensure your retention of any controlled data (PII is less than required by regulations. If using a 3rd party - how do they address ensure compliance?
  • 12. Observability Considerations Data Recording vs. Regulations Never record passwords, password hashes, CC#s or CVVs during in transaction in logs or otherwise for PCI/HIPAA/etc. If using a 3rd party for logging, etc. and you ever have accidentally logged any of those things how can you remediate that?
  • 13. Do you have a Data Protection Officer (DPO? Your DPO can help you, and your engineers, navigate the complex requirements of not just observability platforms but requirements in what you provide your customers, and internal customers such as data analysts and business development teams. Find one.
  • 14. What are we actually talking about?
  • 15. What even is an “Observability Platform”? An “Observability Platform” is a set of shared tools, your organization uses to understand the state of your system at any given time, with some historical context, to diagnose and improve the product.
  • 16. Bare Metal + Applications or Services VM  Applications or Services Container Orchestration Container + Application Serverless Function Data Sources Platform Exception Handling Logging Metrics Tracing Specialized Domains: DBs, etc. APM Sinks Humans 👥 Robots 🤖 Engineers Product Owners Analysts Finance Team Chat Bots AIML Alerting
  • 17. Other Sources ● Remote Clients: Mobile (Native) Applications and Browsers ● Short lived batch jobs (cron? ● Long lived but inconsistently run batch jobs (Spark! ● Networking Devices (routers, switches, firewalls, load-balancers, etc.) ● IoT Devices
  • 18. Let’s actually build some observability tools!
  • 19. Let’s start with what we all have ...
  • 20. Bare Metal + Applications or Services VM  Applications or Services Container Orchestration Container + Application Serverless Function Let’s start with what you have: Hosts / Containers / Functions each of which produce some or many Errors / Logs / Metrics / API calls Data Sources Now let’s talk about what you can do with them: Aggregate Errors / Logs / Metrics / API Call information Into A Unified Observability Platform
  • 21. Errors! We do! We do have Errors! panic: really bad error goroutine 1 [running]: main.main() $GOPATH/src/github.com/daveworth/foobar/main.go:14 +0x7b exit status 2
  • 22. Exception Handling Pipeline Application Exception 💥 + Context: Inputs (Query Parameters) Request ID * Environment Variables Stack Trace Exception Handler Client Exception Handler Service
  • 24. OK. We also have (lots of) Logs! Aug 21 18:34:39 openvpn-access-server-sfo3 systemd[1]: Starting Daily apt download activities... Aug 21 18:34:39 openvpn-access-server-sfo3 systemd[1]: Started Daily apt download activities. Aug 21 19:17:01 openvpn-access-server-sfo3 CRON[21214]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 21 20:17:01 openvpn-access-server-sfo3 CRON[21309]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Aug 21 21:17:01 openvpn-access-server-sfo3 CRON[21352]: (root) CMD ( cd / &&
  • 25. Source(s) Log Collector UI / Visualization 👥 / 🤖 Overly Simplified) Logging Pipeline Overview
  • 26. Source A More Realistic) Logging Pipeline Overview Filter Log Collector Log Aggregator Broker Ad-Hoc Stream Query Indexer and Query UI / Visualization 👥 / 🤖 / 🕵‍♀
  • 27. Bare Metal + Applications or Services VM  Applications or Services Container Orchestration Container + Application Serverless Function Log Sources Storage / Access Log Aggregators ElasticSearch Loki Redis Kafka Kinesis Brokers Log Collectors
  • 28. Collectors - can “push” to either Brokers or Log Storage Logspout FluentD / FluentBit Filebeat Promtail rsyslog Log Collectors often are Log Aggregators Aggregators - pull from Brokers or systems and push to Log Storage Logstash FluentD / FluentBit Promtail
  • 29. To get the most out of a (Centralized) Logging Platform you need to Structure your logs in such a way they can be best consumed by your Sinks. A standard format is MITRE’s Common Event Expression (CEE is represented as JSON. JSON is well supported by essentially every programming language and has the advantage of being both human and robot parsable. Emitting Structured Logs means humans, programs written by humans, and Centralized Logging Platforms can consume logs on equal footing. Not all 3rd party applications you run in your ecosystem may support your logging format - you may have to ingest suboptimal logs or write transformers for them. Logging Aside - Wire Format { "level": "error", "ts": 1598044449.8620532, "caller": "zappings/main.go:47", "msg": "This is an ERROR message", ... }
  • 30. Log Storage and Usage Where are the logs actually going?
  • 31. Storage / Access ElasticSearch Loki UI / Visualization Kibana Grafana Sinks Humans 👥 Engineers Product Owners Analysts Finance Team Robots 🤖 Chat Bots AIML Alerting Legal 🕵‍♀ Audit
  • 32. Log Processing and Collection How do we get (logs) there?
  • 33. Log Processing Infrastructure Storage / Access Log Aggregators ElasticSearch Loki Redis Kafka Kinesis Brokers Log Collectors
  • 34. Log Collector Deployment Patterns - Hosts Bare Metal + Applications or Services VM  Applications or Services Container Orchestration Log Collector running as a service locally on the host Log Processing Infrastructure
  • 35. Log Collector Deployment Patterns - Containers Container Orchestration Container + Application Log Collector running as a service locally Log Processing Infrastructure Logging Sidecar Container + Application Logging Sidecar Container + Application Logging Sidecar
  • 36. Log Collector Deployment Patterns - Functions Log Processing Infrastructure Log Collector running as a separate service Serverless Function Cloud Provider native logging
  • 37. OK  What about Metrics? # A weird metric from before the epoch: something_weird{problem="division by zero"} +Inf -3982045 # A histogram, which has a pretty complex representation in the text format: # HELP http_request_duration_seconds A histogram of the request duration. # TYPE http_request_duration_seconds histogram http_request_duration_seconds_bucket{le="0.05"} 24054 http_request_duration_seconds_bucket{le="0.1"} 33444
  • 39. Source 👥 / 🤖 Overly Simplified) Metrics Pipeline Overview “Timers” Counters Gauges “Instruments” Metrics Collector UI / Visualization
  • 40. Source A More Realistic) Metrics Pipeline Overview “Timers” Counters Gauges “Instruments” Metrics Collector 👥 / 🤖 UI / Visualization Metrics Aggregator Long-Term Metrics Storage
  • 41. Bare Metal + Applications or Services VM  Applications or Services Container Orchestration Container + Application Serverless Function Metrics Sources StatsD Prometheus Push-Gateway Prometheus Metrics Collectors Query / UI Graphite GrafanaPrometheus Metrics Aggregators / Long Term Storage Graphite Thanos Cortex
  • 42. Metrics Storage and Usage Where are we going?
  • 43. Query / UI Graphite GrafanaPrometheus Metrics Aggregators & Long Term Storage Graphite Thanos Cortex Sinks Humans 👥 Engineers Product Owners Analysts Finance Team Robots 🤖 Chat Bots AIML Alerting
  • 44. Metric Ingest and Collection How do we get (metrics) there?
  • 45. Metric Ingest Infrastructure StatsD Prometheus Push-Gateway Prometheus Metrics Collectors Metrics Aggregators / Long Term Storage Graphite Thanos Cortex
  • 46. Metrics Collector Deployment Patterns - Hosts Bare Metal + Applications or Services VM  Applications or Services Container Orchestration Metrics collectors and exporters running locally on the host Metrics Ingest Infrastructure
  • 47. Metrics Collector Deployment Patterns - Containers Container Orchestration Containerized Application Metrics collectors and exporters Metrics Ingest Infrastructure Service Sidecar(s) Service Sidecar(s) Service Sidecar(s) Service Discovery
  • 48. Metrics Collector Deployment Patterns - Functions Metric Ingest Infrastructure Metrics Collector running as a separate service ingesting native metrics Serverless Function Cloud Provider native metrics
  • 49. But is it a “Platform”? It is only a “Platform” when every aspect of your business from tactical engineering metrics to business metrics are enabled “for free” e.g. your team does not have to remember to integrate with the platform.
  • 50. This is doubly true if you are talking about “Tracing”
  • 51. You had this... Bare Metal + Applications or Services VM  Applications or Services Container Orchestration Container + Application Serverless Function Data Sources
  • 52. Now you have this…. Bare Metal + Applications or Services VMs + Applications or Services Container Orchestration Containers + Application Serverless Functions Whoa! WAY too many Data Sources
  • 53. Bare Metal + Applications or Services VM  Applications or Services Container Orchestration Container + Application Serverless Function Data Sources Platform Exception Handling Logging Metrics Tracing You need this ... Sinks Humans 👥 Robots 🤖 Engineers Product Owners Analysts Finance Team Chat Bots AIML Alerting to serve them!
  • 54. And you build this by ... Choosing a unified set of platform tools for each domain Exception Handling / Logging / Metrics / etc…) and building curated libraries that all of your applications consume to ensure they integrate into your platform.
  • 55. And there just isn’t one “best” answer I have my favorites … but the answer is you still have homework to do. I’ve named a few of my favorites during this talk - maybe they help?
  • 56. Let’s briefly talk about tracing... and why it is a great “forcing function” for building a true Platform
  • 57. Tracing only really works when Literally every system in your entire ecosystem integrates with it. Every blindspot is magnified in tracing.
  • 58. A Distributed Tracing Primer Start Time End Time + Success/Failure Request - ID: 1234 Cache Miss Service Call (ID: 1234) Render HTMLQuery Wrapper Database Query DB Query Hard Calculation Coordinator Calculation Pipelines
  • 59. Source(s) Distributed) Tracing Pipeline Overview Trace Collector Samping Trace Storage Request-ID Aware UI / Visualization 👥
  • 60. Bare Metal + Applications or Services VM  Applications or Services Container Orchestration Container + Application Serverless Function Tracing Sources Trace Collectors Trace Storage UI / Visualizations These Spaces Intentionally Left Blank. These Spaces Intentionally Left Blank. These Spaces Intentionally Left Blank.
  • 61. So why is Tracing the “forcing function”? Every single time you have a Trace with a “blind spot” it creates red-herrings and diverts attention from the real problem. So tracing lets you fix it by touching all of your systems and eliminating those blind spots by ...
  • 63. … building good and standard Exception Handling libraries. … logging uniformly, with structure with those libraries. … exposing good instrumentation primitives in that library. … adding tracing primitives via that library. Eliminate those blinds spots by ...
  • 64. … all of your engineers “get observability for free” … and they can get more specialized observability with very little work. Eliminate those blinds spots so ….
  • 65. Take what you have, process it and centralize it. Ensure that everyone has the same tools and the same systems to consume them. Solve the problems you have today and prepare to solve the ones you will have soon. Build a platform. 383 Difficult Steps Distilled
  • 66. And if you’ve done that - you have an Observability Platform! ��