SlideShare a Scribd company logo
@chronosphereio
Querying millions to billions of
metrics with M3DB’s index
FOSDEM 2020
@chronosphereio
@roskilli
Previously M3 tech lead at Uber, creator of M3DB.
CTO at Chronosphere.
Member of OpenMetrics.
@chronosphereio
Schema for data you would like to collect and aggregate
Name
● http_requests
Dimensions/Labels
● endpoint (e.g. /api/search)
● status_code (e.g. 500)
● deploy_version_git_sha (e.g. 25149a04c)
Monitoring: what is a metric?
@chronosphereio
1. Increasing number of regions, containers, k8s
pods, tracking deployed version - (cardinality!)
2. Metrics can have arbitrary number of dimensions
3. Building compound index is expensive
Problem
@chronosphereio
1. We have monitoring,
it’s awesome and
developers are happy
with standardized
metrics mostly.
Adding more metrics at organizations
2. Developers put
custom metrics on
everything and I am
deploying tons of
applications in
something like
Kubernetes, things are
ok!
3. Things are on way
too on fire, we can’t
manage this many
things anymore, can
everyone just stop
please.
??
???
@chronosphereio
Timeseries
Timeseries from
lots of hosts and
container pods
ID Timeseries
1 __name__=cpu_seconds_total, pod=foo-123abc
8 __name__=memory_memfree, pod=foo-123abc
33 __name__=cpu_seconds_total, pod=foo-456def
44 __name__=memory_memfree, pod=foo-456def
45 __name__=cpu_seconds_total, pod=bar-768ghe
58 __name__=memory_memfree, pod=bar-768ghe
… millions .. and if you are unfortunate... billions
@chronosphereio
Aggregate metric cpu_seconds_total
Timeseries from
lots of hosts and
container pods
ID Timeseries
1 __name__=cpu_seconds_total, pod=foo-123abc
8 __name__=memory_memfree, pod=foo-123abc
33 __name__=cpu_seconds_total, pod=foo-456def
44 __name__=memory_memfree, pod=foo-456def
45 __name__=cpu_seconds_total, pod=bar-768ghe
58 __name__=memory_memfree, pod=bar-768ghe
… millions .. and if you are unfortunate... billions
@chronosphereio
cpu_seconds_total and pod=foo-(.+)
Timeseries from
lots of hosts and
container pods
ID Timeseries
1 __name__=cpu_seconds_total, pod=foo-123abc
8 __name__=memory_memfree, pod=foo-123abc
33 __name__=cpu_seconds_total, pod=foo-456def
44 __name__=memory_memfree, pod=foo-456def
45 __name__=cpu_seconds_total, pod=bar-768ghe
58 __name__=memory_memfree, pod=bar-768ghe
… millions .. and if you are unfortunate... billions
@chronosphereio
Need high flexibility and speed
1. Any arbitrary set of dimensions/labels can be
specified for filtering
2. Ideally speed is sub-linear
@chronosphereio
Timeseries column lookup
1. Secondary lookup using prefix ordered table
2. Secondary inverted index
Labels Timeseries ID
(fingerprint)
__name__=cpu, pod=foo-123abc 1 ID Column key Col value
1 __name__=cpu, pod=foo-123abc {t=...,v=...} ➡
2 __name__=cpu, pod=foo-456def {t=...,v=...} ➡
3 __name__=cpu, pod=bar-123abc {t=...,v=...} ➡
Label Label value Timeseries IDs
__name__ cpu 1, 2, 3
pod foo-123abc 1
foo-456def 2
bar-123abc 3
@chronosphereio
Ways to keep timeseries index/data
1. Index and data live separately
Lookup and returning timeseries data across processes,
typically making network request between the two
operations.
2. Index and data live together
Lookup next to timeseries data, send data back directly
once matches index query.
@chronosphereio
v1
M3 storage evolution (pre-open release, 2015)
Cassandra
Elastic
Search
Already
Indexed
Cache
Heavy
read
cache
Query
Query
Query
QueryRecently
read
cache
1. Fetch index (ES)
2. Fetch data (C*)
>100 servers
>1,000 servers
@chronosphereio
v1
M3 storage evolution (pre-open release, 2016)
Cassandra
Elastic
Search
Already
Indexed
Cache
Heavy
read
cache
Query
Query
Query
QueryRecently
read
cache
>100 servers
>1,000 servers
@chronosphereio
v2
M3 storage evolution (pre-open release, 2016)
M3DB
(data on disk
with LRU
caches)
Elastic
Search
Already
Indexed
Cache
Heavy
read
cache
Query
Query
Query
Query
With M3DB 7x less servers from
Cassandra, while increasing RF=2
to RF=3
@chronosphereio
v2
M3 storage evolution (pre-open release, 2018)
M3DB
(data on disk
with LRU
caches)
Elastic
Search
Already
Indexed
Cache
Heavy
read
cache
Query
Query
Query
Query
@chronosphereio
v4
All read/write caches for data/index now in M3DB nodes
M3 storage evolution (open release, 2018)
M3DB
(data and
index on disk
with LRU
caches)
Query
Query
Query
Query
@chronosphereio
Inverted index w/ Prometheus
Timeseries IDs 1, 33, 45
Timeseries IDs 8, 44, 58
Timeseries IDs 1, 8
Timeseries IDs 33, 44
Timeseries IDs 45, 58
__name__
cpu_seconds
mem_free
pod
foo-123abc
foo-456def
bar-123abc
https://github.com/prometheus/prometheus/blob/master
/tsdb/docs/format/index.md
@chronosphereio
Inverted index w/ Prometheus
https://github.com/prometheus/prometheus/blob/master
/tsdb/docs/format/index.md
TS IDs 1, 33, 45
TS IDs 8, 44, 58
TS IDs 1, 8
TS IDs 33, 44
TS IDs 45, 58
__name__
cpu_seconds
mem_free
pod
foo-123abc
foo-456def
bar-123abc
ID Timeseries
1 __name__=cpu_seconds, pod=foo-123abc
8 __name__=mem_free, pod=foo-123abc
33 __name__=cpu_seconds, pod=foo-456abc
44 __name__=mem_free, pod=foo-456abc
45 __name__=cpu_seconds, pod=bar-123abc
58 __name__=mem_free, pod=bar-123abc
@chronosphereio
Inverted index w/ Prometheus
Labels (name and distinct values entries)
@chronosphereio
Inverted index w/ Prometheus
Postings/Timeseries IDs
@chronosphereio
Inverted index w/ Prometheus
Matching label values
https://github.com/prometheus/prometheus/blob/38d32e06862f6b72700f67043ce574508b5697f0/tsdb/querier.go#L417-L451
vals, err := ix.LabelValues(m.Name)
...
var res []string
for _, val := range vals {
if m.Matches(val) {
res = append(res, val)
}
}
...
return ix.Postings(m.Name, res...) // Merges postings/timeseries IDs together
@chronosphereio
Inverted index w/ M3
1. Inverted index more similar to ElasticSearch & Apache Lucene.
2. Instead of storing distinct label values with associated
postings, instead stores distinct label values in FST (Finite
State Transducer).
3. Instead of storing postings/timeseries IDs as integer sets
(one after another), instead stores using Roaring Bitmaps
(compressed bitmaps) for fast intersection (across thousands
of sets).
@chronosphereio
What is an FST?
Like a compressed trie.
Good overview and some examples at
https://blog.burntsushi.net/transducers/
Searching data set of wikipedia titles is more than 10x
faster than grep.
This matters when you have billions of metrics, i.e. Uber
with 11 billion metrics.
@chronosphereio
https://github.com/chronosphereiox/high_cardinality_microbenchmark
Disclaimer: This is only testing one part of much bigger systems, mainly
to support architectural choices not for real world performance.
Demo
@chronosphereio
Thank you to M3 contributors:
…@chronosphere.io, …@uber.com, …@aiven.io, …@cloudera.com,
…@linkedin.com and many other great individuals!
Learn more (release 0.15.0 coming soon):
● Slack https://bit.ly/m3slack
● Mailing list https://groups.google.com/forum/#!forum/m3db
● GitHub https://github.com/m3db/m3
● Documentation https://m3db.io
● Chronosphere contact@chronosphere.io
Thank you, questions? Come say hi
@chronosphereio
Ad

More Related Content

What's hot (19)

The reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memoryThe reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memory
PVS-Studio
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
Cloudera, Inc.
 
Declarative Infrastructure Tools
Declarative Infrastructure Tools Declarative Infrastructure Tools
Declarative Infrastructure Tools
Yulia Shcherbachova
 
2013 05 ny
2013 05 ny2013 05 ny
2013 05 ny
Sri Ambati
 
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-OnApache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Taiwan User Group
 
.NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov).NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov)
ITCamp
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Hisham Mardam-Bey
 
Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)
Sean Krail
 
Tc basics
Tc basicsTc basics
Tc basics
jeromy fu
 
Redis - for duplicate detection on real time stream
Redis - for duplicate detection on real time streamRedis - for duplicate detection on real time stream
Redis - for duplicate detection on real time stream
Codemotion
 
tokyotalk
tokyotalktokyotalk
tokyotalk
Hiroshi Ono
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflows
Sam Bowne
 
Streaming kafka search utility for Mozilla's Bagheera
Streaming kafka search utility for Mozilla's BagheeraStreaming kafka search utility for Mozilla's Bagheera
Streaming kafka search utility for Mozilla's Bagheera
Varunkumar Manohar
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Intel® Software
 
Parallel computing with GPars
Parallel computing with GParsParallel computing with GPars
Parallel computing with GPars
Pablo Molnar
 
Text tagging with finite state transducers
Text tagging with finite state transducersText tagging with finite state transducers
Text tagging with finite state transducers
lucenerevolution
 
Norikra in Action (ver. 2014 spring)
Norikra in Action (ver. 2014 spring)Norikra in Action (ver. 2014 spring)
Norikra in Action (ver. 2014 spring)
SATOSHI TAGOMORI
 
Resource element lte explanations!
Resource element lte explanations!Resource element lte explanations!
Resource element lte explanations!
Bobir Shomaksudov
 
Ceph Day Chicago: Using Ceph for Large Hadron Collider Data
Ceph Day Chicago: Using Ceph for Large Hadron Collider Data Ceph Day Chicago: Using Ceph for Large Hadron Collider Data
Ceph Day Chicago: Using Ceph for Large Hadron Collider Data
Ceph Community
 
The reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memoryThe reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memory
PVS-Studio
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
Cloudera, Inc.
 
Declarative Infrastructure Tools
Declarative Infrastructure Tools Declarative Infrastructure Tools
Declarative Infrastructure Tools
Yulia Shcherbachova
 
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-OnApache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Taiwan User Group
 
.NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov).NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov)
ITCamp
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Hisham Mardam-Bey
 
Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)
Sean Krail
 
Redis - for duplicate detection on real time stream
Redis - for duplicate detection on real time streamRedis - for duplicate detection on real time stream
Redis - for duplicate detection on real time stream
Codemotion
 
CNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflowsCNIT 127 Ch 5: Introduction to heap overflows
CNIT 127 Ch 5: Introduction to heap overflows
Sam Bowne
 
Streaming kafka search utility for Mozilla's Bagheera
Streaming kafka search utility for Mozilla's BagheeraStreaming kafka search utility for Mozilla's Bagheera
Streaming kafka search utility for Mozilla's Bagheera
Varunkumar Manohar
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Intel® Software
 
Parallel computing with GPars
Parallel computing with GParsParallel computing with GPars
Parallel computing with GPars
Pablo Molnar
 
Text tagging with finite state transducers
Text tagging with finite state transducersText tagging with finite state transducers
Text tagging with finite state transducers
lucenerevolution
 
Norikra in Action (ver. 2014 spring)
Norikra in Action (ver. 2014 spring)Norikra in Action (ver. 2014 spring)
Norikra in Action (ver. 2014 spring)
SATOSHI TAGOMORI
 
Resource element lte explanations!
Resource element lte explanations!Resource element lte explanations!
Resource element lte explanations!
Bobir Shomaksudov
 
Ceph Day Chicago: Using Ceph for Large Hadron Collider Data
Ceph Day Chicago: Using Ceph for Large Hadron Collider Data Ceph Day Chicago: Using Ceph for Large Hadron Collider Data
Ceph Day Chicago: Using Ceph for Large Hadron Collider Data
Ceph Community
 

Similar to FOSDEM 2020: Querying over millions and billions of metrics with M3DB's index (20)

2024Nov20-BigDataEU-RealTimeAIWithOpenSource
2024Nov20-BigDataEU-RealTimeAIWithOpenSource2024Nov20-BigDataEU-RealTimeAIWithOpenSource
2024Nov20-BigDataEU-RealTimeAIWithOpenSource
Timothy Spann
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
Sasha Goldshtein
 
What is the best full text search engine for Python?
What is the best full text search engine for Python?What is the best full text search engine for Python?
What is the best full text search engine for Python?
Andrii Soldatenko
 
ql.io at NodePDX
ql.io at NodePDXql.io at NodePDX
ql.io at NodePDX
Subbu Allamaraju
 
Tibero sql execution plan guide en
Tibero sql execution plan guide enTibero sql execution plan guide en
Tibero sql execution plan guide en
ssusered8afe
 
Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.
Keshav Murthy
 
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouseApplication Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
VictoriaMetrics
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in Python
Simon Frid
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your Code
Nate Abele
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0
Nate Abele
 
BioMake BOSC 2004
BioMake BOSC 2004BioMake BOSC 2004
BioMake BOSC 2004
Chris Mungall
 
Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21
Stamatis Zampetakis
 
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
Timothy Spann
 
CCM AlchemyAPI and Real-time Aggregation
CCM AlchemyAPI and Real-time AggregationCCM AlchemyAPI and Real-time Aggregation
CCM AlchemyAPI and Real-time Aggregation
Victor Anjos
 
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Ontico
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
HBaseCon
 
Protocol handler in Gecko
Protocol handler in GeckoProtocol handler in Gecko
Protocol handler in Gecko
Chih-Hsuan Kuo
 
Paul Dix (Founder InfluxDB) - Organising Metrics at #DOXLON
Paul Dix (Founder InfluxDB) - Organising Metrics at #DOXLONPaul Dix (Founder InfluxDB) - Organising Metrics at #DOXLON
Paul Dix (Founder InfluxDB) - Organising Metrics at #DOXLON
Outlyer
 
Comunicare, condividere e mantenere decisioni architetturali nei team di svil...
Comunicare, condividere e mantenere decisioni architetturali nei team di svil...Comunicare, condividere e mantenere decisioni architetturali nei team di svil...
Comunicare, condividere e mantenere decisioni architetturali nei team di svil...
Michele Orselli
 
2024Nov20-BigDataEU-RealTimeAIWithOpenSource
2024Nov20-BigDataEU-RealTimeAIWithOpenSource2024Nov20-BigDataEU-RealTimeAIWithOpenSource
2024Nov20-BigDataEU-RealTimeAIWithOpenSource
Timothy Spann
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
Sasha Goldshtein
 
What is the best full text search engine for Python?
What is the best full text search engine for Python?What is the best full text search engine for Python?
What is the best full text search engine for Python?
Andrii Soldatenko
 
Tibero sql execution plan guide en
Tibero sql execution plan guide enTibero sql execution plan guide en
Tibero sql execution plan guide en
ssusered8afe
 
Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.
Keshav Murthy
 
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouseApplication Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
VictoriaMetrics
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in Python
Simon Frid
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your Code
Nate Abele
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0
Nate Abele
 
Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21
Stamatis Zampetakis
 
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
Timothy Spann
 
CCM AlchemyAPI and Real-time Aggregation
CCM AlchemyAPI and Real-time AggregationCCM AlchemyAPI and Real-time Aggregation
CCM AlchemyAPI and Real-time Aggregation
Victor Anjos
 
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Ontico
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
HBaseCon
 
Protocol handler in Gecko
Protocol handler in GeckoProtocol handler in Gecko
Protocol handler in Gecko
Chih-Hsuan Kuo
 
Paul Dix (Founder InfluxDB) - Organising Metrics at #DOXLON
Paul Dix (Founder InfluxDB) - Organising Metrics at #DOXLONPaul Dix (Founder InfluxDB) - Organising Metrics at #DOXLON
Paul Dix (Founder InfluxDB) - Organising Metrics at #DOXLON
Outlyer
 
Comunicare, condividere e mantenere decisioni architetturali nei team di svil...
Comunicare, condividere e mantenere decisioni architetturali nei team di svil...Comunicare, condividere e mantenere decisioni architetturali nei team di svil...
Comunicare, condividere e mantenere decisioni architetturali nei team di svil...
Michele Orselli
 
Ad

Recently uploaded (20)

Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Ad

FOSDEM 2020: Querying over millions and billions of metrics with M3DB's index

  • 1. @chronosphereio Querying millions to billions of metrics with M3DB’s index FOSDEM 2020
  • 2. @chronosphereio @roskilli Previously M3 tech lead at Uber, creator of M3DB. CTO at Chronosphere. Member of OpenMetrics.
  • 3. @chronosphereio Schema for data you would like to collect and aggregate Name ● http_requests Dimensions/Labels ● endpoint (e.g. /api/search) ● status_code (e.g. 500) ● deploy_version_git_sha (e.g. 25149a04c) Monitoring: what is a metric?
  • 4. @chronosphereio 1. Increasing number of regions, containers, k8s pods, tracking deployed version - (cardinality!) 2. Metrics can have arbitrary number of dimensions 3. Building compound index is expensive Problem
  • 5. @chronosphereio 1. We have monitoring, it’s awesome and developers are happy with standardized metrics mostly. Adding more metrics at organizations 2. Developers put custom metrics on everything and I am deploying tons of applications in something like Kubernetes, things are ok! 3. Things are on way too on fire, we can’t manage this many things anymore, can everyone just stop please. ?? ???
  • 6. @chronosphereio Timeseries Timeseries from lots of hosts and container pods ID Timeseries 1 __name__=cpu_seconds_total, pod=foo-123abc 8 __name__=memory_memfree, pod=foo-123abc 33 __name__=cpu_seconds_total, pod=foo-456def 44 __name__=memory_memfree, pod=foo-456def 45 __name__=cpu_seconds_total, pod=bar-768ghe 58 __name__=memory_memfree, pod=bar-768ghe … millions .. and if you are unfortunate... billions
  • 7. @chronosphereio Aggregate metric cpu_seconds_total Timeseries from lots of hosts and container pods ID Timeseries 1 __name__=cpu_seconds_total, pod=foo-123abc 8 __name__=memory_memfree, pod=foo-123abc 33 __name__=cpu_seconds_total, pod=foo-456def 44 __name__=memory_memfree, pod=foo-456def 45 __name__=cpu_seconds_total, pod=bar-768ghe 58 __name__=memory_memfree, pod=bar-768ghe … millions .. and if you are unfortunate... billions
  • 8. @chronosphereio cpu_seconds_total and pod=foo-(.+) Timeseries from lots of hosts and container pods ID Timeseries 1 __name__=cpu_seconds_total, pod=foo-123abc 8 __name__=memory_memfree, pod=foo-123abc 33 __name__=cpu_seconds_total, pod=foo-456def 44 __name__=memory_memfree, pod=foo-456def 45 __name__=cpu_seconds_total, pod=bar-768ghe 58 __name__=memory_memfree, pod=bar-768ghe … millions .. and if you are unfortunate... billions
  • 9. @chronosphereio Need high flexibility and speed 1. Any arbitrary set of dimensions/labels can be specified for filtering 2. Ideally speed is sub-linear
  • 10. @chronosphereio Timeseries column lookup 1. Secondary lookup using prefix ordered table 2. Secondary inverted index Labels Timeseries ID (fingerprint) __name__=cpu, pod=foo-123abc 1 ID Column key Col value 1 __name__=cpu, pod=foo-123abc {t=...,v=...} ➡ 2 __name__=cpu, pod=foo-456def {t=...,v=...} ➡ 3 __name__=cpu, pod=bar-123abc {t=...,v=...} ➡ Label Label value Timeseries IDs __name__ cpu 1, 2, 3 pod foo-123abc 1 foo-456def 2 bar-123abc 3
  • 11. @chronosphereio Ways to keep timeseries index/data 1. Index and data live separately Lookup and returning timeseries data across processes, typically making network request between the two operations. 2. Index and data live together Lookup next to timeseries data, send data back directly once matches index query.
  • 12. @chronosphereio v1 M3 storage evolution (pre-open release, 2015) Cassandra Elastic Search Already Indexed Cache Heavy read cache Query Query Query QueryRecently read cache 1. Fetch index (ES) 2. Fetch data (C*) >100 servers >1,000 servers
  • 13. @chronosphereio v1 M3 storage evolution (pre-open release, 2016) Cassandra Elastic Search Already Indexed Cache Heavy read cache Query Query Query QueryRecently read cache >100 servers >1,000 servers
  • 14. @chronosphereio v2 M3 storage evolution (pre-open release, 2016) M3DB (data on disk with LRU caches) Elastic Search Already Indexed Cache Heavy read cache Query Query Query Query With M3DB 7x less servers from Cassandra, while increasing RF=2 to RF=3
  • 15. @chronosphereio v2 M3 storage evolution (pre-open release, 2018) M3DB (data on disk with LRU caches) Elastic Search Already Indexed Cache Heavy read cache Query Query Query Query
  • 16. @chronosphereio v4 All read/write caches for data/index now in M3DB nodes M3 storage evolution (open release, 2018) M3DB (data and index on disk with LRU caches) Query Query Query Query
  • 17. @chronosphereio Inverted index w/ Prometheus Timeseries IDs 1, 33, 45 Timeseries IDs 8, 44, 58 Timeseries IDs 1, 8 Timeseries IDs 33, 44 Timeseries IDs 45, 58 __name__ cpu_seconds mem_free pod foo-123abc foo-456def bar-123abc https://github.com/prometheus/prometheus/blob/master /tsdb/docs/format/index.md
  • 18. @chronosphereio Inverted index w/ Prometheus https://github.com/prometheus/prometheus/blob/master /tsdb/docs/format/index.md TS IDs 1, 33, 45 TS IDs 8, 44, 58 TS IDs 1, 8 TS IDs 33, 44 TS IDs 45, 58 __name__ cpu_seconds mem_free pod foo-123abc foo-456def bar-123abc ID Timeseries 1 __name__=cpu_seconds, pod=foo-123abc 8 __name__=mem_free, pod=foo-123abc 33 __name__=cpu_seconds, pod=foo-456abc 44 __name__=mem_free, pod=foo-456abc 45 __name__=cpu_seconds, pod=bar-123abc 58 __name__=mem_free, pod=bar-123abc
  • 19. @chronosphereio Inverted index w/ Prometheus Labels (name and distinct values entries)
  • 20. @chronosphereio Inverted index w/ Prometheus Postings/Timeseries IDs
  • 21. @chronosphereio Inverted index w/ Prometheus Matching label values https://github.com/prometheus/prometheus/blob/38d32e06862f6b72700f67043ce574508b5697f0/tsdb/querier.go#L417-L451 vals, err := ix.LabelValues(m.Name) ... var res []string for _, val := range vals { if m.Matches(val) { res = append(res, val) } } ... return ix.Postings(m.Name, res...) // Merges postings/timeseries IDs together
  • 22. @chronosphereio Inverted index w/ M3 1. Inverted index more similar to ElasticSearch & Apache Lucene. 2. Instead of storing distinct label values with associated postings, instead stores distinct label values in FST (Finite State Transducer). 3. Instead of storing postings/timeseries IDs as integer sets (one after another), instead stores using Roaring Bitmaps (compressed bitmaps) for fast intersection (across thousands of sets).
  • 23. @chronosphereio What is an FST? Like a compressed trie. Good overview and some examples at https://blog.burntsushi.net/transducers/ Searching data set of wikipedia titles is more than 10x faster than grep. This matters when you have billions of metrics, i.e. Uber with 11 billion metrics.
  • 24. @chronosphereio https://github.com/chronosphereiox/high_cardinality_microbenchmark Disclaimer: This is only testing one part of much bigger systems, mainly to support architectural choices not for real world performance. Demo
  • 25. @chronosphereio Thank you to M3 contributors: …@chronosphere.io, …@uber.com, …@aiven.io, …@cloudera.com, …@linkedin.com and many other great individuals! Learn more (release 0.15.0 coming soon): ● Slack https://bit.ly/m3slack ● Mailing list https://groups.google.com/forum/#!forum/m3db ● GitHub https://github.com/m3db/m3 ● Documentation https://m3db.io ● Chronosphere [email protected] Thank you, questions? Come say hi