SlideShare a Scribd company logo
December 12-14 2023 December 12-14 2023
The Building Blocks of a
Time-Series database
Javier Ramirez
Database advocate
@supercoco9
Timestamp problems are hard
https://twitter.com/rshidashrf/status/975566803052134400
https://stackoverflow.com/questions/6841333/why-is-subtracting-these-two-epoch-milli-times-in-year-1927-giving-a-strange-r
1553983200
* This is a date. Which one?
When is this?
1/4/19
* is this April 19th? january 4th? April 1st?
Working with timestamped data in a
database is tricky*
* specially working with analytics of data changing over time or at a high rate
If you can use only one
database for everything, go
with PostgreSQL*
* Or any other major and well supported RDBMS
Some things RDBMS are not designed for
● Writing data faster than it is read (several millions of inserts per day and faster)
● Aggregations scoped to different time units (per year/minute/microsecond)
● Identifying gaps or missing data for a given interval
● Joining tables by approximate timestamp
● Sparse data (tables with hundreds or thousands of columns)
● Aggregates over billions of records
QuestDB: The building blocks of a fast open-source time-series database
● a factory floor with 500 machines, or
● a fleet with 500 vehicles, or
● 50 trains, with 10 cars each, or
● 500 users with a mobile phone
Sending data every second
How I made my first billion
86,400
* Seconds in one day
604,800
* Seconds in one week
2,628,288
* Seconds in one month. Well, in the average month of 30.42 days anyway
43,200,000 rows a day…….
302,400,000 rows a week….
1,314,144,000 rows a month
How I made my first billion
* See? On streaming data, It is kind of easy to get your first billion of data points
Not all data
problems are
the same
Time-series database basics
● Optimised for fast append-only ingestion
● Data lifecycle policies
● Analytics over chunks of time
● Time-based aggregations
● Often power real-time dashboards
QuestDB: The building blocks of a fast open-source time-series database
QuestDB would like to be known for:
● Performance
○ Also with smaller machines
● Developer Experience
● Proudly Open Source (Apache 2.0)
Fast streaming
ingestion
* You can try ingesting streaming data using https://github.com/javier/questdb-quickstart
QuestDB ingestion and storage layer
● Data always stored by incremental timestamp.
● Data partitioned by time units and stored in columnar format.
● No indexes needed. Data is immediately available after writing.
● Predictable ingestion rate, even under demanding workloads (millions/second).
● Built-in event deduplication.
● Optimized data types (Symbol, geohash, ipv4, uuid).
● Row updates and upserts supported.
Lifecycle policies
ALTER TABLE my_table DROP PARTITION LIST '2021-01-01', '2021-01-02';
--Delete days before 2021-01-03
ALTER TABLE my_table DROP PARTITION WHERE timestamp < to_timestamp('2021-01-03', 'yyyy-MM-dd');
ALTER TABLE x DETACH PARTITION LIST '2019-02-01', '2019-02-02';
-- It is also possible to use WHERE clause to define the partition list
ALTER TABLE sensors DETACH PARTITION WHERE < '2019-02-03T00';
CREATE TABLE my_table (i symbol, ts timestamp) IN VOLUME SECONDARY_VOLUME;
Connectivity, protocols, and interfaces
● REST API and web console: Query execution, CSV imports/exports. Basic charts.
● Pgwire: perfect for querying, DDL, and DML. Ingestion supported, up to moderate
throughput. Compatible with any low-level postgresql client or library.
● Influx Line Protocol(ILP): socket-based, ingestion only, very high throughput. Official clients
available for C/C++, JAVA, Python, Rust, Go, NodeJS, and .Net.
● Health/Metrics: HTTP endpoint with Prometheus format
● Integrations with: Apache Kafka, Apache Flink, Apache Spark, Python Pandas, Grafana,
Superset, Telegraf, Redpanda, qStudio, SQLAlchemy, Cube…
The query engine
QuestDB Query engine internals
● Our Java codebase has zero dependencies. No garbage collection on
the hot path. As close to the hardware as possible.
● We research the latest trends. Our code takes advantage of the
state-of-the-art in CPU, storage design, and data structures.
● We implement our own Just in Time Compiler to make query execution
as parallel and fast as possible.
● We spend weeks of development to save microseconds or
nanoseconds in many operations.
The query language: SQL with
time-series extensions
LATEST ON … PARTITION BY …
Retrieves the latest entry by timestamp for a given key or combination of keys, for scenarios where multiple
time series are stored in the same table.
SELECT * FROM trades
LATEST ON timestamp PARTITION BY symbol;
Try it live on
https://demo.questdb.io
LATEST ON … PARTITION BY …
Retrieves the latest entry by timestamp for a given key or combination of keys, for scenarios where multiple
time series are stored in the same table.
SELECT * FROM trades
WHERE symbol in ('BTC-USD', 'ETH-USD')
LATEST ON timestamp PARTITION BY symbol, side;
Try it live on
https://demo.questdb.io
SAMPLE BY
Aggregates data in homogeneous time chunks
SELECT
timestamp,
sum(price * amount) / sum(amount) AS vwap_price,
sum(amount) AS volume
FROM trades
WHERE symbol = 'BTC-USD' AND timestamp > dateadd('d', -1, now())
SAMPLE BY 15m ALIGN TO CALENDAR;
SELECT timestamp, min(tempF),
max(tempF), avg(tempF)
FROM weather SAMPLE BY 1M;
Try it live on
https://demo.questdb.io
How do you ask your database to
return which data is not stored?
I am sending data every second or
so. Tell me which devices didn’t
send any data with more than 1.5
seconds gap
SAMPLE BY … FILL
Can fill missing time chunks using different strategies (NULL, constant, LINEAR, PREVious value)
SELECT
timestamp,
sum(price * amount) / sum(amount) AS vwap_price,
sum(amount) AS volume
FROM trades
WHERE symbol = 'BTC-USD' AND timestamp > dateadd('d', -1, now())
SAMPLE BY 1s FILL(NULL) ALIGN TO CALENDAR;
Try it live on
https://demo.questdb.io
WHERE … TIME RANGE
SELECT * from trips WHERE pickup_datetime in '2018';
SELECT * from trips WHERE pickup_datetime in '2018-06';
SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59';
Try it live on
https://demo.questdb.io
WHERE … TIME RANGE
SELECT * from trips WHERE pickup_datetime in '2018';
SELECT * from trips WHERE pickup_datetime in '2018-06';
SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59';
SELECT * from trips WHERE pickup_datetime in '2018;2M' LIMIT -10;
SELECT * from trips WHERE pickup_datetime in '2018;10s' LIMIT -10;
SELECT * from trips WHERE pickup_datetime in '2018;-3d' LIMIT -10;
Try it live on
https://demo.questdb.io
WHERE … TIME RANGE
SELECT * from trips WHERE pickup_datetime in '2018';
SELECT * from trips WHERE pickup_datetime in '2018-06';
SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59';
SELECT * from trips WHERE pickup_datetime in '2018;2M' LIMIT -10;
SELECT * from trips WHERE pickup_datetime in '2018;10s' LIMIT -10;
SELECT * from trips WHERE pickup_datetime in '2018;-3d' LIMIT -10;
SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59:58;4s;1d;7'
SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59:58;4s;-1d;7'
Try it live on
https://demo.questdb.io
What if I have two tables, where
data is (obviously) not sent at the
same exact timestamps and I want
to join by closest matching
timestamp?
ASOF JOIN (LT JOIN and SPLICE JOIN variations)
ASOF JOIN joins two different time-series measured. For each row in the first time-series, the ASOF JOIN takes from
the second time-series a timestamp that meets both of the following criteria:
● The timestamp is the closest to the first timestamp.
● The timestamp is strictly prior or equal to the first timestamp.
WITH trips2018 AS (
SELECT * from trips WHERE pickup_datetime in '2016'
)
SELECT pickup_datetime, timestamp, fare_amount, tempF, windDir
FROM trips2018
ASOF JOIN weather;
Try it live on
https://demo.questdb.io
Some things we are trying out next for performance
● Compression, and exploring data formats like arrow/ parquet
● Own ingestion protocol
● Second level partitioning
● Improved vectorization of some operations (group by multiple columns or by expressions
● Add specific joins optimizations (index nested loop joins, for example)
QuestDB OSS
Open Source. Self-managed. Suitable for
production workloads.
https://github.com/questdb/questdb
QuestDB Enterprise
Licensed. Self-managed. Enterprise features like
RBAC, compression, replication, TLS on all
protocols, cold storage, K8s operator…
https://questdb.io/enterprise/
QuestDB Cloud
Fully managed, pay per usage environment,
with enterprise-grade features.
https://questdb.io/cloud/
OSA CON | December 12-14 2023
Q&A
● github.com/questdb/questdb
● https://questdb.io
● https://demo.questdb.io
● https://github.com/javier/questdb-quickstart
● https://slack.questdb.io/
40
Javier Ramirez
@supercoco9
We 💕 contributions
and GitHub ⭐ stars
Ad

More Related Content

What's hot (20)

Alfresco tuning part1
Alfresco tuning part1Alfresco tuning part1
Alfresco tuning part1
Luis Cabaceira
 
The consequences of sync_binlog != 1
The consequences of sync_binlog != 1The consequences of sync_binlog != 1
The consequences of sync_binlog != 1
Jean-François Gagné
 
Catalogs - Turning a Set of Parquet Files into a Data Set
Catalogs - Turning a Set of Parquet Files into a Data SetCatalogs - Turning a Set of Parquet Files into a Data Set
Catalogs - Turning a Set of Parquet Files into a Data Set
InfluxData
 
Fun with click house window functions webinar slides 2021-08-19
Fun with click house window functions webinar slides  2021-08-19Fun with click house window functions webinar slides  2021-08-19
Fun with click house window functions webinar slides 2021-08-19
Altinity Ltd
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Altinity Ltd
 
Docker Networking - Common Issues and Troubleshooting Techniques
Docker Networking - Common Issues and Troubleshooting TechniquesDocker Networking - Common Issues and Troubleshooting Techniques
Docker Networking - Common Issues and Troubleshooting Techniques
Sreenivas Makam
 
Securing SAML SSO from XSW attacks
Securing SAML SSO from XSW attacksSecuring SAML SSO from XSW attacks
Securing SAML SSO from XSW attacks
Tharindu Edirisinghe
 
Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses Consistency
ScyllaDB
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series Database
All Things Open
 
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxData
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
Altinity Ltd
 
ELK introduction
ELK introductionELK introduction
ELK introduction
Waldemar Neto
 
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELK
YuHsuan Chen
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
Altinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
Altinity Ltd
 
2015-StarWest presentation on REST-assured
2015-StarWest presentation on REST-assured2015-StarWest presentation on REST-assured
2015-StarWest presentation on REST-assured
Eing Ong
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
Karwin Software Solutions LLC
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
Altinity Ltd
 
The consequences of sync_binlog != 1
The consequences of sync_binlog != 1The consequences of sync_binlog != 1
The consequences of sync_binlog != 1
Jean-François Gagné
 
Catalogs - Turning a Set of Parquet Files into a Data Set
Catalogs - Turning a Set of Parquet Files into a Data SetCatalogs - Turning a Set of Parquet Files into a Data Set
Catalogs - Turning a Set of Parquet Files into a Data Set
InfluxData
 
Fun with click house window functions webinar slides 2021-08-19
Fun with click house window functions webinar slides  2021-08-19Fun with click house window functions webinar slides  2021-08-19
Fun with click house window functions webinar slides 2021-08-19
Altinity Ltd
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Altinity Ltd
 
Docker Networking - Common Issues and Troubleshooting Techniques
Docker Networking - Common Issues and Troubleshooting TechniquesDocker Networking - Common Issues and Troubleshooting Techniques
Docker Networking - Common Issues and Troubleshooting Techniques
Sreenivas Makam
 
Securing SAML SSO from XSW attacks
Securing SAML SSO from XSW attacksSecuring SAML SSO from XSW attacks
Securing SAML SSO from XSW attacks
Tharindu Edirisinghe
 
Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses Consistency
ScyllaDB
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series Database
All Things Open
 
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxData
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
Altinity Ltd
 
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELK
YuHsuan Chen
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
Altinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
Altinity Ltd
 
2015-StarWest presentation on REST-assured
2015-StarWest presentation on REST-assured2015-StarWest presentation on REST-assured
2015-StarWest presentation on REST-assured
Eing Ong
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
Altinity Ltd
 

Similar to QuestDB: The building blocks of a fast open-source time-series database (20)

The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
javier ramirez
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDB
javier ramirez
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
ScyllaDB
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
javier ramirez
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
javier ramirez
 
Real World Storage in Treasure Data
Real World Storage in Treasure DataReal World Storage in Treasure Data
Real World Storage in Treasure Data
Kai Sasaki
 
Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud Storage
Kai Sasaki
 
Introdução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon RedshiftIntrodução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon Redshift
Amazon Web Services LATAM
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simple
Dori Waldman
 
Jethro data meetup index base sql on hadoop - oct-2014
Jethro data meetup    index base sql on hadoop - oct-2014Jethro data meetup    index base sql on hadoop - oct-2014
Jethro data meetup index base sql on hadoop - oct-2014
Eli Singer
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Codemotion
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
javier ramirez
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
Masahiko Sawada
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Accumulo Summit
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
Robbie Strickland
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Data Con LA
 
Gur1009
Gur1009Gur1009
Gur1009
Cdiscount
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the Server
devObjective
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
javier ramirez
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDB
javier ramirez
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
ScyllaDB
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
javier ramirez
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
javier ramirez
 
Real World Storage in Treasure Data
Real World Storage in Treasure DataReal World Storage in Treasure Data
Real World Storage in Treasure Data
Kai Sasaki
 
Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud Storage
Kai Sasaki
 
Introdução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon RedshiftIntrodução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon Redshift
Amazon Web Services LATAM
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simple
Dori Waldman
 
Jethro data meetup index base sql on hadoop - oct-2014
Jethro data meetup    index base sql on hadoop - oct-2014Jethro data meetup    index base sql on hadoop - oct-2014
Jethro data meetup index base sql on hadoop - oct-2014
Eli Singer
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Codemotion
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
javier ramirez
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
Masahiko Sawada
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Accumulo Summit
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
Robbie Strickland
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Data Con LA
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the Server
devObjective
 
Ad

More from javier ramirez (20)

The Future of Fast Databases: Lessons from a Decade of QuestDB
The Future of Fast Databases: Lessons from a Decade of QuestDBThe Future of Fast Databases: Lessons from a Decade of QuestDB
The Future of Fast Databases: Lessons from a Decade of QuestDB
javier ramirez
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
javier ramirez
 
How We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeachHow We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeach
javier ramirez
 
¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest
javier ramirez
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
javier ramirez
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
javier ramirez
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728
javier ramirez
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022
javier ramirez
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragón
javier ramirez
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverless
javier ramirez
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloud
javier ramirez
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
javier ramirez
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analytics
javier ramirez
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipeline
javier ramirez
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Dive
javier ramirez
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)
javier ramirez
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWS
javier ramirez
 
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
javier ramirez
 
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
javier ramirez
 
Open Distro for ElasticSearch and how Grimoire is using it. Madrid DevOps Oct...
Open Distro for ElasticSearch and how Grimoire is using it. Madrid DevOps Oct...Open Distro for ElasticSearch and how Grimoire is using it. Madrid DevOps Oct...
Open Distro for ElasticSearch and how Grimoire is using it. Madrid DevOps Oct...
javier ramirez
 
The Future of Fast Databases: Lessons from a Decade of QuestDB
The Future of Fast Databases: Lessons from a Decade of QuestDBThe Future of Fast Databases: Lessons from a Decade of QuestDB
The Future of Fast Databases: Lessons from a Decade of QuestDB
javier ramirez
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
javier ramirez
 
How We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeachHow We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeach
javier ramirez
 
¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest
javier ramirez
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
javier ramirez
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
javier ramirez
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728
javier ramirez
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022
javier ramirez
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragón
javier ramirez
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverless
javier ramirez
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloud
javier ramirez
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
javier ramirez
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analytics
javier ramirez
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipeline
javier ramirez
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Dive
javier ramirez
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)
javier ramirez
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWS
javier ramirez
 
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
Consulta cualquier fuente de datos usando SQL con Amazon Athena y sus consult...
javier ramirez
 
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...Recomendaciones, predicciones y detección de fraude usando servicios de intel...
Recomendaciones, predicciones y detección de fraude usando servicios de intel...
javier ramirez
 
Open Distro for ElasticSearch and how Grimoire is using it. Madrid DevOps Oct...
Open Distro for ElasticSearch and how Grimoire is using it. Madrid DevOps Oct...Open Distro for ElasticSearch and how Grimoire is using it. Madrid DevOps Oct...
Open Distro for ElasticSearch and how Grimoire is using it. Madrid DevOps Oct...
javier ramirez
 
Ad

Recently uploaded (20)

Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136
illuminati Agent uganda call+256776963507/0741506136
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 

QuestDB: The building blocks of a fast open-source time-series database

  • 1. December 12-14 2023 December 12-14 2023 The Building Blocks of a Time-Series database Javier Ramirez Database advocate @supercoco9
  • 5. 1553983200 * This is a date. Which one?
  • 6. When is this? 1/4/19 * is this April 19th? january 4th? April 1st?
  • 7. Working with timestamped data in a database is tricky* * specially working with analytics of data changing over time or at a high rate
  • 8. If you can use only one database for everything, go with PostgreSQL* * Or any other major and well supported RDBMS
  • 9. Some things RDBMS are not designed for ● Writing data faster than it is read (several millions of inserts per day and faster) ● Aggregations scoped to different time units (per year/minute/microsecond) ● Identifying gaps or missing data for a given interval ● Joining tables by approximate timestamp ● Sparse data (tables with hundreds or thousands of columns) ● Aggregates over billions of records
  • 11. ● a factory floor with 500 machines, or ● a fleet with 500 vehicles, or ● 50 trains, with 10 cars each, or ● 500 users with a mobile phone Sending data every second How I made my first billion
  • 14. 2,628,288 * Seconds in one month. Well, in the average month of 30.42 days anyway
  • 15. 43,200,000 rows a day……. 302,400,000 rows a week…. 1,314,144,000 rows a month How I made my first billion * See? On streaming data, It is kind of easy to get your first billion of data points
  • 16. Not all data problems are the same
  • 17. Time-series database basics ● Optimised for fast append-only ingestion ● Data lifecycle policies ● Analytics over chunks of time ● Time-based aggregations ● Often power real-time dashboards
  • 19. QuestDB would like to be known for: ● Performance ○ Also with smaller machines ● Developer Experience ● Proudly Open Source (Apache 2.0)
  • 20. Fast streaming ingestion * You can try ingesting streaming data using https://github.com/javier/questdb-quickstart
  • 21. QuestDB ingestion and storage layer ● Data always stored by incremental timestamp. ● Data partitioned by time units and stored in columnar format. ● No indexes needed. Data is immediately available after writing. ● Predictable ingestion rate, even under demanding workloads (millions/second). ● Built-in event deduplication. ● Optimized data types (Symbol, geohash, ipv4, uuid). ● Row updates and upserts supported.
  • 22. Lifecycle policies ALTER TABLE my_table DROP PARTITION LIST '2021-01-01', '2021-01-02'; --Delete days before 2021-01-03 ALTER TABLE my_table DROP PARTITION WHERE timestamp < to_timestamp('2021-01-03', 'yyyy-MM-dd'); ALTER TABLE x DETACH PARTITION LIST '2019-02-01', '2019-02-02'; -- It is also possible to use WHERE clause to define the partition list ALTER TABLE sensors DETACH PARTITION WHERE < '2019-02-03T00'; CREATE TABLE my_table (i symbol, ts timestamp) IN VOLUME SECONDARY_VOLUME;
  • 23. Connectivity, protocols, and interfaces ● REST API and web console: Query execution, CSV imports/exports. Basic charts. ● Pgwire: perfect for querying, DDL, and DML. Ingestion supported, up to moderate throughput. Compatible with any low-level postgresql client or library. ● Influx Line Protocol(ILP): socket-based, ingestion only, very high throughput. Official clients available for C/C++, JAVA, Python, Rust, Go, NodeJS, and .Net. ● Health/Metrics: HTTP endpoint with Prometheus format ● Integrations with: Apache Kafka, Apache Flink, Apache Spark, Python Pandas, Grafana, Superset, Telegraf, Redpanda, qStudio, SQLAlchemy, Cube…
  • 25. QuestDB Query engine internals ● Our Java codebase has zero dependencies. No garbage collection on the hot path. As close to the hardware as possible. ● We research the latest trends. Our code takes advantage of the state-of-the-art in CPU, storage design, and data structures. ● We implement our own Just in Time Compiler to make query execution as parallel and fast as possible. ● We spend weeks of development to save microseconds or nanoseconds in many operations.
  • 26. The query language: SQL with time-series extensions
  • 27. LATEST ON … PARTITION BY … Retrieves the latest entry by timestamp for a given key or combination of keys, for scenarios where multiple time series are stored in the same table. SELECT * FROM trades LATEST ON timestamp PARTITION BY symbol; Try it live on https://demo.questdb.io
  • 28. LATEST ON … PARTITION BY … Retrieves the latest entry by timestamp for a given key or combination of keys, for scenarios where multiple time series are stored in the same table. SELECT * FROM trades WHERE symbol in ('BTC-USD', 'ETH-USD') LATEST ON timestamp PARTITION BY symbol, side; Try it live on https://demo.questdb.io
  • 29. SAMPLE BY Aggregates data in homogeneous time chunks SELECT timestamp, sum(price * amount) / sum(amount) AS vwap_price, sum(amount) AS volume FROM trades WHERE symbol = 'BTC-USD' AND timestamp > dateadd('d', -1, now()) SAMPLE BY 15m ALIGN TO CALENDAR; SELECT timestamp, min(tempF), max(tempF), avg(tempF) FROM weather SAMPLE BY 1M; Try it live on https://demo.questdb.io
  • 30. How do you ask your database to return which data is not stored?
  • 31. I am sending data every second or so. Tell me which devices didn’t send any data with more than 1.5 seconds gap
  • 32. SAMPLE BY … FILL Can fill missing time chunks using different strategies (NULL, constant, LINEAR, PREVious value) SELECT timestamp, sum(price * amount) / sum(amount) AS vwap_price, sum(amount) AS volume FROM trades WHERE symbol = 'BTC-USD' AND timestamp > dateadd('d', -1, now()) SAMPLE BY 1s FILL(NULL) ALIGN TO CALENDAR; Try it live on https://demo.questdb.io
  • 33. WHERE … TIME RANGE SELECT * from trips WHERE pickup_datetime in '2018'; SELECT * from trips WHERE pickup_datetime in '2018-06'; SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59'; Try it live on https://demo.questdb.io
  • 34. WHERE … TIME RANGE SELECT * from trips WHERE pickup_datetime in '2018'; SELECT * from trips WHERE pickup_datetime in '2018-06'; SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59'; SELECT * from trips WHERE pickup_datetime in '2018;2M' LIMIT -10; SELECT * from trips WHERE pickup_datetime in '2018;10s' LIMIT -10; SELECT * from trips WHERE pickup_datetime in '2018;-3d' LIMIT -10; Try it live on https://demo.questdb.io
  • 35. WHERE … TIME RANGE SELECT * from trips WHERE pickup_datetime in '2018'; SELECT * from trips WHERE pickup_datetime in '2018-06'; SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59'; SELECT * from trips WHERE pickup_datetime in '2018;2M' LIMIT -10; SELECT * from trips WHERE pickup_datetime in '2018;10s' LIMIT -10; SELECT * from trips WHERE pickup_datetime in '2018;-3d' LIMIT -10; SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59:58;4s;1d;7' SELECT * from trips WHERE pickup_datetime in '2018-06-21T23:59:58;4s;-1d;7' Try it live on https://demo.questdb.io
  • 36. What if I have two tables, where data is (obviously) not sent at the same exact timestamps and I want to join by closest matching timestamp?
  • 37. ASOF JOIN (LT JOIN and SPLICE JOIN variations) ASOF JOIN joins two different time-series measured. For each row in the first time-series, the ASOF JOIN takes from the second time-series a timestamp that meets both of the following criteria: ● The timestamp is the closest to the first timestamp. ● The timestamp is strictly prior or equal to the first timestamp. WITH trips2018 AS ( SELECT * from trips WHERE pickup_datetime in '2016' ) SELECT pickup_datetime, timestamp, fare_amount, tempF, windDir FROM trips2018 ASOF JOIN weather; Try it live on https://demo.questdb.io
  • 38. Some things we are trying out next for performance ● Compression, and exploring data formats like arrow/ parquet ● Own ingestion protocol ● Second level partitioning ● Improved vectorization of some operations (group by multiple columns or by expressions ● Add specific joins optimizations (index nested loop joins, for example)
  • 39. QuestDB OSS Open Source. Self-managed. Suitable for production workloads. https://github.com/questdb/questdb QuestDB Enterprise Licensed. Self-managed. Enterprise features like RBAC, compression, replication, TLS on all protocols, cold storage, K8s operator… https://questdb.io/enterprise/ QuestDB Cloud Fully managed, pay per usage environment, with enterprise-grade features. https://questdb.io/cloud/
  • 40. OSA CON | December 12-14 2023 Q&A ● github.com/questdb/questdb ● https://questdb.io ● https://demo.questdb.io ● https://github.com/javier/questdb-quickstart ● https://slack.questdb.io/ 40 Javier Ramirez @supercoco9 We 💕 contributions and GitHub ⭐ stars