SlideShare a Scribd company logo
Michael Desa / Software Engineer
InfluxDB 101
© 2018 InfluxData. All rights reserved.2
Agenda
By the end of this session, users should be able to...
● Define time-series data
● Describe what InfluxDB is and its relationship to
InfluxData
● Explain the InfluxDB data model
● Reason about the impact of the schema in an instance
© 2018 InfluxData. All rights reserved.3
What is time-series data?
© 2018 InfluxData. All rights reserved.4
© 2018 InfluxData. All rights reserved.5
© 2018 InfluxData. All rights reserved.6
© 2018 InfluxData. All rights reserved.7
© 2018 InfluxData. All rights reserved.8
© 2018 InfluxData. All rights reserved.9
© 2018 InfluxData. All rights reserved.10
© 2018 InfluxData. All rights reserved.11
Regular vs Irregular Time-Series
© 2018 InfluxData. All rights reserved.12
Metrics vs Events
© 2018 InfluxData. All rights reserved.13
Question: Time-Series?
© 2018 InfluxData. All rights reserved.14
Question: Time-Series?
© 2018 InfluxData. All rights reserved.15
What is a time-series database (tsdb)?
© 2018 InfluxData. All rights reserved.16
Time-series Database
● A database where you manage and store time-series
data
● Efficiently handles time-series data
● Supports time based queries
© 2018 InfluxData. All rights reserved.17
© 2018 InfluxData. All rights reserved.18
Why couldn’t I just use [insert db]?
© 2018 InfluxData. All rights reserved.19
But, time-series is not just a database problem
© 2018 InfluxData. All rights reserved.20
Time-series problems
● Visualizing your data
● Alerting you data
● Processing your data
● Taking action based on your data
© 2018 InfluxData. All rights reserved.21
What is InfluxDB/InfluxData?
© 2018 InfluxData. All rights reserved.22
InfluxData History
● 2012 - Errplane
● 2014 - InfluxDB is born
● 2015 - Transition to InfluxData
○ A platform for time-series data
● 2018 - 2.0 of InfluxData
© 2018 InfluxData. All rights reserved.23
Why InfluxData
● Easy to get started with
● Aims to solve the entire time-series problem
● Scales well
○ Both horizontally and vertically
InfluxDB Data Model
© 2018 InfluxData. All rights reserved.25
Canonical Time-Series Line Graph
© 2018 InfluxData. All rights reserved.26
The Label (measurement)
© 2018 InfluxData. All rights reserved.27
The Legend (tags)
ticker=A
ticker=AA
ticker=AAPL
market=NASDAQ
market=NYSE
© 2018 InfluxData. All rights reserved.28
Y-Axis Values
price=177.03
price=32.10
price=35.52
© 2018 InfluxData. All rights reserved.29
X-Axis Values
© 2018 InfluxData. All rights reserved.30
Series
stock_price,ticker=A,market=NASDAQ
© 2018 InfluxData. All rights reserved.31
Data Model
● Measurement
○ High level grouping of data
● Tags
○ Indexed key-value pairs
● Fields
○ Key-value pairs of actual data
● Timestamp
○ Time of the data
● Series
○ A unique combination of measurement+tags
© 2018 InfluxData. All rights reserved.32
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Measurement
© 2018 InfluxData. All rights reserved.33
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Tag
s
© 2018 InfluxData. All rights reserved.34
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Fields
© 2018 InfluxData. All rights reserved.35
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
timestamp
Querying Data
© 2018 InfluxData. All rights reserved.37
InfluxData Languages
● InfluxQL
○ SQL-like query language
● TICKscript
○ Time-series data processing language
● Flux
○ Next generation functional data scripting language
© 2018 InfluxData. All rights reserved.38
InfluxQL
> SELECT index, id FROM h2o_quality WHERE time > now() - 1w GROUP BY location
name: h2o_quality
tags: location = coyote_creek
time index id
---- ----- ---
2015-08-18T00:00:00Z 41 1
2015-08-18T00:00:00Z 41 1
name: h2o_quality
tags: location = santa_monica
time index id
---- ----- ---
2015-08-18T00:00:00Z 99 2
2015-08-18T00:06:00Z 56 2
© 2018 InfluxData. All rights reserved.39
TICKscript
var measurement = 'requests'
var data = stream
|from()
.measurement(measurement)
|where(lambda: "is_up" == TRUE)
|where(lambda: "my_field" > 10)
|window()
.period(5m)
.every(5m)
// Count number of points in window
data
|count('value')
.as('the_count')
// Compute mean of data window
data
|mean('value')
.as('the_average')
© 2018 InfluxData. All rights reserved.40
Flux
// get all data from the telegraf db
from(bucket:”telegraf/autogen”)
// filter that by the last hour
|> range(start:-1h)
// filter further by series with a specific measurement and field
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
© 2018 InfluxData. All rights reserved.41
Why Flux?
● Composabile
○ Users should be able to take pieces of different scripts and
combine them into a single one to solve their own problem
● Extensible
○ Adding new functions and capabilities to flux should be easy
● Shareable
○ Users should be able to create libraries and packages to solve
specific problems
● Flexible
○ Users should be able to use the language for arbitrary data
processing
Schema Design
© 2018 InfluxData. All rights reserved.43
What not to do
© 2018 InfluxData. All rights reserved.44
Don’t Encode Data into Measurement/Tags
Bad:
cpu.server-5.us-west value=2 1444234982000000000
cpu.server-6.us-west value=4 1444234982000000000
mem-free.server-6.us-west value=2500 1444234982000000000
Good:
cpu,host=server-5,region=us-west value=2 1444234982000000000
cpu,host=server-6,region=us-west value=4 1444234982000000000
mem-free,host=server-6,region=us-west value=2500 1444234982000000
© 2018 InfluxData. All rights reserved.45
Don’t Encode Data into Measurement/Tags
Bad:
cpu,server=localhost.us-west value=2 1444234982000000000
cpu,server=localhost.us-east value=3 1444234982000000000
Good:
cpu,host=localhost,region=us-west value=2 1444234982000000000
cpu,host=localhost,region=us-east value=3 1444234982000000000
© 2018 InfluxData. All rights reserved.46
Don’t Use Tags with Very High Variability
Bad:
response_time,session_id=33254331,request_id=3424347 value=340 14442349820000
Good-ish:
response_time session_id=33254331,request_id=3424347,value=340 14442349820000
© 2018 InfluxData. All rights reserved.47
Don’t Use Too Few Tags
Bad:
cpu,region=us-west host="server1",value=0.5 1444234986000
cpu,region=us-west host="server2",value=4 1444234982000
cpu,region=us-west host="server2",value=1 1444234982000
Good-ish:
cpu,region=us-west,host=server1 value=0.5 1444234986000
cpu,region=us-west,host=server2 value=4 1444234982000
cpu,region=us-west,host=server2 value=1 1444234982000
© 2018 InfluxData. All rights reserved.48
What should I do then?
© 2018 InfluxData. All rights reserved.49
Designing a Schema
● What dashboards do I need?
● What alerts do I need?
● What kind of reports do I want to generate?
● What type of information do I need readily available when there’s
an incident?
© 2018 InfluxData. All rights reserved.50
Schema Example
● I operate a SaaS application
● There are ~1000 different services
● I want to know the request and error rates for each service
● I want to trigger an alert if the error rate for each service
● I want to see the services with the highest average request duration
© 2018 InfluxData. All rights reserved.51
Data Available
● app Service name, e.g. user_service, auth_service…
● container_id Container ID of the container running the service
● path HTTP request path
● method HTTP method, e.g. GET, POST, DELETE…
● src Hostname of client making request
● dest Hostname of server being contacted
● status HTTP status code associated with the request
● request_id Unique request identifier
● duration Duration of request
● bytes_tx Number of bytes transmitted
● bytes_rx Number of bytes received
© 2018 InfluxData. All rights reserved.52
Question
Why would it be a bad idea to make container_id or
request_id a tag?
© 2018 InfluxData. All rights reserved.53
Answer
Why would it be a bad idea to make container_id or
request_id a tag?
request_id and container_id both have a high cardinality and could result in an large number of
series, which impacts memory utilization.
request_id is substantially worse than container_id. In the next few releases we hope to allow for
indexing on container_id.
© 2018 InfluxData. All rights reserved.54
Question
How should we organize our data?
© 2018 InfluxData. All rights reserved.55
Data Available
● app Service name, e.g. user_service, auth_service…
● container_id Container ID of the container running the service
● path HTTP request path
● method HTTP method, e.g. GET, POST, DELETE…
● src Hostname of client making request
● dest Hostname of server being contacted
● status HTTP status code associated with the request
● request_id Unique request identifier
● duration Duration of request
● bytes_tx Number of bytes transmitted
● bytes_rx Number of bytes received
© 2018 InfluxData. All rights reserved.56
Schema
measurement:
latency
tags:
app container_id path method src dst status
fields:
request_id duration bytes_tx bytes_rx
© 2018 InfluxData. All rights reserved.57
Request/Error Rate per Service
Top 10 average request duration
> SELECT top(avg_duration, app, 10) FROM (
SELECT mean(duration) AS avg_dur
FROM latency
WHERE time > now() - 1h
GROUP BY time(1m), *
)
Request Rate Per Service
> SELECT count(duration)
FROM latency
WHERE time > now() - 10m
GROUP BY app, time(1s) fill(none)
Error Rate Per Service
> SELECT count(duration)
FROM latency
WHERE time > now() - 10m AND status != ‘200’
GROUP BY app, time(1s) fill(none)
Thank You!
InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer | InfluxData

More Related Content

What's hot (20)

PDF
InfluxDB 2.0 Client Libraries by Noah Crowley
InfluxData
 
PDF
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
InfluxData
 
PDF
WRITING QUERIES (INFLUXQL AND TICK)
InfluxData
 
PDF
InfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxData
 
PDF
Lessons Learned: Running InfluxDB Cloud and Other Cloud Services at Scale | T...
InfluxData
 
PDF
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxData
 
PPTX
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxData
 
PDF
Kapacitor Stream Processing
InfluxData
 
PDF
Setting up InfluxData for IoT
InfluxData
 
PPTX
Kapacitor - Real Time Data Processing Engine
Prashant Vats
 
PPTX
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
InfluxData
 
PDF
How to Store and Visualize CAN Bus Telematic Data with InfluxDB Cloud and Gra...
InfluxData
 
PDF
Catalogs - Turning a Set of Parquet Files into a Data Set
InfluxData
 
PDF
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
InfluxData
 
PPTX
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
InfluxData
 
PPTX
InfluxDB Cloud Product Update
InfluxData
 
PDF
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
InfluxData
 
PDF
Vasilis Papavasiliou [Mist.io] | Integrating Telegraf, InfluxDB and Mist to M...
InfluxData
 
PDF
A True Story About Database Orchestration
InfluxData
 
PDF
OPTIMIZING THE TICK STACK
InfluxData
 
InfluxDB 2.0 Client Libraries by Noah Crowley
InfluxData
 
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
InfluxData
 
WRITING QUERIES (INFLUXQL AND TICK)
InfluxData
 
InfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxData
 
Lessons Learned: Running InfluxDB Cloud and Other Cloud Services at Scale | T...
InfluxData
 
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxData
 
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxData
 
Kapacitor Stream Processing
InfluxData
 
Setting up InfluxData for IoT
InfluxData
 
Kapacitor - Real Time Data Processing Engine
Prashant Vats
 
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
InfluxData
 
How to Store and Visualize CAN Bus Telematic Data with InfluxDB Cloud and Gra...
InfluxData
 
Catalogs - Turning a Set of Parquet Files into a Data Set
InfluxData
 
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
InfluxData
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
InfluxData
 
InfluxDB Cloud Product Update
InfluxData
 
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
InfluxData
 
Vasilis Papavasiliou [Mist.io] | Integrating Telegraf, InfluxDB and Mist to M...
InfluxData
 
A True Story About Database Orchestration
InfluxData
 
OPTIMIZING THE TICK STACK
InfluxData
 

Similar to InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer | InfluxData (20)

PPTX
Why You Should NOT Be Using an RDBMS for Time-stamped Data
DevOps.com
 
PPTX
Why You Should NOT Be Using an RDBS for Time-stamped Data
DevOps.com
 
PDF
Best Practices: How to Analyze IoT Sensor Data with InfluxDB
InfluxData
 
PDF
Intro to Kapacitor for Alerting and Anomaly Detection
InfluxData
 
PPTX
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
InfluxData
 
PDF
Why Open Source Works for DevOps Monitoring
DevOps.com
 
PPTX
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
InfluxData
 
PPTX
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
 
PDF
Build an Edge-to-Cloud Solution with the MING Stack
InfluxData
 
PPTX
Why and how to leverage the simplicity and power of SQL on Flink
DataWorks Summit
 
PDF
3 reasons to pick a time series platform for monitoring dev ops driven contai...
DevOps.com
 
PDF
How to Build a Telegraf Plugin by Noah Crowley
InfluxData
 
PDF
Building a Telegraf Plugin by Noah Crowly | Developer Advocate | InfluxData
InfluxData
 
PDF
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
VictoriaMetrics
 
PDF
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
PDF
Jacob Marble [InfluxData] | Observability with InfluxDB IOx and OpenTelemetry...
InfluxData
 
PPTX
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward
 
PPTX
Performance testing - Accenture
GeetikaVerma16
 
PPTX
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles Sonigo
 
PDF
Installing your influx enterprise cluster
Chris Churilo
 
Why You Should NOT Be Using an RDBMS for Time-stamped Data
DevOps.com
 
Why You Should NOT Be Using an RDBS for Time-stamped Data
DevOps.com
 
Best Practices: How to Analyze IoT Sensor Data with InfluxDB
InfluxData
 
Intro to Kapacitor for Alerting and Anomaly Detection
InfluxData
 
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
InfluxData
 
Why Open Source Works for DevOps Monitoring
DevOps.com
 
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
InfluxData
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
 
Build an Edge-to-Cloud Solution with the MING Stack
InfluxData
 
Why and how to leverage the simplicity and power of SQL on Flink
DataWorks Summit
 
3 reasons to pick a time series platform for monitoring dev ops driven contai...
DevOps.com
 
How to Build a Telegraf Plugin by Noah Crowley
InfluxData
 
Building a Telegraf Plugin by Noah Crowly | Developer Advocate | InfluxData
InfluxData
 
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
VictoriaMetrics
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
Jacob Marble [InfluxData] | Observability with InfluxDB IOx and OpenTelemetry...
InfluxData
 
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward
 
Performance testing - Accenture
GeetikaVerma16
 
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles Sonigo
 
Installing your influx enterprise cluster
Chris Churilo
 
Ad

More from InfluxData (20)

PPTX
Announcing InfluxDB Clustered
InfluxData
 
PDF
Best Practices for Leveraging the Apache Arrow Ecosystem
InfluxData
 
PDF
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
InfluxData
 
PDF
Power Your Predictive Analytics with InfluxDB
InfluxData
 
PDF
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
InfluxData
 
PDF
Meet the Founders: An Open Discussion About Rewriting Using Rust
InfluxData
 
PDF
Introducing InfluxDB Cloud Dedicated
InfluxData
 
PDF
Gain Better Observability with OpenTelemetry and InfluxDB
InfluxData
 
PPTX
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
InfluxData
 
PDF
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
InfluxData
 
PPTX
Introducing InfluxDB’s New Time Series Database Storage Engine
InfluxData
 
PDF
Start Automating InfluxDB Deployments at the Edge with balena
InfluxData
 
PDF
Understanding InfluxDB’s New Storage Engine
InfluxData
 
PDF
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
InfluxData
 
PPTX
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
InfluxData
 
PDF
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
InfluxData
 
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
InfluxData
 
PDF
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
InfluxData
 
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
InfluxData
 
PDF
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
InfluxData
 
Announcing InfluxDB Clustered
InfluxData
 
Best Practices for Leveraging the Apache Arrow Ecosystem
InfluxData
 
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
InfluxData
 
Power Your Predictive Analytics with InfluxDB
InfluxData
 
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
InfluxData
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
InfluxData
 
Introducing InfluxDB Cloud Dedicated
InfluxData
 
Gain Better Observability with OpenTelemetry and InfluxDB
InfluxData
 
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
InfluxData
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
InfluxData
 
Introducing InfluxDB’s New Time Series Database Storage Engine
InfluxData
 
Start Automating InfluxDB Deployments at the Edge with balena
InfluxData
 
Understanding InfluxDB’s New Storage Engine
InfluxData
 
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
InfluxData
 
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
InfluxData
 
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
InfluxData
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
InfluxData
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
InfluxData
 
Ad

Recently uploaded (20)

PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 

InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer | InfluxData

  • 1. Michael Desa / Software Engineer InfluxDB 101
  • 2. © 2018 InfluxData. All rights reserved.2 Agenda By the end of this session, users should be able to... ● Define time-series data ● Describe what InfluxDB is and its relationship to InfluxData ● Explain the InfluxDB data model ● Reason about the impact of the schema in an instance
  • 3. © 2018 InfluxData. All rights reserved.3 What is time-series data?
  • 4. © 2018 InfluxData. All rights reserved.4
  • 5. © 2018 InfluxData. All rights reserved.5
  • 6. © 2018 InfluxData. All rights reserved.6
  • 7. © 2018 InfluxData. All rights reserved.7
  • 8. © 2018 InfluxData. All rights reserved.8
  • 9. © 2018 InfluxData. All rights reserved.9
  • 10. © 2018 InfluxData. All rights reserved.10
  • 11. © 2018 InfluxData. All rights reserved.11 Regular vs Irregular Time-Series
  • 12. © 2018 InfluxData. All rights reserved.12 Metrics vs Events
  • 13. © 2018 InfluxData. All rights reserved.13 Question: Time-Series?
  • 14. © 2018 InfluxData. All rights reserved.14 Question: Time-Series?
  • 15. © 2018 InfluxData. All rights reserved.15 What is a time-series database (tsdb)?
  • 16. © 2018 InfluxData. All rights reserved.16 Time-series Database ● A database where you manage and store time-series data ● Efficiently handles time-series data ● Supports time based queries
  • 17. © 2018 InfluxData. All rights reserved.17
  • 18. © 2018 InfluxData. All rights reserved.18 Why couldn’t I just use [insert db]?
  • 19. © 2018 InfluxData. All rights reserved.19 But, time-series is not just a database problem
  • 20. © 2018 InfluxData. All rights reserved.20 Time-series problems ● Visualizing your data ● Alerting you data ● Processing your data ● Taking action based on your data
  • 21. © 2018 InfluxData. All rights reserved.21 What is InfluxDB/InfluxData?
  • 22. © 2018 InfluxData. All rights reserved.22 InfluxData History ● 2012 - Errplane ● 2014 - InfluxDB is born ● 2015 - Transition to InfluxData ○ A platform for time-series data ● 2018 - 2.0 of InfluxData
  • 23. © 2018 InfluxData. All rights reserved.23 Why InfluxData ● Easy to get started with ● Aims to solve the entire time-series problem ● Scales well ○ Both horizontally and vertically
  • 25. © 2018 InfluxData. All rights reserved.25 Canonical Time-Series Line Graph
  • 26. © 2018 InfluxData. All rights reserved.26 The Label (measurement)
  • 27. © 2018 InfluxData. All rights reserved.27 The Legend (tags) ticker=A ticker=AA ticker=AAPL market=NASDAQ market=NYSE
  • 28. © 2018 InfluxData. All rights reserved.28 Y-Axis Values price=177.03 price=32.10 price=35.52
  • 29. © 2018 InfluxData. All rights reserved.29 X-Axis Values
  • 30. © 2018 InfluxData. All rights reserved.30 Series stock_price,ticker=A,market=NASDAQ
  • 31. © 2018 InfluxData. All rights reserved.31 Data Model ● Measurement ○ High level grouping of data ● Tags ○ Indexed key-value pairs ● Fields ○ Key-value pairs of actual data ● Timestamp ○ Time of the data ● Series ○ A unique combination of measurement+tags
  • 32. © 2018 InfluxData. All rights reserved.32 Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 Measurement
  • 33. © 2018 InfluxData. All rights reserved.33 Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 Tag s
  • 34. © 2018 InfluxData. All rights reserved.34 Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 Fields
  • 35. © 2018 InfluxData. All rights reserved.35 Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 timestamp
  • 37. © 2018 InfluxData. All rights reserved.37 InfluxData Languages ● InfluxQL ○ SQL-like query language ● TICKscript ○ Time-series data processing language ● Flux ○ Next generation functional data scripting language
  • 38. © 2018 InfluxData. All rights reserved.38 InfluxQL > SELECT index, id FROM h2o_quality WHERE time > now() - 1w GROUP BY location name: h2o_quality tags: location = coyote_creek time index id ---- ----- --- 2015-08-18T00:00:00Z 41 1 2015-08-18T00:00:00Z 41 1 name: h2o_quality tags: location = santa_monica time index id ---- ----- --- 2015-08-18T00:00:00Z 99 2 2015-08-18T00:06:00Z 56 2
  • 39. © 2018 InfluxData. All rights reserved.39 TICKscript var measurement = 'requests' var data = stream |from() .measurement(measurement) |where(lambda: "is_up" == TRUE) |where(lambda: "my_field" > 10) |window() .period(5m) .every(5m) // Count number of points in window data |count('value') .as('the_count') // Compute mean of data window data |mean('value') .as('the_average')
  • 40. © 2018 InfluxData. All rights reserved.40 Flux // get all data from the telegraf db from(bucket:”telegraf/autogen”) // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
  • 41. © 2018 InfluxData. All rights reserved.41 Why Flux? ● Composabile ○ Users should be able to take pieces of different scripts and combine them into a single one to solve their own problem ● Extensible ○ Adding new functions and capabilities to flux should be easy ● Shareable ○ Users should be able to create libraries and packages to solve specific problems ● Flexible ○ Users should be able to use the language for arbitrary data processing
  • 43. © 2018 InfluxData. All rights reserved.43 What not to do
  • 44. © 2018 InfluxData. All rights reserved.44 Don’t Encode Data into Measurement/Tags Bad: cpu.server-5.us-west value=2 1444234982000000000 cpu.server-6.us-west value=4 1444234982000000000 mem-free.server-6.us-west value=2500 1444234982000000000 Good: cpu,host=server-5,region=us-west value=2 1444234982000000000 cpu,host=server-6,region=us-west value=4 1444234982000000000 mem-free,host=server-6,region=us-west value=2500 1444234982000000
  • 45. © 2018 InfluxData. All rights reserved.45 Don’t Encode Data into Measurement/Tags Bad: cpu,server=localhost.us-west value=2 1444234982000000000 cpu,server=localhost.us-east value=3 1444234982000000000 Good: cpu,host=localhost,region=us-west value=2 1444234982000000000 cpu,host=localhost,region=us-east value=3 1444234982000000000
  • 46. © 2018 InfluxData. All rights reserved.46 Don’t Use Tags with Very High Variability Bad: response_time,session_id=33254331,request_id=3424347 value=340 14442349820000 Good-ish: response_time session_id=33254331,request_id=3424347,value=340 14442349820000
  • 47. © 2018 InfluxData. All rights reserved.47 Don’t Use Too Few Tags Bad: cpu,region=us-west host="server1",value=0.5 1444234986000 cpu,region=us-west host="server2",value=4 1444234982000 cpu,region=us-west host="server2",value=1 1444234982000 Good-ish: cpu,region=us-west,host=server1 value=0.5 1444234986000 cpu,region=us-west,host=server2 value=4 1444234982000 cpu,region=us-west,host=server2 value=1 1444234982000
  • 48. © 2018 InfluxData. All rights reserved.48 What should I do then?
  • 49. © 2018 InfluxData. All rights reserved.49 Designing a Schema ● What dashboards do I need? ● What alerts do I need? ● What kind of reports do I want to generate? ● What type of information do I need readily available when there’s an incident?
  • 50. © 2018 InfluxData. All rights reserved.50 Schema Example ● I operate a SaaS application ● There are ~1000 different services ● I want to know the request and error rates for each service ● I want to trigger an alert if the error rate for each service ● I want to see the services with the highest average request duration
  • 51. © 2018 InfluxData. All rights reserved.51 Data Available ● app Service name, e.g. user_service, auth_service… ● container_id Container ID of the container running the service ● path HTTP request path ● method HTTP method, e.g. GET, POST, DELETE… ● src Hostname of client making request ● dest Hostname of server being contacted ● status HTTP status code associated with the request ● request_id Unique request identifier ● duration Duration of request ● bytes_tx Number of bytes transmitted ● bytes_rx Number of bytes received
  • 52. © 2018 InfluxData. All rights reserved.52 Question Why would it be a bad idea to make container_id or request_id a tag?
  • 53. © 2018 InfluxData. All rights reserved.53 Answer Why would it be a bad idea to make container_id or request_id a tag? request_id and container_id both have a high cardinality and could result in an large number of series, which impacts memory utilization. request_id is substantially worse than container_id. In the next few releases we hope to allow for indexing on container_id.
  • 54. © 2018 InfluxData. All rights reserved.54 Question How should we organize our data?
  • 55. © 2018 InfluxData. All rights reserved.55 Data Available ● app Service name, e.g. user_service, auth_service… ● container_id Container ID of the container running the service ● path HTTP request path ● method HTTP method, e.g. GET, POST, DELETE… ● src Hostname of client making request ● dest Hostname of server being contacted ● status HTTP status code associated with the request ● request_id Unique request identifier ● duration Duration of request ● bytes_tx Number of bytes transmitted ● bytes_rx Number of bytes received
  • 56. © 2018 InfluxData. All rights reserved.56 Schema measurement: latency tags: app container_id path method src dst status fields: request_id duration bytes_tx bytes_rx
  • 57. © 2018 InfluxData. All rights reserved.57 Request/Error Rate per Service Top 10 average request duration > SELECT top(avg_duration, app, 10) FROM ( SELECT mean(duration) AS avg_dur FROM latency WHERE time > now() - 1h GROUP BY time(1m), * ) Request Rate Per Service > SELECT count(duration) FROM latency WHERE time > now() - 10m GROUP BY app, time(1s) fill(none) Error Rate Per Service > SELECT count(duration) FROM latency WHERE time > now() - 10m AND status != ‘200’ GROUP BY app, time(1s) fill(none)