SlideShare a Scribd company logo
1
Miguel Bosin
Support Engineer, @miguelbosin
Hot/Warm
Architecture + Sizing
2
Intro
Int
• Miguel Bosin
– Support engineer
– Joined in 2015
– Interested in techonology
– Passion about support
• Elastic
– Founded in 2012
– Distributed company
– Elasticsearch: What’s it?
– Open-source:
ES,LS,Kibana and Beats
– Commercial:
X-Pack
3
Intro
• Miguel Bosin
– Support engineer
– Joined in 2015
– Interested in techonology
– Passion about support
• Elastic
– Founded in 2012
– Distributed company
– Elasticsearch: What’s it
– Open-source:
ES,LS,Kibana and Beats
– Commercial:
X-Pack
4
What is it?
 Open source
 Distributed-scalable
 Highly available
 Document-oriented (JSON)
 RESTful
 FT search engine with real-
time search and analytics
capabilities
5
Agenda
Elastic overview1
Sizing introduction3
Hot/Warm architecture4
Elasticsearch basic architecture2
6
Elastic current’s products overview
7
Agenda
Elastic overview
Sizing introduction3
Hot/Warm architecture4
Elasticsearch basic architecture
1
2
8
Elasticsearch terminology
 A node is a single Elasticsearch instance, a single JVM
 Multiple nodes can form a cluster
 A cluster or a node can manage multiple indices
 An index is a container for data
 A shard is a single piece of an Elasticsearch index
 A shard is either a primary or a replica
9
Elasticsearch terminology II
10
Elasticsearch terminology III
11
Elasticsearch Architecture: Node roles
Master node:
 coordinates the cluster
 only node able to apply changes to cluster state
 publishes updated cluster state to all nodes
Data node:
 performs indexing
 can allocate shards locally
 knows cluster state
12
Elasticsearch Architecture: Node roles II
Client node:
 does NOT perform indexing or allocate shards locally
 does NOT perform cluster management operations
 knows cluster state
 smart load balancer (load balancing Kibana searches i.e.)
 redirect operations to the nodes that holds the relevant data
 calculate aggregations results
13
Nodes roles are set in the elasticsearch.yml
Elasticsearch Architecture: Node roles III
14
Architecture: node roles
15
Architecture: node roles
16
Architecture special case: dedicated master nodes
17
Dedicated master nodes –Why / minimum_master_nodes
 Indexing and searching data is CPU-, memory-, and I/O-intensive work which can
put pressure on a node’s resources
 Avoiding split brain: 2 current master nodes on the same cluster DATA LOSS
 Set this setting discovery.zen.minimum_master_nodes to the quorum:
(master_eligible_nodes / 2) + 1
18
Agenda
Elastic overview
Sizing introduction
Hot/Warm architecture4
Elasticsearch basic architecture
1
3
2
19
Sizing: general factors (server capacity)
• Disks (SSD vs. HD)
• RAM
-1/2 total RAM for ES
-ES heap size max: 30.5Gb
• # CPU cores
-ES threadpools concept
**1 shard—>gets 1 thread—>1 java process—>1core**
20
Sizing: Elasticsearch factors (logging case)
 Size of shards
 Number of shards on each node
 Retention period of data
 Mapping configuration
 -Which fields are searchable, _source enabled or
not,etc…
 Size (average) of the documents
21
Sizing: Capacity planning test I
 FIRST: testing on a single node with a single index with one shard
and no replica
 THEN: insert as many documents as you can and run some typical
queries
 At some point, queries will start to slow down to a threshold, which
no longer meet your requirements
 This is the ideal number of documents a single shard is able to
hold
 NEXT: Find the ideal number your primary shards (by dividing your
dataset size by the ideal shard size)
 FINALLY: Add replicas for HA and improve the read throughput
22
Sizing: Capacity planning test II
Each experiment tries to accomplish a discreet goal and build upon previous
22
Determine various
disk utilization
1 2 3 4
Determine breaking
point of a shard
Determine
saturation point of a
node
Test desired
configuration on
two node cluster
23
Agenda
Elastic overview
Sizing introduction
Hot/Warm architecture
3
Elasticsearch basic architecture
1
2
4
24
Hot / Warm architecture
When using it?
 Elasticsearch for larger time-data analytics use cases
 Using time-based indices
 Able to run an architecture with 3 different types of nodes
25
Hot / Warm architecture: Type of nodes
Master, Hot and Warm nodes:
 Master nodes: 3 dedicated master nodes
 Hot data nodes: perform all indexing and also hold the most recent daily
(data to be queried most frequently). Powerful machines with SSD storage
 Warm data nodes: handle a large amount of read-only indices that are not
queried frequently. Very large attached spinning disks
26
Hot / Warm architecture: tagging
Which node is doing what?
 ES needs to know which servers contain the hot nodes and which servers
contain the warm nodes
 This can be achieved by assigning arbitrary tags to each server (Hot/Warm)
 Tag the node with node.box_type: xxx in elasticsearch.yml
 OR start a node using ./bin/elasticsearch --node.box_type xxx
27
Hot / Warm architecture: Force Merge API
Optimizing your indices in the Warm Node
 The force merge API allows to force merging of one or more indices
through an API. Optimizes the index for faster search operation
 The merge relates to the number of segments a Lucene index holds within
each shard
 The force merge operation allows to reduce the number of segments by
merging them:
$ curl -XPOST 'http://localhost:9200/my_index/_forcemerge'
28
Hot / Warm architecture: Demo time!!
DEMO
Ad

More Related Content

What's hot (20)

Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
pmanvi
 
Intro to elasticsearch
Intro to elasticsearchIntro to elasticsearch
Intro to elasticsearch
Joey Wen
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
medcl
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Bo Andersen
 
Cassandra - how to fail?
Cassandra - how to fail?Cassandra - how to fail?
Cassandra - how to fail?
SoftwareMill
 
Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015
Roy Russo
 
Mongo gridfs
Mongo gridfsMongo gridfs
Mongo gridfs
Siva Chakilam
 
London MongoDB User Group April 2011
London MongoDB User Group April 2011London MongoDB User Group April 2011
London MongoDB User Group April 2011
Rainforest QA
 
Small intro to Big Data
Small intro to Big DataSmall intro to Big Data
Small intro to Big Data
SoftwareMill
 
ElasticSearch Getting Started
ElasticSearch Getting StartedElasticSearch Getting Started
ElasticSearch Getting Started
Onuralp Taner
 
Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph
Molly Struve
 
ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014
Roy Russo
 
Bridging Batch and Real-time Systems for Anomaly Detection
Bridging Batch and Real-time Systems for Anomaly DetectionBridging Batch and Real-time Systems for Anomaly Detection
Bridging Batch and Real-time Systems for Anomaly Detection
DataWorks Summit
 
WF ED 540, Class Meeting 2 - Importing & exporting data, 2016
WF ED 540, Class Meeting 2 - Importing & exporting data, 2016WF ED 540, Class Meeting 2 - Importing & exporting data, 2016
WF ED 540, Class Meeting 2 - Importing & exporting data, 2016
Penn State University
 
Firebird database that does not burn your data
Firebird   database that does not burn your dataFirebird   database that does not burn your data
Firebird database that does not burn your data
Ido Kanner
 
Mongo presentation conf
Mongo presentation confMongo presentation conf
Mongo presentation conf
Shridhar Joshi
 
Klevis Mino: MongoDB
Klevis Mino: MongoDBKlevis Mino: MongoDB
Klevis Mino: MongoDB
Carlo Vaccari
 
Firebird Mythbusters
Firebird MythbustersFirebird Mythbusters
Firebird Mythbusters
Mind The Firebird
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
Stéphane Gamard
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
Codemotion
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
pmanvi
 
Intro to elasticsearch
Intro to elasticsearchIntro to elasticsearch
Intro to elasticsearch
Joey Wen
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
medcl
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Bo Andersen
 
Cassandra - how to fail?
Cassandra - how to fail?Cassandra - how to fail?
Cassandra - how to fail?
SoftwareMill
 
Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015
Roy Russo
 
London MongoDB User Group April 2011
London MongoDB User Group April 2011London MongoDB User Group April 2011
London MongoDB User Group April 2011
Rainforest QA
 
Small intro to Big Data
Small intro to Big DataSmall intro to Big Data
Small intro to Big Data
SoftwareMill
 
ElasticSearch Getting Started
ElasticSearch Getting StartedElasticSearch Getting Started
ElasticSearch Getting Started
Onuralp Taner
 
Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph
Molly Struve
 
ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014
Roy Russo
 
Bridging Batch and Real-time Systems for Anomaly Detection
Bridging Batch and Real-time Systems for Anomaly DetectionBridging Batch and Real-time Systems for Anomaly Detection
Bridging Batch and Real-time Systems for Anomaly Detection
DataWorks Summit
 
WF ED 540, Class Meeting 2 - Importing & exporting data, 2016
WF ED 540, Class Meeting 2 - Importing & exporting data, 2016WF ED 540, Class Meeting 2 - Importing & exporting data, 2016
WF ED 540, Class Meeting 2 - Importing & exporting data, 2016
Penn State University
 
Firebird database that does not burn your data
Firebird   database that does not burn your dataFirebird   database that does not burn your data
Firebird database that does not burn your data
Ido Kanner
 
Mongo presentation conf
Mongo presentation confMongo presentation conf
Mongo presentation conf
Shridhar Joshi
 
Klevis Mino: MongoDB
Klevis Mino: MongoDBKlevis Mino: MongoDB
Klevis Mino: MongoDB
Carlo Vaccari
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
Stéphane Gamard
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
Codemotion
 

Viewers also liked (14)

ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup
ElasticSearch 5.x -  New Tricks - 2017-02-08 - Elasticsearch Meetup ElasticSearch 5.x -  New Tricks - 2017-02-08 - Elasticsearch Meetup
ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup
Alberto Paro
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scale
Bharvi Dixit
 
Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)
foundsearch
 
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
NAVER D2
 
You know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900msYou know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900ms
Jodok Batlogg
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearch
Anton Udovychenko
 
Real-time data analysis using ELK
Real-time data analysis using ELKReal-time data analysis using ELK
Real-time data analysis using ELK
Jettro Coenradie
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
Alaa Elhadba
 
6/18/14 Billing & Payments Engineering Meetup I
6/18/14 Billing & Payments Engineering Meetup I6/18/14 Billing & Payments Engineering Meetup I
6/18/14 Billing & Payments Engineering Meetup I
Mathieu Chauvin
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
BeyondTrees
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Spark Summit
 
Elasticsearch in Zalando
Elasticsearch in ZalandoElasticsearch in Zalando
Elasticsearch in Zalando
Alaa Elhadba
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
Florian Hopf
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Rafał Kuć
 
ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup
ElasticSearch 5.x -  New Tricks - 2017-02-08 - Elasticsearch Meetup ElasticSearch 5.x -  New Tricks - 2017-02-08 - Elasticsearch Meetup
ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup
Alberto Paro
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scale
Bharvi Dixit
 
Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)
foundsearch
 
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
NAVER D2
 
You know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900msYou know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900ms
Jodok Batlogg
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearch
Anton Udovychenko
 
Real-time data analysis using ELK
Real-time data analysis using ELKReal-time data analysis using ELK
Real-time data analysis using ELK
Jettro Coenradie
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
Alaa Elhadba
 
6/18/14 Billing & Payments Engineering Meetup I
6/18/14 Billing & Payments Engineering Meetup I6/18/14 Billing & Payments Engineering Meetup I
6/18/14 Billing & Payments Engineering Meetup I
Mathieu Chauvin
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
BeyondTrees
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Spark Summit
 
Elasticsearch in Zalando
Elasticsearch in ZalandoElasticsearch in Zalando
Elasticsearch in Zalando
Alaa Elhadba
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
Florian Hopf
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Rafał Kuć
 
Ad

Similar to Elastic meetup june16 (20)

Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
Joe Alex
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
Rich Lee
 
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big DataABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
Hitoshi Sato
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Codemotion
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Splunk
 
Percona FT / TokuDB
Percona FT / TokuDBPercona FT / TokuDB
Percona FT / TokuDB
Vadim Tkachenko
 
Realtime Indexing for Fast Queries on Massive Semi-Structured Data
Realtime Indexing for Fast Queries on Massive Semi-Structured DataRealtime Indexing for Fast Queries on Massive Semi-Structured Data
Realtime Indexing for Fast Queries on Massive Semi-Structured Data
ScyllaDB
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
SudheerKumar499932
 
Black friday logs - Scaling Elasticsearch
Black friday logs - Scaling ElasticsearchBlack friday logs - Scaling Elasticsearch
Black friday logs - Scaling Elasticsearch
Sylvain Wallez
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
Rahul Jain
 
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Streamsets Inc.
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Rick Bilodeau
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with Fusion
Lucidworks
 
Relational databases for BigData
Relational databases for BigDataRelational databases for BigData
Relational databases for BigData
Alexander Tokarev
 
Logging, Metrics, and APM: The Operations Trifecta
Logging, Metrics, and APM: The Operations TrifectaLogging, Metrics, and APM: The Operations Trifecta
Logging, Metrics, and APM: The Operations Trifecta
Elasticsearch
 
lecture3-indexconstruction.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
lecture3-indexconstruction.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnlecture3-indexconstruction.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
lecture3-indexconstruction.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
RAtna29
 
Apache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoTApache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoT
jixuan1989
 
S016825 ibm-cos-nola-v1710d
S016825 ibm-cos-nola-v1710dS016825 ibm-cos-nola-v1710d
S016825 ibm-cos-nola-v1710d
Tony Pearson
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
Vikram Shinde
 
High Performance With Java
High Performance With JavaHigh Performance With Java
High Performance With Java
malduarte
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
Joe Alex
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
Rich Lee
 
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big DataABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
Hitoshi Sato
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Codemotion
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Splunk
 
Realtime Indexing for Fast Queries on Massive Semi-Structured Data
Realtime Indexing for Fast Queries on Massive Semi-Structured DataRealtime Indexing for Fast Queries on Massive Semi-Structured Data
Realtime Indexing for Fast Queries on Massive Semi-Structured Data
ScyllaDB
 
Black friday logs - Scaling Elasticsearch
Black friday logs - Scaling ElasticsearchBlack friday logs - Scaling Elasticsearch
Black friday logs - Scaling Elasticsearch
Sylvain Wallez
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
Rahul Jain
 
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Streamsets Inc.
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Rick Bilodeau
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with Fusion
Lucidworks
 
Relational databases for BigData
Relational databases for BigDataRelational databases for BigData
Relational databases for BigData
Alexander Tokarev
 
Logging, Metrics, and APM: The Operations Trifecta
Logging, Metrics, and APM: The Operations TrifectaLogging, Metrics, and APM: The Operations Trifecta
Logging, Metrics, and APM: The Operations Trifecta
Elasticsearch
 
lecture3-indexconstruction.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
lecture3-indexconstruction.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnlecture3-indexconstruction.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
lecture3-indexconstruction.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
RAtna29
 
Apache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoTApache IOTDB: a Time Series Database for Industrial IoT
Apache IOTDB: a Time Series Database for Industrial IoT
jixuan1989
 
S016825 ibm-cos-nola-v1710d
S016825 ibm-cos-nola-v1710dS016825 ibm-cos-nola-v1710d
S016825 ibm-cos-nola-v1710d
Tony Pearson
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
Vikram Shinde
 
High Performance With Java
High Performance With JavaHigh Performance With Java
High Performance With Java
malduarte
 
Ad

Recently uploaded (20)

kurtlewin theory of motivation -181226082203.pptx
kurtlewin theory of motivation -181226082203.pptxkurtlewin theory of motivation -181226082203.pptx
kurtlewin theory of motivation -181226082203.pptx
TayyabaSiddiqui12
 
A Bot Identification Model and Tool Based on GitHub Activity Sequences
A Bot Identification Model and Tool Based on GitHub Activity SequencesA Bot Identification Model and Tool Based on GitHub Activity Sequences
A Bot Identification Model and Tool Based on GitHub Activity Sequences
natarajan8993
 
Lec 3 - Chapter 2 Carl Jung’s Theory of Personality.pptx
Lec 3 - Chapter 2 Carl Jung’s Theory of Personality.pptxLec 3 - Chapter 2 Carl Jung’s Theory of Personality.pptx
Lec 3 - Chapter 2 Carl Jung’s Theory of Personality.pptx
TayyabaSiddiqui12
 
Approach to diabetes Mellitus, diagnosis
Approach to diabetes Mellitus,  diagnosisApproach to diabetes Mellitus,  diagnosis
Approach to diabetes Mellitus, diagnosis
Mohammed Ahmed Bamashmos
 
The Business Dynamics of Quick Commerce.pdf
The Business Dynamics of Quick Commerce.pdfThe Business Dynamics of Quick Commerce.pdf
The Business Dynamics of Quick Commerce.pdf
RDinuRao
 
Wood Age and Trees of life - talk at Newcastle City Library
Wood Age and Trees of life - talk at Newcastle City LibraryWood Age and Trees of life - talk at Newcastle City Library
Wood Age and Trees of life - talk at Newcastle City Library
Woods for the Trees
 
Setup & Implementation of OutSystems Cloud Connector ODC
Setup & Implementation of OutSystems Cloud Connector ODCSetup & Implementation of OutSystems Cloud Connector ODC
Setup & Implementation of OutSystems Cloud Connector ODC
outsystemspuneusergr
 
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvvBasic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
hkthmrz42n
 
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdfMicrosoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
MinniePfeiffer
 
Speech 3-A Vision for Tomorrow for GE2025
Speech 3-A Vision for Tomorrow for GE2025Speech 3-A Vision for Tomorrow for GE2025
Speech 3-A Vision for Tomorrow for GE2025
Noraini Yunus
 
Updated treatment of hypothyroidism, causes and symptoms
Updated treatment of hypothyroidism,  causes and symptomsUpdated treatment of hypothyroidism,  causes and symptoms
Updated treatment of hypothyroidism, causes and symptoms
Mohammed Ahmed Bamashmos
 
Key Elements of a Procurement Plan.docx.
Key Elements of a Procurement Plan.docx.Key Elements of a Procurement Plan.docx.
Key Elements of a Procurement Plan.docx.
NeoRakodu
 
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
ASHISHKUMAR504404
 
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
patricialago3459
 
Bidding World Conference 2027 - Ghana.pptx
Bidding World Conference 2027 - Ghana.pptxBidding World Conference 2027 - Ghana.pptx
Bidding World Conference 2027 - Ghana.pptx
ISGF - International Scout and Guide Fellowship
 
cardiovascular outcome in trial of new antidiabetic drugs
cardiovascular outcome in trial of new antidiabetic drugscardiovascular outcome in trial of new antidiabetic drugs
cardiovascular outcome in trial of new antidiabetic drugs
Mohammed Ahmed Bamashmos
 
2025-05-04 A New Day Dawns 03 (shared slides).pptx
2025-05-04 A New Day Dawns 03 (shared slides).pptx2025-05-04 A New Day Dawns 03 (shared slides).pptx
2025-05-04 A New Day Dawns 03 (shared slides).pptx
Dale Wells
 
fundamentals of communicationclass notes.pptx
fundamentals of communicationclass notes.pptxfundamentals of communicationclass notes.pptx
fundamentals of communicationclass notes.pptx
Sunkod
 
Besu Shibpur Enquesta 2012 Intra College General Quiz Finals.pptx
Besu Shibpur Enquesta 2012 Intra College General Quiz Finals.pptxBesu Shibpur Enquesta 2012 Intra College General Quiz Finals.pptx
Besu Shibpur Enquesta 2012 Intra College General Quiz Finals.pptx
Rajdeep Chakraborty
 
2. Asexual propagation of fruit crops and .pptx
2. Asexual propagation of fruit crops and .pptx2. Asexual propagation of fruit crops and .pptx
2. Asexual propagation of fruit crops and .pptx
aschenakidawit1
 
kurtlewin theory of motivation -181226082203.pptx
kurtlewin theory of motivation -181226082203.pptxkurtlewin theory of motivation -181226082203.pptx
kurtlewin theory of motivation -181226082203.pptx
TayyabaSiddiqui12
 
A Bot Identification Model and Tool Based on GitHub Activity Sequences
A Bot Identification Model and Tool Based on GitHub Activity SequencesA Bot Identification Model and Tool Based on GitHub Activity Sequences
A Bot Identification Model and Tool Based on GitHub Activity Sequences
natarajan8993
 
Lec 3 - Chapter 2 Carl Jung’s Theory of Personality.pptx
Lec 3 - Chapter 2 Carl Jung’s Theory of Personality.pptxLec 3 - Chapter 2 Carl Jung’s Theory of Personality.pptx
Lec 3 - Chapter 2 Carl Jung’s Theory of Personality.pptx
TayyabaSiddiqui12
 
The Business Dynamics of Quick Commerce.pdf
The Business Dynamics of Quick Commerce.pdfThe Business Dynamics of Quick Commerce.pdf
The Business Dynamics of Quick Commerce.pdf
RDinuRao
 
Wood Age and Trees of life - talk at Newcastle City Library
Wood Age and Trees of life - talk at Newcastle City LibraryWood Age and Trees of life - talk at Newcastle City Library
Wood Age and Trees of life - talk at Newcastle City Library
Woods for the Trees
 
Setup & Implementation of OutSystems Cloud Connector ODC
Setup & Implementation of OutSystems Cloud Connector ODCSetup & Implementation of OutSystems Cloud Connector ODC
Setup & Implementation of OutSystems Cloud Connector ODC
outsystemspuneusergr
 
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvvBasic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
hkthmrz42n
 
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdfMicrosoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
MinniePfeiffer
 
Speech 3-A Vision for Tomorrow for GE2025
Speech 3-A Vision for Tomorrow for GE2025Speech 3-A Vision for Tomorrow for GE2025
Speech 3-A Vision for Tomorrow for GE2025
Noraini Yunus
 
Updated treatment of hypothyroidism, causes and symptoms
Updated treatment of hypothyroidism,  causes and symptomsUpdated treatment of hypothyroidism,  causes and symptoms
Updated treatment of hypothyroidism, causes and symptoms
Mohammed Ahmed Bamashmos
 
Key Elements of a Procurement Plan.docx.
Key Elements of a Procurement Plan.docx.Key Elements of a Procurement Plan.docx.
Key Elements of a Procurement Plan.docx.
NeoRakodu
 
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
ASHISHKUMAR504404
 
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
patricialago3459
 
cardiovascular outcome in trial of new antidiabetic drugs
cardiovascular outcome in trial of new antidiabetic drugscardiovascular outcome in trial of new antidiabetic drugs
cardiovascular outcome in trial of new antidiabetic drugs
Mohammed Ahmed Bamashmos
 
2025-05-04 A New Day Dawns 03 (shared slides).pptx
2025-05-04 A New Day Dawns 03 (shared slides).pptx2025-05-04 A New Day Dawns 03 (shared slides).pptx
2025-05-04 A New Day Dawns 03 (shared slides).pptx
Dale Wells
 
fundamentals of communicationclass notes.pptx
fundamentals of communicationclass notes.pptxfundamentals of communicationclass notes.pptx
fundamentals of communicationclass notes.pptx
Sunkod
 
Besu Shibpur Enquesta 2012 Intra College General Quiz Finals.pptx
Besu Shibpur Enquesta 2012 Intra College General Quiz Finals.pptxBesu Shibpur Enquesta 2012 Intra College General Quiz Finals.pptx
Besu Shibpur Enquesta 2012 Intra College General Quiz Finals.pptx
Rajdeep Chakraborty
 
2. Asexual propagation of fruit crops and .pptx
2. Asexual propagation of fruit crops and .pptx2. Asexual propagation of fruit crops and .pptx
2. Asexual propagation of fruit crops and .pptx
aschenakidawit1
 

Elastic meetup june16

  • 1. 1 Miguel Bosin Support Engineer, @miguelbosin Hot/Warm Architecture + Sizing
  • 2. 2 Intro Int • Miguel Bosin – Support engineer – Joined in 2015 – Interested in techonology – Passion about support • Elastic – Founded in 2012 – Distributed company – Elasticsearch: What’s it? – Open-source: ES,LS,Kibana and Beats – Commercial: X-Pack
  • 3. 3 Intro • Miguel Bosin – Support engineer – Joined in 2015 – Interested in techonology – Passion about support • Elastic – Founded in 2012 – Distributed company – Elasticsearch: What’s it – Open-source: ES,LS,Kibana and Beats – Commercial: X-Pack
  • 4. 4 What is it?  Open source  Distributed-scalable  Highly available  Document-oriented (JSON)  RESTful  FT search engine with real- time search and analytics capabilities
  • 5. 5 Agenda Elastic overview1 Sizing introduction3 Hot/Warm architecture4 Elasticsearch basic architecture2
  • 7. 7 Agenda Elastic overview Sizing introduction3 Hot/Warm architecture4 Elasticsearch basic architecture 1 2
  • 8. 8 Elasticsearch terminology  A node is a single Elasticsearch instance, a single JVM  Multiple nodes can form a cluster  A cluster or a node can manage multiple indices  An index is a container for data  A shard is a single piece of an Elasticsearch index  A shard is either a primary or a replica
  • 11. 11 Elasticsearch Architecture: Node roles Master node:  coordinates the cluster  only node able to apply changes to cluster state  publishes updated cluster state to all nodes Data node:  performs indexing  can allocate shards locally  knows cluster state
  • 12. 12 Elasticsearch Architecture: Node roles II Client node:  does NOT perform indexing or allocate shards locally  does NOT perform cluster management operations  knows cluster state  smart load balancer (load balancing Kibana searches i.e.)  redirect operations to the nodes that holds the relevant data  calculate aggregations results
  • 13. 13 Nodes roles are set in the elasticsearch.yml Elasticsearch Architecture: Node roles III
  • 16. 16 Architecture special case: dedicated master nodes
  • 17. 17 Dedicated master nodes –Why / minimum_master_nodes  Indexing and searching data is CPU-, memory-, and I/O-intensive work which can put pressure on a node’s resources  Avoiding split brain: 2 current master nodes on the same cluster DATA LOSS  Set this setting discovery.zen.minimum_master_nodes to the quorum: (master_eligible_nodes / 2) + 1
  • 18. 18 Agenda Elastic overview Sizing introduction Hot/Warm architecture4 Elasticsearch basic architecture 1 3 2
  • 19. 19 Sizing: general factors (server capacity) • Disks (SSD vs. HD) • RAM -1/2 total RAM for ES -ES heap size max: 30.5Gb • # CPU cores -ES threadpools concept **1 shard—>gets 1 thread—>1 java process—>1core**
  • 20. 20 Sizing: Elasticsearch factors (logging case)  Size of shards  Number of shards on each node  Retention period of data  Mapping configuration  -Which fields are searchable, _source enabled or not,etc…  Size (average) of the documents
  • 21. 21 Sizing: Capacity planning test I  FIRST: testing on a single node with a single index with one shard and no replica  THEN: insert as many documents as you can and run some typical queries  At some point, queries will start to slow down to a threshold, which no longer meet your requirements  This is the ideal number of documents a single shard is able to hold  NEXT: Find the ideal number your primary shards (by dividing your dataset size by the ideal shard size)  FINALLY: Add replicas for HA and improve the read throughput
  • 22. 22 Sizing: Capacity planning test II Each experiment tries to accomplish a discreet goal and build upon previous 22 Determine various disk utilization 1 2 3 4 Determine breaking point of a shard Determine saturation point of a node Test desired configuration on two node cluster
  • 23. 23 Agenda Elastic overview Sizing introduction Hot/Warm architecture 3 Elasticsearch basic architecture 1 2 4
  • 24. 24 Hot / Warm architecture When using it?  Elasticsearch for larger time-data analytics use cases  Using time-based indices  Able to run an architecture with 3 different types of nodes
  • 25. 25 Hot / Warm architecture: Type of nodes Master, Hot and Warm nodes:  Master nodes: 3 dedicated master nodes  Hot data nodes: perform all indexing and also hold the most recent daily (data to be queried most frequently). Powerful machines with SSD storage  Warm data nodes: handle a large amount of read-only indices that are not queried frequently. Very large attached spinning disks
  • 26. 26 Hot / Warm architecture: tagging Which node is doing what?  ES needs to know which servers contain the hot nodes and which servers contain the warm nodes  This can be achieved by assigning arbitrary tags to each server (Hot/Warm)  Tag the node with node.box_type: xxx in elasticsearch.yml  OR start a node using ./bin/elasticsearch --node.box_type xxx
  • 27. 27 Hot / Warm architecture: Force Merge API Optimizing your indices in the Warm Node  The force merge API allows to force merging of one or more indices through an API. Optimizes the index for faster search operation  The merge relates to the number of segments a Lucene index holds within each shard  The force merge operation allows to reduce the number of segments by merging them: $ curl -XPOST 'http://localhost:9200/my_index/_forcemerge'
  • 28. 28 Hot / Warm architecture: Demo time!! DEMO