SlideShare a Scribd company logo
How to ensure Presto
scalability
in multi use case
Kai Sasaki
Treasure Data Inc.
Kai Sasaki (@Lewuathe)
Software Engineer at Treasure Data Inc.
Hadoop/Presto/Spark
Presto In TD
• 150000+ queries / day
• 190+ TB processing / day
• 10+ MB processing / query * sec
• 100+ million processed records / query
Presto In TD
Prestobase
Proxy
PerfectQueue
query
Plazma
data
Presto
TD API
BI Tool
HTTP
How to make it scalable
• Prestobase Proxy
• Node scheduler
• Resource Group
Prestobase proxy
Prestobase proxy
Prestobase proxy aims to provide the
interface especially for BI tools through
JDBC/ODBC and also to replace Prestogres.
Presto In TD
Prestobase
Proxy
PerfectQueue
query
Plazma
data
Presto
TD API
BI Tool
HTTP
Prestobase proxy
• Written in Scala
• Finagle base RPC proxy
• Running as Docker container
• A user of Airframe
• VCR base light-weight test framework
Finagle
Finagle is an extensible RPC system for the JVM,
used to construct high-concurrency servers.
Finagle implements uniform client and server
APIs for several protocols, and is designed for
high performance and concurrency.
see: https://twitter.github.io/finagle/
Finagle
protected val service: Service[Request, Response] =
bind[SomeFilter] andThen
bind[AnotherHandler] andThen
LastFilter andThen
prestoClient
Build request pipeline by binding
filter, handlers with Airframe
Airframe
Airframe is a trait base dependency injection
framework using Scala macro
- https://github.com/wvlet/airframe
Airframe
- Dependency injection tailored Scala
- Tagged binding with wvlet
https://github.com/wvlet/wvlet
- Object lifecycle management
Airframe
val design : Design =
newDesign
.bind[X].toInstance(new X) // Bind type X to a concrete instance
.bind[Y].toSingleton // Bind type Y to a singleton object
.bind[Z].to[ZImpl] // Bind type Z to an instance of ZImpl
import wvlet.airframe._
trait App {
val x = bind[X]
val y = bind[Y]
val z = bind[Z]
// Do something with X, Y, and Z
}
val session = design.newSession
val app : App = session.build[App]
VCR testing framework
Record test suite HTTP interaction to make
test stable and deterministic
see more detail
https://testing.googleblog.com/2016/11/what-test-engineers-do-at-google.html
VCR testing framework
protected val service: Service[Request, Response] =
bind[SomeFilter] andThen
bind[AnotherHandler] andThen
QueryRewriter andThen
bind[RequestVCR] andThen
prestClient
protected val service: Service[Request, Response] =
bind[SomeFilter] andThen
bind[AnotherHandler] andThen
QueryRewriter andThen
bind[NoRecording] andThen
prestClient
On CI
On Production
Prestobase
VCR testing framework
RequestVCRClient
…
…
SQLite
Recording
Prestobase
VCR testing framework
RequestVCRClient
…
…
SQLite
Replaying
Prestobase proxy
Will be open sourced soon
Node Scheduler
Node Scheduler
Submitting query follows…
- Analyze query AST
- Make query logical/physical plan
- Schedule each stage
Node Scheduler
query
stage2 stage1 stage0
task2-0
task2-1
task2-0
task1-0
task1-1
task0-0
Table Scan output
Node Scheduler
NodeScheduler creates NodeSelector that
selects worker nodes on which tasks are
scheduled. NodeSelector picks up worker
nodes when there is available splits.
Node Scheduler in TD
Keeps worker node map that can be
candidate for launching next tasks.
- Ignore min candidates
- Limit by available memory pool
Node Scheduler in TD
Back to normal memory pool usage after task is completed.
Node Scheduler in TD
Challenges
- Smoothing CPU time metric
- Split type awareness
- Avoid problematic worker nodes
Resource Group
Resource Group
Resource Group was introduced since 0.147
→ https://prestodb.io/docs/current/admin/resource-groups.html
Resource Group aims to limit the resource
usage by account/group/query.
Resource Group
rootGroup
general adhoc
softMemoryLimit: 100%
maxQueued : 5000
maxRunning : 1000
softMemoryLimit: 100%
maxQueued : 100
maxRunning : 200
softMemoryLimit: 100%
maxRunning : 1000
Resource Group limits
- maxQueued
- maxRunning
- softMemoryLimit
Following queries will be queued
- softCpuLimit
Impose penalty against max running queries
- hardCpuLimit
Following queries will be queued
Resource Group scheduling
- schedulingPolicy
- fair : FIFO
- weighted : Selected stochastically
- query_priority : Selected according to priority
- schedulingWeight
Resource Group
Every query must be associated to a resource
group. The matching can be done by
configured selector.
{
"user": “bob", "group": "general"
},
{
"source": “.*adhoc.*", "group": "global.adhoc.adhoc_${USER}"
}
Resource Group
rootGroup
general adhoc
softMemoryLimit: 100%
maxQueued : 5000
maxRunning : 1000
softMemoryLimit: 100%
maxQueued : 100
maxRunning : 200
softMemoryLimit: 100%
maxRunning : 1000
Bob’s
query
Bob’s
query …
Resource Group DI
Easily change resource group config behavior
with Guice injection.
- ResourceGroupConfigurationManager
- configure(ResourceGroup, SelectionContext)
- ResourceGroupSelector
- match(Statement, SelectionContext)
SelectionContext
SelectionContext holds the information for associating
submitted query.
- Authenticated
- User
- Source
- Query Priority
Currently available as default
{
"runningQueryIds": ["query1", "query2"],
"accountId": 1,
"children": [{
"memoryUsage": 12345,
"runningQueryIds": [“query1"],
"children": [],
"runningQueries": 1,
"queuedQueries": 0,
"maxRunningQueries": 2,
"resourceId": "general"
}, {
"memoryUsage": 26296,
"runningQueryIds": ["query2"],
"children": [],
"runningQueries": 1,
"queuedQueries": 0,
"maxRunningQueries": 2,
"resourceId": "scheduled"
}],
"runningQueries": 2,
"maxRunningQueries": 30,
}
Queries in parent group
Running query in general
Running query in scheduled
Recap
Distributed system often requires each
component to be stable and scalable. We can
make Presto ecosystem reliable by doing…
- Code modification reliability with DI
- VCR testing
- Multi dimensional resource scheduling
- Resource isolation makes multi-tenant
distributed SQL engine reliable
Ad

More Related Content

What's hot (20)

Bullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query EngineBullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query Engine
DataWorks Summit
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
kbajda
 
Presto
PrestoPresto
Presto
Knoldus Inc.
 
Presto in my_use_case
Presto in my_use_casePresto in my_use_case
Presto in my_use_case
wyukawa
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.
Wojciech Biela
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
Sadayuki Furuhashi
 
Presto+MySQLで分散SQL
Presto+MySQLで分散SQLPresto+MySQLで分散SQL
Presto+MySQLで分散SQL
Sadayuki Furuhashi
 
Presto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookPresto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @Facebook
Treasure Data, Inc.
 
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015
Taro L. Saito
 
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016
kbajda
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014
Sadayuki Furuhashi
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on Spark
Vinoth Chandar
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
SATOSHI TAGOMORI
 
User Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDBUser Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDB
Kai Sasaki
 
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopBig Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Gruter
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Matt Fuller
 
Natural Language Query and Conversational Interface to Apache Spark
Natural Language Query and Conversational Interface to Apache SparkNatural Language Query and Conversational Interface to Apache Spark
Natural Language Query and Conversational Interface to Apache Spark
Databricks
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
SATOSHI TAGOMORI
 
Presto Meetup (2015-03-19)
Presto Meetup (2015-03-19)Presto Meetup (2015-03-19)
Presto Meetup (2015-03-19)
Dain Sundstrom
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
Cloudera, Inc.
 
Bullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query EngineBullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query Engine
DataWorks Summit
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
kbajda
 
Presto in my_use_case
Presto in my_use_casePresto in my_use_case
Presto in my_use_case
wyukawa
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.
Wojciech Biela
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
Sadayuki Furuhashi
 
Presto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @FacebookPresto meetup 2015-03-19 @Facebook
Presto meetup 2015-03-19 @Facebook
Treasure Data, Inc.
 
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015
Taro L. Saito
 
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016
kbajda
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014
Sadayuki Furuhashi
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on Spark
Vinoth Chandar
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
SATOSHI TAGOMORI
 
User Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDBUser Defined Partitioning on PlazmaDB
User Defined Partitioning on PlazmaDB
Kai Sasaki
 
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopBig Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Gruter
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Matt Fuller
 
Natural Language Query and Conversational Interface to Apache Spark
Natural Language Query and Conversational Interface to Apache SparkNatural Language Query and Conversational Interface to Apache Spark
Natural Language Query and Conversational Interface to Apache Spark
Databricks
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
SATOSHI TAGOMORI
 
Presto Meetup (2015-03-19)
Presto Meetup (2015-03-19)Presto Meetup (2015-03-19)
Presto Meetup (2015-03-19)
Dain Sundstrom
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
Cloudera, Inc.
 

Viewers also liked (9)

Bypassing Web Application Firewalls and other security filters
Bypassing Web Application Firewalls and other security filtersBypassing Web Application Firewalls and other security filters
Bypassing Web Application Firewalls and other security filters
Netsparker
 
Presto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FuturePresto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and Future
DataWorks Summit
 
Presto: SQL-on-anything
Presto: SQL-on-anythingPresto: SQL-on-anything
Presto: SQL-on-anything
DataWorks Summit
 
Presto - SQL on anything
Presto  - SQL on anythingPresto  - SQL on anything
Presto - SQL on anything
Grzegorz Kokosiński
 
Facebook Presto presentation
Facebook Presto presentationFacebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
 
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CAPresto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
kbajda
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine
kiran palaka
 
Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud Storage
Kai Sasaki
 
Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmark
Dongwon Kim
 
Bypassing Web Application Firewalls and other security filters
Bypassing Web Application Firewalls and other security filtersBypassing Web Application Firewalls and other security filters
Bypassing Web Application Firewalls and other security filters
Netsparker
 
Presto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FuturePresto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and Future
DataWorks Summit
 
Facebook Presto presentation
Facebook Presto presentationFacebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
 
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CAPresto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
kbajda
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine
kiran palaka
 
Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud Storage
Kai Sasaki
 
Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmark
Dongwon Kim
 
Ad

Similar to How to ensure Presto scalability 
in multi use case (20)

GraphConnect 2014 SF: From Zero to Graph in 120: Scale
GraphConnect 2014 SF: From Zero to Graph in 120: ScaleGraphConnect 2014 SF: From Zero to Graph in 120: Scale
GraphConnect 2014 SF: From Zero to Graph in 120: Scale
Neo4j
 
AtlasCamp 2014: Building a Production Ready Connect Add-on
AtlasCamp 2014: Building a Production Ready Connect Add-onAtlasCamp 2014: Building a Production Ready Connect Add-on
AtlasCamp 2014: Building a Production Ready Connect Add-on
Atlassian
 
20151010 my sq-landjavav2a
20151010 my sq-landjavav2a20151010 my sq-landjavav2a
20151010 my sq-landjavav2a
Ivan Ma
 
Delivering High Performance Ecommerce with Magento Commerce Cloud
Delivering High Performance Ecommerce with Magento Commerce CloudDelivering High Performance Ecommerce with Magento Commerce Cloud
Delivering High Performance Ecommerce with Magento Commerce Cloud
Guncha Pental
 
AtlasCamp 2014: Building a Production Ready Connect Add-On
AtlasCamp 2014: Building a Production Ready Connect Add-OnAtlasCamp 2014: Building a Production Ready Connect Add-On
AtlasCamp 2014: Building a Production Ready Connect Add-On
Robin Fernandes
 
Assurer - a pluggable server testing/monitoring framework
Assurer - a pluggable server testing/monitoring frameworkAssurer - a pluggable server testing/monitoring framework
Assurer - a pluggable server testing/monitoring framework
Gosuke Miyashita
 
Microsoft Windows Server AppFabric
Microsoft Windows Server AppFabricMicrosoft Windows Server AppFabric
Microsoft Windows Server AppFabric
Mark Ginnebaugh
 
Behavior Driven Development and Automation Testing Using Cucumber
Behavior Driven Development and Automation Testing Using CucumberBehavior Driven Development and Automation Testing Using Cucumber
Behavior Driven Development and Automation Testing Using Cucumber
KMS Technology
 
Altitude San Francisco 2018: Testing with Fastly Workshop
Altitude San Francisco 2018: Testing with Fastly WorkshopAltitude San Francisco 2018: Testing with Fastly Workshop
Altitude San Francisco 2018: Testing with Fastly Workshop
Fastly
 
GE Predix 新手入门 赵锴 物联网_IoT
GE Predix 新手入门 赵锴 物联网_IoTGE Predix 新手入门 赵锴 物联网_IoT
GE Predix 新手入门 赵锴 物联网_IoT
Kai Zhao
 
Dev309 from asgard to zuul - netflix oss-final
Dev309  from asgard to zuul - netflix oss-finalDev309  from asgard to zuul - netflix oss-final
Dev309 from asgard to zuul - netflix oss-final
Ruslan Meshenberg
 
Structure your Play application with the cake pattern (and test it)
Structure your Play application with the cake pattern (and test it)Structure your Play application with the cake pattern (and test it)
Structure your Play application with the cake pattern (and test it)
yann_s
 
Play Framework: async I/O with Java and Scala
Play Framework: async I/O with Java and ScalaPlay Framework: async I/O with Java and Scala
Play Framework: async I/O with Java and Scala
Yevgeniy Brikman
 
Build A Killer Client For Your REST+JSON API
Build A Killer Client For Your REST+JSON APIBuild A Killer Client For Your REST+JSON API
Build A Killer Client For Your REST+JSON API
Stormpath
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
Chris Fregly
 
Using Istio to Secure & Monitor Your Services
Using Istio to Secure & Monitor Your ServicesUsing Istio to Secure & Monitor Your Services
Using Istio to Secure & Monitor Your Services
Alcide
 
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Rackspace Academy
 
High-Performance Hibernate - JDK.io 2018
High-Performance Hibernate - JDK.io 2018High-Performance Hibernate - JDK.io 2018
High-Performance Hibernate - JDK.io 2018
Vlad Mihalcea
 
Where is my cache architectural patterns for caching microservices by example
Where is my cache architectural patterns for caching microservices by exampleWhere is my cache architectural patterns for caching microservices by example
Where is my cache architectural patterns for caching microservices by example
Rafał Leszko
 
Cannibalising The Google App Engine
Cannibalising The  Google  App  EngineCannibalising The  Google  App  Engine
Cannibalising The Google App Engine
catherinewall
 
GraphConnect 2014 SF: From Zero to Graph in 120: Scale
GraphConnect 2014 SF: From Zero to Graph in 120: ScaleGraphConnect 2014 SF: From Zero to Graph in 120: Scale
GraphConnect 2014 SF: From Zero to Graph in 120: Scale
Neo4j
 
AtlasCamp 2014: Building a Production Ready Connect Add-on
AtlasCamp 2014: Building a Production Ready Connect Add-onAtlasCamp 2014: Building a Production Ready Connect Add-on
AtlasCamp 2014: Building a Production Ready Connect Add-on
Atlassian
 
20151010 my sq-landjavav2a
20151010 my sq-landjavav2a20151010 my sq-landjavav2a
20151010 my sq-landjavav2a
Ivan Ma
 
Delivering High Performance Ecommerce with Magento Commerce Cloud
Delivering High Performance Ecommerce with Magento Commerce CloudDelivering High Performance Ecommerce with Magento Commerce Cloud
Delivering High Performance Ecommerce with Magento Commerce Cloud
Guncha Pental
 
AtlasCamp 2014: Building a Production Ready Connect Add-On
AtlasCamp 2014: Building a Production Ready Connect Add-OnAtlasCamp 2014: Building a Production Ready Connect Add-On
AtlasCamp 2014: Building a Production Ready Connect Add-On
Robin Fernandes
 
Assurer - a pluggable server testing/monitoring framework
Assurer - a pluggable server testing/monitoring frameworkAssurer - a pluggable server testing/monitoring framework
Assurer - a pluggable server testing/monitoring framework
Gosuke Miyashita
 
Microsoft Windows Server AppFabric
Microsoft Windows Server AppFabricMicrosoft Windows Server AppFabric
Microsoft Windows Server AppFabric
Mark Ginnebaugh
 
Behavior Driven Development and Automation Testing Using Cucumber
Behavior Driven Development and Automation Testing Using CucumberBehavior Driven Development and Automation Testing Using Cucumber
Behavior Driven Development and Automation Testing Using Cucumber
KMS Technology
 
Altitude San Francisco 2018: Testing with Fastly Workshop
Altitude San Francisco 2018: Testing with Fastly WorkshopAltitude San Francisco 2018: Testing with Fastly Workshop
Altitude San Francisco 2018: Testing with Fastly Workshop
Fastly
 
GE Predix 新手入门 赵锴 物联网_IoT
GE Predix 新手入门 赵锴 物联网_IoTGE Predix 新手入门 赵锴 物联网_IoT
GE Predix 新手入门 赵锴 物联网_IoT
Kai Zhao
 
Dev309 from asgard to zuul - netflix oss-final
Dev309  from asgard to zuul - netflix oss-finalDev309  from asgard to zuul - netflix oss-final
Dev309 from asgard to zuul - netflix oss-final
Ruslan Meshenberg
 
Structure your Play application with the cake pattern (and test it)
Structure your Play application with the cake pattern (and test it)Structure your Play application with the cake pattern (and test it)
Structure your Play application with the cake pattern (and test it)
yann_s
 
Play Framework: async I/O with Java and Scala
Play Framework: async I/O with Java and ScalaPlay Framework: async I/O with Java and Scala
Play Framework: async I/O with Java and Scala
Yevgeniy Brikman
 
Build A Killer Client For Your REST+JSON API
Build A Killer Client For Your REST+JSON APIBuild A Killer Client For Your REST+JSON API
Build A Killer Client For Your REST+JSON API
Stormpath
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
Chris Fregly
 
Using Istio to Secure & Monitor Your Services
Using Istio to Secure & Monitor Your ServicesUsing Istio to Secure & Monitor Your Services
Using Istio to Secure & Monitor Your Services
Alcide
 
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Rackspace Academy
 
High-Performance Hibernate - JDK.io 2018
High-Performance Hibernate - JDK.io 2018High-Performance Hibernate - JDK.io 2018
High-Performance Hibernate - JDK.io 2018
Vlad Mihalcea
 
Where is my cache architectural patterns for caching microservices by example
Where is my cache architectural patterns for caching microservices by exampleWhere is my cache architectural patterns for caching microservices by example
Where is my cache architectural patterns for caching microservices by example
Rafał Leszko
 
Cannibalising The Google App Engine
Cannibalising The  Google  App  EngineCannibalising The  Google  App  Engine
Cannibalising The Google App Engine
catherinewall
 
Ad

More from Kai Sasaki (20)

Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤
Kai Sasaki
 
Infrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed systemInfrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed system
Kai Sasaki
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData Analysis
Kai Sasaki
 
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future Presto
Kai Sasaki
 
Real World Storage in Treasure Data
Real World Storage in Treasure DataReal World Storage in Treasure Data
Real World Storage in Treasure Data
Kai Sasaki
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_system
Kai Sasaki
 
Deep dive into deeplearn.js
Deep dive into deeplearn.jsDeep dive into deeplearn.js
Deep dive into deeplearn.js
Kai Sasaki
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0
Kai Sasaki
 
Embulk makes Japan visible
Embulk makes Japan visibleEmbulk makes Japan visible
Embulk makes Japan visible
Kai Sasaki
 
Maintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoopMaintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoop
Kai Sasaki
 
図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding
Kai Sasaki
 
Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~
Kai Sasaki
 
How I tried MADE
How I tried MADEHow I tried MADE
How I tried MADE
Kai Sasaki
 
Reading kernel org
Reading kernel orgReading kernel org
Reading kernel org
Kai Sasaki
 
Reading drill
Reading drillReading drill
Reading drill
Kai Sasaki
 
Kernel ext4
Kernel ext4Kernel ext4
Kernel ext4
Kai Sasaki
 
Kernel bootstrap
Kernel bootstrapKernel bootstrap
Kernel bootstrap
Kai Sasaki
 
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
Kai Sasaki
 
Kernel resource
Kernel resourceKernel resource
Kernel resource
Kai Sasaki
 
Kernel overview
Kernel overviewKernel overview
Kernel overview
Kai Sasaki
 
Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤
Kai Sasaki
 
Infrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed systemInfrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed system
Kai Sasaki
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData Analysis
Kai Sasaki
 
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future Presto
Kai Sasaki
 
Real World Storage in Treasure Data
Real World Storage in Treasure DataReal World Storage in Treasure Data
Real World Storage in Treasure Data
Kai Sasaki
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_system
Kai Sasaki
 
Deep dive into deeplearn.js
Deep dive into deeplearn.jsDeep dive into deeplearn.js
Deep dive into deeplearn.js
Kai Sasaki
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0
Kai Sasaki
 
Embulk makes Japan visible
Embulk makes Japan visibleEmbulk makes Japan visible
Embulk makes Japan visible
Kai Sasaki
 
Maintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoopMaintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoop
Kai Sasaki
 
図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding
Kai Sasaki
 
Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~
Kai Sasaki
 
How I tried MADE
How I tried MADEHow I tried MADE
How I tried MADE
Kai Sasaki
 
Reading kernel org
Reading kernel orgReading kernel org
Reading kernel org
Kai Sasaki
 
Kernel bootstrap
Kernel bootstrapKernel bootstrap
Kernel bootstrap
Kai Sasaki
 
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
Kai Sasaki
 
Kernel resource
Kernel resourceKernel resource
Kernel resource
Kai Sasaki
 
Kernel overview
Kernel overviewKernel overview
Kernel overview
Kai Sasaki
 

Recently uploaded (20)

How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Ranjan Baisak
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Not So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java WebinarNot So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java Webinar
Tier1 app
 
Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Ranjan Baisak
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Not So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java WebinarNot So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java Webinar
Tier1 app
 
Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 

How to ensure Presto scalability 
in multi use case

  • 1. How to ensure Presto scalability in multi use case Kai Sasaki Treasure Data Inc.
  • 2. Kai Sasaki (@Lewuathe) Software Engineer at Treasure Data Inc. Hadoop/Presto/Spark
  • 3. Presto In TD • 150000+ queries / day • 190+ TB processing / day • 10+ MB processing / query * sec • 100+ million processed records / query
  • 5. How to make it scalable • Prestobase Proxy • Node scheduler • Resource Group
  • 7. Prestobase proxy Prestobase proxy aims to provide the interface especially for BI tools through JDBC/ODBC and also to replace Prestogres.
  • 9. Prestobase proxy • Written in Scala • Finagle base RPC proxy • Running as Docker container • A user of Airframe • VCR base light-weight test framework
  • 10. Finagle Finagle is an extensible RPC system for the JVM, used to construct high-concurrency servers. Finagle implements uniform client and server APIs for several protocols, and is designed for high performance and concurrency. see: https://twitter.github.io/finagle/
  • 11. Finagle protected val service: Service[Request, Response] = bind[SomeFilter] andThen bind[AnotherHandler] andThen LastFilter andThen prestoClient Build request pipeline by binding filter, handlers with Airframe
  • 12. Airframe Airframe is a trait base dependency injection framework using Scala macro - https://github.com/wvlet/airframe
  • 13. Airframe - Dependency injection tailored Scala - Tagged binding with wvlet https://github.com/wvlet/wvlet - Object lifecycle management
  • 14. Airframe val design : Design = newDesign .bind[X].toInstance(new X) // Bind type X to a concrete instance .bind[Y].toSingleton // Bind type Y to a singleton object .bind[Z].to[ZImpl] // Bind type Z to an instance of ZImpl import wvlet.airframe._ trait App { val x = bind[X] val y = bind[Y] val z = bind[Z] // Do something with X, Y, and Z } val session = design.newSession val app : App = session.build[App]
  • 15. VCR testing framework Record test suite HTTP interaction to make test stable and deterministic see more detail https://testing.googleblog.com/2016/11/what-test-engineers-do-at-google.html
  • 16. VCR testing framework protected val service: Service[Request, Response] = bind[SomeFilter] andThen bind[AnotherHandler] andThen QueryRewriter andThen bind[RequestVCR] andThen prestClient protected val service: Service[Request, Response] = bind[SomeFilter] andThen bind[AnotherHandler] andThen QueryRewriter andThen bind[NoRecording] andThen prestClient On CI On Production
  • 19. Prestobase proxy Will be open sourced soon
  • 21. Node Scheduler Submitting query follows… - Analyze query AST - Make query logical/physical plan - Schedule each stage
  • 22. Node Scheduler query stage2 stage1 stage0 task2-0 task2-1 task2-0 task1-0 task1-1 task0-0 Table Scan output
  • 23. Node Scheduler NodeScheduler creates NodeSelector that selects worker nodes on which tasks are scheduled. NodeSelector picks up worker nodes when there is available splits.
  • 24. Node Scheduler in TD Keeps worker node map that can be candidate for launching next tasks. - Ignore min candidates - Limit by available memory pool
  • 25. Node Scheduler in TD Back to normal memory pool usage after task is completed.
  • 26. Node Scheduler in TD Challenges - Smoothing CPU time metric - Split type awareness - Avoid problematic worker nodes
  • 28. Resource Group Resource Group was introduced since 0.147 → https://prestodb.io/docs/current/admin/resource-groups.html Resource Group aims to limit the resource usage by account/group/query.
  • 29. Resource Group rootGroup general adhoc softMemoryLimit: 100% maxQueued : 5000 maxRunning : 1000 softMemoryLimit: 100% maxQueued : 100 maxRunning : 200 softMemoryLimit: 100% maxRunning : 1000
  • 30. Resource Group limits - maxQueued - maxRunning - softMemoryLimit Following queries will be queued - softCpuLimit Impose penalty against max running queries - hardCpuLimit Following queries will be queued
  • 31. Resource Group scheduling - schedulingPolicy - fair : FIFO - weighted : Selected stochastically - query_priority : Selected according to priority - schedulingWeight
  • 32. Resource Group Every query must be associated to a resource group. The matching can be done by configured selector. { "user": “bob", "group": "general" }, { "source": “.*adhoc.*", "group": "global.adhoc.adhoc_${USER}" }
  • 33. Resource Group rootGroup general adhoc softMemoryLimit: 100% maxQueued : 5000 maxRunning : 1000 softMemoryLimit: 100% maxQueued : 100 maxRunning : 200 softMemoryLimit: 100% maxRunning : 1000 Bob’s query Bob’s query …
  • 34. Resource Group DI Easily change resource group config behavior with Guice injection. - ResourceGroupConfigurationManager - configure(ResourceGroup, SelectionContext) - ResourceGroupSelector - match(Statement, SelectionContext)
  • 35. SelectionContext SelectionContext holds the information for associating submitted query. - Authenticated - User - Source - Query Priority Currently available as default
  • 36. { "runningQueryIds": ["query1", "query2"], "accountId": 1, "children": [{ "memoryUsage": 12345, "runningQueryIds": [“query1"], "children": [], "runningQueries": 1, "queuedQueries": 0, "maxRunningQueries": 2, "resourceId": "general" }, { "memoryUsage": 26296, "runningQueryIds": ["query2"], "children": [], "runningQueries": 1, "queuedQueries": 0, "maxRunningQueries": 2, "resourceId": "scheduled" }], "runningQueries": 2, "maxRunningQueries": 30, } Queries in parent group Running query in general Running query in scheduled
  • 37. Recap Distributed system often requires each component to be stable and scalable. We can make Presto ecosystem reliable by doing… - Code modification reliability with DI - VCR testing - Multi dimensional resource scheduling - Resource isolation makes multi-tenant distributed SQL engine reliable