SlideShare a Scribd company logo
Enable SQL/JDBC Access to Apache
Geode/GemFire Using Apache Calcite
By Christian Tzolov
@christzolov
1
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Christian Tzolov
Spring Software Engineer at Pivotal
Spring Cloud Data Flow
Apache Committer, Crunch PMC member
Apache {Crunch, HAWQ, Zeppelin, Calcite...}
2
blog.tzolov.net
@christzolov
@tzolov
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
“What happens if an established NoSQL database
decides to implement a reasonably standard SQL;
The only predictable outcome for such an
eventuality is plenty of argument.”
Martin Fowler, P.J.Sadalage, NoSQL Distilled, 2012
3
Disclaimer
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Safe Harbor Statement
The following is intended to outline the general direction of Pivotal's offerings. It is
intended for information purposes only and may not be incorporated into any
contract. Any information regarding pre-release of Pivotal offerings, future updates
or other planned modifications is subject to ongoing evaluation by Pivotal and is
subject to change. This information is provided without warranty or any kind,
express or implied, and is not a commitment to deliver any material, code, or
functionality, and should not be relied upon in making purchasing decisions
regarding Pivotal's offerings. These purchasing decisions should only be based on
features currently available. The development, release, and timing of any features
or functionality described for Pivotal's offerings in this presentation remain at the
sole discretion of Pivotal. Pivotal has no obligation to update forward looking
information in this presentation.
5
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Demo?
In a minute ;)
6
Data Management Systems
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
8
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
One Size Fit All*
“The traditional DBMS architecture has been used to support many
data-centric applications with widely varying characteristics and
requirements… We argue that this concept is no longer applicable to the
database market, and that the commercial world will fracture into a
collection of independent database engines, some of which may be
unified by a common front-end parser.”
Michael Stonebraker, 2005
9
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Collection of Independent Databases
10
How to ingest and keep
consistent across DBs?
How to access the data
across multiple DBs?
Data Federation
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Federation: Unified Query Interface
12
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Federation: Two Mapping Tasks
13
Federation: Data Model Mapping
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Geode Model Mapping
• PDX serialization
• Primitive Types
• Arrays (employ[0])
• Nested Objects
(parent[‘child.childChild…’])
15
Cache
Region 1
Region K
ValKey
v1k1
v2k2
Schema
Table 1
Table K
C
o
l
1
C
o
l
2
C
o
l
N
V(M,1)
R
o
w
M
V(M,2)
V(M,N
)
V(2,1)
R
o
w
2
V(2,2) V(2,N)
V(1,1)
R
o
w
1
V(1,2) V(1,N)
2. Regions into Tables
1. Cache into Schema
3. Key/Value into
Table Row
4. Column Types inferred
from PDX fields
Federation: Query Mapping
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
SQL for NoSQL?
• Extended Relational Algebra - already present in most NoSql data
• Relational Expression Optimization – Desirable but hard to implement
17
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Query = Relational Algebra
18
SELECT b.totalPrice, c.firstName
FROM BookOrder AS b
INNER JOIN BookCustomer AS c ON b.customerNumber = c.customerNumber
WHERE b.totalPrice > 0
…
}
Scan Scan
Join
Filter
Project
BookCustomer [c] BookOrder [b]
(on customerNumber)
(b.totalPrice > 0)
(c.firstName, b.totalPrice)
Scan Scan
Join
Project
BookCustomer [c] BookOrder [b]
(on
customerNumber)
(totalPrice > 0)
(c.firstName, b.totalPrice)
Project
(firstName,
customerNumber)
Filter
(totalPrice,
customerNumber)Project
<<Optimization>>
<<SQLParsing>>
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
“Native” Geode Operations
19
Relational Operation Supported on Geode
SORT/LIMIT YES (no Offset)
PROJECT YES
FILTER YES
AGGREGATE
SUM, AVG, MAX, MIN,
COUNT, GROUP BY,
DISTINCT
JOIN NO
SQL Adapter Framework
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Apache Calcite
• Java Framework
• SQL Query Parser, Validator and Optimizer(s)
• JDBC drivers - local and remote
• Agnostic to data storage and processing
• Powered by Calcite:
21
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Apache Geode Adapter
22
Geode API and OQL
SQL/JDBC
Convert relational expressions into OQL queriesGeode Adapter
(Geode Client)
Geode ServerGeode ServerGeode Server
Data Data Data
Push down all supported expressions to Geode OQL
and fall back to Calcite Enumerable for the rest
Enumerable
Adapter
Apache Calcite Parse SQL into relational expression and optimizes
Calcite-Geode-Embedded(e.g.JDBC.jar)
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Apache Geode Concepts
Cache Server
Cache
Region 1
Region N
ValKey
v1k1
v2k2
Cache - In-memory collection of
Regions
Region - consistent, distributed Map
(key-value),
Partitioned or Replicated
CacheServer – process
connected to the distributed
system with created Cache
ClientLocator
Client –read and modify the content
of the distributed system
Locator – tracks system members and
provides membership information
…
Listeners
FunctionsFunctions – distributed,
concurrent data processing
Listener – event handler.
Registers for one or more events
and notified when they occur
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Key/Value vs. OQL
“to get extreme scale we use GemFire for what it does best: key-value
storage. Then, when we can’t design our way to using what GemFire does
best, we can use OQL and treat it like an object data-base”
Mike Stolz, “Scaling Data Services with Pivotal GemFire®”, 2018
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Query Execution Flow
25
2. Parse SQL, convert to rel. expressions. Validate
and Optimize them
3. Start building a physical plan from the relation
expressions
4. Implement the Geode relations and encode
them as Expression tree
5. Pass the Expression tree to the Interpreter to
generate Java code
6. Generate and Compile a Binder instance that
on ‘bind()’ call runs Geodes’ query method
1. On new SQL query JDBC delegates to Prepare
to prepare the query execution
7. JDBC uses the newly compiled Binder to
perform the query on the Geode Cluster
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Query Mapping - Example
26
SELECT author
FROM BookMaster
WHERE retailCost > 0
LIMIT 1
…
}
SELECT author
FROM BookMaster
WHERE retailCost > 0
LIMIT 1 OFFSET 1
…
}
Demo
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
DB Tools Integration
28
Access Apache Geode /
GemFire over SQL/JDBC
Explore Geode & GemFire
Data with IntelliJ
SQL/Database tool
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Analytics
29
Advanced Apache
Geode Data Analytics
with Apache Zeppelin
over SQL/JDBC
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Data Federation
30
{Geode|Greenplum|...} Data Federation
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Spatial (OpenGIS /PostGIS)
31
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
SQL Stream (in progress...)
32
https://calcite.apache.org/docs/stream.html
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
SQL Stream (2)
33
● How many rows come out for each row in?
● Does each incoming value appear in one total, or more?
● What defines the “window”, the set of rows that contribute to a given output row?
● Is the result a stream or a relation?
● tumbling window (GROUP BY)
● hopping window (multi GROUP BY)
● sliding window (window functions)
● cascading window (window
functions)
Usability?
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Pros & Cons
• Ad-hoc data exploration
• JDBC integration with 3rd
party
tools
• Data Federation, correlate Geode
with other data sources
• SQL Streaming as CQ++
• No-intrusive and extensible
approach
• What happened with SQLFire?
• Geode == Transactional System!
SQL+Geode <> Analytical System!
• Key/Value vs. Full Scan
• Overhead: SQL > OQL > Functions
• Data at Rest (Table) vs. Data at
Motion (Stream)
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Future work
• Add Geode Adapter to Calcite Project: [CALCITE-2059] – VOTE! ;)
• Table Materialization
• SQL Streaming (https://calcite.apache.org/docs/stream.html)
• Pre-defined vs. Inferred Schema
• “Native” JOIN operation implementation
• Geode Indexes and Query statistics as Calcite metadata. Hint Indexes
• Mixing Key/Value and OQL data access
• Explore Geode functions as an alternative to OQL
• Explore TX support
• Performance Benchmark
• Calcite Spatial (https://calcite.apache.org/docs/spatial.html)
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Credits
to Dan Baskette for inspiring this work
and to Roman Shaposhnik for helping
spread the word
37
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
References
38
[1 ] Apache Geode: http://geode.apache.org
[2] Geode Object Query Language (OQL): http://bit.ly/2BfXmNR
[3] Apache Calcite: https://calcite.apache.org
[4] Calcite - supported SQL language: https://calcite.apache.org/docs/reference.html
[5] Apache Geode Adapter for Apache Calcite: https://github.com/tzolov/calcite/tree/geode-1.3
[6] Relational Algebra Operations: http://bit.ly/2zX8cMc
[7] Apache Phoenix - http://phoenix.apache.org
[8] Big Data is Four Different Problems, 2016, M.Stonebraker (video): http://bit.ly/2jDmKpq
[9] Architecture of a Database System, 2007 (J.M. Hellerstein, M. Stonebraker, J. Hamilton):
http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf
[10] One Size Fits All (Michael Stonebraker): http://ieeexplore.ieee.org/document/1410100/?denied
Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons
Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Bonus: Geode Adapter Implementation
39
Common interface for all MyAdapter Relation
Expressions. Provides implementation callback
method called as part of physical plan
implementation
…
…
<<instance of>>
<<create>>
Can convert
queries in
Expression
myQuery() implements the call to your DB
It is called by the auto generated code. It must
return an Enumberable instance
Registers all MyAdapter Rules
<<create>>
operands: (RelNode.class,
MyAdapterConvention, EnumerableConvention)
…
<<create on match >>
<<create on match >>
Recursively call the implement on each MyAdapter
Relation Expression
Encode the myQuery(params) call as
Expressions
Learn More. Stay Connected.
Vote for [CALCITE-2059] and don’t miss:
Simplifying Apache Geode with Spring Data
Exploring Data-Driven, Cognitive Capabilities in Pivotal Cloud Foundry
Orchestrating Data Microservices with Spring Cloud Data Flow
40
#springone@s1p
Ad

More Related Content

Similar to Enable SQL/JDBC Access to Apache Geode/GemFire Using Apache Calcite (20)

SpringOnePlatform2017 recap
SpringOnePlatform2017 recapSpringOnePlatform2017 recap
SpringOnePlatform2017 recap
minseok kim
 
Cloud-Native Streaming and Event-Driven Microservices
Cloud-Native Streaming and Event-Driven MicroservicesCloud-Native Streaming and Event-Driven Microservices
Cloud-Native Streaming and Event-Driven Microservices
VMware Tanzu
 
Under the Hood of Reactive Data Access (2/2)
Under the Hood of Reactive Data Access (2/2)Under the Hood of Reactive Data Access (2/2)
Under the Hood of Reactive Data Access (2/2)
VMware Tanzu
 
Springone2gx 2015 Cassandra and Grails
Springone2gx 2015 Cassandra and GrailsSpringone2gx 2015 Cassandra and Grails
Springone2gx 2015 Cassandra and Grails
Jeff Beck
 
Quickly Build Spring Boot Applications to Consume Public Cloud Services
Quickly Build Spring Boot Applications to Consume Public Cloud ServicesQuickly Build Spring Boot Applications to Consume Public Cloud Services
Quickly Build Spring Boot Applications to Consume Public Cloud Services
VMware Tanzu
 
Caching for Microservives - Introduction to Pivotal Cloud Cache
Caching for Microservives - Introduction to Pivotal Cloud CacheCaching for Microservives - Introduction to Pivotal Cloud Cache
Caching for Microservives - Introduction to Pivotal Cloud Cache
VMware Tanzu
 
Consumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice ArchitectureConsumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice Architecture
Marcin Grzejszczak
 
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
VMware Tanzu
 
YugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
YugaByte DB—A Planet-Scale Database for Low Latency Transactional AppsYugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
YugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
VMware Tanzu
 
Core Spring + Reactive 김민석
Core Spring + Reactive  김민석Core Spring + Reactive  김민석
Core Spring + Reactive 김민석
VMware Tanzu Korea
 
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
confluent
 
Consumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice ArchitectureConsumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice Architecture
VMware Tanzu
 
Lattice: A Cloud-Native Platform for Your Spring Applications
Lattice: A Cloud-Native Platform for Your Spring ApplicationsLattice: A Cloud-Native Platform for Your Spring Applications
Lattice: A Cloud-Native Platform for Your Spring Applications
Matt Stine
 
Tools to Slay the Fire Breathing Monoliths in Your Enterprise
Tools to Slay the Fire Breathing Monoliths in Your EnterpriseTools to Slay the Fire Breathing Monoliths in Your Enterprise
Tools to Slay the Fire Breathing Monoliths in Your Enterprise
VMware Tanzu
 
The Beginner’s Guide To Spring Cloud
The Beginner’s Guide To Spring CloudThe Beginner’s Guide To Spring Cloud
The Beginner’s Guide To Spring Cloud
VMware Tanzu
 
Building Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data GridsBuilding Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data Grids
John Blum
 
Kubernetes for the Spring Developer
Kubernetes for the Spring DeveloperKubernetes for the Spring Developer
Kubernetes for the Spring Developer
VMware Tanzu
 
P to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to ContainersP to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to Containers
VMware Tanzu
 
Federated Queries with HAWQ - SQL on Hadoop and Beyond
Federated Queries with HAWQ - SQL on Hadoop and BeyondFederated Queries with HAWQ - SQL on Hadoop and Beyond
Federated Queries with HAWQ - SQL on Hadoop and Beyond
Christian Tzolov
 
Marcin Grzejszczak - Contract Tests in the Enterprise
Marcin Grzejszczak - Contract Tests in the EnterpriseMarcin Grzejszczak - Contract Tests in the Enterprise
Marcin Grzejszczak - Contract Tests in the Enterprise
SegFaultConf
 
SpringOnePlatform2017 recap
SpringOnePlatform2017 recapSpringOnePlatform2017 recap
SpringOnePlatform2017 recap
minseok kim
 
Cloud-Native Streaming and Event-Driven Microservices
Cloud-Native Streaming and Event-Driven MicroservicesCloud-Native Streaming and Event-Driven Microservices
Cloud-Native Streaming and Event-Driven Microservices
VMware Tanzu
 
Under the Hood of Reactive Data Access (2/2)
Under the Hood of Reactive Data Access (2/2)Under the Hood of Reactive Data Access (2/2)
Under the Hood of Reactive Data Access (2/2)
VMware Tanzu
 
Springone2gx 2015 Cassandra and Grails
Springone2gx 2015 Cassandra and GrailsSpringone2gx 2015 Cassandra and Grails
Springone2gx 2015 Cassandra and Grails
Jeff Beck
 
Quickly Build Spring Boot Applications to Consume Public Cloud Services
Quickly Build Spring Boot Applications to Consume Public Cloud ServicesQuickly Build Spring Boot Applications to Consume Public Cloud Services
Quickly Build Spring Boot Applications to Consume Public Cloud Services
VMware Tanzu
 
Caching for Microservives - Introduction to Pivotal Cloud Cache
Caching for Microservives - Introduction to Pivotal Cloud CacheCaching for Microservives - Introduction to Pivotal Cloud Cache
Caching for Microservives - Introduction to Pivotal Cloud Cache
VMware Tanzu
 
Consumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice ArchitectureConsumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice Architecture
Marcin Grzejszczak
 
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
VMware Tanzu
 
YugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
YugaByte DB—A Planet-Scale Database for Low Latency Transactional AppsYugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
YugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
VMware Tanzu
 
Core Spring + Reactive 김민석
Core Spring + Reactive  김민석Core Spring + Reactive  김민석
Core Spring + Reactive 김민석
VMware Tanzu Korea
 
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
confluent
 
Consumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice ArchitectureConsumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice Architecture
VMware Tanzu
 
Lattice: A Cloud-Native Platform for Your Spring Applications
Lattice: A Cloud-Native Platform for Your Spring ApplicationsLattice: A Cloud-Native Platform for Your Spring Applications
Lattice: A Cloud-Native Platform for Your Spring Applications
Matt Stine
 
Tools to Slay the Fire Breathing Monoliths in Your Enterprise
Tools to Slay the Fire Breathing Monoliths in Your EnterpriseTools to Slay the Fire Breathing Monoliths in Your Enterprise
Tools to Slay the Fire Breathing Monoliths in Your Enterprise
VMware Tanzu
 
The Beginner’s Guide To Spring Cloud
The Beginner’s Guide To Spring CloudThe Beginner’s Guide To Spring Cloud
The Beginner’s Guide To Spring Cloud
VMware Tanzu
 
Building Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data GridsBuilding Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data Grids
John Blum
 
Kubernetes for the Spring Developer
Kubernetes for the Spring DeveloperKubernetes for the Spring Developer
Kubernetes for the Spring Developer
VMware Tanzu
 
P to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to ContainersP to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to Containers
VMware Tanzu
 
Federated Queries with HAWQ - SQL on Hadoop and Beyond
Federated Queries with HAWQ - SQL on Hadoop and BeyondFederated Queries with HAWQ - SQL on Hadoop and Beyond
Federated Queries with HAWQ - SQL on Hadoop and Beyond
Christian Tzolov
 
Marcin Grzejszczak - Contract Tests in the Enterprise
Marcin Grzejszczak - Contract Tests in the EnterpriseMarcin Grzejszczak - Contract Tests in the Enterprise
Marcin Grzejszczak - Contract Tests in the Enterprise
SegFaultConf
 

Recently uploaded (20)

Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
Microsoft Excel Core Points Training.pptx
Microsoft Excel Core Points Training.pptxMicrosoft Excel Core Points Training.pptx
Microsoft Excel Core Points Training.pptx
Mekonnen
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Odoo ERP for Education Management to Streamline Your Education Process
Odoo ERP for Education Management to Streamline Your Education ProcessOdoo ERP for Education Management to Streamline Your Education Process
Odoo ERP for Education Management to Streamline Your Education Process
iVenture Team LLP
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
DVDFab Crack FREE Download Latest Version 2025
DVDFab Crack FREE Download Latest Version 2025DVDFab Crack FREE Download Latest Version 2025
DVDFab Crack FREE Download Latest Version 2025
younisnoman75
 
Tools of the Trade: Linux and SQL - Google Certificate
Tools of the Trade: Linux and SQL - Google CertificateTools of the Trade: Linux and SQL - Google Certificate
Tools of the Trade: Linux and SQL - Google Certificate
VICTOR MAESTRE RAMIREZ
 
Implementing promises with typescripts, step by step
Implementing promises with typescripts, step by stepImplementing promises with typescripts, step by step
Implementing promises with typescripts, step by step
Ran Wahle
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Foundation Models for Time Series : A Survey
Foundation Models for Time Series : A SurveyFoundation Models for Time Series : A Survey
Foundation Models for Time Series : A Survey
jayanthkalyanam1
 
Top 10 Data Cleansing Tools for 2025.pdf
Top 10 Data Cleansing Tools for 2025.pdfTop 10 Data Cleansing Tools for 2025.pdf
Top 10 Data Cleansing Tools for 2025.pdf
AffinityCore
 
🌱 Green Grafana 🌱 Essentials_ Data, Visualizations and Plugins.pdf
🌱 Green Grafana 🌱 Essentials_ Data, Visualizations and Plugins.pdf🌱 Green Grafana 🌱 Essentials_ Data, Visualizations and Plugins.pdf
🌱 Green Grafana 🌱 Essentials_ Data, Visualizations and Plugins.pdf
Imma Valls Bernaus
 
Top 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docxTop 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docx
Portli
 
Innovative Approaches to Software Dev no good at all
Innovative Approaches to Software Dev no good at allInnovative Approaches to Software Dev no good at all
Innovative Approaches to Software Dev no good at all
ayeshakanwal75
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
Microsoft Excel Core Points Training.pptx
Microsoft Excel Core Points Training.pptxMicrosoft Excel Core Points Training.pptx
Microsoft Excel Core Points Training.pptx
Mekonnen
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Odoo ERP for Education Management to Streamline Your Education Process
Odoo ERP for Education Management to Streamline Your Education ProcessOdoo ERP for Education Management to Streamline Your Education Process
Odoo ERP for Education Management to Streamline Your Education Process
iVenture Team LLP
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
DVDFab Crack FREE Download Latest Version 2025
DVDFab Crack FREE Download Latest Version 2025DVDFab Crack FREE Download Latest Version 2025
DVDFab Crack FREE Download Latest Version 2025
younisnoman75
 
Tools of the Trade: Linux and SQL - Google Certificate
Tools of the Trade: Linux and SQL - Google CertificateTools of the Trade: Linux and SQL - Google Certificate
Tools of the Trade: Linux and SQL - Google Certificate
VICTOR MAESTRE RAMIREZ
 
Implementing promises with typescripts, step by step
Implementing promises with typescripts, step by stepImplementing promises with typescripts, step by step
Implementing promises with typescripts, step by step
Ran Wahle
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Foundation Models for Time Series : A Survey
Foundation Models for Time Series : A SurveyFoundation Models for Time Series : A Survey
Foundation Models for Time Series : A Survey
jayanthkalyanam1
 
Top 10 Data Cleansing Tools for 2025.pdf
Top 10 Data Cleansing Tools for 2025.pdfTop 10 Data Cleansing Tools for 2025.pdf
Top 10 Data Cleansing Tools for 2025.pdf
AffinityCore
 
🌱 Green Grafana 🌱 Essentials_ Data, Visualizations and Plugins.pdf
🌱 Green Grafana 🌱 Essentials_ Data, Visualizations and Plugins.pdf🌱 Green Grafana 🌱 Essentials_ Data, Visualizations and Plugins.pdf
🌱 Green Grafana 🌱 Essentials_ Data, Visualizations and Plugins.pdf
Imma Valls Bernaus
 
Top 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docxTop 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docx
Portli
 
Innovative Approaches to Software Dev no good at all
Innovative Approaches to Software Dev no good at allInnovative Approaches to Software Dev no good at all
Innovative Approaches to Software Dev no good at all
ayeshakanwal75
 
Ad

Enable SQL/JDBC Access to Apache Geode/GemFire Using Apache Calcite

  • 1. Enable SQL/JDBC Access to Apache Geode/GemFire Using Apache Calcite By Christian Tzolov @christzolov 1
  • 2. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Christian Tzolov Spring Software Engineer at Pivotal Spring Cloud Data Flow Apache Committer, Crunch PMC member Apache {Crunch, HAWQ, Zeppelin, Calcite...} 2 blog.tzolov.net @christzolov @tzolov
  • 3. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ “What happens if an established NoSQL database decides to implement a reasonably standard SQL; The only predictable outcome for such an eventuality is plenty of argument.” Martin Fowler, P.J.Sadalage, NoSQL Distilled, 2012 3
  • 5. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Safe Harbor Statement The following is intended to outline the general direction of Pivotal's offerings. It is intended for information purposes only and may not be incorporated into any contract. Any information regarding pre-release of Pivotal offerings, future updates or other planned modifications is subject to ongoing evaluation by Pivotal and is subject to change. This information is provided without warranty or any kind, express or implied, and is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions regarding Pivotal's offerings. These purchasing decisions should only be based on features currently available. The development, release, and timing of any features or functionality described for Pivotal's offerings in this presentation remain at the sole discretion of Pivotal. Pivotal has no obligation to update forward looking information in this presentation. 5
  • 6. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Demo? In a minute ;) 6
  • 8. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 8
  • 9. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ One Size Fit All* “The traditional DBMS architecture has been used to support many data-centric applications with widely varying characteristics and requirements… We argue that this concept is no longer applicable to the database market, and that the commercial world will fracture into a collection of independent database engines, some of which may be unified by a common front-end parser.” Michael Stonebraker, 2005 9
  • 10. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Collection of Independent Databases 10 How to ingest and keep consistent across DBs? How to access the data across multiple DBs?
  • 12. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Federation: Unified Query Interface 12
  • 13. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Federation: Two Mapping Tasks 13
  • 15. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Geode Model Mapping • PDX serialization • Primitive Types • Arrays (employ[0]) • Nested Objects (parent[‘child.childChild…’]) 15 Cache Region 1 Region K ValKey v1k1 v2k2 Schema Table 1 Table K C o l 1 C o l 2 C o l N V(M,1) R o w M V(M,2) V(M,N ) V(2,1) R o w 2 V(2,2) V(2,N) V(1,1) R o w 1 V(1,2) V(1,N) 2. Regions into Tables 1. Cache into Schema 3. Key/Value into Table Row 4. Column Types inferred from PDX fields
  • 17. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ SQL for NoSQL? • Extended Relational Algebra - already present in most NoSql data • Relational Expression Optimization – Desirable but hard to implement 17
  • 18. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Query = Relational Algebra 18 SELECT b.totalPrice, c.firstName FROM BookOrder AS b INNER JOIN BookCustomer AS c ON b.customerNumber = c.customerNumber WHERE b.totalPrice > 0 … } Scan Scan Join Filter Project BookCustomer [c] BookOrder [b] (on customerNumber) (b.totalPrice > 0) (c.firstName, b.totalPrice) Scan Scan Join Project BookCustomer [c] BookOrder [b] (on customerNumber) (totalPrice > 0) (c.firstName, b.totalPrice) Project (firstName, customerNumber) Filter (totalPrice, customerNumber)Project <<Optimization>> <<SQLParsing>>
  • 19. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ “Native” Geode Operations 19 Relational Operation Supported on Geode SORT/LIMIT YES (no Offset) PROJECT YES FILTER YES AGGREGATE SUM, AVG, MAX, MIN, COUNT, GROUP BY, DISTINCT JOIN NO
  • 21. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Apache Calcite • Java Framework • SQL Query Parser, Validator and Optimizer(s) • JDBC drivers - local and remote • Agnostic to data storage and processing • Powered by Calcite: 21
  • 22. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Apache Geode Adapter 22 Geode API and OQL SQL/JDBC Convert relational expressions into OQL queriesGeode Adapter (Geode Client) Geode ServerGeode ServerGeode Server Data Data Data Push down all supported expressions to Geode OQL and fall back to Calcite Enumerable for the rest Enumerable Adapter Apache Calcite Parse SQL into relational expression and optimizes Calcite-Geode-Embedded(e.g.JDBC.jar)
  • 23. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Apache Geode Concepts Cache Server Cache Region 1 Region N ValKey v1k1 v2k2 Cache - In-memory collection of Regions Region - consistent, distributed Map (key-value), Partitioned or Replicated CacheServer – process connected to the distributed system with created Cache ClientLocator Client –read and modify the content of the distributed system Locator – tracks system members and provides membership information … Listeners FunctionsFunctions – distributed, concurrent data processing Listener – event handler. Registers for one or more events and notified when they occur
  • 24. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Key/Value vs. OQL “to get extreme scale we use GemFire for what it does best: key-value storage. Then, when we can’t design our way to using what GemFire does best, we can use OQL and treat it like an object data-base” Mike Stolz, “Scaling Data Services with Pivotal GemFire®”, 2018
  • 25. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Query Execution Flow 25 2. Parse SQL, convert to rel. expressions. Validate and Optimize them 3. Start building a physical plan from the relation expressions 4. Implement the Geode relations and encode them as Expression tree 5. Pass the Expression tree to the Interpreter to generate Java code 6. Generate and Compile a Binder instance that on ‘bind()’ call runs Geodes’ query method 1. On new SQL query JDBC delegates to Prepare to prepare the query execution 7. JDBC uses the newly compiled Binder to perform the query on the Geode Cluster
  • 26. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Query Mapping - Example 26 SELECT author FROM BookMaster WHERE retailCost > 0 LIMIT 1 … } SELECT author FROM BookMaster WHERE retailCost > 0 LIMIT 1 OFFSET 1 … }
  • 27. Demo
  • 28. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ DB Tools Integration 28 Access Apache Geode / GemFire over SQL/JDBC Explore Geode & GemFire Data with IntelliJ SQL/Database tool
  • 29. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Analytics 29 Advanced Apache Geode Data Analytics with Apache Zeppelin over SQL/JDBC
  • 30. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Data Federation 30 {Geode|Greenplum|...} Data Federation
  • 31. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spatial (OpenGIS /PostGIS) 31
  • 32. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ SQL Stream (in progress...) 32 https://calcite.apache.org/docs/stream.html
  • 33. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ SQL Stream (2) 33 ● How many rows come out for each row in? ● Does each incoming value appear in one total, or more? ● What defines the “window”, the set of rows that contribute to a given output row? ● Is the result a stream or a relation? ● tumbling window (GROUP BY) ● hopping window (multi GROUP BY) ● sliding window (window functions) ● cascading window (window functions)
  • 35. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Pros & Cons • Ad-hoc data exploration • JDBC integration with 3rd party tools • Data Federation, correlate Geode with other data sources • SQL Streaming as CQ++ • No-intrusive and extensible approach • What happened with SQLFire? • Geode == Transactional System! SQL+Geode <> Analytical System! • Key/Value vs. Full Scan • Overhead: SQL > OQL > Functions • Data at Rest (Table) vs. Data at Motion (Stream)
  • 36. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Future work • Add Geode Adapter to Calcite Project: [CALCITE-2059] – VOTE! ;) • Table Materialization • SQL Streaming (https://calcite.apache.org/docs/stream.html) • Pre-defined vs. Inferred Schema • “Native” JOIN operation implementation • Geode Indexes and Query statistics as Calcite metadata. Hint Indexes • Mixing Key/Value and OQL data access • Explore Geode functions as an alternative to OQL • Explore TX support • Performance Benchmark • Calcite Spatial (https://calcite.apache.org/docs/spatial.html)
  • 37. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Credits to Dan Baskette for inspiring this work and to Roman Shaposhnik for helping spread the word 37
  • 38. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ References 38 [1 ] Apache Geode: http://geode.apache.org [2] Geode Object Query Language (OQL): http://bit.ly/2BfXmNR [3] Apache Calcite: https://calcite.apache.org [4] Calcite - supported SQL language: https://calcite.apache.org/docs/reference.html [5] Apache Geode Adapter for Apache Calcite: https://github.com/tzolov/calcite/tree/geode-1.3 [6] Relational Algebra Operations: http://bit.ly/2zX8cMc [7] Apache Phoenix - http://phoenix.apache.org [8] Big Data is Four Different Problems, 2016, M.Stonebraker (video): http://bit.ly/2jDmKpq [9] Architecture of a Database System, 2007 (J.M. Hellerstein, M. Stonebraker, J. Hamilton): http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf [10] One Size Fits All (Michael Stonebraker): http://ieeexplore.ieee.org/document/1410100/?denied
  • 39. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Bonus: Geode Adapter Implementation 39 Common interface for all MyAdapter Relation Expressions. Provides implementation callback method called as part of physical plan implementation … … <<instance of>> <<create>> Can convert queries in Expression myQuery() implements the call to your DB It is called by the auto generated code. It must return an Enumberable instance Registers all MyAdapter Rules <<create>> operands: (RelNode.class, MyAdapterConvention, EnumerableConvention) … <<create on match >> <<create on match >> Recursively call the implement on each MyAdapter Relation Expression Encode the myQuery(params) call as Expressions
  • 40. Learn More. Stay Connected. Vote for [CALCITE-2059] and don’t miss: Simplifying Apache Geode with Spring Data Exploring Data-Driven, Cognitive Capabilities in Pivotal Cloud Foundry Orchestrating Data Microservices with Spring Cloud Data Flow 40 #springone@s1p