SlideShare a Scribd company logo
1
we liberate enterprise data
2
Extending Enterprise Applications with the
Full Power of Hadoop
Tanel Poder
gluent.com
3
Gluent - who we are
Tanel also co-authored the Expert
Oracle Exadata book.
Speaker:
Tanel Poder
A long time computer performance geek.
Co-founder & CEO of Gluent.
Long term Oracle Database &
Data Warehousing guys –
focused on performance & scale.
Alumni 2009-2016
4
• Super-scalable
• Processing pushed close to data
• Software-defined (open source)
• Commodity hardware
• No SAN storage bottlenecks
• Open data formats
• One data, many engines
Why Hadoop?
Scalable & affordable-at-scale
• Yahoo: multiple 4000+ node
Hadoop clusters
• Facebook: 30 PB Hadoop
cluster (in year 2011!)
2017: Enterprise-ready
• Hadoop is secure …
• … has management tools …
• … and evolving fast
5
One Data, Many Engines!
• Decoupling storage from compute + open data formats =
flexible future-proof data platforms!
HDFS
Parquet ORC XML Avro
Amazon S3
Parquet WebLog
Kudu
Column-store
Impala SQLHive SQL Xyz…
Solr / Search SparkMR
Kudu API
libparquet
6
BUT
No complex transactions
No transactional “PL/SQL”
No very complex queries
7
Is Hadoop only for ”Big Data”?
8
9
Hadoop for traditional enterprise apps?
New
“Big Data”
applications
Traditional
enterprise
applications
10
How to connect all this data with enterprise applications?
New data
SaaS
IoT
Big Data
Modern data platformsCore enterprise apps
Running on relational DBs
? ?
11
Hybrid World!
1212
Gluent
Oracle
Postgres
SQL
Teradata
IoT & Big
Data
MSSQL
App
X
App
Y
App
Z
Hadoop/RDBMS
connectivity
layer
Open data
formats!
13
• Gluent Data Platform (of course :-)
• No-ETL Data Sync (Data Offload to Hadoop)
• Smart Connector (Transparent Data Query from Hadoop)
• ETL & replication products
• Informatica, Talend, Pentaho, etc etc…
• Oracle GoldenGate, Attunity, DBVisit, etc…
• RDBMS->Hadoop Query products
• Teradata QueryGrid
• Microsoft SQL Server Polybase
• Oracle Big Data SQL
• IBM Big SQL
• Native RDBMS database links & linked servers over ODBC etc…
Hybrid World-related Vendors & Tools
14
• 2-minute demo!
• More technical details at:
• https://vimeo.com/196497024
Gluent Demo
15
Hybrid World Case Studies
16
Case Study 1 – IoT data within existing RDBMS app
17
Securus: Satellite Tracking of People (STOP) VeriTracks Application
http://www.stopllc.com/
Challenge - how to:
• Scale business?
• Offer additional services?
• Add additional data sources?
• Embed predictive & advanced
analytics, machine learning?
• Cut cost at the same time?!
• 150 TB dataset
• Geospatial data
• Kept in Oracle DB
• Growing fast
• Google Maps API
• Near-realtime reaction
• Long-term analytics
18
Securus: Satellite Tracking of People (STOP) VeriTracks Application
19
1. New analytics in existing apps immediately possible
2. Reduced cost
3. Move fast with low risk – don’t rewrite entire apps
• The customer didn’t change a single line of code!
Securus STOP: Summary
20
Database Schema Virtualization
21
Typical Application Story: Monolithic Data Model
A complex business application
running on a RDBMS
Years of application
development & improvement
Upstream & downstream
dependencies
Terabytes of historical data
(usually years of history)
Big queries run for too long or
never complete (or never tried)
Does not scale with modern
demand
Way too expensive
Application rewrite very costly
& risky or virtually impossible
Customers
Products Preferences
Promotions
Prices
RDBMS + SAN
SALES
22
Hybrid Data Virtualization (90/10)
Virtual
(90%)
SALES
(10%)
Customers
Products Preferences
Promotions
Prices
RDBMS + SAN
10%
RDBMS +
SAN
SALES
(90%)
90%
Hadoop
Gluent
Reduce cost,
offload data,
increase
performance
Application still
sees all data:
App code &
architecture
unchanged!
Gluent
Columnar
compression:
6-20x data size
reduction
Automatic data
flow, No ETL
development!
23
Hybrid Data Virtualization (100/10)
Virtual
(90%)
SALES
(10%)
Customers
Products Preferences
Promotions
Prices
RDBMS + SAN
SALES
(100%)
10%
RDBMS +
SAN
100%
Hadoop
Gluent
Customers
Products Preferences
Promotions
Prices
Gluent
Gluent
New
Analytics &
Apps
Reduce cost and
enable new
analytics on
Hadoop
24
Hybrid Data Virtualization (Big Data/IoT)
Customers
Products
Preferences
Promotions
Prices
RDBMS + SAN
WEB_VISITS
(Hadoop only)
SALES WEB_VISITS
(Virtual)
Gluent
Data & compute
virtualization:
Users query tables
in databases, actual
data & processing
in Hadoop
25
• Call Detail Records
• Only 90 days of history
• Offloaded 89 days
Case Study 2 – Large Telecom
26
Case Study 2 – Large Telecom - Results
27
• Query Elapsed Times Avg 36X Faster in Hybrid Mode
• Average Oracle CPU Reduction 87% in Hybrid Mode
• Storage cost reduction ~100X
• HDFS storage ~10x cheaper than SAN
• 11x compression due to columnar format (ORC)
• 30 days < 2 minutes
• 90 days ~3 minutes
• Enabled Completely New Capabilities
• Application Owner Wanted to Query 1 Year
Case Study 2 – Large Telecom - Results
28
Case Study 3 – Multi-Year Reports
29
• Many different (generated) queries running for a few seconds each
• We executed 5,500 APPX queries from AWR history using our tools
• 50% reduction of CPU
Case Study 4 – Thousands of ”Short” Queries
7731
3846
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Before
After
Total CPU…
Schema CPU Seconds
DATAMART 7731
DATAMART_H 3846
50% CPU
saving with
hybrid query
Average CPU
1.4 sec/exec
before
0.7 sec with
hybrid query
30
Hybrid
Case Study 5 – EDW Offload
EDW DB
(Oracle)
EDW Apps
EDW Apps
Hadoop
Transparent access
No ETL data sync
EDW DB
(Oracle)
EDW Apps
EDW Apps
Shrink legacy cost
footprint, increase
performance without
re-writing apps
31
Hybrid
Case Study 6 – Access IoT Data in Enterprise Apps
EDW DB
(Oracle)
EDW Apps
EDW Apps
Hadoop
Smart Meter Data
Call Recordings
Transparent access
Transparent access
Hybrid Queries over all
enterprise data
No need to rewrite
existing apps
32
Hybrid
Case Study 7 – data sharing platform (24 DBs)
App 23
App 24
Hadoop
App 1
App 2
Oracle DB
Oracle DB
…
Oracle DB
CDR data
Oracle DB
CDR data
33
Summary
34
• The Hybrid World is not “all-or-nothing”
• Get the best of both worlds (RDBMS+Hadoop)
• No data migration downtime & cutover needed
• No need to re-write your apps to take
advantage of modern data platforms
• No need write ETL jobs
to sync your data to Hadoop & Cloud
Summary
we liberate enterprise data
35
Advisor
Do you want to
assess potential
savings &
opportunities
with Gluent? 
https://gluent.com/products/gluent-advisor/
36
http://gluent.com
@gluent
Thanks! + Q&A
we liberate enterprise data
Ad

More Related Content

What's hot (20)

Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
DataWorks Summit/Hadoop Summit
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Big Data Spain
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
DataWorks Summit/Hadoop Summit
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Altan Khendup
 
Instrumenting your Instruments
Instrumenting your Instruments Instrumenting your Instruments
Instrumenting your Instruments
DataWorks Summit/Hadoop Summit
 
Apache frameworks for Big and Fast Data
Apache frameworks for Big and Fast DataApache frameworks for Big and Fast Data
Apache frameworks for Big and Fast Data
Naveen Korakoppa
 
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data StreamingOracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Michael Rainey
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Big Data Spain
 
Bigger Faster Easier: LinkedIn Hadoop Summit 2015
Bigger Faster Easier: LinkedIn Hadoop Summit 2015Bigger Faster Easier: LinkedIn Hadoop Summit 2015
Bigger Faster Easier: LinkedIn Hadoop Summit 2015
Shirshanka Das
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
DataWorks Summit/Hadoop Summit
 
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 Migration and Coexistence between Relational and NoSQL Databases by Manuel H... Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Big Data Spain
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
DataWorks Summit
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
Blake Irvine
 
Streaming Analytics
Streaming AnalyticsStreaming Analytics
Streaming Analytics
Neera Agarwal
 
Stream Analytics
Stream Analytics Stream Analytics
Stream Analytics
Franco Ucci
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming Era
Attunity
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
 
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
HostedbyConfluent
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
DataWorks Summit
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
Dr. Mirko Kämpf
 
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
DataWorks Summit/Hadoop Summit
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Big Data Spain
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
DataWorks Summit/Hadoop Summit
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Altan Khendup
 
Apache frameworks for Big and Fast Data
Apache frameworks for Big and Fast DataApache frameworks for Big and Fast Data
Apache frameworks for Big and Fast Data
Naveen Korakoppa
 
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data StreamingOracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Michael Rainey
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Big Data Spain
 
Bigger Faster Easier: LinkedIn Hadoop Summit 2015
Bigger Faster Easier: LinkedIn Hadoop Summit 2015Bigger Faster Easier: LinkedIn Hadoop Summit 2015
Bigger Faster Easier: LinkedIn Hadoop Summit 2015
Shirshanka Das
 
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 Migration and Coexistence between Relational and NoSQL Databases by Manuel H... Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Big Data Spain
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
DataWorks Summit
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
Blake Irvine
 
Stream Analytics
Stream Analytics Stream Analytics
Stream Analytics
Franco Ucci
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming Era
Attunity
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
 
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
HostedbyConfluent
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
DataWorks Summit
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
Dr. Mirko Kämpf
 

Similar to Gluent Extending Enterprise Applications with Hadoop (20)

How Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments WebcastHow Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments Webcast
Yellowbrick Data
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3
Simon Ambridge
 
HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016
INDUSCommunity
 
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
StampedeCon
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Avere Systems
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Global Business Events
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS
Alluxio, Inc.
 
Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622
Anthony Potappel
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
Alluxio, Inc.
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
SoftServe
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Perficient, Inc.
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Michael Hiskey
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
Satish Mohan
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
Seeling Cheung
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
Alluxio, Inc.
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
RTTS
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
Crate.io
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
Attunity
 
How Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments WebcastHow Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments Webcast
Yellowbrick Data
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3
Simon Ambridge
 
HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016
INDUSCommunity
 
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
StampedeCon
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Avere Systems
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Global Business Events
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS
Alluxio, Inc.
 
Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622
Anthony Potappel
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
Alluxio, Inc.
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
SoftServe
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Perficient, Inc.
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Michael Hiskey
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
Satish Mohan
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
Seeling Cheung
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
Alluxio, Inc.
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
RTTS
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
Crate.io
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
Attunity
 
Ad

Recently uploaded (20)

Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Ad

Gluent Extending Enterprise Applications with Hadoop

  • 2. 2 Extending Enterprise Applications with the Full Power of Hadoop Tanel Poder gluent.com
  • 3. 3 Gluent - who we are Tanel also co-authored the Expert Oracle Exadata book. Speaker: Tanel Poder A long time computer performance geek. Co-founder & CEO of Gluent. Long term Oracle Database & Data Warehousing guys – focused on performance & scale. Alumni 2009-2016
  • 4. 4 • Super-scalable • Processing pushed close to data • Software-defined (open source) • Commodity hardware • No SAN storage bottlenecks • Open data formats • One data, many engines Why Hadoop? Scalable & affordable-at-scale • Yahoo: multiple 4000+ node Hadoop clusters • Facebook: 30 PB Hadoop cluster (in year 2011!) 2017: Enterprise-ready • Hadoop is secure … • … has management tools … • … and evolving fast
  • 5. 5 One Data, Many Engines! • Decoupling storage from compute + open data formats = flexible future-proof data platforms! HDFS Parquet ORC XML Avro Amazon S3 Parquet WebLog Kudu Column-store Impala SQLHive SQL Xyz… Solr / Search SparkMR Kudu API libparquet
  • 6. 6 BUT No complex transactions No transactional “PL/SQL” No very complex queries
  • 7. 7 Is Hadoop only for ”Big Data”?
  • 8. 8
  • 9. 9 Hadoop for traditional enterprise apps? New “Big Data” applications Traditional enterprise applications
  • 10. 10 How to connect all this data with enterprise applications? New data SaaS IoT Big Data Modern data platformsCore enterprise apps Running on relational DBs ? ?
  • 13. 13 • Gluent Data Platform (of course :-) • No-ETL Data Sync (Data Offload to Hadoop) • Smart Connector (Transparent Data Query from Hadoop) • ETL & replication products • Informatica, Talend, Pentaho, etc etc… • Oracle GoldenGate, Attunity, DBVisit, etc… • RDBMS->Hadoop Query products • Teradata QueryGrid • Microsoft SQL Server Polybase • Oracle Big Data SQL • IBM Big SQL • Native RDBMS database links & linked servers over ODBC etc… Hybrid World-related Vendors & Tools
  • 14. 14 • 2-minute demo! • More technical details at: • https://vimeo.com/196497024 Gluent Demo
  • 16. 16 Case Study 1 – IoT data within existing RDBMS app
  • 17. 17 Securus: Satellite Tracking of People (STOP) VeriTracks Application http://www.stopllc.com/ Challenge - how to: • Scale business? • Offer additional services? • Add additional data sources? • Embed predictive & advanced analytics, machine learning? • Cut cost at the same time?! • 150 TB dataset • Geospatial data • Kept in Oracle DB • Growing fast • Google Maps API • Near-realtime reaction • Long-term analytics
  • 18. 18 Securus: Satellite Tracking of People (STOP) VeriTracks Application
  • 19. 19 1. New analytics in existing apps immediately possible 2. Reduced cost 3. Move fast with low risk – don’t rewrite entire apps • The customer didn’t change a single line of code! Securus STOP: Summary
  • 21. 21 Typical Application Story: Monolithic Data Model A complex business application running on a RDBMS Years of application development & improvement Upstream & downstream dependencies Terabytes of historical data (usually years of history) Big queries run for too long or never complete (or never tried) Does not scale with modern demand Way too expensive Application rewrite very costly & risky or virtually impossible Customers Products Preferences Promotions Prices RDBMS + SAN SALES
  • 22. 22 Hybrid Data Virtualization (90/10) Virtual (90%) SALES (10%) Customers Products Preferences Promotions Prices RDBMS + SAN 10% RDBMS + SAN SALES (90%) 90% Hadoop Gluent Reduce cost, offload data, increase performance Application still sees all data: App code & architecture unchanged! Gluent Columnar compression: 6-20x data size reduction Automatic data flow, No ETL development!
  • 23. 23 Hybrid Data Virtualization (100/10) Virtual (90%) SALES (10%) Customers Products Preferences Promotions Prices RDBMS + SAN SALES (100%) 10% RDBMS + SAN 100% Hadoop Gluent Customers Products Preferences Promotions Prices Gluent Gluent New Analytics & Apps Reduce cost and enable new analytics on Hadoop
  • 24. 24 Hybrid Data Virtualization (Big Data/IoT) Customers Products Preferences Promotions Prices RDBMS + SAN WEB_VISITS (Hadoop only) SALES WEB_VISITS (Virtual) Gluent Data & compute virtualization: Users query tables in databases, actual data & processing in Hadoop
  • 25. 25 • Call Detail Records • Only 90 days of history • Offloaded 89 days Case Study 2 – Large Telecom
  • 26. 26 Case Study 2 – Large Telecom - Results
  • 27. 27 • Query Elapsed Times Avg 36X Faster in Hybrid Mode • Average Oracle CPU Reduction 87% in Hybrid Mode • Storage cost reduction ~100X • HDFS storage ~10x cheaper than SAN • 11x compression due to columnar format (ORC) • 30 days < 2 minutes • 90 days ~3 minutes • Enabled Completely New Capabilities • Application Owner Wanted to Query 1 Year Case Study 2 – Large Telecom - Results
  • 28. 28 Case Study 3 – Multi-Year Reports
  • 29. 29 • Many different (generated) queries running for a few seconds each • We executed 5,500 APPX queries from AWR history using our tools • 50% reduction of CPU Case Study 4 – Thousands of ”Short” Queries 7731 3846 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Before After Total CPU… Schema CPU Seconds DATAMART 7731 DATAMART_H 3846 50% CPU saving with hybrid query Average CPU 1.4 sec/exec before 0.7 sec with hybrid query
  • 30. 30 Hybrid Case Study 5 – EDW Offload EDW DB (Oracle) EDW Apps EDW Apps Hadoop Transparent access No ETL data sync EDW DB (Oracle) EDW Apps EDW Apps Shrink legacy cost footprint, increase performance without re-writing apps
  • 31. 31 Hybrid Case Study 6 – Access IoT Data in Enterprise Apps EDW DB (Oracle) EDW Apps EDW Apps Hadoop Smart Meter Data Call Recordings Transparent access Transparent access Hybrid Queries over all enterprise data No need to rewrite existing apps
  • 32. 32 Hybrid Case Study 7 – data sharing platform (24 DBs) App 23 App 24 Hadoop App 1 App 2 Oracle DB Oracle DB … Oracle DB CDR data Oracle DB CDR data
  • 34. 34 • The Hybrid World is not “all-or-nothing” • Get the best of both worlds (RDBMS+Hadoop) • No data migration downtime & cutover needed • No need to re-write your apps to take advantage of modern data platforms • No need write ETL jobs to sync your data to Hadoop & Cloud Summary we liberate enterprise data
  • 35. 35 Advisor Do you want to assess potential savings & opportunities with Gluent?  https://gluent.com/products/gluent-advisor/