SlideShare a Scribd company logo
Hi




Hadoop Meets Exadata
       Presented by: Kerry Osborne

 Oracle Open World – October, 2012
whoami –

Never Worked for Oracle
Worked with Oracle Since 1982 (V2)
Working with Exadata since early 2010
Work for Enkitec (www.enkitec.com)
(Enkitec owns a Half Rack – V2/X2)
(Enkitec owns a Big Data Appliance)
Many Exadata customers and POCs
Exadata Book (recently translated to Chinese)
Hadoop Aficionado



 Blog: kerryosborne.oracle-guy.com
 Twitter: @KerryOracleGuy




                                                2
Top Secret Feature of BDA




                            3
What’s the Point?

Data Volumes are Increasing Rapidly
Cost of Processing / Storing is High
Something’s Gotta Give!




  Besides – managing large quantities of data is what
  we do!




                                                        4
Hadoop Is A Virus




* Stolen from Orbitz

                                           5
Google Trends




                6
Google Trends




                7
Google Trends




                8
Digression #1 - Big Data
Not My Favorite Term
3 or 4 V’s
Value Density
Not the Right Tool for Every Job




                                   9
Disjointed Presentation

            Architecture Comparison
            Integration Discussion
            Case Study ?




                                  10
Traditional RDBMS Architecture
                                                  RAC

w
o                    Cache         (SGA)
r   workers
k


                    dbwr       lgwr        etc…


     Block Mapper
        (ASM)



                             Storage




                                                        11
HDFS/Hadoop Architecture
                                                               HA ?

w
o                                 Job Tracker
                     Name Node
r
k




    datanode      tasktracker    datanode        tasktracker

                 workers                        workers

      Storage                      Storage




                                                                      12
HDFS/Hadoop Architecture
                                                                  HA ?

w
o                                    Job Tracker
r
k
               Block Mapper
                (namenode)



    datanode          tasktracker   datanode        tasktracker

                     workers                       workers

      Storage                         Storage




                                                                         13
Exadata Architecture
                                                           RAC

w
                        workers
o                                   Cache
r
k
              Block Mapper
                 (ASM)



    Storage Node                  Storage Node

                    workers                      workers

      Storage                       Storage




                                                                 14
HDFS/Hadoop Architecture
                                                                  HA ?

w
o                                    Job Tracker
r
k
               Block Mapper
                (namenode)



    datanode          tasktracker   datanode        tasktracker

                     workers                       workers

      Storage                         Storage




                                                                         15
Oracle + Hadoop Integration




                              16
Obligatory Marketing Slide




                             17
Integration Options


Many Ways to Skin the Cat


   •
     Fuse
   •
     Sqoop
   •
     Oracle Big Data Connectors




                                       18
Fuse – External Tables




                         19
Sqoop (SQL-to-Hadoop)


•
  Graduated from Incubator Status in March 2012
•
  Slower (no direct path?)
•
  Quest has a plug-in (oraoop)
•
  Bi-Directional




                                                  20
Oracle Big Data Connectors
 Oracle Loader for Hadoop - OLH

 Oracle Direct Connector for HDFS  - ODCH

 Oracle R Connector for Hadoop – ORHC

 Oracle Data Integrator Application Adapter for Hadoop




Note:

All Connectors are One Way
All sold together for $2K per core list


                                                         21
Oracle Data Integrator Application
Adapter for Hadoop
        ODIAAH ?




                                     22
Oracle R Connector for Hadoop (ORHC)
•
  Provides ability to pull data from Oracle RDBMS
•
  Provides ability to pull data from HDFS
•
  Provides access to local file system
•
  Not really a loader tool
•
  Most useful for analysts




                                                    23
Oracle Loader for Hadoop (OLH)

•
  Implemented as a MapReduce job (oraloader.jar)
•
  Saves CPU on DB Server
•
  Can convert to Oracle datatypes
•
  Can partition data and optionally sort it
•
  Online – direct into Oracle tables
       •
          Can load into Oracle via JDBC or OCI Direct Path
•
  Offline – generate preprocessed files in HDFS (DP format)




                                                              24
Oracle Direct Connector for HDFS  (ODCH)
•
  My Favorite
•
  Uses External Tables
•
  Fastest
•
  12T per hour
•
  Can load DP files preprocessed by OLH
•
  Allows Oracle SQL to query HDFS data
•
  Doesn’t require loading into Oracle
•
  Pretty Cool!

•
    Downside – uses DB CPU’s




                                               25
Exadoop




* Mad Scientist Project



                          26
Exadoop

Unusual Situation!
Half Rack with 4 Spare Storage Servers
Exadata Cells Very Similar to BDA Servers
     slower CPU’s
     less memory
     but same drives (12X3T)
     and IB
     and Flash
4 Cells ≈ Mini BDA! (happy face)




                                            27
Digression #2 - BDA Stuff




                            28
Digression #2 - BDA Stuff




                            29
Digression #2 - BDA Stuff




                            30
Exadoop


Situation

•
  Pilot Underway – but wanted more power
•
  4 Exadata Storage Servers were sitting idle
•
  Suggestion was to Install Hadoop Cluster on them
•
  1st Concern was being able to Reclaim for Exadata
•
  Removing Data Node from HDFS Not a Problem
•
  Adding Storage to ASM Not a Problem
•
  So the Decision Was Made to Move Forward




                                                      31
Exadoop

Set Up

•
  Removed the Internal USB’s
•
  Installed OEL 6.2
•
  Installed CDH3
•
  Loaded Some Data
•
  Set Up ODCH with External Tables




                                     32
Exadoop

Testing

•
  Selecting Data Using External Tables was Not Very Fast
•
  Quickly Determined we had Used Default 1G Network
•
  Reconfigured with IB
•
  Helped But Not as Much as Expected
•
  Using Little CPU on Data Nodes
•
  But a Single Process was Pegging a CPU on the DB
•
  Added Parallelism
•
  No Good, Only One Slave Active
•
  Added Multiple Files to External Table Def. – Bingo!




                                                           33
Exadoop

Testing - Continued

•
  Added Fuse Client
•
  Created External Tables with Fuse
•
  PX seems to work even on single files
•
  Puts additional CPU load on DB server (2T/hr)




                                                  34
Wrap Up

          Right Tool For The Job?

          Maybe

          All the Cool Kids Are Doing It!




                                            35
Questions?
Contact Information          : Kerry Osborne
                       kerry.osborne@enkitec.com
                      kerryosborne.oracle-guy.com
                            www.enkitec.com




                                                    36

More Related Content

What's hot (19)

Introduction to Apache Drill
Introduction to Apache DrillIntroduction to Apache Drill
Introduction to Apache Drill
Swiss Big Data User Group
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
tshiran
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
Mike Pittaro
 
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Uwe Printz
 
Hadoop
HadoopHadoop
Hadoop
Peerapat Asoktummarungsri
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
Edureka!
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
Phil Young
 
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric BaldeschwielerArchitecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
lucenerevolution
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Ran Ziv
 
Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
Siva Pandeti
 
HBase with MapR
HBase with MapRHBase with MapR
HBase with MapR
Tomer Shiran
 
HDFS Erasure Coding in Action
HDFS Erasure Coding in Action HDFS Erasure Coding in Action
HDFS Erasure Coding in Action
DataWorks Summit/Hadoop Summit
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?
gvernik
 
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Edureka!
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 
HDFS
HDFSHDFS
HDFS
Steve Loughran
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
tcloudcomputing-tw
 
Hadoop online training
Hadoop online training Hadoop online training
Hadoop online training
Keylabs
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
tshiran
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
Mike Pittaro
 
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Uwe Printz
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
Edureka!
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
Phil Young
 
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric BaldeschwielerArchitecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
lucenerevolution
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Ran Ziv
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?
gvernik
 
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Edureka!
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
tcloudcomputing-tw
 
Hadoop online training
Hadoop online training Hadoop online training
Hadoop online training
Keylabs
 

Similar to Kerry osborne hadoop meets exadata (20)

Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
cdmaxime
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Databricks
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Rahul Borate
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Djamel Zouaoui
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
Rittman Analytics
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
MaharajothiP
 
002 Introduction to hadoop v3
002   Introduction to hadoop v3002   Introduction to hadoop v3
002 Introduction to hadoop v3
Dendej Sawarnkatat
 
From oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsFrom oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other tools
Guy Harrison
 
Introduction to Spark - Phoenix Meetup 08-19-2014
Introduction to Spark - Phoenix Meetup 08-19-2014Introduction to Spark - Phoenix Meetup 08-19-2014
Introduction to Spark - Phoenix Meetup 08-19-2014
cdmaxime
 
Introduction to Hadoop and Big Data Processing
Introduction to Hadoop and Big Data ProcessingIntroduction to Hadoop and Big Data Processing
Introduction to Hadoop and Big Data Processing
Sam Ng
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
Cisco Canada
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
Cisco Canada
 
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephants
Ovidiu Dimulescu
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
saipriyacoool
 
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesSQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
OReillyStrata
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
chariorienit
 
Drill njhug -19 feb2013
Drill njhug -19 feb2013Drill njhug -19 feb2013
Drill njhug -19 feb2013
MapR Technologies
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
cdmaxime
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Databricks
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Rahul Borate
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Djamel Zouaoui
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
Rittman Analytics
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
MaharajothiP
 
From oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsFrom oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other tools
Guy Harrison
 
Introduction to Spark - Phoenix Meetup 08-19-2014
Introduction to Spark - Phoenix Meetup 08-19-2014Introduction to Spark - Phoenix Meetup 08-19-2014
Introduction to Spark - Phoenix Meetup 08-19-2014
cdmaxime
 
Introduction to Hadoop and Big Data Processing
Introduction to Hadoop and Big Data ProcessingIntroduction to Hadoop and Big Data Processing
Introduction to Hadoop and Big Data Processing
Sam Ng
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
Cisco Canada
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
Cisco Canada
 
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephants
Ovidiu Dimulescu
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
saipriyacoool
 
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesSQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
OReillyStrata
 

More from Enkitec (20)

Using Angular JS in APEX
Using Angular JS in APEXUsing Angular JS in APEX
Using Angular JS in APEX
Enkitec
 
Controlling execution plans 2014
Controlling execution plans   2014Controlling execution plans   2014
Controlling execution plans 2014
Enkitec
 
Engineered Systems: Environment-as-a-Service Demonstration
Engineered Systems: Environment-as-a-Service DemonstrationEngineered Systems: Environment-as-a-Service Demonstration
Engineered Systems: Environment-as-a-Service Demonstration
Enkitec
 
Think Exa!
Think Exa!Think Exa!
Think Exa!
Enkitec
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry Osborne
Enkitec
 
In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1
Enkitec
 
Mini Session - Using GDB for Profiling
Mini Session - Using GDB for ProfilingMini Session - Using GDB for Profiling
Mini Session - Using GDB for Profiling
Enkitec
 
Profiling Oracle with GDB
Profiling Oracle with GDBProfiling Oracle with GDB
Profiling Oracle with GDB
Enkitec
 
Oracle Performance Tools of the Trade
Oracle Performance Tools of the TradeOracle Performance Tools of the Trade
Oracle Performance Tools of the Trade
Enkitec
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
Enkitec
 
SQL Tuning Tools of the Trade
SQL Tuning Tools of the TradeSQL Tuning Tools of the Trade
SQL Tuning Tools of the Trade
Enkitec
 
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan StabilityUsing SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Enkitec
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
Enkitec
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture Performance
Enkitec
 
APEX Security Primer
APEX Security PrimerAPEX Security Primer
APEX Security Primer
Enkitec
 
How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?
Enkitec
 
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Enkitec
 
Sql tuning made easier with sqltxplain (sqlt)
Sql tuning made easier with sqltxplain (sqlt)Sql tuning made easier with sqltxplain (sqlt)
Sql tuning made easier with sqltxplain (sqlt)
Enkitec
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writer
Enkitec
 
Fatkulin hotsos 2014
Fatkulin hotsos 2014Fatkulin hotsos 2014
Fatkulin hotsos 2014
Enkitec
 
Using Angular JS in APEX
Using Angular JS in APEXUsing Angular JS in APEX
Using Angular JS in APEX
Enkitec
 
Controlling execution plans 2014
Controlling execution plans   2014Controlling execution plans   2014
Controlling execution plans 2014
Enkitec
 
Engineered Systems: Environment-as-a-Service Demonstration
Engineered Systems: Environment-as-a-Service DemonstrationEngineered Systems: Environment-as-a-Service Demonstration
Engineered Systems: Environment-as-a-Service Demonstration
Enkitec
 
Think Exa!
Think Exa!Think Exa!
Think Exa!
Enkitec
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry Osborne
Enkitec
 
In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1
Enkitec
 
Mini Session - Using GDB for Profiling
Mini Session - Using GDB for ProfilingMini Session - Using GDB for Profiling
Mini Session - Using GDB for Profiling
Enkitec
 
Profiling Oracle with GDB
Profiling Oracle with GDBProfiling Oracle with GDB
Profiling Oracle with GDB
Enkitec
 
Oracle Performance Tools of the Trade
Oracle Performance Tools of the TradeOracle Performance Tools of the Trade
Oracle Performance Tools of the Trade
Enkitec
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
Enkitec
 
SQL Tuning Tools of the Trade
SQL Tuning Tools of the TradeSQL Tuning Tools of the Trade
SQL Tuning Tools of the Trade
Enkitec
 
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan StabilityUsing SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Enkitec
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
Enkitec
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture Performance
Enkitec
 
APEX Security Primer
APEX Security PrimerAPEX Security Primer
APEX Security Primer
Enkitec
 
How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?
Enkitec
 
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Enkitec
 
Sql tuning made easier with sqltxplain (sqlt)
Sql tuning made easier with sqltxplain (sqlt)Sql tuning made easier with sqltxplain (sqlt)
Sql tuning made easier with sqltxplain (sqlt)
Enkitec
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writer
Enkitec
 
Fatkulin hotsos 2014
Fatkulin hotsos 2014Fatkulin hotsos 2014
Fatkulin hotsos 2014
Enkitec
 

Recently uploaded (20)

Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 

Kerry osborne hadoop meets exadata

  • 1. Hi Hadoop Meets Exadata Presented by: Kerry Osborne Oracle Open World – October, 2012
  • 2. whoami – Never Worked for Oracle Worked with Oracle Since 1982 (V2) Working with Exadata since early 2010 Work for Enkitec (www.enkitec.com) (Enkitec owns a Half Rack – V2/X2) (Enkitec owns a Big Data Appliance) Many Exadata customers and POCs Exadata Book (recently translated to Chinese) Hadoop Aficionado Blog: kerryosborne.oracle-guy.com Twitter: @KerryOracleGuy 2
  • 4. What’s the Point? Data Volumes are Increasing Rapidly Cost of Processing / Storing is High Something’s Gotta Give! Besides – managing large quantities of data is what we do! 4
  • 5. Hadoop Is A Virus * Stolen from Orbitz 5
  • 9. Digression #1 - Big Data Not My Favorite Term 3 or 4 V’s Value Density Not the Right Tool for Every Job 9
  • 10. Disjointed Presentation Architecture Comparison Integration Discussion Case Study ? 10
  • 11. Traditional RDBMS Architecture RAC w o Cache (SGA) r workers k dbwr lgwr etc… Block Mapper (ASM) Storage 11
  • 12. HDFS/Hadoop Architecture HA ? w o Job Tracker Name Node r k datanode tasktracker datanode tasktracker workers workers Storage Storage 12
  • 13. HDFS/Hadoop Architecture HA ? w o Job Tracker r k Block Mapper (namenode) datanode tasktracker datanode tasktracker workers workers Storage Storage 13
  • 14. Exadata Architecture RAC w workers o Cache r k Block Mapper (ASM) Storage Node Storage Node workers workers Storage Storage 14
  • 15. HDFS/Hadoop Architecture HA ? w o Job Tracker r k Block Mapper (namenode) datanode tasktracker datanode tasktracker workers workers Storage Storage 15
  • 16. Oracle + Hadoop Integration 16
  • 18. Integration Options Many Ways to Skin the Cat • Fuse • Sqoop • Oracle Big Data Connectors 18
  • 19. Fuse – External Tables 19
  • 20. Sqoop (SQL-to-Hadoop) • Graduated from Incubator Status in March 2012 • Slower (no direct path?) • Quest has a plug-in (oraoop) • Bi-Directional 20
  • 21. Oracle Big Data Connectors Oracle Loader for Hadoop - OLH Oracle Direct Connector for HDFS  - ODCH Oracle R Connector for Hadoop – ORHC Oracle Data Integrator Application Adapter for Hadoop Note: All Connectors are One Way All sold together for $2K per core list 21
  • 22. Oracle Data Integrator Application Adapter for Hadoop ODIAAH ? 22
  • 23. Oracle R Connector for Hadoop (ORHC) • Provides ability to pull data from Oracle RDBMS • Provides ability to pull data from HDFS • Provides access to local file system • Not really a loader tool • Most useful for analysts 23
  • 24. Oracle Loader for Hadoop (OLH) • Implemented as a MapReduce job (oraloader.jar) • Saves CPU on DB Server • Can convert to Oracle datatypes • Can partition data and optionally sort it • Online – direct into Oracle tables • Can load into Oracle via JDBC or OCI Direct Path • Offline – generate preprocessed files in HDFS (DP format) 24
  • 25. Oracle Direct Connector for HDFS  (ODCH) • My Favorite • Uses External Tables • Fastest • 12T per hour • Can load DP files preprocessed by OLH • Allows Oracle SQL to query HDFS data • Doesn’t require loading into Oracle • Pretty Cool! • Downside – uses DB CPU’s 25
  • 27. Exadoop Unusual Situation! Half Rack with 4 Spare Storage Servers Exadata Cells Very Similar to BDA Servers slower CPU’s less memory but same drives (12X3T) and IB and Flash 4 Cells ≈ Mini BDA! (happy face) 27
  • 28. Digression #2 - BDA Stuff 28
  • 29. Digression #2 - BDA Stuff 29
  • 30. Digression #2 - BDA Stuff 30
  • 31. Exadoop Situation • Pilot Underway – but wanted more power • 4 Exadata Storage Servers were sitting idle • Suggestion was to Install Hadoop Cluster on them • 1st Concern was being able to Reclaim for Exadata • Removing Data Node from HDFS Not a Problem • Adding Storage to ASM Not a Problem • So the Decision Was Made to Move Forward 31
  • 32. Exadoop Set Up • Removed the Internal USB’s • Installed OEL 6.2 • Installed CDH3 • Loaded Some Data • Set Up ODCH with External Tables 32
  • 33. Exadoop Testing • Selecting Data Using External Tables was Not Very Fast • Quickly Determined we had Used Default 1G Network • Reconfigured with IB • Helped But Not as Much as Expected • Using Little CPU on Data Nodes • But a Single Process was Pegging a CPU on the DB • Added Parallelism • No Good, Only One Slave Active • Added Multiple Files to External Table Def. – Bingo! 33
  • 34. Exadoop Testing - Continued • Added Fuse Client • Created External Tables with Fuse • PX seems to work even on single files • Puts additional CPU load on DB server (2T/hr) 34
  • 35. Wrap Up Right Tool For The Job? Maybe All the Cool Kids Are Doing It! 35
  • 36. Questions? Contact Information : Kerry Osborne [email protected] kerryosborne.oracle-guy.com www.enkitec.com 36

Editor's Notes

  • #17: Many companies that are using Hadoop in a big way still have Oracle databases sitting right next to them. Nokia - I had a meeting with a guy from Nokia a couple of weeks ago. We discussed how they were using Hadoop and he described basically an ETL kind of setup. The HDFS cluster ingests data that is then processed by MR jobs. The aggregated data is then fed into a Relational DB so the analysts could have their way with it. People have preferences for certain tools (BI tools for example). Also, RDBMS’s can be very fast for this type of access is the data is of reasonable size. Not using Flume, ??? Usiing it for many things but positional data from phones was one of the main cases we discussed. Canadian NSA – They have Exadata and Hadoop Cluster – rows of racks of both
  • #27: Use Firefox http://192.168.9.98:7777/pls/apex/f?p=100:2:1849672391763932::NO#
  • #28: Use Firefox http://192.168.9.98:7777/pls/apex/f?p=100:2:1849672391763932::NO#
  • #32: Use Firefox http://192.168.9.98:7777/pls/apex/f?p=100:2:1849672391763932::NO#
  • #36: With all the new options available it will take some serious thought about what architecture makes the most sense for any given problem. I had a conversation 2 weeks ago with the Canadian NSA (CSE) – completely static data set – never updated. Good for Hadoop or for HCC. HCC provides about 10x compression on their data set. So a single Exadata rack which has a raw storage capacity of about half a pedabyte can store over 2 pedabytes with normal redundancy. On the other hand, I had a conversation with Nokia about how they are using Hadoop. They have been heavily investing in the technology for a couple of years. A large part of what they do involves investing data produced by mobile phones. The data is typical mined by MR jobs and aggregated data sets are then loaded into RDBMS’s where analysts can use standard BI tools to do what they do. So they described it as an ETL type process.