SlideShare a Scribd company logo
Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
We do Hadoop together.
Modern Data Architecture for
Data Transformation and Acquisition
with Oracle® and Apache™
Hadoop®
Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Quick Housekeeping
Q&A box is available for your questions
Webinar will be recorded for future viewing
Thank you for joining!
Your Presenters
• Jeff Pollock
– Vice President, Product Management, Oracle
– Previously responsible for IBM InfoSphere Information
Integration & Governance products
– Author of “Semantic Web for Dummies” and "Adaptive
Information”
• Tim Hall
– Vice President, Product Management, Hortonworks
– Previously responsible for Oracle’s outbound product
management covering the Business Process
Management Suite, SOA Suite
Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Today’s Topics
• Drivers for the Modern Data Architecture
• New Analytic Applications for New Types of Data
• Hadoop as the solution for Data Lake
• Hortonworks and Oracle Data Integration teaming up
• Oracle patterns for successful Data Reservoirs
• Oracle Data Integration Strengths in Hadoop
• Oracle Data Governance for Hadoop
• Q&A
Poll: Where are you in your Hadoop journey?
1. Researching our options
2. Currently evaluating some software
3. Deep in a trial
4. In production with a Hadoop cluster
5. What’s Hadoop?
Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
A Data Architecture Under Pressure From New DataAPPLICATIONSDATASYSTEM
REPOSITORIES
SOURCES
Existing Sources
(CRM, ERP, Clickstream, Logs)
RDBMS EDW MPP
Business
Analytics
Custom
Applications
Packaged
Applications
Source: IDC
2.8 ZB in 2012
85% from New Data Types
15x Machine Data by 2020
40 ZB by 2020
OLTP, ERP, CRM Systems
Unstructured documents, emails
Clickstream
Server logs
Sentiment, Web Data
Sensor. Machine Data
Geolocation
Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop Within An Emerging Modern Data Architecture
OPERATIONS TOOLS
Provision,
Manage &
Monitor
DEV & DATA TOOLS
Build &
Test
DATASYSTEM
REPOSITORIES
SOURCES
RDBMS EDW MPP
OLTP, ERP,
CRM
Systems
Documents,
Emails
Web Logs,
Click
Streams
Social
Networks
Machine
Generated
Sensor
Data
Geolocation
Data
Governance
&Integration
Security
Operations
Data Access
Data Management
APPLICATIONS
Business
Analytics
Custom
Applications
Packaged
Applications
Clickstream
Capture and analyze
website visitors’ data
trails and optimize
your website
Sensors
Discover patterns in
data streaming
automatically from
remote sensors and
machines
Server Logs
Research logs to
diagnose process
failures and prevent
security breaches
New types of dataHadoop Value:
Sentiment
Understand how
your customers feel
about your brand
and products –
right now
Geographic
Analyze location-
based data to
manage operations
where they occur
Unstructured
Understand patterns
in files across
millions of web
pages, emails, and
documents
Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
New Analytic Applications For New Types Of Data
$
• Supplier Consolidation
• Supply Chain and Logistics
• Assembly Line Quality Assurance
• Proactive Maintenance
• Crowdsourced Quality Assurance
• New Account Risk Screens
• Fraud Prevention
• Trading Risk
• Maximize Deposit Spread
• Insurance Underwriting
• Accelerate Loan Processing
• Call Detail Records (CDRs)
• Infrastructure Investment
• Next Product to Buy (NPTB)
• Real-time Bandwidth
Allocation
• New Product Development
• 360° View of the Customer
• Analyze Brand Sentiment
• Localized, Personalized
Promotions
• Website Optimization
• Optimal Store Layout
Financial
Services
Retail Telecom Manufacturing
Healthcare
Utilities,
Oil & Gas
Public
Sector
• Genomic data for medical trials
• Monitor patient vitals
• Reduce re-admittance rates
• Store medical research data
• Recruit cohorts for
pharmaceutical trials
• Smart meter stream
analysis
• Slow oil well decline curves
• Optimize lease bidding
• Compliance reporting
• Proactive equipment repair
• Seismic image processing
• Analyze public sentiment
• Protect critical networks
• Prevent fraud and waste
• Crowdsource reporting for
repairs to infrastructure
• Fulfill open records requests
Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
… And Incrementally Delivers A ‘Data Lake’
Data Lake
• An architectural shift in the
data center that uses
Hadoop to deliver deeper
insight across a large,
broad, diverse set of data at
efficient scale
SCALE
SCOPE
A Modern Data Architecture/Data Lake
New Analytic Apps
New types of data
LOB-driven
RDBMS
MPP
EDW
Governance
&Integration
Security
Operations
Data Access
Data Management
Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The Modern Data Architecture
Oracle Data Integration
• Eliminates need for
separate ETL engine –
and associated H/W,
admin, overhead
• Non-invasive realtime
data staging into Hadoop
• Streamlines development
by providing capability to
separate Logical from
Physical mappings
• Reduces risk and
compliance exposure via
comprehensive data
governance
Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Oracle & Hortonworks
YARN Ready Partner
Certified on latest release of
Hortonworks Data Platform
Sandbox tutorial
Tutorial for
HWX Sandbox
Coming Soon!
ORCL Sandbox
Here Now!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration & Governance
14
Dynamic Data Movement
– Low impact capture
– Continuous data staging
Data Transformation
– Bulk data movement
– Pushdown data processing
Data Federation
– Virtualized Data Services
Data Quality & Verification
– Fix quality at the source
– Verify data consistency
Metadata Management
– Lineage and Impact Analysis
– Business Glossary Semantics
Data Governance
Foundation
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Fast
Load
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator
(Federation)
GoldenGate Veridata
(Online Data Verification)
ELT Processing
on Hadoop or SQL
Continuous Availability
Comprehensive capabilities for the end-to-end data integration
and governance of all data – including Hadoop based data
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Leverage Wide Range of Modern Analytic Styles
How to Succeed With a Big Data Reservoir
15
Do:
– Directly link to a Line of Business
initiative
– Iterate on short cycles, plan for
small high-value deliverables along
the way
– Use tools, not only custom coded
programs
Do Not:
– Start with a techie-led research
project w/out a biz objective
– Over promise business results on
the market hype alone
– Assume MapReduce is the answer
to all your technical challenges
DBMS
(on prem or cloud)
Data First
Analytics
Model First
Analytics
Streaming
Analytics
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Maximizing benefits:
1. Schema on Read
2. Cheaper Compute
3. Cheaper Storage
3 Core Patterns of Big Data Reservoir Success
16
DBMS
(on prem or cloud)
Sandbox
ETL Offload
Staging
Deep Data
Storage
Data Sandbox:
– Leader: Line of Business (LoB)
– Value: Faster access to business data, Faster
time to value on Analytics
– Innovation: Schema-on-read empowers
rapid staging and Data Discovery
ETL Offload:
– Leader: Information Technology (IT)
– Value: Cost avoidance on DW/Marts
– Innovation: YARN/Hadoop empowers lower
cost compute and lower cost storage
Deep Data Storage:
– Leader: Risk / Compliance (LoB)
– Core Value: High fidelity aged data
– Innovation: SQL on Hadoop engines enable
very low cost, queryable data access
Leverage Wide Range of Modern Analytic Styles
Data First
Analytics
Model First
Analytics
Streaming
Analytics
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Approach to Big Data Integration is Superior
17
DBMS
(on prem or cloud)
Sandbox
ETL Offload
Staging
Deep Data
Storage
Data Governance
Foundation
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
GoldenGate Veridata
(Online Data Verification)
Oracle GoldenGate:
– Non-invasive data capture
– Low-latency data movement
– Full or partial records staging
– Most proven integration tool worldwide
Oracle Data Integrator:
– No ETL engine is required
– Logical design separate from physical
– Deploys in Hadoop or off cluster
– Many options for movement
Metadata & Glossary:
– Search Driven
– Business Friendly
– Huge 3rd Party Support
– Automated Metadata Stitching
Leverage Wide Range of Modern Analytic Styles
Data First
Analytics
Model First
Analytics
Streaming
Analytics
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle GoldenGate Capabilities for Big Data
18
HDFS (Files)
HBase (NoSQL)
Hive / Hive Streaming (SQL)
Flume & Storm (Streaming)
Kafka (MPP Pub/Sub)
Spark Streaming (Machine Learning)
Capture Database Transactions and
Deliver to Big Data in Real-Time
Capture
Trail
Route
Deliver
Pump
GoldenGate
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Business Value of the GoldenGate Approach
19
Continuous Data Staging
– Don’t make the business wait
– CDC is by default, not an add-on
– Least invasive on sources
– Hadoop staging is fresh
Integrated, Native Capture
– Don’t create unnecessary risk
– Keep current with DB patches
– Certainty around licensing
– Proven best performance
Most Widely Proven
– 1000’s of customers
– Most demanding high volume
– Used for High Availability (HA)
– Dependable results
vs.
Batch Data Movement
– Typical ETL vendors all default to batch data
movement in their reference architectures
– Changed Data is an immature add-on
– ETL loading into Hadoop is mainly “batch mode”
Clumsy & Risky Data Capture
– Not in sync with Oracle Database versions
– Some can “talk the talk” but their CDC tech can’t
touch Oracle GoldenGate scale/performance
– Patches and Licensing create business risk
Niche, Low-End
– Some vendors only cover a few platforms
– Some vendors are broad, but don’t scale
– Few vendors have the reliability and dependability
to cover HA use cases
vs.
vs.
…the “Other Vendors”
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integrator (ODI) Capabilities for Big Data
20
Flume
Hive on MR, Tez, Spark
Logs
OLTP DB
SQOOP
OGG
Pig on MR, Tez, Spark
ODI
SQOOP
Any DW
OGG
Spark
Oozie
OEDQ OEMM
Data Validation
& Cleansing
Metadata Mgmt
& Lineage
API/File
Hive/HCat,
HDFS,HBase
Hive/HCat,
HDFS,HBase
NoSQL
Flume
Map once at the logical level, and then choose which Big Data or
Hadoop framework you want to run in!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Business Value of ODI: Low Cost and High Dev Efficiency
21
No ETL engine is
required
Separation of
Logical and
Physical design
Physical exec on
SQL, Hive, Pig, or
Spark
Runtime exec in
Oozie or via ODI
Java Agent
Rich set of pre-
built operators
User defined
functions
Eliminate your ETL Engines and improve Developer efficiency –
now, everybody can be a Big Data developer!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Hadoop Cluster
Spark
Sqoop
Hive
Pig
ODI
Oozie
Sqoop
Data Flow Approaches to Big Data Integration
22
Hadoop Cluster
Spark
Sqoop Sqoop
Hive
Pig
Manual Code
Hadoop Cluster
ETLETL HDFS
Hadoop Cluster
ETLETLETL
HDFS
1. Traditional ETL Tools
(execute entirely outside of Hadoop)
2. ETL Tools with Native “on” Hadoop
(require proprietary code on Data Nodes)
3. Manual Coding
(ultimate flexibility, but at a very high cost)
4. ODI Native in Hadoop
(no ETL Engine & no Data Node footprint)
ETL
*small ODI Agent may optionally install off cluster or
on Name Node, no dependencies on Data Nodes
GG
BEST
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Metadata Management & Glossary for Big Data
23
Comprehensive Data Lineage
Business Friendly Navigation
Business & IT Collaboration
Easy to Use, Search Driven
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Value of Metadata and Business Glossary
My dashboard
does not match
this report…why?
Where did
this data
come
from?Where can I find
the data I need for
analytics?
Which ETL mappings or
BI Reports will be
affected by my column
change?
What systems does
the data flow
through?
TRUSTED DATA IT CERTAINTY
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Big Data Governance Lifecycle Tooling
25
Operational Data Flows
Business Sources
Quality KPIs Case
Management
Governance Cockpit for Data Stewards & Stakeholders
Exception
Review
Metadata
Management
Business
Glossary
Design Time
Support People and Processes with an end-to-
end tooling capability!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
…to manage Risk/Compliance
 Records retention
 Rediscovery
 Litigation support
 Data access management
 Information security and protection
Minimize corporate liability through proper
governance of data
…to drive Business Value
 Metadata discovery
 Metadata & glossary cataloging
 Data profiling
 Data cleansing lifecycle
 Data remediation
Maximize opportunity by ensuring trusted
data is easily available for data driven
business processes
26
The Data Governance Opportunity with Big Data
Solving business and IT data challenges
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Most Heterogeneous, Deep 3rd Party Coverage
27
 Hadoop HBase
 Hadoop Hive/Flume
 HP Enscribe
 HP NonStop
 HP Neoview
 Hypersonic SQL
 IBM DB2 i Series
 IBM DB2 UDB
 IBM DB2 z Series
 IBM Informix
 IBM Netezza
 JMS / MQ
 Microsoft Access
 Microsoft SQLServer
 MySQL
 Pivotal Greenplum
 PostgreSQL
 Salesforce.com
 SAP BW / BI
 SAP ERP / ECC
 SAS
 SQL/MP
 SQL/MX
 Sybase ASE
 Sybase IQ
 Teradata
 Adaptive
 Altova
 Apache Hcatalog
 Apache Hive/HQL
 Borland
 CA ERwin
 Cloudera Impala
 COBOL Copybook
 DataStax
 Embarcadero
 EMC ProActivity
 GentleWare
 Google BigQuery
 Grandite
 Hadapt Hive
 Hortonworks Hive
 IBM Cognos
 IBM DB2
 IBM DataStage
 IBM Discovery
 IBM Federation Server
 IBM Lotus Notes
 IBM Netezza
 IBM Rational Rose
 IBM Rational Architect
 Informatica Metadata Mgr.
 Informatica PowerCenter
 CoSORT
 ISO SQL Standard (DDL)
 MapR Hadoop Hive
 MicroFocus
 Microsoft Access
 Microsoft Office Excel
 Microsoft Visio
 Microsoft SQL Server
 Microsoft SSIS
 Microsoft Visual Studio
 Microstrategy
 Magic Draw
 OMG CWM Standard
 OMG UML Standard
 Oracle BI Answers
 Oracle BI Enterprise Edition
 Oracle BI Server
 Oracle DAC
 Oracle Data Integrator
 Oracle Data Modeler
 Oracle Database
 Oracle Designer
 Oracle Hyperion Applications
 Oracle Hyperion Essbase
 Oracle Warehouse Builder
 Pivotal Greenplum
 PostgreSQL
 QlikView
 SAP BO Crystal Reports
 SAP BO Designer
 SAP BO Desktop Intelligence
 SAP BO Repository
 SAP BO Data Integrator
 SAP BO Data Steward
 SAP Master Data Management
 SAP Sybase PowerDesigner
 SAP Sybase ASE Database
 SAS Data Integration Studio
 SAS BI Server
 SAS Information Map
 SAS Metadata Management
 SAS OLAP Server
 Select
 Sparx Architect
 Syncsort
 Tableau
 Talend
 Teradata
 Tigris
 Visible
 W3C DTD & XSD Schema
Operational Integration (Movement / Transformation) Metadata Harvesting (Glossary, Lineage & Impact Analysis)
 Oracle Database
 Oracle Exadata
 Oracle Big Data Appliance
 Oracle TimesTen
 Oracle OLAP
 Oracle Business Intelligence
 Oracle BI Applications
 Oracle E-Business Suite
 Oracle JD Edwards Enterprise One
 Oracle JD Edwards World
 Oracle Fusion Applications
 Oracle Governance Risk and Compliance
 Oracle Fusion AIA
 Oracle Retail Applications
 Oracle Agile BI / DW
 Oracle Agile PLM for Process
 Oracle iFlex FlexCUBE
 Oracle iFlex Mantas
 Oracle Hyperion Applications
 Oracle PeopleSoft
 Oracle Siebel CRM / OnDemand
 Oracle Communications
 Oracle WebLogic Server
 Oracle Coherence Data Grid
 Oracle SOA Suite
 Oracle Enterprise Service Bus
+ open APIs and standards
based meta-model
No other vendor can compare:
• 50+ systems for Operational Integration
• 70+ systems for Metadata Harvesting
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Governance
Foundation
Differentiated Technical Approach from Oracle
28
Dynamic Data Movement
– Real-time by default, not ETL
– Least invasive on sources
– Proven best performance
– Native Oracle integration
No ETL Engines
– Take processing to the data;
don’t move the data
– Leverage the data engines for
workloads (Hadoop or SQL)
Most Heterogeneous
– Leverage open source Hadoop,
not proprietary distributions
– Hadoop is the Hub, not ETL tools
– Open metadata standards
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Fast
Load
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator
(Federation)
GoldenGate Veridata
(Online Data Verification)
ELT Processing
on Hadoop or SQL
Continuous Availability
Comprehensive capabilities for the end-to-end data integration
and governance of all data – including Hadoop based data
Question & Answer session will be conducted electronically,
using the panel to the right of your screen
About Oracle and Hortonworks
hortonworks.com/partner/oracle/
Get started with Hortonworks Sandbox
hortonworks.com/sandbox
Follow us:
@hortonworks @Oracle
Learn more
Oracle.com/goto/dataintegration
Ad

More Related Content

What's hot (20)

Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
DataWorks Summit/Hadoop Summit
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
jdijcks
 
Oracle's BigData solutions
Oracle's BigData solutionsOracle's BigData solutions
Oracle's BigData solutions
Swiss Big Data User Group
 
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group
Hortonworks
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)
Jeffrey T. Pollock
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
Hortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
Hortonworks
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
Hortonworks
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark Summit
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Hortonworks
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
Jeffrey T. Pollock
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - Overview
Jeffrey T. Pollock
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
Hortonworks
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
Hortonworks
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
Jeffrey T. Pollock
 
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Jeffrey T. Pollock
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
DataWorks Summit/Hadoop Summit
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
jdijcks
 
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group
Hortonworks
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)
Jeffrey T. Pollock
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
Hortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
Hortonworks
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
Hortonworks
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark Summit
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Hortonworks
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
Jeffrey T. Pollock
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - Overview
Jeffrey T. Pollock
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
Hortonworks
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
Hortonworks
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
Jeffrey T. Pollock
 
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Jeffrey T. Pollock
 

Similar to Hortonworks Oracle Big Data Integration (20)

Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
Slim Baltagi
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
Hortonworks
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Hortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Hortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Hortonworks
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
Hortonworks
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
POSSCON
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
Hortonworks
 
Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by Example
Harald Erb
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
jaxconf
 
Hortonworks kognitio webinar 10 dec 2013
Hortonworks kognitio webinar 10 dec 2013Hortonworks kognitio webinar 10 dec 2013
Hortonworks kognitio webinar 10 dec 2013
Michael Hiskey
 
Modern Data Architecture: In-Memory with Hadoop - the new BI
Modern Data Architecture: In-Memory with Hadoop - the new BIModern Data Architecture: In-Memory with Hadoop - the new BI
Modern Data Architecture: In-Memory with Hadoop - the new BI
Kognitio
 
OOP 2014
OOP 2014OOP 2014
OOP 2014
Emil Andreas Siemes
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Hortonworks
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
Michael Rainey
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
Hortonworks
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
Cécile Poyet
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
Slim Baltagi
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
Hortonworks
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Hortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Hortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Hortonworks
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
Hortonworks
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
POSSCON
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
Hortonworks
 
Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by Example
Harald Erb
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
jaxconf
 
Hortonworks kognitio webinar 10 dec 2013
Hortonworks kognitio webinar 10 dec 2013Hortonworks kognitio webinar 10 dec 2013
Hortonworks kognitio webinar 10 dec 2013
Michael Hiskey
 
Modern Data Architecture: In-Memory with Hadoop - the new BI
Modern Data Architecture: In-Memory with Hadoop - the new BIModern Data Architecture: In-Memory with Hadoop - the new BI
Modern Data Architecture: In-Memory with Hadoop - the new BI
Kognitio
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Hortonworks
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
Michael Rainey
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
Hortonworks
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
Cécile Poyet
 
Ad

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
 
Ad

Recently uploaded (20)

Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New VersionPixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
saimabibi60507
 
Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025
kashifyounis067
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New VersionPixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
saimabibi60507
 
Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025
kashifyounis067
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 

Hortonworks Oracle Big Data Integration

  • 1. Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved We do Hadoop together.
  • 2. Modern Data Architecture for Data Transformation and Acquisition with Oracle® and Apache™ Hadoop®
  • 3. Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Quick Housekeeping Q&A box is available for your questions Webinar will be recorded for future viewing Thank you for joining!
  • 4. Your Presenters • Jeff Pollock – Vice President, Product Management, Oracle – Previously responsible for IBM InfoSphere Information Integration & Governance products – Author of “Semantic Web for Dummies” and "Adaptive Information” • Tim Hall – Vice President, Product Management, Hortonworks – Previously responsible for Oracle’s outbound product management covering the Business Process Management Suite, SOA Suite
  • 5. Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Today’s Topics • Drivers for the Modern Data Architecture • New Analytic Applications for New Types of Data • Hadoop as the solution for Data Lake • Hortonworks and Oracle Data Integration teaming up • Oracle patterns for successful Data Reservoirs • Oracle Data Integration Strengths in Hadoop • Oracle Data Governance for Hadoop • Q&A
  • 6. Poll: Where are you in your Hadoop journey? 1. Researching our options 2. Currently evaluating some software 3. Deep in a trial 4. In production with a Hadoop cluster 5. What’s Hadoop?
  • 7. Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved A Data Architecture Under Pressure From New DataAPPLICATIONSDATASYSTEM REPOSITORIES SOURCES Existing Sources (CRM, ERP, Clickstream, Logs) RDBMS EDW MPP Business Analytics Custom Applications Packaged Applications Source: IDC 2.8 ZB in 2012 85% from New Data Types 15x Machine Data by 2020 40 ZB by 2020 OLTP, ERP, CRM Systems Unstructured documents, emails Clickstream Server logs Sentiment, Web Data Sensor. Machine Data Geolocation
  • 8. Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop Within An Emerging Modern Data Architecture OPERATIONS TOOLS Provision, Manage & Monitor DEV & DATA TOOLS Build & Test DATASYSTEM REPOSITORIES SOURCES RDBMS EDW MPP OLTP, ERP, CRM Systems Documents, Emails Web Logs, Click Streams Social Networks Machine Generated Sensor Data Geolocation Data Governance &Integration Security Operations Data Access Data Management APPLICATIONS Business Analytics Custom Applications Packaged Applications
  • 9. Clickstream Capture and analyze website visitors’ data trails and optimize your website Sensors Discover patterns in data streaming automatically from remote sensors and machines Server Logs Research logs to diagnose process failures and prevent security breaches New types of dataHadoop Value: Sentiment Understand how your customers feel about your brand and products – right now Geographic Analyze location- based data to manage operations where they occur Unstructured Understand patterns in files across millions of web pages, emails, and documents
  • 10. Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved New Analytic Applications For New Types Of Data $ • Supplier Consolidation • Supply Chain and Logistics • Assembly Line Quality Assurance • Proactive Maintenance • Crowdsourced Quality Assurance • New Account Risk Screens • Fraud Prevention • Trading Risk • Maximize Deposit Spread • Insurance Underwriting • Accelerate Loan Processing • Call Detail Records (CDRs) • Infrastructure Investment • Next Product to Buy (NPTB) • Real-time Bandwidth Allocation • New Product Development • 360° View of the Customer • Analyze Brand Sentiment • Localized, Personalized Promotions • Website Optimization • Optimal Store Layout Financial Services Retail Telecom Manufacturing Healthcare Utilities, Oil & Gas Public Sector • Genomic data for medical trials • Monitor patient vitals • Reduce re-admittance rates • Store medical research data • Recruit cohorts for pharmaceutical trials • Smart meter stream analysis • Slow oil well decline curves • Optimize lease bidding • Compliance reporting • Proactive equipment repair • Seismic image processing • Analyze public sentiment • Protect critical networks • Prevent fraud and waste • Crowdsource reporting for repairs to infrastructure • Fulfill open records requests
  • 11. Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved … And Incrementally Delivers A ‘Data Lake’ Data Lake • An architectural shift in the data center that uses Hadoop to deliver deeper insight across a large, broad, diverse set of data at efficient scale SCALE SCOPE A Modern Data Architecture/Data Lake New Analytic Apps New types of data LOB-driven RDBMS MPP EDW Governance &Integration Security Operations Data Access Data Management
  • 12. Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The Modern Data Architecture Oracle Data Integration • Eliminates need for separate ETL engine – and associated H/W, admin, overhead • Non-invasive realtime data staging into Hadoop • Streamlines development by providing capability to separate Logical from Physical mappings • Reduces risk and compliance exposure via comprehensive data governance
  • 13. Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Oracle & Hortonworks YARN Ready Partner Certified on latest release of Hortonworks Data Platform Sandbox tutorial Tutorial for HWX Sandbox Coming Soon! ORCL Sandbox Here Now!
  • 14. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Data Integration & Governance 14 Dynamic Data Movement – Low impact capture – Continuous data staging Data Transformation – Bulk data movement – Pushdown data processing Data Federation – Virtualized Data Services Data Quality & Verification – Fix quality at the source – Verify data consistency Metadata Management – Lineage and Impact Analysis – Business Glossary Semantics Data Governance Foundation Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Fast Load Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator (Federation) GoldenGate Veridata (Online Data Verification) ELT Processing on Hadoop or SQL Continuous Availability Comprehensive capabilities for the end-to-end data integration and governance of all data – including Hadoop based data
  • 15. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Leverage Wide Range of Modern Analytic Styles How to Succeed With a Big Data Reservoir 15 Do: – Directly link to a Line of Business initiative – Iterate on short cycles, plan for small high-value deliverables along the way – Use tools, not only custom coded programs Do Not: – Start with a techie-led research project w/out a biz objective – Over promise business results on the market hype alone – Assume MapReduce is the answer to all your technical challenges DBMS (on prem or cloud) Data First Analytics Model First Analytics Streaming Analytics
  • 16. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Maximizing benefits: 1. Schema on Read 2. Cheaper Compute 3. Cheaper Storage 3 Core Patterns of Big Data Reservoir Success 16 DBMS (on prem or cloud) Sandbox ETL Offload Staging Deep Data Storage Data Sandbox: – Leader: Line of Business (LoB) – Value: Faster access to business data, Faster time to value on Analytics – Innovation: Schema-on-read empowers rapid staging and Data Discovery ETL Offload: – Leader: Information Technology (IT) – Value: Cost avoidance on DW/Marts – Innovation: YARN/Hadoop empowers lower cost compute and lower cost storage Deep Data Storage: – Leader: Risk / Compliance (LoB) – Core Value: High fidelity aged data – Innovation: SQL on Hadoop engines enable very low cost, queryable data access Leverage Wide Range of Modern Analytic Styles Data First Analytics Model First Analytics Streaming Analytics
  • 17. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Approach to Big Data Integration is Superior 17 DBMS (on prem or cloud) Sandbox ETL Offload Staging Deep Data Storage Data Governance Foundation Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) GoldenGate Veridata (Online Data Verification) Oracle GoldenGate: – Non-invasive data capture – Low-latency data movement – Full or partial records staging – Most proven integration tool worldwide Oracle Data Integrator: – No ETL engine is required – Logical design separate from physical – Deploys in Hadoop or off cluster – Many options for movement Metadata & Glossary: – Search Driven – Business Friendly – Huge 3rd Party Support – Automated Metadata Stitching Leverage Wide Range of Modern Analytic Styles Data First Analytics Model First Analytics Streaming Analytics
  • 18. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle GoldenGate Capabilities for Big Data 18 HDFS (Files) HBase (NoSQL) Hive / Hive Streaming (SQL) Flume & Storm (Streaming) Kafka (MPP Pub/Sub) Spark Streaming (Machine Learning) Capture Database Transactions and Deliver to Big Data in Real-Time Capture Trail Route Deliver Pump GoldenGate
  • 19. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Business Value of the GoldenGate Approach 19 Continuous Data Staging – Don’t make the business wait – CDC is by default, not an add-on – Least invasive on sources – Hadoop staging is fresh Integrated, Native Capture – Don’t create unnecessary risk – Keep current with DB patches – Certainty around licensing – Proven best performance Most Widely Proven – 1000’s of customers – Most demanding high volume – Used for High Availability (HA) – Dependable results vs. Batch Data Movement – Typical ETL vendors all default to batch data movement in their reference architectures – Changed Data is an immature add-on – ETL loading into Hadoop is mainly “batch mode” Clumsy & Risky Data Capture – Not in sync with Oracle Database versions – Some can “talk the talk” but their CDC tech can’t touch Oracle GoldenGate scale/performance – Patches and Licensing create business risk Niche, Low-End – Some vendors only cover a few platforms – Some vendors are broad, but don’t scale – Few vendors have the reliability and dependability to cover HA use cases vs. vs. …the “Other Vendors”
  • 20. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Data Integrator (ODI) Capabilities for Big Data 20 Flume Hive on MR, Tez, Spark Logs OLTP DB SQOOP OGG Pig on MR, Tez, Spark ODI SQOOP Any DW OGG Spark Oozie OEDQ OEMM Data Validation & Cleansing Metadata Mgmt & Lineage API/File Hive/HCat, HDFS,HBase Hive/HCat, HDFS,HBase NoSQL Flume Map once at the logical level, and then choose which Big Data or Hadoop framework you want to run in!
  • 21. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Business Value of ODI: Low Cost and High Dev Efficiency 21 No ETL engine is required Separation of Logical and Physical design Physical exec on SQL, Hive, Pig, or Spark Runtime exec in Oozie or via ODI Java Agent Rich set of pre- built operators User defined functions Eliminate your ETL Engines and improve Developer efficiency – now, everybody can be a Big Data developer!
  • 22. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Hadoop Cluster Spark Sqoop Hive Pig ODI Oozie Sqoop Data Flow Approaches to Big Data Integration 22 Hadoop Cluster Spark Sqoop Sqoop Hive Pig Manual Code Hadoop Cluster ETLETL HDFS Hadoop Cluster ETLETLETL HDFS 1. Traditional ETL Tools (execute entirely outside of Hadoop) 2. ETL Tools with Native “on” Hadoop (require proprietary code on Data Nodes) 3. Manual Coding (ultimate flexibility, but at a very high cost) 4. ODI Native in Hadoop (no ETL Engine & no Data Node footprint) ETL *small ODI Agent may optionally install off cluster or on Name Node, no dependencies on Data Nodes GG BEST
  • 23. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Metadata Management & Glossary for Big Data 23 Comprehensive Data Lineage Business Friendly Navigation Business & IT Collaboration Easy to Use, Search Driven
  • 24. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Value of Metadata and Business Glossary My dashboard does not match this report…why? Where did this data come from?Where can I find the data I need for analytics? Which ETL mappings or BI Reports will be affected by my column change? What systems does the data flow through? TRUSTED DATA IT CERTAINTY
  • 25. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Big Data Governance Lifecycle Tooling 25 Operational Data Flows Business Sources Quality KPIs Case Management Governance Cockpit for Data Stewards & Stakeholders Exception Review Metadata Management Business Glossary Design Time Support People and Processes with an end-to- end tooling capability!
  • 26. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | …to manage Risk/Compliance  Records retention  Rediscovery  Litigation support  Data access management  Information security and protection Minimize corporate liability through proper governance of data …to drive Business Value  Metadata discovery  Metadata & glossary cataloging  Data profiling  Data cleansing lifecycle  Data remediation Maximize opportunity by ensuring trusted data is easily available for data driven business processes 26 The Data Governance Opportunity with Big Data Solving business and IT data challenges
  • 27. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Most Heterogeneous, Deep 3rd Party Coverage 27  Hadoop HBase  Hadoop Hive/Flume  HP Enscribe  HP NonStop  HP Neoview  Hypersonic SQL  IBM DB2 i Series  IBM DB2 UDB  IBM DB2 z Series  IBM Informix  IBM Netezza  JMS / MQ  Microsoft Access  Microsoft SQLServer  MySQL  Pivotal Greenplum  PostgreSQL  Salesforce.com  SAP BW / BI  SAP ERP / ECC  SAS  SQL/MP  SQL/MX  Sybase ASE  Sybase IQ  Teradata  Adaptive  Altova  Apache Hcatalog  Apache Hive/HQL  Borland  CA ERwin  Cloudera Impala  COBOL Copybook  DataStax  Embarcadero  EMC ProActivity  GentleWare  Google BigQuery  Grandite  Hadapt Hive  Hortonworks Hive  IBM Cognos  IBM DB2  IBM DataStage  IBM Discovery  IBM Federation Server  IBM Lotus Notes  IBM Netezza  IBM Rational Rose  IBM Rational Architect  Informatica Metadata Mgr.  Informatica PowerCenter  CoSORT  ISO SQL Standard (DDL)  MapR Hadoop Hive  MicroFocus  Microsoft Access  Microsoft Office Excel  Microsoft Visio  Microsoft SQL Server  Microsoft SSIS  Microsoft Visual Studio  Microstrategy  Magic Draw  OMG CWM Standard  OMG UML Standard  Oracle BI Answers  Oracle BI Enterprise Edition  Oracle BI Server  Oracle DAC  Oracle Data Integrator  Oracle Data Modeler  Oracle Database  Oracle Designer  Oracle Hyperion Applications  Oracle Hyperion Essbase  Oracle Warehouse Builder  Pivotal Greenplum  PostgreSQL  QlikView  SAP BO Crystal Reports  SAP BO Designer  SAP BO Desktop Intelligence  SAP BO Repository  SAP BO Data Integrator  SAP BO Data Steward  SAP Master Data Management  SAP Sybase PowerDesigner  SAP Sybase ASE Database  SAS Data Integration Studio  SAS BI Server  SAS Information Map  SAS Metadata Management  SAS OLAP Server  Select  Sparx Architect  Syncsort  Tableau  Talend  Teradata  Tigris  Visible  W3C DTD & XSD Schema Operational Integration (Movement / Transformation) Metadata Harvesting (Glossary, Lineage & Impact Analysis)  Oracle Database  Oracle Exadata  Oracle Big Data Appliance  Oracle TimesTen  Oracle OLAP  Oracle Business Intelligence  Oracle BI Applications  Oracle E-Business Suite  Oracle JD Edwards Enterprise One  Oracle JD Edwards World  Oracle Fusion Applications  Oracle Governance Risk and Compliance  Oracle Fusion AIA  Oracle Retail Applications  Oracle Agile BI / DW  Oracle Agile PLM for Process  Oracle iFlex FlexCUBE  Oracle iFlex Mantas  Oracle Hyperion Applications  Oracle PeopleSoft  Oracle Siebel CRM / OnDemand  Oracle Communications  Oracle WebLogic Server  Oracle Coherence Data Grid  Oracle SOA Suite  Oracle Enterprise Service Bus + open APIs and standards based meta-model No other vendor can compare: • 50+ systems for Operational Integration • 70+ systems for Metadata Harvesting
  • 28. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Data Governance Foundation Differentiated Technical Approach from Oracle 28 Dynamic Data Movement – Real-time by default, not ETL – Least invasive on sources – Proven best performance – Native Oracle integration No ETL Engines – Take processing to the data; don’t move the data – Leverage the data engines for workloads (Hadoop or SQL) Most Heterogeneous – Leverage open source Hadoop, not proprietary distributions – Hadoop is the Hub, not ETL tools – Open metadata standards Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Fast Load Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator (Federation) GoldenGate Veridata (Online Data Verification) ELT Processing on Hadoop or SQL Continuous Availability Comprehensive capabilities for the end-to-end data integration and governance of all data – including Hadoop based data
  • 29. Question & Answer session will be conducted electronically, using the panel to the right of your screen About Oracle and Hortonworks hortonworks.com/partner/oracle/ Get started with Hortonworks Sandbox hortonworks.com/sandbox Follow us: @hortonworks @Oracle Learn more Oracle.com/goto/dataintegration