SlideShare a Scribd company logo
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Unlock the Value in your Big Data Reservoir using
Oracle Big Data Discover
Mark Rittman, CTO, Rittman Mead
IlOUG Tech Days 3, Tel Aviv, May 2016
info@rittmanmead.com www.rittmanmead.com @rittmanmead 2
•Mark Rittman, Co-Founder of Rittman Mead

‣Oracle ACE Director, specialising in Oracle BI&DW

‣14 Years Experience with Oracle Technology

‣Regular columnist for Oracle Magazine

•Author of two Oracle Press Oracle BI books

‣Oracle Business Intelligence Developers Guide

‣Oracle Exalytics Revealed

‣Writer for Rittman Mead Blog :

http://www.rittmanmead.com/blog

•Email : mark.rittman@rittmanmead.com

•Twitter : @markrittman
About the Speaker
info@rittmanmead.com www.rittmanmead.com @rittmanmead 3
•Many customers and organisations are now running initiatives around “big data”

•Some are IT-led and are looking for cost-savings around data warehouse storage + ETL

•Others are “skunkworks” projects in the marketing department that are now scaling-up

•Projects now emerging from pilot exercises

•And design patterns starting to emerge
Many Organisations are Running Big Data Initiatives
info@rittmanmead.com www.rittmanmead.com @rittmanmead 4
•Typical implementation of Hadoop and big data in an analytic context is the “data lake”

•Additional data storage platform with cheap storage, flexible schema support + compute

•Data lands in the data lake or reservoir in raw form, then minimally processed

•Data then accessed directly by “data scientists”, or processed further into DW
Common Big Data Design Pattern : “Data Reservoir”
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Common Project Environment : The “Data Lab”
info@rittmanmead.com www.rittmanmead.com @rittmanmead 6
•Data loaded into the reservoir needs preparation and curation before presenting to users

•Specialist skills typically needed to ingest and understand data - and those staff are scarce

•How do we staff and scale projects as our use of big data matures?
But … Working with Unstructured Textual Data Is Hard
Hold on …
Haven't we heard this story before?
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle Big Data Discovery
info@rittmanmead.com www.rittmanmead.com @rittmanmead 10
•Part of the acquisition of Endeca back in 2012 by
Oracle Corporation

•Based on search technology and concept of
“faceted search”

•Data stored in flexible NoSQL-style in-memory
database called “Endeca Server”

•Added aggregation, text analytics and text
enrichment features for “data discovery”

‣Explore data in raw form, loose connections,
navigate via search rather than hierarchies

‣Useful to find out what is relevant and valuable in
a dataset before formal modeling
What Was Oracle Endeca Information Discovery?
the years pass…
2013
… to 2015
and even more importantly…
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle Big Data Discovery
info@rittmanmead.com www.rittmanmead.com @rittmanmead 16
•A visual front-end to the Hadoop data reservoir, providing end-user access to datasets

•Catalog, profile, analyse and combine schema-on-read datasets across the Hadoop cluster

•Visualize and search datasets to gain insights, potentially load in summary form into DW
Oracle Big Data Discovery
info@rittmanmead.com www.rittmanmead.com @rittmanmead 17
What Does Big Data Discovery Do?
•Provide a visual catalog and search function across data in the data reservoir

•Profile and understand data, relationships, data quality issues

•Apply simple changes, enrichment to incoming data

•Visualize datasets including combinations (joins)
info@rittmanmead.com www.rittmanmead.com @rittmanmead 18
•A catalog and view onto data sitting in Hadoop

•Data wrangling, enrichment and combining

•Visualization and Exploration of that data
Main Big Data Discovery Use-Cases
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Visual Analyzer, DVCS, DVD are also free-
form data visualisation tools

‣Single-pane-of-glass view onto data

‣Drill, export and visualise your data

•Crucially though these tools work on 

data that’s already structured and “known”

‣Data sources are either the OBIEE RPD, or

files extracted from other structured
systems

•Oracle BDD is for stage before this, when
data is in a “raw” state and landed in
Hadoop

‣See the shape, value and integration points

‣Structure, enrich and transform
Oracle Big Data Discovery vs. Oracle Data Visualisation
info@rittmanmead.com www.rittmanmead.com @rittmanmead 20
•Rittman Mead want to understand drivers and audience for their website

‣What is our most popular content? Who are the most in-demand blog authors?

‣Who are the influencers? What do they read? 

•Three data sources in scope:
Example Scenario : Social Media Analysis
RM Website Logs Twitter Stream Website Posts, Comments etc
info@rittmanmead.com www.rittmanmead.com @rittmanmead 21
•Datasets in Hive have to be ingested into DGraph engine before analysis, transformation

•Can either define an automatic Hive table detector process, or manually upload

•Typically ingests 1m row random sample

‣1m row sample provides > 99% confidence that answer is within 2% of value shown

no matter how big the full dataset (1m, 1b, 1q+)

‣Makes interactivity cheap - representative dataset
Ingesting & Sampling Datasets for the DGraph Engine
info@rittmanmead.com www.rittmanmead.com @rittmanmead 22
•Ingestion process has automatically geo-coded host IP addresses

•Other automatic enrichments run after initial discovery step, based on datatypes, content
Automatic Enrichment of Ingested Datasets
info@rittmanmead.com www.rittmanmead.com @rittmanmead 23
•Data ingest process automatically applies some enrichments - geocoding etc

•Can apply others from Transformation page - simple transformations & Groovy expressions
Data Transformation & Enrichment
info@rittmanmead.com www.rittmanmead.com @rittmanmead 24
•Users can upload their own datasets into BDD, from MS Excel or CSV file

•Uploaded data is first loaded into Hive table, then sampled/ingested as normal
Upload Additional Datasets
1
2
3
info@rittmanmead.com www.rittmanmead.com @rittmanmead 25
•Used to create a dataset based on the intersection (typically) of two datasets

•Not required to just view two or more datasets together - think of this as a JOIN and
SELECT
Join Datasets On Common Attributes
info@rittmanmead.com www.rittmanmead.com @rittmanmead 26
Visualize and Interact With Hadoop Datasets
info@rittmanmead.com www.rittmanmead.com @rittmanmead 27
•BDD Studio dashboards support faceted search across all attributes, refinements

•Auto-filter dashboard contents on selected attribute values - for data discovery

•Fast analysis and summarisation through Endeca Server technology
Faceted Search Across Entire Data Reservoir
Further refinement on

“OBIEE” in post keywords
3
Results now filtered

on two refinements
4
info@rittmanmead.com www.rittmanmead.com @rittmanmead 28
•Transformations within BDD can then be used to create curated fact + dim Hive tables

•Can be used then as a more suitable dataset for use with OBIEE RPD + Visual Analyzer

•Or exported then in to Exadata or Exalytics to combine with main DW datasets
Export Prepared Datasets Back to Hive, for OBIEE + VA
info@rittmanmead.com www.rittmanmead.com @rittmanmead 29
•Users in Visual Analyzer then have

a more structured dataset to use

•Data organised into dimensions, 

facts, hierarchies and attributes

•Can still access Hadoop directly

through Impala or Big Data SQL

•Big Data Discovery though was 

key to initial understanding of data
Further Analyse in Visual Analyzer for Managed Dataset
hang on a moment…
… what about her?
or him?
I’m too
sexy…
How do I turn Christian Berg…
Into Christian Berg-Tierney?
I’m too
sexy…
info@rittmanmead.com www.rittmanmead.com @rittmanmead
BDD 1.2.0 Release Theme : “Developer Productivity”
35
info@rittmanmead.com www.rittmanmead.com @rittmanmead 36
•Useful to create weekly stats out of daily ones,
for example for logistic regression analysis

•Roll up low-level data to higher grains

•Logs -> sessions -> customers

•Intuitive UI helps analysts find the right grains

•Execute at full scale using Spark

•Results can be sampled or indexed in full
New Features to Aggregate and Join Datasets
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Blend huge datasets right in BDD, UI to support experimentation, preview

•Execute at scale with Spark, results can be sampled or indexed in full

•Useful for joining-back to same dataset to calculate change since last month/week
Join Datasets in BDD (incl. Joinback)
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Access Python & Spark packages, run
against BDD datasets

•Pandas, Numpy, SparkML, Spark SQL

•Suddenly Big Data Discovery gets useful

•Use BDD’s cataloging and enrichment

•Blend and join datasets

•Use them as datasources into Python

and Spark data science algorithms

•The best of both worlds
Most Significant for Data Scientists : pySpark Shell
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Use .toSpark and .toDF Spark transformations to turn
BDD datasets into Spark RDDs and Dataframes

•Then access Spark SQL set-based queries and JDBC
query federation to further manipulate + join data

•Use Spark MLlib to access machine-learning features

•Classification

•Clustering

•Collaborative Learning
Access to Spark SQL and Spark MLlib
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Using BDD + wearable + home IoT datasets to create a quantified-self project

•Use machine learning (via BDD shell) to optimise my life for health and mood
Watch This Space … at KScope’16
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle Big Data Discovery
info@rittmanmead.com www.rittmanmead.com @rittmanmead 42
•Articles on the Rittman Mead Blog

‣http://www.rittmanmead.com/category/oracle-big-data-appliance/

‣http://www.rittmanmead.com/category/big-data/

‣http://www.rittmanmead.com/category/oracle-big-data-discovery/

•Rittman Mead offer consulting, training and managed services for Oracle Big Data

‣Oracle & Cloudera partners

‣http://www.rittmanmead.com/bigdata
Additional Resources
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Unlock the Value in your Big Data Reservoir using
Oracle Big Data Discover
Mark Rittman, CTO, Rittman Mead
IlOUG Tech Days 3, Tel Aviv, May 2016
Ad

More Related Content

What's hot (20)

Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data Architectures
DataWorks Summit
 
SplunkSummit 2015 - Real World Big Data Architecture
SplunkSummit 2015 -  Real World Big Data ArchitectureSplunkSummit 2015 -  Real World Big Data Architecture
SplunkSummit 2015 - Real World Big Data Architecture
Splunk
 
Big Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data LakesBig Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data Lakes
Denodo
 
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformBig Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Caserta
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
James Serra
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
DataWorks Summit
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
DataWorks Summit
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
DataWorks Summit/Hadoop Summit
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015
DataWorks Summit
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Dipti Borkar
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Hortonworks
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
Mark Rittman
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
James Serra
 
Azure HDInsight
Azure HDInsightAzure HDInsight
Azure HDInsight
Koray Kocabas
 
Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for Architects
Tomasz Kopacz
 
Big Data with Azure
Big Data with AzureBig Data with Azure
Big Data with Azure
Aaron (Ari) Bornstein
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data Discovery
DataWorks Summit/Hadoop Summit
 
Data warehouse con azure synapse analytics
Data warehouse con azure synapse analyticsData warehouse con azure synapse analytics
Data warehouse con azure synapse analytics
Eduardo Castro
 
Azure Big data
Azure Big data Azure Big data
Azure Big data
Michel HUBERT
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data Architectures
DataWorks Summit
 
SplunkSummit 2015 - Real World Big Data Architecture
SplunkSummit 2015 -  Real World Big Data ArchitectureSplunkSummit 2015 -  Real World Big Data Architecture
SplunkSummit 2015 - Real World Big Data Architecture
Splunk
 
Big Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data LakesBig Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data Lakes
Denodo
 
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformBig Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Caserta
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
James Serra
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
DataWorks Summit
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
DataWorks Summit
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015
DataWorks Summit
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Dipti Borkar
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Hortonworks
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
Mark Rittman
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
James Serra
 
Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for Architects
Tomasz Kopacz
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data Discovery
DataWorks Summit/Hadoop Summit
 
Data warehouse con azure synapse analytics
Data warehouse con azure synapse analyticsData warehouse con azure synapse analytics
Data warehouse con azure synapse analytics
Eduardo Castro
 

Similar to IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle Big Data Discovery (20)

Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Mark Rittman
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
Mark Rittman
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Mark Rittman
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...
Mark Rittman
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Mark Rittman
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Rittman Analytics
 
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Mark Rittman
 
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Mark Rittman
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Mark Rittman
 
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Mark Rittman
 
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g ImplementationsPractical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Michael Rainey
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
Institute of Contemporary Sciences
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
DATAVERSITY
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
Andrew Brust
 
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
Mark Rittman
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
Inside Analysis
 
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's ToolkitUsing Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
Mark Rittman
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Mark Rittman
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
Mark Rittman
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Mark Rittman
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...
Mark Rittman
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Mark Rittman
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Rittman Analytics
 
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Mark Rittman
 
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Mark Rittman
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Mark Rittman
 
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Mark Rittman
 
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g ImplementationsPractical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Michael Rainey
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
Institute of Contemporary Sciences
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
DATAVERSITY
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
Andrew Brust
 
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
Mark Rittman
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
Inside Analysis
 
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's ToolkitUsing Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
Mark Rittman
 
Ad

More from Mark Rittman (18)

The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data Platforms
Mark Rittman
 
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudOTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
Mark Rittman
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
Mark Rittman
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
Mark Rittman
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Mark Rittman
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
Mark Rittman
 
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Mark Rittman
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
Mark Rittman
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Mark Rittman
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Mark Rittman
 
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
Mark Rittman
 
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
Mark Rittman
 
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODIBIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
Mark Rittman
 
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI ProjectsOGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
Mark Rittman
 
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12cUKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
Mark Rittman
 
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Mark Rittman
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Mark Rittman
 
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12cPart 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Mark Rittman
 
The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data Platforms
Mark Rittman
 
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudOTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
Mark Rittman
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
Mark Rittman
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
Mark Rittman
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Mark Rittman
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
Mark Rittman
 
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Mark Rittman
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
Mark Rittman
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Mark Rittman
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Mark Rittman
 
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
Mark Rittman
 
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
Mark Rittman
 
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODIBIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
Mark Rittman
 
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI ProjectsOGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
Mark Rittman
 
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12cUKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
Mark Rittman
 
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Mark Rittman
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Mark Rittman
 
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12cPart 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Mark Rittman
 
Ad

Recently uploaded (20)

MASAkkjjkttuyrdquesjhjhjfc44dddtions.docx
MASAkkjjkttuyrdquesjhjhjfc44dddtions.docxMASAkkjjkttuyrdquesjhjhjfc44dddtions.docx
MASAkkjjkttuyrdquesjhjhjfc44dddtions.docx
santosh162
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
KNN_Logistic_Regression_Presentation_Styled.pptx
KNN_Logistic_Regression_Presentation_Styled.pptxKNN_Logistic_Regression_Presentation_Styled.pptx
KNN_Logistic_Regression_Presentation_Styled.pptx
sonujha1980712
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
History of Science and Technologyandits source.pptx
History of Science and Technologyandits source.pptxHistory of Science and Technologyandits source.pptx
History of Science and Technologyandits source.pptx
balongcastrojo
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
Induction Program of MTAB online session
Induction Program of MTAB online sessionInduction Program of MTAB online session
Induction Program of MTAB online session
LOHITH886892
 
Andhra Pradesh Micro Irrigation Project”
Andhra Pradesh Micro Irrigation Project”Andhra Pradesh Micro Irrigation Project”
Andhra Pradesh Micro Irrigation Project”
vzmcareers
 
brainstorming-techniques-infographics.pptx
brainstorming-techniques-infographics.pptxbrainstorming-techniques-infographics.pptx
brainstorming-techniques-infographics.pptx
maritzacastro321
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptxPRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
JayeshTaneja4
 
MASAkkjjkttuyrdquesjhjhjfc44dddtions.docx
MASAkkjjkttuyrdquesjhjhjfc44dddtions.docxMASAkkjjkttuyrdquesjhjhjfc44dddtions.docx
MASAkkjjkttuyrdquesjhjhjfc44dddtions.docx
santosh162
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
KNN_Logistic_Regression_Presentation_Styled.pptx
KNN_Logistic_Regression_Presentation_Styled.pptxKNN_Logistic_Regression_Presentation_Styled.pptx
KNN_Logistic_Regression_Presentation_Styled.pptx
sonujha1980712
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
History of Science and Technologyandits source.pptx
History of Science and Technologyandits source.pptxHistory of Science and Technologyandits source.pptx
History of Science and Technologyandits source.pptx
balongcastrojo
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
Induction Program of MTAB online session
Induction Program of MTAB online sessionInduction Program of MTAB online session
Induction Program of MTAB online session
LOHITH886892
 
Andhra Pradesh Micro Irrigation Project”
Andhra Pradesh Micro Irrigation Project”Andhra Pradesh Micro Irrigation Project”
Andhra Pradesh Micro Irrigation Project”
vzmcareers
 
brainstorming-techniques-infographics.pptx
brainstorming-techniques-infographics.pptxbrainstorming-techniques-infographics.pptx
brainstorming-techniques-infographics.pptx
maritzacastro321
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptxPRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
JayeshTaneja4
 

IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle Big Data Discovery

  • 1. [email protected] www.rittmanmead.com @rittmanmead Unlock the Value in your Big Data Reservoir using Oracle Big Data Discover Mark Rittman, CTO, Rittman Mead IlOUG Tech Days 3, Tel Aviv, May 2016
  • 2. [email protected] www.rittmanmead.com @rittmanmead 2 •Mark Rittman, Co-Founder of Rittman Mead ‣Oracle ACE Director, specialising in Oracle BI&DW ‣14 Years Experience with Oracle Technology ‣Regular columnist for Oracle Magazine •Author of two Oracle Press Oracle BI books ‣Oracle Business Intelligence Developers Guide ‣Oracle Exalytics Revealed ‣Writer for Rittman Mead Blog :
 http://www.rittmanmead.com/blog •Email : [email protected] •Twitter : @markrittman About the Speaker
  • 3. [email protected] www.rittmanmead.com @rittmanmead 3 •Many customers and organisations are now running initiatives around “big data” •Some are IT-led and are looking for cost-savings around data warehouse storage + ETL •Others are “skunkworks” projects in the marketing department that are now scaling-up •Projects now emerging from pilot exercises •And design patterns starting to emerge Many Organisations are Running Big Data Initiatives
  • 4. [email protected] www.rittmanmead.com @rittmanmead 4 •Typical implementation of Hadoop and big data in an analytic context is the “data lake” •Additional data storage platform with cheap storage, flexible schema support + compute •Data lands in the data lake or reservoir in raw form, then minimally processed •Data then accessed directly by “data scientists”, or processed further into DW Common Big Data Design Pattern : “Data Reservoir”
  • 5. [email protected] www.rittmanmead.com @rittmanmead Common Project Environment : The “Data Lab”
  • 6. [email protected] www.rittmanmead.com @rittmanmead 6 •Data loaded into the reservoir needs preparation and curation before presenting to users •Specialist skills typically needed to ingest and understand data - and those staff are scarce •How do we staff and scale projects as our use of big data matures? But … Working with Unstructured Textual Data Is Hard
  • 8. Haven't we heard this story before?
  • 10. [email protected] www.rittmanmead.com @rittmanmead 10 •Part of the acquisition of Endeca back in 2012 by Oracle Corporation •Based on search technology and concept of “faceted search” •Data stored in flexible NoSQL-style in-memory database called “Endeca Server” •Added aggregation, text analytics and text enrichment features for “data discovery” ‣Explore data in raw form, loose connections, navigate via search rather than hierarchies ‣Useful to find out what is relevant and valuable in a dataset before formal modeling What Was Oracle Endeca Information Discovery?
  • 12. 2013
  • 14. and even more importantly…
  • 16. [email protected] www.rittmanmead.com @rittmanmead 16 •A visual front-end to the Hadoop data reservoir, providing end-user access to datasets •Catalog, profile, analyse and combine schema-on-read datasets across the Hadoop cluster •Visualize and search datasets to gain insights, potentially load in summary form into DW Oracle Big Data Discovery
  • 17. [email protected] www.rittmanmead.com @rittmanmead 17 What Does Big Data Discovery Do? •Provide a visual catalog and search function across data in the data reservoir •Profile and understand data, relationships, data quality issues •Apply simple changes, enrichment to incoming data •Visualize datasets including combinations (joins)
  • 18. [email protected] www.rittmanmead.com @rittmanmead 18 •A catalog and view onto data sitting in Hadoop •Data wrangling, enrichment and combining •Visualization and Exploration of that data Main Big Data Discovery Use-Cases
  • 19. [email protected] www.rittmanmead.com @rittmanmead •Visual Analyzer, DVCS, DVD are also free- form data visualisation tools ‣Single-pane-of-glass view onto data ‣Drill, export and visualise your data •Crucially though these tools work on 
 data that’s already structured and “known” ‣Data sources are either the OBIEE RPD, or
 files extracted from other structured systems •Oracle BDD is for stage before this, when data is in a “raw” state and landed in Hadoop ‣See the shape, value and integration points ‣Structure, enrich and transform Oracle Big Data Discovery vs. Oracle Data Visualisation
  • 20. [email protected] www.rittmanmead.com @rittmanmead 20 •Rittman Mead want to understand drivers and audience for their website ‣What is our most popular content? Who are the most in-demand blog authors? ‣Who are the influencers? What do they read? •Three data sources in scope: Example Scenario : Social Media Analysis RM Website Logs Twitter Stream Website Posts, Comments etc
  • 21. [email protected] www.rittmanmead.com @rittmanmead 21 •Datasets in Hive have to be ingested into DGraph engine before analysis, transformation •Can either define an automatic Hive table detector process, or manually upload •Typically ingests 1m row random sample ‣1m row sample provides > 99% confidence that answer is within 2% of value shown
 no matter how big the full dataset (1m, 1b, 1q+) ‣Makes interactivity cheap - representative dataset Ingesting & Sampling Datasets for the DGraph Engine
  • 22. [email protected] www.rittmanmead.com @rittmanmead 22 •Ingestion process has automatically geo-coded host IP addresses •Other automatic enrichments run after initial discovery step, based on datatypes, content Automatic Enrichment of Ingested Datasets
  • 23. [email protected] www.rittmanmead.com @rittmanmead 23 •Data ingest process automatically applies some enrichments - geocoding etc •Can apply others from Transformation page - simple transformations & Groovy expressions Data Transformation & Enrichment
  • 24. [email protected] www.rittmanmead.com @rittmanmead 24 •Users can upload their own datasets into BDD, from MS Excel or CSV file •Uploaded data is first loaded into Hive table, then sampled/ingested as normal Upload Additional Datasets 1 2 3
  • 25. [email protected] www.rittmanmead.com @rittmanmead 25 •Used to create a dataset based on the intersection (typically) of two datasets •Not required to just view two or more datasets together - think of this as a JOIN and SELECT Join Datasets On Common Attributes
  • 26. [email protected] www.rittmanmead.com @rittmanmead 26 Visualize and Interact With Hadoop Datasets
  • 27. [email protected] www.rittmanmead.com @rittmanmead 27 •BDD Studio dashboards support faceted search across all attributes, refinements •Auto-filter dashboard contents on selected attribute values - for data discovery •Fast analysis and summarisation through Endeca Server technology Faceted Search Across Entire Data Reservoir Further refinement on
 “OBIEE” in post keywords 3 Results now filtered
 on two refinements 4
  • 28. [email protected] www.rittmanmead.com @rittmanmead 28 •Transformations within BDD can then be used to create curated fact + dim Hive tables •Can be used then as a more suitable dataset for use with OBIEE RPD + Visual Analyzer •Or exported then in to Exadata or Exalytics to combine with main DW datasets Export Prepared Datasets Back to Hive, for OBIEE + VA
  • 29. [email protected] www.rittmanmead.com @rittmanmead 29 •Users in Visual Analyzer then have
 a more structured dataset to use •Data organised into dimensions, 
 facts, hierarchies and attributes •Can still access Hadoop directly
 through Impala or Big Data SQL •Big Data Discovery though was 
 key to initial understanding of data Further Analyse in Visual Analyzer for Managed Dataset
  • 30. hang on a moment…
  • 33. How do I turn Christian Berg…
  • 35. [email protected] www.rittmanmead.com @rittmanmead BDD 1.2.0 Release Theme : “Developer Productivity” 35
  • 36. [email protected] www.rittmanmead.com @rittmanmead 36 •Useful to create weekly stats out of daily ones, for example for logistic regression analysis •Roll up low-level data to higher grains •Logs -> sessions -> customers •Intuitive UI helps analysts find the right grains •Execute at full scale using Spark •Results can be sampled or indexed in full New Features to Aggregate and Join Datasets
  • 37. [email protected] www.rittmanmead.com @rittmanmead •Blend huge datasets right in BDD, UI to support experimentation, preview •Execute at scale with Spark, results can be sampled or indexed in full •Useful for joining-back to same dataset to calculate change since last month/week Join Datasets in BDD (incl. Joinback)
  • 38. [email protected] www.rittmanmead.com @rittmanmead •Access Python & Spark packages, run against BDD datasets •Pandas, Numpy, SparkML, Spark SQL •Suddenly Big Data Discovery gets useful •Use BDD’s cataloging and enrichment •Blend and join datasets •Use them as datasources into Python
 and Spark data science algorithms •The best of both worlds Most Significant for Data Scientists : pySpark Shell
  • 39. [email protected] www.rittmanmead.com @rittmanmead •Use .toSpark and .toDF Spark transformations to turn BDD datasets into Spark RDDs and Dataframes •Then access Spark SQL set-based queries and JDBC query federation to further manipulate + join data •Use Spark MLlib to access machine-learning features •Classification •Clustering •Collaborative Learning Access to Spark SQL and Spark MLlib
  • 40. [email protected] www.rittmanmead.com @rittmanmead •Using BDD + wearable + home IoT datasets to create a quantified-self project •Use machine learning (via BDD shell) to optimise my life for health and mood Watch This Space … at KScope’16
  • 42. [email protected] www.rittmanmead.com @rittmanmead 42 •Articles on the Rittman Mead Blog ‣http://www.rittmanmead.com/category/oracle-big-data-appliance/ ‣http://www.rittmanmead.com/category/big-data/ ‣http://www.rittmanmead.com/category/oracle-big-data-discovery/ •Rittman Mead offer consulting, training and managed services for Oracle Big Data ‣Oracle & Cloudera partners ‣http://www.rittmanmead.com/bigdata Additional Resources
  • 43. [email protected] www.rittmanmead.com @rittmanmead Unlock the Value in your Big Data Reservoir using Oracle Big Data Discover Mark Rittman, CTO, Rittman Mead IlOUG Tech Days 3, Tel Aviv, May 2016