SlideShare a Scribd company logo
Hortonworks: We Do Hadoop.
Our mission is to enable your Modern Data Architecture
by Delivering Enterprise Apache Hadoop

Emil A. Siemes
esiemes@hortonworks.com
Solution Engineer
January 2014
Enable your Modern Data Architecture by
Our Mission: Delivering Enterprise Apache Hadoop

Our Commitment
Headquarters: Palo Alto, CA
Employees: 300+ and growing

Open Leadership
Drive innovation in the open exclusively via the
Apache community-driven open source process

Trusted Partners

Enterprise Rigor
Engineer, test and certify Apache Hadoop with
the enterprise in mind

Ecosystem Endorsement
Focus on deep integration with existing data
center technologies and skills

Page 2
APPLICATIONS

A Traditional Approach Under Pressure
Custom
Applications

Business
Analytics

Packaged
Applications

DATA SYSTEM

2.8 ZB in 2012
85% from New Data Types
RDBMS

EDW

MPP

REPOSITORIES

15x Machine Data by 2020
40 ZB by 2020

SOURCES

Source: IDC

Existing Sources

Emerging Sources

(CRM, ERP, Clickstream, Logs)

(Sensor, Sentiment, Geo, Unstructured)

Page 3
APPLICATIONS

Emerging Modern Data Architecture
Custom
Applications

Business
Analytics

Packaged
Applications
DEV & DATA
TOOLS

SOURCES

DATA SYSTEM

BUILD &
TEST

OPERATIONAL
TOOLS
RDBMS

EDW

MANAGE &
MONITOR

MPP

REPOSITORIES

Existing Sources

Emerging Sources

(CRM, ERP, Clickstream, Logs)

(Sensor, Sentiment, Geo, Unstructured)

Page 4
Drivers of Hadoop Adoption

New Business
Applications
From NEW types of
Data (or existing
types for longer)

Page 5
Most Common NEW TYPES OF DATA
1. Sentiment
Understand how your customers feel about your brand and
products – right now

2. Clickstream
Capture and analyze website visitors’ data trails and
optimize your website

3. Sensor/Machine
Discover patterns in data streaming automatically from
remote sensors and machines

4. Geographic

Value

Analyze location-based data to manage operations where
they occur

5. Server Logs
Research logs to diagnose process failures and prevent
security breaches

6. Unstructured (txt, video, pictures, etc..)
Understand patterns in files across millions of web pages,
emails, and documents

+ Keep existing
data longer!
Drivers of Hadoop Adoption
Architectural
A Modern Data
Architecture

New Business
Applications

Complement your existing data
systems: the right workload in the
right place

Page 7
Let’s build a Data Lake…
Instructions on:
hadoopwrangler.com

Page 8
HDP Data Lake Solution Architecture
Manage Steps 1-4: Data Lifecycle with Falcon
FALCON (data pipeline & flow management)

Downstream
Data Sources

Oozie (Batch scheduler)
Step 4: Schedule and Orchestrate

HIVE

SOURCE
DATA

PIG

Step 3: Transform, Aggregate & Materialize

EDW

ClickStream
Data

HCATALOG
File

Sales
Transaction/
Data
JMS

Ingestion

Step 1:Extract & Load
REST

HTTP

Social Data

Sqoop/Hiv
e

EDW
(Teradata)

Step 2: Model/Apply Metadata
INTERACTIVE

SQOOP

compute
&
storage
.
Storm

Web
HDFS

Query/
Analytics/Repor
ting Tools

Hive Server
(Tez/Stinger)

Tableau/Excel

YARN
.
MR2

.

NFS

Marketing/I
nventory

HBase
Client

OLTP
HBase

Use Case Type 1:
Materialize & Exchange

(table & user-defined metadata)

FLUME

Product
Data

Mahout

(data processing)

Exchange

.

.

.

Elastic
Search

.

TEZ

.

.

.
SAS
compute
&
storage

Use Case Type 2:
Explore/Visualize

Datameer/Platfo
ra/SAP

Stream Processing,
Real-time Search,
MPI

AMBARI
Streaming

YARN
Apps

Data Lake HDP Grid
Knox – Perimeter Level Security

Opens up Hadoop to
many new use cases
Page 9
Hadoop 2: The Introduction of YARN
Store all date in a single place, interact in multiple ways
Single Use System

Multi Use Data Platform

Batch Apps

Batch, Interactive, Online, Streaming, …

1st Gen of
Hadoop

HADOOP 2
Standard Query
Processing

(cluster resource management
& data processing)

HDFS
(redundant, reliable storage)

Real Time Stream
Processing

Hive, Pig

MapReduce

Online Data
Processing
HBase, Accumulo

Storm

Batch

…

Interactive

MapReduce

others

Tez

Efficient Cluster Resource
Management & Shared Services
(YARN)

Redundant, Reliable Storage
(HDFS)

Page 10
Let’s start simple…
• A solution unifying all data sources of a mobile App
– Allowing analytics over all data in one place
– In real time and long term

• Mobile Apps have multiple channels for data:
– Data created on the handset (e.g. geo location)
– Data created on servers accessed by the mobile app (e.g. app
data, logs)
– Data from backend services (e.g. RDBMS)
– Store data (e.g. iTunes Connect, Google Play)
– Social data (Twitter, App Reviews, etc.)

Page 11
Why Should We Care?
• How much revenue did I made?
(Not that easy to answer as one could think)
• Where are my customers now?
• Can you fulfill requirements from the business like: ”Tell me when our
customers are in a coffee shop so we can offer them e.g. Wifi”
• What are my customers thinking about my app/brand?
• Are the ones complaining really using it (correct)?
• How can I support marketing activities?
• How can I evaluate local marketing activities?
• Does positive/negative sentiment effect my downloads?
• Will my servers be able to deal with the load in 3 months
• …

Page 12
Design Goals
• Use as much as we have in our stack as possible
• Minimize dependencies on stacks beyond Hadoop
– Still make it useful and complete

• Make it fit into a 8GB MacBook/Laptop
• Release early & release often

Page 13
iiCaptain

Page 14
Types Of Data For iiCaptain
• Geo location data
• Store Data
• iTunes Connect, Google Play, Amazon via AppAnnie

• Twitter
• RDBMS (Sqoop)
• Logs

Page 15
iiCaptain’s Data Ocean / Data Lake

Page 16
More Details

Page 17
Analytics

Page 18
SQL Interactive Query & Apache Hive
Key Services

Apache Hive

Platform, operational and
data services essential for
the enterprise

• The defacto standard for Hadoop SQL access
• Used by your current data center partners
• Built for batch AND interactive query

Skills

SQL

Leverage your existing
skills: development,
analytics, operations

Stinger Initiative

Integration
Interoperable with existing
data center investments

Broad, community based effort to deliver the
next generation of Apache Hive
Speed

Scale

SQL

Improve Hive query
performance by 100X to
allow for interactive
query times (seconds)

The only SQL interface
to Hadoop designed for
queries that scale from
TB to PB

Support broadest range
of SQL semantics for
analytic applications
against Hadoop

Page 19
Build Process, Shining With Savanna

Page 20
Roadmap
-

Servlet Engine in YARN
Project Savanna: Continuous Delivery end-2-end
Sentiment Analysis with Flume/Hive and App Reviews
Knox
Falcon
Phoenix

Page 21
HDP 2.0: Enterprise Hadoop Platform
OPERATIONAL
OPERATIONAL
SERVICES
SERVICES
AMBARI
Cluster
AMBARI Dataset
Mgmnt FALCON
FALCON*
Mgmnt
Schedule
OOZIE
OOZIE

Hortonworks
Data Platform (HDP)

DATA
DATA
SERVICES
SERVICES
FLUME
FLUME
Data
Movement

SQOOP
SQOOP
LOAD &
LOAD &
EXTRACT
EXTRACT

NFS
NFS

CORE
CORE SERVICES

WebHDFS

CORE
CORE SERVICES
SERVICES

KNOX*
KNOX*

WebHDFS

HIVE
HBASEData Access HIVE&
PIG
HCATALOG
HBASE

MAP

Process
REDUCE

TEZ
TEZ

ResourceYARN
Management

Cloud

• Integrates full range of
enterprise-ready services

HDFS
Storage
HDFS
Enterprise Readiness
High Availability, Disaster
Recovery, Rolling Upgrades,
Security and Snapshots

HORTONWORKS
DATA PLATFORM (HDP)
OS/VM

• The ONLY 100% open source
and most current platform

• Certified and tested at scale
• Engineered for deep
ecosystem interoperability

Appliance

Page 22
Hortonworks: The Value of “Open” for You
Validate & Try
1. Download the
Hortonworks Sandbox
2. Learn Hadoop using the
technical tutorials

3. Investigate a business
case using the step-bystep business cases
scenarios
4. Validate YOUR business
case using your data in
the sandbox

Engage
1. Execute a Business Case
Discovery Workshop with
our architects
2. Build a business case for
Hadoop today

Connect With the Hadoop Community
We employ a large number of Apache project committers & innovators so
that you are represented in the open source community

Avoid Vendor Lock-In
Hortonworks Data Platform remain as close to the open source trunk as
possible and is developed 100% in the open so you are never locked in

The Partners you Rely On, Rely On Hortonworks
We work with partners to deeply integrate Hadoop with data center
technologies so you can leverage existing skills and investments

Certified for the Enterprise
We engineer, test and certify the Hortonworks Data Platform at scale to
ensure reliability and stability you require for enterprise use

Support from the Experts
We provide the highest quality of support for deploying at scale. You are
supported by hundreds of years of Hadoop experience

Page 23
Ad

More Related Content

What's hot (19)

Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Hortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
Hortonworks
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Hortonworks
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun Connolly
Hortonworks
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Hortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Hortonworks
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
Hortonworks
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
Hortonworks
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of Hortonworks
Data Con LA
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
Hortonworks
 
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011
Hortonworks
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
Hortonworks
 
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hortonworks
 
Bigger Data For Your Budget
Bigger Data For Your BudgetBigger Data For Your Budget
Bigger Data For Your Budget
Hortonworks
 
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Cedric CARBONE
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
Hortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Hortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Hortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Hortonworks
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Hortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
Hortonworks
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Hortonworks
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun Connolly
Hortonworks
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Hortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Hortonworks
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
Hortonworks
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
Hortonworks
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of Hortonworks
Data Con LA
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
Hortonworks
 
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011
Hortonworks
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
Hortonworks
 
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hortonworks
 
Bigger Data For Your Budget
Bigger Data For Your BudgetBigger Data For Your Budget
Bigger Data For Your Budget
Hortonworks
 
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Cedric CARBONE
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
Hortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Hortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Hortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Hortonworks
 

Similar to OOP 2014 (20)

Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
Hortonworks
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Hortonworks
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
Joan Novino
 
Hortonworks.bdb
Hortonworks.bdbHortonworks.bdb
Hortonworks.bdb
Emil Andreas Siemes
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
Hortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
Hortonworks
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Hortonworks
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
skumpf
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
Hortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hortonworks
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
Hortonworks
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
POSSCON
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
WANdisco Plc
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
jaxconf
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
Hortonworks
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
Hortonworks
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
Pactera_US
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
Hortonworks
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Hortonworks
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
Joan Novino
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
Hortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
Hortonworks
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Hortonworks
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
skumpf
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
Hortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hortonworks
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
Hortonworks
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
POSSCON
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
WANdisco Plc
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
jaxconf
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
Hortonworks
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
Hortonworks
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
Pactera_US
 
Ad

Recently uploaded (20)

Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Ad

OOP 2014

  • 1. Hortonworks: We Do Hadoop. Our mission is to enable your Modern Data Architecture by Delivering Enterprise Apache Hadoop Emil A. Siemes [email protected] Solution Engineer January 2014
  • 2. Enable your Modern Data Architecture by Our Mission: Delivering Enterprise Apache Hadoop Our Commitment Headquarters: Palo Alto, CA Employees: 300+ and growing Open Leadership Drive innovation in the open exclusively via the Apache community-driven open source process Trusted Partners Enterprise Rigor Engineer, test and certify Apache Hadoop with the enterprise in mind Ecosystem Endorsement Focus on deep integration with existing data center technologies and skills Page 2
  • 3. APPLICATIONS A Traditional Approach Under Pressure Custom Applications Business Analytics Packaged Applications DATA SYSTEM 2.8 ZB in 2012 85% from New Data Types RDBMS EDW MPP REPOSITORIES 15x Machine Data by 2020 40 ZB by 2020 SOURCES Source: IDC Existing Sources Emerging Sources (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) Page 3
  • 4. APPLICATIONS Emerging Modern Data Architecture Custom Applications Business Analytics Packaged Applications DEV & DATA TOOLS SOURCES DATA SYSTEM BUILD & TEST OPERATIONAL TOOLS RDBMS EDW MANAGE & MONITOR MPP REPOSITORIES Existing Sources Emerging Sources (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) Page 4
  • 5. Drivers of Hadoop Adoption New Business Applications From NEW types of Data (or existing types for longer) Page 5
  • 6. Most Common NEW TYPES OF DATA 1. Sentiment Understand how your customers feel about your brand and products – right now 2. Clickstream Capture and analyze website visitors’ data trails and optimize your website 3. Sensor/Machine Discover patterns in data streaming automatically from remote sensors and machines 4. Geographic Value Analyze location-based data to manage operations where they occur 5. Server Logs Research logs to diagnose process failures and prevent security breaches 6. Unstructured (txt, video, pictures, etc..) Understand patterns in files across millions of web pages, emails, and documents + Keep existing data longer!
  • 7. Drivers of Hadoop Adoption Architectural A Modern Data Architecture New Business Applications Complement your existing data systems: the right workload in the right place Page 7
  • 8. Let’s build a Data Lake… Instructions on: hadoopwrangler.com Page 8
  • 9. HDP Data Lake Solution Architecture Manage Steps 1-4: Data Lifecycle with Falcon FALCON (data pipeline & flow management) Downstream Data Sources Oozie (Batch scheduler) Step 4: Schedule and Orchestrate HIVE SOURCE DATA PIG Step 3: Transform, Aggregate & Materialize EDW ClickStream Data HCATALOG File Sales Transaction/ Data JMS Ingestion Step 1:Extract & Load REST HTTP Social Data Sqoop/Hiv e EDW (Teradata) Step 2: Model/Apply Metadata INTERACTIVE SQOOP compute & storage . Storm Web HDFS Query/ Analytics/Repor ting Tools Hive Server (Tez/Stinger) Tableau/Excel YARN . MR2 . NFS Marketing/I nventory HBase Client OLTP HBase Use Case Type 1: Materialize & Exchange (table & user-defined metadata) FLUME Product Data Mahout (data processing) Exchange . . . Elastic Search . TEZ . . . SAS compute & storage Use Case Type 2: Explore/Visualize Datameer/Platfo ra/SAP Stream Processing, Real-time Search, MPI AMBARI Streaming YARN Apps Data Lake HDP Grid Knox – Perimeter Level Security Opens up Hadoop to many new use cases Page 9
  • 10. Hadoop 2: The Introduction of YARN Store all date in a single place, interact in multiple ways Single Use System Multi Use Data Platform Batch Apps Batch, Interactive, Online, Streaming, … 1st Gen of Hadoop HADOOP 2 Standard Query Processing (cluster resource management & data processing) HDFS (redundant, reliable storage) Real Time Stream Processing Hive, Pig MapReduce Online Data Processing HBase, Accumulo Storm Batch … Interactive MapReduce others Tez Efficient Cluster Resource Management & Shared Services (YARN) Redundant, Reliable Storage (HDFS) Page 10
  • 11. Let’s start simple… • A solution unifying all data sources of a mobile App – Allowing analytics over all data in one place – In real time and long term • Mobile Apps have multiple channels for data: – Data created on the handset (e.g. geo location) – Data created on servers accessed by the mobile app (e.g. app data, logs) – Data from backend services (e.g. RDBMS) – Store data (e.g. iTunes Connect, Google Play) – Social data (Twitter, App Reviews, etc.) Page 11
  • 12. Why Should We Care? • How much revenue did I made? (Not that easy to answer as one could think) • Where are my customers now? • Can you fulfill requirements from the business like: ”Tell me when our customers are in a coffee shop so we can offer them e.g. Wifi” • What are my customers thinking about my app/brand? • Are the ones complaining really using it (correct)? • How can I support marketing activities? • How can I evaluate local marketing activities? • Does positive/negative sentiment effect my downloads? • Will my servers be able to deal with the load in 3 months • … Page 12
  • 13. Design Goals • Use as much as we have in our stack as possible • Minimize dependencies on stacks beyond Hadoop – Still make it useful and complete • Make it fit into a 8GB MacBook/Laptop • Release early & release often Page 13
  • 15. Types Of Data For iiCaptain • Geo location data • Store Data • iTunes Connect, Google Play, Amazon via AppAnnie • Twitter • RDBMS (Sqoop) • Logs Page 15
  • 16. iiCaptain’s Data Ocean / Data Lake Page 16
  • 19. SQL Interactive Query & Apache Hive Key Services Apache Hive Platform, operational and data services essential for the enterprise • The defacto standard for Hadoop SQL access • Used by your current data center partners • Built for batch AND interactive query Skills SQL Leverage your existing skills: development, analytics, operations Stinger Initiative Integration Interoperable with existing data center investments Broad, community based effort to deliver the next generation of Apache Hive Speed Scale SQL Improve Hive query performance by 100X to allow for interactive query times (seconds) The only SQL interface to Hadoop designed for queries that scale from TB to PB Support broadest range of SQL semantics for analytic applications against Hadoop Page 19
  • 20. Build Process, Shining With Savanna Page 20
  • 21. Roadmap - Servlet Engine in YARN Project Savanna: Continuous Delivery end-2-end Sentiment Analysis with Flume/Hive and App Reviews Knox Falcon Phoenix Page 21
  • 22. HDP 2.0: Enterprise Hadoop Platform OPERATIONAL OPERATIONAL SERVICES SERVICES AMBARI Cluster AMBARI Dataset Mgmnt FALCON FALCON* Mgmnt Schedule OOZIE OOZIE Hortonworks Data Platform (HDP) DATA DATA SERVICES SERVICES FLUME FLUME Data Movement SQOOP SQOOP LOAD & LOAD & EXTRACT EXTRACT NFS NFS CORE CORE SERVICES WebHDFS CORE CORE SERVICES SERVICES KNOX* KNOX* WebHDFS HIVE HBASEData Access HIVE& PIG HCATALOG HBASE MAP Process REDUCE TEZ TEZ ResourceYARN Management Cloud • Integrates full range of enterprise-ready services HDFS Storage HDFS Enterprise Readiness High Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots HORTONWORKS DATA PLATFORM (HDP) OS/VM • The ONLY 100% open source and most current platform • Certified and tested at scale • Engineered for deep ecosystem interoperability Appliance Page 22
  • 23. Hortonworks: The Value of “Open” for You Validate & Try 1. Download the Hortonworks Sandbox 2. Learn Hadoop using the technical tutorials 3. Investigate a business case using the step-bystep business cases scenarios 4. Validate YOUR business case using your data in the sandbox Engage 1. Execute a Business Case Discovery Workshop with our architects 2. Build a business case for Hadoop today Connect With the Hadoop Community We employ a large number of Apache project committers & innovators so that you are represented in the open source community Avoid Vendor Lock-In Hortonworks Data Platform remain as close to the open source trunk as possible and is developed 100% in the open so you are never locked in The Partners you Rely On, Rely On Hortonworks We work with partners to deeply integrate Hadoop with data center technologies so you can leverage existing skills and investments Certified for the Enterprise We engineer, test and certify the Hortonworks Data Platform at scale to ensure reliability and stability you require for enterprise use Support from the Experts We provide the highest quality of support for deploying at scale. You are supported by hundreds of years of Hadoop experience Page 23

Editor's Notes

  • #2: Hello Today I’m going to talk to you about HW and how we deliver an Enterprise Ready Hadoop to enable your modern data architecture.
  • #3: Founded just 2.5 years ago from the original hadoop team members a yahoo.Hortonworks emerged as the leader in open source Hadoop.We are commited to ensure H is an enterprise viable data platform ready for your modern data architectureOur team is probably the largest assembled team of Hadoop experts and active leaders in the communityWe not only make sure Hadoop meets all your enterprise requirements likeOperations, reliablity & SecurityIt also needs to bePackaged & Tested and we do this.It has to work with what you have Make Hadoop an enterprise data platform. Make the market function.Innovate core platform, data, & operational servicesIntegrate deeply with enterprise ecosystemProvide world-class enterprise supportDrive 100% open source software development and releases through the core Apache projectsAddress enterprise needs in community projectsEstablish Apache foundation projects as “the standard”Promote open community vs. vendor control / lock-inEnable the Hadoop market to functionMake it easy for enterprises to deploy at scaleBe the best at enabling deep ecosystem integrationCreate a pull market with key strategic partners
  • #4: On left hands side you see the traditional sources of data you have. Data base or Web applications may be growing by 8% year over year. And while there is also a lot of innovation happening in this space Over the last couple of years we see more and more pressure on these traditional systems as new datasources bringing a massive growth of new data:Smartphones, Sensors, Internet of things, logs. No efficient way to store and analyze this data.
  • #5: Then Hadoop entered the scene. What we’re seeing in most organizations is that they bring Hadoop in to the datacenter not to replace the existing systems but to augment or support them. They use the right tool at the right place for the right type of data.Hadoop really is the Landing spot for these new data sources we discussed before. It provides a way to store and process these types of new data in a very cost effective manor. While it’s very cost effective it also scales horizontally and linear which was a key requirement when it was invented at Yahoo: When you need to index the web you better know how to scale and you better can handle the distributed nature of a cluster.
  • #6: Net New Analytic applications.How to extract value from the new sources60-70% of Hadoop installations are of this type.
  • #24: Make Hadoop an enterprise data platformInnovate core platform, data, & operational servicesIntegrate deeply with enterprise ecosystemProvide world-class enterprise supportDrive 100% open source software development and releases through the core Apache projectsAddress enterprise needs in community projectsEstablish Apache foundation projects as “the standard”Promote open community vs. vendor control / lock-inEnable the Hadoop market to functionMake it easy for enterprises to deploy at scaleBe the best at enabling deep ecosystem integrationCreate a pull market with key strategic partners