SlideShare a Scribd company logo
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
LOWERING THE ENTRY POINT TO GETTING GOING
WITH HADOOP AND OBTAINING BUSINESS VALUE
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
SAS & HADOOP WHO IS SAS?
• Almost 4 decades of Advanced Analytics & DM expertise.
• Validated by Gartner and Forrester Analysts as Leaders in
Advanced Analytics, BI and DM.
• Leader in *17* Gartner’s Magic Quadrants from Data
Management, BI to Advanced Analytics.
• 400 offices, 70,000+ customers, 135 countries with largest
ecosystem of users and partners.
• 38% of Advanced Analytics Market Share (per IDC).
• 25% reinvested into R&D.
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
GARTNER: MAGIC QUADRANT FOR ADVANCED ANALYTIC PLATFORMS
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from SAS. Gartner does not endorse any vendor, product or
service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner's research organization and
should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
Gartner Magic Quadrant for Advanced Analytics Platforms by Gareth Herschel, Alexander Linden, Lisa Kart, 19 February 2015.
Gartner defines advanced
analytics as, "the analysis of all
kinds of data using
sophisticated quantitative
methods (for example,
statistics, descriptive and
predictive data mining,
simulation and optimization) to
produce insights that traditional
approaches to business
intelligence (BI) — such as
query and reporting — are
unlikely to discover."
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
SAS & INTEL
STUDY
HADOOP ADOPTION & CHALLENGES
Research summary: SAS and Intel asked more
than 300 IT-managers from the largest companies
in Denmark, Finland, Norway and Sweden about
the adoption of Big Data analytics and Hadoop.
http://nordichadoopsurvey.com
60% - cited advanced analytics,
data discovery, or as an
analytical lab
22% - would like to
speed up processing
Primary reason for considering Hadoop
Adoption / Obstacles
35% - cited “Resources and Competencies”
Results & Key
Findings
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
CHALLENGE HADOOP SKILLS SHORTAGE
Performing even the simplest tasks in
Hadoop typically requires mastering
disparate tools and writing hundreds of
lines of code.
Fact: There are a limited # of users
with the necessary Hadoop skills
• MapReduce
• Pig Latin
• HiveQL
• HDFS
• Sqoop and Oozie
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
CHALLENGE USER TOOLS ARE NOT BIG DATA ENABLED
Big data brings new requirements:
• Access to HDFS
• Parallel Loads
• New Native file types
• Knowledge of file structures
• New languages & code
• Need to transform data In-cluster
User tools are not engineered to process
data inside Hadoop.
• Tools are not optimized for Hadoop
• Users move data out of Hadoop to do
data management and data quality
• This requires more processing time
• Data is duplicated and more storage is
required
• Users do not use the Hadoop platform
as it was designed
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
BIG DATA
MANAGEMENT
ANALYSTS TAKE
Recommendation
“Use self-service interactive data preparation
tools to enhance analyst productivity.” and
“improve the quality of data”
– Gartner, “Data Preparation Is Not an Afterthought”
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
SAS & HADOOP HOW?
SAS & Hadoop intersect in many ways:
 SAS can treat Hadoop just as any other data source, pulling data
FROM Hadoop, when it is most convenient;
 SAS can work WITH Hadoop, lifting data in a purpose-built
advanced analytics in-memory environment;
 SAS can work directly IN Hadoop, leveraging the distributed
processing capabilities of Hadoop.



Copyright © 2015, SAS Institute Inc. All rights reserv ed.
SAS & HADOOP THE PRAGMATIC APPROACH
Prepare data IN
Hadoop for analytics
Move data FROM
Hadoop into a SAS
environment
Deploy and manage
model score code IN
Hadoop
Lift data IN to memory
for analytics
Model data in-memory
WITH advanced
modeling tools
Use the
right
approach for
what needs
to be done!
Explore data in-memory
WITH data
visualization and
approachable
modelling
MANAGE
DATA
EXPLORE
DATA
DEVELOP
MODELS
DEPLOY&
MONITOR
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
DEMO
• Sector: Online e-commerce shop
• Business Problem: Having difficulty to keep
profitable customers returning to the web site
• Specific aim: Identify the reasons for clients to
abandon their shopping cart and predict those
visitors with a high probability to abandon
• Challenge: Getting big data stored into Hadoop
and accessing it is difficult for analysts to access
from their traditional systems without specific
expertise
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
ENABLING ENTIRE ANALYTICS LIFECYCLE AROUND HADOOP
TEXT
MANAGE
DATA
EXPLORE
DATA
DEVELOP
MODELS
DEPLOY&
MONITOR
• SAS/ACCESS to Hadoop
• SAS Data Loader for Hadoop
• SAS Data Management
• SAS Federation Server
• SAS Event Stream Processing
• SAS Visual Analytics
• SAS In-memory Statistics
• SAS Visual Statistics
• SAS High-Performance
Analytics Products
• SAS Scoring Accelerator for
Hadoop
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
WHY SAS?
SUPPORTING THE ENTIRE
ANALYTICS JOURNEY!
SAS® Visual Analytics
SAS® Visual Statistics
SAS® In-Memory
Statistics
SAS® Enterprise Miner /
SAS® Forecast Server
SAS® Decision Manager /
SAS® Scoring Accelerator
Data exploration,
analysis,
visualization and
approachable
analytics for the
masses
In-depth GUI driven
approachable
modelling
State-of-the-art
interactive
analytics driven
through a
programmatic
interface
Robust production
modelling tools that
provide for
repeatability and
easy
operationalization
Capabilities to
deploy, monitor and
automate analytics
with appropriate
business rules into
operational business
processes
Visualize, explore, interact, explain, understand, democratize
Finalize, Deploy, integrate, execute,
operationalize, industrialize
SAS® Data Loader for Hadoop
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
SAS AND HADOOP SUMMARY
 SAS is the only vendor to work FROM + WITH + IN Hadoop throughout
the analytics lifecycle.
 All three approaches can be combined and coordinated,
complementing each other for each situation.
 Each approach can evolve, mature and/or morph into the other.
 SAS can help realize the value of Hadoop; bring production-analytics to
the platform.
Copyright © 2015, SAS Institute Inc. All rights reserv ed.
THANK YOU
MARK.TORR@SAS.COM
Lets connect on LinkedIn: http://de.linkedin.com/in/marktorr/en
Follow my interests on Twitter: https://twitter.com/mark_torr
Ad

More Related Content

What's hot (20)

The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
Hortonworks
 
How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016
StampedeCon
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
DataWorks Summit/Hadoop Summit
 
Data Discovery & Lineage in Enterprise Hadoop
Data Discovery & Lineage in Enterprise HadoopData Discovery & Lineage in Enterprise Hadoop
Data Discovery & Lineage in Enterprise Hadoop
DataWorks Summit
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
DataWorks Summit
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
VMware Tanzu
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
DataWorks Summit/Hadoop Summit
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
NoSQLmatters
 
Pentaho big data camp - 5 min
Pentaho   big data camp - 5 minPentaho   big data camp - 5 min
Pentaho big data camp - 5 min
ianfyfe
 
Bay Area Hadoop User Group
Bay Area Hadoop User GroupBay Area Hadoop User Group
Bay Area Hadoop User Group
Pentaho
 
Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data Discovery
Harald Erb
 
Beyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIBeyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AI
DataWorks Summit
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data Stores
DATAVERSITY
 
Big data/Hadoop/HANA Basics
Big data/Hadoop/HANA BasicsBig data/Hadoop/HANA Basics
Big data/Hadoop/HANA Basics
Global Business Solutions SME
 
Data Management for High Performance Analytics
Data Management for High Performance AnalyticsData Management for High Performance Analytics
Data Management for High Performance Analytics
Mary Snyder
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016
StampedeCon
 
Hadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance InitiativeHadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance Initiative
DataWorks Summit
 
Deploying a Governed Data Lake
Deploying a Governed Data LakeDeploying a Governed Data Lake
Deploying a Governed Data Lake
WaterlineData
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
Hortonworks
 
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for HadoopPartners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Eric Sun
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
Hortonworks
 
How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016
StampedeCon
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
DataWorks Summit/Hadoop Summit
 
Data Discovery & Lineage in Enterprise Hadoop
Data Discovery & Lineage in Enterprise HadoopData Discovery & Lineage in Enterprise Hadoop
Data Discovery & Lineage in Enterprise Hadoop
DataWorks Summit
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
DataWorks Summit
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
VMware Tanzu
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
NoSQLmatters
 
Pentaho big data camp - 5 min
Pentaho   big data camp - 5 minPentaho   big data camp - 5 min
Pentaho big data camp - 5 min
ianfyfe
 
Bay Area Hadoop User Group
Bay Area Hadoop User GroupBay Area Hadoop User Group
Bay Area Hadoop User Group
Pentaho
 
Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data Discovery
Harald Erb
 
Beyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIBeyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AI
DataWorks Summit
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data Stores
DATAVERSITY
 
Data Management for High Performance Analytics
Data Management for High Performance AnalyticsData Management for High Performance Analytics
Data Management for High Performance Analytics
Mary Snyder
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016
StampedeCon
 
Hadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance InitiativeHadoop in Validated Environment - Data Governance Initiative
Hadoop in Validated Environment - Data Governance Initiative
DataWorks Summit
 
Deploying a Governed Data Lake
Deploying a Governed Data LakeDeploying a Governed Data Lake
Deploying a Governed Data Lake
WaterlineData
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
Hortonworks
 
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for HadoopPartners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Eric Sun
 

Similar to Lowering the entry point to getting going with Hadoop and obtaining business value (20)

Accelerate Your Big Data Analytics Efforts with SAS and Hadoop
Accelerate Your Big Data Analytics Efforts with SAS and HadoopAccelerate Your Big Data Analytics Efforts with SAS and Hadoop
Accelerate Your Big Data Analytics Efforts with SAS and Hadoop
DataWorks Summit
 
Tableau and hadoop
Tableau and hadoopTableau and hadoop
Tableau and hadoop
Craig Jordan
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
Hortonworks
 
2015 HortonWorks MDA Roadshow Presentation
2015 HortonWorks MDA Roadshow Presentation2015 HortonWorks MDA Roadshow Presentation
2015 HortonWorks MDA Roadshow Presentation
Felix Liao
 
How Can I Save Time and Build Trust With My Data Preparation.pdf
How Can I Save Time and Build Trust With My Data Preparation.pdfHow Can I Save Time and Build Trust With My Data Preparation.pdf
How Can I Save Time and Build Trust With My Data Preparation.pdf
ArelyMuoz4
 
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Italy
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Pentaho
 
Business Visualization: Dashboard & Storyboarding
Business Visualization: Dashboard & StoryboardingBusiness Visualization: Dashboard & Storyboarding
Business Visualization: Dashboard & Storyboarding
NMIMS Global Access School of Continuing Education (NGA-SCE)
 
What's New with SAP BusinessObjects Business Intelligence 4.1?
What's New with SAP BusinessObjects Business Intelligence 4.1?What's New with SAP BusinessObjects Business Intelligence 4.1?
What's New with SAP BusinessObjects Business Intelligence 4.1?
SAP Analytics
 
Leveraging SAP HANA with Apache Hadoop and SAP Analytics
Leveraging SAP HANA with Apache Hadoop and SAP AnalyticsLeveraging SAP HANA with Apache Hadoop and SAP Analytics
Leveraging SAP HANA with Apache Hadoop and SAP Analytics
Method360
 
Big Data for BI - Beyond the Hype - Pentaho
Big Data for BI - Beyond the Hype - PentahoBig Data for BI - Beyond the Hype - Pentaho
Big Data for BI - Beyond the Hype - Pentaho
Subramanian Senthamarai Kannan
 
SAS Visual Analytics
SAS Visual AnalyticsSAS Visual Analytics
SAS Visual Analytics
MarketingArrowECS_CZ
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to Production
Contexti
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
Sai Paravastu
 
Data & Analytics with CIS & Microsoft Platforms
Data & Analytics with CIS & Microsoft PlatformsData & Analytics with CIS & Microsoft Platforms
Data & Analytics with CIS & Microsoft Platforms
Sonata Software
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
DataWorks Summit/Hadoop Summit
 
Ramesh kutumbaka resume
Ramesh kutumbaka resumeRamesh kutumbaka resume
Ramesh kutumbaka resume
Ramesh Kutumbaka
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
DataWorks Summit
 
Accelerate Your Big Data Analytics Efforts with SAS and Hadoop
Accelerate Your Big Data Analytics Efforts with SAS and HadoopAccelerate Your Big Data Analytics Efforts with SAS and Hadoop
Accelerate Your Big Data Analytics Efforts with SAS and Hadoop
DataWorks Summit
 
Tableau and hadoop
Tableau and hadoopTableau and hadoop
Tableau and hadoop
Craig Jordan
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
Hortonworks
 
2015 HortonWorks MDA Roadshow Presentation
2015 HortonWorks MDA Roadshow Presentation2015 HortonWorks MDA Roadshow Presentation
2015 HortonWorks MDA Roadshow Presentation
Felix Liao
 
How Can I Save Time and Build Trust With My Data Preparation.pdf
How Can I Save Time and Build Trust With My Data Preparation.pdfHow Can I Save Time and Build Trust With My Data Preparation.pdf
How Can I Save Time and Build Trust With My Data Preparation.pdf
ArelyMuoz4
 
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Data Management for Analytics: potenzia le tue analisi e sostieni l’innov...
SAS Italy
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Pentaho
 
What's New with SAP BusinessObjects Business Intelligence 4.1?
What's New with SAP BusinessObjects Business Intelligence 4.1?What's New with SAP BusinessObjects Business Intelligence 4.1?
What's New with SAP BusinessObjects Business Intelligence 4.1?
SAP Analytics
 
Leveraging SAP HANA with Apache Hadoop and SAP Analytics
Leveraging SAP HANA with Apache Hadoop and SAP AnalyticsLeveraging SAP HANA with Apache Hadoop and SAP Analytics
Leveraging SAP HANA with Apache Hadoop and SAP Analytics
Method360
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to Production
Contexti
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
Sai Paravastu
 
Data & Analytics with CIS & Microsoft Platforms
Data & Analytics with CIS & Microsoft PlatformsData & Analytics with CIS & Microsoft Platforms
Data & Analytics with CIS & Microsoft Platforms
Sonata Software
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Hortonworks
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
DataWorks Summit/Hadoop Summit
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
DataWorks Summit
 
Ad

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Ad

Recently uploaded (20)

Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 

Lowering the entry point to getting going with Hadoop and obtaining business value

  • 1. Copyright © 2015, SAS Institute Inc. All rights reserv ed. LOWERING THE ENTRY POINT TO GETTING GOING WITH HADOOP AND OBTAINING BUSINESS VALUE
  • 2. Copyright © 2015, SAS Institute Inc. All rights reserv ed. SAS & HADOOP WHO IS SAS? • Almost 4 decades of Advanced Analytics & DM expertise. • Validated by Gartner and Forrester Analysts as Leaders in Advanced Analytics, BI and DM. • Leader in *17* Gartner’s Magic Quadrants from Data Management, BI to Advanced Analytics. • 400 offices, 70,000+ customers, 135 countries with largest ecosystem of users and partners. • 38% of Advanced Analytics Market Share (per IDC). • 25% reinvested into R&D.
  • 3. Copyright © 2015, SAS Institute Inc. All rights reserv ed. GARTNER: MAGIC QUADRANT FOR ADVANCED ANALYTIC PLATFORMS This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from SAS. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. Gartner Magic Quadrant for Advanced Analytics Platforms by Gareth Herschel, Alexander Linden, Lisa Kart, 19 February 2015. Gartner defines advanced analytics as, "the analysis of all kinds of data using sophisticated quantitative methods (for example, statistics, descriptive and predictive data mining, simulation and optimization) to produce insights that traditional approaches to business intelligence (BI) — such as query and reporting — are unlikely to discover."
  • 4. Copyright © 2015, SAS Institute Inc. All rights reserv ed. SAS & INTEL STUDY HADOOP ADOPTION & CHALLENGES Research summary: SAS and Intel asked more than 300 IT-managers from the largest companies in Denmark, Finland, Norway and Sweden about the adoption of Big Data analytics and Hadoop. http://nordichadoopsurvey.com 60% - cited advanced analytics, data discovery, or as an analytical lab 22% - would like to speed up processing Primary reason for considering Hadoop Adoption / Obstacles 35% - cited “Resources and Competencies” Results & Key Findings
  • 5. Copyright © 2015, SAS Institute Inc. All rights reserv ed. CHALLENGE HADOOP SKILLS SHORTAGE Performing even the simplest tasks in Hadoop typically requires mastering disparate tools and writing hundreds of lines of code. Fact: There are a limited # of users with the necessary Hadoop skills • MapReduce • Pig Latin • HiveQL • HDFS • Sqoop and Oozie
  • 6. Copyright © 2015, SAS Institute Inc. All rights reserv ed. CHALLENGE USER TOOLS ARE NOT BIG DATA ENABLED Big data brings new requirements: • Access to HDFS • Parallel Loads • New Native file types • Knowledge of file structures • New languages & code • Need to transform data In-cluster User tools are not engineered to process data inside Hadoop. • Tools are not optimized for Hadoop • Users move data out of Hadoop to do data management and data quality • This requires more processing time • Data is duplicated and more storage is required • Users do not use the Hadoop platform as it was designed
  • 7. Copyright © 2015, SAS Institute Inc. All rights reserv ed. BIG DATA MANAGEMENT ANALYSTS TAKE Recommendation “Use self-service interactive data preparation tools to enhance analyst productivity.” and “improve the quality of data” – Gartner, “Data Preparation Is Not an Afterthought”
  • 8. Copyright © 2015, SAS Institute Inc. All rights reserv ed. SAS & HADOOP HOW? SAS & Hadoop intersect in many ways:  SAS can treat Hadoop just as any other data source, pulling data FROM Hadoop, when it is most convenient;  SAS can work WITH Hadoop, lifting data in a purpose-built advanced analytics in-memory environment;  SAS can work directly IN Hadoop, leveraging the distributed processing capabilities of Hadoop.   
  • 9. Copyright © 2015, SAS Institute Inc. All rights reserv ed. SAS & HADOOP THE PRAGMATIC APPROACH Prepare data IN Hadoop for analytics Move data FROM Hadoop into a SAS environment Deploy and manage model score code IN Hadoop Lift data IN to memory for analytics Model data in-memory WITH advanced modeling tools Use the right approach for what needs to be done! Explore data in-memory WITH data visualization and approachable modelling MANAGE DATA EXPLORE DATA DEVELOP MODELS DEPLOY& MONITOR
  • 10. Copyright © 2015, SAS Institute Inc. All rights reserv ed. DEMO • Sector: Online e-commerce shop • Business Problem: Having difficulty to keep profitable customers returning to the web site • Specific aim: Identify the reasons for clients to abandon their shopping cart and predict those visitors with a high probability to abandon • Challenge: Getting big data stored into Hadoop and accessing it is difficult for analysts to access from their traditional systems without specific expertise
  • 11. Copyright © 2015, SAS Institute Inc. All rights reserv ed. ENABLING ENTIRE ANALYTICS LIFECYCLE AROUND HADOOP TEXT MANAGE DATA EXPLORE DATA DEVELOP MODELS DEPLOY& MONITOR • SAS/ACCESS to Hadoop • SAS Data Loader for Hadoop • SAS Data Management • SAS Federation Server • SAS Event Stream Processing • SAS Visual Analytics • SAS In-memory Statistics • SAS Visual Statistics • SAS High-Performance Analytics Products • SAS Scoring Accelerator for Hadoop
  • 12. Copyright © 2015, SAS Institute Inc. All rights reserv ed. WHY SAS? SUPPORTING THE ENTIRE ANALYTICS JOURNEY! SAS® Visual Analytics SAS® Visual Statistics SAS® In-Memory Statistics SAS® Enterprise Miner / SAS® Forecast Server SAS® Decision Manager / SAS® Scoring Accelerator Data exploration, analysis, visualization and approachable analytics for the masses In-depth GUI driven approachable modelling State-of-the-art interactive analytics driven through a programmatic interface Robust production modelling tools that provide for repeatability and easy operationalization Capabilities to deploy, monitor and automate analytics with appropriate business rules into operational business processes Visualize, explore, interact, explain, understand, democratize Finalize, Deploy, integrate, execute, operationalize, industrialize SAS® Data Loader for Hadoop
  • 13. Copyright © 2015, SAS Institute Inc. All rights reserv ed. SAS AND HADOOP SUMMARY  SAS is the only vendor to work FROM + WITH + IN Hadoop throughout the analytics lifecycle.  All three approaches can be combined and coordinated, complementing each other for each situation.  Each approach can evolve, mature and/or morph into the other.  SAS can help realize the value of Hadoop; bring production-analytics to the platform.
  • 14. Copyright © 2015, SAS Institute Inc. All rights reserv ed. THANK YOU [email protected] Lets connect on LinkedIn: http://de.linkedin.com/in/marktorr/en Follow my interests on Twitter: https://twitter.com/mark_torr