SlideShare a Scribd company logo
© Cloudera, Inc. All rights reserved.
Enterprise-Ready Data Science:
Scaling, Governance, and Operationalization
© Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved.2
Mark Chisam
Senior Solution Engineer
Introducing Cloudera Data Science Workbench
© Cloudera, Inc. All rights reserved. 3© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved.9
Dr. Daniel Parton
Lead Data Scientist
Operationalizing Data Science for Enterprises
© Cloudera, Inc. All rights reserved. 10© Cloudera, Inc. All rights reserved.
Bardess® is a consulting company focused on
designing and implementing data analytics solutions.
We are a team of data and business professionals,
who ask insightful questions, extend boundaries and
take action.
We transform data into
insights and action, everyday.
1
0
© Cloudera, Inc. All rights reserved.
11
Requirements
Discovery
Strategy +
Planning
Solution
Design
Ingestion +
Shaping
Data
Architecture
Storage +
Processing
Predictive
Analytics
Machine
Learning
Artificial
Intelligence
Visualization
Data
Discovery
Dev / Ops
Bardess Data Practices
MANAGEMENT CONSULTING DATA OPS DATA SCIENCE DATA ANALYTICS
© Cloudera, Inc. All rights reserved. 12© Cloudera, Inc. All rights reserved.12
AI
MACHINE
LEARNING
DATA SCIENCE
ANALYTICS
"BIG DATA"
© Cloudera, Inc. All rights reserved. 13© Cloudera, Inc. All rights reserved.
WHAT IS A DATA SCIENTIST?
© Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved.
WHAT IS A DATA SCIENTIST?
© Cloudera, Inc. All rights reserved. 15© Cloudera, Inc. All rights reserved.15
Data Engineering Data Science (Exploratory) Production (Operational)
Data has never been
more plentiful.
Open source data science and
machine learning libraries are
rapidly evolving.
Commodity (and on-demand)
compute makes scalable
production machine learning
affordable.
Reports,
Dashboards
Production Data
Pipelines
Batch scoring
…
THE GOOD NEWS
© Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved.
THE BAD NEWS
Data needs to move
across multiple
different systems.
Teams have different
conflicting requests for
languages and libraries.
Most data science done at
small scale, individually,
and is difficult to replace.
Very few models
reach production.
Data Engineering Data Science (Exploratory) Production (Operational)
© Cloudera, Inc. All rights reserved. 17© Cloudera, Inc. All rights reserved.17
THE CHALLENGE
Balance these needs
DATA SCIENCE
• Access to granular data
• Flexibility
• Preferred open source tools
• Elastic provisioning
• Compute
• Storage
• Reproducible research
• Path to production
DATA MANAGEMENT
• Security
• Governance
• Standards
• Low maintenance
• Low cost
• Self-service access
© Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved.18
THE TYPICAL SOLUTION
“If I can’t use my favorite tools, I’ll…”
• Copy data to my laptop
• Copy data to a data science appliance
• Copy data to a cloud service
Why this is a problem:
• Complicates security
• Breaks data governance
• Adds latency to process
• Makes collaboration more difficult
• Complicates model management and
deployment
• Creates infrastructure silos
© Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved.19
CLOUDERA DATA SCIENCE WORKBENCH
Accelerate Machine Learning from Research to Production
For data scientists
• Experiment faster
Use R, Python, or Scala with
on-demand compute and
secure CDH data access
• Work together
Share reproducible research
with your whole team
• Deploy with confidence
Get to production repeatably
and without recoding
For IT professionals
• Bring data science to the data
Give your data science team
more freedom while reducing
the risk and cost of silos
• Secure by default
Leverage common security and
governance across workloads
• Run anywhere
On-premises or in the cloud
© Cloudera, Inc. All rights reserved. 20© Cloudera, Inc. All rights reserved.20
CASE STUDY
Transforming Business Decision-Making with Machine Learning at Scale
Background:
• Retail client aimed to use clustering to
understand their most common types of
transactions
• And to find which groups of products
tend to be purchased together
• Cloudera cluster, storing 2 billion rows of
historical transaction data
• Used CDSW to build custom clustering
workflow in Spark and Python
Representative image of clustering
© Cloudera, Inc. All rights reserved. 21© Cloudera, Inc. All rights reserved.21
CASE STUDY
Transforming Business Decision-Making with Machine Learning at Scale
Result:
• Clusters describe transactions with far
more nuance than the simple category-
level aggregations that were previously
in use
• Identified major trends in certain types of
transaction, worth multiples of $100M
• Clusters transforming how company
thinks about their business, from shop
floor to board level
• Clustering workflow is easily
maintainable, reproducible, and scalable
Representative image of clustering
© Cloudera, Inc. All rights reserved. 22© Cloudera, Inc. All rights reserved.22
CASE STUDY
Transforming Business Decision-Making with Machine Learning at Scale
Benefits of CDSW:
• Easy access to big datasets from
Cloudera HDFS
• Access to Spark to apply clustering on
entire 2 billion row dataset
• Notebook environment allows data
scientists to innovate while staying within
secure Cloudera environment
• Collaborative environment enabling
organized project structure and
collaboration within team of data
scientists
Representative image of clustering
© Cloudera, Inc. All rights reserved. 23© Cloudera, Inc. All rights reserved.
LIVE DEMO
© Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved.24
Introducing the Data Science Sandbox
Lovan Chetty
VP, Product
© Cloudera, Inc. All rights reserved. 25© Cloudera, Inc. All rights reserved.
SOLUTION
Data Science Workbench
EDH Stack
+ Option for Altus PaaS & More…
Cloud IaaS (Fully-Managed)
+ BYOL options
End to End Management (Cloud>Cluster>Workload)
24x7 Production DevOps
Security, Governance & Compliance
Workload Optimization
Fully-Managed, Complete Cloud Platform for Analytics and Data Science
DevOps Built-In, Cloudera & Cloud IaaS Included. Fast Setup, Ready in Hours.
Fully-Managed Data Science Sandbox as a Service
© Cloudera, Inc. All rights reserved. 26© Cloudera, Inc. All rights reserved.
The Fastest, Most Cost-Effective Way to Expand or Deploy
a Modern Platform for Data Science in the Cloud.
• Ready Now, with No New Resources 24x7 Production DevOps &
Monitoring
• Secure, Enterprise-Ready: Hybrid Gateways, Governance, Compliance
• Simple: All-in-one solutions for agility, flexibility in analytics & tools
• Cost-Effective: ½ TCO, Best price-performance, SLA Optimization
Benefits Fully-Managed Data Science Sandbox
www.cazena.com/cloudera
WHY CLOUD?
© Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved.
Q&A
© Cloudera, Inc. All rights reserved. 28© Cloudera, Inc. All rights reserved.28
Q&A - TECHNICAL PANELISTS
Lovan Chetty
VP, Products
lovan@cazena.com
Dr. Daniel Parton
Lead Data Scientist
dparton@bardess.com
Mark Chisam
Senior Solution Engineer
mchisam@cloudera.com
© Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved.
The Data Science Sandbox as a Service
Try it Now with the FastStart Business Value Pilot:
4 Weeks to a Guaranteed Business Outcome.
Philip Duplisey,
Senior Director of Consulting
pduplisey@bardess.com
Bardess.com
Bardess: Data Science &
Management Consulting
Cazena: Fully-Managed
Cloudera Solutions for Azure &
AWS
Cloudera: The Modern
Platform for Data Science and
Analytics.
Sam Berg
VP Sales
sberg@cazena.com
Cazena.com
Tia Watson
Partner Manager
twatson@cloudera.com
Cloudera.com
© Cloudera, Inc. All rights reserved.
THANK YOU
Ad

More Related Content

What's hot (20)

Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloudera, Inc.
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
Cloudera, Inc.
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
Cloudera, Inc.
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)
Cloudera, Inc.
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
Cloudera, Inc.
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...
Cloudera, Inc.
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera, Inc.
 
Cloudera - IoT & Smart Cities
Cloudera - IoT & Smart CitiesCloudera - IoT & Smart Cities
Cloudera - IoT & Smart Cities
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Cloudera, Inc.
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
Cloudera, Inc.
 
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1

Cloudera, Inc.
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloudera, Inc.
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
Cloudera, Inc.
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
Cloudera, Inc.
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)
Cloudera, Inc.
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
Cloudera, Inc.
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...
Cloudera, Inc.
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera, Inc.
 
Cloudera - IoT & Smart Cities
Cloudera - IoT & Smart CitiesCloudera - IoT & Smart Cities
Cloudera - IoT & Smart Cities
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Cloudera, Inc.
 
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1

Cloudera, Inc.
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Cloudera, Inc.
 

Similar to Introducing the data science sandbox as a service 8.30.18 (20)

Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Stefan Lipp
 
Unlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and ClouderaUnlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and Cloudera
Cloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
Cloudera, Inc.
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
DataWorks Summit
 
A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloud
Cloudera, Inc.
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the Cloud
GoDataDriven
 
Hadoop security implementationon 20171003
Hadoop security implementationon 20171003Hadoop security implementationon 20171003
Hadoop security implementationon 20171003
lee tracie
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoop
Wei-Chiu Chuang
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Cloudera, Inc.
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to Implementation
DataWorks Summit
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
The Hive
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
Cloudera, Inc.
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Cloudera, Inc.
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Stefan Lipp
 
Unlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and ClouderaUnlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and Cloudera
Cloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
Cloudera, Inc.
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
DataWorks Summit
 
A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloud
Cloudera, Inc.
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the Cloud
GoDataDriven
 
Hadoop security implementationon 20171003
Hadoop security implementationon 20171003Hadoop security implementationon 20171003
Hadoop security implementationon 20171003
lee tracie
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoop
Wei-Chiu Chuang
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Cloudera, Inc.
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to Implementation
DataWorks Summit
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
The Hive
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
Cloudera, Inc.
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Cloudera, Inc.
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
Cloudera, Inc.
 
Ad

More from Cloudera, Inc. (9)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
Cloudera, Inc.
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18
Cloudera, Inc.
 
Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18
Cloudera, Inc.
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: Exposed
Cloudera, Inc.
 
Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18
Cloudera, Inc.
 
Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
Cloudera, Inc.
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18
Cloudera, Inc.
 
Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18
Cloudera, Inc.
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: Exposed
Cloudera, Inc.
 
Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18
Cloudera, Inc.
 
Ad

Recently uploaded (20)

Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 

Introducing the data science sandbox as a service 8.30.18

  • 1. © Cloudera, Inc. All rights reserved. Enterprise-Ready Data Science: Scaling, Governance, and Operationalization
  • 2. © Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved.2 Mark Chisam Senior Solution Engineer Introducing Cloudera Data Science Workbench
  • 3. © Cloudera, Inc. All rights reserved. 3© Cloudera, Inc. All rights reserved.
  • 4. © Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved.
  • 5. © Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved.
  • 6. © Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved.
  • 7. © Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved.
  • 8. © Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved.
  • 9. © Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved.9 Dr. Daniel Parton Lead Data Scientist Operationalizing Data Science for Enterprises
  • 10. © Cloudera, Inc. All rights reserved. 10© Cloudera, Inc. All rights reserved. Bardess® is a consulting company focused on designing and implementing data analytics solutions. We are a team of data and business professionals, who ask insightful questions, extend boundaries and take action. We transform data into insights and action, everyday. 1 0
  • 11. © Cloudera, Inc. All rights reserved. 11 Requirements Discovery Strategy + Planning Solution Design Ingestion + Shaping Data Architecture Storage + Processing Predictive Analytics Machine Learning Artificial Intelligence Visualization Data Discovery Dev / Ops Bardess Data Practices MANAGEMENT CONSULTING DATA OPS DATA SCIENCE DATA ANALYTICS
  • 12. © Cloudera, Inc. All rights reserved. 12© Cloudera, Inc. All rights reserved.12 AI MACHINE LEARNING DATA SCIENCE ANALYTICS "BIG DATA"
  • 13. © Cloudera, Inc. All rights reserved. 13© Cloudera, Inc. All rights reserved. WHAT IS A DATA SCIENTIST?
  • 14. © Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved. WHAT IS A DATA SCIENTIST?
  • 15. © Cloudera, Inc. All rights reserved. 15© Cloudera, Inc. All rights reserved.15 Data Engineering Data Science (Exploratory) Production (Operational) Data has never been more plentiful. Open source data science and machine learning libraries are rapidly evolving. Commodity (and on-demand) compute makes scalable production machine learning affordable. Reports, Dashboards Production Data Pipelines Batch scoring … THE GOOD NEWS
  • 16. © Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved. THE BAD NEWS Data needs to move across multiple different systems. Teams have different conflicting requests for languages and libraries. Most data science done at small scale, individually, and is difficult to replace. Very few models reach production. Data Engineering Data Science (Exploratory) Production (Operational)
  • 17. © Cloudera, Inc. All rights reserved. 17© Cloudera, Inc. All rights reserved.17 THE CHALLENGE Balance these needs DATA SCIENCE • Access to granular data • Flexibility • Preferred open source tools • Elastic provisioning • Compute • Storage • Reproducible research • Path to production DATA MANAGEMENT • Security • Governance • Standards • Low maintenance • Low cost • Self-service access
  • 18. © Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved.18 THE TYPICAL SOLUTION “If I can’t use my favorite tools, I’ll…” • Copy data to my laptop • Copy data to a data science appliance • Copy data to a cloud service Why this is a problem: • Complicates security • Breaks data governance • Adds latency to process • Makes collaboration more difficult • Complicates model management and deployment • Creates infrastructure silos
  • 19. © Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved.19 CLOUDERA DATA SCIENCE WORKBENCH Accelerate Machine Learning from Research to Production For data scientists • Experiment faster Use R, Python, or Scala with on-demand compute and secure CDH data access • Work together Share reproducible research with your whole team • Deploy with confidence Get to production repeatably and without recoding For IT professionals • Bring data science to the data Give your data science team more freedom while reducing the risk and cost of silos • Secure by default Leverage common security and governance across workloads • Run anywhere On-premises or in the cloud
  • 20. © Cloudera, Inc. All rights reserved. 20© Cloudera, Inc. All rights reserved.20 CASE STUDY Transforming Business Decision-Making with Machine Learning at Scale Background: • Retail client aimed to use clustering to understand their most common types of transactions • And to find which groups of products tend to be purchased together • Cloudera cluster, storing 2 billion rows of historical transaction data • Used CDSW to build custom clustering workflow in Spark and Python Representative image of clustering
  • 21. © Cloudera, Inc. All rights reserved. 21© Cloudera, Inc. All rights reserved.21 CASE STUDY Transforming Business Decision-Making with Machine Learning at Scale Result: • Clusters describe transactions with far more nuance than the simple category- level aggregations that were previously in use • Identified major trends in certain types of transaction, worth multiples of $100M • Clusters transforming how company thinks about their business, from shop floor to board level • Clustering workflow is easily maintainable, reproducible, and scalable Representative image of clustering
  • 22. © Cloudera, Inc. All rights reserved. 22© Cloudera, Inc. All rights reserved.22 CASE STUDY Transforming Business Decision-Making with Machine Learning at Scale Benefits of CDSW: • Easy access to big datasets from Cloudera HDFS • Access to Spark to apply clustering on entire 2 billion row dataset • Notebook environment allows data scientists to innovate while staying within secure Cloudera environment • Collaborative environment enabling organized project structure and collaboration within team of data scientists Representative image of clustering
  • 23. © Cloudera, Inc. All rights reserved. 23© Cloudera, Inc. All rights reserved. LIVE DEMO
  • 24. © Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved.24 Introducing the Data Science Sandbox Lovan Chetty VP, Product
  • 25. © Cloudera, Inc. All rights reserved. 25© Cloudera, Inc. All rights reserved. SOLUTION Data Science Workbench EDH Stack + Option for Altus PaaS & More… Cloud IaaS (Fully-Managed) + BYOL options End to End Management (Cloud>Cluster>Workload) 24x7 Production DevOps Security, Governance & Compliance Workload Optimization Fully-Managed, Complete Cloud Platform for Analytics and Data Science DevOps Built-In, Cloudera & Cloud IaaS Included. Fast Setup, Ready in Hours. Fully-Managed Data Science Sandbox as a Service
  • 26. © Cloudera, Inc. All rights reserved. 26© Cloudera, Inc. All rights reserved. The Fastest, Most Cost-Effective Way to Expand or Deploy a Modern Platform for Data Science in the Cloud. • Ready Now, with No New Resources 24x7 Production DevOps & Monitoring • Secure, Enterprise-Ready: Hybrid Gateways, Governance, Compliance • Simple: All-in-one solutions for agility, flexibility in analytics & tools • Cost-Effective: ½ TCO, Best price-performance, SLA Optimization Benefits Fully-Managed Data Science Sandbox www.cazena.com/cloudera WHY CLOUD?
  • 27. © Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved. Q&A
  • 28. © Cloudera, Inc. All rights reserved. 28© Cloudera, Inc. All rights reserved.28 Q&A - TECHNICAL PANELISTS Lovan Chetty VP, Products [email protected] Dr. Daniel Parton Lead Data Scientist [email protected] Mark Chisam Senior Solution Engineer [email protected]
  • 29. © Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved. The Data Science Sandbox as a Service Try it Now with the FastStart Business Value Pilot: 4 Weeks to a Guaranteed Business Outcome. Philip Duplisey, Senior Director of Consulting [email protected] Bardess.com Bardess: Data Science & Management Consulting Cazena: Fully-Managed Cloudera Solutions for Azure & AWS Cloudera: The Modern Platform for Data Science and Analytics. Sam Berg VP Sales [email protected] Cazena.com Tia Watson Partner Manager [email protected] Cloudera.com
  • 30. © Cloudera, Inc. All rights reserved. THANK YOU