SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Security Implementation on Hadoop
Dr. Wei-Chiu Chuang | Software
Engineer
2© Cloudera, Inc. All rights reserved.
$ whoami
Software Engineer, Cloudera Apache Hadoop Committer/PMC
3© Cloudera, Inc. All rights reserved.
Unguarded data stores are the victims
4© Cloudera, Inc. All rights reserved.
Regulatory Compliance
Organizations can be fined up to 4% of
annual global turnover for breaching GDPR
or €20 Million
6© Cloudera, Inc. All rights reserved.
Security Implementation
7© Cloudera, Inc. All rights reserved.
Disclaimer
This talk serves as a general guideline for
security implementation on Hadoop.
The actual implementation procedures and
scope of implementation vary on a case-
by-case basis, and should be assessed by
Cloudera’s Professional Services team or
certified Cloudera SI Partners.
8© Cloudera, Inc. All rights reserved.
Non-secure #0
Data Free for All
9© Cloudera, Inc. All rights reserved.
Firewall
ActiveDirectory/KDC
Hadoop cluster
Cloudera
Manager
Gateway
node
Cloudera
NavigatorDatacenter
Applications
10© Cloudera, Inc. All rights reserved.
High Availability made Easy
11© Cloudera, Inc. All rights reserved.
Identity Management
Simple Authentication
File group ownership
• AD integration
• SSSD or Centrify
Consideration in large enterprises.
SSSD
via
12© Cloudera, Inc. All rights reserved.
System Diagram #0
Firewall
ActiveDirectory
Master
Worker Worker Worker
Cloudera
Manager
Master
(SSSD/Centrify)
13© Cloudera, Inc. All rights reserved.
Simple authentication =
no authentication
14© Cloudera, Inc. All rights reserved.
Minimal Security #1
Reduce Risk Exposure
15© Cloudera, Inc. All rights reserved.
Kerberos
EXAMPLE.COM
KDC
user@EXAMPLE.COM
Hadoop
user@EXAMPLE.COM 
user
Strong Authentication
KDC
• MIT
• ActiveDirectory (more common)
realmprimary
16© Cloudera, Inc. All rights reserved.
Kerberos
Consideration in large corporates
Time synchronization
CM Kerberos Wizard
• Configure AD to create a Kerberos
principal for CM server, and to
delegate CM the ability to
create/manage Kerberos principals
17© Cloudera, Inc. All rights reserved.
LDAPAuthentication
* LDAP over SSL
18© Cloudera, Inc. All rights reserved.
Authorization/Access Control
HDFS File ACL YARN job submission
Hbase ACLsOozie ACL
Access Control List (ACLs)
Hive
Sentry Managed
(RBAC)
Impala
19© Cloudera, Inc. All rights reserved.
Auditing
20© Cloudera, Inc. All rights reserved.
Backup/Disaster Recovery
Cloudera Backup/Disaster Recovery (BDR)
• A high performance data replicator
• Copies incremental data on the source cluster at specified schedules
Supports
 Kerberos
 Data encryption
 HDFS replication to cloud
21© Cloudera, Inc. All rights reserved.
Kerberized BDR Best Practice
Production DR
Cloudera BDR
PROD.EXAMPLE.COM
Cross-realm trust
KDC KDC
DR.EXAMPLE.COM
22© Cloudera, Inc. All rights reserved.
Firewall
System Diagram #1
ActiveDirectory/
KDC
Master
Worker Worker Worker
Cloudera
Manager
Kerberos
Master
(SSSD/Centrify)
DR
23© Cloudera, Inc. All rights reserved.
More Security #2
Managed, Secure, Protected
24© Cloudera, Inc. All rights reserved.
Data In-Transit Encryption
RPC encryption
Data transport encryption
• Supports AES CTR, up to 256-bit
key length
HTTP TLS/SSL encryption
• No self-signed certificates in
production
Master
Worker Worker Worker
Master
Application
RPC encryption
Transport
encryption
TLS/SSL
25© Cloudera, Inc. All rights reserved.
Data At-Rest Encryption
Transparent encryption
Supports any Hadoop applications
Encryption Zone
$ hadoop key create mykey
$ hadoop fs -mkdir /zone
$ hdfs crypto -createZone -keyName mykey -path /zone
/
/tmp
/zon
e
foo bar
Encryption zone
26© Cloudera, Inc. All rights reserved.
Key Management Server Deployment (non-prod)
HDFS
NameNode
Client
Java
Keystore
KMS
Keystore
file
Separation of duties
• Encryption Zone Key (EZK) is stored in
KMS server
• HDFS super user can not decrypt files
27© Cloudera, Inc. All rights reserved.
Key Management Server/Key Trustee Server Deployment
HDFS
NameNode
Client
Key Trustee
KMS
Key Trustee
KMS
Firewall
Key Trustee
Server
(Active)
Key Trustee
Server
(Passive)
synchronization
(or more)
28© Cloudera, Inc. All rights reserved.
KMS+KTS+HSM Deployment
HDFS
NameNode
Client HSM KMS
HSM KMS
Firewall
Key Trustee
Server
(Active)
Key Trustee
Server
(Passive)
synchronization
Key HSM
(or more)
Key HSM
HSM
HSM
29© Cloudera, Inc. All rights reserved.
Encryption Performance
30© Cloudera, Inc. All rights reserved.
Troubleshooting: Encryption Performance Anomaly
• Configuration
• AES-NI Hardware acceleration
• OpenSSL library
• Entropy
31© Cloudera, Inc. All rights reserved.
Fine Grained Access Control with Apache Sentry
32© Cloudera, Inc. All rights reserved.
Firewall
System Diagram #2
ActiveDirectory/
KDC
Master
Worker Worker Worker
Cloudera
Manager
Kerberos
Master
KMSKMS
Firewall
KeyTrusteeKeyTrustee
(SSSD/Centrify)
33© Cloudera, Inc. All rights reserved.
Most Security #3
Secure Data Vault
34© Cloudera, Inc. All rights reserved.
Data Redaction
Personal Identifiable Information
• PCI-DSS, HIPAA
Best practice
Password
• stores in credential files, not in configuration
Log, queries
• Cloudera Manager
35© Cloudera, Inc. All rights reserved.
Full Encryption
Encrypt Data Spills
• MapReduce
• Impala
• Hive
• Flume
OS-level encryption
• Navigator Encrypt
36© Cloudera, Inc. All rights reserved.
Security Vulnerabilities
37© Cloudera, Inc. All rights reserved.
Vulnerability Response and Process
Vulnerability
reports
Upstream
Internal
External
Fix Publish
CVE
Cloudera TSB
38© Cloudera, Inc. All rights reserved.
Cloudera Certified Technology
39© Cloudera, Inc. All rights reserved.
Cloudera Certified Technology Partners
Data Sources Data Ingest
Process, Refine
& Prep
Data Discovery Advanced Analytics
Connected
Machines/Data sources
Other Data Sources
40© Cloudera, Inc. All rights reserved.
A certified product ensures it integrates with a secure
cluster
• Authenticate via Kerberos or LDAP
Authentication
• Handle Apache Sentry with Hive, Impala, Search, HDFS
Authorization
• Support HDFS transport encryption, at-rest encryption; support
SSL/TLS connection encryption
Encryption
41© Cloudera, Inc. All rights reserved.
Cloudera SDX
42© Cloudera, Inc. All rights reserved.
Cloudera Enterprise
42
The modern platform for machine learning and analytics optimized for the cloud
EXTENSIBLE
SERVICES
CORE SERVICES
DATA
ENGINEERING
OPERATIONAL
DATABASE
ANALYTIC
DATABASE
DATA CATALOG
INGEST &
REPLICATION
SECURITY GOVERNANCE
WORKLOAD
MANAGEMENT
DATA
SCIENCE
S3 ADLS HDFS KUDU
STORAGE
SERVICES
43© Cloudera, Inc. All rights reserved.
• Unified security – protects sensitive data with consistent
controls, even for transient and recurring workloads
• Consistent governance – enables secure self-service access
to all relevant data and increases compliance
• Easy workload management – increases user productivity
and boosts job predictability
• Flexible ingest and replication – aggregates a single copy of
all data, provides disaster recovery, and eases migration
• Shared catalog – defines and preserves structure and
business context of data for new applications and partner
solutions
Open platform services
Built for multi-function analytics | Optimized for cloud
44© Cloudera, Inc. All rights reserved.
Successful use cases
45© Cloudera, Inc. All rights reserved.
Cloudera Overview & Financial Services Focus
2000
Strong Partner
Ecosystem
+
1600 Employees
Globally
+
19 Of the 30 G-SIBs Run
on Cloudera
Strong Focus &
Momentum in
Financial Services
3 Of the Fortune 500
Top 5 Insurers Run on
Cloudera
5 Of the Top 6 Asset
Management Firms
Run on Cloudera
200+
Financial Services
Customers
47© Cloudera, Inc. All rights reserved.
Building a Fantastic Customer Experience
• Improved customer experience
• 80 percent reduction in operating costs
through a wide-range of customer
service and operational improvements
• Decrease in cost to service customers
while increasing revenue through better
service
CUSTOMER 360
FINANCIAL SERVICES
» PREDICTIVE ANALYTICS
» 360 CUSTOMER VIEW
» OPERATIONAL ANALYTICS
48© Cloudera, Inc. All rights reserved.
Large healthcare
provider enables
practitioners to
recommend at-home
actions to prevent
hospital visits
• Flexible, automatic
data classification for
diverse medical
ontologies
• Self-service data
discovery for real-
time, data-driven
decisions
49© Cloudera, Inc. All rights reserved.
Thank you
Wei-ChiuChuang | weichiu@cloudera.com
50© Cloudera, Inc. All rights reserved.
More information on Hadoop Security
51© Cloudera, Inc. All rights reserved.
Books authored by Clouderans
Ad

More Related Content

What's hot (20)

PaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusPaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with Altus
Cloudera, Inc.
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Cloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
Cloudera, Inc.
 
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined StorageZero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
DataCore Software
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
Andrew Miller
 
IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
IDC Nutanix - Hyperconvergence and the Pulling Forces in the DatacenterIDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
NEXTtour
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Cloudera, Inc.
 
SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...
SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...
SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...
Splunk
 
Xpress azure - Extension of Azure in Tunisia
Xpress azure - Extension of Azure in TunisiaXpress azure - Extension of Azure in Tunisia
Xpress azure - Extension of Azure in Tunisia
Aymen Mami
 
PaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusPaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with Altus
Cloudera, Inc.
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Cloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
Cloudera, Inc.
 
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined StorageZero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
DataCore Software
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
Andrew Miller
 
IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
IDC Nutanix - Hyperconvergence and the Pulling Forces in the DatacenterIDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
NEXTtour
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Cloudera, Inc.
 
SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...
SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...
SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...
Splunk
 
Xpress azure - Extension of Azure in Tunisia
Xpress azure - Extension of Azure in TunisiaXpress azure - Extension of Azure in Tunisia
Xpress azure - Extension of Azure in Tunisia
Aymen Mami
 

Similar to Hadoop security implementationon 20171003 (20)

Cloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera clusterCloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera cluster
Cloudera, Inc.
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Big Data Spain
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
Cloudera, Inc.
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
Cloudera, Inc.
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Cloudera, Inc.
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Cloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Cloudera, Inc.
 
SDX Pitch Deck (201) - Apresentação SDP 2024
SDX Pitch Deck (201) - Apresentação SDP 2024SDX Pitch Deck (201) - Apresentação SDP 2024
SDX Pitch Deck (201) - Apresentação SDP 2024
PauloEduardoBitarJun
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
Shravan (Sean) Pabba
 
Seeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the DataSeeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the Data
Cloudera, Inc.
 
Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18
Cloudera, Inc.
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for Hadoop
Cloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera, Inc.
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Stefan Lipp
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
Cloudera, Inc.
 
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionUpgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Cloudera, Inc.
 
Cloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera clusterCloudera training: secure your Cloudera cluster
Cloudera training: secure your Cloudera cluster
Cloudera, Inc.
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Big Data Spain
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
Cloudera, Inc.
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
Cloudera, Inc.
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Cloudera, Inc.
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Cloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Cloudera, Inc.
 
SDX Pitch Deck (201) - Apresentação SDP 2024
SDX Pitch Deck (201) - Apresentação SDP 2024SDX Pitch Deck (201) - Apresentação SDP 2024
SDX Pitch Deck (201) - Apresentação SDP 2024
PauloEduardoBitarJun
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
Shravan (Sean) Pabba
 
Seeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the DataSeeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the Data
Cloudera, Inc.
 
Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18
Cloudera, Inc.
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for Hadoop
Cloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera, Inc.
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Stefan Lipp
 
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionUpgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Cloudera, Inc.
 
Ad

Recently uploaded (20)

The Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLabThe Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLab
Journal of Soft Computing in Civil Engineering
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
Compiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptxCompiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptx
RushaliDeshmukh2
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
Metal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistryMetal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistry
mee23nu
 
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Journal of Soft Computing in Civil Engineering
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
Compiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptxCompiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptx
RushaliDeshmukh2
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
Metal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistryMetal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistry
mee23nu
 
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
Ad

Hadoop security implementationon 20171003

  • 1. 1© Cloudera, Inc. All rights reserved. Security Implementation on Hadoop Dr. Wei-Chiu Chuang | Software Engineer
  • 2. 2© Cloudera, Inc. All rights reserved. $ whoami Software Engineer, Cloudera Apache Hadoop Committer/PMC
  • 3. 3© Cloudera, Inc. All rights reserved. Unguarded data stores are the victims
  • 4. 4© Cloudera, Inc. All rights reserved. Regulatory Compliance Organizations can be fined up to 4% of annual global turnover for breaching GDPR or €20 Million
  • 5. 6© Cloudera, Inc. All rights reserved. Security Implementation
  • 6. 7© Cloudera, Inc. All rights reserved. Disclaimer This talk serves as a general guideline for security implementation on Hadoop. The actual implementation procedures and scope of implementation vary on a case- by-case basis, and should be assessed by Cloudera’s Professional Services team or certified Cloudera SI Partners.
  • 7. 8© Cloudera, Inc. All rights reserved. Non-secure #0 Data Free for All
  • 8. 9© Cloudera, Inc. All rights reserved. Firewall ActiveDirectory/KDC Hadoop cluster Cloudera Manager Gateway node Cloudera NavigatorDatacenter Applications
  • 9. 10© Cloudera, Inc. All rights reserved. High Availability made Easy
  • 10. 11© Cloudera, Inc. All rights reserved. Identity Management Simple Authentication File group ownership • AD integration • SSSD or Centrify Consideration in large enterprises. SSSD via
  • 11. 12© Cloudera, Inc. All rights reserved. System Diagram #0 Firewall ActiveDirectory Master Worker Worker Worker Cloudera Manager Master (SSSD/Centrify)
  • 12. 13© Cloudera, Inc. All rights reserved. Simple authentication = no authentication
  • 13. 14© Cloudera, Inc. All rights reserved. Minimal Security #1 Reduce Risk Exposure
  • 14. 15© Cloudera, Inc. All rights reserved. Kerberos EXAMPLE.COM KDC [email protected] Hadoop [email protected]  user Strong Authentication KDC • MIT • ActiveDirectory (more common) realmprimary
  • 15. 16© Cloudera, Inc. All rights reserved. Kerberos Consideration in large corporates Time synchronization CM Kerberos Wizard • Configure AD to create a Kerberos principal for CM server, and to delegate CM the ability to create/manage Kerberos principals
  • 16. 17© Cloudera, Inc. All rights reserved. LDAPAuthentication * LDAP over SSL
  • 17. 18© Cloudera, Inc. All rights reserved. Authorization/Access Control HDFS File ACL YARN job submission Hbase ACLsOozie ACL Access Control List (ACLs) Hive Sentry Managed (RBAC) Impala
  • 18. 19© Cloudera, Inc. All rights reserved. Auditing
  • 19. 20© Cloudera, Inc. All rights reserved. Backup/Disaster Recovery Cloudera Backup/Disaster Recovery (BDR) • A high performance data replicator • Copies incremental data on the source cluster at specified schedules Supports  Kerberos  Data encryption  HDFS replication to cloud
  • 20. 21© Cloudera, Inc. All rights reserved. Kerberized BDR Best Practice Production DR Cloudera BDR PROD.EXAMPLE.COM Cross-realm trust KDC KDC DR.EXAMPLE.COM
  • 21. 22© Cloudera, Inc. All rights reserved. Firewall System Diagram #1 ActiveDirectory/ KDC Master Worker Worker Worker Cloudera Manager Kerberos Master (SSSD/Centrify) DR
  • 22. 23© Cloudera, Inc. All rights reserved. More Security #2 Managed, Secure, Protected
  • 23. 24© Cloudera, Inc. All rights reserved. Data In-Transit Encryption RPC encryption Data transport encryption • Supports AES CTR, up to 256-bit key length HTTP TLS/SSL encryption • No self-signed certificates in production Master Worker Worker Worker Master Application RPC encryption Transport encryption TLS/SSL
  • 24. 25© Cloudera, Inc. All rights reserved. Data At-Rest Encryption Transparent encryption Supports any Hadoop applications Encryption Zone $ hadoop key create mykey $ hadoop fs -mkdir /zone $ hdfs crypto -createZone -keyName mykey -path /zone / /tmp /zon e foo bar Encryption zone
  • 25. 26© Cloudera, Inc. All rights reserved. Key Management Server Deployment (non-prod) HDFS NameNode Client Java Keystore KMS Keystore file Separation of duties • Encryption Zone Key (EZK) is stored in KMS server • HDFS super user can not decrypt files
  • 26. 27© Cloudera, Inc. All rights reserved. Key Management Server/Key Trustee Server Deployment HDFS NameNode Client Key Trustee KMS Key Trustee KMS Firewall Key Trustee Server (Active) Key Trustee Server (Passive) synchronization (or more)
  • 27. 28© Cloudera, Inc. All rights reserved. KMS+KTS+HSM Deployment HDFS NameNode Client HSM KMS HSM KMS Firewall Key Trustee Server (Active) Key Trustee Server (Passive) synchronization Key HSM (or more) Key HSM HSM HSM
  • 28. 29© Cloudera, Inc. All rights reserved. Encryption Performance
  • 29. 30© Cloudera, Inc. All rights reserved. Troubleshooting: Encryption Performance Anomaly • Configuration • AES-NI Hardware acceleration • OpenSSL library • Entropy
  • 30. 31© Cloudera, Inc. All rights reserved. Fine Grained Access Control with Apache Sentry
  • 31. 32© Cloudera, Inc. All rights reserved. Firewall System Diagram #2 ActiveDirectory/ KDC Master Worker Worker Worker Cloudera Manager Kerberos Master KMSKMS Firewall KeyTrusteeKeyTrustee (SSSD/Centrify)
  • 32. 33© Cloudera, Inc. All rights reserved. Most Security #3 Secure Data Vault
  • 33. 34© Cloudera, Inc. All rights reserved. Data Redaction Personal Identifiable Information • PCI-DSS, HIPAA Best practice Password • stores in credential files, not in configuration Log, queries • Cloudera Manager
  • 34. 35© Cloudera, Inc. All rights reserved. Full Encryption Encrypt Data Spills • MapReduce • Impala • Hive • Flume OS-level encryption • Navigator Encrypt
  • 35. 36© Cloudera, Inc. All rights reserved. Security Vulnerabilities
  • 36. 37© Cloudera, Inc. All rights reserved. Vulnerability Response and Process Vulnerability reports Upstream Internal External Fix Publish CVE Cloudera TSB
  • 37. 38© Cloudera, Inc. All rights reserved. Cloudera Certified Technology
  • 38. 39© Cloudera, Inc. All rights reserved. Cloudera Certified Technology Partners Data Sources Data Ingest Process, Refine & Prep Data Discovery Advanced Analytics Connected Machines/Data sources Other Data Sources
  • 39. 40© Cloudera, Inc. All rights reserved. A certified product ensures it integrates with a secure cluster • Authenticate via Kerberos or LDAP Authentication • Handle Apache Sentry with Hive, Impala, Search, HDFS Authorization • Support HDFS transport encryption, at-rest encryption; support SSL/TLS connection encryption Encryption
  • 40. 41© Cloudera, Inc. All rights reserved. Cloudera SDX
  • 41. 42© Cloudera, Inc. All rights reserved. Cloudera Enterprise 42 The modern platform for machine learning and analytics optimized for the cloud EXTENSIBLE SERVICES CORE SERVICES DATA ENGINEERING OPERATIONAL DATABASE ANALYTIC DATABASE DATA CATALOG INGEST & REPLICATION SECURITY GOVERNANCE WORKLOAD MANAGEMENT DATA SCIENCE S3 ADLS HDFS KUDU STORAGE SERVICES
  • 42. 43© Cloudera, Inc. All rights reserved. • Unified security – protects sensitive data with consistent controls, even for transient and recurring workloads • Consistent governance – enables secure self-service access to all relevant data and increases compliance • Easy workload management – increases user productivity and boosts job predictability • Flexible ingest and replication – aggregates a single copy of all data, provides disaster recovery, and eases migration • Shared catalog – defines and preserves structure and business context of data for new applications and partner solutions Open platform services Built for multi-function analytics | Optimized for cloud
  • 43. 44© Cloudera, Inc. All rights reserved. Successful use cases
  • 44. 45© Cloudera, Inc. All rights reserved. Cloudera Overview & Financial Services Focus 2000 Strong Partner Ecosystem + 1600 Employees Globally + 19 Of the 30 G-SIBs Run on Cloudera Strong Focus & Momentum in Financial Services 3 Of the Fortune 500 Top 5 Insurers Run on Cloudera 5 Of the Top 6 Asset Management Firms Run on Cloudera 200+ Financial Services Customers
  • 45. 47© Cloudera, Inc. All rights reserved. Building a Fantastic Customer Experience • Improved customer experience • 80 percent reduction in operating costs through a wide-range of customer service and operational improvements • Decrease in cost to service customers while increasing revenue through better service CUSTOMER 360 FINANCIAL SERVICES » PREDICTIVE ANALYTICS » 360 CUSTOMER VIEW » OPERATIONAL ANALYTICS
  • 46. 48© Cloudera, Inc. All rights reserved. Large healthcare provider enables practitioners to recommend at-home actions to prevent hospital visits • Flexible, automatic data classification for diverse medical ontologies • Self-service data discovery for real- time, data-driven decisions
  • 47. 49© Cloudera, Inc. All rights reserved. Thank you Wei-ChiuChuang | [email protected]
  • 48. 50© Cloudera, Inc. All rights reserved. More information on Hadoop Security
  • 49. 51© Cloudera, Inc. All rights reserved. Books authored by Clouderans