SlideShare a Scribd company logo
Securing the Hadoop Ecosystem
ATM (Cloudera) & Tucu (Cloudera)
Hadoop Summit, June 2013
Why is Security Important?
Tucu’s mug
Pic
ATM Tucu
Agenda
• Hadoop Ecosystem Interactions
• Security Concepts
• Authentication
• Authorization
• Confidentiality
• Auditing
• IT Infrastructure Integration
• Deployment Recommendations
Hadoop on its Own
Hadoop
NN
DN TT
JT
DN TT
DN TT
MR
client
Map
Task
Map
Task
Reduce
Task
SNN
hdfs, httpfs & mapred users end users protocols: RPC/data transfer/HTTP
HttpFS
HDFS
client
WebHdfs
client
Hadoop and Friends
Hadoop
Hive Metastore
Hbase
Oozie
Hue
Impala
Zookeeper
FlumeMapRed
Pig
Crunch
Cascading
Sqoop
Hive
Hbase
Oozie
Impala
browser
Flume
servicesclients clients
RPC
HTTP
Thrift
HTTP
RPC
Thrift
HTTP
RPC
service users end users protocols: RPCs/data/HTTP/Thrift/Avro-RPC
Avro RPC
WebHdfs
HTTP
RPCZookeeper
• Authentication:
• End users to services, as a user: user credentials
• Services to Services, as a service: service credentials
• Services to Services, on behalf of a user: service credentials
+ trusted service
• Job tasks to Services, on behalf of a user: job delegation
token
• Authorization
• Data: HDFS, HBase, Hive Metastore, Zookeeper
• Jobs: who can submit, view or manage Jobs
(MR, Pig, Oozie, Hue, …)
• Queries: who can run queries (Impala)
Authentication / Authorization
Confidentiality / Auditing
• Confidentiality
• Data at rest (on disk)
• Data in transit (on the network)
• Auditing
• Who accessed (read/write) data
• Who submitted, managed or viewed a Job or a Query
• End Users to services, as a user
• CLI & libraries: Kerberos (kinit or keytab)
• Web UIs: Kerberos SPNEGO & pluggable HTTP auth
• Services to Services, as a service
• Credentials: Kerberos (keytab)
• Services to Services, on behalf of a user
• Proxy-user (after Kerberos for service)
Authentication Details
• HDFS Data
• File System permissions (Unix like user/group permissions)
• HBase Data
• Read/Write Access Control Lists (ACLs) at table level
• Hive Metastore (Hive, Impala)
• Leverages/proxies HDFS permissions for tables & partitions
• Hive Server (Hive, Impala) (coming)
• More advanced GRANT/REVOKE with ACLs for tables
• Jobs (Hadoop, Oozie)
• Job ACLs for Hadoop Scheduler Queues, manage & view jobs
• Zookeeper
• ACLs at znodes, authenticated & read/write
Authorization Details
• Data in transit
• RPC: using SASL
• HDFS data: using SASL
• HTTP: using SSL (web UIs, shuffle). Requires SSL certs
• Thrift: not avail (Hive Metastore, Impala)
• Avro-RPC: not avail (Flume)
• Data at rest
• Nothing out of the box
• Doable by: custom ‘compression’ codec or
local file system encryption
Confidentiality Details
• Who accessed (read/write) FS data
• NN audit log contains all file opens, creates
• NN audit log contains all metadata ops, e.g. rename, listdir
• Who submitted, managed, or viewed a Job or a
Query
• JT, RM, and Job History Server logs contain history of all
jobs run on a cluster
• Who submitted, managed, or viewed a workflow
• Oozie audit logs contain history of all user requests
Auditing Details
Auditing Gaps
• Not all projects have explicit audit logs
• Audit-like information can be extracted by processing logs
• Eg: Impala query logs are distributed across all nodes
• It is difficult go correlate jobs & data access
• Eg: Map-Reduce jobs launched by Pig job
• Eg: HDFS data accessed by a Map-Reduce job
IT Integration: Kerberos
• Users don’t want Yet Another Credential
• Corp IT doesn’t want to provision thousands of
service principals
• Solution: local KDC + one-way trust
• Run a KDC (usually MIT Kerberos) in the cluster
• Put all service principals here
• Set up one-way trust of central corporate realm by
local KDC
• Normal user credentials can be used to access Hadoop
IT Integration: Groups
• Much of Hadoop authorization uses “groups”
• User ‘atm’ might belong to groups ‘analysts’, ‘eng’, etc.
• Users’ groups are not stored in Hadoop anywhere
• Refers to external system to determine group membership
• NN/JT/Oozie/Hive servers all must perform group mapping
• Default plugins for user/group mapping:
• ShellBasedUnixGroupsMapping – forks/runs `/bin/id’
• JniBasedUnixGroupsMapping – makes a system call
• LdapGroupsMapping – talks directly to an LDAP server
IT Integration: Kerberos + LDAP
Hadoop Cluster
Local KDC
hdfs/host1@HADOOP.EXAMPLE.COM
yarn/host2@HADOOP.EXAMPLE.COM
…
Central Active Directory
tucu@EXAMPLE.COM
atm@EXAMPLE.COM
…
Cross-realm trust
NN JT
LDAP group
mapping
IT Integration: Web Interfaces
• Most web interfaces authenticate using SPNEGO
• Standard HTTP authentication protocol
• Used internally by services which communicate over HTTP
• Most browsers support Kerberos SPNEGO authentication
• Hadoop components which use servlets for web
interfaces can plug in custom filter
• Integrate with intranet SSO HTTP solution
• Security configuration is a PITA
• Do only what you really need
• Enable cluster security (Kerberos) only if un-trusted
groups of users are sharing the cluster
• Otherwise use edge-security to keep outsiders out
• Only enable wire encryption if required
• Only enable web interface authentication if required
Deployment Recommendations
• Secure Hadoop bring-up order
1. HDFS RPC (including SNN check-pointing)
2. JobTracker RPC
3. TaskTrackers RPC & LinuxTaskControler
4. Hadoop web UI
5. Configure monitoring to work with security
6. Other services (HBase, Oozie, Hive Metastore, etc)
7. Continue with authorization and network encryption if needed
• Recommended: Use an admin/management tool
• Several inter-related configuration knobs
• To manage principals/keytabs creation and distribution
• Automatically configures monitoring for security
Deployment Recommendations
Q&A
Thanks
ATM (Cloudera) & Tucu (Cloudera)
Hadoop Summit, June 2013
Client Protocol Authentication Proxy User Authorization Confidentiality Auditing
Hadoop HDFS RPC Kerberos Yes FS permissions SASL Yes
Hadoop HDFS Data Transfer SASL No FS permissions SASL No
Hadoop WebHDFS HTTP
Kerberos SPNEGO
plus pluggable Yes FS permissions N/A Yes
Hadoop MapReduce
(Pig, Hive, Sqoop,
Crunch, Cascading) RPC Kerberos
Yes
(requires job
config work)
Job & Queue
ACLs SASL No
Hive Metastore Thrift Kerberos Yes FS permissions N/A Yes
Oozie HTTP
Kerberos SPNEGO
plus pluggable Yes
Job & Queue
ACLs and FS
permissions SSL (HTTPS) Yes
Hbase RPC/Thrift/HTTP Kerberos Yes table ACLs SASL No
Zookeeper RPC Kerberos No znode ACLs N/A No
Impala Thrift Kerberos No Hive policy file N/A No
Hue HTTP pluggable No
Job & Queue
ACLs and FS
permissions HTTPS No
Flume Avro RPC N/A No N/A N/A No
Security Capabilities

More Related Content

PPTX
Hadoop security @ Philly Hadoop Meetup May 2015
PDF
Hadoop security overview_hit2012_1117rev
PPTX
Hadoop Security Today & Tomorrow with Apache Knox
PPTX
Hadoop Security Today and Tomorrow
PPTX
Hadoop Security Features That make your risk officer happy
PPTX
Hadoop Security Features that make your risk officer happy
PPTX
The Future of Hadoop Security - Hadoop Summit 2014
PDF
Apache Sentry for Hadoop security
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security overview_hit2012_1117rev
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today and Tomorrow
Hadoop Security Features That make your risk officer happy
Hadoop Security Features that make your risk officer happy
The Future of Hadoop Security - Hadoop Summit 2014
Apache Sentry for Hadoop security

What's hot (20)

PPTX
Hadoop security
PPT
Hadoop Security Architecture
PPTX
Deploying Enterprise-grade Security for Hadoop
PDF
Hadoop Security: Overview
PDF
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
PDF
Hadoop & Security - Past, Present, Future
PPT
Hadoop Operations: How to Secure and Control Cluster Access
PPTX
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
PPTX
Hadoop REST API Security with Apache Knox Gateway
PPTX
Improvements in Hadoop Security
PDF
2014 sept 4_hadoop_security
PPTX
Open Source Security Tools for Big Data
PPTX
Hdp security overview
PDF
April 2014 HUG : Apache Sentry
PPTX
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
PDF
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
PPTX
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
PDF
Big Data Security with Hadoop
PPTX
Apache Hadoop Security - Ranger
PDF
Hadoop Security
Hadoop security
Hadoop Security Architecture
Deploying Enterprise-grade Security for Hadoop
Hadoop Security: Overview
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Hadoop & Security - Past, Present, Future
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Hadoop REST API Security with Apache Knox Gateway
Improvements in Hadoop Security
2014 sept 4_hadoop_security
Open Source Security Tools for Big Data
Hdp security overview
April 2014 HUG : Apache Sentry
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Big Data Security with Hadoop
Apache Hadoop Security - Ranger
Hadoop Security
Ad

Similar to Securing the Hadoop Ecosystem (20)

PPTX
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
PDF
CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding Edge
PPTX
Hadoop and Data Access Security
PDF
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
PPTX
Improvements in Hadoop Security
PDF
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
PPTX
Securing Data in Hadoop at Uber
PPTX
HBaseConAsia2018 Track3-2: HBase at China Telecom
PPTX
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
PDF
TriHUG October: Apache Ranger
PPTX
Troubleshooting Kerberos in Hadoop: Taming the Beast
PDF
BigData Security - A Point of View
PDF
Bi with apache hadoop(en)
PPTX
Distro-independent Hadoop cluster management
PPTX
Big data security
PPTX
Secure Hadoop clusters on Windows platform
PDF
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
PPTX
Big data - Online Training
PDF
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
PDF
Technologies for Data Analytics Platform
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding Edge
Hadoop and Data Access Security
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
Improvements in Hadoop Security
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Securing Data in Hadoop at Uber
HBaseConAsia2018 Track3-2: HBase at China Telecom
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
TriHUG October: Apache Ranger
Troubleshooting Kerberos in Hadoop: Taming the Beast
BigData Security - A Point of View
Bi with apache hadoop(en)
Distro-independent Hadoop cluster management
Big data security
Secure Hadoop clusters on Windows platform
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Big data - Online Training
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Technologies for Data Analytics Platform
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
PPTX
Managing the Dewey Decimal System
PPTX
Practical NoSQL: Accumulo's dirlist Example
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
PPTX
Security Framework for Multitenant Architecture
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
PDF
Computer Vision: Coming to a Store Near You
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Recently uploaded (20)

PPTX
Web Security: Login Bypass, SQLi, CSRF & XSS.pptx
PDF
CIFDAQ's Teaching Thursday: Moving Averages Made Simple
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Belt and Road Supply Chain Finance Blockchain Solution
PDF
How AI Agents Improve Data Accuracy and Consistency in Due Diligence.pdf
PDF
DevOps & Developer Experience Summer BBQ
PDF
AI And Its Effect On The Evolving IT Sector In Australia - Elevate
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
PPTX
How to Build Crypto Derivative Exchanges from Scratch.pptx
PDF
Reimagining Insurance: Connected Data for Confident Decisions.pdf
PDF
Chapter 2 Digital Image Fundamentals.pdf
PPTX
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PPTX
ABU RAUP TUGAS TIK kelas 8 hjhgjhgg.pptx
PDF
Event Presentation Google Cloud Next Extended 2025
PDF
Building High-Performance Oracle Teams: Strategic Staffing for Database Manag...
PDF
CIFDAQ's Token Spotlight: SKY - A Forgotten Giant's Comeback?
PDF
Sensors and Actuators in IoT Systems using pdf
Web Security: Login Bypass, SQLi, CSRF & XSS.pptx
CIFDAQ's Teaching Thursday: Moving Averages Made Simple
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Belt and Road Supply Chain Finance Blockchain Solution
How AI Agents Improve Data Accuracy and Consistency in Due Diligence.pdf
DevOps & Developer Experience Summer BBQ
AI And Its Effect On The Evolving IT Sector In Australia - Elevate
NewMind AI Weekly Chronicles - August'25 Week I
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
How to Build Crypto Derivative Exchanges from Scratch.pptx
Reimagining Insurance: Connected Data for Confident Decisions.pdf
Chapter 2 Digital Image Fundamentals.pdf
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
ABU RAUP TUGAS TIK kelas 8 hjhgjhgg.pptx
Event Presentation Google Cloud Next Extended 2025
Building High-Performance Oracle Teams: Strategic Staffing for Database Manag...
CIFDAQ's Token Spotlight: SKY - A Forgotten Giant's Comeback?
Sensors and Actuators in IoT Systems using pdf

Securing the Hadoop Ecosystem

  • 1. Securing the Hadoop Ecosystem ATM (Cloudera) & Tucu (Cloudera) Hadoop Summit, June 2013
  • 2. Why is Security Important? Tucu’s mug Pic ATM Tucu
  • 3. Agenda • Hadoop Ecosystem Interactions • Security Concepts • Authentication • Authorization • Confidentiality • Auditing • IT Infrastructure Integration • Deployment Recommendations
  • 4. Hadoop on its Own Hadoop NN DN TT JT DN TT DN TT MR client Map Task Map Task Reduce Task SNN hdfs, httpfs & mapred users end users protocols: RPC/data transfer/HTTP HttpFS HDFS client WebHdfs client
  • 5. Hadoop and Friends Hadoop Hive Metastore Hbase Oozie Hue Impala Zookeeper FlumeMapRed Pig Crunch Cascading Sqoop Hive Hbase Oozie Impala browser Flume servicesclients clients RPC HTTP Thrift HTTP RPC Thrift HTTP RPC service users end users protocols: RPCs/data/HTTP/Thrift/Avro-RPC Avro RPC WebHdfs HTTP RPCZookeeper
  • 6. • Authentication: • End users to services, as a user: user credentials • Services to Services, as a service: service credentials • Services to Services, on behalf of a user: service credentials + trusted service • Job tasks to Services, on behalf of a user: job delegation token • Authorization • Data: HDFS, HBase, Hive Metastore, Zookeeper • Jobs: who can submit, view or manage Jobs (MR, Pig, Oozie, Hue, …) • Queries: who can run queries (Impala) Authentication / Authorization
  • 7. Confidentiality / Auditing • Confidentiality • Data at rest (on disk) • Data in transit (on the network) • Auditing • Who accessed (read/write) data • Who submitted, managed or viewed a Job or a Query
  • 8. • End Users to services, as a user • CLI & libraries: Kerberos (kinit or keytab) • Web UIs: Kerberos SPNEGO & pluggable HTTP auth • Services to Services, as a service • Credentials: Kerberos (keytab) • Services to Services, on behalf of a user • Proxy-user (after Kerberos for service) Authentication Details
  • 9. • HDFS Data • File System permissions (Unix like user/group permissions) • HBase Data • Read/Write Access Control Lists (ACLs) at table level • Hive Metastore (Hive, Impala) • Leverages/proxies HDFS permissions for tables & partitions • Hive Server (Hive, Impala) (coming) • More advanced GRANT/REVOKE with ACLs for tables • Jobs (Hadoop, Oozie) • Job ACLs for Hadoop Scheduler Queues, manage & view jobs • Zookeeper • ACLs at znodes, authenticated & read/write Authorization Details
  • 10. • Data in transit • RPC: using SASL • HDFS data: using SASL • HTTP: using SSL (web UIs, shuffle). Requires SSL certs • Thrift: not avail (Hive Metastore, Impala) • Avro-RPC: not avail (Flume) • Data at rest • Nothing out of the box • Doable by: custom ‘compression’ codec or local file system encryption Confidentiality Details
  • 11. • Who accessed (read/write) FS data • NN audit log contains all file opens, creates • NN audit log contains all metadata ops, e.g. rename, listdir • Who submitted, managed, or viewed a Job or a Query • JT, RM, and Job History Server logs contain history of all jobs run on a cluster • Who submitted, managed, or viewed a workflow • Oozie audit logs contain history of all user requests Auditing Details
  • 12. Auditing Gaps • Not all projects have explicit audit logs • Audit-like information can be extracted by processing logs • Eg: Impala query logs are distributed across all nodes • It is difficult go correlate jobs & data access • Eg: Map-Reduce jobs launched by Pig job • Eg: HDFS data accessed by a Map-Reduce job
  • 13. IT Integration: Kerberos • Users don’t want Yet Another Credential • Corp IT doesn’t want to provision thousands of service principals • Solution: local KDC + one-way trust • Run a KDC (usually MIT Kerberos) in the cluster • Put all service principals here • Set up one-way trust of central corporate realm by local KDC • Normal user credentials can be used to access Hadoop
  • 14. IT Integration: Groups • Much of Hadoop authorization uses “groups” • User ‘atm’ might belong to groups ‘analysts’, ‘eng’, etc. • Users’ groups are not stored in Hadoop anywhere • Refers to external system to determine group membership • NN/JT/Oozie/Hive servers all must perform group mapping • Default plugins for user/group mapping: • ShellBasedUnixGroupsMapping – forks/runs `/bin/id’ • JniBasedUnixGroupsMapping – makes a system call • LdapGroupsMapping – talks directly to an LDAP server
  • 15. IT Integration: Kerberos + LDAP Hadoop Cluster Local KDC hdfs/[email protected] yarn/[email protected] … Central Active Directory [email protected] [email protected] … Cross-realm trust NN JT LDAP group mapping
  • 16. IT Integration: Web Interfaces • Most web interfaces authenticate using SPNEGO • Standard HTTP authentication protocol • Used internally by services which communicate over HTTP • Most browsers support Kerberos SPNEGO authentication • Hadoop components which use servlets for web interfaces can plug in custom filter • Integrate with intranet SSO HTTP solution
  • 17. • Security configuration is a PITA • Do only what you really need • Enable cluster security (Kerberos) only if un-trusted groups of users are sharing the cluster • Otherwise use edge-security to keep outsiders out • Only enable wire encryption if required • Only enable web interface authentication if required Deployment Recommendations
  • 18. • Secure Hadoop bring-up order 1. HDFS RPC (including SNN check-pointing) 2. JobTracker RPC 3. TaskTrackers RPC & LinuxTaskControler 4. Hadoop web UI 5. Configure monitoring to work with security 6. Other services (HBase, Oozie, Hive Metastore, etc) 7. Continue with authorization and network encryption if needed • Recommended: Use an admin/management tool • Several inter-related configuration knobs • To manage principals/keytabs creation and distribution • Automatically configures monitoring for security Deployment Recommendations
  • 19. Q&A
  • 20. Thanks ATM (Cloudera) & Tucu (Cloudera) Hadoop Summit, June 2013
  • 21. Client Protocol Authentication Proxy User Authorization Confidentiality Auditing Hadoop HDFS RPC Kerberos Yes FS permissions SASL Yes Hadoop HDFS Data Transfer SASL No FS permissions SASL No Hadoop WebHDFS HTTP Kerberos SPNEGO plus pluggable Yes FS permissions N/A Yes Hadoop MapReduce (Pig, Hive, Sqoop, Crunch, Cascading) RPC Kerberos Yes (requires job config work) Job & Queue ACLs SASL No Hive Metastore Thrift Kerberos Yes FS permissions N/A Yes Oozie HTTP Kerberos SPNEGO plus pluggable Yes Job & Queue ACLs and FS permissions SSL (HTTPS) Yes Hbase RPC/Thrift/HTTP Kerberos Yes table ACLs SASL No Zookeeper RPC Kerberos No znode ACLs N/A No Impala Thrift Kerberos No Hive policy file N/A No Hue HTTP pluggable No Job & Queue ACLs and FS permissions HTTPS No Flume Avro RPC N/A No N/A N/A No Security Capabilities