SlideShare a Scribd company logo
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB on Red Hat by Alejandro González at Big Data Spain 2015
1© Cloudera, Inc. All rights reserved.
Securing Big Data at Rest with
Encryption for Hadoop,
Cassandra and MongoDB on
Red Hat.
Alex Gonzalez| Software Engineer
2© Cloudera, Inc. All rights reserved.
Content
• Important No-SQL players + Hadoop
• Who uses Big Data
• Use Cases
• Encryption Solutions and its demo
• Navigator Encrypt
• Performance
• MongoDB, Hadoop and Cassandra Encryption
3© Cloudera, Inc. All rights reserved.
Is a framework that allows for
the distributed processing of
large data sets across clusters of
computers.
A database with high
availability, linear scalability and
proven fault-tolerance on
commodity hardware or cloud
infrastructure.
A scalable and high-
performance, high availability,
and easy scalability open source
database designed to handle
document-oriented storage.
Important NoSQL players + Hadoop
4© Cloudera, Inc. All rights reserved.
5© Cloudera, Inc. All rights reserved.
6© Cloudera, Inc. All rights reserved.
7© Cloudera, Inc. All rights reserved.
Big Data Application Areas
• Business Intelligence, Analytics & Performance Mgmt
• Advertising, Sales & Marketing
• Advertising Network or Exchange
• Monitoring and Security
• Social
•Education and Training
•Data and Document Management - Financial, Health, etc.
•Music
•Video
•Gaming
8© Cloudera, Inc. All rights reserved.
Open Source Encryption Solutions
dm-crypt
A transparent disk encryption subsystem
eCryptfs
eCryptfs is a POSIX-compliant enterprise cryptographic stacked filesystem for
Linux.
Both are supported at Ubuntu, SLES, RedHat, Debian and CentOS.
Red Hat 7.x and CentOS 7.x are not supporting ecryptfs anymore.
9© Cloudera, Inc. All rights reserved.
eCryptfs & MongoDB demo
10© Cloudera, Inc. All rights reserved.
eCryptfs and dm-crypt cons
• Any access can access the data when the mountpoint is active
• Do not perform key management at all
11© Cloudera, Inc. All rights reserved.
Cloudera Navigator Encrypt
Provides massively scalable, high-performance encryption for sensitive data. It
leverages industry-standard AES-256 encryption and provides a transparent layer
between the application and filesystem.
12© Cloudera, Inc. All rights reserved.
Cloudera Navigator Encrypt Architecture
13© Cloudera, Inc. All rights reserved.
Cloudera Navigator Encrypt and Key Management
14© Cloudera, Inc. All rights reserved.
Navigator Encrypt Performance
Performance cost is ~5% to ~10%
{ nThreads: 32, fileSizeMB: 1000, r: true }
new thread, total running : 1
Not-encrypted: 2380 ops/sec 9 MB/sec Encrypted: 2479 ops/sec 9 MB/sec
Performance cost: 4.15%
new thread, total running : 2
Not-encrypted: 3011 ops/sec 11 MB/sec Encrypted: 3160 ops/sec 11 MB/sec
Performance cost: 4.94%
15© Cloudera, Inc. All rights reserved.
Encrypting MongoDB with Navigator Encrypt
16© Cloudera, Inc. All rights reserved.
Navigator Encrypt Profiles
Navigator Encrypt works differently when creating ACLs for Java processes because
the binary executed is the Java executable and Java can receive different jars.
In that case, you need to specify a profile, which contains all the options that Java
receives when it gets executed. Using that profile, you can set which java
application will access the data.
17© Cloudera, Inc. All rights reserved.
Hadoop Encryption
Navigator Encrypt Profiling - Obtaining the PID
[root@hdfs-2 ~]# ps aux | grep datanode
hdfs 7910 0.5 3.3 1649284 257040 ? Sl 11:41 0:25 /usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_datanode -Xmx1000m -Dhdfs.audit.
logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.
file=hadoop-cmf-HDFS-1-DATANODE-hdfs-2.vpc.cloudera.com.log.out -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop -
Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native -
Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:
+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:
OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode
18© Cloudera, Inc. All rights reserved.
Hadoop Encryption
Navigator Encrypt Profiling
[root@hdfs-2 ~]# navencrypt-profile -p 7910
{
"uid":"496",
"comm":"java",
"cmdline":"/usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_datanode -Xmx1000m -Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS
-Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.file=hadoop-cmf-HDFS-1-DATANODE-hdfs-2.vpc.cloudera.com.log.
out -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.
library.path=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.
preferIPv4Stack=true -server -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:
CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.
security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode"
}
[root@hdfs-2 ~]# navencrypt-profile -p 7910 > profile.txt
19© Cloudera, Inc. All rights reserved.
Hadoop Encryption
Adding a Navigator Encrypt ACL
[root@hdfs-2 ~]# navencrypt acl --add --rule="ALLOW @hdfs * /usr/java/jdk1.7.0_67-
cloudera/bin/java" --profile=profile.txt
Type MASTER passphrase:
1 rule(s) were added
20© Cloudera, Inc. All rights reserved.
Hadoop Encryption
Verify Navigator Encrypt ACL
[root@hdfs-2 ~]# navencrypt acl --list --all
Type MASTER passphrase:
# - Type Category Path Profile Process
1 ALLOW @hdfs * YES /usr/java/jdk1.7.0_67-cloudera/bin/java
PROFILE:
{"uid":"496","cmdline":"/usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_datanode -Xmx1000m -Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.
logger=INFO,RFAS -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.file=hadoop-cmf-HDFS-1-DATANODE-hdfs-2.vpc.
cloudera.com.log.out -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.
logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml
-Djava.net.preferIPv4Stack=true -server -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:
CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.
security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode","comm":"java"}
21© Cloudera, Inc. All rights reserved.
Hadoop Encryption
Stopping the cluster
22© Cloudera, Inc. All rights reserved.
Hadoop Encryption
Stopping the cluster
23© Cloudera, Inc. All rights reserved.
Hadoop Encryption
Navigator Encrypt Data Encryption
root@hdfs-2 ~]# navencrypt-move encrypt @hdfs /data/dfs/dn/current/ /mnt/mountpoint/
Type MASTER passphrase:
Size to encrypt: 12 KB
Moving from: '/data/dfs/dn/current'
Moving to: '/mnt/mountpoint/hdfs/data/dfs/dn/current'
100% [=======================================================>] [ 345 B]
Done.
24© Cloudera, Inc. All rights reserved.
Hadoop Encryption
Starting the Cluster
25© Cloudera, Inc. All rights reserved.
Hadoop Encryption
HDFS Test
[root@hdfs-2 ~]# su - hdfs
[hdfs@hdfs-2 ~]$ touch file.txt
[hdfs@hdfs-2 ~]$ hdfs dfs -mkdir /data/
[hdfs@hdfs-2 ~]$ hdfs dfs -copyFromLocal file.txt /data/file.txt
[hdfs@hdfs-2 ~]$ hdfs dfs -ls /data/
Found 1 items
-rw-r--r-- 2 hdfs supergroup 0 2015-05-20 13:50 /data/file.txt
26© Cloudera, Inc. All rights reserved.
Cassandra Encryption
# ps aux | grep cassandra
root 15109 22.4 27.0 6347932 4143708 pts/0 SLl 00:22 0:08 java -ea -javaagent:
/apache-......
# navencrypt-profile --pid=15109 > cassandra.profile
# navencrypt acl --add --rule="ALLOW @cassandra * /usr/lib/jvm/java-6-
oracle/jre/bin/java" --profile=cassandra.profile
# navencrypt-move encrypt @cassandra /var/lib/cassandra/ /mnt/encrypted-mountpoint
27© Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved.
Thank you
alex.gonzalez@cloudera.com
twitter: @kozlex
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB on Red Hat by Alejandro González at Big Data Spain 2015
Ad

More Related Content

What's hot (20)

Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Cloudera, Inc.
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
Adam Muise
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
Uwe Printz
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
Shivaji Dutta
 
Deploying Enterprise-grade Security for Hadoop
Deploying Enterprise-grade Security for HadoopDeploying Enterprise-grade Security for Hadoop
Deploying Enterprise-grade Security for Hadoop
Cloudera, Inc.
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
shrey mehrotra
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
Kashif Khan
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
DataWorks Summit
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
Rommel Garcia
 
Apache HDFS Extended Attributes and Transparent Encryption
Apache HDFS Extended Attributes and Transparent EncryptionApache HDFS Extended Attributes and Transparent Encryption
Apache HDFS Extended Attributes and Transparent Encryption
Uma Maheswara Rao Gangumalla
 
Kerberos, Token and Hadoop
Kerberos, Token and HadoopKerberos, Token and Hadoop
Kerberos, Token and Hadoop
Kai Zheng
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happy
DataWorks Summit
 
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheConTechnical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Yahoo!デベロッパーネットワーク
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
Cloudera, Inc.
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
Steve Loughran
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster Access
Cloudera, Inc.
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
DataWorks Summit
 
April 2014 HUG : Apache Sentry
April 2014 HUG : Apache SentryApril 2014 HUG : Apache Sentry
April 2014 HUG : Apache Sentry
Yahoo Developer Network
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Steve Loughran
 
Hadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using KerberosHadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using Kerberos
Sarvesh Meena
 
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Cloudera, Inc.
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
Adam Muise
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
Uwe Printz
 
Deploying Enterprise-grade Security for Hadoop
Deploying Enterprise-grade Security for HadoopDeploying Enterprise-grade Security for Hadoop
Deploying Enterprise-grade Security for Hadoop
Cloudera, Inc.
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
DataWorks Summit
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
Rommel Garcia
 
Apache HDFS Extended Attributes and Transparent Encryption
Apache HDFS Extended Attributes and Transparent EncryptionApache HDFS Extended Attributes and Transparent Encryption
Apache HDFS Extended Attributes and Transparent Encryption
Uma Maheswara Rao Gangumalla
 
Kerberos, Token and Hadoop
Kerberos, Token and HadoopKerberos, Token and Hadoop
Kerberos, Token and Hadoop
Kai Zheng
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happy
DataWorks Summit
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
Cloudera, Inc.
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
Steve Loughran
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster Access
Cloudera, Inc.
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
DataWorks Summit
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Steve Loughran
 
Hadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using KerberosHadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using Kerberos
Sarvesh Meena
 

Viewers also liked (20)

The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
Big Data Spain
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
Big Data Spain
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Big Data Spain
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
Big Data Spain
 
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Big Data Spain
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...
Big Data Spain
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014
Big Data Spain
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Big Data Spain
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Spain
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conference
Big Data Spain
 
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Cloudera, Inc.
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
Big Data Spain
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0
Big Data Spain
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Big Data Spain
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data Spain
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
Big Data Spain
 
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
Big Data Spain
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Big Data Spain
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Big Data Spain
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Big Data Spain
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
Big Data Spain
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
Big Data Spain
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Big Data Spain
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
Big Data Spain
 
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Big Data Spain
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...
Big Data Spain
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014
Big Data Spain
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Big Data Spain
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Spain
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conference
Big Data Spain
 
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Cloudera, Inc.
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
Big Data Spain
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0
Big Data Spain
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Big Data Spain
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data Spain
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
Big Data Spain
 
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
Big Data Spain
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Big Data Spain
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Big Data Spain
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Big Data Spain
 
Ad

Similar to Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB on Red Hat by Alejandro González at Big Data Spain 2015 (20)

Hadoop security implementationon 20171003
Hadoop security implementationon 20171003Hadoop security implementationon 20171003
Hadoop security implementationon 20171003
lee tracie
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoop
Wei-Chiu Chuang
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
Timothy Spann
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Stefan Lipp
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
Cloudera, Inc.
 
Securing Spark Applications
Securing Spark ApplicationsSecuring Spark Applications
Securing Spark Applications
DataWorks Summit/Hadoop Summit
 
Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10
Timothy Spann
 
MySQL 8.0 - Security Features
MySQL 8.0 - Security FeaturesMySQL 8.0 - Security Features
MySQL 8.0 - Security Features
Harin Vadodaria
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC Solution
Timothy Spann
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
Cloudera, Inc.
 
Apache Hadoop 3
Apache Hadoop 3Apache Hadoop 3
Apache Hadoop 3
Cloudera, Inc.
 
Updates to Apache CloudStack and LINBIT SDS
Updates to Apache CloudStack and LINBIT SDSUpdates to Apache CloudStack and LINBIT SDS
Updates to Apache CloudStack and LINBIT SDS
ShapeBlue
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
Shravan (Sean) Pabba
 
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
Neo4j
 
Event Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache KafkaEvent Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache Kafka
DataWorks Summit
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
Cloudera, Inc.
 
Kafka for DBAs
Kafka for DBAsKafka for DBAs
Kafka for DBAs
Gwen (Chen) Shapira
 
Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019   Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019
Timothy Spann
 
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Timothy Spann
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
inside-BigData.com
 
Hadoop security implementationon 20171003
Hadoop security implementationon 20171003Hadoop security implementationon 20171003
Hadoop security implementationon 20171003
lee tracie
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoop
Wei-Chiu Chuang
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
Timothy Spann
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Stefan Lipp
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
Cloudera, Inc.
 
Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10
Timothy Spann
 
MySQL 8.0 - Security Features
MySQL 8.0 - Security FeaturesMySQL 8.0 - Security Features
MySQL 8.0 - Security Features
Harin Vadodaria
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC Solution
Timothy Spann
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
Cloudera, Inc.
 
Updates to Apache CloudStack and LINBIT SDS
Updates to Apache CloudStack and LINBIT SDSUpdates to Apache CloudStack and LINBIT SDS
Updates to Apache CloudStack and LINBIT SDS
ShapeBlue
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
Shravan (Sean) Pabba
 
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
Neo4j
 
Event Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache KafkaEvent Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache Kafka
DataWorks Summit
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
Cloudera, Inc.
 
Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019   Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019
Timothy Spann
 
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Timothy Spann
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
inside-BigData.com
 
Ad

More from Big Data Spain (20)

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data Spain
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Big Data Spain
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
Big Data Spain
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Big Data Spain
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Big Data Spain
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Big Data Spain
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Big Data Spain
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Big Data Spain
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
Big Data Spain
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
Big Data Spain
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Big Data Spain
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
Big Data Spain
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Big Data Spain
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Big Data Spain
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Big Data Spain
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Big Data Spain
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
Big Data Spain
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Big Data Spain
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
Big Data Spain
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Big Data Spain
 
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data Spain
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Big Data Spain
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
Big Data Spain
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Big Data Spain
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Big Data Spain
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Big Data Spain
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Big Data Spain
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Big Data Spain
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
Big Data Spain
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
Big Data Spain
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Big Data Spain
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
Big Data Spain
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Big Data Spain
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Big Data Spain
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Big Data Spain
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Big Data Spain
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
Big Data Spain
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Big Data Spain
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
Big Data Spain
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Big Data Spain
 

Recently uploaded (20)

SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko
Fwdays
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
Datastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptxDatastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptx
kaleeswaric3
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Asthma presentación en inglés abril 2025 pdf
Asthma presentación en inglés abril 2025 pdfAsthma presentación en inglés abril 2025 pdf
Asthma presentación en inglés abril 2025 pdf
VanessaRaudez
 
Learn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step GuideLearn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step Guide
Marcel David
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Leading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael JidaelLeading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael Jidael
Michael Jidael
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko
Fwdays
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
Datastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptxDatastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptx
kaleeswaric3
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Asthma presentación en inglés abril 2025 pdf
Asthma presentación en inglés abril 2025 pdfAsthma presentación en inglés abril 2025 pdf
Asthma presentación en inglés abril 2025 pdf
VanessaRaudez
 
Learn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step GuideLearn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step Guide
Marcel David
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Leading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael JidaelLeading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael Jidael
Michael Jidael
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 

Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB on Red Hat by Alejandro González at Big Data Spain 2015

  • 2. 1© Cloudera, Inc. All rights reserved. Securing Big Data at Rest with Encryption for Hadoop, Cassandra and MongoDB on Red Hat. Alex Gonzalez| Software Engineer
  • 3. 2© Cloudera, Inc. All rights reserved. Content • Important No-SQL players + Hadoop • Who uses Big Data • Use Cases • Encryption Solutions and its demo • Navigator Encrypt • Performance • MongoDB, Hadoop and Cassandra Encryption
  • 4. 3© Cloudera, Inc. All rights reserved. Is a framework that allows for the distributed processing of large data sets across clusters of computers. A database with high availability, linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure. A scalable and high- performance, high availability, and easy scalability open source database designed to handle document-oriented storage. Important NoSQL players + Hadoop
  • 5. 4© Cloudera, Inc. All rights reserved.
  • 6. 5© Cloudera, Inc. All rights reserved.
  • 7. 6© Cloudera, Inc. All rights reserved.
  • 8. 7© Cloudera, Inc. All rights reserved. Big Data Application Areas • Business Intelligence, Analytics & Performance Mgmt • Advertising, Sales & Marketing • Advertising Network or Exchange • Monitoring and Security • Social •Education and Training •Data and Document Management - Financial, Health, etc. •Music •Video •Gaming
  • 9. 8© Cloudera, Inc. All rights reserved. Open Source Encryption Solutions dm-crypt A transparent disk encryption subsystem eCryptfs eCryptfs is a POSIX-compliant enterprise cryptographic stacked filesystem for Linux. Both are supported at Ubuntu, SLES, RedHat, Debian and CentOS. Red Hat 7.x and CentOS 7.x are not supporting ecryptfs anymore.
  • 10. 9© Cloudera, Inc. All rights reserved. eCryptfs & MongoDB demo
  • 11. 10© Cloudera, Inc. All rights reserved. eCryptfs and dm-crypt cons • Any access can access the data when the mountpoint is active • Do not perform key management at all
  • 12. 11© Cloudera, Inc. All rights reserved. Cloudera Navigator Encrypt Provides massively scalable, high-performance encryption for sensitive data. It leverages industry-standard AES-256 encryption and provides a transparent layer between the application and filesystem.
  • 13. 12© Cloudera, Inc. All rights reserved. Cloudera Navigator Encrypt Architecture
  • 14. 13© Cloudera, Inc. All rights reserved. Cloudera Navigator Encrypt and Key Management
  • 15. 14© Cloudera, Inc. All rights reserved. Navigator Encrypt Performance Performance cost is ~5% to ~10% { nThreads: 32, fileSizeMB: 1000, r: true } new thread, total running : 1 Not-encrypted: 2380 ops/sec 9 MB/sec Encrypted: 2479 ops/sec 9 MB/sec Performance cost: 4.15% new thread, total running : 2 Not-encrypted: 3011 ops/sec 11 MB/sec Encrypted: 3160 ops/sec 11 MB/sec Performance cost: 4.94%
  • 16. 15© Cloudera, Inc. All rights reserved. Encrypting MongoDB with Navigator Encrypt
  • 17. 16© Cloudera, Inc. All rights reserved. Navigator Encrypt Profiles Navigator Encrypt works differently when creating ACLs for Java processes because the binary executed is the Java executable and Java can receive different jars. In that case, you need to specify a profile, which contains all the options that Java receives when it gets executed. Using that profile, you can set which java application will access the data.
  • 18. 17© Cloudera, Inc. All rights reserved. Hadoop Encryption Navigator Encrypt Profiling - Obtaining the PID [root@hdfs-2 ~]# ps aux | grep datanode hdfs 7910 0.5 3.3 1649284 257040 ? Sl 11:41 0:25 /usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_datanode -Xmx1000m -Dhdfs.audit. logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log. file=hadoop-cmf-HDFS-1-DATANODE-hdfs-2.vpc.cloudera.com.log.out -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop - Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native - Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX: +UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX: OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode
  • 19. 18© Cloudera, Inc. All rights reserved. Hadoop Encryption Navigator Encrypt Profiling [root@hdfs-2 ~]# navencrypt-profile -p 7910 { "uid":"496", "comm":"java", "cmdline":"/usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_datanode -Xmx1000m -Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.file=hadoop-cmf-HDFS-1-DATANODE-hdfs-2.vpc.cloudera.com.log. out -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava. library.path=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net. preferIPv4Stack=true -server -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX: CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop. security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode" } [root@hdfs-2 ~]# navencrypt-profile -p 7910 > profile.txt
  • 20. 19© Cloudera, Inc. All rights reserved. Hadoop Encryption Adding a Navigator Encrypt ACL [root@hdfs-2 ~]# navencrypt acl --add --rule="ALLOW @hdfs * /usr/java/jdk1.7.0_67- cloudera/bin/java" --profile=profile.txt Type MASTER passphrase: 1 rule(s) were added
  • 21. 20© Cloudera, Inc. All rights reserved. Hadoop Encryption Verify Navigator Encrypt ACL [root@hdfs-2 ~]# navencrypt acl --list --all Type MASTER passphrase: # - Type Category Path Profile Process 1 ALLOW @hdfs * YES /usr/java/jdk1.7.0_67-cloudera/bin/java PROFILE: {"uid":"496","cmdline":"/usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_datanode -Xmx1000m -Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit. logger=INFO,RFAS -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.file=hadoop-cmf-HDFS-1-DATANODE-hdfs-2.vpc. cloudera.com.log.out -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root. logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX: CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop. security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode","comm":"java"}
  • 22. 21© Cloudera, Inc. All rights reserved. Hadoop Encryption Stopping the cluster
  • 23. 22© Cloudera, Inc. All rights reserved. Hadoop Encryption Stopping the cluster
  • 24. 23© Cloudera, Inc. All rights reserved. Hadoop Encryption Navigator Encrypt Data Encryption root@hdfs-2 ~]# navencrypt-move encrypt @hdfs /data/dfs/dn/current/ /mnt/mountpoint/ Type MASTER passphrase: Size to encrypt: 12 KB Moving from: '/data/dfs/dn/current' Moving to: '/mnt/mountpoint/hdfs/data/dfs/dn/current' 100% [=======================================================>] [ 345 B] Done.
  • 25. 24© Cloudera, Inc. All rights reserved. Hadoop Encryption Starting the Cluster
  • 26. 25© Cloudera, Inc. All rights reserved. Hadoop Encryption HDFS Test [root@hdfs-2 ~]# su - hdfs [hdfs@hdfs-2 ~]$ touch file.txt [hdfs@hdfs-2 ~]$ hdfs dfs -mkdir /data/ [hdfs@hdfs-2 ~]$ hdfs dfs -copyFromLocal file.txt /data/file.txt [hdfs@hdfs-2 ~]$ hdfs dfs -ls /data/ Found 1 items -rw-r--r-- 2 hdfs supergroup 0 2015-05-20 13:50 /data/file.txt
  • 27. 26© Cloudera, Inc. All rights reserved. Cassandra Encryption # ps aux | grep cassandra root 15109 22.4 27.0 6347932 4143708 pts/0 SLl 00:22 0:08 java -ea -javaagent: /apache-...... # navencrypt-profile --pid=15109 > cassandra.profile # navencrypt acl --add --rule="ALLOW @cassandra * /usr/lib/jvm/java-6- oracle/jre/bin/java" --profile=cassandra.profile # navencrypt-move encrypt @cassandra /var/lib/cassandra/ /mnt/encrypted-mountpoint
  • 28. 27© Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved. Thank you [email protected] twitter: @kozlex