SlideShare a Scribd company logo
1 © Hortonworks Inc. 2011–2018. All rights reserved
Running Enterprise Workloads in the
Cloud
Richard Doktorics
Peter Darvasi
2 © Hortonworks Inc. 2011–2018. All rights reserved
Who we are?
⬢ Peter Darvasi
- Partner Engineer at Hortonworks
- @pdarvasi
⬢ Richard Doktorics
- Software Engineer
- @doktoric
3 © Hortonworks Inc. 2011–2018. All rights reserved
Agenda
⬢ What is Cloudbreak?
⬢ Enterprise checklist for big data in the cloud
⬢ Cloudbreak in da house
⬢ Questions
4 © Hortonworks Inc. 2011–2018. All rights reserved
What is Cloudbreak?
5 © Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak is a tool for provisioning Hadoop
clusters on any cloud infrastructure
Simplified Cluster Provisioning - prescriptive setup,
simple automation
6 © Hortonworks Inc. 2011–2018. All rights reserved
Deploy on Public or Private
Clouds
Dynamically configure and manage
clusters on public or private clouds
(Amazon Web Services, Microsoft
Azure, Google Cloud Platform and
OpenStack)
Automated Scaling
Seamlessly manage elasticity
requirements as cluster workloads
change (Ambari Metrics / Prometheus)
Secured Cluster Access
Supports configuration defining
network boundaries and configuring
security groups
Highly Extensible
Recipes to run custom commands
Custom images
7 © Hortonworks Inc. 2011–2018. All rights reserved
⬢ Cloudbreak Deployer (CBD)
– Written in Go and Bash
– Compiled into single binary
⬢ Micro-service architecture
– Each service runs in a Docker
container
– Each container is replaceable
with custom ones
– Services are handled with
docker-compose
Single node deployment
8 © Hortonworks Inc. 2011–2018. All rights reserved
Enterprise checklist for big data in cloud
9 © Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
1
0
© Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
✓ Simple UX
✓ Powerful CLI
✓ Autoscaling
1
1
© Hortonworks Inc. 2011–2018. All rights reserved
Simplified UX
1
2
© Hortonworks Inc. 2011–2018. All rights reserved
Create Credential Experience
1
3
© Hortonworks Inc. 2011–2018. All rights reserved
Built-In Blueprints
1
4
© Hortonworks Inc. 2011–2018. All rights reserved
Basic and Advanced Cluster Creation Experiences
BASIC ADVANCED
1
5
© Hortonworks Inc. 2011–2018. All rights reserved
New Network and Security Group Choices
⬢ Network
– Create new Network and new
Subnet
– Choose existing Network and
existing Subnet
⬢ Security Groups
– Create new SGs
• Choose default SGs
(minimal set of ports)
• Create customized
– Choose existing SGs
1
6
© Hortonworks Inc. 2011–2018. All rights reserved
Powerful CLI
1
7
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak CLI: Designed for DevOps
1
8
© Hortonworks Inc. 2011–2018. All rights reserved
“Show cli command” for every request
1
9
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-scaling
2
0
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling
⬢ Alerts: Create metric or time-based alerts for cluster scaling
⬢ Policies: Scaling policies adjust cluster size based on activity and workload
alerts
⬢ General Configurations: Boundaries and cooldown period
2
1
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling Time-Based Alert
Fire at 10:15 am everyday
2
2
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling Metric-Based Alert
Fire after NodeManagers are in
CRITICAL state for 10 minutes
2
3
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling Policies
⬢ Define the Scale Adjustment (Node Count/Percentage/Exact size)
⬢ Select the HostGroup (to Scale)
⬢ Select Alert (which when fired, executes the Policy)
2
4
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling General Configurations
⬢ Cooldown Period (between scaling actions)
⬢ Minimum and Maximum Cluster size (boundaries)
Cluster size
boundaries
Time Interval between
two Autoscale events
2
5
© Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
✓ Cloud Resources
✓ Hortonworks DataFlow
✓ Custom Images
2
6
© Hortonworks Inc. 2011–2018. All rights reserved
Cloud Resources:
RDBMS + LDAP
2
7
© Hortonworks Inc. 2011–2018. All rights reserved
Cloud Resources: RDBMS and LDAP/AD = Dynamic Blueprints
⬢ Background:
– Cluster configuration often includes external database (for Hive, Ranger, etc) and LDAP/AD configs
– It’s a challenge to know the different Blueprint configuration choices per service across the stack
⬢ Dynamic Blueprints:
– Ability to manage External Sources (e.g. RDBMS and LDAP/AD) outside of your Blueprint
– Cloudbreak will inject the configurations into your Blueprint
– Simplifies reuse of external cloud resources
– Simplifies your Blueprints -> don’t have to know all the configurations for each component
2
8
© Hortonworks Inc. 2011–2018. All rights reserved
Dynamic Blueprints: RDBMS/LDAP
⬢ Built-In Components:
– Atlas, Ranger, Hadoop, Hive LLAP, Hive, Ambari, Oozie, Druid, SuperSet
JDBC/LDAP
properties in
Blueprint for the
Component?
Yes
Use Blueprint as-is,
no Component
configuration
property injection
No Inject Component
configuration
properties
Perform property
variable
replacement
S
E
2
9
© Hortonworks Inc. 2011–2018. All rights reserved
At-Motion Workloads:
Hortonworks DataFlow
3
0
© Hortonworks Inc. 2011–2018. All rights reserved
Hortonworks DataFlow in CloudBreak
⬢ Default blueprint: “Flow Management: Apache NiFi”
HDF 3.1: NiFi, Ambari, Ambari Metrics, ZooKeeper
3
1
© Hortonworks Inc. 2011–2018. All rights reserved
HDF - cluster creation
3
2
© Hortonworks Inc. 2011–2018. All rights reserved
HDF - cluster creation
3
3
© Hortonworks Inc. 2011–2018. All rights reserved
Custom Images
3
4
© Hortonworks Inc. 2011–2018. All rights reserved
Background: Cloudbreak
1. Cloudbreak creates VM instances using a default base images.
2. Cloudbreak installs Ambari on a VM instance.
3. Cloudbreak instructs Ambari to install an HDP Cluster on other VM instances.
Cloudbreak
RHEL 7
HDP Node
VM
HDP Node
VM
HDP Node
VM
HDP Node
VM
HDP Node
VM
HDP Node
VM
HDP Cluster
3
5
© Hortonworks Inc. 2011–2018. All rights reserved
Background: Cloudbreak Default Images
⬢ By default, Cloudbreak uses default base public images when creating VM instances.
Cloud Standard Image Operating System
AWS Amazon Linux 2017
Azure CentOS 7.x
Google Cloud Platform CentOS 7.x
OpenStack CentOS 7.x
Support for Custom Images provides a way for Cloudbreak
users to leverage their own custom image (not the default
image) when creating VM instances.
3
6
© Hortonworks Inc. 2011–2018. All rights reserved
Making a Custom Image: Overview
Create the
Custom Image
Register the
Custom Image
in Cloudbreak
Use the Custom
Image when
Creating a
Cluster
1 2 3
3
7
© Hortonworks Inc. 2011–2018. All rights reserved
Creating the Image: Code Repository
⬢ Instructions, Packer scripts and Salt states in public GitHub repository
– https://github.com/hortonworks/cloudbreak-images
⬢ An understanding of Packer and Salt is useful
– Packer creates infrastructure
– Packer runs Salt provisioner
⬢ Customer should clone the repository and build on it
3
8
© Hortonworks Inc. 2011–2018. All rights reserved
Creating the Image: Example Scenarios
SCENARIO APPROACH
For AWS: I don’t want Amazon Linux
and instead want RHEL 7
1. Setup repository and AWS environment
2. Use the repository tools to build a RHEL 7 image
make build-aws-rhel7
I don’t want OpenJDK and instead
want Oracle JDK
1. Setup repository and environment
2. Turn on Oracle optional state
3. Use the repository tools to build an image
For AWS: I don’t want Amazon Linux
and instead want MY RHEL 7
** This is an advanced scenario**
1. Setup repository and AWS environment
2. Change the source base image
3. Use the repository tools to build a RHEL 7 image
make build-aws-rhel7
3
9
© Hortonworks Inc. 2011–2018. All rights reserved
Use the Custom Image: Create Cluster (UI)
⬢ Create Cluster > General Configuration > Advanced
Choose image
catalog
Adjust the Ambari +
HDP repos (if you want)
Choose image
you registered
4
0
© Hortonworks Inc. 2011–2018. All rights reserved
Pre-Warmed Images
PROS CONS
Prewarmed: OS + pre-installed Ambari and
HDP
Cluster installs are faster
No internet connection is needed
Cannot change the Ambari or HDP versions,
cannot use local repositories
Base: OS only Cluster installs take longer Can change the Ambari or HDP Versions, or
use local repositories
Base Images Prewarmed Images
4
1
© Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
✓ Kerberos support
✓ LDAP integration
✓ Proxy configuration
4
2
© Hortonworks Inc. 2011–2018. All rights reserved
Cluster Security:
Kerberos
4
3
© Hortonworks Inc. 2011–2018. All rights reserved
What is Kerberos
⬢ Strongly authenticating and establishing a user’s identity is the basis for secure access in
Hadoop. Users need to be able to reliably “identify” themselves and then have that
identity propagated throughout the Hadoop cluster.
⬢ Kerberos is the de-facto system for authenticating access to distributed services
4
4
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak: Support for Enabling Kerberos
Goal
Provide a way for Cloudbreak users to create clusters that
are Kerberos enabled
Approach
Ambari exposes a lot of Kerberos options
Leverage Ambari Kerberos options and avoid re-creating
Ambari Kerberos experience
Pragmatic prescriptive options on-top
4
5
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak: Enable Kerberos Security
⬢ Create Cluster > Security > Advanced
⬢ [ ] Enable Kerberos Security
4
6
© Hortonworks Inc. 2011–2018. All rights reserved
Options: Use Existing KDC or Use Test KDC
Use Existing
KDC
Use Test KDC
Advanced
Basic
- Not for production use. For testing and
evaluation purposes only.
- Installs and configures an MIT KDC on the
master node.
- Configures the cluster to leverage that KDC.
- Provide basic information
about your existing KDC.
- Ambari Kerberos descriptors
are generated automatically.
- Provide basic information
about your existing KDC.
- Provide your own Ambari
Kerberos descriptors.
4
7
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak + LDAP/AD
4
8
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak User AuthN
⬢ Goal: Configure Cloudbreak to provide for external User AuthN to LDAP/AD
– CloudFoundry UAA (User Account and Authentication Server) is the foundation
https://github.com/cloudfoundry/uaa
⬢ Two parts:
1. Configure Cloudbreak to talk to external LDAP/AD
2. Configure which group(s) can access Cloudbreak
4
9
© Hortonworks Inc. 2011–2018. All rights reserved
Step 1: Configure Cloudbreak to talk to LDAP/AD
⬢ On the Cloudbreak host, create:
/var/lib/cloudbreak-deployment/uaa-changes.yml
⬢ Define LDAP profile for users and groups
Cloudbreak LDAP/AD
5
0
© Hortonworks Inc. 2011–2018. All rights reserved
Step 2: Configure which group(s) can access Cloudbreak
⬢ Configure which group(s) are authorized to access Cloudbreak:
cbd util execute-ldap-mapping [group]
cbd util delete-ldap-mapping [group]
⬢ To authorize users in the ”Analysts” group to access Cloudbreak:
cbd util execute-ldap-mapping cn=Analysts,ou=Groups,dc=hortonworks,dc=local
5
1
© Hortonworks Inc. 2011–2018. All rights reserved
Proxy configuration
5
2
© Hortonworks Inc. 2011–2018. All rights reserved
Limited Outbound Internet Access
⬢ Handle enterprise scenarios where:
– Limited (or restricted) outbound internet access, and/or
– Required use of a Proxy to obtain internet access
Cloudbreak
Cluster Hosts
Cloudbreak
• Docker Hub
• Cloudbreak dependencies
• Default Image Catalog
Cloudbreak and Cluster Hosts
• Cloud Provider APIs
• HDP or HDF platform repositories
http/sproxy
(optional)
5
3
© Hortonworks Inc. 2011–2018. All rights reserved
Internet Access via Proxy
Cloudbreak
Proxy Setup
Clusters Proxy
Setup
How does Cloudbreak
communicate thru a proxy to
get to the internet (and to the
cluster hosts)?
How do the Cluster Hosts
communicate thru a proxy to
get to the Internet?
5
4
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak: Proxy Setup
⬢ Setup Docker Environment to use Proxy
– Modify the Docker service to set HTTP_PROXY and HTTPS_PROXY (and NO_PROXY)
https://docs.docker.com/config/daemon/systemd/#httphttps-proxy
⬢ Setup Cloudbreak to use Proxy in Profile
⬢ Advanced Profile option “HTTPS_PROXYFORCLUSTERCONNECTION=true|false”
– Defaults to “false”
HTTP_PROXY_HOST=your-proxy-host
HTTPS_PROXY_HOST=your-proxy-host
PROXY_PORT=your-proxy-port
PROXY_USER=your-proxy-user
PROXY_PASSWORD=your-proxy-password
5
5
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak: Advanced Proxy Scenarios
SCENARIO #1: Proxy for internet, not clusters SCENARIO #2: Proxy for internet and clusters
5
6
© Hortonworks Inc. 2011–2018. All rights reserved
Clusters: Register Proxy Configuration
⬢ External Sources > Proxy Configurations
(optional)
if proxy requires
authentication
5
7
© Hortonworks Inc. 2011–2018. All rights reserved
Clusters: Configure Proxy for Cluster Hosts
⬢ Create Cluster > Advanced > External Sources > Configure Proxy
• Configures yum with “proxy” settings
• Configures Ambari Server with “httpProxy”
settings
5
8
© Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
✓ SmartSense integration
✓ Flex support
5
9
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak in da house
6
0
© Hortonworks Inc. 2011–2018. All rights reserved
⬢ Have an internal hosted Cloudbreak service for…
– our CI/CD pipeline
– testing and prototyping HDP and HDF services
– have self-service clusters for QE/SE/PS teams
Main use cases
6
1
© Hortonworks Inc. 2011–2018. All rights reserved
⬢ Run Cloudbreak in HA (High Availability) mode
– Ability to recover flows in case of node failure
– Avoid master-slave design / leader election problems
⬢ Scale Cloudbreak as we desire
– Distribute each cluster related flow
– Cannot run 2 flows for the same cluster at the same time (e.g: 2 upscale flows)
– Flow cancellation must be handled
⬢ Scale the Web UI
– Had to introduce a Redis cluster for the session store
⬢ Scale every other service as well
⬢ Find a tool that makes it easy to deploy these services to multiple nodes
Our technical goals
6
2
© Hortonworks Inc. 2011–2018. All rights reserved
⬢ Not because it’s fancy..
⬢ Evaluated Kubernetes, Swarm, Mesos, Rancher
⬢ Open source / Active community with hands-on experience
⬢ Many cloud providers already supports it
⬢ Lots of tooling behind it / API / CLI / Helm / Ansible / Salt
⬢ Integration with most of the cloud providers
– Provision Load Balancer (GCP, AWS, Azure)
– Use object stores to share data (Ceph, S3, GCP bucket, Azure Storage Account)
– Dynamic volume provisioning / Persistent disk (EBS, Azure Blob)
Why Kubernetes?
6
3
© Hortonworks Inc. 2011–2018. All rights reserved
Thank you

More Related Content

What's hot (20)

PPTX
Connecting the Drops with Apache NiFi & Apache MiNiFi
DataWorks Summit
 
PPTX
Running Enterprise Workloads in the Cloud
DataWorks Summit
 
PPTX
Apache Hadoop YARN: state of the union
DataWorks Summit
 
PPTX
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
DataWorks Summit
 
PPTX
Ozone- Object store for Apache Hadoop
Hortonworks
 
PPTX
An Overview on Optimization in Apache Hive: Past, Present Future
DataWorks Summit/Hadoop Summit
 
PPTX
Mission to NARs with Apache NiFi
Hortonworks
 
PDF
An Apache Hive Based Data Warehouse
DataWorks Summit
 
PPTX
Hive ACID Apache BigData 2016
alanfgates
 
PDF
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
DataWorks Summit
 
PPTX
Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...
DataWorks Summit
 
PPTX
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
 
PPTX
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
DataWorks Summit
 
PPTX
An Apache Hive Based Data Warehouse
DataWorks Summit
 
PPTX
Row/Column- Level Security in SQL for Apache Spark
DataWorks Summit/Hadoop Summit
 
PPTX
The Future of Apache Ambari
DataWorks Summit
 
PPTX
Enabling ABAC with Accumulo and Ranger integration
DataWorks Summit
 
PDF
Present and future of unified, portable and efficient data processing with Ap...
DataWorks Summit
 
PDF
Ozone and HDFS’s evolution
DataWorks Summit
 
PDF
Meet HBase 2.0 and Phoenix 5.0
DataWorks Summit
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
DataWorks Summit
 
Running Enterprise Workloads in the Cloud
DataWorks Summit
 
Apache Hadoop YARN: state of the union
DataWorks Summit
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
DataWorks Summit
 
Ozone- Object store for Apache Hadoop
Hortonworks
 
An Overview on Optimization in Apache Hive: Past, Present Future
DataWorks Summit/Hadoop Summit
 
Mission to NARs with Apache NiFi
Hortonworks
 
An Apache Hive Based Data Warehouse
DataWorks Summit
 
Hive ACID Apache BigData 2016
alanfgates
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
DataWorks Summit
 
Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...
DataWorks Summit
 
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
 
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
DataWorks Summit
 
An Apache Hive Based Data Warehouse
DataWorks Summit
 
Row/Column- Level Security in SQL for Apache Spark
DataWorks Summit/Hadoop Summit
 
The Future of Apache Ambari
DataWorks Summit
 
Enabling ABAC with Accumulo and Ranger integration
DataWorks Summit
 
Present and future of unified, portable and efficient data processing with Ap...
DataWorks Summit
 
Ozone and HDFS’s evolution
DataWorks Summit
 
Meet HBase 2.0 and Phoenix 5.0
DataWorks Summit
 

Similar to Running Enterprise Workloads in the Cloud (20)

PDF
Data in the Cloud Crash Course
DataWorks Summit
 
PDF
Data in the Cloud Crash Course
DataWorks Summit
 
PDF
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
PDF
Hadoop Operations – Past, Present, and Future
DataWorks Summit
 
PDF
Hadoop Operations - Past, Present, and Future
DataWorks Summit
 
PDF
Hadoop Everywhere & Cloudbreak
Sean Roberts
 
PDF
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks
 
PPTX
Cloudbreak - Technical Deep Dive
DataWorks Summit/Hadoop Summit
 
PPTX
One Click Hadoop Clusters - Anywhere (Using Docker)
DataWorks Summit
 
PPTX
Docker based Hadoop Deployment
Rakesh Saha
 
PPTX
Hadoop on Docker
Rakesh Saha
 
PPTX
DEVNET-1141 Dynamic Dockerized Hadoop Provisioning
Cisco DevNet
 
PPTX
Running Cloudbreak on Kubernetes
Krisztián Horváth
 
PPTX
Running Cloudbreak on Kubernetes
Future of Data Meetup
 
PPTX
Docker based Hadoop provisioning - anywhere
Janos Matyas
 
PPTX
Cloudy with a Chance of Hadoop - Real World Considerations
DataWorks Summit/Hadoop Summit
 
PPTX
Cloudy with a chance of Hadoop - real world considerations
DataWorks Summit
 
PPTX
Hadoop & devOps : better together
Maxime Lanciaux
 
PPTX
Moving towards enterprise ready Hadoop clusters on the cloud
DataWorks Summit/Hadoop Summit
 
PPTX
A First-Hand Look at What's New in HDP 2.3
DataWorks Summit
 
Data in the Cloud Crash Course
DataWorks Summit
 
Data in the Cloud Crash Course
DataWorks Summit
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
Hadoop Operations – Past, Present, and Future
DataWorks Summit
 
Hadoop Operations - Past, Present, and Future
DataWorks Summit
 
Hadoop Everywhere & Cloudbreak
Sean Roberts
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks
 
Cloudbreak - Technical Deep Dive
DataWorks Summit/Hadoop Summit
 
One Click Hadoop Clusters - Anywhere (Using Docker)
DataWorks Summit
 
Docker based Hadoop Deployment
Rakesh Saha
 
Hadoop on Docker
Rakesh Saha
 
DEVNET-1141 Dynamic Dockerized Hadoop Provisioning
Cisco DevNet
 
Running Cloudbreak on Kubernetes
Krisztián Horváth
 
Running Cloudbreak on Kubernetes
Future of Data Meetup
 
Docker based Hadoop provisioning - anywhere
Janos Matyas
 
Cloudy with a Chance of Hadoop - Real World Considerations
DataWorks Summit/Hadoop Summit
 
Cloudy with a chance of Hadoop - real world considerations
DataWorks Summit
 
Hadoop & devOps : better together
Maxime Lanciaux
 
Moving towards enterprise ready Hadoop clusters on the cloud
DataWorks Summit/Hadoop Summit
 
A First-Hand Look at What's New in HDP 2.3
DataWorks Summit
 
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
DataWorks Summit
 
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
PPTX
Managing the Dewey Decimal System
DataWorks Summit
 
PPTX
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
PPTX
Security Framework for Multitenant Architecture
DataWorks Summit
 
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
PPTX
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
PDF
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Ad

Recently uploaded (20)

PDF
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PPTX
PCU Keynote at IEEE World Congress on Services 250710.pptx
Ramesh Jain
 
PPTX
Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptx
bashnahara11
 
PDF
Basics of Electronics for IOT(actuators ,microcontroller etc..)
arnavmanesh
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PPTX
Machine Learning Benefits Across Industries
SynapseIndia
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
PPTX
python advanced data structure dictionary with examples python advanced data ...
sprasanna11
 
PDF
Market Insight : ETH Dominance Returns
CIFDAQ
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PPTX
Earn Agentblazer Status with Slack Community Patna.pptx
SanjeetMishra29
 
PDF
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PCU Keynote at IEEE World Congress on Services 250710.pptx
Ramesh Jain
 
Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptx
bashnahara11
 
Basics of Electronics for IOT(actuators ,microcontroller etc..)
arnavmanesh
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Machine Learning Benefits Across Industries
SynapseIndia
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
python advanced data structure dictionary with examples python advanced data ...
sprasanna11
 
Market Insight : ETH Dominance Returns
CIFDAQ
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Earn Agentblazer Status with Slack Community Patna.pptx
SanjeetMishra29
 
Researching The Best Chat SDK Providers in 2025
Ray Fields
 

Running Enterprise Workloads in the Cloud

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Running Enterprise Workloads in the Cloud Richard Doktorics Peter Darvasi
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved Who we are? ⬢ Peter Darvasi - Partner Engineer at Hortonworks - @pdarvasi ⬢ Richard Doktorics - Software Engineer - @doktoric
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved Agenda ⬢ What is Cloudbreak? ⬢ Enterprise checklist for big data in the cloud ⬢ Cloudbreak in da house ⬢ Questions
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved What is Cloudbreak?
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak is a tool for provisioning Hadoop clusters on any cloud infrastructure Simplified Cluster Provisioning - prescriptive setup, simple automation
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved Deploy on Public or Private Clouds Dynamically configure and manage clusters on public or private clouds (Amazon Web Services, Microsoft Azure, Google Cloud Platform and OpenStack) Automated Scaling Seamlessly manage elasticity requirements as cluster workloads change (Ambari Metrics / Prometheus) Secured Cluster Access Supports configuration defining network boundaries and configuring security groups Highly Extensible Recipes to run custom commands Custom images
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved ⬢ Cloudbreak Deployer (CBD) – Written in Go and Bash – Compiled into single binary ⬢ Micro-service architecture – Each service runs in a Docker container – Each container is replaceable with custom ones – Services are handled with docker-compose Single node deployment
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved Enterprise checklist for big data in cloud
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud
  • 10. 1 0 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud ✓ Simple UX ✓ Powerful CLI ✓ Autoscaling
  • 11. 1 1 © Hortonworks Inc. 2011–2018. All rights reserved Simplified UX
  • 12. 1 2 © Hortonworks Inc. 2011–2018. All rights reserved Create Credential Experience
  • 13. 1 3 © Hortonworks Inc. 2011–2018. All rights reserved Built-In Blueprints
  • 14. 1 4 © Hortonworks Inc. 2011–2018. All rights reserved Basic and Advanced Cluster Creation Experiences BASIC ADVANCED
  • 15. 1 5 © Hortonworks Inc. 2011–2018. All rights reserved New Network and Security Group Choices ⬢ Network – Create new Network and new Subnet – Choose existing Network and existing Subnet ⬢ Security Groups – Create new SGs • Choose default SGs (minimal set of ports) • Create customized – Choose existing SGs
  • 16. 1 6 © Hortonworks Inc. 2011–2018. All rights reserved Powerful CLI
  • 17. 1 7 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak CLI: Designed for DevOps
  • 18. 1 8 © Hortonworks Inc. 2011–2018. All rights reserved “Show cli command” for every request
  • 19. 1 9 © Hortonworks Inc. 2011–2018. All rights reserved Auto-scaling
  • 20. 2 0 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling ⬢ Alerts: Create metric or time-based alerts for cluster scaling ⬢ Policies: Scaling policies adjust cluster size based on activity and workload alerts ⬢ General Configurations: Boundaries and cooldown period
  • 21. 2 1 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling Time-Based Alert Fire at 10:15 am everyday
  • 22. 2 2 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling Metric-Based Alert Fire after NodeManagers are in CRITICAL state for 10 minutes
  • 23. 2 3 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling Policies ⬢ Define the Scale Adjustment (Node Count/Percentage/Exact size) ⬢ Select the HostGroup (to Scale) ⬢ Select Alert (which when fired, executes the Policy)
  • 24. 2 4 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling General Configurations ⬢ Cooldown Period (between scaling actions) ⬢ Minimum and Maximum Cluster size (boundaries) Cluster size boundaries Time Interval between two Autoscale events
  • 25. 2 5 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud ✓ Cloud Resources ✓ Hortonworks DataFlow ✓ Custom Images
  • 26. 2 6 © Hortonworks Inc. 2011–2018. All rights reserved Cloud Resources: RDBMS + LDAP
  • 27. 2 7 © Hortonworks Inc. 2011–2018. All rights reserved Cloud Resources: RDBMS and LDAP/AD = Dynamic Blueprints ⬢ Background: – Cluster configuration often includes external database (for Hive, Ranger, etc) and LDAP/AD configs – It’s a challenge to know the different Blueprint configuration choices per service across the stack ⬢ Dynamic Blueprints: – Ability to manage External Sources (e.g. RDBMS and LDAP/AD) outside of your Blueprint – Cloudbreak will inject the configurations into your Blueprint – Simplifies reuse of external cloud resources – Simplifies your Blueprints -> don’t have to know all the configurations for each component
  • 28. 2 8 © Hortonworks Inc. 2011–2018. All rights reserved Dynamic Blueprints: RDBMS/LDAP ⬢ Built-In Components: – Atlas, Ranger, Hadoop, Hive LLAP, Hive, Ambari, Oozie, Druid, SuperSet JDBC/LDAP properties in Blueprint for the Component? Yes Use Blueprint as-is, no Component configuration property injection No Inject Component configuration properties Perform property variable replacement S E
  • 29. 2 9 © Hortonworks Inc. 2011–2018. All rights reserved At-Motion Workloads: Hortonworks DataFlow
  • 30. 3 0 © Hortonworks Inc. 2011–2018. All rights reserved Hortonworks DataFlow in CloudBreak ⬢ Default blueprint: “Flow Management: Apache NiFi” HDF 3.1: NiFi, Ambari, Ambari Metrics, ZooKeeper
  • 31. 3 1 © Hortonworks Inc. 2011–2018. All rights reserved HDF - cluster creation
  • 32. 3 2 © Hortonworks Inc. 2011–2018. All rights reserved HDF - cluster creation
  • 33. 3 3 © Hortonworks Inc. 2011–2018. All rights reserved Custom Images
  • 34. 3 4 © Hortonworks Inc. 2011–2018. All rights reserved Background: Cloudbreak 1. Cloudbreak creates VM instances using a default base images. 2. Cloudbreak installs Ambari on a VM instance. 3. Cloudbreak instructs Ambari to install an HDP Cluster on other VM instances. Cloudbreak RHEL 7 HDP Node VM HDP Node VM HDP Node VM HDP Node VM HDP Node VM HDP Node VM HDP Cluster
  • 35. 3 5 © Hortonworks Inc. 2011–2018. All rights reserved Background: Cloudbreak Default Images ⬢ By default, Cloudbreak uses default base public images when creating VM instances. Cloud Standard Image Operating System AWS Amazon Linux 2017 Azure CentOS 7.x Google Cloud Platform CentOS 7.x OpenStack CentOS 7.x Support for Custom Images provides a way for Cloudbreak users to leverage their own custom image (not the default image) when creating VM instances.
  • 36. 3 6 © Hortonworks Inc. 2011–2018. All rights reserved Making a Custom Image: Overview Create the Custom Image Register the Custom Image in Cloudbreak Use the Custom Image when Creating a Cluster 1 2 3
  • 37. 3 7 © Hortonworks Inc. 2011–2018. All rights reserved Creating the Image: Code Repository ⬢ Instructions, Packer scripts and Salt states in public GitHub repository – https://github.com/hortonworks/cloudbreak-images ⬢ An understanding of Packer and Salt is useful – Packer creates infrastructure – Packer runs Salt provisioner ⬢ Customer should clone the repository and build on it
  • 38. 3 8 © Hortonworks Inc. 2011–2018. All rights reserved Creating the Image: Example Scenarios SCENARIO APPROACH For AWS: I don’t want Amazon Linux and instead want RHEL 7 1. Setup repository and AWS environment 2. Use the repository tools to build a RHEL 7 image make build-aws-rhel7 I don’t want OpenJDK and instead want Oracle JDK 1. Setup repository and environment 2. Turn on Oracle optional state 3. Use the repository tools to build an image For AWS: I don’t want Amazon Linux and instead want MY RHEL 7 ** This is an advanced scenario** 1. Setup repository and AWS environment 2. Change the source base image 3. Use the repository tools to build a RHEL 7 image make build-aws-rhel7
  • 39. 3 9 © Hortonworks Inc. 2011–2018. All rights reserved Use the Custom Image: Create Cluster (UI) ⬢ Create Cluster > General Configuration > Advanced Choose image catalog Adjust the Ambari + HDP repos (if you want) Choose image you registered
  • 40. 4 0 © Hortonworks Inc. 2011–2018. All rights reserved Pre-Warmed Images PROS CONS Prewarmed: OS + pre-installed Ambari and HDP Cluster installs are faster No internet connection is needed Cannot change the Ambari or HDP versions, cannot use local repositories Base: OS only Cluster installs take longer Can change the Ambari or HDP Versions, or use local repositories Base Images Prewarmed Images
  • 41. 4 1 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud ✓ Kerberos support ✓ LDAP integration ✓ Proxy configuration
  • 42. 4 2 © Hortonworks Inc. 2011–2018. All rights reserved Cluster Security: Kerberos
  • 43. 4 3 © Hortonworks Inc. 2011–2018. All rights reserved What is Kerberos ⬢ Strongly authenticating and establishing a user’s identity is the basis for secure access in Hadoop. Users need to be able to reliably “identify” themselves and then have that identity propagated throughout the Hadoop cluster. ⬢ Kerberos is the de-facto system for authenticating access to distributed services
  • 44. 4 4 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak: Support for Enabling Kerberos Goal Provide a way for Cloudbreak users to create clusters that are Kerberos enabled Approach Ambari exposes a lot of Kerberos options Leverage Ambari Kerberos options and avoid re-creating Ambari Kerberos experience Pragmatic prescriptive options on-top
  • 45. 4 5 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak: Enable Kerberos Security ⬢ Create Cluster > Security > Advanced ⬢ [ ] Enable Kerberos Security
  • 46. 4 6 © Hortonworks Inc. 2011–2018. All rights reserved Options: Use Existing KDC or Use Test KDC Use Existing KDC Use Test KDC Advanced Basic - Not for production use. For testing and evaluation purposes only. - Installs and configures an MIT KDC on the master node. - Configures the cluster to leverage that KDC. - Provide basic information about your existing KDC. - Ambari Kerberos descriptors are generated automatically. - Provide basic information about your existing KDC. - Provide your own Ambari Kerberos descriptors.
  • 47. 4 7 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak + LDAP/AD
  • 48. 4 8 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak User AuthN ⬢ Goal: Configure Cloudbreak to provide for external User AuthN to LDAP/AD – CloudFoundry UAA (User Account and Authentication Server) is the foundation https://github.com/cloudfoundry/uaa ⬢ Two parts: 1. Configure Cloudbreak to talk to external LDAP/AD 2. Configure which group(s) can access Cloudbreak
  • 49. 4 9 © Hortonworks Inc. 2011–2018. All rights reserved Step 1: Configure Cloudbreak to talk to LDAP/AD ⬢ On the Cloudbreak host, create: /var/lib/cloudbreak-deployment/uaa-changes.yml ⬢ Define LDAP profile for users and groups Cloudbreak LDAP/AD
  • 50. 5 0 © Hortonworks Inc. 2011–2018. All rights reserved Step 2: Configure which group(s) can access Cloudbreak ⬢ Configure which group(s) are authorized to access Cloudbreak: cbd util execute-ldap-mapping [group] cbd util delete-ldap-mapping [group] ⬢ To authorize users in the ”Analysts” group to access Cloudbreak: cbd util execute-ldap-mapping cn=Analysts,ou=Groups,dc=hortonworks,dc=local
  • 51. 5 1 © Hortonworks Inc. 2011–2018. All rights reserved Proxy configuration
  • 52. 5 2 © Hortonworks Inc. 2011–2018. All rights reserved Limited Outbound Internet Access ⬢ Handle enterprise scenarios where: – Limited (or restricted) outbound internet access, and/or – Required use of a Proxy to obtain internet access Cloudbreak Cluster Hosts Cloudbreak • Docker Hub • Cloudbreak dependencies • Default Image Catalog Cloudbreak and Cluster Hosts • Cloud Provider APIs • HDP or HDF platform repositories http/sproxy (optional)
  • 53. 5 3 © Hortonworks Inc. 2011–2018. All rights reserved Internet Access via Proxy Cloudbreak Proxy Setup Clusters Proxy Setup How does Cloudbreak communicate thru a proxy to get to the internet (and to the cluster hosts)? How do the Cluster Hosts communicate thru a proxy to get to the Internet?
  • 54. 5 4 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak: Proxy Setup ⬢ Setup Docker Environment to use Proxy – Modify the Docker service to set HTTP_PROXY and HTTPS_PROXY (and NO_PROXY) https://docs.docker.com/config/daemon/systemd/#httphttps-proxy ⬢ Setup Cloudbreak to use Proxy in Profile ⬢ Advanced Profile option “HTTPS_PROXYFORCLUSTERCONNECTION=true|false” – Defaults to “false” HTTP_PROXY_HOST=your-proxy-host HTTPS_PROXY_HOST=your-proxy-host PROXY_PORT=your-proxy-port PROXY_USER=your-proxy-user PROXY_PASSWORD=your-proxy-password
  • 55. 5 5 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak: Advanced Proxy Scenarios SCENARIO #1: Proxy for internet, not clusters SCENARIO #2: Proxy for internet and clusters
  • 56. 5 6 © Hortonworks Inc. 2011–2018. All rights reserved Clusters: Register Proxy Configuration ⬢ External Sources > Proxy Configurations (optional) if proxy requires authentication
  • 57. 5 7 © Hortonworks Inc. 2011–2018. All rights reserved Clusters: Configure Proxy for Cluster Hosts ⬢ Create Cluster > Advanced > External Sources > Configure Proxy • Configures yum with “proxy” settings • Configures Ambari Server with “httpProxy” settings
  • 58. 5 8 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud ✓ SmartSense integration ✓ Flex support
  • 59. 5 9 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak in da house
  • 60. 6 0 © Hortonworks Inc. 2011–2018. All rights reserved ⬢ Have an internal hosted Cloudbreak service for… – our CI/CD pipeline – testing and prototyping HDP and HDF services – have self-service clusters for QE/SE/PS teams Main use cases
  • 61. 6 1 © Hortonworks Inc. 2011–2018. All rights reserved ⬢ Run Cloudbreak in HA (High Availability) mode – Ability to recover flows in case of node failure – Avoid master-slave design / leader election problems ⬢ Scale Cloudbreak as we desire – Distribute each cluster related flow – Cannot run 2 flows for the same cluster at the same time (e.g: 2 upscale flows) – Flow cancellation must be handled ⬢ Scale the Web UI – Had to introduce a Redis cluster for the session store ⬢ Scale every other service as well ⬢ Find a tool that makes it easy to deploy these services to multiple nodes Our technical goals
  • 62. 6 2 © Hortonworks Inc. 2011–2018. All rights reserved ⬢ Not because it’s fancy.. ⬢ Evaluated Kubernetes, Swarm, Mesos, Rancher ⬢ Open source / Active community with hands-on experience ⬢ Many cloud providers already supports it ⬢ Lots of tooling behind it / API / CLI / Helm / Ansible / Salt ⬢ Integration with most of the cloud providers – Provision Load Balancer (GCP, AWS, Azure) – Use object stores to share data (Ceph, S3, GCP bucket, Azure Storage Account) – Dynamic volume provisioning / Persistent disk (EBS, Azure Blob) Why Kubernetes?
  • 63. 6 3 © Hortonworks Inc. 2011–2018. All rights reserved Thank you