SlideShare a Scribd company logo
Scalable Object Storage with
Apache CloudStack and Apache
Hadoop
April 30 2013
Chiradeep Vittal
@chiradeep
Agenda
• What is CloudStack
• Object Storage for IAAS
• Current Architecture and Limitations
• Requirements for Object Storage
• Object Storage integrations in CloudStack
• HDFS for Object Storage
• Future directions
• History
• Incubating in the Apache
Software Foundation since
April 2012
• Open Source since May
2010
• In production since 2009
– Turnkey platform for delivering
IaaS clouds
– Full featured GUI, end-user API
and admin API
Apache CloudStack
Build your cloud the way the
world’s most successful
clouds are built
How did Amazon build its cloud?
Commodity
Servers
Commodity
Storage
Networking
Open Source Xen Hypervisor
Amazon Orchestration Software
AWS API (EC2, S3, …)
Amazon eCommerce Platform
How can YOU build a cloud?
Servers StorageNetworking
Open Source Xen Hypervisor
Amazon Orchestration Software
AWS API (EC2, S3, …)
Amazon eCommerce Platform
Hypervisor (Xen/KVM/VMW/)
CloudStack Orchestration Software
Optional Portal
CloudStack or AWS API
Secondary Storage
Image
L3/L2 core
DC Edge
End users
Pod Pod Pod Pod
Zone Architecture
Pod
Access Sw
MySQL
CloudStack
Admin/User API
Primary Storage
NFS/ISCSI/FC
Hypervisor (Xen
/VMWare/KVM)
VM
VM
Snapshot
Snapshot
Image
Disk Disk
VM
Cloud-Style Workloads
• Low cost
– Standardized, cookie cutter infrastructure
– Highly automated and efficient
• Application owns availability
– At scale everything breaks
– Focus on MTTR instead of MTBF
Secondary Storage
Image
L3/L2 core
DC Edge
Pod Pod Pod Pod
At scale…everything breaks
Pod
Access Sw
Primary Storage
NFS/ISCSI/FC
Hypervisor (Xen
/VMWare/KVM)
VM
VM
Snapshot
Snapshot
Image
Disk Disk
VM
Region “West”
Zone “West-Alpha”
Zone “West-Beta”
Zone “West-Gamma”
Zone “West-Delta”
Low Latency Backbone
(e.g., SONET ring)
Regions and zones
Region “East”
Region “South”
Internet
Geographic
separation
Region “West”
Low Latency
Secondary Storage in CloudStack 4.0
• NFS server default
– can be mounted by hypervisor
– Easy to obtain, set up and operate
• Problems with NFS:
– Scale: max limits of file systems
• Solution: CloudStack can manage multiple NFS stores (+
complexity)
– Performance
• N hypervisors : 1 storage CPU / 1 network link
– Wide area suitability for cross-region storage
• Chatty protocol
– Lack of replication
Object Storage Technology
Region “West”
Zone “West-Alpha”
Zone “West-Beta”
Zone “West-Gamma”
Zone “West-Delta”
Object Storage in a region
• Replication
• Audit
• Repair
• Maintenance
Region “West”
Object Storage enables reliability
Object Storage Technology
Region “West”
Object Storage also enables other
applications
Object Store
API Servers
• DropBox
• Static Content
• Archival
Object Storage characteristics
• Highly reliable and durable
– 99.9 % availability for AWS S3
– 99.999999999 % durability
• Massive scale
– 1.3 trillion objects stored across 7 AWS regions [Nov 2012 figures]
– Throughput: 830,000 requests per second
• Immutable objects
– Objects cannot be modified, only deleted
• Simple API
– PUT/POST objects, GET objects, DELETE objects
– No seek / no mutation / no POSIX API
• Flat namespace
– Everything stored in buckets.
– Bucket names are unique
– Buckets can only contain objects, not other buckets
• Cheap and getting cheaper
CloudStack S3 API Server
Object Storage Technology
S3
API Servers
MySQL
CloudStack S3 API Server
• Understands AWS S3 REST-style and SOAP API
• Pluggable backend
– Backend storage needs to map simple calls to their
API
• E.g., createContainer, saveObject, loadObject
– Default backend is a POSIX filesystem
– Backend with Caringo Object Store (commercial
vendor) available
– HDFS backend also available
• MySQL storage
– Bucket -> object mapping
– ACLs, bucket policies
Object Store Integration into
CloudStack
• For images and snapshots
• Replacement for NFS secondary storage
Or
Augmentation for NFS secondary storage
• Integrations available with
– Riak CS
– Openstack Swift
• New in 4.2 (upcoming):
– Framework for integrating storage providers
What do we want to build ?
• Open source, ASL licensed object storage
• Scales to at least 1 billion objects
• Reliability and durability on par with S3
• S3 API (or similar, e.g., Google Storage)
• Tooling around maintenance and
operation, specific to object storage
The following slides are a design
discussion
Architecture of Scalable Object
Storage
API Servers
Auth Servers
Object Servers Replicators/Auditors
Object
Lookup
Servers
Why HDFS
• ASF Project (Apache Hadoop)
• Immutable objects, replication
• Reliability, scale and performance
– 200 million objects in 1 cluster [Facebook]
– 100 PB in 1 cluster [Facebook]
• Simple operation
– Just add data nodes
HDFS-based Object Storage
S3 API Servers
S3 Auth Servers
Data nodes
Namenode
pair
HDFS API
BUT
• Name Node Scalability
– 150 bytes RAM / block
– GC issues
• Name Node SPOF
– Being addressed in the community✔
• Cross-zone replication
– Rack-awareness placement ✔
– What if the zones are spread a little further apart?
• Storage for object metadata
– ACLs, policies, timers
Name Node scalability
• 1 billion objects = 3 billion blocks (chunks)
– Average of 5 MB/object = 5 PB (actual), 15
PB (raw)
– 450 GB of RAM per Name Node
• 150b x 3 x 10^9
– 16 TB / node => 1000 Data nodes
• Requires Name Node federation ?
• Or an approach like HAR files
Name Node Federation
Extension: Federated NameNodes are HA pairs
Federation issues
• HA for name nodes
• Namespace shards
– Map object -> name node
• Requires another scalable key-value store
– HBase?
• Rebalancing between name nodes
Replication over lossy/slower links
A. Asynchronous replication
– Use distcp to replicate between clusters
– 6 copies vs. 3
– Master/Slave relationship
• Possibility of loss of data during failover
• Need coordination logic outside of HDFS
B. Synchronous replication
– API server writes to 2 clusters and acks only
when both writes are successful
– Availability compromised when one zone is
down
CAP Theorem
Consistency or Availability during partition
Many nuances
Storage for object metadata
A. Store it in HDFS along with the object
– Reads are expensive (e.g., to check ACL)
– Mutable data, needs layer over HDFS
B. Use another storage system (e.g. HBase)
– Name node federation also requires this.
C. Modify Name Node to store metadata
– High performance
– Not extensible
Object store on HDFS Future
• Viable for small-sized deployments
– Up to 100-200 million objects
– Datacenters close together
• Larger deployments needs development
– No effort ongoing at this time
Conclusion
• CloudStack needs object storage for
“cloud-style” workloads
• Object Storage is not easy
• HDFS comes close but not close enough
• Join the community!
Ad

More Related Content

What's hot (13)

Scaling Pinterest
Scaling PinterestScaling Pinterest
Scaling Pinterest
C4Media
 
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
Michael Stack
 
Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015
Joydeep Sen Sarma
 
Amazon Athena Hands-On Workshop
Amazon Athena Hands-On WorkshopAmazon Athena Hands-On Workshop
Amazon Athena Hands-On Workshop
DoiT International
 
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
Michael Stack
 
An overview of Amazon Athena
An overview of Amazon AthenaAn overview of Amazon Athena
An overview of Amazon Athena
Julien SIMON
 
Cloud Optimized Big Data
Cloud Optimized Big DataCloud Optimized Big Data
Cloud Optimized Big Data
Joydeep Sen Sarma
 
Getting Started with EC2, S3 and EMR
Getting Started with EC2, S3 and EMRGetting Started with EC2, S3 and EMR
Getting Started with EC2, S3 and EMR
Arun Sirimalla
 
Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Oracle Databases on AWS - Getting the Best Out of RDS and EC2Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Maris Elsins
 
Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and Future
Ryan Hennig
 
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetupAutoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
Rafal Kwasny
 
CosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersCosmosDB for DBAs & Developers
CosmosDB for DBAs & Developers
Niko Neugebauer
 
Presto Fast SQL on Anything
Presto Fast SQL on AnythingPresto Fast SQL on Anything
Presto Fast SQL on Anything
Alluxio, Inc.
 
Scaling Pinterest
Scaling PinterestScaling Pinterest
Scaling Pinterest
C4Media
 
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
Michael Stack
 
Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015
Joydeep Sen Sarma
 
Amazon Athena Hands-On Workshop
Amazon Athena Hands-On WorkshopAmazon Athena Hands-On Workshop
Amazon Athena Hands-On Workshop
DoiT International
 
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
Michael Stack
 
An overview of Amazon Athena
An overview of Amazon AthenaAn overview of Amazon Athena
An overview of Amazon Athena
Julien SIMON
 
Getting Started with EC2, S3 and EMR
Getting Started with EC2, S3 and EMRGetting Started with EC2, S3 and EMR
Getting Started with EC2, S3 and EMR
Arun Sirimalla
 
Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Oracle Databases on AWS - Getting the Best Out of RDS and EC2Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Maris Elsins
 
Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and Future
Ryan Hennig
 
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetupAutoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
Rafal Kwasny
 
CosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersCosmosDB for DBAs & Developers
CosmosDB for DBAs & Developers
Niko Neugebauer
 
Presto Fast SQL on Anything
Presto Fast SQL on AnythingPresto Fast SQL on Anything
Presto Fast SQL on Anything
Alluxio, Inc.
 

Similar to Scalable Object Storage with Apache CloudStack and Apache Hadoop (20)

Red Hat Storage Server For AWS
Red Hat Storage Server For AWSRed Hat Storage Server For AWS
Red Hat Storage Server For AWS
Red_Hat_Storage
 
HDF Cloud Services
HDF Cloud ServicesHDF Cloud Services
HDF Cloud Services
The HDF-EOS Tools and Information Center
 
HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
The HDF-EOS Tools and Information Center
 
Red Hat Storage Day New York - What's New in Red Hat Ceph Storage
Red Hat Storage Day New York - What's New in Red Hat Ceph StorageRed Hat Storage Day New York - What's New in Red Hat Ceph Storage
Red Hat Storage Day New York - What's New in Red Hat Ceph Storage
Red_Hat_Storage
 
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Cloudian
 
New use cases for Ceph, beyond OpenStack, Luis Rico
New use cases for Ceph, beyond OpenStack, Luis RicoNew use cases for Ceph, beyond OpenStack, Luis Rico
New use cases for Ceph, beyond OpenStack, Luis Rico
Ceph Community
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
Tom Laszewski
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
Steve Staso
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
SUSE Italy
 
Big data on aws
Big data on awsBig data on aws
Big data on aws
Serkan Özal
 
Barcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWSBarcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWS
Wong Hoi Sing Edison
 
Orchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using HeatOrchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using Heat
CoreStack
 
Big Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWSBig Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWS
javier ramirez
 
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
PROIDEA
 
DEVNET-1106 Upcoming Services in OpenStack
DEVNET-1106	Upcoming Services in OpenStackDEVNET-1106	Upcoming Services in OpenStack
DEVNET-1106 Upcoming Services in OpenStack
Cisco DevNet
 
Hadoop and object stores can we do it better
Hadoop and object stores  can we do it betterHadoop and object stores  can we do it better
Hadoop and object stores can we do it better
gvernik
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?
gvernik
 
Managing storage on Prem and in Cloud
Managing storage on Prem and in CloudManaging storage on Prem and in Cloud
Managing storage on Prem and in Cloud
Howard Marks
 
Optimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudOptimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public Cloud
Qubole
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the experts
DataWorks Summit
 
Red Hat Storage Server For AWS
Red Hat Storage Server For AWSRed Hat Storage Server For AWS
Red Hat Storage Server For AWS
Red_Hat_Storage
 
Red Hat Storage Day New York - What's New in Red Hat Ceph Storage
Red Hat Storage Day New York - What's New in Red Hat Ceph StorageRed Hat Storage Day New York - What's New in Red Hat Ceph Storage
Red Hat Storage Day New York - What's New in Red Hat Ceph Storage
Red_Hat_Storage
 
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Cloudian
 
New use cases for Ceph, beyond OpenStack, Luis Rico
New use cases for Ceph, beyond OpenStack, Luis RicoNew use cases for Ceph, beyond OpenStack, Luis Rico
New use cases for Ceph, beyond OpenStack, Luis Rico
Ceph Community
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
Tom Laszewski
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
SUSE Italy
 
Barcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWSBarcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWS
Wong Hoi Sing Edison
 
Orchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using HeatOrchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using Heat
CoreStack
 
Big Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWSBig Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWS
javier ramirez
 
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
PROIDEA
 
DEVNET-1106 Upcoming Services in OpenStack
DEVNET-1106	Upcoming Services in OpenStackDEVNET-1106	Upcoming Services in OpenStack
DEVNET-1106 Upcoming Services in OpenStack
Cisco DevNet
 
Hadoop and object stores can we do it better
Hadoop and object stores  can we do it betterHadoop and object stores  can we do it better
Hadoop and object stores can we do it better
gvernik
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?
gvernik
 
Managing storage on Prem and in Cloud
Managing storage on Prem and in CloudManaging storage on Prem and in Cloud
Managing storage on Prem and in Cloud
Howard Marks
 
Optimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudOptimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public Cloud
Qubole
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the experts
DataWorks Summit
 
Ad

More from buildacloud (20)

The Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep VittalThe Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep Vittal
buildacloud
 
Policy Based SDN Solution for DC and Branch Office by Suresh Boddapati
Policy Based SDN Solution for DC and Branch Office by Suresh BoddapatiPolicy Based SDN Solution for DC and Branch Office by Suresh Boddapati
Policy Based SDN Solution for DC and Branch Office by Suresh Boddapati
buildacloud
 
L4-L7 services for SDN and NVF by Youcef Laribi
L4-L7 services for SDN and NVF by Youcef LaribiL4-L7 services for SDN and NVF by Youcef Laribi
L4-L7 services for SDN and NVF by Youcef Laribi
buildacloud
 
Jenkins, jclouds, CloudStack, and CentOS by David Nalley
Jenkins, jclouds, CloudStack, and CentOS by David NalleyJenkins, jclouds, CloudStack, and CentOS by David Nalley
Jenkins, jclouds, CloudStack, and CentOS by David Nalley
buildacloud
 
Intro to Zenoss by Andrew Kirch
Intro to Zenoss by Andrew KirchIntro to Zenoss by Andrew Kirch
Intro to Zenoss by Andrew Kirch
buildacloud
 
Guaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike TutkowskiGuaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike Tutkowski
buildacloud
 
Cloud Application Blueprints with Apache Brooklyn by Alex Henevald
Cloud Application Blueprints with Apache Brooklyn by Alex HenevaldCloud Application Blueprints with Apache Brooklyn by Alex Henevald
Cloud Application Blueprints with Apache Brooklyn by Alex Henevald
buildacloud
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalley
buildacloud
 
Managing infrastructure with Application Policy by Mike Cohen
Managing infrastructure with Application Policy by Mike CohenManaging infrastructure with Application Policy by Mike Cohen
Managing infrastructure with Application Policy by Mike Cohen
buildacloud
 
Intro to Zenoss by Andrew Kirch
Intro to Zenoss by Andrew KirchIntro to Zenoss by Andrew Kirch
Intro to Zenoss by Andrew Kirch
buildacloud
 
Monitoring CloudStack in context with Converged Infrastructure by Mike Turnlund
Monitoring CloudStack in context with Converged Infrastructure by Mike TurnlundMonitoring CloudStack in context with Converged Infrastructure by Mike Turnlund
Monitoring CloudStack in context with Converged Infrastructure by Mike Turnlund
buildacloud
 
Rest api design by george reese
Rest api design by george reeseRest api design by george reese
Rest api design by george reese
buildacloud
 
Enterprise grade firewall and ssl termination to ac by will stevens
Enterprise grade firewall and ssl termination to ac by will stevensEnterprise grade firewall and ssl termination to ac by will stevens
Enterprise grade firewall and ssl termination to ac by will stevens
buildacloud
 
State of the cloud by reuven cohen
State of the cloud by reuven cohenState of the cloud by reuven cohen
State of the cloud by reuven cohen
buildacloud
 
Securing Your Cloud With the Xen Hypervisor by Russell Pavlicek
Securing Your Cloud With the Xen Hypervisor by Russell PavlicekSecuring Your Cloud With the Xen Hypervisor by Russell Pavlicek
Securing Your Cloud With the Xen Hypervisor by Russell Pavlicek
buildacloud
 
DevCloud - Setup and Demo on Apache CloudStack
DevCloud - Setup and Demo on Apache CloudStack DevCloud - Setup and Demo on Apache CloudStack
DevCloud - Setup and Demo on Apache CloudStack
buildacloud
 
Cloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper ContrailCloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper Contrail
buildacloud
 
Ian rae panel cloud stack & cloud storage where are we at, and where do we ne...
Ian rae panel cloud stack & cloud storage where are we at, and where do we ne...Ian rae panel cloud stack & cloud storage where are we at, and where do we ne...
Ian rae panel cloud stack & cloud storage where are we at, and where do we ne...
buildacloud
 
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
buildacloud
 
CloudStack University by Sebastien Goasguen
CloudStack University by Sebastien GoasguenCloudStack University by Sebastien Goasguen
CloudStack University by Sebastien Goasguen
buildacloud
 
The Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep VittalThe Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep Vittal
buildacloud
 
Policy Based SDN Solution for DC and Branch Office by Suresh Boddapati
Policy Based SDN Solution for DC and Branch Office by Suresh BoddapatiPolicy Based SDN Solution for DC and Branch Office by Suresh Boddapati
Policy Based SDN Solution for DC and Branch Office by Suresh Boddapati
buildacloud
 
L4-L7 services for SDN and NVF by Youcef Laribi
L4-L7 services for SDN and NVF by Youcef LaribiL4-L7 services for SDN and NVF by Youcef Laribi
L4-L7 services for SDN and NVF by Youcef Laribi
buildacloud
 
Jenkins, jclouds, CloudStack, and CentOS by David Nalley
Jenkins, jclouds, CloudStack, and CentOS by David NalleyJenkins, jclouds, CloudStack, and CentOS by David Nalley
Jenkins, jclouds, CloudStack, and CentOS by David Nalley
buildacloud
 
Intro to Zenoss by Andrew Kirch
Intro to Zenoss by Andrew KirchIntro to Zenoss by Andrew Kirch
Intro to Zenoss by Andrew Kirch
buildacloud
 
Guaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike TutkowskiGuaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike Tutkowski
buildacloud
 
Cloud Application Blueprints with Apache Brooklyn by Alex Henevald
Cloud Application Blueprints with Apache Brooklyn by Alex HenevaldCloud Application Blueprints with Apache Brooklyn by Alex Henevald
Cloud Application Blueprints with Apache Brooklyn by Alex Henevald
buildacloud
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalley
buildacloud
 
Managing infrastructure with Application Policy by Mike Cohen
Managing infrastructure with Application Policy by Mike CohenManaging infrastructure with Application Policy by Mike Cohen
Managing infrastructure with Application Policy by Mike Cohen
buildacloud
 
Intro to Zenoss by Andrew Kirch
Intro to Zenoss by Andrew KirchIntro to Zenoss by Andrew Kirch
Intro to Zenoss by Andrew Kirch
buildacloud
 
Monitoring CloudStack in context with Converged Infrastructure by Mike Turnlund
Monitoring CloudStack in context with Converged Infrastructure by Mike TurnlundMonitoring CloudStack in context with Converged Infrastructure by Mike Turnlund
Monitoring CloudStack in context with Converged Infrastructure by Mike Turnlund
buildacloud
 
Rest api design by george reese
Rest api design by george reeseRest api design by george reese
Rest api design by george reese
buildacloud
 
Enterprise grade firewall and ssl termination to ac by will stevens
Enterprise grade firewall and ssl termination to ac by will stevensEnterprise grade firewall and ssl termination to ac by will stevens
Enterprise grade firewall and ssl termination to ac by will stevens
buildacloud
 
State of the cloud by reuven cohen
State of the cloud by reuven cohenState of the cloud by reuven cohen
State of the cloud by reuven cohen
buildacloud
 
Securing Your Cloud With the Xen Hypervisor by Russell Pavlicek
Securing Your Cloud With the Xen Hypervisor by Russell PavlicekSecuring Your Cloud With the Xen Hypervisor by Russell Pavlicek
Securing Your Cloud With the Xen Hypervisor by Russell Pavlicek
buildacloud
 
DevCloud - Setup and Demo on Apache CloudStack
DevCloud - Setup and Demo on Apache CloudStack DevCloud - Setup and Demo on Apache CloudStack
DevCloud - Setup and Demo on Apache CloudStack
buildacloud
 
Cloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper ContrailCloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper Contrail
buildacloud
 
Ian rae panel cloud stack & cloud storage where are we at, and where do we ne...
Ian rae panel cloud stack & cloud storage where are we at, and where do we ne...Ian rae panel cloud stack & cloud storage where are we at, and where do we ne...
Ian rae panel cloud stack & cloud storage where are we at, and where do we ne...
buildacloud
 
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
buildacloud
 
CloudStack University by Sebastien Goasguen
CloudStack University by Sebastien GoasguenCloudStack University by Sebastien Goasguen
CloudStack University by Sebastien Goasguen
buildacloud
 
Ad

Recently uploaded (20)

Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 

Scalable Object Storage with Apache CloudStack and Apache Hadoop

  • 1. Scalable Object Storage with Apache CloudStack and Apache Hadoop April 30 2013 Chiradeep Vittal @chiradeep
  • 2. Agenda • What is CloudStack • Object Storage for IAAS • Current Architecture and Limitations • Requirements for Object Storage • Object Storage integrations in CloudStack • HDFS for Object Storage • Future directions
  • 3. • History • Incubating in the Apache Software Foundation since April 2012 • Open Source since May 2010 • In production since 2009 – Turnkey platform for delivering IaaS clouds – Full featured GUI, end-user API and admin API Apache CloudStack Build your cloud the way the world’s most successful clouds are built
  • 4. How did Amazon build its cloud? Commodity Servers Commodity Storage Networking Open Source Xen Hypervisor Amazon Orchestration Software AWS API (EC2, S3, …) Amazon eCommerce Platform
  • 5. How can YOU build a cloud? Servers StorageNetworking Open Source Xen Hypervisor Amazon Orchestration Software AWS API (EC2, S3, …) Amazon eCommerce Platform Hypervisor (Xen/KVM/VMW/) CloudStack Orchestration Software Optional Portal CloudStack or AWS API
  • 6. Secondary Storage Image L3/L2 core DC Edge End users Pod Pod Pod Pod Zone Architecture Pod Access Sw MySQL CloudStack Admin/User API Primary Storage NFS/ISCSI/FC Hypervisor (Xen /VMWare/KVM) VM VM Snapshot Snapshot Image Disk Disk VM
  • 7. Cloud-Style Workloads • Low cost – Standardized, cookie cutter infrastructure – Highly automated and efficient • Application owns availability – At scale everything breaks – Focus on MTTR instead of MTBF
  • 8. Secondary Storage Image L3/L2 core DC Edge Pod Pod Pod Pod At scale…everything breaks Pod Access Sw Primary Storage NFS/ISCSI/FC Hypervisor (Xen /VMWare/KVM) VM VM Snapshot Snapshot Image Disk Disk VM
  • 9. Region “West” Zone “West-Alpha” Zone “West-Beta” Zone “West-Gamma” Zone “West-Delta” Low Latency Backbone (e.g., SONET ring) Regions and zones
  • 11. Secondary Storage in CloudStack 4.0 • NFS server default – can be mounted by hypervisor – Easy to obtain, set up and operate • Problems with NFS: – Scale: max limits of file systems • Solution: CloudStack can manage multiple NFS stores (+ complexity) – Performance • N hypervisors : 1 storage CPU / 1 network link – Wide area suitability for cross-region storage • Chatty protocol – Lack of replication
  • 12. Object Storage Technology Region “West” Zone “West-Alpha” Zone “West-Beta” Zone “West-Gamma” Zone “West-Delta” Object Storage in a region • Replication • Audit • Repair • Maintenance
  • 13. Region “West” Object Storage enables reliability
  • 14. Object Storage Technology Region “West” Object Storage also enables other applications Object Store API Servers • DropBox • Static Content • Archival
  • 15. Object Storage characteristics • Highly reliable and durable – 99.9 % availability for AWS S3 – 99.999999999 % durability • Massive scale – 1.3 trillion objects stored across 7 AWS regions [Nov 2012 figures] – Throughput: 830,000 requests per second • Immutable objects – Objects cannot be modified, only deleted • Simple API – PUT/POST objects, GET objects, DELETE objects – No seek / no mutation / no POSIX API • Flat namespace – Everything stored in buckets. – Bucket names are unique – Buckets can only contain objects, not other buckets • Cheap and getting cheaper
  • 16. CloudStack S3 API Server Object Storage Technology S3 API Servers MySQL
  • 17. CloudStack S3 API Server • Understands AWS S3 REST-style and SOAP API • Pluggable backend – Backend storage needs to map simple calls to their API • E.g., createContainer, saveObject, loadObject – Default backend is a POSIX filesystem – Backend with Caringo Object Store (commercial vendor) available – HDFS backend also available • MySQL storage – Bucket -> object mapping – ACLs, bucket policies
  • 18. Object Store Integration into CloudStack • For images and snapshots • Replacement for NFS secondary storage Or Augmentation for NFS secondary storage • Integrations available with – Riak CS – Openstack Swift • New in 4.2 (upcoming): – Framework for integrating storage providers
  • 19. What do we want to build ? • Open source, ASL licensed object storage • Scales to at least 1 billion objects • Reliability and durability on par with S3 • S3 API (or similar, e.g., Google Storage) • Tooling around maintenance and operation, specific to object storage
  • 20. The following slides are a design discussion
  • 21. Architecture of Scalable Object Storage API Servers Auth Servers Object Servers Replicators/Auditors Object Lookup Servers
  • 22. Why HDFS • ASF Project (Apache Hadoop) • Immutable objects, replication • Reliability, scale and performance – 200 million objects in 1 cluster [Facebook] – 100 PB in 1 cluster [Facebook] • Simple operation – Just add data nodes
  • 23. HDFS-based Object Storage S3 API Servers S3 Auth Servers Data nodes Namenode pair HDFS API
  • 24. BUT • Name Node Scalability – 150 bytes RAM / block – GC issues • Name Node SPOF – Being addressed in the community✔ • Cross-zone replication – Rack-awareness placement ✔ – What if the zones are spread a little further apart? • Storage for object metadata – ACLs, policies, timers
  • 25. Name Node scalability • 1 billion objects = 3 billion blocks (chunks) – Average of 5 MB/object = 5 PB (actual), 15 PB (raw) – 450 GB of RAM per Name Node • 150b x 3 x 10^9 – 16 TB / node => 1000 Data nodes • Requires Name Node federation ? • Or an approach like HAR files
  • 26. Name Node Federation Extension: Federated NameNodes are HA pairs
  • 27. Federation issues • HA for name nodes • Namespace shards – Map object -> name node • Requires another scalable key-value store – HBase? • Rebalancing between name nodes
  • 28. Replication over lossy/slower links A. Asynchronous replication – Use distcp to replicate between clusters – 6 copies vs. 3 – Master/Slave relationship • Possibility of loss of data during failover • Need coordination logic outside of HDFS B. Synchronous replication – API server writes to 2 clusters and acks only when both writes are successful – Availability compromised when one zone is down
  • 29. CAP Theorem Consistency or Availability during partition Many nuances
  • 30. Storage for object metadata A. Store it in HDFS along with the object – Reads are expensive (e.g., to check ACL) – Mutable data, needs layer over HDFS B. Use another storage system (e.g. HBase) – Name node federation also requires this. C. Modify Name Node to store metadata – High performance – Not extensible
  • 31. Object store on HDFS Future • Viable for small-sized deployments – Up to 100-200 million objects – Datacenters close together • Larger deployments needs development – No effort ongoing at this time
  • 32. Conclusion • CloudStack needs object storage for “cloud-style” workloads • Object Storage is not easy • HDFS comes close but not close enough • Join the community!

Editor's Notes

  • #4: Need a better slide than this
  • #5: Frequently require CCNA , Vmwareceritification, EMC training, etc etc. But they chose commondity systems. And simple networking.Can also sell cheaply since they use their own commerce platform.
  • #6: The key here is the API on top of the infrastructure. This is the disruptive piece for the industry. Forget about CCNA, Vmware cert, now people can programmatically control their infrastructure as well as the VMs on top of it.