0% found this document useful (0 votes)
63 views

An Insider's Guide To Object Storage

Uploaded by

Dennis Larsson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

An Insider's Guide To Object Storage

Uploaded by

Dennis Larsson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

An Insider’s Guide

to Object Storage

1
Table of Contents
INTRODUCTION 3 6: The Technical Benefits of Object Storage 13
1: The Storage Challenge 4 Limitless Scalability 13
2: Solving the Challenge: Object Storage 6 Driven by Metadata 13
3: How Does Object Storage Work? 7 Cloud Native 13
Flat File System 7 Files and Objects 13
Clustered Nodes 7 7: Object Storage Use Cases 14
Object vs File Storage 8 File Services 14
Retrieving Objects 9 Data Protection 14
Data Protection 9 Media and Entertainment 14
The S3 API 9 Storage as a Service 14
The Difference Maker: Metadata 10 Healthcare 14
4: Object Storage and the Cloud 11 Video Surveillance 14
5: The Financial Benefits of Object Storage 12 Artificial Intelligence (AI) 14
CAPEX Savings 12 8: Case Study: WGBH 15
OPEX Savings 12 9: Conclusion 16

2
Introduction
The data revolution is upon us. Organizations in every industry generate
exponentially more volumes of unstructured data than ever before. They also retain,
The data revolution is upon us. Organizations in every industry generate exponentially more volumes
re-use, and learn from that data to a far greater extent. Add to this the emergence
of unstructured data than ever before. They also retain, re-use, and learn from that data to a far greater
of cloud and IoT, and it becomes apparent that unlimited, affordable scalability is an
extent. Add to this the emergence of cloud and IoT, and it becomes apparent that unlimited, affordable
increasingly important component of a company’s long-term success.
storage scalability is an increasingly important component of a company’s long-term success.
Today’s massive amounts of data are now generated and stored on-premise, at the
Today’s massive amounts of data are now generated and stored on-premise, at the edge, and in the
edge, and in the cloud, greatly increasing storage management complexity. To meet
cloud, greatly increasing storage network complexity. To meet the needs of this transformation, the way
the needs of this transformation, the way we store unstructured data must change.
we store and manage unstructured data must change.
That’s where hyperscale object storage comes in.
That’s where hyperscale object storage comes in.
In this ebook we’ll discuss these challenges and how object storage provides
In this ebook you’ll learn about today’s storage challenges and how object storage provides unique
unique capabilities to address them. You’ll acquire a better sense of how object
capabilities to address them. You’ll acquire a better sense of how object storage works, and how it can
storage works, and how it can fundamentally redefine data management at scale.
fundamentally redefine data management at scale. Ultimately, you’ll gain peace of mind from knowing
Ultimately, you’ll gain peace of mind from knowing the value of object storage for
the value of object storage for your organization.
your organization.
Armed with this information, you’ll be much better equipped to take advantage of the data revolution –
Once armed with this information, you’ll be much better equipped to take advantage
and reap maximum ROI from your storage investment.
of the data revolution – and reap maximum ROI from your storage investment.

3
1: The Storage Challenge
Each succeeding generation of computing creates greater and Unstructured data represents the vast diverse use cases such as healthcare,
greater volumes of data. In 1986 the world’s combined data totaled majority of this information – 80%, by security, and media and entertainment,
2.6 exabytes (EB). In 2016, 16 zettabytes (ZB) of data was generated most accounts. unstructured data has enduring value as
we find new ways to analyze and re-
worldwide (1 ZB = 1,000 EB). By 2025 the amount of usable business Scale Capacity Without purpose it. For instance:
data will soar to 163 ZB, according to market research firm IDC. Complexity
• Broadcasters now re-use media files
As a result, capacity requirements for
multiple times as their content is
2025 unstructured data are growing more
163 ZB formatted for distribution in broadcast,
than 50% year over year. No wonder
160 mobile streaming, and web versions.
ZB that CAPACITY GROWTH is the No. 1
challenge facing most organizations that • Medical firms re-analyze archived
generate and use vast amounts of data. records, looking for predictive insights.
• Manufacturers examine machine
Rapid Data Search sensor data, trying to anticipate
Users across every industry increasingly failures before they occur.
100
ZB demand that data be easily searchable • Vehicle makers now collect millions of
1986 2016
2.6 EB 16.1 ZB and instantly accessible. This desire for miles of driving data, endeavoring to
speed is a direct consequence of our train autonomous cars and trucks.
growing dependence on the internet.
When the world’s data is instantly All of this data has tremendous value,
20 available on the web, users expect their but only if it can be economically stored,
ZB own data to be just as searchable and efficiently analyzed, and instantly
accessible. searched. Traditional storage lacks the
1986 1996 2006 2016 2026
needed economics and scale to achieve
IDC predicts the total amount of data generated worldwide will Instant Data Access these tasks, and tape media lacks
skyrocket to more than 163 zettabytes by 2025. Businesses today recognize the value of the required access time and search
data across all phases of their operation, capability. Meeting these objectives is
including archived information. In attainable only with object storage.

FASTFACT
Unstructured data, also referred to as “file data,” is any information not managed by a
database application, including high-resolution digital media, healthcare records, and
engineering files. It accounts for about 80% of all capacity generated today.

4
Hyperscale object
storage provides the
most scalable, most
economical solution
for meeting storage
challenges
Object storage:
• Strips complexity from the data center
• Addresses an organization’s demands
for global data search
• Ensures data durability
• Accommodates new requirements
(such as GDPR) regulating where data
is stored and how it is managed.

5
2: Solving the Challenge with Object Storage
Data center managers, content Exabyte Scalable Geo-Distribution
creators, broadcasters, Modular design for non-disruptive growth One storage pool that spans the globe
researchers, healthcare Traditional storage systems were designed with an upper With the advent of IoT, remote sensing technologies, and
providers, and software limit on capacity. Object storage solves the scale problem low-cost 4K cameras, high-capacity data is now created
with an architecture that eliminates this limitation. But this everywhere. This paradigm shift places new demands
developers all require storage
exabyte-plus scale is only part of the solution. How easily on networking and storage technologies. Object storage
solutions that help them contend the system grows is just as critical as how big it can get. addresses this challenge with a distributed system in which
with the explosive growth in With object storage, scaling is non-disruptive, so you can nodes may be deployed wherever needed. Low-cost, remote
unstructured data. add capacity when needed. storage lets analysis happen where the data is collected,
rather than having to load the network with raw information.
Object storage offers unique Files and Objects Together Whether providing local storage for remote applications or
capabilities to meet these Consolidated storage for unstructured data disaster recover (DR) capability across sites, object storage
needs. Far from being an Files remain the most common means of managing possesses the needed data management capability to
incremental improvement over unstructured data. Object storage lets you combine file and perform these tasks simply and economically.
earlier technologies, it is a object data in a single pool. Unlike NAS, in which the file
hierarchy limits growth, object storage stores files in a flat Cloud Integration
fundamentally different way of
file system that can expand forever. Hybrid cloud and multi-cloud ready
storing data. Here’s why:
Most organizations today plan to use both cloud and
on-prem storage. Analysts predict continued rapid growth
for both storage types. Object storage employs the S3 API,
so it speaks the language of the cloud. It also incorporates
data management features that simplify data placement,
cloud and on-prem storage become two parts of a single
global namespace.

The Benefits of Object Storage

Scalability Searchability Geo-Distribution Rapid Access Simplicity Data Durability Data Protection Compliance

6
3: How Does Object Storage Work?
Object storage combines multiple technologies Object storage, on the other hand, employs a flat file simply by adding more nodes. The nodes themselves
structure that has no limits. An “object” includes: are stand-alone devices — typically industry-
in a single, integrated system to solve the
standard servers running a software-defined storage
capacity data management challenge. Here’s a • User data (usually a file application — interconnected via either LAN or WAN.
quick look at the key elements. • A unique ID that is created from the data itself Each has a metadata catalog, knows where in the
• User-defined metadata, which can be used to cluster data is stored, and can respond independently
Flat File System
describe that object’s content to data requests. For enhanced performance, nodes
Every storage system needs a way to index, or can also work in parallel to accelerate the delivery of
locate, information so it can be managed and Objects are then organized in “buckets,” which are
analogous to folders for files. These containers can large objects.
retrieved. With traditional SAN and NAS technologies,
hold similar or related objects, allowing them to be
these indexing schemes have built-in limits. NAS file Retrieving Objects
systems use a hierarchy, like branches on a tree. SAN managed as groups.
To retrieve data from an object store, you simply ask
uses direct addressing, like postal addresses. Both
Clustered Nodes for it by its object ID. Objects may be local or at other
have scale limits. Furthermore, with these systems, sites, but because they are in a flat address space,
Object storage is always a clustered system, never
performance degradation may slow growth before the they are retrieved in exactly the same way.
a single device. Any “node” in the cluster can see
theoretical limits are reached.
and retrieve any data. The cluster can be expanded

How Does Object Storage Solve the Storage Capacity Problem?


To visualize how object storage works, first visualize traditional storage as a
parking lot. That lot has a limited number of spaces. When it’s full, you need to
find a new parking lot. Furthermore, as it fills up, it will take you longer to find an
empty space.
Now visualize object storage as valet parking. You leave your car with the
attendant who gives you a ticket, a unique ID. He is then free to park your car
anywhere space is available. He keeps a record of where your car is parked, and
is therefore not limited to the space in a single lot.

7
How Does Object Storage Work? (continued)

Object vs File Storage:


What’s the Difference?
OBJECT STORAGE FILE STORAGE
A file is information written in a specific
format. That format is known to an PERFORMANCE Performs best for big content Performs best for
application, and is identified by the file’s and high stream throughput smaller files
suffix. A “.jpg” file format, for example,
GEOGRAPHY Data can be stored across Data typically stored
would be usable by an application that
multiple regions locally
manages images.
SCALABILITY Can scale infinitely to Operational limits
An object, on the other hand, is a
exabytes and beyond reached at a few PBs
package that contains information
(perhaps a file or a part of a file). There
is also metadata that describes the user ANALYTICS Customizable metadata Limited number of set
allows data to be easily metadata tags
data. A unique ID identifies the object. organized and retrieved
The object can be divided into “slices” to
be distributed across multiple nodes.
Unlike files, an object can include a large
amount of metadata, which enhances
search capabilities using sophisticated
Google-like tools.

8
How Does Object Storage Work? (continued)
Data Protection
To guard against component failure, such as a failed hard drive, all storage systems employ The S3 API
data protection. With object storage, the built-in failure protection can do much more. It can The Amazon S3 Application Programming Interface, better
guard against whole-device failure, rack failure, or even site failure. known as the S3 API, is the most common way in which data
is stored, managed, and retrieved by object stores. Originally
When data is written to object storage, it is protected with either ERASURE CODING,
created for the Amazon S3 Simple Storage Service, the
or DATA REPLICATION, or both. Erasure coding breaks the data into smaller segments
widely adopted S3 API is now the de facto standard for
(“slices”) and writes them to multiple nodes within the cluster. It also writes additional data
object storage.
segments to other nodes for data durability purposes, ensuring that the object can be
restored even if nodes fail. Not All S3 APIs Are Equal
Nodes may be located at one data center, or distributed across multiple data centers to Compared with established file protocols such as NFS, the
protect from site failure. Configurations always allow for multiple node failures to ensure S3 API is relatively new and rapidly evolving. Among object
data durability. storage vendors, S3 API compliance varies from below 50%
to over 90%. This difference becomes material when an
The second option, replication, writes identical copies on multiple nodes to ensure
application — or an updated version of that app— fails due to
availability. Both erasure coding and replication can operate across sites for DR purposes.
S3 API incompatibility.
Cloudian is the only object storage solution to exclusively
Data Replication support the S3 API. Launched in 2011, Cloudian’s many
Writes identical copies on years of S3 API development translate to the industry’s
multiple nodes, and can highest level of compliance.
operate across multiple sites.
REMOTE SITE Employing the S3 API makes an object storage
solution flexible and powerful for three reasons:
PUBLIC CLOUD
Standardization
With Cloudian, any object written using the S3 API can be
used by other S3-enabled applications and object storage
Erasure Coding solutions; the existing code works out of the box.
Writes to multiple nodes Maturity
within the cluster; also The S3 API provides a wide variety of features that meet
writes to nodes outside virtually every need for an object store.
the cluster to allow for
multiple node failues. Simplicity
End users planning to deploy object stores can access the
SITE 1 SITE 2 SITE 3
plentiful resources of the S3 community — both individuals
and companies.
Data striped across all nodes

9
How Does Object Storage Work? (continued)

The Difference Maker:


Metadata + Search
User-defined rich metadata tags are another
key attribute of object storage. Each object
includes one or more tags with system
metadata (such as creation date), and user
metadata that can be used to describe
the object’s contents. To find information,
Google-like tools let you search the
metadata for specific attributes.
User metadata can include information about
the content, such as:
• The location where an asset was created
• The project the object was created for
• The specific subject of the data

Search tools like Elasticsearch and Kibana


let you both search metadata and also
create graphical views of data to categorize
information and more easily spot trends.

10
4: Object Storage and the Cloud
Large-scale object storage S3-Compatible Storage Services
DATA CENTER
adoption first occurred in the With object storage, it’s easy to launch S3-compatible storage
cloud, and today all major services. Many managed service providers (MSPs) and enterprises
public clouds are built on today offer S3-compatible services in either public or private cloud
service models. Cloudian incorporates features that simplify this with
the technology. Nearly all the
multi-tenancy, quality of service controls, billing, and access controls.
web services you use every
day – including Facebook, Hybrid Cloud: Ideal for DR APPLICATIONS BACKUP DATA

Netflix, and Google – rely on A hybrid cloud lets you manage both public and private storage
it. Now, the same technology pools as one, with data management tools that manage policy-based
S3-Compatible Storage Services
is available for use in your tiering or replication among environments. Migrate or replicate data
based on file type, frequency of access, file size, or other parameters
data center: the same APIs
of your choice.
and the same limitless DATA CENTER

scalability. Hybrid cloud use cases vary. DR is easy to manage and cost-
effective with policy-based replication to the cloud. For capacity Policy-based
Replication
Cloud integration is built- expansion, data tiering can effectively provide unlimited capacity. PUBLIC CLOUD
in with object storage, so Policies maintain the most commonly used data on-prem, while
public and private clouds infrequently used data is moved to the cloud. APPLICATIONS RECENT BACKUP DATA

can be merged into a Two other use cases, data analysis in the cloud and content SINGLE MANAGEMENT VIEW

single storage pool where distribution, let you capitalize on the compute capability and
the public cloud acts as geographic reach of the cloud, while also letting you access data Hybrid Cloud: Ideal for DR
locally for performance.
another storage tier, giving
you new options for data Multi-Cloud: Multiple Cloud Vendors in One Storage Pool
DATA CENTER
management. Here are Multi-cloud lets you merge clouds from multiple vendors in a single
several ways that on-prem management environment. Combine storage from Amazon, Google, Policy-based
Migration
object storage and the public and Microsoft, plus private cloud storage, to a single pool with one
cloud work together to solve set of management APIs.
MULTI-CLOUD
problems. With multi-cloud, your organization can use different clouds for APPLICATIONS RECENT BACKUP DATA

different reasons, letting you take advantage of various features or SINGLE MANAGEMENT VIEW
data center locations. It also allows you to use clouod resources
selectivity for cloudbursting or other events that call for a temporary
use of more storage resources than usual. Multi-Cloud: Multiple Vendors in One Storage Pool

11
5: The Financial Benefits of Object Storage
Cost-effective acquisition and operation is designed into object storage.
Built specifically for large-scale data management, object storage
delivers the lowest CAPEX and OPEX of any enterprise storage system.
And unlike traditional enterprise storage where acquisition costs —
measured in cost per terabyte — tend to increase with scale, object
storage systems become more efficient and less costly with scale.

CAPEX Savings
Object storage CAPEX benefits often reach 70% savings vs. traditional tier 1
storage. One reason is that it functions on industry-standard hardware. This CENTS per GB
ACCESS CHARGES
per MONTH
attribute eliminates the need for proprietary platforms, which keeps both including data protection
acquisition and maintenance costs low. As your system grows, the open 11¢ 0.4 - 2¢ 0.5 - 1¢
systems model ensures that your costs always remain in line with the industry’s ENTERPRISE CLOUD OBJECT
best pricing. STORAGE STORAGE STORAGE

Deployment Flexibility OPEX Savings


With the choice to deploy as a VM, as bare-metal on the server of your choice,
Traditional storage becomes complex to manage as the number of systems and
or as vendor-supplied hardware appliances, object storage can be installed in
associated middleware tools grows. Object storage saves by consolidating data
whatever manner makes the most sense for your organization.
to a single system and leveraging built-in management tools such as automated
Modular Scaling DR between sites. Support costs are reduced as well.
An object storage cluster expands in a modular fashion, with nodes being added
Because object storage runs on standard hardware, it eliminates your need to
to the cluster as needed. No longer do you need to plan capacity increases
deploy and maintain proprietary storage servers. In fact, the Cloudian object
months in advance — or pay for storage you’re not using.
storage solution can reduce your enterprise storage costs with up to 95% less
management overhead, 30% less power/space/cooling, and a highly robust
design that ensures maximum productivity with up to 14 nines of data durability.

12
6: The Technical Benefits of Object Storage
Object storage is not an architectural Cloud Native
enhancement, but rather a fundamentally Object storage is the storage technology of the
different approach to storage with benefits cloud. On-prem object storage employs the
tailored for large-capacity use cases. same language (the S3 API), so the two storage
environments can be managed as one for use
Limitless Scalability cases such as DR and on-demand capacity
Object storage’s flat file system eliminates the expansion.
scaling limitations of traditional storage, which
Files and Objects
in turn allows an object storage system to
File storage systems face limitations in capacity
grow much larger than traditional storage. Meet Cloudian HyperStore
and cost. Object storage offers limitless scale at
Consolidate your unstructured data — both objects and files
Driven by Metadata lower cost. While not intended for transactional
data where operations-per-second is critical, — to a single, limitlessly scalable storage pool with Cloudian
Metadata identifies properties of a storage
object storage presents an economic option for HyperStore. Available as either standalone software or fully
object, and can be customized and updated
large pools of less latency-sensitive files. integrated appliances, HyperStore enterprise object storage
over time. Google-like search and data
provides unlimited capacity scalability, intuitive management
visualization tools make information easy to
tools, uncompromising data protection, and the industry’s
find and analyze.
most compatible S3 API implementation.

Enterprise NAS File Services


S3 SMB/NFS
Cloudian HyperFile®, a scale-out NAS controller for Cloudian HyperStore, delivers
HyperFile
Scale file throughput
limitlessly scalable enterprise file services, on-prem. Together with Cloudian HyperStore,
with additional nodes
HyperFile provides a cost-effective solution for your capacity-intensive, less frequently
HyperStore used files. Seamlessly grow from terabytes to petabytes. Independently scale capacity
Scale capacity
without disruption
and performance. And get the features of enterprise NAS at one-third the cost.

CLOUDIAN CLUSTER

13
7: Object Storage Use Cases
Data Protection Healthcare
Object storage makes an ideal target for data protection Object storage integrates seamlessly with PACS systems
systems from Rubrik, Commvault, Veritas, and Veeam. and vendor neutral archives (VNA) to provide healthcare
All are proven compatible with Cloudian and are simple professionals with a unified view of all patient data. Object
to integrate. DR planning is simple, too, with built-in storage is about one-third the cost of the proprietary
replication to the cloud or to a remote site. storage systems commonly employed for healthcare
storage.
Media and Entertainment
To accommodate high-resolution formats and rapidly Video Surveillance
expanding content libraries, object storage provides With 4K cameras, higher frame rates, and longer retention
studios and post-production teams with unlimited capacity periods, security personnel need exponentially more
and modular growth. Rich metadata dramatically increases storage than in the past. Object storage is proven with
the ability to find media assets. Most asset management video management software and provides a scalable,
software supports the S3 API, making object storage a affordable solution for video surveillance.
simple plug-in alternative to tape-based archives.
Artificial Intelligence
Storage-as-a-Service For rapidly expanding learning databases, you need a
For service providers and organizations looking to offer storage system that can start small, and grow limitlessly
S3-compatible storage services, object storage is the ideal and affordably. Object storage provides the ability to scale
solution. Cloudian provides features such as quality of on-site and to use the cloud to flex with your needs.
service controls, multitenancy, and billing to make services
easy to manage.

File Services
Files and objects are easily combined in a single pool,
delivering scale and cost savings benefits vs. traditional
file platforms. Faster than cloud and more cost-effective
than enterprise NAS, object storage is the advanced file
services option.

14
8: Case Study: WGBH
WGBH, the PBS television station based In the search for a new solution, the WGBH team
in Boston, is well known for its slate of decided that a hybrid cloud approach would best
award-winning programming such as meet its needs. A hybrid cloud environment combines
on-prem and public cloud storage: on-prem for
“Nova,” “Frontline,” “Masterpiece,” and
the working copy, public cloud for a DR copy. This
“Antiques Roadshow.” solution features the rapid data access of an on-prem
Fifty years of content creation has resulted in an system with simplicity of cloud-based DR.
enormous archive, most of which was stored in a
Eventually, WGBH’s research led them to object
tape library or on external hard drives located within
storage — specifically, Cloudian’s HyperStore.
in a large vault at the WGBH studios . Meanwhile,
an ongoing transition to 4K and 8K media created HyperStore provides WGBH with fast access to
ballooning capacity demand, while the growing rate data, limitless capacity, modular and easily managed
at which media was created added to the sheer growth, high density, metadata tagging to facilitate
volume of material to be archived. Media retrieval search, and low cost. The initial deployment
time and data protection were challenges as well. consisted of a 3PB cluster, housed in three 4U-high
appliances. Consuming just 21 inches of rack height,
this cluster consumes less than 1/10 the space of the
equivalent tapes and library facilities. VIEW VIDEO

“With Cloudian, DR became automatic. We store data to


the archive and it’s automatically replicated to the cloud.
That’s a lot simpler and more reliable than managing tapes.”
Shane Miner
WGBH Senior Director of Technical Services

15
9: Conclusion
Hyperscale object storage Think Big
addresses the capacity storage Consolidate data, organization-wide, to
challenge in ways that traditional a single, exabyte-scalable data fabric.
NAS and SAN architectures Cloudian’s modular design makes it easy
to grow. Expand capacity and geographic
cannot. Object storage is a
reach simply by adding nodes anywhere
different kind of storage, not you need capacity. Performance scales,
a middleware-enhanced spin too, thanks to the peer-to-peer, shared-
on an older technology. As nothing architecture.
the only storage type to come
Cloud Connected
of age during the cloud era, it
On-prem storage, private cloud, hybrid- By bringing the flexibility and simplicity of public
provides unique capabilities that
let you reshape your storage
cloud, or multi-cloud: It’s your choice. cloud storage into your data center, you can
Connect seamlessly with the public
strategy to meet the needs of a cloud, then use our integrated tools to simplify management costs and reduce TCO by
geographically dispersed, cloud- replicate or fluidly migrate information. 70% versus conventional storage systems.
connected enterprise. Whether your goal is DR, capacity
expansion, or a data archive in the cloud,
we make it effortless to integrate on-prem
and cloud storage.

Capacity Where You Need It


Built to be distributed, Cloudian can
be deployed across sites or across the
globe. Place nodes within your data
center, at your DR site, or at remote
offices — then control them in a single
data fabric. All fabric-connected devices
work as one, so you can store, find and
protect information wherever it resides.

16
BIG
THINK

4Exabyte Scalable 4Geo Distributed 4Cloud Integration

Cloudian, Inc.
©2018 Cloudian, Inc. Cloudian, the Cloudian logo, HyperFile, HyperScale, and
177 Bovet Road, Suite 450, San Mateo, CA 94402 HyperStore are registered trademarks or trademarks of Cloudian, Inc. All other
Tel: 1.650.227.2380 | Email: [email protected] | www.cloudian.com trademarks are property of their respective holders.
17

You might also like