0% found this document useful (0 votes)
25 views2 pages

Demand-Aware Erasure Coding For Distributed Storage Systems

Uploaded by

dhinarsk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views2 pages

Demand-Aware Erasure Coding For Distributed Storage Systems

Uploaded by

dhinarsk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Demand-aware Erasure Coding for Distributed Storage Systems

ABSTRACT:

Distributed storage systems provide cloud storage services by storing data on


commodity storage servers. Conventionally, data are protected against failures of
such commodity servers by replication. Erasure coding consumes less storage
overhead than replication to tolerate the same number of failures and thus has
been replacing replication in many distributed storage systems.

However, with erasure coding, the overhead of reconstructing data from failures
also increases significantly. Under the ever-changing workload where data
accesses can be highly skewed, it is challenging to deploy erasure coding with
appropriate values of parameters to achieve a well trade-off between storage
overhead and reconstruction overhead.

In this paper, we propose Zebra, a framework that encodes data by their demand
into multiple tiers that deploy erasure codes with different values of parameters.
Zebra automatically determines the number of such tiers and dynamically assigns
erasure codes with optimal values of parameters into corresponding tiers.

With Zebra, a flexible trade-off between storage overhead and reconstruction


overhead is achieved with multiple tiers. When demand changes, Zebra adjusts
itself with a marginal amount of network transfer. We demonstrate that Zebra
can work with two representative families of erasure codes in distributed storage
systems, Reed-Solomon codes and local reconstruction codes.

EXISTING SYSTEM:

Different from the above works where the reconstruction overhead is evaluated
in terms of network traffic or disk I/O, a tree-structured topology can be created,
which routes the traffic through the edges of the tree and alleviates the
bottleneck of sending data from existing servers to the replacement server.
The purpose of such works, instead, is to save the time of reconstruction. Such
works can be applied into our framework without affecting the network overhead
during reconstruction, and thus we focus on network overhead only in this paper.

PROPOSED SYSTEM:

This scheme is similar to the method proposed in which implements two tiers
with two other preconfigured erasure codes.

In our simulation, the two tiers both deploy RS codes or local reconstruction
codes for the purpose of fair comparison. For convenience, we name the two tiers
as hot tier and cold tier, as the hot tier will store hot data with low reconstruction
overhead while the cold tier can provide low storage overhead for the cold data.

In this scheme, under the constraint of the overall storage overhead, we try to
assign as much hot data as possible to the hot tier and store the rest in the cold
tier.

You might also like