RAID Storage Explained
RAID Storage Explained
0
RAID storage explained June 18, 2007
By George Ou
Since I've been doing a lot of coverage of storage technology, both for the enterprise and for the home lately, I
thought I should give an explanation of what RAID storage is. I won't go in to every RAID type under the sun; I just
want to cover the basic types of RAID and explain the benefits and tradeoffs.
RAID was originally defined as Redundant Array of Inexpensive Drives. But RAID setups were traditionally very
expensive, so the definition of "I" became Independent. The costs have recently come down significantly because
of commoditization, and RAID features are now embedded in most higher-end motherboards. Storage RAIDs
were primarily designed to improve fault tolerance, offer better performance, and allow easier storage
management. It presents multiple hard drives as a single storage volume, which simplifies storage management.
Before we start talking about the different RAID types, I'm going to define some basic concepts.
Mirroring
Data mirroring stores the same data across two hard drives, which provides redundancy and read speed. It's
redundant because if a single drive fails, the other drive still has the data. It's great on read I/O performance and
read throughput because it can independently process two read requests at the same time. In a well implemented
RAID controller that uses mirroring, the read IOPS and read throughput (for two tasks) can be twice that of a
single drive. Write IOPS and write throughput aren't any faster than a single hard drive because they can't be
processed independently, since data must be written to both hard drives at the same time. The downside to
mirroring is that your capacity is only half the total capacity of all your hard drives, so it's expensive.
Page 1
Copyright ©2007 CNET Networks, Inc. All rights reserved.
For more downloads and a free TechRepublic membership, please visit http://techrepublic.com.com/2001-6240-0.html
RAID storage explained
Striping
Data striping distributes data across multiple hard drives. Striping scales very well on read and write throughput
for single tasks, but it has less read throughput than data mirroring when processing multiple tasks. A good RAID
controller can produce single-task read/write throughput equal to the total throughput of each individual drive.
Striping also produces better read and write IOPS, though it's not as effective on read IOPS as data mirroring.
You also get a large consolidated drive volume equal to the total capacity of all the drives in the RAID array.
Striping is rarely used by itself because it provides zero fault tolerance, and a single drive failure causes not only
the data on that drive to fail, but the entire RAID array. Striping is often used in conjunction with data mirroring or
with parity.
RAID Level 0
RAID Level 0 is the cluster-level implementation of data striping, and it is the only RAID type that doesn't care
about fault tolerance. Clusters can vary in size and are user-definable, but they are typically blocks of 64
thousand bytes. The clusters are evenly distributed across multiple hard drives. It's used by people who don't care
about data integrity if a single drive fails. This RAID type is sometimes used by video editing professionals who
are using the drive only as a temporary work space. It's also used by some PC enthusiasts who want maximum
throughput and capacity.
RAID Level 1
RAID Level 1 is the pure implementation of data mirroring. In a nutshell, RAID Level 1 gives you fault tolerance,
but it cuts your usable capacity in half and offers excellent throughput and I/O performance. This RAID level is
often used in servers for the system partition for enhanced reliability, but PC enthusiasts can also get a nice
performance boost from RAID Level 1. Using multiple independent RAID Level 1 volumes can offer the best
performance for database storage.
RAID Level 2
RAID Level 2 is a bit-level implementation of data striping with parity. The bits are evenly distributed across
multiple hard drives, and one of the drives in the RAID is designated to store parity. Out of an array with "N"
number of drives, the total capacity is equal to the sum of "N-1" hard drives. For example, an array with six equal-
size hard drives will have the combined capacity of five hard drives. It's interesting to note that this RAID level is
almost forgotten and is rarely used.
RAID Level 3
RAID Level 3 is a byte-level implementation of data striping with parity. The bytes are evenly distributed across
multiple hard drives, and one of the drives in the RAID is designated to store parity. Out of an array with "N"
Page 2
Copyright ©2007 CNET Networks, Inc. All rights reserved.
For more downloads and a free TechRepublic membership, please visit http://techrepublic.com.com/2001-6240-0.html
RAID storage explained
number of drives, the total capacity is equal to the sum of "N-1" hard drives. For example, an array with four
equal-size hard drives will have the combined capacity of three hard drives. This RAID level is not commonly
used and is rarely supported.
RAID Level 4
RAID Level 4 is a cluster-level implementation of data striping with parity. Clusters can vary in size and are user-
definable, but they are typically blocks of 64 thousand bytes. The clusters are evenly distributed across multiple
hard drives, and one of the drives in the RAID is designated to store parity. Out of an array with "N" number of
drives, the total capacity is equal to the sum of "N-1" hard drives. For example, an array with eight equalsize hard
drives will have the combined capacity of seven hard drives. This RAID level is not commonly used and is rarely
supported.
RAID Level 5
RAID Level 5 is a cluster-level implementation of data striping with DISTRIBUTED parity for enhanced
performance. Clusters can vary in size and are user-definable, but they are typically blocks of 64 thousand bytes.
The clusters and parity are evenly distributed across multiple hard drives, and this provides better performance
than using a single drive for parity. Out of an array with "N" number of drives, the total capacity is equal to the
sum of "N-1" hard drives. For example, an array with seven equal-size hard drives will have the combined
capacity of six hard drives. This is the most common implementation of data striping with parity.
RAID Level 6
RAID Level 6 is a cluster-level implementation of data striping with DUAL distributed parity for enhanced fault
tolerance. It's similar to RAID Level 5, but it uses the equivalent capacity of two hard drives to store parity. RAID
Level 6 is used in high-end RAID systems, but it's slowly becoming more common as technology becomes more
commoditized. Dual parity allows any two hard drives in the array to fail without data loss, which is unique in all
the basic RAID types. If a drive fails in a RAID Level 5 array, you'd better hope there is a hot spare that will
quickly restore the array to a healthy state in a few hours and that you don't get a second failure during that
recovery time. RAID Level 6 allows that second drive failure during recovery and is considered the ultimate RAID
level for fault tolerance. Out of an array with "N" number of drives, the total capacity is equal to the sum of "N-2"
hard drives. For example, an array with eight equal-size hard drives will have the combined capacity of six hard
drives.
Page 3
Copyright ©2007 CNET Networks, Inc. All rights reserved.
For more downloads and a free TechRepublic membership, please visit http://techrepublic.com.com/2001-6240-0.html