RAID Theory: An Overview: Ben Rockwood, Cuddletech
RAID Theory: An Overview: Ben Rockwood, Cuddletech
Table of Contents
RAID: That it is; What it does ................................................................... 1
RAID: The Details .................................................................................. 2
RAID Type: Concatenation ............................................................... 2
RAID Type: Striping (RAID-0) ......................................................... 2
RAID Type: Mirroring (RAID-1) ....................................................... 3
RAID Type: Stripping plus Mirroring (RAID-0+1) ................................ 3
RAID Type: RAID-5 (Striping with Parity) .......................................... 4
RAID Comparison: RAID0+1 vs RAID5 ............................................. 5
RAID History: The Lost Brothers ............................................................... 5
Conclusion: Closing Notes ........................................................................ 6
1
RAID Theory: An Overview
Just to try and clear things up a bit more, lets see why we don't simple just need RAID,
but actually WANT it. Let's say we're building a production NFS server that will be used
to store all of our software. We'll need this system to extremely stable, because if it goes
down no one can get or submit code. With RAID we could build a single virtual disk
(volume) that would meet our need for 200G of disk. But we also what to make sure that
if disks die that we don't go down. So we use a mirror (another set of disks identical to
the first set of disks). If a disk dies we're okey, because the mirror will take over; we es-
sentially have 2 identical sets of the same data which are constantly kept up to date. See?
Using these 2 simple RAID concepts we've achieved both availability (thats our mirror
saving us from disk crashes) and increased capacity (we've got a whole bunch of disks
working together, which is cheaper than buying a single 200G disks... if you can find
one!).
Okey, enough of the bad examples. Lets look at the different forms of RAID in use today.
Now, do you see the problem with this type of RAID? Because we're writing data lin-
early across the disks, if we only have 7G of data on our RAID we're only using the first
disk! The 2 other disks are just sitting there bored and useless. This sucks. We got the big
disk we wanted, but it's not any better than a normal disk drive you can buy off the
shelves in terms of performance. There has got to be a better way..........
2
RAID Theory: An Overview
need far kooler term for each disk, a term that allows us to visualize our new RAID bet-
ter..... "column" sounds kool! Alright, so each disk is a "column" and the amount of data
we put on each "column" before moving to the next is our "stripe width".
Let's solidify this. If we're building a RAID-0 with 4 columns, and a stripe width of 128k,
what do I have? It might look something like this:
Look good? So, when we start writing to our new RAID, we'll write the first 128k to the
first column, then the next 128k to the second column, then the next 128k to the third
column, then the next 128k to the fourth column, THEN the next 128k to the first
column, and keep going till all the data is written. See? If we were writing a 1M file we'd
wrap that one file around all 4 disks almost 3 times! Can you see now where our speed up
comes from? SCSI drives can write data at about (depending on what type of drive and
what type of SCSI) 20M/s. On our Striped RAID we'd be writing at 80M/s! Kool huh!?
But, now we've got ANOTHER problem. In a Simple RAID if we had, say, 3 9G disks,
we'd have 27G of data. Now, if I only wrote 9G of data to that RAID and the third disk
died, so what, there is no data on it. (See where I'm going with this?) We'd only be using
one of our three disks in a simple. BUT, in a Striped RAID, we could write only 10M of
data to the RAID, but if even ONE disk failed, the whole thing would be trash because
we wrote it on ALL of the disks. So, how do we solve this one?
There's not much to it. However, there is a new problem! This is expensive... really ex-
pensive. Let's say you wanted a 27G RAID. So you bought 3 9G drives. In order to mir-
ror it you'll need to buy 3 more 9G drives. If you ever get depressed you'll start thinking:
"You know, I just shelled out $400 for 3 more drives, and I don't even get more usable
space!". Well, in this industry we all get depressed a lot so, they thought of another kool
idea for a RAID......
3
RAID Theory: An Overview
A mirror is nothing more that another RAID identical to the RAID we're trying to protect.
So when we build a mirror we'll need the mirror to be the same type of RAID as the ori-
ginal RAID. If the RAID we want to mirror is a Simple RAID, our mirror then will be a
Simple RAID. If we want to mirror a Striped RAID, then we'll want another Striped
RAID to mirror the first. Right? So, if you say to me, we're building a RAID-0+1, I know
that we're going to mirror a Striped RAID, and the mirror itself is going to be striped as
well.
You'll see this term used more often than "RAID-1" simply because a mirror, in and of it-
self, isn't useful. Again, it's not really a "RAID" in the sense that we mean to use the
word.
Okey, let's break it down a bit. Let's say we build a RAID-5 out of 4 9G drives. So we'll
have 4 columns, and lets say our stripe width is 128k again. The first 128k is written on
disks one, two AND three. At the same time it's written a little magic number is written
on each disk with the data. That magic number is called the parity. Then, the second 128k
of data is written to (watch carefully) disks two, three and four. Again, a parity number is
written with that data. The third 128k of data is written to disks three, four and one. (See,
we wrapped around). And data keeps being written like that.
Here's the beauty of it. Each piece of our data is on three different disks in the RAID at
the same time! Let's look back at our 4 disk raid. We're working normally, writing along,
and then SNAP! Disk 3 fails! Are we worried? Not particularly. Because our data is be-
ing written to 3 disks per write instead of just one, the RAID is smart enough to just get
the data off the other 2 disks it wrote to! Then, once we replace the bad disk with a new
one, the RAID "floods" all the data back onto the disk from the data on the other 2 adja-
cent disks! But, you ask, how does the RAID know it's giving you the correct data? Be-
cause of our parity. When the data was written to disk(s) that parity was written with it.
We (actually the computer does this automatically) just look at the data on disks 2 and 4,
then compare (XOR) the parity written with the data and if the parity checks out, we
know the data is good. Kool huh?
Now, as you might expect, this isn't perfect either. Why? Okey, number 1, remember that
parity that saves our butt and makes sure our data is good? Well, as you might expect the
systems CPU has to calculate that, which isn't hard but we're still wasting CPU cycles for
the RAID, which means if the system is really loaded we may need to (eek!) wait. This is
the "performance hit" you'll hear people talk about. Also, we're writing to 3 disks at a
time for the SAME data, which means we're using up I/O bandwidth and not getting a
4
RAID Theory: An Overview
Now, in the real world, you rarely have much choice, and the way to go is clear. If you're
given 10 9G disks and are told to create a 60G RAID, and you can't buy more disks,
you'll need to either go RAID5, or be unprotected. However, if you've got thoughs same
disks and they only want 36G RAID you can go RAID0+1, with the only drawback that
they won't have much room to grow. It's all up to you as an admin, but always take
growth into account. Look at what you've got, downtime availability to grow when
needed, budget, performance needs, etc, etc, etc. Welcome to the world of capacity plan-
ning!
A RAID-4 volume is made up of one or more data disks which are stripped, and a dedic-
ated parity disk which maintains the write checksums of the data written on each stripe.
Checksums are just numbers, so they are very small and quick to write. The problem is
that generally when you write data to the volume you write a stripe, then write parity,
then write the stripe, then write parity, so on and so forth, but you have to wait for the
parity to complete writting before writting the next stripe which is a bottleneck. Add to
that from the avaliblity side of things, if you loose your parity disk due to failure your
running with your pants down. Sure you can rebuild the parity disk after hot-swapping or
replacing the disk, but that requires re-computing parity from each stripe which is an ob-
viously time consuming proposition. NetApp however worked around these problems by
using the Filers onboard memory for caching writes, and additionally an NVRAM as a
5
RAID Theory: An Overview
databackup in case of a power failure. It gets the writes ready in the cache, then when it's
ready it writes out the data stripes first, and then the parity because it's already calculated
parity and the write pattern in memory. In this way we take away the bottleneck of parity
disk. This is possible due to the intellegence of WALF, the NetApp OnTap File System.
NetApp has added a new feature in OnTap 6.5, actually, called RAID-DP, Double Parity.
It's the same RAID-4 system but it employs mirror parity disks. This way you can loose 2
disks in a volume (assuming one is a parity disk) and keep running.
(It's unfortunate, but NetApp Filer's are the worlds fastest NFS servers. But one day Sun
is going to kick their ....... never mind.)
This isn't enough information, though. When you choose a volume manager, the docu-
mentation will almost always talk about RAID types in detail. This course you've just
read is simply a different style of explanation which I've found is better than most others.
Here are some links to other places that explain RAID concepts:
• The RAID Advisory Board (RAB) a good source of information. For all I can tell the
organization is dead (their phone number is wrong, they haven't done anything in
years) but they still have some good stuff. You can find their page here: ht-
tp://www.raid-advisory.com/.
Most notably, they published an excellent book called The RAID Book: A Storage
System Technology Handbook which should be on every storage admins shelf. Be
aware that the last edition of this book is the 6th Edition, someone published a book
and called it the 7th Edition RAID Book, which was not by the RAB. This book is
hard to find, but generally Amazon can track down a copy.
6
RAID Theory: An Overview
terson, et al. This is THE paper on RAID. The one that started it all. This paper
should be read, if for no other reason, as a historical document.