Ted on Tech: RAID is Not Just an Ant Spray


When I first started writing this post, it was going to be on Network Attached Storage (NAS). That’s a topic of which every network user should have at least a basic knowledge. But when I started the post, I realized that BAS really doesn’t make a lot of sense beyond having a huge amount of disk space on the network unless you understand R.A.I.D.

Because R.A.I.D, not huge amounts of available storage, is the underlying reason that NAS is something all business (and many home) network administrators should understand, and implement.

R.A.I.D. is one of those technologies that gets brushed off with “it increases storage space” or “it backs up your data”. It can do either or both, but what R.A.I.D. really consists of are ways that multiple disk drives can be configured.

First off, R.A.I.D. is an acronym. Ask three people what it means, though, and you’ll probably get three different answers. For the purposes of this and future posts, I’m going to go with “Redundant Array of Individual Disks” and leave out the periods—RAID. The two operative words in the acronym as far as we’re concerned are array and redundant. “Array” is important because you can’t RAID just a single drive, you need at least two.

And “Redundant” is important because it’s the entire point of RAID.

Leveling the Field

When you see the term RAID, it’s usually accompanied by a level, such as RAID 0, RAID 1, RAID 5, or RAID 10. That level number describes the way that the array is configured. Setting up a RAID array is usually done at the drive controller (hardware) level with a RAID controller on the computer’s motherboard, or an external RAID controller plugged into an expansion slot.

Sometimes you’ll see the letter jumble “JBOD”. That stands for “Just a Bunch of Disks” and is not really a RAID array, just two or more drives that are spanned by the controller to look like a single contiguous drive. With JBOD, when the first drive fills up, the controller continues the volume into the next one, and so on. JBOD is not a really good way to set up multiple drives, as if one drive fails, it may be difficult, or impossible to recover the data from the bad drive. If you have a file written across two drives and one drive goes bad, that file is likely toast.

RAID Level 0, usually just expressed as RAID 0, is an actual array, and requires two or more physical drives. There can be an even or an odd number of drives in a RAID 0 array. Unfortunately, a RAID 0 configuration does not provide redundancy by itself. With RAID 0, the controller writes a block of data on one drive, the next block of data on a second drive, and if there are drives beyond the two, blocks of data will be written sequentially on those until the controller shifts back to the first drive. This results essentially in converting the drives into a single drive, since a large enough file can span several drives. The process or writing blocks of data onto sequential drives is called striping, and the big advantage of this type of RAID array is that one drive can be read while another is being written, which speeds up drive access times, sometimes considerably. RAID 0 suffers from the same problem as JBOD—if a drive fails, chances are you’ll lose significant data.

RAID 1 and RAID 5 are the two most frequently used arrays in business computing. RAID 1 requires an even number of drives, and writes the exact same data, at the exact same time, onto two drives. This is called mirroring, and if one of the paired drives fails, the data on the other drive is safe. Sometimes drives are combined in an array where striped drives are also mirrored (which requires a minimum of four disks). The array is called RAID 0 + 1, or more commonly, RAID 10.

RAID 5 is where we really start to get into redundancy, and requires a minimum of three drives in the array. Information about the data files is stored on all three drives. With a RAID 5 array, if one of the drives fails, the stored parity information can be used to rebuild the entire array. RAID 6 extends this by storing parity information across four (or more) drives, and provides the ability to rebuild the array if two of the drives fail simultaneously. RAID 5 and RAID 6 provide redundancy of data, and an effective fall-back as long it’s the drive that fails and not the controller. We’ll look at the controller issue as well as choosing what level of RAID to implement next time when we discuss Network Attached Storage.

Comments (4)
One other point (from a purist and former Oracle Database Administrator) RAID 0 1 is NOT the same as RAID 10. RAID 10 is mirrored, then striped; RAID 0+1 is striped, then mirrored. Therefore, RAID 0+1 provides better fault tolerance.
Posted by mjevans | Thursday, May 22 2014 at 5:03PM ET
Hi to both of you.

Scott--I'm not going to argue about the "I" in RAID--I've seen it all kinds of ways, and I hope that the message of RAID as as a hedge against drive failure is the one that makes it through.

Your comment about RAID 5 rebuild times is valid and applies to RAID 6 as well. But on mission critical data, I personally am willing to put up with an extended rebuild. All of the NAS systems I have in-house are RAID 6, and my mission-critical data, albeit not a huge amount compared to the 40+TB of storage on my home network, is also backed up elsewhere as well.

dgccpa--I'm not really selling RAID as a way to back up, rather as an approach to fault-tolerant disk storage, regardless of what the disk storage is used for.

I'm afraid I'm going to disappoint you in the NAS blog. I've used a number of the open source NAS operating systems, and have no shortage of ATOM-based Mini-ITX Mobos that I play with and that serve well in this use, but building an NAS from scratch, while not difficult, isn't anything I'd advise unless the builder has a good idea of what to do when the home-brew implodes.

My reason for doing these blogs is to hopefully help readers understand how technologies are used beyond just being "gee-whiz" terms in a product push. I'm a big believer in the "Black box" approach--as long you you have reasonable expectations of what's supposed to go in and what you expect to come out. But I'm also the kind of person that likes to pry open the box as well :>)

Thanks to both of you for the feedback.
Posted by tedneedleman | Wednesday, May 21 2014 at 4:14PM ET
First off Ted.....in RAID, I = inexpensive (or independent commonly now).....please be more detailed when writing technical articles.

dbccpa is right - people need to understand RAID isn't about backups, or providing holistic security of data. RAID is basically your insurance plan so that WHEN a drive fails, you bought your self a little time to replace it before you lose data.

However, readers of this article should stay away from complications like unRaid and FreeNAS unless you're an experienced geek yourself - but then you'd already know what's being said here, and would be reading more advanced tech publications.

And speaking of time, please remember before you recommend Raid5 to anyone, they understand the risks of Raid5 rebuild on large capacity disks. Buying a software driven NAS box with 8 bays (qnap, synology, drobo, etc) and stuffing in 4TB drives is cool...but the slowdown and rebuild time when a drive fails will startle the bullets of sweat right out of you. Nothing like seeing 37 hours remaining while you're network is crawling and your data is unprotected.

The mathematics of it show that during long R5 rebuilds, the odds of a URE (un-recoverable error) on a sector are more than 50/50. (just Google it.) This means you will likely lose data before you can complete a rebuild.

Having built many storage boxes and NAS appliances, and worked with Windows-based and Linux stand-alones, large drives arrays belong in R10 or at worst R6. Either way, minimum 4 drives when using large disks.

And remember....there's no skimping on storage....do it right, and sleep well.
Posted by Scott H | Wednesday, May 21 2014 at 3:37PM ET
Ted - be careful mixing the terms "RAID" and "backup" - RAID is not a backup, and relying on it as such is setting one's self up for data loss. I prefer to think of RAID as a hedge against a big calamity, instead allowing for reduced performance while an array is rebuilt when a drive fails.

Looking forward to the next article in this series, and I hope you'll discuss the "little engines that could" of the NAS world - unRAID, FreeNAS, etc.
Posted by dgccpa | Wednesday, May 21 2014 at 2:37PM ET
Add Your Comments:
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.

Register now for FREE site access and more