What exactly is RAID and it's advantages?

Hi,

I've been hearing a lot about setting up Mac Pros/G5's in a "Raid" configuration for better performence an i was wondering what it's all about and how to do it?

Mac Pro 2ghz/Powerbook G3 400Mhz/iPod Photo 40gb/20gb/Shuffle 512mb, Mac OS X (10.4.7), Mac Pro 2ghz/250gb/160gb/160gb/80gb/1gb - Powerbook: 20GB/512MB.

Posted on Aug 29, 2006 12:32 AM

Reply
22 replies

Aug 29, 2006 1:23 AM in response to Mark Thornton

Does this answer your question: http://www.answers.com/topic/raid-technology

RAID

(Redundant Array of Independent Disks) A disk subsystem that is used to increase performance or provide fault tolerance or both. RAID uses two or more ordinary hard disks and a RAID disk controller. In the past, RAID has also been implemented via software only.

In the late 1980s, the term stood for "redundant array of inexpensive disks," being compared to large, expensive disks at the time. As hard disks became cheaper, the RAID Advisory Board changed "inexpensive" to "independent."

Small and Large

RAID subsystems come in all sizes from desktop units to floor-standing models (see NAS (Network Access Server) and SAN (Storage Area Network)). Stand-alone units may include large amounts of cache as well as redundant power supplies. Initially used with servers, desktop PCs are increasingly being retrofitted by adding a RAID controller and extra IDE or SCSI disks. Newer motherboards often have RAID controllers.

Disk Striping

RAID improves performance by disk striping, which interleaves bytes or groups of bytes across multiple drives, so more than one disk is reading and writing simultaneously.

Mirroring and Parity

Fault tolerance is achieved by mirroring or parity. Mirroring is 100% duplication of the data on two drives (RAID 1). Parity is used to calculate the data in two drives and store the results on a third (RAID 3 or 5). After a failed drive is replaced, the RAID controller automatically rebuilds the lost data from the other two. RAID systems may have a spare drive (hot spare) ready and waiting to be the replacement for a drive that fails.

The parity calculation is performed in the following manner: a bit from drive 1 is XOR'd with a bit from drive 2, and the result bit is stored on drive 3 (see OR for an explanation of XOR).

RAID Levels


RAID 0 - Speed

Level 0 is disk striping only, which interleaves data across multiple disks for better performance. It does not provide safeguards against failure. RAID 0 is widely used in gaming machines for higher speed.

RAID 1 - Fault Tolerance

Uses disk mirroring, which provides 100% duplication of data. Offers highest reliability, but doubles storage cost. RAID 1 is widely used in business applications.

RAID 2 - Speed

Bits (rather than bytes or groups of bytes) are interleaved across multiple disks. The Connection Machine used this technique, but this is a rare method.

RAID 3 - Speed and Fault Tolerance

Data are striped across three or more drives. Used to achieve the highest data transfer, because all drives operate in parallel. Parity bits are stored on separate, dedicated drives.

RAID 4 - Speed and Fault Tolerance

Similar to Level 3, but manages disks independently rather than in unison. Not often used.

RAID 5 - Speed and Fault Tolerance

Data are striped across three or more drives for performance, and parity bits are used for fault tolerance. The parity bits from two drives are stored on a third drive and are interspersed with user data. RAID 5 is widely used on servers to provide speed and fault tolerance.

RAID 6 - Speed and Fault Tolerance

Highest reliability, but not widely used. Similar to RAID 5, but performs two different parity computations or the same computation on overlapping subsets of the data.

RAID 10 - Speed and Fault Tolerance

A combination of RAID 1 and RAID 0 combined. Raid 0 is used for performance, and RAID 1 is used for fault tolerance.

Aug 29, 2006 2:00 AM in response to Mark Thornton

User uploaded fileBasically speaking, RAID offered by Mac OS X (Disk Utility) has three distinct, functions…

1) Large drive sizes. It enables the combination of multiple drives into one large drive. So, if you have 3 x 500GB drive you can combine them into one large 1.5TB drive. This is implemented under a RAID 0 (striping or concatenating) scheme.

2) Redundancy. By mirroring two or more drives it is possible to have all but one drive sumulaneously fail and still maintain data operation and integrity. So, if you had 3 x 500GB mirrored drives (resulting in a single 500GB drive) all would be identical giving you in effect, 3 original drives. If one or two of those drives failed the reminaing drive(s) would remain in operation. This is implemented under a RAID 1 scheme.

3) I/O performance gains. By having your data spread over multiple drives each of those drives are able to read and write to those drive independently and concurrently. So, if you had 2 x 500GB drives under a RAID 0 (stripe only - concatenating provides no I/O improvements) half of your data will be on the first drive and the other half on the 2nd drive. For a 100GB file each drive would have 50GB of that file. When it came time to read that file each drive would simultaneously give the 50GB piece it had which would essentially halve read time and the same would apply to the writing of that 100GB file as each drive would only have to write half of the file.

Under a RAID 1 (mirroring) scheme read speeds would be likewise be improved as you have multiple drives to read from whereas write speed would not change as the same data needs to be written to all drives.

Please be aware that I/O is not actually improved in multiples as it suggests as there are overheads that are incurred by the RAID controller in managing the RAID implementation. Additionally, under Mac OS X the controller is software based rather than hardware based so there are performance losses there as well. It is possible however that the Mac Pro does have a hardware RAID controller which is implemented as part of the Intel 5000X chipset and Disk Utilty only controls it. Someone who knows more about the Intel 5000X chipset might want to confirm or deny this.

While you do get across the board performance gains they are minimal, if anything, at best when booting due to large amounts of very small I/O operations. RAID does do best with large file I/O operations.

<HR width="50%" />



Many RAID implementation are in fact a combination of RAID 0 and 1 to combine their effects. So, what you could do with 4 identical 500GB is to create two RAID 1 mirrored pairs, leaving you with 2 x 500GB disks, and then create a RAID 0 stripe those two resulting disks into a single 1TB disk. By doing this you in effect get the redundancy for each member in the stripe and also the combined I/O performance (2x write and 4x read) gains. This particular implementation is a RAID 10 (or 1 + 0) scheme.

To provide the most optimal RAID implementations is it highly suggested that all drives are the same size, make and model. Additionally, due to the negative effects of RAID, which I'll get into, it is even more highly suggested that you opt for an entrerprise/nearline class drive with higher reliability and lower failure rates. Enterprise class drives are design for high use and RAID implementations. Yes they are more expensive, though often not excessively so, and are sometimes smaller than their consumer/desktop class bretheren but are well worth the difference.

So what the downside? The largest downside is with RAID 0 striping. With each member/drive you add to a stripe set you add to the likelihood of a catastrophic failure. For instance, if you had a 4 drive RAID 0 stripe the possibilty of a drive failure is 4x that of a single drive. Additionally, if a single drive fails all data is lost as that one drive has a quarter of each file on your system. Now you can see why enterprise class drives are recommended!

Even if you implemented redundancy through a combined RAID 1 scheme, if there were damage to what was written to disk then all disks are damaged. Remember, with mirroring, what is written to one is written to all drives. RAID 1 really only protects you drive a hardware failure, not a software failure.

This does sound bad doesn't it. Just remember to keep it in perspective… a failure under RAID is more likely but that does not make it imminent. With a 4 drive stripe and an enterprise class drive with a 1 million hour MTBF that still gives an overall 250,000 hour MTBF. The likelihood of a single failure is still extremely low.

When you implement RAID schemes how you safeguard your data via clones and backups becomes more of an important issue. Bear in mind that not safeguarding your data with or without RAID is not a wise thing so if you are doing the right thing there's is really nothing extra to do with RAID.

I clone every day or two and backup weekly. Even without RAID, if I didn't do this a hardware failure would result in data loss so RAID isn't particularly worse than an independent drive setup.

<HR width="50%" />



Would I recommend it. To some yes and others no. It really comes down to how you use your Mac Pro. My suggestion that if you have the drives and the time just give it a go and see how it feels.

Aug 29, 2006 5:27 AM in response to Mark Thornton

It sounds a bit complex to me though
it does sound complex but in reality implementing it is not

just wondered if it was worth doing to improve general performance
Only you can determine that. I came across this benchmark barefeats that shows single drive performance and raid 0 drive performance, using a variety of drives.

draw your conclusion but for me, the savings in general don't seem to bear out the cost factor. Using only Raid 0 is a risk as your writing to two seperate drives, if one of the drives "burps" then you have some problems. How high is that risk, I'd say rather low but it is there which is why many people mirror the stripped disks (a combination of raid 0 and 1).

I opted not to go with a raid solution as the improved performance did not outweigh the cost of buying multiple drives. At the moment the stock 250gig has more then enough room for my needs. I will be buying another drive but we're talking about spending 100 bucks as opposed to 300 to 400 for an array of drive units.

Mike
User uploaded file

Aug 29, 2006 6:09 AM in response to Mark Thornton

User uploaded file It sounds a bit complex to me though, i just wondered if it was worth doing to improve general performance (not that it's bad of course) but i think i'll continue with a standard setup.

As Michael points out, the implementation side of RAID is in fact exceedingly simple. The difficult part is determining what RAID scheme, if any, you want to implement. Once a decision is made you can in fact have a RAID implementation up and running in less than 3 minutes (2:30 of that is taken up by reading the Disk Utility instructions for the first time).

Aug 29, 2006 6:47 AM in response to Michael Flynn

User uploaded file How high is that risk, I'd say rather low but it is there which is why many people mirror the stripped disks

In actual fact if you use an enterprise class drive a 2 drive stripe is in fact no more risky that a single consumer/desktop class drive. With 2 drives of a 1 million hour MTBF (the Raptor as a 1.2 million hour MTBF rating) you get a resulting 500K hour rating which is no worse than many consumer drives and better than many value classed drives.

I opted not to go with a raid solution as the improved performance did not outweigh the cost of buying multiple drives.

Which is the position that many are going take and it the most cost effective option. You can't argue with that. I too looked hard at the numbers and found that while I spent more, it was more cost effective to go with a RAID implementation.

As of right now I use around 150-200GB which means as a CTO, the minimum drive I would sensibly buy would be a single 500GB at a cost of 200€. Instead I chose a 5 drive system (4 x 300GB Maxtor MaXLine III in RAID 10 + 1 x 300GB Western Digital SE in the optical drive bay for Windows) which results in 100GB more in my main drive, 300GB separate drive for Windows (150GB for Boot Camp and 150GB for Parallels), superior performance and drive redundancy.

All 5 drives cost me 430€ and if you subtract the 70€ for downgrading to the 160GB Apple drive the price difference to the Apple 500GB CTO price was 160€. And that's including having the 160GB drive as a paper-weight. Even if I had downgraded to the 160GB drive and bought a 3rd party 500GB drive the cost would not have been much less, if at all.

For the cost of 1GB RAM it was simply a no brainer on my part. Is it over-kill for what I'm really going to be doing… probably. Given the Mac Pro though I felt that going with a single consumer class drive was like using standard octane gas and retreads on a Ferarri.

Of course your milage will vary greatly depending on your needs.

Aug 29, 2006 7:22 AM in response to infinite vortex

In actual fact if you use an enterprise class drive a 2 drive stripe is in fact no more risky that a single

I tend to disagree in a raid 0 setting you are relying two phyisical devices to write one file that is a higher risk then a non-raid solution. when you make the process more complex and interleaving two drives as one is more complex then a single drive you have a higher potential of corruption.

Your splitting the data stream in two and things happen, if they didn't people wouldn't be looking to do a combination raid 0+1

I don't think the risk is high but it is there.

Mike
User uploaded file

Aug 29, 2006 8:10 AM in response to Michael Flynn

User uploaded file I tend to disagree in a raid 0 setting you are relying two phyisical devices to write one file that is a higher risk then a non-raid solution.

That is assuming that the failure rates are the same between drives. Two enterprise class drive should have a similar failure rate to one consumer class drive. It's like saying that a Rolls Royce is just as likely to fail as a GM. This is why, when going to RAID, is it suggested you change your drive class. For instance, a WD Raptor is going to fail far less than pretty much every other drive that's out there.

Aug 29, 2006 1:32 PM in response to Mark Thornton

Mark:

If you feel in any way uncomfortable with RAID configs then don't use it. RAID configs can either be implemented with software or hardware. Apple's Disk Utility configs the RAID device via software and is less optimal than it being done in hardware - but then it's less expensive also. :-))

RAID essentailly provides performance and/or reliability and/or data protection backup improvements over single disk spindles. RAID configs benefits may also consume more of your physical disk space than you can afford. For example, mirroring disk or RAID 1 means if you have 2 HDs then configuring RAID 1 for these means you only have access to the space equivalent of 1 HD.

Before using RAID on your Mac Pro become very well aquainted with the feature is my advice.

Good luck, and I think you've already made the wise choice.

Aug 29, 2006 3:45 PM in response to Nigel-66

User uploaded fileEnterprise class drives are designed specifically for server and RAID implementations. They have far lower failure rates than consumer class drives and are also specifically able to withstand 24/7 use without issue. Normally these are SCSI or SAS (Serial Attached SCSI) drives although there are a few SATA drives that match these requirements.

Examples of SATA enterprise drives are the Maxtor MaXLine III/Pro series, the Western Digital RE/RE2 series and the gold standard of performance, the Western Digital Raptors.

As for a 500GB drive choice… you might want to look at the Maxtor MaXLine Pro else the Western Digital RE2.

Aug 29, 2006 3:55 PM in response to Nigel-66

Enterprise. Designed for 24/7 and RAID. Longer burn-in likely, along with a full 5 yr warranty. If you look on StorageReview you'll see that Seagate, Maxtor and WDC as well as Hitachi all have consumer and enterprise drives. They also have 1.2 million hours MTBF as compared to 600-700K.

In other words, suitable for the pro-sumer or workstation as well as (enterprise) servers now.

I'm sold on WD's line. Their enterprise class are the "RE" and RE2 lines. Maxtor is the MaxLine III. And Hitachi is more subtle, with the "T"urbo 7K500. Seagate has something like "NL."

From SR front page:
First up will be a roundup-style look at a trio of 500 GB enterprise offerings from the big three American drive manufacturers: Seagate's NL35.2, Maxtor's MaXLine Pro, and Western Digital's Caviar RE2 WD5000.
SATA in the Enterprise

Mac Pro 2GHz 2GB WD Raptor Mac OS X (10.4.7) G4 MDD WD 320 OEM 9600 1.75GB SoftAID 3

Aug 29, 2006 8:31 PM in response to Nigel-66

I'm of the opinion that MTBDL (mean time between data loss) is far more important than MTBF (mean time between failure).

The Mac OS HFS+ is pretty robust but strange things can happen by software or simply malfunctions in the hardware that corrupt the HD's data structures no matter if HD is consumer or enterprise rated.

At my work place it's MTBDL that counts.

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

What exactly is RAID and it's advantages?

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.