Apple Event: May 7th at 7 am PT

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

What's faster? Built in hard drive or Promise Pegasus R4 4TB (4x1TB) RAID System connected with Thunderbolt?

For encoding video etc.


1TB 7200-rpm Serial ATA 3Gb/s hard drive


vs.


http://store.apple.com/us/product/H5184VC/A/Thunderbolt




And what about Thunderbolt drive vs. Solid State?

Posted on Oct 6, 2011 6:37 AM

Reply
Question marked as Best reply

Posted on Oct 6, 2011 6:49 AM

Famous_Boi69 wrote:


For encoding video etc.


1TB 7200-rpm Serial ATA 3Gb/s hard drive


vs.


http://store.apple.com/us/product/H5184VC/A/Thunderbolt




And what about Thunderbolt drive vs. Solid State?

As Thunderbolt expands the internal Data-Bus to extern there should be (theoretically) no major speed difference in data transfer rate between the internal drive and a TB connected one.

http://en.wikipedia.org/wiki/Thunderbolt_%28interface%29


SSD is indeed faster than a conventional disk as there is no need to move parts for accessing data and therefore the SSD drive can use (theoretically) the full bus rate of the system for data transfer.


A raid, depending on used raid level and interface could be remarable faster as a single drive. http://en.wikipedia.org/wiki/Raid_levels


Lupunus

27 replies

Oct 22, 2011 9:30 AM in response to Smphoto74

IMHO, RAID 10 doesn't make the sense that RAID 5 does (but it's easier to implement.) What I especially don't like is losing half the storage... I ran RAID 0 (still do on one system) for data that really needs to be fast (e.g., uncompressed HD video) but only because it was all I could afford. I built a 2TB RAID 0 box for $1200 when the asking price for a 2TB hardware RAID 5 was $8-10K.


Back-up is important no matter what technology you use, but (again, IMHO) Time Machine isn't really meant for the job since it wants to "track changes in your documents" and at least the way I work, I can make 50-100GB of "changes" in a hard day of editing. That's going to create a problem on a 1TB or 2TB TM box...


The thing about very fast disk arrays is that they're hard to measure in the real world because to see the crazy fast stuff you need to have really fast arrays at both ends and a clean path in between. I had exactly that a while back when I had two RAIDs attached to the same system and was backing up a feature film I had just finished cutting. It took about 22 minutes to back up something 318GB of source and work files. If I'd been going to a clean single SATA drive it probably would have taken on hour.


Solid State Drives are a really interesting item. They are terrifyingly fast but pretty small and quite expensive. Yes, they should improve your cut/paste scores, but how fast can you afford to go? The scenario that I like best for SSD is to use it as a system drive with a reasonably high (4-6) spindle RAID5 as main storage. In this scenario on my system (8X 3.2GHz Xeon) it may be possible to get 250MB/sec throughput. Some of the things I do (like transcoding HD video) can currently saturate the 8 processors and become disc-dependent for performance.


But the thing to keep in mind is that in the end, when you raise the bar you're just moving the choke point to a more expensive place. 🙂

Oct 22, 2011 9:07 PM in response to RatVega™

RatVega,


What do you recommend using for backup software since you don't think TM is up to it? I'm a photographer an I have 3 employees editing images all from the r4 unit at any given time. I have it TM to a WD 2TB HD(USB) and really haven't had any issues.


I believe and I would have to double check but in Raid 10, the write is better than in Raid 5. I do a ton of writing so that is why I went Raid 10. I really want Raid 0. Lol.

Oct 23, 2011 12:13 AM in response to Smphoto74

If you like TM then stick with it. I tend to think in video terms and (as I mentioned) I can see myself wiping out multi-TB drives at a pretty good clip. In your case, you don't have the render files and versioning issues so a couple of TB will last you a great deal longer.


By all means, do what YOU think is best for your circumstance. I'm just a guy (in a different business) with a little time on the ground and an opinion. What I wanted to give you is an alternative that I think makes sense.


Speed is a relative thing. Once you are going faster than you need to, you start looking at different things. My recommendation for a SATA III RAID 5 is based on a couple of points:

1. SATA III is twice the speed of SATA II (which is basic to a Mac Pro) and this speed costs you the price of the RAID controller.

2. With twice the theoretical speed, staying above your desired threshold should be possible without a lot of tuning.

3. Since a RAID 10 is mirrored, your net capacity is half, a fairly steep price when RAID 5 provides 75% in a 4-spindle array, more if you go with more spindles.

4. I assumed you were using RAID 10 because that is what was offered (as I said, it's easier.) This is apparently not true and your opinion of what you need trumps my opinion any time.


I've had the RAID 0 and I understand the draw... If you're not living on the edge, you're taking up too much room! 😀 (but it's a lot like playing Russian Roulette with a .45 Automatic...)

Oct 23, 2011 12:36 AM in response to RatVega™

Different views are great and a reason why I'm here so thanks for your comments and tips.


I mainly did Raid 10 due to lots of research online about what Raid to use for whatever. Having only 2TB available out of 4TB is fine with me so that's never been an issue. Speed is important...it's a guy thing I think. Lol


I'm seriously thinking of doing a Raid 0 in my R4 unit then doing TM in Raid 1 on my 4TB WD FireWire drive. Granted that is only 2 TB for TM but I never plan to take the R4 over 2TB. I rotate my work out fast and I've always heard drive performance starts to bog down after 1/2 full. I do also keep another copy of the original files in a separate location.


I did buy FCPX and want to get into video a bit. More personal stuff than business. Lol.


Thanks again for your help

Oct 24, 2011 6:29 PM in response to RatVega™

RatVega™ wrote:


IMHO, RAID 10 doesn't make the sense that RAID 5 does (but it's easier to implement.) What I especially don't like is losing half the storage... I ran RAID 0 (still do on one system) for data that really needs to be fast (e.g., uncompressed HD video) but only because it was all I could afford. I built a 2TB RAID 0 box for $1200 when the asking price for a 2TB hardware RAID 5 was $8-10K.


Back-up is important no matter what technology you use, but (again, IMHO) Time Machine isn't really meant for the job since it wants to "track changes in your documents" and at least the way I work, I can make 50-100GB of "changes" in a hard day of editing. That's going to create a problem on a 1TB or 2TB TM box...


The thing about very fast disk arrays is that they're hard to measure in the real world because to see the crazy fast stuff you need to have really fast arrays at both ends and a clean path in between. I had exactly that a while back when I had two RAIDs attached to the same system and was backing up a feature film I had just finished cutting. It took about 22 minutes to back up something 318GB of source and work files. If I'd been going to a clean single SATA drive it probably would have taken on hour.


Solid State Drives are a really interesting item. They are terrifyingly fast but pretty small and quite expensive. Yes, they should improve your cut/paste scores, but how fast can you afford to go? The scenario that I like best for SSD is to use it as a system drive with a reasonably high (4-6) spindle RAID5 as main storage. In this scenario on my system (8X 3.2GHz Xeon) it may be possible to get 250MB/sec throughput. Some of the things I do (like transcoding HD video) can currently saturate the 8 processors and become disc-dependent for performance.


But the thing to keep in mind is that in the end, when you raise the bar you're just moving the choke point to a more expensive place. 🙂

I been looking for it and finally found it....this is what I read online and what I based my decision on going to Raid 10 instead of 5. It's a long read. lol



RAID5 versus RAID10 (or even RAID3 or RAID4)


First let's get on the same page so we're all talking about apples.


What is RAID5?


OK here is the deal, RAID5 uses ONLY ONE parity drive per stripe and many

RAID5 arrays are 5 (if your counts are different adjust the calculations

appropriately) drives (4 data and 1 parity though it is not a single drive

that is holding all of the parity as in RAID 3 & 4 but read on). If you

have 10 drives or say 20GB each for 200GB RAID5 will use 20% for parity

(assuming you set it up as two 5 drive arrays) so you will have 160GB of

storage. Now since RAID10, like mirroring (RAID1), uses 1 (or more) mirror

drive for each primary drive you are using 50% for redundancy so to get the

same 160GB of storage you will need 8 pairs or 16 - 20GB drives, which is

why RAID5 is so popular. This intro is just to put things into

perspective.


RAID5 is physically a stripe set like RAID0 but with data recovery

included. RAID5 reserves one disk block out of each stripe block for

parity data. The parity block contains an error correction code which can

correct any error in the RAID5 block, in effect it is used in combination

with the remaining data blocks to recreate any single missing block, gone

missing because a drive has failed. The innovation of RAID5 over RAID3 &

RAID4 is that the parity is distributed on a round robin basis so that

there can be independent reading of different blocks from the several

drives. This is why RAID5 became more popular than RAID3 & RAID4 which

must sychronously read the same block from all drives together. So, if

Drive2 fails blocks 1,2,4,5,6 & 7 are data blocks on this drive and blocks

3 and 8 are parity blocks on this drive. So that means that the parity on

Drive5 will be used to recreate the data block from Disk2 if block 1 is

requested before a new drive replaces Drive2 or during the rebuilding of

the new Drive2 replacement. Likewise the parity on Drive1 will be used to

repair block 2 and the parity on Drive3 will repair block4, etc. For block

2 all the data is safely on the remaining drives but during the rebuilding

of Drive2's replacement a new parity block will be calculated from the

block 2 data and will be written to Drive 2.


Now when a disk block is read from the array the RAID software/firmware

calculates which RAID block contains the disk block, which drive the disk

block is on and which drive contains the parity block for that RAID block

and reads ONLY the one data drive. It returns the data block. If you

later modify the data block it recalculates the parity by subtracting the

old block and adding in the new version then in two separate operations it

writes the data block followed by the new parity block. To do this it must

first read the parity block from whichever drive contains the parity for

that stripe block and reread the unmodified data for the updated block from

the original drive. This read-read-write-write is known as the RAID5 write

penalty since these two writes are sequential and synchronous the write

system call cannot return until the reread and both writes complete, for

safety, so writing to RAID5 is up to 50% slower than RAID0 for an array of

the same capacity. (Some software RAID5's avoid the re-read by keeping an

unmodified copy of the orginal block in memory.)


Now what is RAID10:


RAID10 is one of the combinations of RAID1 (mirroring) and RAID0

(striping) which are possible. There used to be confusion about what

RAID01 or RAID10 meant and different RAID vendors defined them

differently. About five years or so ago I proposed the following standard

language which seems to have taken hold. When N mirrored pairs are

striped together this is called RAID10 because the mirroring (RAID1) is

applied before striping (RAID0). The other option is to create two stripe

sets and mirror them one to the other, this is known as RAID01 (because

the RAID0 is applied first). In either a RAID01 or RAID10 system each and

every disk block is completely duplicated on its drive's mirror.

Performance-wise both RAID01 and RAID10 are functionally equivalent. The

difference comes in during recovery where RAID01 suffers from some of the

same problems I will describe affecting RAID5 while RAID10 does not.


Now if a drive in the RAID5 array dies, is removed, or is shut off data is

returned by reading the blocks from the remaining drives and calculating

the missing data using the parity, assuming the defunct drive is not the

parity block drive for that RAID block. Note that it takes 4 physical

reads to replace the missing disk block (for a 5 drive array) for four out

of every five disk blocks leading to a 64% performance degradation until

the problem is discovered and a new drive can be mapped in to begin

recovery. Performance is degraded further during recovery because all

drives are being actively accessed in order to rebuild the replacement

drive (see below).


If a drive in the RAID10 array dies data is returned from its mirror drive

in a single read with only minor (6.25% on average for a 4 pair array as a

whole) performance reduction when two non-contiguous blocks are needed from

the damaged pair (since the two blocks cannot be read in parallel from both

drives) and none otherwise.


One begins to get an inkling of what is going on and why I dislike RAID5,

but, as they say on late night info-mercials, there's more.


What's wrong besides a bit of performance I don't know I'm missing?


OK, so that brings us to the final question of the day which is: What is

the problem with RAID5? It does recover a failed drive right? So writes

are slower, I don't do enough writing to worry about it and the cache

helps a lot also, I've got LOTS of cache! The problem is that despite the

improved reliability of modern drives and the improved error correction

codes on most drives, and even despite the additional 8 bytes of error

correction that EMC puts on every Clariion drive disk block (if you are

lucky enough to use EMC systems), it is more than a little possible that a

drive will become flaky and begin to return garbage. This is known as

partial media failure. Now SCSI controllers reserve several hundred disk

blocks to be remapped to replace fading sectors with unused ones, but if

the drive is going these will not last very long and will run out and SCSI

does NOT report correctable errors back to the OS! Therefore you will not

know the drive is becoming unstable until it is too late and there are no

more replacement sectors and the drive begins to return garbage. [Note

that the recently popular IDE/ATA drives do not (TMK) include bad sector

remapping in their hardware so garbage is returned that much sooner.]

When a drive returns garbage, since RAID5 does not EVER check parity on

read (RAID3 & RAID4 do BTW and both perform better for databases than

RAID5 to boot) when you write the garbage sector back garbage parity will

be calculated and your RAID5 integrity is lost! Similarly if a drive

fails and one of the remaining drives is flaky the replacement will be

rebuilt with garbage also propagating the problem to two blocks instead of

just one.


Need more? During recovery, read performance for a RAID5 array is

degraded by as much as 80%. Some advanced arrays let you configure the

preference more toward recovery or toward performance. However, doing so

will increase recovery time and increase the likelihood of losing a second

drive in the array before recovery completes resulting in catastrophic

data loss. RAID10 on the other hand will only be recovering one drive out

of 4 or more pairs with performance ONLY of reads from the recovering pair

degraded making the performance hit to the array overall only about 20%!

Plus there is no parity calculation time used during recovery - it's a

straight data copy.


What about that thing about losing a second drive? Well with RAID10 there

is no danger unless the one mirror that is recovering also fails and

that's 80% or more less likely than that any other drive in a RAID5 array

will fail! And since most multiple drive failures are caused by

undetected manufacturing defects you can make even this possibility

vanishingly small by making sure to mirror every drive with one from a

different manufacturer's lot number. ("Oh", you say, "this schenario does

not seem likely!" Pooh, we lost 50 drives over two weeks when a batch of

200 IBM drives began to fail. IBM discovered that the single lot of

drives would have their spindle bearings freeze after so many hours of

operation. Fortunately due in part to RAID10 and in part to a herculean

effort by DG techs and our own people over 2 weeks no data was lost.

HOWEVER, one RAID5 filesystem was a total loss after a second drive failed

during recover. Fortunately everything was on tape.


Conclusion? For safety and performance favor RAID10 first, RAID3 second,

RAID4 third, and RAID5 last! The original reason for the RAID2-5 specs

was that the high cost of disks was making RAID1, mirroring, impractical.

That is no longer the case! Drives are commodity priced, even the biggest

fastest drives are cheaper in absolute dollars than drives were then and

cost per MB is a tiny fraction of what it was. Does RAID5 make ANY sense

anymore? Obviously I think not.


To put things into perspective: If a drive costs $1000US (and most are far

less expensive than that) then switching from a 4 pair RAID10 array to a 5

drive RAID5 array will save 3 drives or $3000US. What is the cost of

overtime, wear and tear on the technicians, DBAs, managers, and customers

of even a recovery scare? What is the cost of reduced performance and

possibly reduced customer satisfaction? Finally what is the cost of lost

business if data is unrecoverable? I maintain that the drives are FAR

cheaper! Hence my mantra:


NO RAID5! NO RAID5! NO RAID5! NO RAID5! NO RAID5! NO RAID5! NO RAID5!


Art S. Kagel

Nov 4, 2011 10:52 PM in response to RatVega™

I'm putting my Pegasus R4 in Raid 0 this weekend. I also want my OS(lion) on there as well. Not sure if I will make a partition for that or not but my backup will be to a 3TB WD drive via TM and also another backup to another WD HD using Carbon Copy cloner set up for Daily backup. So 2 backups should be pretty good. Granted if a drive dies on the R4 it will still suck but either way, it would still suck if a drive went out in Raid 10.


I only have a 1TB HD in my iMac so I figure I can get better performance from my programs if they are installed on the Pegasus R4(although I heard boot times will still be same as a regular HD. I will post speed of the R4 in Raid 0 Monday.

Dec 4, 2012 9:21 AM in response to Tman123

You're asking the wrong question...


Maybe I've missed something here, but I don't believe that Thunderbolt is available on a Mac Pro. I think I recall seeing an aftermarket card, but that would be gated by PCI speed so it would end up being about the same speed.


Given enough spindles, Thunderbolt should be faster because the internal SATA bus on a Mac Pro is SATA II limited to 4 drives. OK, 6 drives on some Mac Pros... On the other hand, T-bolt can be whatever is offered: SATA3, lots of spindles, neon lights, dancing girls... just not in a Mac Pro.


If you really need dead fast data at very high speed for your Mac Pro, you should look at SATA3 or fiber channel. It is possible to convert the internal SATA to SATA3 but the ability to boot from the RAID is expensive. One alternative is an SSD boot drive on the internal SATA bus. I honestly know little about fiber channel except that I can't afford to go that fast.

Mar 3, 2014 8:41 AM in response to Famous_Boi69

http://www.tomshardware.com/reviews/thunderbolt-performance-z77a-gd80,3205-2.htm l


According to this article, RAID or not, any external hard drive will suffer from an extra layer of thinking than using an internal drive. SATA or eSATA talks the native language, so there's no translation going on before writing or collecting your data. So, lets say you have 1 million 4 KB files you have to write, vs 1 single 4GB file, the internal drive will perform better than a RAID array with the 1 million 4 KB files. The RAID will perform better with the 1 single 4GB file, because it only has to think about translating the data 1 time, vs 4 million times. Even though it's the same amount of data. If you want to know the details, just check the link. If you use a Thunderbolt drive with no RAID array, you have a chance of doing much closer to the performance of the internal drive, depending on the drive and what controller it uses. That's mentioned towards the bottom of the article.

What's faster? Built in hard drive or Promise Pegasus R4 4TB (4x1TB) RAID System connected with Thunderbolt?

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.