Ambiguous RAID failure
Early 2008 Mac Pro with Apple RAID card and 4 x 1TB drives installed.
I've had yet another RAID failure on my system (it's happened several times before) but this time the diaganostic is ambiguous and I need to make absolutely sure what's happened before I try to recover from it.
Overnight the RAID Utility log reported that Drive 3 had failed, that Raid Set RS1 was now degraded, and that there was no spare available for rebuild.
When I look now in RAID Utility at the status of the drives and of the array, all four drives show "green" (SMART: Verified and Status: Good), and Raid Set RS1 is "Viable (Degraded)". But it shows the drives in Bays 2, 3, 4 as "Assigned" to Raid Set RS1, while the drive in Bay is not: it shows as "Roaming".
I'm fairly sure that one of the drives actually is problematic, because I've been having increasingly frequent episodes of freezing and non-responsiveness on the system (spinning beachball). In the past couple of days it got so bad that it was difficult to do anything at all following a restart; the freeze/beachball happened very soon after. I remember now that I had exactly this symptom in the past, just prior to a drive failure that RAID Utility reported.
So I guess I need to replace one of the drives, mark it as "Spare" in RAID Utility, and let the array rebuild.
But WHICH drive should I replace? The log says that Drive 3 failed (I'm assuming that "Drive 3" is the drive in "Bay 3"), but now that drive shows as "good"--as do all four drives. It's Drive 1 (i.e. the drive in Bay 1) that's been taken out of the array; drives 2, 3, and 4 are in Raid Set RS1. Is that a red herring? Is it possible that Drive 1 is bad even though the report was about Drive 3? (Drive 1 is the only drive that has never been replaced at any time in the four years since I got this system.)
I think/fear that if I replace Drive 3, I'll blow away the array.
So it seems to me that I should rebuild the array by marking Drive 1 as spare (since it's the only drive that's unassigned), wait for it to complete, and then replace Drive 3 and rebuild again. Or maybe I should just replace Drive 1 pre-emptively.
I don't know, but it takes a full 72 hours for the re-build to complete, a nerve-wracking time because throughout it the system is vulnerable to a second drive failure, so I would prefer not to have to do it multiple times.
Can someone please tell me in detail what the safest/most correct way is to proceed in order to recover from this?
Thanks.
Mac Pro (Early 2008), Mac OS X (10.7.4)