Disk Utility: for bad blocks on hard disks, are seven overwrites any more effective than a single pass of zeros?

Question

Level 2

261 points

Disk Utility: for bad blocks on hard disks, are seven overwrites any more effective than a single pass of zeros?

In this topic I'm not interested in security or data remanence (for such things we can turn to e.g. Wilders Security Forums).

I'm interested solely in best practice approaches to dealing with bad blocks on hard disks.

I read potentially conflicting information. Examples:

… 7-way write (not just zero all, it does NOT do a reliable safe job mapping out bad blocks) …

— https://discussions.apple.com/thread/1732448?answerId=8191915022#8191915022 (2008-09-29)

… In theory zero all might find weak or bad blocks but there are better tools …

— https://discussions.apple.com/thread/2362269?answerId=11199777022#11199777022 (2010-03-09)

… substitution will happen on the first re-write with Zeroes. More passes just takes longer.

— https://discussions.apple.com/thread/2507329?answerId=12414270022#12414270022 (2010-10-12)

For bad block purposes alone I can't imagine seven overwrites being any more effective than a single pass of zeros.

Please, can anyone elaborate?

Anecdotally, I did find that a Disk Utility single pass of zeros seemed to make good (good enough for a particular purpose) a disk that was previously unreliable (a disk drive that had been dropped).

Intel and PowerPC desktops and laptops, Intel and G4 Xserves, Xserve RAIDs, Mac OS X (10.6.7), Mac OS X 10.5.8 and 10.6.7, Mac OS X Server 10.5.8 and 10.6.7

Posted on Apr 21, 2011 12:22 PM

Reply

Answer 1

Feb 4, 2013 1:13 PM in response to Graham Perrin

@MrHoffman

As well pointed your answers are, you are not answering the original question, and regarding consumer device hard drives your answers are missleading.

Consumer device hard drives ONLY remap a bad sector on write. That means regardless how many spare capacity the drive has, it will NEVER remap the sector. That means you ALWAYS have a bad file containing a bad sector.

In other words YOU would throw away an otherwise fully functional drive. That might be reasonable in a big enterprise where it is cheaper to replace the drive and let the RAID system take care of it.

However on an iMac or MacBook (Pro) an ordinary user can not replace the drive himself, so on top of the drive costs he has to pay the repair bill (for a drive that likely STILL is in perfect shape, except for the one 'not yet' remaped bad block)

You simply miss the point that the drive can have still one million good reserve blocks, but will never remap the affected block in a particular email or particular song or particular calendar. So as soon as the file affected is READ the machine hangs, all other processes more or less hang at the same moment they try to perform I/O because the process trying to read the bad block is blocking in the kernal. This happens regardless how many free reserve blocks you have, as the bad block never gets reallocated, unless it is written to it. And your email program wont rewrite an email that is 4 years old for you ... because it is not programmed to realize a certain file needs to be rewritten to get rid of a bad block.

@Graham Perrin

You are similar stubborn in not realizing that your original question is awnsered.

A bad block gets remapped on write.

So obviously it happens at the first write.

How do you come to the strange idea that writing several times makes a difference? How do you come to the strange idea that the bytes you write make a difference? Suppose block 1234 is bad. And the blocks 100,000,000 to 100,000,999 are reserve blocks. When you write '********' to block 1234 the hard drive (firmware) will remap it to e.g. 100,000,101. All subsequent writes will go to the same NEW block. So why do you ask if doing it several times will 'improve' this? After all the awnsers here you should have realized: your question makes no sense as soon as you have understood how remapping works (is supposed to work). And no: it does not matter if you write a sequence od zeros, of '0's or of '1's or of 1s or of your social security number or just 'help me I'm hold prisoner in a software forum'.

I would try to find a software that finds which file is affected, then try to read the bad block until you in fact have read it (that works surprisngly often but may take any time from a few mins to hours) ... in other words you need a software that tries to read the file and copies it completely, so even the bad block is read (hopefully) successful. Then write the whole data to a new file and delete the old one (deleting will free the bad block and ar some later time something will be written there and cause a remap).

Writing zeros into the bad block basically only helps if you don't care that the affected file is corrupted afterwards. E.g. in case of a movie the player might crash after trying to display the affected area. E.g. if you know the affected file is a text file, it would make more sense to write a bunch of '-' signs, as they are readable while zero bytes are not (a text file is not supposed to contain zero bytes)

Hope that helped ;)

Reply

Answer 2

Sep 29, 2013 4:42 PM in response to Graham Perrin

I take the following to mean that YES, 7-pass (or 3-pass) may find more bad blocks than a single pass, by the mere fact that some blocks may be, well, "iffy" and need more than one pass to be rejected. Please correct me if I am misinterpreting the following:

[I]f you're going to be committing important data to the drive, you may wish to... exercise the drive, by writing and reading data from as many locations as possible for as much time as you can spare.... [A]ny weak spot will show itself now instead of sometime down the road.

From:

Revive a Hard Drive for Use With Your Mac

Scanning for Bad Blocks

This next step will check every location of the drive and determine that each section can have data written to it, and the correct data read back. In the process of performing this step, the utilities we use will also mark any section that is unable to be written to or read from as a bad block. This prevents the drive from using these areas later....

When Disk Utility uses the Zero Out Data option, it will trigger the drive's built-in Spare Bad Blocks routine as part of the erasure process....

[I]f you're going to be committing important data to the drive, you may wish to run one more test. This is a drive stress test, sometimes referred to as a burn-in. The purpose is to exercise the drive, by writing and reading data from as many locations as possible for as much time as you can spare. The idea is that any weak spot will show itself now instead of sometime down the road.

There are a few ways to perform a stress test, but in all cases, we want the entire volume to be written to and read back.

Stress Test With Disk Utility

When Disk Utility uses the DOE-compliant 3-pass secure erase, it will write two passes of random data and then a single pass of a known data pattern. [Or 7-pass will write over data 7 times.] ... Once the erasure is complete, if Disk Utility shows no errors, you're ready to use the drive knowing it's in great shape.

[1]: http://macs.about.com/od/MacTroubleshootingTips/ss/Reviving-A-Hard-Drive-For-Use -With-Your-Mac.htm

Reply

Answer 3

tedtv

Level 1

10 points

Aug 28, 2016 8:27 AM in response to Graham Perrin

Last post on this was March 2016. Hopefully I get an answer. But I just bought SoftRAID for my Mid 2010 Mac Pro running El Capitan, four internal SATA, and an external JBOD SansDigital RAID box with Sonnetch Tempos Duo. I'm curious about using the SoftRAID Write Zeros to first and last Sectors function. I see many saying it is a waste of time. I'm a video editor so backup is very important. I've been replacing drives the past month and I will replace these with the bad sectors since SoftRAID is recommending they be removed but I'm still curious about the Zeros to Disk function and if SoftRAID has a good reputation as software that is good at repairing drives vs. the Mac Disk Utility. Also, if I engage the Zeros to Disk function will it totally erase the drive or just write to those sectors leaving all my other data alone? On a side note, knocking on wood, these Hitachi, and Samsung, internal drives have served me well. And well past the average 3 year life span of drives. Serious work horses.

Reply

Answer 4

Graham Perrin Author

Level 2

261 points

Apr 25, 2011 9:37 AM in response to MrHoffman

Thanks for another helpful answer.

I should have stated at the outset that I'm thinking of any disk drive that does have (a) functional bad block recovery and (b) sufficient spare blocks.

(Giving an example of just one such disk seems to have caused confusion.)

… tangling together of data remanence and associated overwrite recommendations, and the typical bad block revector-on-write support, and RAID. …

I did, still do, half-suspect a tangling of some sort. https://discussions.apple.com/thread/1732448?answerId=8191915022#8191915022 is archived so I can't reply there FAO The hatter.

Maybe a better way of phrasing my original question … defocusing from Disk Utility, considering the secureErase verb of diskutil:

is a random fill pattern — or the first pass of any type of fill pattern associated with level 2, 3 or 4 — any more likely than a pattern of zeros to trigger spare block substitution?

The October 2010 post by Grant Bennet-Alder seems clear enough re: the effect of a pattern of zeros.

Still, I'd like to leave this question open/unanswered for a while, in case people have anything to add concerning fill patterns for spare block substitution purposes.

Reply

Answer 5

Nov 18, 2013 12:41 PM in response to Graham Perrin

Hi Graham:

There are several things at play here. First, security, the second is mapping out a bad block. Don't confuse the two.

If a contemporary HD has a bad sector in it, one of the "tricks" used to force it to be mapped out is to use the security feature of Disk Utility or other tools to write zeros to the bad block. When the write fails, the controller in the HD remaps the sector to a spare sector, if there are any left. Doing this seven times won't make a difference. It's either going to re-map or it isn't.

I would suggest the 7 pass information you're getting is either from someone that's confusing security with block re-mapping, or they're thinking of old ST-506 based MFM (circa 1980s) HDs. In the latter, the old MFM drives were expensive ($300 for a 10MB drive NOTE: that's an "M", not a "G") and it was worth making an attempt at repairing them. The technique was to go to that bad sector and over write it many times iin hopes of "clearing" it, and if not put the sector into a list of bad sectors. This is really, really old stuff. No one does that anymore. I believe one of the tools that did this was Spinrite. I don't remember for sure. With age comes fog.😉

I recently had to go through a lot of computers that we were originally considering re-selling, but ended up selling or giving them away to staff:

https://discussions.apple.com/thread/5385790?answerId=23798487022#23798487022

I ended up getting Scannerz because it could scan the drives faster than anything else and it could be used for more advanced troubleshooting. In one of the sections of the manual they tell you how to re-map the bad blocks, and they tell you to do it using Disk Utility by zeroing it. They say nothing about doing this 7 times. If a bad block will re-map it will do it on the first try.They also give hints to imply you might be better off getting a new drive.

From a cost standpoint, attempting to fix a hard drive probably isn't worth it anymore. If I have a technician that's getting paid $20/hour and you figure in overhead costs that wage goes up a lot. If I had one of these guys go into troubleshooting stuff like that, the costs could likely run into hundreds of dollars, for what is now likely an inexpensive part. I can get a 250GB new HD here in the U.S. for just over $40. Why spend more money trying to repair one that may end up being unrepairable if you can just get a newer, likely even better drive?

I've seen the "zeroing" trick work. If there are spare sectors and the bad blocks are limited, they will remap. However, Google did a study on HD failures that implies once a drive has detected errors, more are likely to follow. I don't have a link for it, but it's on the web. IIRC the life expectancy of a drive that's showing errors is 6 months or less.

Reply

Answer 6

BobHarris

Level 9

57,161 points

Mar 11, 2016 6:01 PM in response to sheffi

If the volume is encrypted, you just destroy the keys, and forget the data on the disk because with out the keys it is just a bunch of random bits.

If you insist on erasing, then 1 pass of zeros is good enough because even if someone can recover the random bits, without the encryption key there is nothing they can do with those random bits.

All you ever have to do with a FileVault encrypted disk, is ask Disk Utility to reformat the drive putting a new empty file system on it. That will erase the encryption keys, and if you do not have a separate copy, then once the keys are gone the encrypted bits on that drive are useless to everyone.

I am sorry to say Apple...

Apple is not reading this forum. It is just for User-to-User problem solving. If you wish to communicate with Apple, try the Feedback page

<http://www.apple.com/feedback/macosx.html>

Reply

Answer 7

Graham Perrin Author

Level 2

261 points

Dec 14, 2012 7:44 PM in response to Graham Perrin

Now in Ask Different:

For bad and/or marginal blocks on hard disks, can anything be more effective than a single pass of zeros?

Reply

Answer 8

MrHoffman

Level 10

149,931 points

Apr 21, 2011 3:27 PM in response to Graham Perrin

Reformatting or overwriting a disk in an attempt to cause it to repair overt bad blocks is usually a waste of time and effort, in my experience.

By the time the disk is tossing enough visible errors (that can't be overcome using the device's EDC recovery and whatever RAID might be in use), the disk has usually failed; has exceeded its ability to apply whatever replacement sector scheme and error recovery it might possess. I'd replace it.

If the disk was effectively dealing with its inherent errors on read, then you should not be seeing these errors. With the read, you should either get the data back from the disk, or you get the data back with the assistance of the EDC, or you get an error and you get the data from elsewhere within the RAID, or you get an error and presumed-bogus data if the EDC and RAID cannot recover the data.

And rather than the whole-disk overwrite, an attempt to write a bad sector with a typical disk should automatically cause the disk to re-vector the write over to a spare, meaning you shouldn't need to do the wholesale overwrite.

If you're using the overwrite as a disk hardware test, that's another discussion. I'd probably use a different tool, and specifically targeted to disk verification, but certainly setting known patterns and then verifying them is entirely feasible.

To see what the disk thinks is going on with the errors, query the SMART data. There are a few canaries in SMART, particularly including scan errors, as well as the reallocation count, offline reallocation and probational count values. Those tend to point to an impending failure. The rest of SMART probably isn't as predictive, and SMART in general isn't all that good at predicting failures.

And a disk drive that had been dropped? I'd probably expect to swap it.

Reply

Answer 9

Graham Perrin Author

Level 2

261 points

Apr 24, 2011 5:45 AM in response to timg17

I realise the problem, I don't expect a fix.

Focusing again on the subject line: I would like to know whether for bad block purposes, seven overwrites are any more effective than a single pass of zeros.

Reply

Answer 10

MrHoffman

Level 10

149,931 points

Apr 24, 2011 8:39 AM in response to Graham Perrin

Reformatting or overwriting a disk in an attempt to cause it to repair overt bad blocks is usually a waste of time and effort, in my experience. Whether you run zero passes, one pass, a thousand or a billion passes, you (still) either have a disk with effective and functional bad block recovery (and sufficient spare blocks), or one that doesn't.

Multiple passes are an attempt to avoid data exposures through data remanence via overwriting any slightly off-track recording of data (and that's a specialized attack, and requiring specific hardware), and you've explicitly excluded that topic from the discussion here.

Put another way, this seems to be an unfortunate tangling together of data remanence and associated overwrite recommendations, and the typical bad block revector-on-write support, and RAID. All with what looks to be failing hardware. And given the reported industry average of three to six hard errors per terabyte, RAID is one of the few available mechanisms for recovery from failed blocks, for when you encounter a block error.

Run as many passes as you want. I'd swap the disk. By the time disks are overtly tossing multiple (visible) errors, they're not devices I would trust with my data, and not worth further use. (Why "overt"? The typical three to six hard errors usually don't show.)

Reply

Answer 11

Nov 7, 2012 10:55 AM in response to Graham Perrin

Actually, despite all the noise in this thread, Disk utility is able to remove bad blocks from the usable pool on a drive when it is used to erase the a drive and write all zero's.

Backup your data first!!

Now, the point about "your hardware is failing" isn't necessarily wrong, but it isn't necessarily right either. If you have a situation where a drive has bad sectors, and it's not easy to replace it, using disk utility can help.

I have resurrected many hard drives and iPods this way, and MOST of them continue working indefinitely.

Good Luck,

Marty

PS Obviously in any mission critical application you replace the damaged drive ASAP.

Reply

Answer 12

MrHoffman

Level 10

149,931 points

Dec 15, 2012 7:59 AM in response to Graham Perrin

Go talk with the disk vendors. (If they're interested in discussing this, and that's not a certainty.) Or acquire data from a large provider, as Google and CMU did with their technical papers on this topic; here are links to some of the papers.

My comments (above) are from common practice within large and enterprise businesses. Dinking around with a comparatively cheap disk that's tossing visible errors is not an effective use of repair time, and risks reoccurrences.

Disks always toss errors. If they're working reasonably and the error is within the error detection and correction (EDC) available in the drivers and/or firmware, you'll get the data corrected and block transparently flagged for replacement on the next block write. If the error is outside what the EDC can recover, you're rolling in backups.

Can that disk that's tossing visible errors be functional? Sure.

Can the errors be isolated? Certainly.

Can pattern overwrites be used to force revectoring, or to demonstrate flaws in the EDC? Yes. Rolling in a full disk backup can have similar effects.

But my preference is to swap the disk. Preferably swapping for an equal or better grade of disk, as the cheap disks are usually cheap for a reason; shipped with fewer spare blocks, or with less reliable EDC, or other such. Disks are cheap. The cost of the repairs and data loss are more expensive.

Obviously: configure and maintain RAID or Time Machine or some other form of backup, if the data matters.

Reply

Answer 13

MrHoffman

Level 10

149,931 points

Feb 5, 2013 5:55 AM in response to Angelo Schneider

Thank you for your feedback, Angelo Schneider. Your reply refers to rewriting files or reloading data, and to the error recovery sequences which involve sector writes, which can via by rewriting the file or rolling in a copy of the file or through normal operations, and not to erasing or reformatting the drive.

On rereading my replies, I did indicate that; that reloading and rewriting and RAID can resolves issues with bad blocks through rewriting. As you correctly indicate, that's how sectors are revectored.

The original poster was specifically interested in using erasure and reformatting as a step for recovering from bad blocks, which isn't a step I view as being necessary or even useful, and — when folks are resorting to reformatting a drive for bad blocks — is usually an indication that the drive is long past replacement.

As for why the OP is referring to a multi-pass operation, some folks use that as a scrubber or as a way to expose additional failing blocks. In the absence of a disk diagnostic, overwrites are also sometimes used for disk qualifications, and to try to force a disk over into overt failure due to heating or other problems. (Not my preference, given I've had disks that don't read back the same data that was written; one had a two-byte "slip" in what was written, about two thirds through the disk.) Sometimes a RAID build or rebuild is used, too.

Yes, I do replace drives. Quite possibly earlier than some folks choose to. I just replaced a few disks yesterday, though those were well and truly bricked. As for costs of replacements, my data and the data of the folks I know and my time is worth more than the savings that I might accrue from successfully repairing a questionable disk drive.

Again, thanks.

Reply

Answer 14

MrHoffman

Level 10

149,931 points

Sep 29, 2013 5:56 PM in response to worldpoop

Write operations should trigger a block replacement, whether it's an overwrite or just writing application data.

"Rotating-rust" magnetic disks do usually either fail fairly quickly after installation and first use, or will eventially wear out after some amount of usage, so always have backups of your data. If your data is important enough, then RAID (which is not a backup) and have on-site and off-site backups of your data.

Reviving a questionable hard drive is... not a strategy I'd recommend. As I've commented up-thread. More than a few folks have tried that approach over the eons — even back when some half-gigabyte drives weighed 68 Kg and 19" wide and cost ~US$12000 and more — because — well, for any of various reasons — and I cannot recommend that approach.

Once a disk device flakes out, I replace it.

Particularly these days.

Disks are too cheap these days and the data on that disk is too valuable to mess around with a questionable disk device. Once a disk starts throwing visible bad blocks, it's usually on its way to failure. Sure. You might get lucky, and might have encountered an isolated failure. If not and if the disk is more generally failing, then you risk your data (again) and spend more time shuffling the data around.

Reply

Answer 15

Sep 29, 2013 11:06 PM in response to MrHoffman

Hello -Thank you for your response! These are excellent and informed recommendations (truly) about whether or not to revive drives versus purchasing new ones. Basically, in most cases one is better off off buying a new good hard drive. If you care about data, that's good advice.

Meanwhile, like the original poster above, I too have a situation where I seek the best way to unearth bad blocks on an existing hard drive. We are not alone in this big world I'm sure (though it may be lonely!) 🙂 The salient technical question re: manner of practice is this: Does multi-pass make a difference?

On the question of one vs. multi-pass rewrite erase, this author bases advice on an assumption that multi-pass serves as a "burn in" (also called a "stress test") to indeed increase detection of poor blocks. Does that assertion have merit -- it is technially accurate?

Thank you!

Reply