Filesystem corruption in Mac OS X

I have just run a md5sum check on a 300GB partition which I used as a "Reference" partition which is mounted read-only 99% of the time. I keep such things as software, music, e-books and all other reference documents which will not change on it. Most files have an md5sum checksum attached to them which is kept in a separate file.

I run a md5sum check every once in a blue moon. I guess I ran the last one some time in the past 5 years. I ran one today, and discovered with great alarm that 5 files none of which I have touched for years and years have failed the md5sum check.

Of the files I was able to recover from another source, I have found that the failure in every case involved a difference of an entire 4kB sector. The error in each case involved incorrectly read data with no warning of a bad sector or a bad read, or any indication that the data may be corrupt. Is this more likely to be a hardware-related failure or a kernel software filesystem-related failure? Has anybody ever conducted a systematic study of data integrity guarantees in Mac OS X filesystems? What is the hardware error rate of a typical disk? How reliable is the filesystem, and in what circumstances can corruption occur? Is this well understood?

iBook G4, Mac OS X (10.4.11)

Posted on Dec 3, 2007 11:20 AM

Reply
6 replies

Dec 3, 2007 5:42 PM in response to Dr. T

Dr. T wrote:
What is the hardware error rate of a typical disk?


100%

How reliable is the filesystem, and in what circumstances can corruption occur? Is this well understood?


If where were something, or anything, wrong with the HFS+ filesystem, or any filesystem you are likely to be using, we would have heard about it a long time ago.

You say you haven't touched this partition in 5 years? That is 2 years past the time for the average hard drive to die. Modern hard drives are designed to be big, fast, and cheap. They don't go in for SMART status or bad sectors like they used to. They just die, and they tend to do it every 3 years.

Dec 3, 2007 9:54 PM in response to etresoft

Well if a disk crashes, at least you know that the data are compromised (or unavailable). What Jun is asking about are whether subtle changes in files over time that would escape detection except via md5sum or similar mechanism common.

I have to store about a terabyte of data and I too am worried about this.

I have heard HFS+ is more fragile than flat file systems.

Dec 4, 2007 5:44 AM in response to Bill Scott

Bill Scott wrote:
I have heard HFS+ is more fragile than flat file systems.

You mean like the Macintosh File System? With a terabyte of data, you'll at least need an external floppy drive.

Well if a disk crashes, at least you know that the data are compromised (or unavailable). What Jun is asking about are whether subtle changes in files over time that would escape detection except via md5sum or similar mechanism common.


There is nothing about a particular file system that is going to make that happen. There are complaints about HFS+, but no one has ever said it randomly corrupts data. That would have been noticed something in the past 20 years. All HFS+ keeps track of are file names and directory structures. It doesn't care about the file data itself except for handling fragmentation.

The original poster was concerned about a 300 GB partition that hasn't changed in 5 years. However, those read-write heads have had about 400,000 hours of use on them due to the other partitions. The data on that 300 GB of unused platter space may be fine, and it may not be. Regardless, it is a stretch to try to blame it HFS+.

I have to store about a terabyte of data and I too am worried about this.


Don't buy your hard drives from Best Buy. Buy some decent, industrial drives and put them on a RAID. You'll probably want tape backup too. If you are really worried about HFS+, by all means, use something else. Just don't assume then that you are immune from data corruption.

Dec 4, 2007 6:05 AM in response to Dr. T

{quote:title=etresoft wrote:}
If where were something, or anything, wrong with the HFS+ filesystem, or any filesystem you are likely to be using, we would have heard about it a long time ago.
{quote}

I would dispute this. It is not well known, I believe, although I would hope it is at least moderately well known, that when you read a data-grade CD-ROM or DVD-ROM on Mac OS X, you will with a reasonable frequency (more than 1 sector/6GB read in my experience) occasionally get a silent incorrectly read sector. Yet this is factual.

{quote:title=Dr. T wrote:}
What is the error rate of a typical disk.
{quote}
{quote:title=etresoft wrote:}
100%
{quote}

I am interested in a rate in the form of number of sector silently returned differently read to the data originally written, per amount of data written. Thanks.

{quote:title=etresoft wrote:}
You say you haven't touched this partition in 5 years? That is 2 years past the time for the average hard drive to die. Modern hard drives are designed to be big, fast, and cheap. They don't go in for SMART status or bad sectors like they used to. They just die, and they tend to do it every 3 years.

The original poster was concerned about a 300 GB partition that hasn't changed in 5 years. However, those read-write heads have had about 400,000 hours of use on them due to the other partitions. The data on that 300 GB of unused platter space may be fine, and it may not be.

Don't buy your hard drives from Best Buy. Buy some decent, industrial drives and put them on a RAID. You'll probably want tape backup too. If you are really worried about HFS+, by all means, use something else. Just don't assume then that you are immune from data corruption.
{quote}

I run the md5sum checksum check only occasionally. The data is occasionally copied from one disk to another, but whenever I do this, I do not only check md5sum checksums, I make precautions to ensure that all data caches are flushed before running a full compare between the original and the copy. The current disk is less 2 years old, and the partition is the sole partition on that drive. The disk is a standard Western Digital server-grade drive. You may prefer buying Best Buy disks yourself, I will leave this to you.

{quote:title=etresoft wrote:}
You mean like the Macintosh File System? With a terabyte of data, you'll at least need an external floppy drive.
{quote}

I thought he meant the sort of professional-grade filesystem used in the Unix world, such as perhaps ZFS or ReiserFS, but I may be mistaken.

Dec 4, 2007 6:25 AM in response to Dr. T

Perhaps 1 sector/6GB is an exaggerated figure. More precisely, I have found the following. I occasionally scan software CDs and keep them on a large disk as it's less hassle to mount the disk images than having to retrieve the individual CDs each time; every time I do so, I make at least 5 different copies and compare them. I have found that at least 1 time in 10, at least one of the 5 copies is different from the other 4.

Now this is undoubtedly partly the fault of faulty CD media; but it is a fault of Mac OS X that the read errors are silent; i.e. that a read failure is not reported to the user as a hard error. If I were not aware of this, and did not make 5 copies each time, I would end up with a substantial number of bad copies unawares, and I have no doubt many people do, all the time.

Also, I would like to point out that in Mac OS X 10.4, the implementation of UFS has the following bug which systematically results in data corruption:

- mount /dev/diskX /mountpoint
- write some data to /mountpoint; mount -ur /dev/diskX (in quick succession)

will often not update the data on the disk correctly;

- sync; mount -ur /dev/diskX

appears to cure the problem. I have reported this bug to Apple a long time ago, and I do not know whether they have fixed it or not - I suspect not!

I doubt this is (sufficiently) well known either.

So forgive me for having doubts about your assertion that a bug silently corrupting data on HFS+ could not go unnoticed for many years.

Dec 4, 2007 6:50 AM in response to Dr. T

Dr. T wrote:
I would dispute this. It is not well known, I believe, although I would hope it is at least moderately well known, that when you read a data-grade CD-ROM or DVD-ROM on Mac OS X, you will with a reasonable frequency (more than 1 sector/6GB read in my experience) occasionally get a silent incorrectly read sector. Yet this is factual.


Those are optical media consisting of some clear plastic and optical film. Plus, they don't even use HFS+ so it is a bit of a moot point.

I am interested in a rate in the form of number of sector silently returned differently read to the data originally written, per amount of data written. Thanks.


In other words, what percentage of written data will be read differently? That depends on a number of factors such as age of drive, hours of use, usage conditions, environmental conditions, etc. File system used on the disk is irrelevant. All disks die eventually - some after 6 months, some after 5 years.

I run the md5sum checksum check only occasionally. The data is occasionally copied from one disk to another, but whenever I do this, I do not only check md5sum checksums, I make precautions to ensure that all data caches are flushed before running a full compare between the original and the copy. The current disk is less 2 years old, and the partition is the sole partition on that drive. The disk is a standard Western Digital server-grade drive. You may prefer buying Best Buy disks yourself, I will leave this to you.


I actually buy my disks from Other World Computing, but they are the same drives that Best Buy sells. I've never run checksums on all my data and, frankly, I'm not going to. It isn't worth my time. If your data is critical and it is worth your time, you need to use professional quality hardware. A single WD drive does not qualify. You need to look at a RAID, at a minimum. Tape would be good too. The professional storage systems are designed to be big, fast, and automatically recover from failures like what you have experienced. They don't come cheap and WD doesn't sell them. Look at Sun, IBM, HP, EMC, those kind of people.

I thought he meant the sort of professional-grade filesystem used in the Unix world, such as perhaps ZFS or ReiserFS, but I may be mistaken.


You can never be too sure. Ask a specific question and you'll get a specific answer.

It is a stretch to call ZFS "professional-grade". It is brand new. There is experimental support for it in 10.5 but it will be a while before it is ready for the general public. In any event, ZFS on a single drive won't be that much better than HFS+ on a single drive. It is only as good as the media. If the media has no redundancy, neither does the file system.

ReiserFS is a Linux file system, not Unix. There is a difference. It has its own set of problems. Agsin, it would be a stretch to claim ReiserFS is sigificantly different or better than any other modern file system.

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Filesystem corruption in Mac OS X

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.