Why do snapshots take so long for First Aid to process?

Had a short power outage this evening. One of my backup disks is not on UPS. So I thought I'd run First Aid on it to be sure there are no errors. My question is, why does it take so long for First Aid to process the snapshots? Is it reading each byte a dozen times? Each snapshot takes several minutes or more, and with 69 of them now, it's going to take hours! Is there some way to skip the snapshots? Maybe by using the fsck_apfs command in Terminal? (NOTE: This is a backup disk, not the system disk [startup disk], so I can't delete them, as they are the actual backups!)


Disk Utility 21.5

macOS 12.7

24" 4-port M1, iMac21,1


TIA!

iMac 24″, macOS 12.6

Posted on Oct 12, 2023 8:36 PM

Reply
Question marked as Top-ranking reply

Posted on Oct 13, 2023 10:03 AM

betaneptune wrote:

why does it take so long for First Aid to process the snapshots? Is it reading each byte a dozen times?

If you have a dozen snapshots, then yes. If you have 69, then it will take about 5 times longer than that.


Each snapshot is a self-contained file system. It contains only the file pointers and low-level block pointers. That's why the snapshot itself is so small. But each snapshot represents almost the entirety of the disk, with only very small differences between each snapshot. So yes, you are churning virtually your entire disk 69 times.


Is there some way to skip the snapshots? Maybe by using the fsck_apfs command in Terminal? (NOTE: This is a backup disk, not the system disk [startup disk], so I can't delete them, as they are the actual backups!)

The only way to skip them is to delete them. You could delete all but the last one. People have an unnatural affinity for the history of their backups. It's a nice trick that Time Machine has. You can usually recover files that you created in May and then deleted in June. But that requires the historical data. If you're concerned that your power outage has corrupted the disk, then you don't really have any options, as long as you insist on preserving that data from May.


If you value your backup drives, you have to keep them on UPS. A better idea is to have multiple drives and don't keep all of them connected all of the time.


Also, APFS is supposed to be more resilient to standard types of corruption. I regularly get "disk ejected" warnings when no disk has been ejected, and it doesn't seem to cause any harm.

17 replies
Question marked as Top-ranking reply

Oct 13, 2023 10:03 AM in response to betaneptune

betaneptune wrote:

why does it take so long for First Aid to process the snapshots? Is it reading each byte a dozen times?

If you have a dozen snapshots, then yes. If you have 69, then it will take about 5 times longer than that.


Each snapshot is a self-contained file system. It contains only the file pointers and low-level block pointers. That's why the snapshot itself is so small. But each snapshot represents almost the entirety of the disk, with only very small differences between each snapshot. So yes, you are churning virtually your entire disk 69 times.


Is there some way to skip the snapshots? Maybe by using the fsck_apfs command in Terminal? (NOTE: This is a backup disk, not the system disk [startup disk], so I can't delete them, as they are the actual backups!)

The only way to skip them is to delete them. You could delete all but the last one. People have an unnatural affinity for the history of their backups. It's a nice trick that Time Machine has. You can usually recover files that you created in May and then deleted in June. But that requires the historical data. If you're concerned that your power outage has corrupted the disk, then you don't really have any options, as long as you insist on preserving that data from May.


If you value your backup drives, you have to keep them on UPS. A better idea is to have multiple drives and don't keep all of them connected all of the time.


Also, APFS is supposed to be more resilient to standard types of corruption. I regularly get "disk ejected" warnings when no disk has been ejected, and it doesn't seem to cause any harm.

Oct 13, 2023 1:29 PM in response to betaneptune

betaneptune wrote:

Still, several minutes, like 5 to 10 or longer, for each snapshot?

A snapshot is nothing more than the index for all the files and blocks used on disk. It is essentially a copy of the entire disk filesystem when the snapshot was made, not including the data blocks. Those are shared between all snapshots. In other words, each snapshot is the entire disk. Several minutes to scan the entire disk is not unreasonable.


Never run Disk First Aid on a Time Machine backup. In addition to taking forever, it will likely reduce the lifespan of the disk, repeatedly scanning it that many times. Don't try to copy it either. Don't try to delete any of the files. Time Machine is designed to be plugged in and forgotten about until you need it. The more manual maintenance you do, the less reliability and the more problems you will have.

Oct 18, 2023 5:01 AM in response to betaneptune

betaneptune wrote:

Anyway, the claimed speed on these TM drives (OWC), at least currently, is "up to 252MB/s real-world speed." But that's probably for a large file, not something like snapshots, which have many thousands of files and folders. I assume there's a lot of overhead with that structure!

Snapshots don't have any files at all. A snapshot is the entire disk index. It is a pointer every single file that was on the disk when the snapshot was made. Large files that have been fragmented and/or modified will have multiple pointers.


You can do Get Info in Disk Utility on your Data volume and it will tell you how many files you have. Mine has about 5.5 million. Multiple that times the number of snapshots you have.

Running AJA on my external SSD I get about 800 MB/s average. Not bad! The M1 Mac gets like 2300 MB/s!

I still think it takes an incredibly long time for "First Aid" to run if you have a fair number of snapshots. Wow, I just ran AJA on the internal drive and got 3139 MB/s write and 1898 MB/s read!

The external SSD: Just got 838 MB/s write and 766 MB/s read. I think the advertised value was "up to 1025 MB/s" or similar. 

As HWTech says, your mechanical hard drive will run, at best, ten times slower than any of those.


However, those are sequential read and write speeds. Unless you are actively backing up an entire hard drive or a very large, multi-gb file, those speeds are not relevant. Instead, you have to look at the file system speed. This is the speed that the file system creates files, directories, and manages pointers and physical blocks. There is less speed difference here. The fastest SSDs are typically only 3 times as fast as mechanical hard drives.


I only know of one tool that measures file system speed and I can't mention it here. Plus, it only measures the performance of boot drives. I figured that no one would ever need to measure the speed of an external when they have these super-fast internal drives. I guess consumer behaviour proved me wrong on that. Oh well. People might try to run Disk First Aid on a Time Machine drive once. But they sure won't try it twice! 😄

Oct 12, 2023 10:26 PM in response to betaneptune

Why it takes the time it takes is the domain of the Apple developers. We have no idea why Apple chose to make things work the way they do.


If the process is "stuck" halt it and try again ... things happen.


You can erase your backup drive with no harm to your system drive any time you want to. It is a backup after all. If the dataset is corrupt, and it might be, erase the drive and reestablish Time Machine for the drive. The time interval between erasing the backup and having a new backup is very likely too short for a properly operating Mac to be "unsafe". After all, if the data is corrupt, what choice do you have.


If it passes First Aid, and sometimes you do have to run it more than once, then you're home free. But then you have to consider it did take more than once to complete first Aid. That may or may not be significant.


If the backup drive is five years old, or older, it should be replaced. Five years is a good life and if your data is important, replacing an old backup dive is a good idea.

Oct 15, 2023 1:25 PM in response to etresoft

One more thing about this. Spotlight seems to be keeping the backup disks really busy for long periods of time. I haven't yet obtained a tool that can prove this, but one or the other of my Time Machine drives frequently makes noise and keeps its light flashing for long periods of time; while at the same time, the process mds is very active on the CPU page of Activity Monitor. So I'm guessing that macOS (Spotlight, or whatever) is indexing the Time Machine drives. I don't know why it needs to do that, but AFAIK it can't be avoided. NTL, in light of your post, I'm a lot less likely to run First Aid on them!

Oct 17, 2023 8:27 PM in response to betaneptune

Hard drives are slow with many only being able to transfer data at about 80MB/s with the fastest single drive speeds maybe getting 120MB/s, but most will be closer to 80MB/s at best while portions of the hard drive can be only 40MB/s since the location being accessed on the platter plays a role in the transfer rates, plus the number of read heads available on the drive. If the hard drives are in a RAID 5/6 configuration, sometimes their performance can be much faster than a RAID 1 mirror (RAID0 is fast, but very easy to break and not recommended for most people).


Plus some hard drive models are extremely slow especially if they are Shingled drives (SMR) which Western Digital loves to sell to people without properly disclosing the type of drive or its slow speed.

https://arstechnica.com/gadgets/2020/04/caveat-emptor-smr-disks-are-being-submarined-into-unexpected-channels/


https://en.wikipedia.org/wiki/Shingled_magnetic_recording


Unfortunately SSDs are not always fast. Many SSDs these days are low end budget economy models which can be as slow as a hard drive. It can be extremely hard to find actual performance SSDs since the SSD manufacturers are now hiding the technical specifications, sometimes not even listing them in their product data sheets. Part of it is due to embarrassment, but another part is because the SSD manufacturers are utilizing different physical components with different characteristics & performance making it impossible for them to list specific technical details. You can no longer trust SSD reviews because the SSD you receive may be using different components than the one which was reviews. The problem is even worse since the model numbers do not change or reflect the differently constructed SSDs. This is occurring with all SSD manufacturers, even the popular well respected brands.

Oct 18, 2023 8:33 AM in response to etresoft

"Snapshots don't have any files at all. A snapshot is the entire disk

index. It is a pointer every single file that was on the disk when the

snapshot was made. Large files that have been fragmented and/or modified

will have multiple pointers."


Now that's the answer I was looking for!!! Finally!!!


Yes, I know. Those speeds are for large contiguous files without any folders within, hence, the "up to" part. There is all that file system overhead. The existence of lots of folders, like there are in many of my Final Cut libraries, doesn't help any! Yes, I know. But those speeds are better than nothing. I don't think they're irrelevant. They're much higher for the internal disk, just like you said. It's like a car in a race. Its top speed still tells you something, even when you have the overhead of all the curves in the road, traffic lights, etc. (Anh [rhymes with Poincare, or a little like "meh"], a rough analogy. But you know what I mean.)


As for the internal disk, even 2 TB would not be enough. All my footage and libraries would not fit. So I absolutely need the external drive. I can hear it now: "You should get a Mac Pro or Studio." Yeah, when I have mucho bucks, if that day ever comes, maybe then. Again, I was told by users in the Final Cut group that an external drive should be fine for Final Cut. So I went for it. Now, for half the bucks, I could get another SSD from OWC -- same size, much faster. Maybe FCLM will actually work right on that one. Maybe I'll get it -- maybe not.


I wish the cables fit better or more snugly. The "A-type" aren't too bad, but the B-type that looks like a square with a hump -- those are great! They are easy to plug in and make a nice solid connection. I think for at least one of my drives, the power cord connections are a little too "sensitive" or "fragile." I need to shut down and redo everything. I wish I had bought 1m cables instead of the 2m ones! Oh, the worst are the other B-type, the thin wide ones with two parts. Pain in the butt and don't have a real snug feel to them.


After checking for certain files on an old backup disk I retrieved and hooked up, I must have bumped something, as one or both of my backup disks went offline. I just plugged everything back in, and things were wacky for a while! The first backup appeared to be close to stuck at one point, and there were many md* processes, and they were CPU-consuming. But I didn't run First Aid, and in a few hours, the ship righted itself (!), as they say. I'm getting short backups about 1 hour apart as before.


BTW, Spotlight keeps the backup disks pretty busy at times! I think it's Spotlight. I have a small portable fan in my office, and it helps mask the constant "buzzing" (not quite the right word). Ohp! There are a lot of mdworker_shared processes right now, and mds is using roughly 20% CPU. And one of the backup drives is buzzing away and flashing. Ughhh.

Oct 19, 2023 1:52 AM in response to betaneptune

"Snapshots don't have any files at all. A snapshot is the entire disk

index. It is a pointer every single file that was on the disk when the

snapshot was made. Large files that have been fragmented and/or modified

will have multiple pointers."


Now that's the answer I was looking for!!! Finally!!!


Yes, I know. Those speeds are for large contiguous files without any folders within, hence, the "up to" part. There is all that file system overhead. The existence of lots of folders, like there are in many of my Final Cut libraries, doesn't help any! Yes, I know. But those speeds are better than nothing. I don't think they're irrelevant. They're much higher for the internal disk, just like you said. It's like a car in a race. Its top speed still tells you something, even when you have the overhead of all the curves in the road, traffic lights, etc. (Anh [rhymes with Poincare, or a little like "meh"], a rough analogy. But you know what I mean.) Actually, if you have a car that can do 100 mph, it can go, say, 60 mph on certain curves. But another car that has a top speed of 30, cannot go 60 on those same curves. So it does help.


As for the internal disk, even 2 TB would not be enough. All my footage and libraries would not fit. So I absolutely need the external drive. I can hear it now: "You should get a Mac Pro or Studio." Yeah, when I have mucho bucks, if that day ever comes, maybe then. Again, I was told by users in the Final Cut group that an external SSD drive should be fine for Final Cut. So I went for it. Now, for half the bucks, I could get another SSD from OWC -- same size, much faster. Maybe FCLM will actually work right on that one. Maybe I'll get it -- maybe not.


I wish the cables fit better or more snugly. The "A-type" aren't too bad, but the B-type that looks like a square with a hump -- those are great! They are easy to plug in and make a nice, solid connection. I think for at least one of my drives, the power cord connections are a little too "sensitive" or "fragile." I need to shut down and redo everything. I wish I had bought 1m cables instead of the 2m ones! Oh, the worst are the other B-type, the thin wide ones with "two parts." Pain in the butt to put in and don't have a really good feel to them once they're in. They feel "sloppy," for lack of a better word that I can think of at the moment. Not robust.


After checking for certain files on an old backup disk I retrieved and hooked up, I must have bumped something, as one or both of my backup disks went offline. I just plugged everything back in, and things were wacky for a while! The first backup appeared to be close to stuck at one point, and there were many md* processes, and they were CPU-consuming. But I didn't run First Aid, and in a few hours, the ship righted itself (!), as they say. I'm getting short backups about 1 hour apart as before.


BTW, Spotlight keeps the backup disks pretty busy at times! I think it's Spotlight. I have a small portable fan in my office, and it helps mask the constant "buzzing" (not quite the right word). It's somewhere between buzzing and grinding or gravely. It's like irregular pulsing. Ohp! There are a lot of mdworker_shared processes right now, and mds is using roughly 20% CPU. And one of the backup drives is buzzing away and flashing. Ughhh. I don't remember this being the case. Maybe it's because there are now many more backups? No, sir; I don't like it!


The backup disks are finally quiet, and all those md* processes are gone! Good reason not to do regular reboots!

Oct 12, 2023 10:54 PM in response to ku4hx

Is there anything in macOS that is not in the purview of the developers? There are users of this site who know things like this.


It's not stuck. I said it took several minutes per snapshot. I never said anything was stuck.


I am not erasing the backup drive. I'm pretty sure the backups are okay. Until I find out that the dataset is corrupt, only then will I erase the drive. Why do that when the odds are really high that all is okay? The First Aid pass will probably be done sometime later today, and if it comes up fine, there's no need to erase anything. I simply want to know why it takes so long, in particular, compared to everything else it does. (BTW, backups are very important! You might want to know what was in a file months ago, e.g. Backups are for more than just recovering from a sudden disk failure. For what you're talking about, you need a RAID array. Backups != data availability.)


"If it passes First Aid, and sometimes you do have to run it more than once, then you're home free. But then you have to consider it did take more than once to complete first Aid. That may or may not be significant."


I have no idea what this means.


This phenomenon where it takes several minutes to do a snapshot happens all the time on all disks that have snapshots. I'd be erasing everything by your advice. I'm just curious what it is that makes it take so long? Is it so terrible to ask such a question? I don't see why it's impossible that someone outside of the Apple dev population would know something about this. Maybe a dev person might even log in and answer it him- or herself! Apple once published how its old memory system worked, and if you didn't know, someone who read it might answer here.


I think I know the problem. There are times when I am just curious about something, and people assume there must be a problem. Well, actually there is a problem: why does it take so .... long?! Seems to me that this is the case for others, too. So what's the deal? It's a simple question.


Oct 13, 2023 10:14 AM in response to ku4hx

I meant the problem with not getting a good answer. People always assume I need to fix something. I simply want to know what it is doing that takes so long. I am CURIOUS. Also, if there is a way to shorten the time it takes for the snapshots, or skip them altogether. I mean, really, it takes hours for the thing to run. I've got 77 snapshots on the disk I'm checking now. It will probably take 9 hours. And for what? That's what I'm asking. (So it's both curiosity and a question asking if I can shorten the elapsed time.)


"You can erase your backup drive with no harm to your system drive any time you want to. It is a backup, after all."


OK, I guess you meant if I found a problem. I thought you meant I should just go ahead and erase it as if it were no big deal. Even so, re "It is a backup after all." Backups are for restoring a failed disk, finding old versions of files, and recovering files you deleted by mistake. So they're not "just backups." What you seem to be talking about is data availability. And for that you use RAID (well, except for RAID 0, which I think is just striping), not backups. They're not the same. RAID means you can survive a disk failure or two without interrupting system operation. Useful in production environments. Backups mean you can recover lost data, which may or may not be from a disk failure.

Oct 13, 2023 10:25 AM in response to etresoft

Still, several minutes, like 5 to 10 or longer, for each snapshot? Still seems excessive. These are 16TB drives, so when they get full, it might take an entire 24 hours!


"People have an unnatural affinity for the history of their backups. It's a nice trick that Time Machine has."


Well, it's not just Time Machine. I used to do full and incremental backups on OpenVMS systems. One could easily go to an old tape and recover deleted or corrupted files. Oh, I did backups on the Stratus, too.


Well, one of the two backup disks is on UPS. But at some point you run out of outlets. I already have an extension power strip plugged into it, and those transformer blocks waste a lot of the outlets. Then I was plugging my UPS cable into the hub and two other disks gave me those ". . . eject first . . ." messages. And yes, the hub is on the UPS. Maybe it's another bad hub. It's been working much better than my last one. Time to review and perhaps redo the whole setup.


And yes, there was a brief power outage yesterday. Probably at about 21:xx EDT.


Thanks! (^_^(



Oct 13, 2023 2:24 PM in response to etresoft

"Never run Disk First Aid on a Time Machine backup."


Well, it's too late. I already ran it on the one that wasn't on the UPS when there was a brief power outage here. (It came up clean, BTW.) Then I noticed the UPS cable wasn't plugged into the hub. So I plugged it in and two other disks gave the ". . . eject first . . ." notifications. I'm presently running First Aid on the 2nd Time Machine drive. Unlike the other one that wasn't on the UPS, I had to use a manual command in Terminal:


% time sudo diskutil verifyVolume /dev/disk9s2


It's now checking snapshot 41 of 77. Should I ^C it and configure it back into Time Machine? Or just let it run its course?


There were 3 drives involved:

TM disk 1 - F.A. is running now. Had gotten the ". . . eject . . ." msg. when I was plugging the UPS comm cable into the hub. It's checking snapshot 41 of 77.

TM disk 2 was not on UPS, so I ran F.A. on it. Came up clean.

External hard drive for ordinary storage. Got the ". . . eject . . ." message on this when I plugged the UPS comm cable into the hub. Ran F.A. on it, ran really quickly, and came up clean.


A year or two ago, I once tried copying a snapshot. Yeah, that didn't go too well! Never tried to delete one, though! But what to do if the TM drive is failing, and you want to copy your data off fast? I guess you better have another backup going in parallel, which I do. Yeah, your backups are safe, unless the disk dies.


BONUS: As for wear and tear, isn't it good if the disk doesn't stop spinning for sleep? I hate waiting for a disk to spin up, so I wish I could turn this "energy saver" feature off. Yeah, how much energy do I save if I have to go to the store that much sooner?


TIA!

Oct 17, 2023 9:15 PM in response to HWTech

My Time Machine HDDs are definitely less than 5 Gbps. Apparently much less. I have downloaded the AJA app, and it won't work on Time Machine drives. Looks like nothing but Time Machine itself can do anything to them. If a TM drive is failing, you can't copy off the backups! Well, hopefully it's at least robust, ransomware-proof, and will restore files and folders correctly, unlike my last restore operation with an older version of TM, which I had to do due to a failed drive. It put the Movies and iTunes or Music folders in the wrong places, and missed some files, IIRC! Probably best to have two running in parallel, which is what I do. For really old backups, I guess the only way to be sure you can go as far back as you like is to buy that with an offsite/online backup service.


Anyway, the claimed speed on these TM drives (OWC), at least currently, is "up to 252MB/s real-world speed." But that's probably for a large file, not something like snapshots, which have many thousands of files and folders. I assume there's a lot of overhead with that structure!


The great thing about Time Machine is that it makes it easy to restore files that go into a database, like Mail, TV, Music and such. macOS does it for you!


Running AJA on my external SSD I get about 800 MB/s average. Not bad! The M1 Mac gets like 2300 MB/s!


I still think it takes an incredibly long time for "First Aid" to run if you have a fair number of snapshots. Wow, I just ran AJA on the internal drive and got 3139 MB/s write and 1898 MB/s read!


The external SSD: Just got 838 MB/s write and 766 MB/s read. I think the advertised value was "up to 1025 MB/s" or similar. 


I've had mixed success with WD. One drive was broken right out of the box! I couldn't even initialize it. I could hear the head assembly going back and forth in spurts maybe 1/2 second apart.


I had a SanDisk 2TB SSD that would occasionally brick on me, and now it's just dead. I've heard that a lot of them were going south!

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Why do snapshots take so long for First Aid to process?

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.