14 Replies Latest reply: Jun 4, 2012 10:53 AM by BobHarris
Jerome Krinock Level 1 Level 1 (15 points)

I'd like to understand why Time Machine creates a sparse bundle disk image when backing up to a network drive.  Is the likely reason that the compression reduces network traffic and thus results in faster backups?  If so, is the typical "gain" we get from doing this roughly the same as the typical "squeeze" you get from compressing a file, 50% or so depending on the type of file?

 

Thanks!

  • MrHoffman Level 6 Level 6 (13,050 points)

    You only need to haul around the 8 megabyte "bands" from the sparse bundle that have changed with the backup, rather than what would have meant hauling around a much larger disk image file.

  • Linc Davis Level 10 Level 10 (165,065 points)

    It's so that the local filesystem doesn't have to be HFS. The HFS filesystem is encapsulated in the disk image.

  • Jerome Krinock Level 1 Level 1 (15 points)

    MrHoffman, are you saying that it helps with the "Microsoft Entourage" problem of large files?  Do you mean that Time Machine would only send the changed band over the network to the backup drive?

  • Jerome Krinock Level 1 Level 1 (15 points)

    Linc, this seems like a correct answer, but I'm not smart enough to know for sure.

     

    One thing that bothers me is that this answer implies that Apple has gone out of their way to enable use of non-Apple network attached storage devices.   That is, shall we say, uncharacteristic of them!  You know, I presume that Time Capsule formats its drive as HFS.  Am I making sense?

  • Linc Davis Level 10 Level 10 (165,065 points)

    I don't know what a Time Capsule does internally. There is a published specification for Time Machine servers, and some third-party NAS vendors claim to implement it. They don't implement HFS.

  • Linc Davis Level 10 Level 10 (165,065 points)

    Sparseimage bands aren't sent over the network at all. The sparseimage resides on the server, not on the clients.

  • BobHarris Level 6 Level 6 (15,600 points)

    A sparse bundle containing an HFS+ file system offers 2 advantages

     

    1st) as files inside the sparse bundle HFS+ files system are deleted, if all of an 8MB band is now free, the entire band can be given back to the host file system.  That way the sparse bundle does not need to stay at its peak size, but rather can grow or shrink within the hosting file system.

     

    2nd) Ignoring for the moment what is inside a sparse bundle, as a container, the sparse bundle allows an outside backup utility to perform an 8MB band-by-band incremental backup.  Keep in mind that one implementation of the FileVault where your home directory was encrypted inside of a sparse bundle, it made it easier to backup your home directory with an incremental backup utility that only needed to copy 8MB bands that had changed recently.  And considering your music and pictures folders were also inside, that sparse bundle could be huge.

     

    So once Apple had the sparse bundle technology, it was made a useful container for Time Machine backups.

  • Jerome Krinock Level 1 Level 1 (15 points)

    Thank you, Bob.  So why does Time Machine only use the sparse bundle disk image for network Time Machine disks?

  • twtwtw Level 5 Level 5 (4,900 points)

    He answered that.  sparse images allow band-by-band incremental backups, which (from a network perspective) is both more efficient and more reliable.  for instance, rather than being forced to copy an entire 10 Gb disk image - with the possibility of corrupting the whole thing if there's a network error - the system can transfer one band at a time, or even only the bands that have changed.

  • BobHarris Level 6 Level 6 (15,600 points)

    Why?  I have no clue.  I can speculate.  FIrst, a Sparse Bundle means all the remote file system needs to provide is the ability to create lots and lots of 8MB files with the names Time Machine likes.  The remote system does not need to support hardlinks, it does not need to support hardlinks between directories (most Unix systems support hardlinks, but explicitly exclude hardlinking directories).  The remote system does not need to support resource forks, it does not need to support extended attributes, it does not need to support Finder flags, etc...

     

    Time Machine puts a fully functional HFS+ file system inside that Sparse Bundle, then accesses that Sparse Bundle wrapped HFS+ file system as if it is a local file system allowing lower layer file system drivers deal with and worry about the networking issues.

     

    The 8MB bands are not directly involved in the network traffic.  That is to say, if Time Machine needs to backup a 1024 byte file, it will just write 1024 bytes (plus needed file system metadata) to the sparse bundle, and that is all the network traffic that occurs for that small file.  The 8MB band containing that file will not be sent.

     

    The 8MB bands are more useful to Time Machine as a way to give up ununsed space back to the host system when older Time Machine generations are purged.

     

    The 8MB bands are also useful if a sparse bundle is used to hold something besides a Time Machine backup, and that bundle is being backed up by any kind of incremental backup utility that does not or cannot look inside the sparse bundle (SuperDuper, Carbon Copy Cloner, rsync, etc...).  That external backup utility can then just backup the 8MB bands which have been modified.  And it does not matter if the back is over the  network or to a local disk.  Incremental backups are always faster if you do not need to copy everything.

     

    There was mention of compression.  A sparse bundle does not necessarly have compression involved.  It is sparse because any 8MB band that is totally empty can have its band deleted thus saving the host system disk space.  It is sparse because not every 1 through n band needs to actually have an 8MB file associated with it.  The all zero bands are the sparse holes.  They can be filled in later when there is something to put in that space, and again deleted when what was there is removed, and the entire 8MB band is again totally empty.

     

    But all I'm doing it speculating about why Apple choose Sparse Bundles.  I suspect they were created for spaces efficient and backup efficient home directories encrypted in FileVault, but that once the technology existed the Time Machine team decided it was a good solution for its needs, over requiring remote file system be 100% HFS+ compatible, over regular .dmg disk images, over sparse disk images, over zip files, over name your poison storage solution.  And in a future release, Time Machine may switch to yet another storage format as better solutions come along.

     

    Message was edited by: BobHarris

  • Jerome Krinock Level 1 Level 1 (15 points)

    I shall further speculate that the reason Apple did not use the sparse bundle disk image for local Time Machine drives is that the advantages BobHarris listed in his first paragraph are not as strong when considering a local drive, and Apple deemed that they were outweighed by the advantages of using regular files, most notably that the backup is more robust if users can browse and copy their backup files from their Time Machine disk.  (That is despite the fact that these files contain unwanted ACLs which can be annoying if copied using a method that preserves ACLs.)

     

    (For the record, I marked BobHarris' answer as correct, even though I'm not smart enough to know for sure.  BobHarris' answer certainly makes a lot of sense.)

     

    Thanks, all.

  • ianhinder Level 1 Level 1 (0 points)

    Another possible reason is one of security.  In order to create an exact copy of your local filesystem, Time Machine needs to create files in the backup volume with different ownerships and permissions.  For example, many files need to be owned by "root" and some may have the SUID bit set (allowing a normal user to run them with root permissions).  If the Time Machine volume was a direct mount of the remote file system, the only way to create files owned by root would be to log in to the file server as root.  This might be OK for a time capsule used by a single user, but is a bad idea in general for servers used by more than one person. 

     

    With the bundle/image approach, the bundle files can be owned by any user on the server, and do not need special permissions to access.  All the permission information is stored in the bundle files as normal file data.

  • Jerome Krinock Level 1 Level 1 (15 points)

    Ianhinder, I've never understood how security works in Mac OS X when logging into another Mac.  When I attempt to do something on the other Mac that requires an administrator credential, do I enter the credential of an admin on my Mac or on the other Mac?  The latter makes more sense, but I've generally had more luck with former, although I've made the situation even more confusing by usually configuring admin accounts with the same user name and password on all of our Macs.

  • BobHarris Level 6 Level 6 (15,600 points)

    The remote Mac's admin credentals are what is needed, as you are making the changes on the remote Mac.