Skip navigation

Why Time Machine use Sparse Bundle Disk Image for network?

7919 Views 14 Replies Latest reply: Jun 4, 2012 10:53 AM by BobHarris RSS
Jerome Krinock Level 1 Level 1 (15 points)
Currently Being Moderated
Feb 13, 2012 6:05 PM

I'd like to understand why Time Machine creates a sparse bundle disk image when backing up to a network drive.  Is the likely reason that the compression reduces network traffic and thus results in faster backups?  If so, is the typical "gain" we get from doing this roughly the same as the typical "squeeze" you get from compressing a file, 50% or so depending on the type of file?

 

Thanks!

  • MrHoffman Level 6 Level 6 (11,720 points)

    You only need to haul around the 8 megabyte "bands" from the sparse bundle that have changed with the backup, rather than what would have meant hauling around a much larger disk image file.

  • Linc Davis Level 10 Level 10 (107,985 points)

    It's so that the local filesystem doesn't have to be HFS. The HFS filesystem is encapsulated in the disk image.

  • Linc Davis Level 10 Level 10 (107,985 points)

    I don't know what a Time Capsule does internally. There is a published specification for Time Machine servers, and some third-party NAS vendors claim to implement it. They don't implement HFS.

  • Linc Davis Level 10 Level 10 (107,985 points)

    Sparseimage bands aren't sent over the network at all. The sparseimage resides on the server, not on the clients.

  • BobHarris Level 6 Level 6 (12,510 points)

    A sparse bundle containing an HFS+ file system offers 2 advantages

     

    1st) as files inside the sparse bundle HFS+ files system are deleted, if all of an 8MB band is now free, the entire band can be given back to the host file system.  That way the sparse bundle does not need to stay at its peak size, but rather can grow or shrink within the hosting file system.

     

    2nd) Ignoring for the moment what is inside a sparse bundle, as a container, the sparse bundle allows an outside backup utility to perform an 8MB band-by-band incremental backup.  Keep in mind that one implementation of the FileVault where your home directory was encrypted inside of a sparse bundle, it made it easier to backup your home directory with an incremental backup utility that only needed to copy 8MB bands that had changed recently.  And considering your music and pictures folders were also inside, that sparse bundle could be huge.

     

    So once Apple had the sparse bundle technology, it was made a useful container for Time Machine backups.

  • twtwtw Level 5 Level 5 (4,580 points)

    He answered that.  sparse images allow band-by-band incremental backups, which (from a network perspective) is both more efficient and more reliable.  for instance, rather than being forced to copy an entire 10 Gb disk image - with the possibility of corrupting the whole thing if there's a network error - the system can transfer one band at a time, or even only the bands that have changed.

  • BobHarris Level 6 Level 6 (12,510 points)

    Why?  I have no clue.  I can speculate.  FIrst, a Sparse Bundle means all the remote file system needs to provide is the ability to create lots and lots of 8MB files with the names Time Machine likes.  The remote system does not need to support hardlinks, it does not need to support hardlinks between directories (most Unix systems support hardlinks, but explicitly exclude hardlinking directories).  The remote system does not need to support resource forks, it does not need to support extended attributes, it does not need to support Finder flags, etc...

     

    Time Machine puts a fully functional HFS+ file system inside that Sparse Bundle, then accesses that Sparse Bundle wrapped HFS+ file system as if it is a local file system allowing lower layer file system drivers deal with and worry about the networking issues.

     

    The 8MB bands are not directly involved in the network traffic.  That is to say, if Time Machine needs to backup a 1024 byte file, it will just write 1024 bytes (plus needed file system metadata) to the sparse bundle, and that is all the network traffic that occurs for that small file.  The 8MB band containing that file will not be sent.

     

    The 8MB bands are more useful to Time Machine as a way to give up ununsed space back to the host system when older Time Machine generations are purged.

     

    The 8MB bands are also useful if a sparse bundle is used to hold something besides a Time Machine backup, and that bundle is being backed up by any kind of incremental backup utility that does not or cannot look inside the sparse bundle (SuperDuper, Carbon Copy Cloner, rsync, etc...).  That external backup utility can then just backup the 8MB bands which have been modified.  And it does not matter if the back is over the  network or to a local disk.  Incremental backups are always faster if you do not need to copy everything.

     

    There was mention of compression.  A sparse bundle does not necessarly have compression involved.  It is sparse because any 8MB band that is totally empty can have its band deleted thus saving the host system disk space.  It is sparse because not every 1 through n band needs to actually have an 8MB file associated with it.  The all zero bands are the sparse holes.  They can be filled in later when there is something to put in that space, and again deleted when what was there is removed, and the entire 8MB band is again totally empty.

     

    But all I'm doing it speculating about why Apple choose Sparse Bundles.  I suspect they were created for spaces efficient and backup efficient home directories encrypted in FileVault, but that once the technology existed the Time Machine team decided it was a good solution for its needs, over requiring remote file system be 100% HFS+ compatible, over regular .dmg disk images, over sparse disk images, over zip files, over name your poison storage solution.  And in a future release, Time Machine may switch to yet another storage format as better solutions come along.

     

    Message was edited by: BobHarris

  • ianhinder Calculating status...

    Another possible reason is one of security.  In order to create an exact copy of your local filesystem, Time Machine needs to create files in the backup volume with different ownerships and permissions.  For example, many files need to be owned by "root" and some may have the SUID bit set (allowing a normal user to run them with root permissions).  If the Time Machine volume was a direct mount of the remote file system, the only way to create files owned by root would be to log in to the file server as root.  This might be OK for a time capsule used by a single user, but is a bad idea in general for servers used by more than one person. 

     

    With the bundle/image approach, the bundle files can be owned by any user on the server, and do not need special permissions to access.  All the permission information is stored in the bundle files as normal file data.

  • BobHarris Level 6 Level 6 (12,510 points)

    The remote Mac's admin credentals are what is needed, as you are making the changes on the remote Mac.

Actions

More Like This

  • Retrieving data ...

Bookmarked By (0)

Legend

  • This solved my question - 10 points
  • This helped me - 5 points
This site contains user submitted content, comments and opinions and is for informational purposes only. Apple disclaims any and all liability for the acts, omissions and conduct of any third parties in connection with or related to your use of the site. All postings and use of the content on this site are subject to the Apple Support Communities Terms of Use.