Previous 1 2 Next 22 Replies Latest reply: Feb 19, 2008 9:21 AM by Tim Semic
pariah0 Level 1 Level 1 (80 points)
I've primarily been a Linux & *BSD user in the past. I've used rsync for years. But with OS X, I'm running into behavior I'm not understanding:

Like many people, I've used rsync to do backups (over a network) of my iTunes 'stuff' to a Linux server at home.

I typically use a command line similar to the following:

rsync -aP ~/Music/iTunes remotehost:/srv/

If I were to do a Linux -> Linux transfer, it's typically very fast - mainly because the data isn't changing, and only new files are transfered.

However, with OS X, it isn't that fast at all - even if the file's MD5sum and SHA1sum are identical on the different hosts (ie. the files in both locations are identical), and only the date or permissions are different, the entire file is still transferred. (ie. the number of bytes sent or received is greater than the size of the file)

Again, with a Linux->Linux transfer, if a large file is identical on both sides (excepting the date stamp), the actual amount of data transferred when using rsync is much smaller than the actual filesize. A multi-GB file could end up sending/receiving a few MB in Linux->Linux rsync, but when I do OS X -> linux, several GB's are sent.

What am I missing? I'd expect rsync to behave more or less the same with plain files and no extended attribues...

Message was edited by: pariah0

MacBook Pro 17", Mac OS X (10.5)
  • jarik Level 4 Level 4 (1,005 points)
    I can't repeat that problem. I rsynced a 67MB AVI file to another Mac with rsync -aP (46s to transfer), then touched the original, making it one minute older. Another rsync -aP and a 2s transfer time later, it's clear it's not transferring the complete file.

    Maybe you could test with one file with a -vvv to increase verbosity to see why it's deciding to resend the file
  • Gnarlodious Level 4 Level 4 (3,225 points)
    The reason is Mac OS "Resource files", which are an invisible file that is attached to almost every file. The resource file contains metadata about the fie, an icon image, text layout and colors, or any other data the application assigns to it.

    Resource files have no modtime of their own, so the variant of rsync included in OSX transfers ALL resource files as a sloppy workaround. If you are doing straight text files, no problem. But if you are doing files with a Mac resource fork, you see a major slowdown.

    For more information read here.
  • jarik Level 4 Level 4 (1,005 points)
    Sorry Gnarlodious, that doesn't seem to be true. Without -E rsync ignores resource forks, and works as documented. I don't have access to a linux remote host to test with it, so I tried Mac to Mac, rsync -aP with a Quicken Data file with a big resource fork.

    Edit: after checking my Music folder, it looks like most files have a small resource fork. Maybe the OP can verify whether they used -E without mentioning it. That'd explain it.

    Message was edited by: jarik
  • Gnarlodious Level 4 Level 4 (3,225 points)
    It's true that music files have a small resource fork, as do almost all OSX files (.plist files being an exception). But the crosstalk involved slows down rsync more than the actual data of those files. It is even the case rsyncing to Firewire.

    As for the -E option, I don't believe it totally ignores "Extended attributes". I believe it still handles them internally, just supresses the actual transfer. That would explain the slow operation compared to the UNIX flavor. While Apple has been saying resource forks are deprecated, it shows no sign of abandoning the format any time soon. And I don't believe developers are willing to give up resource files. Personally, they are an essential part of my stuff and I hope they never go away.

    I keep a number of different rsync versions for various purposes. The resource fork for music files is irrelevant for backup purposes, as one example. You can easily set the file creator and filetype with an Applescript if you should ever need to restore. That way you save time and bandwidth using the non-extended version of rsync.
  • jarik Level 4 Level 4 (1,005 points)
    As to the OP's problem, a major slowdown with iTunes files, I don't see it (albeit I am testing Mac-to-Mac), do you? I rsync -aP'd a folder containing a CD of files with resource forks. I then changed one of the files contained in it, to open with Quicktime. Then resynced and got

    <pre class=command>sent 6888 bytes received 9484 bytes 2976.73 bytes/sec
    total size is 46168809 speedup is 2819.99</pre>

    That's about normal, isn't it?

    We need to see verbose output from rsync from the OP.
  • Gnarlodious Level 4 Level 4 (3,225 points)
    That would look about right. You are syncing 46MB of files that contain 6888 bytes of resource data, and all that data is being transferred (both ways) because the resource files have no modtime. The actual payload is very small, 9484-6888 bytes=2596, which would be the size of the file you changed.

    So yeah, it's messed up. I was hoping it would be fixed in 10.5 but no such luck. Hardly worth it to backup resourced files to an online server. That is why I suggested the Linux version that ignores resource files.
  • jarik Level 4 Level 4 (1,005 points)
    Too bad, eh?

    OP said the entire file is transferred, so that's a different issue.
  • pariah0 Level 1 Level 1 (80 points)
    I realize this is a long time after my original post; but when I do the rsync, I get a 'speedup' of around 1, and the number of bytes sent/received is in the tens of gigabytes (ie. the size of my iTunes library)

    It is frustrating, as I use rsync on Linux daily. For some things, I can get the OS X version of rsync to behave what I consider normally - which is why I was asking about if there's anything 'special' to the OS X version of rsync... It appears that there is; I'll have to do some more tinkering...

    I'll have to play with 'rdiff' and see what differences rsync feels are so significant...
  • pariah0 Level 1 Level 1 (80 points)
    Interestingly enough:

    When I sync from my Mac -> the Linux box, things work more or less as I'd expect them to.

    When I sync from my Linux box -> Mac (the rsync is initiated from the Mac, though), I have issues.

    Of note is the usage of an external hard drive (ie. a FireWire 400 hard drive), though I don't really see how that should be a problem - the drive can transfer data at around 40 MB/sec, which isn't much slower than a notebook hard drive.
  • Gnarlodious Level 4 Level 4 (3,225 points)
    The difference in behavior is the Linux rsync daemon you are talking to on the other end. The OSX version has tweaks to handle resource forks, and the Linux version does not. There is syntax in the rsync command to connect to a non-default rsync daemon on the other end, if it would help.
  • pariah0 Level 1 Level 1 (80 points)
    It appears part of the issue is I was rsyncing to a firewire drive - and the bandwidth to/from the external drive was a bottleneck.
  • Gnarlodious Level 4 Level 4 (3,225 points)
    I can't hardly believe that. Firewire is the fastest you can get with an ejectable disk.

    Any clue the FW system or disk is defective?
  • Tim Semic Level 1 Level 1 (130 points)
    I use Apple's rotating logs to log my rsync's results. I do copy the resource forks in my script as my server holds some legacy os-9 files. I do a daily mailto and as you can see I filter out the resource fork files for the email I receive each day (grep -v ._) which always seem to copy over no matter what for some reason. The logs have everything the backup does. /var/log/daily-backup.log I cron this script to run once a day.

    In case you are interested in logging I thought it may help you.

    <-start script->

    #!/bin/sh
    # The following script rotates out the oldest log file by renaming them.
    echo ""
    printf %s "Rotating log files:"
    cd /var/log
    for i in daily-backup.log; do
    if [ -f "${i}" ]; then
    printf %s " ${i}"
    if [ -x /usr/bin/gzip ]; then gzext=".gz"; else gzext=""; fi
    if [ -f "${i}.6${gzext}" ]; then mv -f "${i}.6${gzext}" "${i}.7${gzext}"; fi
    if [ -f "${i}.5${gzext}" ]; then mv -f "${i}.5${gzext}" "${i}.6${gzext}"; fi
    if [ -f "${i}.4${gzext}" ]; then mv -f "${i}.4${gzext}" "${i}.5${gzext}"; fi
    if [ -f "${i}.3${gzext}" ]; then mv -f "${i}.3${gzext}" "${i}.4${gzext}"; fi
    if [ -f "${i}.2${gzext}" ]; then mv -f "${i}.2${gzext}" "${i}.3${gzext}"; fi
    if [ -f "${i}.1${gzext}" ]; then mv -f "${i}.1${gzext}" "${i}.2${gzext}"; fi
    if [ -f "${i}.0${gzext}" ]; then mv -f "${i}.0${gzext}" "${i}.1${gzext}"; fi
    if [ -f "${i}" ]; then
    touch "${i}.$$" && chmod 640 "${i}.$$" && chown root:admin "${i}.$$"
    mv -f "${i}" "${i}.0" && mv "${i}.$$" "${i}" && if [ -x /usr/bin/gzip ]; then
    gzip -9 "${i}.0"; fi
    fi
    fi
    done
    # The following line creates a backup of the data drive onto the backup drive.
    rsync -aEv /Volumes/data/ /Volumes/backup >> /var/log/daily-backup.log 2>&1
    # Mails truncated log file to the happy admin.
    grep -v '._' /var/log/daily-backup.log | mail -s "rsync-backup on MacServe1" me@domain.com

    <-end script->
  • Tim Semic Level 1 Level 1 (130 points)
    Ok,
    My script didn't paste in very well as you can see. Can someone please advise on proper format for adding code to a comment.

    Thanks.
Previous 1 2 Next