wondermac wrote:
You are right. The KeepAlive key was apparently the problem. But I don't really get it.
twtwtw wrote:
what you're basically doing with it is forcing the shell script to relaunch if it quits with an error.
Isn't that what I want? I want the shell script to be run again, i.e. rsync would attempt again to finish the backup, until rsync quits without an error, i.e. the backup is done.
You haven't included your shell script, so I can't be specific, but the KeepAlive key is intended for processes that run continuously. The idea is that you start the process running, then launchd monitors it and restarts it under various conditions. I suspect, however, that you're running a one-pass script; something that launches rsync and then ends. so let me expand the flow I gave above:
- at the correct time, launchd runs the job, which in turn launches the shell script
- the shell script launches rsync and then ends
- rsync starts trying to do a backup
- launchd sees that main shell script ended and kills all its child processes
- rsync is killed ungracefully by the system
- because it had to kill child processes, launchd marks the job as having an unsuccessful exit
- launchd launches the shell script again immediately, because of the KeepAlive options you've chosen.
- repeat steps 2 through 7 every ten seconds or so until the machine is restarted (or rsync quits gracefully)
- machine restarts, but launchd still remembers if the last run of that process ended unsuccessfully and tries to start it again.
Isn't launchd fun? 😀
But actually, this is not what is happening. I sometimes do get an error from the rsync script but the script is not run again. Interestingly, the error only shows up in the rsync log file, but not in the error file I defined in launchd:
A few things going on here. first, rsync has it's own error system, and (if I remember correctly) doesn't pass errors to standard error. it just writes them to the log. Second, rsync is not the process the launchd job started, rsync is a child of that process (the shell script) running independently. Third, rsync is a relatively smart utility, so it may very well be progressing through the backup incrementally across multiple quits. Whenever rsync quits gracefully (finishes a backup or quits itself from an internal error), launchd marks the job as a successful exit and does not try to restart it.
At any rate, to get the effect you want, first edit your rsync command to send output and error messages where you want them to go - you'll have to convince rsync to do it, launchd won't catch it. Then use the following plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC -//Apple//DTD PLIST 1.0//EN http://www.apple.com/DTDs/PropertyList-1.0.dtd>
<plist version="1.0">
<dict>
<key>Label</key>
<string>org.jenal.docs_backup</string>
<key>ProgramArguments</key>
<array>
<string>/Users/marcus/bin/enterprise_docs_rsync.sh</string>
</array>
<key>AbandonProcessGroup</key>
<true/>
<key>StartCalendarInterval</key>
<dict>
<key>Hour</key>
<integer>11</integer>
<key>Minute</key>
<integer>30</integer>
<key>Weekday</key>
<integer>5</integer>
</dict>
<key>StandardOutPath</key>
<string>/Users/marcus/tmp/launchd_pix.out</string>
<key>StandardErrorPath</key>
<string>/Users/marcus/tmp/launchd_pix.err</string>
</dict>
</plist>
I wouldn't worry about trying to monitor rsync in action. it's written to be robust; you should let it do its thing.