Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

TimeMachine and SpamAssassin conspired to bring my macOS Server to its knees (bayes_toks.expire files)

When migrating to High Sierra + Server, I had to drop using Apple RAID (mirror) because macOS won't boot from it anymore. So, I changed my two-SSD setup to one boot disk and a TimeMachine backup.


This is however dangerous. Because TimeMachine keeps local snapshots (which you cannot stop it from doing). So, under certain circumstances, when suddenly there is a large growth of data, any normal unixy like clean up (say, removing old logs, etc.) doesn't actually free up space until 10 hours later when the local snapshot is deleted.


And that means that there are scenario's that easily bring the Mac to its knees. In this case, it was SpamAssassin (part of macOS Server Mail Services until High Sierra, but it can be any mechanism. I ended up with (within 10 hours) 150GB of bayes_toks.expireXXXXX (where XXXXX is a number, probably process number) in /Library/Server/Mail/Data/scanner/amavis/,spamassassin and thus my disk filled up until the system crashed. And sadly, this time that crash corrupted my APFS file system, which now refuses in any way to delete the TimeMachine snapshots with tmutil. Even worse, when the boot disk is mounted on another Mac using TDM, I cannot even remove the bayes_toks.expireXXXXX files. I cloud remove some of them, but at some point on executing 'rm' the result is:


$ rm bayes_toks.expire69547
rm: bayes_toks.expire69547: No space left on device


No space left on device when removing something? Great work, Apple. And try to keep the file system on HFS+ when installing High Sierra on an SSD so you do not run into APFS trouble? Good luck!


In the meantime, if someone can give me good instructions on how to prevent my disk to fill up with bayes_toks.expireXXXXX files? And any tip to repair a broken APFS volume is welcome too. The helpful Apple person I spoke to today was unable to and told me I had to wipe and rebuild


This is the second time this happened to me. I thought 200GB free should be enough in general. Not so.


Thanks.

Posted on Mar 5, 2019 6:26 AM

Reply
Question marked as Best reply

Posted on Mar 6, 2019 6:50 AM

Root Cause.


  1. macOS Server's mail services include virus and spam scanning. Spam scanning is done with SpamAssassin using a Bayesian algorithm. macOS Server's standard configuration has no setting for auto-expiration of Bayesian rules in its configuration /Library/Server/Mail/Config/local.cf. This means that bayes_auto_expiry is set to 1. Which means that every message filtered my trigger the expiry process. And the standard time_limit for Bayesian filtering is set to 300 seconds. As soon as filtering plus expiry of a message goes over that time limit, SpamAssassin will kill off the filtering process (assuming it is stuck on filtering). As a result, a file called bayes_toks.expireXXXXX is not cleaned up (XXXXX is a process number).
  2. Normally, that file is around 5MB. It is a Berkeley DB file. On APFS, Berkeley DB is broken (I am not aware it has been fixed yet in Mojave). The result is that Berkeley DB files may be huge. E.g. this is true for all postfix DB files. E.g. my virtual_users.db is 155MB where it should be 16kB. As a result of this Berkeley DB/APFS problem, every time SpamAssassin kills of a spam filter action, a 5-8 GB bayes_toks.expireXXXXX file is left in the directory /Library/Server/Mail/Data/scanner/amavis/.spamassassin. As soon as this starts to happen, the disk is filling up very fast. (On HFS+ this will also happen but more slowly, depending on teh volume of mail traffic).


Additional problem if you use Time Machine


When you run Time Machine on your server boot disk, Time Machine makes local snapshots. This cannot be turned off these days. This means that the large files are ending up in TimeMachine snapshots, which on APFS are kept using APFS's internal deduplication mechanism. Simply removing the bayes_toks.expireXXXXX files is thus not enough.


Potential corruption as a result


When the disk gets full and APFS deduplication/snapshotting is in function, the APFS file system can (easily?) become corrupted. When that happens, you are in deep trouble, because you cannot free any room anymore on the volume. Not at all. Your system will not function. You cannot delete the local snapshots. Even when started in Target Disk Mode and accessed from another Mac, you cannot remove them. Even if you remove SIP you will not be able to. And even if you are able to free up some space by removing things that are not in the local snapshots, you may run into the weird situation that on the use of a command-line rm command (remove file) on a bayes_toks.expire file, it fails with 'no space left on device'. The only thing you can do is start over: fully rebuild your server's boot disk from backups.


Recovery.


As my APFS boot volume was corrupted, I had to rebuild my server. I was lucky in that I was able to clone my broken boot disk to another disk (HFS+). Then I deleted the APFS volume, erased the boot disk and formatted it as HFS+. I installed a fresh copy of macOS High Sierra using an installer on a USB stick. I used the cloned disk to migrate the data back. This was enough to get the system working again.


Prevention.


APFS in combination with Berkeley DB and Time Machine is not what I want. I used a special trick to prevent the macOS High Sierra install from transforming the HFS+ volume to an APFS volume (this means that my Berkeley DB files do not blow up in size. For instance, running 'postmap virtual_users' reduced virtual_users.db from 155MB to 16kB. Now, if SpamAssassin times out and leaves junk, it grows at 5MB per attempt, not 8GB. I sincerely hope that any next OS version update (e.g. 10.13.7 if that ever arrives) does not do the APFS conversion anyway. I also hope that Apple finally publishes the instructions to move away from the Mail Services in macOS Server (something promised almost a year ago and still not kept). As long as they do not publish that, I'm stuck with this aging Mac mini with no possibility to buy a new one.


Second, I edited /Library/Server/Mail/Config/spamassassin/local.cf and added


# GW

# Disable auto-expiry. This may lead to lots of bayes_toks.expireXXXXX files

# filling up the disk (especially on APFS where DB files are gargantuan)

# Setting this requires running sa-learn --force-expire once in a while

# e.g. via a cron entry.

#

bayes_auto_expire 0


And I added a cron job for user _amavisd


sudo mkdir /var/log/spamassassin

sudo chown _amavisd:admin /var/log/spamassassin

sudo crontab -e -u _amavisd


and enter the line


2 2 * * * /Applications/Server.app/Contents/ServerRoot/usr/bin/sa-learn --force-expire >>/var/log/spamassassin/expire.log 2>&1


SpamAssassin will now not try to expire the database as a side effect of filtering a mail message. Instead, expiry will done once a day at 2:02 AM. You can add a -D flag to sa-learn to see some more detail. Note, this log file also grows over time.

Similar questions

6 replies
Question marked as Best reply

Mar 6, 2019 6:50 AM in response to Gerben Wierda

Root Cause.


  1. macOS Server's mail services include virus and spam scanning. Spam scanning is done with SpamAssassin using a Bayesian algorithm. macOS Server's standard configuration has no setting for auto-expiration of Bayesian rules in its configuration /Library/Server/Mail/Config/local.cf. This means that bayes_auto_expiry is set to 1. Which means that every message filtered my trigger the expiry process. And the standard time_limit for Bayesian filtering is set to 300 seconds. As soon as filtering plus expiry of a message goes over that time limit, SpamAssassin will kill off the filtering process (assuming it is stuck on filtering). As a result, a file called bayes_toks.expireXXXXX is not cleaned up (XXXXX is a process number).
  2. Normally, that file is around 5MB. It is a Berkeley DB file. On APFS, Berkeley DB is broken (I am not aware it has been fixed yet in Mojave). The result is that Berkeley DB files may be huge. E.g. this is true for all postfix DB files. E.g. my virtual_users.db is 155MB where it should be 16kB. As a result of this Berkeley DB/APFS problem, every time SpamAssassin kills of a spam filter action, a 5-8 GB bayes_toks.expireXXXXX file is left in the directory /Library/Server/Mail/Data/scanner/amavis/.spamassassin. As soon as this starts to happen, the disk is filling up very fast. (On HFS+ this will also happen but more slowly, depending on teh volume of mail traffic).


Additional problem if you use Time Machine


When you run Time Machine on your server boot disk, Time Machine makes local snapshots. This cannot be turned off these days. This means that the large files are ending up in TimeMachine snapshots, which on APFS are kept using APFS's internal deduplication mechanism. Simply removing the bayes_toks.expireXXXXX files is thus not enough.


Potential corruption as a result


When the disk gets full and APFS deduplication/snapshotting is in function, the APFS file system can (easily?) become corrupted. When that happens, you are in deep trouble, because you cannot free any room anymore on the volume. Not at all. Your system will not function. You cannot delete the local snapshots. Even when started in Target Disk Mode and accessed from another Mac, you cannot remove them. Even if you remove SIP you will not be able to. And even if you are able to free up some space by removing things that are not in the local snapshots, you may run into the weird situation that on the use of a command-line rm command (remove file) on a bayes_toks.expire file, it fails with 'no space left on device'. The only thing you can do is start over: fully rebuild your server's boot disk from backups.


Recovery.


As my APFS boot volume was corrupted, I had to rebuild my server. I was lucky in that I was able to clone my broken boot disk to another disk (HFS+). Then I deleted the APFS volume, erased the boot disk and formatted it as HFS+. I installed a fresh copy of macOS High Sierra using an installer on a USB stick. I used the cloned disk to migrate the data back. This was enough to get the system working again.


Prevention.


APFS in combination with Berkeley DB and Time Machine is not what I want. I used a special trick to prevent the macOS High Sierra install from transforming the HFS+ volume to an APFS volume (this means that my Berkeley DB files do not blow up in size. For instance, running 'postmap virtual_users' reduced virtual_users.db from 155MB to 16kB. Now, if SpamAssassin times out and leaves junk, it grows at 5MB per attempt, not 8GB. I sincerely hope that any next OS version update (e.g. 10.13.7 if that ever arrives) does not do the APFS conversion anyway. I also hope that Apple finally publishes the instructions to move away from the Mail Services in macOS Server (something promised almost a year ago and still not kept). As long as they do not publish that, I'm stuck with this aging Mac mini with no possibility to buy a new one.


Second, I edited /Library/Server/Mail/Config/spamassassin/local.cf and added


# GW

# Disable auto-expiry. This may lead to lots of bayes_toks.expireXXXXX files

# filling up the disk (especially on APFS where DB files are gargantuan)

# Setting this requires running sa-learn --force-expire once in a while

# e.g. via a cron entry.

#

bayes_auto_expire 0


And I added a cron job for user _amavisd


sudo mkdir /var/log/spamassassin

sudo chown _amavisd:admin /var/log/spamassassin

sudo crontab -e -u _amavisd


and enter the line


2 2 * * * /Applications/Server.app/Contents/ServerRoot/usr/bin/sa-learn --force-expire >>/var/log/spamassassin/expire.log 2>&1


SpamAssassin will now not try to expire the database as a side effect of filtering a mail message. Instead, expiry will done once a day at 2:02 AM. You can add a -D flag to sa-learn to see some more detail. Note, this log file also grows over time.

Mar 5, 2019 7:26 AM in response to Gerben Wierda

Because TimeMachine keeps local snapshots (which you cannot stop it from doing)...


Launch Terminal on your Mac laptop.

Enter the following command into Terminal.


sudo tmutil disablelocal


Press Enter.


This will remove those local snapshots from your internal storage. You can then re-enable the feature.


Open Terminal on your Mac laptop.

Enter the following command into Terminal:


sudo tmutil enablelocal


Press Enter.

Time Machine will start over, saving local snapshots to your Mac laptop's internal storage.

Mar 6, 2019 1:45 AM in response to BDAqua

Basically, those disablelocal/enablelocal subcommand have not been in tmutil for a while now. It's not in macOS Mojave and it's not in macOS High Sierra.


(Apple assumes you are running APFS and that local snapshots do not take up a lot of space on the disk. This assumption in not robust under many scenarios).


(Blindly giving an answer that was true in the past might get you extra points (or this is a bot), but this answer is wrong)

Mar 6, 2019 9:55 AM in response to Gerben Wierda


Gerben Wierda wrote:
disablelocal/enablelocal subcommand have not been in tmutil for a while now.

(Blindly giving an answer that was true in the past might get you extra points (or this is a bot), but this answer is wrong)


Thanks for the post.


Like many I am here continually learning— I enjoy this aspect, it is ongoing and endless;


suffering through comments like your last statement seemed to me highly unnecessary and self-serving.


The 24 minutes between your original post and your solved post may also be revealing.


Sharing your insights is always welcomed.

TimeMachine and SpamAssassin conspired to bring my macOS Server to its knees (bayes_toks.expire files)

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.