Root Cause.
- macOS Server's mail services include virus and spam scanning. Spam scanning is done with SpamAssassin using a Bayesian algorithm. macOS Server's standard configuration has no setting for auto-expiration of Bayesian rules in its configuration /Library/Server/Mail/Config/local.cf. This means that bayes_auto_expiry is set to 1. Which means that every message filtered my trigger the expiry process. And the standard time_limit for Bayesian filtering is set to 300 seconds. As soon as filtering plus expiry of a message goes over that time limit, SpamAssassin will kill off the filtering process (assuming it is stuck on filtering). As a result, a file called bayes_toks.expireXXXXX is not cleaned up (XXXXX is a process number).
- Normally, that file is around 5MB. It is a Berkeley DB file. On APFS, Berkeley DB is broken (I am not aware it has been fixed yet in Mojave). The result is that Berkeley DB files may be huge. E.g. this is true for all postfix DB files. E.g. my virtual_users.db is 155MB where it should be 16kB. As a result of this Berkeley DB/APFS problem, every time SpamAssassin kills of a spam filter action, a 5-8 GB bayes_toks.expireXXXXX file is left in the directory /Library/Server/Mail/Data/scanner/amavis/.spamassassin. As soon as this starts to happen, the disk is filling up very fast. (On HFS+ this will also happen but more slowly, depending on teh volume of mail traffic).
Additional problem if you use Time Machine
When you run Time Machine on your server boot disk, Time Machine makes local snapshots. This cannot be turned off these days. This means that the large files are ending up in TimeMachine snapshots, which on APFS are kept using APFS's internal deduplication mechanism. Simply removing the bayes_toks.expireXXXXX files is thus not enough.
Potential corruption as a result
When the disk gets full and APFS deduplication/snapshotting is in function, the APFS file system can (easily?) become corrupted. When that happens, you are in deep trouble, because you cannot free any room anymore on the volume. Not at all. Your system will not function. You cannot delete the local snapshots. Even when started in Target Disk Mode and accessed from another Mac, you cannot remove them. Even if you remove SIP you will not be able to. And even if you are able to free up some space by removing things that are not in the local snapshots, you may run into the weird situation that on the use of a command-line rm command (remove file) on a bayes_toks.expire file, it fails with 'no space left on device'. The only thing you can do is start over: fully rebuild your server's boot disk from backups.
Recovery.
As my APFS boot volume was corrupted, I had to rebuild my server. I was lucky in that I was able to clone my broken boot disk to another disk (HFS+). Then I deleted the APFS volume, erased the boot disk and formatted it as HFS+. I installed a fresh copy of macOS High Sierra using an installer on a USB stick. I used the cloned disk to migrate the data back. This was enough to get the system working again.
Prevention.
APFS in combination with Berkeley DB and Time Machine is not what I want. I used a special trick to prevent the macOS High Sierra install from transforming the HFS+ volume to an APFS volume (this means that my Berkeley DB files do not blow up in size. For instance, running 'postmap virtual_users' reduced virtual_users.db from 155MB to 16kB. Now, if SpamAssassin times out and leaves junk, it grows at 5MB per attempt, not 8GB. I sincerely hope that any next OS version update (e.g. 10.13.7 if that ever arrives) does not do the APFS conversion anyway. I also hope that Apple finally publishes the instructions to move away from the Mail Services in macOS Server (something promised almost a year ago and still not kept). As long as they do not publish that, I'm stuck with this aging Mac mini with no possibility to buy a new one.
Second, I edited /Library/Server/Mail/Config/spamassassin/local.cf and added
# GW
# Disable auto-expiry. This may lead to lots of bayes_toks.expireXXXXX files
# filling up the disk (especially on APFS where DB files are gargantuan)
# Setting this requires running sa-learn --force-expire once in a while
# e.g. via a cron entry.
#
bayes_auto_expire 0
And I added a cron job for user _amavisd
sudo mkdir /var/log/spamassassin
sudo chown _amavisd:admin /var/log/spamassassin
sudo crontab -e -u _amavisd
and enter the line
2 2 * * * /Applications/Server.app/Contents/ServerRoot/usr/bin/sa-learn --force-expire >>/var/log/spamassassin/expire.log 2>&1
SpamAssassin will now not try to expire the database as a side effect of filtering a mail message. Instead, expiry will done once a day at 2:02 AM. You can add a -D flag to sa-learn to see some more detail. Note, this log file also grows over time.