Newsroom Update

Beginning in May, a special Today at Apple series titled “Made for Business” will offer small business owners and entrepreneurs free opportunities to learn how Apple products and services can support their growth and success. Learn more >

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Clean-up SpamAssassin db files

By incident I found that some files (auto-learning ?) of SpamAssassin are growing and growing; for example:

/var/amavis/.spamassassin root# ls -l
total 2275104
-rw------- 1 clamav clamav 1118257152 Aug 9 22:47 auto-whitelist
-rw------- 1 clamav clamav 144 May 4 10:09 auto-whitelist.lock.star1.local.16542
-rw------- 1 clamav clamav 93000 Aug 9 22:47 bayes_journal
-rw------- 1 clamav clamav 41598976 Aug 9 22:15 bayes_seen
-rw------- 1 clamav clamav 4894720 Aug 9 22:15 bayes_toks
-rw-r--r-- 1 clamav clamav 1175 Jan 3 2007 user_prefs

That's over 1 GB of data of the auto-whitelist file; while I generally wanted to keep auto whitelisting enabled, is there something I can set to keep disk space requirements in an acceptable limit ?

And second question:

I guess the following "files with random numbers" (dummy language 😉 can be deleted without losing anything... seems that they origin from server restarts or something else (anyway, the server never crashed, at least I shut it down the usual way whenever needed). Not sure, but shouldn't be some kind of automatic clearance that removes such temporarily created files ?

/var/amavis root# ls -l
total 112
drwxr-x--- 16 clamav clamav 544 Aug 9 15:49 .razor
drwx------ 8 clamav clamav 272 Aug 9 22:47 .spamassassin
-rw------- 1 clamav clamav 13339 May 4 10:09 .spamassassin16516mr1BFXtmp
-rw------- 1 clamav clamav 13364 May 4 10:09 .spamassassin16542wwR2sutmp
-rw------- 1 clamav clamav 12112 Feb 3 2007 .spamassassin29700LQmsFJtmp
drwxr-x--- 4 clamav clamav 136 Nov 13 2006 amavis-20061113T165101-27553
drwxr-x--- 4 clamav clamav 136 Nov 13 2006 amavis-20061113T165217-28650
drwxr-x--- 4 clamav clamav 136 Jan 1 2007 amavis-20070101T183212-24902
drwxr-x--- 4 clamav clamav 136 Aug 9 22:29 amavis-20070809T222926-11934
drwxr-x--- 4 clamav clamav 136 Aug 9 22:30 amavis-20070809T223058-11962
-rw-r----- 1 clamav clamav 0 Oct 26 2006 amavisd.lock
-rw-r----- 1 clamav clamav 3 Aug 2 19:12 amavisd.pid
srwxr-x--- 1 clamav clamav 0 Aug 2 19:12 amavisd.sock
-rw-r----- 1 clamav clamav 173 Nov 20 2006 razor-agent.log
-rw-r--r-- 1 clamav clamav 3 Aug 22 2005 whitelist_sender

Mac OS X (10.4.10)

Posted on Aug 9, 2007 1:57 PM

Reply
6 replies

Aug 11, 2007 8:39 AM in response to tobias Eichner

Unfortunately nobody seems to had time so far to reply... anyway, I made some progress of my own. Maybe someone else made the same experiences and is willing to share his/her knowledge:

In a discussion group I found that with some earlier SpamAssassin packages a Perl script called check_whitelist is delivered. I found it at http://archive.apache.org/dist/spamassassin/Mail-SpamAssassin-3.0.3.tar.gz .

When executing the script as user "clamav", it really seems to clear up the AWL database. At least, no error message was shown during the (long) procedure.

However the AWL database still has the same size - no single byte smaller. Do you have an idea why this seems to have no effect ?

I now have disabled AWL by commenting out the module in /etc/mail/spamassassin/v310.pre configuration file and restarted the mail service. Are there any additional steps necessary to disable AWL ?

Is it safe to remove the large AWL simply with rm command as root to get rid off it without hurting the mail server or a later usage of AWL (would this file being rebuilt) ?

Aug 12, 2007 2:16 AM in response to smcnulty

Thanks for your reply 🙂

I considered this script some time ago already, but found that it is not suited for my requirements.

Interesting to know would be why the clearance script not shrank the AWL database and if it is safe to simply delete it.

Anyway, I'll probably simply give it a try after doing a backup of the system and then checking if SpamAssassin still works.

It seems that the AWL is quite an unready feature of SA, since it seems not to come with any maintenance tool (beside the mentioned clearance script that is hard to find).

also has a flag that will delete mail after its been learned and absorbed by
the Bayes DB- and has several options for playing with the bayes db


The bayes database is different to the AWL database, by the way. And SpamAssassin manages it very fine of its own (auto learning, auto clearance).

Aug 13, 2007 6:38 PM in response to tobias Eichner

-sorry Tobias I was thinking of the wrong DB obviously-

There IS a setting in SA for setting a MAX size for the dataases and settings to autoexpire old tokens and what not...
Excerpt from here below- hope it helps http://spamassassin.apache.org/full/3.2.x/doc/MailSpamAssassinConf.html


bayes journal_maxsize (default: 102400)
SpamAssassin will opportunistically sync the journal and the database. It will do so once a day, but will sync more often if the journal file size goes above this setting, in bytes. If set to 0, opportunistic syncing will not occur.

bayes expiry_max_dbsize (default: 150000)
What should be the maximum size of the Bayes tokens database? When expiry occurs, the Bayes system will keep either 75% of the maximum value, or 100,000 tokens, whichever has a larger value. 150,000 tokens is roughly equivalent to a 8Mb database file.

bayes autoexpire (default: 1)
If enabled, the Bayes system will try to automatically expire old tokens from the database. Auto-expiry occurs when the number of tokens in the database surpasses the bayes expiry_max_dbsize value.

Aug 14, 2007 1:26 AM in response to smcnulty

Thanks for your reply 🙂

But it seems that you are referring to the wrong db, again 😉 The autoexpiration of old tokens you mention applies to the bayes database, not to the AWL one.

I wonder myself why the developers of such a renowned software like SpamAssassin missed to implement similar for the AWL, but for any reason they did (or I'm too stupid to find the appropriate settings).

Anyway, I checked the link you sent me, nothing found specific to AWL; I also looked at the amavis configuration file (although I read that most spam settings there are ignored), nothing.

Since I feel having my lucky day today, I will simply remove the AWL database file... if I stay lucky, SA and the server are working after this procedure. And if not, I still stay lucky, since I have made a backup 😉

Regardless the results, I'll let you know here. Not sure if such a radical method is the best solution, but it seems to be the only one.

Aug 14, 2007 5:18 AM in response to tobias Eichner

So... finally I solved it by deleting the "auto-whitelist" file located at /var/amavis/.spamassassin directory. In order to be sure, I recreated it with appropriate permissions and owner. No affects to the mail service noticed, therefore I think that this is the procedure to keep a growing whitelist db in acceptable ranges.

Strangely, after deactivating AWL, I noticed that less false positives are coming through; if nothing changes bad during next days, I think I keep it deactivated.

All in all it would still be interesting to know why the SpamAssassin clearance script (mentioned in one of my previous posts) just stated that it cleared up the database, but obviously didn't. I executed it even as root and there was no error message. Maybe because I had a version > 3.0.3 installed (but the clearance script seems not to ship with later packages of SA).

Maybe I was able to help people facing the same trouble... (but guess that it's not the most recommendable solution, especially when wanting to keep AWL enabled it would lose any and all entries in the db).

Clean-up SpamAssassin db files

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.