3 Replies Latest reply: Jan 2, 2013 6:22 AM by pterobyte
Gavin Lawrie Level 2 Level 2 (395 points)

Although it doesn't appear in Lion Server documentation, I'm assuming that Lion Server still uses the "sa-learn" tool to train the junk mail filter (page 36 of Snow Leopard Mail Services Administration manual).


If so, I wonder if anyone has found a way to automate the collection of spam / ham messages from user accounts.  The suggested approach (user redirects mail to either junkmail@xxx.com or notjunkmail@xxx.com) is reliant on a fairly complex action by the user, and when you have dozens of spam messages is therefore unlikely to be used.


A simpler system would be to designate "ham" and "spam" folders within the user's IMAP folder tree (e.g. as sub-folders below Junk) and simply get them to copy / move messages to the appropriate folder where learning is required (so if you find a spam message in your inbox, you simply move it to the 'ham' folder and it gets picked up by SpamAssassin and is used to train the filter).


This is not a new idea (our last email server implemented a similar system).


I was wondering if anyone knows whether it is possible in Lion Server's mail implementation, and if so how...


I've had three ideas, none of which I can see how to make work:


  • Create a local filter that redirects messages found in a folder to the appropriate sa-learn mailbox (problem - no way to get filters to run daily)
  • Edit the script that captures the content of the junkmail / notjunkmail accounts and extend it to cover all the relevant mail folders in the stack (not sure where script is, or how to add suitable search term to find the right folders across generic stack) - sa-learn will apparently take a filename with list of folders to use as an input, so perhaps there is a way to use 'find' or some such to locate the right files?
  • Create a shell script that finds the messages within the mail-store and moves them to the appropriate folders (i.e. outside Dovecot) (not sure on how to find the right folders, not sure how Dovecot will respond to messages moving outside its control) and link to cron job to get it to run before the sa-learn training happens (if it happens)


Anyone got any ideas?  Or is there a better way to do this?

Mac Pro, Mac OS X (10.7.3), 9Gbytes - OS X Server
  • UptimeJeff Level 4 Level 4 (3,390 points)

    Agreed.. there needs to be a simple method for users to drag mail for training.

    I've tried several different methods and haven' settled on one.


    In a few sites, every user has a junk/notjunk folder and I run a script which calls sa-learn aganst every user junk/notjukn folder and deletes mail older than x days from those folder (to make it self cleaning)


    I'm going to change this system though.... I'm leaning towards using dovecotadm to move the contents of user junk/notjunk folders to one junk and one notjunk folder. then use spamtrainer to learn from those mailboxes.


    I can share some code if you'd like to try this on your own...



  • Gavin Lawrie Level 2 Level 2 (395 points)

    Hi Jeff


    Thanks for the comments.  The dovecotadm idea sounds like a good one.  Would be very interested in trying something out that works along these lines.  Let me know how best I can help.



  • pterobyte Level 6 Level 6 (10,910 points)

    Hi Gavin,


    Jeff and I will be testing an automated script for this in February. If you'd like to help testing, just drop either one a note.