This is really a bit of a me-too article, but I thought it worth summarising a modest Python success story. My hosting provider offers IMAP access and allows me to set up my own cron and procmail configuration. I use Thunderbird on several (Windows) machines and very occasionally Squirrelmail or even mutt if that’s all the access I’ve got. I’ve advertised my mail at timgolden address pretty widely and I’m not at all surprised to be receiving a few hundred spams every day.

I suppose everyone has their way of coping with spam and I’ve been using Spambayes for quite a while via a procmail filter, but the bsddb database kept corrupting during training (a known but unsolved issue, it seems) and in the end I just left the hammie.db in the last known state, without retraining, and carried on as best I could, clearing out my Inbox every few days. Then all of a sudden I seemed to get onto someone’s list and the situation became unmanageable. So… back to Spambayes to see if I couldn’t find a solution.

Well, the result was a fresh install of Spambayes (from svn, fwiw), specifying a pickle database since it seems to be less prone to corruption and the volumes I’m dealing with aren’t high, a slight reshuffling of my folders, and the use of Menno Smits’ recently rehoused imapclient lib. The whole process is as follows:

The result is remarkable: Spambayes very quickly identifies ham/spam pretty much 100% correctly; I haven’t had any database corruptions so far (about a week now); and I’ll pretty soon ignore the Spam folder and drop anything spambayes calls spam into /dev/null. It’s a little risky, but life is short and my experience is that Spambayes very rarely gets it wrong.

The use of the imapclient libs was new this time round (the rest of the process was only very slightly tweaked from its previous incarnation). And this means less for me to check. Just copy/move the email to to-ham/to-spam and forget about it.

One small thing which came out of this was that I discovered I could have folders on IMAP. I was sure I’d tried it previously and failed with some obscure error. This time, though, Thunderbird just told me: you can either have a folder-only folder or a mail-only folder and created it quite happily. I rely heavily on the Nostalgy add-in to Thunderbird. It means I can have a full-width two-pane display without the folder tree and still move things easily from folder to folder.

In short, a couple of Python libs: Spambayes & imapclient coupled with the ubiquitous procmail and I’ve got a very functional spam filter in place.

Notes: