sa-learn – Train SpamAssassin

SpamAssassin is one of the best open-source anti-spam plugin. Its powerful scoring framework not only provide the capability to catch the spam but also allow you to train it by giving it spam and ham emails. Integration with several setups are available in the form of plugins. I am using it on my Email Server to catch the sneaky spam emails which come around from no where.

sa-learn is the utility that can be used to train the SpamAssassin’s Bayesian Classifier. There are several ways you can use to train it. One of the classic wayout is using the plugin in webmail client. Personally i prefer to run down the things through cli (command line interface) but its good to give it spam emails, which sneak through the classic spamassassin), by just pressing the spam button. I usually don’t get much but still tried to train it for learning purpose.

To setup the sa-learn in Roundcube there is a plugin available about which i wrote earlier MarkasJunk2. It allows you to mark any email as spam or to mark it as ham (not spam). You can set this up to train the classifier.

You can also train it through cli commands. Documentation of sa-learn is available in detail which comes in handy. There are many databases available for spam you can Google it. I used one a database from Art Invoice which is complied over the years by them. There are more than 1 million emails in that database.  You can also give it a shot to avoid spam emails in inbox. I will try to put an article on it in coming days about setting up MarkasJunk2 and and manually training sa-learn. 

Thanks to Art Invoice for compiling that database.