Spam Filter in Windows Vista’s Windows Mail
We chose to have a user interface that looks very similar to Outlook. This allows users that use one product to use the other without having to learn anything new. This becomes helpful as users use some of the following features:
- High, Low, and OFF settings
- List of Safe senders and safe recipients
- List of Blocked senders
- List of blocked domains
The magic of spam filtering is always in the filter. Microsoft has a team dedicated to building a spam filter and continually improving it. This filter has shipped in Outlook, runs on Hotmail server’s and will now be shipped in Windows Vista for Outlook Express.
The spam filter works by:
- A subset of hotmail users have agreed to help fight spam. Every so often an email sent to a hotmail customer is used as a sample. The Hotmail user is asked if the message is spam or normal email. These categorized emails are then used to train the filter.
- MS Research came up with a learning algorithm that works best for learning to identify spam. A bayesian was implemented and found to be less effective than the algorithm that was finally chosen.
- The training process creates the filter data file. The filter data file will be updated on using Microsoft Update in order to stay current.
- The version of the filter in Windows Vista is a little newer than Outlook.
We considered training on the client, however we found it much more effective to train on the server. Here are a few interesting facts we learned in the war on spam:
- The first version to make it out of beta shipped in MSN Explorer in 2001. This version trained on the client but was found to not have enough data to become as accurate as we wanted. Spam Filtering can only become reliably accurate with several thousand emails. Home users receiving about 10 emails a day would take a long time to become accurate. And waiting a year for the filter to start working is unacceptable.
- Server training allows us to build DNS BlackHole lists.
- A HoneyPot is an important technique to a good spam filter. It uses decoy email accounts to catch spam and then block those emails. The problem with HoneyPots is that spammers try to avoid them. By using customers to categorize spam, we get the benefit of HoneyPots while avoiding the downfalls.
TIPS: Any users of this filter in Outlooks should go to the Office Update web site to make sure they have the newest version of the filter. Outlook’s filter is much more effective if you have the newest filter file downloaded. In Windows Vista, our goal is that the filter will be downloaded automatically.