Erik posted his SpamSieve 2.0 stats, and I’m equally impressed. Still getting more false positives than I like, but I’m sure that will improve as I’m diligent about telling SpamSieve what it’s doing wrong through the scripts. What I *love* about SpamSieve is that it really learns from its mistakes. Once you tell it that something is spam, it is. And more important, when you tell it that it has a false positive it doesn’t make the same mistake again.
My stats since September 10, 2003 (SpamSieve 2.0):
Good messages: 2333 Spam messages: 1458 False positives: 134 False negatives: 31 Correct: 95.6%
Michael Tsai, the author of SpamSieve commented on Erik’s blog that the reason for the false positives is that he has more spam than good in his corpus. I’m looking at mine and I see that I have 190 good messages and 1458 spam. Ah ha! I went through every folder of saved mail I have and told SpamSieve that it’s all good (over 2000 messages worth) and we’ll see if the false positive rate goes down. I’m checking my spam folder often and it would be nice to going back to doing it every once in a while (if at all). Part of the problem is that I had to start a fresh corpus when I set up the G5.
Related posts:
- Trying to appreciate Mail.app
When I switched from fulltime Windows to fulltime Mac a... - Wordpress.com stats on self-hosted blogs
One thing that I really like about Wordpress.com blogs is... - Akismet spam filter does work
A couple of months ago, I installed the Akismet spam... - Cloudmark Desktop spam filter: so far so good
10 days ago, I blogged that I started using Cloudmark... - The state of WinXP spam filters for Outlook
There’s no way most folks can manage email without a...
