Well I thought this sounded cool so I spent an hour or two coding it up in a VB macro for Outlook.
There's now a "Delete Spam" button on my toolbar that moves the selected message to a "Spam" folder. There's an event handler that runs whenever a new message comes in, analyzes it, and if it looks like spam puts it in a "Probable Spam" folder. There's a macro which analyzes all the messages in the "Spam" folder and all the messages in my Inbox to generate the word probabilities hash table.
I did a quick run through my deleted mail folder, used the "Delete Spam" button to move a representative sample of spam (250 messages) to the Spam folder (I didn't do them all just to save time). I then ran the analyzer to get an initial hash. Then I analyzed the messages in my deleted mail folder, wrote the scores and subject lines to a text file, and moved most of the spam that didn't get flagged as spam to the spam folder, and re-ran the analyzer.
Bingo. That simple technique has caught every spam I've gotten since. From time to time I can check the "Probable Spam" folder and move those messages to the "Spam" folder and re-run the analyzer to improve it. We'll see how it weathers over time, but it's already doing better than I have any right to expect.
Well I thought this sounded cool so I spent an hour or two coding it up in a VB macro for Outlook.
There's now a "Delete Spam" button on my toolbar that moves the selected message to a "Spam" folder. There's an event handler that runs whenever a new message comes in, analyzes it, and if it looks like spam puts it in a "Probable Spam" folder. There's a macro which analyzes all the messages in the "Spam" folder and all the messages in my Inbox to generate the word probabilities hash table.
I did a quick run through my deleted mail folder, used the "Delete Spam" button to move a representative sample of spam (250 messages) to the Spam folder (I didn't do them all just to save time). I then ran the analyzer to get an initial hash. Then I analyzed the messages in my deleted mail folder, wrote the scores and subject lines to a text file, and moved most of the spam that didn't get flagged as spam to the spam folder, and re-ran the analyzer.
Bingo. That simple technique has caught every spam I've gotten since. From time to time I can check the "Probable Spam" folder and move those messages to the "Spam" folder and re-run the analyzer to improve it. We'll see how it weathers over time, but it's already doing better than I have any right to expect.