Next Generation Spam Zombies Will Use Data Mining
branewashd writes "The Globe and Mail is covering some new research on the future of spam. The paper 'Spam Zombies from Outer Space', from researchers at the University of Calgary, will be presented on Sunday at the European Institute for Computer Anti-Virus Research conference. According to the paper, the next generation of spam zombies will employ 'sophisticated data mining of their victims saved email'. When a computer is turned into a spam zombie, it will first be mined of its address book, mail client configuration, and mail archives. Then the spam program will use Natural Language Processing techniques to send spam messages to the victim's contacts that look a lot like messages that the user has previously sent. The researchers predict that this will be extremely hard to detect, but they do offer a few suggestions for combating it."
That doesn't sound like data mining, nor complicated data mining even... just a simple markoff-chain driven text generator would do. Anything more complicated than that wouldn't be data mining either, but rather computer linguistics.
As a Slashdot discussion grows longer, the probability of an analogy involving cars approaches one.
If you mark enough of these random collection of useful word messages as spam, your beysian spam filer will start filing real, useful email as spam, and you will eventually decide the filter doesn't work and turn it off...
Of course, if you feed your filter just the headers and stuff that actually looks like spam, and not the blocks of random words, it can still learn useful things.
- "History shows again and again how nature points out the folly of men" -- Blue Oyster Cult, 'Godzilla'
I regularly recieve emails of exactly this nature to several addresses I use to deal with shady/or poorly managed state agencies. I noticed address mining of this sort at least 16 months ago. I typically know that a given shop will be calling for some sort of aid when I start getting my own (slightly modified and links added) back with own signature attached(once again slightly mispelled).