Armoring Spam Against Anti-Spam Filters

infinite monkeys by bluelip · 2004-02-04 03:19 · Score: 5, Funny

SO the ultimate spam protection mechanism would be an infinite number of monkeys type my list of words to associate w/ spam. :)

--

Yep, I never spell check.
More incorrect spellings can be found he

Re:infinite monkeys by AllUsernamesAreGone · 2004-02-04 03:34 · Score: 4, Funny

We better watch out for slashdot comments appearing in spam now.. ;)
Re:infinite monkeys by Jonas+the+Bold · 2004-02-04 03:36 · Score: 5, Funny

You kids and your monkeys

In my day we didn't have monkeys. We had to filter spam by hand. And we liked it!

You kids and your infinite monkeys... Shakespear wouldn't have used monkeys were he alive today. He would have rolled up his sleaves and written hamlet the right way!

Damn kids..

--
Everything seemed to be going so nice
'till the end of all beings punched right through the ice
Re:infinite monkeys by TheDigitalRaven · 2004-02-04 03:42 · Score: 5, Funny

Hands? Them're luxury! When I were a lad, hands were summat only posh people had. The rest of us had to make do with paws which hadn't evolved fully yet, and we had to filter all of our spam from each mailbox manually, but we had to go to the mailbox - across a river of lava, mind - to collect each message but couldn't filter it until we got back. We'd sort spam twenty six hours a day, getting up two hours before going to bed, and had to eat cold poison while we were doing it. And we had to pay for the priviledge of being allowed to filter our own!
Re:infinite monkeys by letxa2000 · 2004-02-04 04:00 · Score: 5, Insightful

I'm not sure I understand why they think this is a problem with Bayesian filtering. Basically, they're saying that if a spammer sends you the same message thousands of times but inserts a few slightly different words each time, and if the thousands of messages get through the Bayesian filter to the user, and if the user doesn't disable HTML bugs on his email client, then we have a problem...?
First, if the spammer sends thousands of copies of the same message and just changes the "extra words" that he is testing, it will take very little time for Bayesian to adapt to the rest of the message. Suddenly, the rest of the message that previously contained non-spammy words will be considered very spammy and will overwhelm the "extra words" that each message contains. Each time the message is caught as spam, the probability that any future tests get through--regardless of the "extra words"--will be reduced even further.
Second, as the article said, it's a lot of work on the part of the spammer. They'd have to send out thousands of messages to each target to "sniff them out" and most of those wouldn't even be effective since most of them would be caught by filters and those few that got through very few would load the HTML bugs to identify themselves.
Finally, it assumes that those that are using Bayesian filters are filtering their email but leaving their security (inasmuch as HTML bugs) wide open. While there may be some people that use Bayesian and leave HTML bugs active, it has to be a small minority.
In short, it seems to me they've "found" a way to get around Bayesian that won't work, so to speak. I just don't see the problem.... ??
Re:infinite monkeys by Sique · 2004-02-04 04:12 · Score: 4, Insightful

Second, as the article said, it's a lot of work on the part of the spammer. They'd have to send out thousands of messages to each target to "sniff them out" and most of those wouldn't even be effective since most of them would be caught by filters and those few that got through very few would load the HTML bugs to identify themselves.

This is exactly the point. Most of the spam examples will die out because they have an ineffective collection of non spam words. But a few will survive and you now can train an own Bayesian filter which collects the versions of spam that generated webbug hits. After a while some words will shine prominently in your Bayesian filter database for being very effective at slipping through Bayesian spam filters.

Basicly you a fighting the dote with itself. And yes. You can automate the process. Just take your everyday spam (penis enlargement, unsecured credit, Nigerian business opportunities...), take a dictionary and then randomly mix dictionary words into your spam messages and send them out to your email database. Create a website to get the webbug hits and associate every spam message with a hash of the random dictionary words to identify successful sets of anti spam words.

--
.sig: Sique *sigh*
Re:infinite monkeys by Theresa1 · 2004-02-04 04:42 · Score: 5, Funny

cold poison ?! you lucky buggers.
We were so poor we had to eat spam.

--
This is a manual signature virus. Copy to your signiture file and help me spread.
Re:infinite monkeys by Patik · 2004-02-04 05:22 · Score: 2, Funny

You forgot to put quotes around Yorkshire and close the span tag
Re:infinite monkeys by nate1138 · 2004-02-04 05:30 · Score: 2, Funny

Shakespear wouldn't have used monkeys were he alive today. He would have rolled up his sleaves and written hamlet the right way!

Yeah, he would have had Christopher Marlowe or Bacon write it for him!

--
Where's my lobbyist? Right here.
Re:infinite monkeys by NanoGator · 2004-02-04 05:49 · Score: 3, Funny

"We were so poor we had to eat spam."

Ah we're such fun loving people. How come none of us have girlfriends?

--
"Derp de derp."
Re:infinite monkeys by Anonymous Coward · 2004-02-04 06:01 · Score: 2, Funny

And thus, in the ancient lineage of "COWBOY NEAL!!!", "In Soviet Russia..." and "???, Profit!!" comes Slashdot's newest guaranteed "Score: 5, Funny" genre of posts. The "Back in my day...".
Re:infinite monkeys by joebok · 2004-02-04 06:02 · Score: 2, Interesting

I think it's more than no problem - what I believe he is saying is that a Bayesian filter will evolve some "ham" words that will carry an email into an inbox. They are individual and hard to figure out, but there is no reason why a spammer can't append your ham words, my ham words, and everybody else's ham words to the same message and thus bypass all our filters. So instead of the random "word salad" that we would see, we'd be getting a non-random selection of known ham words.

Even if the HTML business didn't work, spammers still have a mechanism for gauging effectiveness - money. They can assume a fairly even distribution of suckers and start sending out groups of messages with random words and, with some analysis, probably eventually come up with some statistically significant ham words.

Perhaps in addition to trading email addresses, ham word lists will also start to be traded. The anti-spam/spam industry will evolve like insurance and re-insurance : whoever has the best actuary will win.

Over time the ham words would also change - I wonder if the fight against spam will start having a noticable effect on our use of language?
Re:infinite monkeys by AaronW · 2004-02-04 06:11 · Score: 2, Funny

In my day we didn't even have spam, and we liked it! You kids and your fancy smancy spam and filters and whatnot have no idea of the difficulties before spam. Hell, if we wanted to find out about penis enlargement pills we had to go out hiking through the snow uphill to search for them. And we liked it!

--
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
Re:infinite monkeys by FireBreathingDog · 2004-02-04 06:13 · Score: 3, Insightful

It's much easier than that to defeat Bayesian filtering. Ever \/\/0|\|D3R why you're getting so much spam with obfuscated words? Or why you're getting so much spam where the text content is contained primarily in images rather than plaintext? Those things bypass Bayesian filters, that's why!
Bayesian filters rely on words. That means it is dependent upon word breaks and certain spellings. Well, spammers have been avoiding word breaks (either by removing spaces or introducing unnecessary ones) and obvious "spam words" by mangling the word or introducing "1337"-type spelling.
And Bayesian filters can't parse graphics, so a lot of spammers are careful to put words likely to trigger spam filters into graphics.
BTW, this article explains why there will never be a filtering-based solution to solving spam until SMTP itself is made more secure.

--
Shame on Google.
Re:infinite monkeys by Anonymous Coward · 2004-02-04 06:19 · Score: 0

In mother Russia, the spam filters YOU!
Re:infinite monkeys by Tripster · 2004-02-04 06:37 · Score: 5, Funny

Don't know about you but my wife won't let me have one!
Re:infinite monkeys by CleverFox · 2004-02-04 06:52 · Score: 3, Funny

Or I could just sell the spammer a list of the words from 300,000 message Bayesian database that are 1% probability tokens.

$50,000 gets you the whole 300,000 message Bayesian database.

lindsayleeds _at_ comcast.net

Pay up spammers.
Re:infinite monkeys by senatorpjt · 2004-02-04 06:54 · Score: 1

You'd think that new Bayesian filters would take this into account and check for letter/number substitutions, etc.
Re:infinite monkeys by deadlinegrunt · 2004-02-04 06:57 · Score: 1

So? Get one anyway. Then when they both find out you been lying they'll see the sinister plot of making them think you've been with the other this whole time while actually allowing for even more time on the computer for you to code!

--
BSD is designed. Linux is grown. C++ libs
Re:infinite monkeys by vigilology · 2004-02-04 07:06 · Score: 1

"getting up two hours before going to bed"
Luxury.
Re:infinite monkeys by Jeremi · 2004-02-04 07:37 · Score: 4, Informative

Ever \/\/0|\|D3R why you're getting so much spam with obfuscated words?

Nope, because my Bayesian filter works just as well for 0bfu5c4t3d words as it does for properly spelled ones. They are all just sequences of letters, and anything that is deliberately misspelled is going to become identified as spammy very quickly.

Or why you're getting so much spam where the text content is contained primarily in images rather than plaintext?

Nope, because I have images turned off by default in my mail viewer. If a stranger wants me to read his email, he'll need to send it as plain text, because (as you point out) HTML email with images is used as a spam vector and little else.

BTW, this article explains why there will never be a filtering-based solution to solving spam until SMTP itself is made more secure.

Funny, my Bayesian filter is working fine at this very moment. Who should I believe, your article or my own eyes?

Jeremy

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:infinite monkeys by FireBreathingDog · 2004-02-04 08:20 · Score: 2, Informative

Nope, because my Bayesian filter works just as well for 0bfu5c4t3d words as it does for properly spelled ones. They are all just sequences of letters, and anything that is deliberately misspelled is going to become identified as spammy very quickly.
The problem with obfuscated words is that there is a pretty sizable set of permutations for any given word. If one obfuscated variant ends up in your spam word list, that doesn't take care of the thousands of other obfuscated versions of the exact same word.
Nope, because I have images turned off by default in my mail viewer. If a stranger wants me to read his email, he'll need to send it as plain text, because (as you point out) HTML email with images is used as a spam vector and little else.
Ahh..yes! I have them turned off, too! But isn't the whole point of Bayesian filtering to stop the spam before it reaches your inbox? Sure, you've got images turned off so you don't see the spam, but if Bayesian is so great, why is the spam in your inbox to begin with?
Funny, my Bayesian filter is working fine at this very moment. Who should I believe, your article or my own eyes?
You can believe your own eyes if you wish, but your misconception is assuming that if Bayesian is working for you it is also working for everyone else. Don't get me wrong...Bayesian filtering is a pretty nifty technology. But let's not pretend it's a universal solution that works for everyone.
For whatever reason, the mix of spam I get isn't caught all that effectively by my Bayesian filter. So, believe your eyes if you wish, but don't claim that my eyes must see exactly what yours do.

--
Shame on Google.
Re:infinite monkeys by Nyarly · 2004-02-04 08:46 · Score: 2, Funny

The funniest thing about the parent is that "pedantic" is misspelled.
The saddest thing is that quoting the values of html attributes isn't required by the standard.

--
IP is just rude.
Is there any torture so subl
Re:infinite monkeys by someone247356 · 2004-02-04 08:52 · Score: 1

Hmmm, you said;

"And Bayesian filters can't parse graphics, so a lot of spammers are careful to put words likely to trigger spam filters into graphics."

So that explains why so much spam consists of empty emails. I strip all HTML code, and disable Java/Java script and it's cousins from my email and reader. Often all that's left is a blank email.

Text is for email, HTML is for web pages. The sooner people realize this the sooner most of this nonsense will go away.

--
Just my $0.02 (Canadian, before taxes)
Re:infinite monkeys by TheLoneDanger · 2004-02-04 09:03 · Score: 1

I don't really think Slashdot users are infinite... we don't get laid enough to propagate indefinitely.

--

"But I trust in the people's capacity for reflection, rage and rebellion." -Oscar Olivera
Re:infinite monkeys by operagost · 2004-02-04 09:16 · Score: 1

I don't know how this guy's "work" managed to make it on the BBC. The results are useless, as the spammer would have to know what kind of words show up in the victim's "ham". If they know the victim that well, they'd probably know they don't want viagra or penis enlargement and leave them the hell alone.

--

Gamingmuseum.com: Give your 3D accelerator a rest.
Re:infinite monkeys by woztheproblem · 2004-02-04 09:22 · Score: 1

That article has ONE sentence regarding Bayesian filtering, and all it says is that spammers will resort to images in blank emails. But I've found that they always include at least some text, and that these messages are easily filtered. Plus, a Bayesian filter could simply assign a negative value to the tag and so an email with just that would be filtered.
Re:infinite monkeys by Kent+Recal · 2004-02-04 09:53 · Score: 1

Back in my day only the lineage of "COWBOY NEAL!!!", "In Soviet Russia..." and "???, Profit!!" were funny.

Not these stupid-copycat posts that contain nothing but a blatant remix of other peoples carefully crafted posts!
Re:infinite monkeys by meeotch · 2004-02-04 09:54 · Score: 3, Funny

Stop it, you fools! Slashcode was never designed to support jokes more than four levels deep - you'll cause a core breach!
You maniacs! Goddamn you all to hell!
mitch
Re:infinite monkeys by Jonas+the+Bold · 2004-02-04 10:12 · Score: 1

Who knows, hopefully "infinite monkeys" will be next in line.

Like: "Damn infinite monkeys doing the job for half the price, undercutting us average Joes."

Or: "You don't have to do all that work of debugging the code, just have the infinite monkeys do it"

--
Everything seemed to be going so nice
'till the end of all beings punched right through the ice
Re:infinite monkeys by Anonymous Coward · 2004-02-04 11:03 · Score: 0

I'm pretty sure that Bayesian filtering works on HTML as well (at least Thunderbird seems to). So, HTML, formmatted in the "spammer's way" might be blockable.
Re:infinite monkeys by elemental23 · 2004-02-04 11:16 · Score: 2, Informative

Well, you know the great thing about standards is that you have so many to choose from!

However, if you choose the current (dated 26 January 2000) W3C XHTML recommendations then yes, the quotes are required.

--
I like my women like my coffee... pale and bitter.
Re:infinite monkeys by LittleBigLui · 2004-02-04 12:20 · Score: 1

I, for one, welcome our new ... ah, you get the picture!

--
Free as in mason.
Re:infinite monkeys by Anonymous Coward · 2004-02-04 12:43 · Score: 0

I think it's more than no problem - what I believe he is saying is that a Bayesian filter will evolve some "ham" words that will carry an email into an inbox. They are individual and hard to figure out, but there is no reason why a spammer can't append your ham words, my ham words, and everybody else's ham words to the same message and thus bypass all our filters. So instead of the random "word salad" that we would see, we'd be getting a non-random selection of known ham words.

I'm sorry? You're saying he can append the ham words of everyone on his mailing list to the end of all his messages, and that that will bypass everyone's filter?

Quite apart from the fact that that would be easily defended against by filtering out all emails greater than 100 mb in size, it's very possible that your ham words are among my spam words.
Re:infinite monkeys by Trevin · 2004-02-04 13:39 · Score: 1

If one obfuscated variant ends up in your spam word list, that doesn't take care of the thousands of other obfuscated versions of the exact same word.

You're missing a very important aspect of Bayesian filtering: words which are not previously known are automatically assigned a certian probability of spamminess. The more unknown words a message has, the more likely it is to be classified as spam.

And by the way, emails that contain the words "img" and "src" are more likely to be classified as spam by my filter than emails without them.
Re:infinite monkeys by Stephen+Samuel · 2004-02-04 14:07 · Score: 1

The funniest thing about the parent is that "pedantic" is misspelled.
It's misspelled the same way Bill Gates misspells it, so we should consider it a standard.

--
Free Software: Like love, it grows best when given away.
Re:infinite monkeys by Mistshadow2k4 · 2004-02-04 15:48 · Score: 3, Funny

Well, my husband seems to think that me having a girlfriend is a great idea, but I'm not so sure...

--
I dream of a better world... one in which chickens can cross roads without their motives being questioned.
Re:infinite monkeys by FireBreathingDog · 2004-02-05 02:01 · Score: 1

You're missing a very important aspect of Bayesian filtering: words which are not previously known are automatically assigned a certian probability of spamminess. The more unknown words a message has, the more likely it is to be classified as spam.
Yes, but what a Bayesian filter considers a word is important. Do "ASCII art" signatures get an e-mail from your friend marked as spam? What about tabular text data, such as the columns in a financial statement?
Bayesian filters have to be careful not to consider every text token a word, or many things that aren't spam will be marked as such. On the other hand, being too loose in the definition of what a word is will let spam through. It's that balancing act that guarantee Bayesian filtering won't be a panacea.

--
Shame on Google.
Re:infinite monkeys by Jeremi · 2004-02-05 05:08 · Score: 1

Note that combining Bayesian with a white-list makes it more effective. (the assumption being that anything your friends send you won't be spam, so you can skip the Bayesian analysis and just give their emails an automatic non-spam rating). Combine that with a mechanism that automatically white-lists the sender of any email you marked as non-spam, and you've got a pretty effective system. (for extra credit, the mechanism can remove from the white-list the sender of any email you marked as spam -- that way you can easily fix mistakes)

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:infinite monkeys by letxa2000 · 2004-02-05 08:14 · Score: 1

It's much easier than that to defeat Bayesian filtering. Ever \/\/0|\|D3R why you're getting so much spam with obfuscated words?
Well, let's see... VIAGRA as a word and as split-up in various obfuscated ways:
VIAGRA: 99.624%
V.IAGRA: V=76.532%, IAGRA=99.9999%
VI.AGRA: VI=72.656%, AGRA=99.9999%
VIA.GRA: VIA=34%, GRA=99.9999%
VIAG.RA: VIAG=99.9999%, RA=92.604%
VIAGR.A: VIAGR=99.9999%, A=67.68%
Now, yes, there are other ways to obfuscate the word. But you can give me just about anything and you'll see results similar to the ones above.
Or why you're getting so much spam where the text content is contained primarily in images rather than plaintext?
HTML "A": 95.426%
HTML "HREF": 93.306%
HTML "IMG": 96.434%
HTML "SRC": 96.357%
You send me an HTML message with a link to an external image and it's almost guaranteed you'll be caught as spam. And we haven't even discussed the fact that Bayesian doesn't just look at the content, it looks at the message headers, too--and there's lots of good spam indicators there.
Those things bypass Bayesian filters, that's why!
Uh, no, they don't. You don't really know what you're talking about, do you? :)
Re:infinite monkeys by mgax · 2004-02-07 01:18 · Score: 1

Has anybody been pounded with the "I'm a Prince with a zillion dollars I want to deposit in your account" I am now getting this about 20 times a day.
Re:infinite monkeys by Anonymous Coward · 2004-02-07 09:58 · Score: 0

You're sadly mistaken. Bayesian filters act heavily in favor of certain words and against others. By overwhelming the spam-like words with more normal words, it's possible to overwhelm any simple spam scoring with non-spam content. The end result of tightening the filters is that more non-spam will be identified as spam, and that's the threshold the spammers will aim to use.

combat the flaw? how? by junkymailbox · 2004-02-04 03:20 · Score: 1

The bad news for spammers is that this flaw in filtering systems is not easy to exploit and can be combated. The cat and mouse game .. Find the "ham".
But how do you combat someone that essentially has your "ham"?

Re:combat the flaw? how? by RHS+Bomber · 2004-02-04 03:31 · Score: 2, Insightful

How about going after the people who own the links in the body of the spam?
Although it may be difficult to discover where the spam came originated, it's pretty clear where it wants you to go (probably the person who commisioned the spam in the first place.)
Re:combat the flaw? how? by Winkhorst · 2004-02-04 04:14 · Score: 3, Insightful

The best solution I have found so far is to have your own domain and generate specific email addresses for specific types of communications. You keep your actual ISP email address totally secret and don't give it to anybody except your domain registrar. You then generate an address for your best friends and aquaintances you can trust and keep it separate from everything else so you don't have to change it but once every few years if that. You have a specific Shopping and Registration address you kill and replace after it becomes spammy. And you have an address for things like newletters and email groups you can also change and reregister if they leak out to the spam boobs. There are all kinds of variations on this theme, but that's the basic gist of the matter: Secrecy and flexibility.

--
"Is this Winkhorst a nova criminal?" "No just a technical sergeant wanted for interrogation."
Re:combat the flaw? how? by Anonymous Coward · 2004-02-04 04:29 · Score: 0

So if I were a spammer, and sent out thousands of mails promoting M@ndr@k3 L!nux, you'd go after that company with full fury and righteous wrath?
There is such a thing as a joe-job, you know..
Re:combat the flaw? how? by Nightlight3 · 2004-02-04 05:16 · Score: 2, Insightful

How about going after the people who own the links in the body of the spam?

You are starting with a heretical premise that government, or rather, the large corporations which pull the strings, have the same objective as the end user (the end of spam). Of course, it could be stopped (by cracking down hard on those contracting the spammers). But it is much more useful for them if the "war on spam" goes on and on, while the measures with side-effects (on your wallet, your freedom and your privacy) are gradually introduced to "combat" the spam. Just recall other such "wars" such as "war on drugs" or "war on poverty/racism" or "war on smoking" or "war on guns" or the most recent "war on terrorism". This is an ancient recipe of control and enslavement, perfected by churches and priesthoods over millenia (war on sin/devil, war on death), merely translated into modern jargon and current circumstances.
Re:combat the flaw? how? by Anonymous Coward · 2004-02-04 06:05 · Score: 0

so by your method, if i start linking to /. in my spam, i won't have to put up with cowboyneal anymore? i'll write my senator today!
Re:combat the flaw? how? by Anonymous Coward · 2004-02-04 07:48 · Score: 0

Err, that's great, but what happens when your registrar spams you? (Especially those times when it isn't even actually your registrar...)

That said, having one's own domain + separate "role" email accounts seems to be the way to go in the future.

One or more for personal email, one or more for business email, separate ones to track where subscription emails go, etc.
Re:combat the flaw? how? by LehiNephi · 2004-02-04 08:53 · Score: 1

This guy already is.

--
Help find a cure for cancer. Join the [H]orde
Re:combat the flaw? how? by darien · 2004-02-04 10:53 · Score: 1

But how do you combat someone that essentially has your "ham"?

The beauty of Bayes is that, so long as you tell it when spam wrongly arrives in your inbox, it will quickly learn that these words don't signify that a message isn't spam. Problem solved.
Re:combat the flaw? how? by shellbeach · 2004-02-04 11:47 · Score: 1

But how do you combat someone that essentially has your "ham"?

Well, once a spammer has your "ham", that ham will quickly become spam via Bayesian filtering. What were certain ham words will become spam words after the user has filtered some of those mails, and pretty soon the spammer will have to find a whole new set of words.

I guess the only way around this for spammers would be to compile a vast library of spam, then attach every other word in the dictionary to the email at the end, meaning that not only are the spam-positive words swamped, but that after filtering these emails every word would become a spam-positive word and the filters would be essentially useless. (You could probably cut that down to a few hundred common words and still have a 99% success rate killing Bayesian filters.)

(But of course, non-spammers could then work around this by configuring filters to only screen the first hundred or so words, which would mean that spammers would have to place their filter-beating words at the beginning of the email ... which would ensure that no-one ever read the spamming body and would kill spam just as surely. The moral is - you can always make a filter that's one step ahead :)
Re:combat the flaw? how? by Winkhorst · 2004-02-05 01:02 · Score: 1

I didn't say it was perfect, but it sure simplifies things. Actually, my ISP is the public library, and you can't get any less commercial than that... I got this idea from an originally free service that let you create various addresses on their server that would then forward email in either direction. They later decided to charge $12 a month (!) for the service. At that point I figured I could do the same thing myself with my own domain for significantly less money and I got all the other advantages that go with a domain.

--
"Is this Winkhorst a nova criminal?" "No just a technical sergeant wanted for interrogation."

Hmmm... by dustmote · 2004-02-04 03:20 · Score: 1

I'm not sure if this is a project I wish to encourage, really. Although I'm sure that there are plenty of spammers already out there doing similar things, rendering it kind of academic.

--

-1, "1337" speak

Re:Hmmm... by somethinghollow · 2004-02-04 03:25 · Score: 5, Insightful

Like many other academic studies, such as skinning humans alive to see how long they can live, I think this one should only be placed into the right hands.

It's a pisser that spammers now have another tool to circumvent filters; on the other hand, the people who write the filters know exactly what a spammer would do to make "better" spam.

The question is: who will implement first?
Re:Hmmm... by JohnGrahamCumming · 2004-02-04 03:32 · Score: 5, Informative

If people working in anti-spam don't try to break their own filters the spammers will do it for them and we'll be worse off.

There's a direct analogy with cryptographic techniques where breaking them is most of the work... that way we know that they are secure.

John.
Re:Hmmm... by cheekyboy · 2004-02-04 03:33 · Score: 1

why do filters look at all the content, surely they can do a has on each line and then compare it and find out that, "hey, the first 90% of lines are the same, last 10% differ all the time" must be spam

together with >3 fonts and colors and / or > 2 images attached/linked. Yep, its spam.

This should'nt be rocket science, why are'nt really smart filters good enough to ignore the random 834723749273742s or random words at the end, or even random white spacing. A human can look at an email and instantly know if its spam, often without even reading a single word.

--
Liberty freedom are no1, not dicks in suits.
Re:Hmmm... by surprise_audit · 2004-02-04 03:49 · Score: 1

Is there a mailer that works on a positive-id system? One where, if you're not in my addressbook, your mail doesn't even get delivered? I've seen stuff that only accepts mail from authenticated systems, but is there one that takes it a step further?
Re:Hmmm... by BigBadBri · 2004-02-04 03:53 · Score: 2, Interesting

Have you tried reducing the significance of your 'ham' list, to see if the spammer's analysis is made more difficult?
Granted, it may increase the number of false positives, but a relatively small change in the values assigned to 'ham' words might make a big difference to the amount of work required by the spammer.
I'm not an expert on Bayesian filtering, but I seem to remember that there were a few tweakable parameters.

--
oh brave new world, that has such people in it!
Re:Hmmm... by Anonymous Coward · 2004-02-04 04:06 · Score: 0

It's not rocket science. The statistical filter I've been writing doesn't ignore random words in general (during scoring they just get counted like any other token), but it will ignore them on incoming mail.

I think trying to classify email as spam/not-spam based on characteristics (which you seem to be suggesting) is a big waste of time. Have you ever tried to wade through Spam::Assassin to see what it actually does? It's painful... and not just because it's written in Perl. Trying to classify based on rules is an arms race with the spammers.

I'm in the process of replacing S::A with about 100 lines of Ruby code. I stopped using S::A immediately after I realized it had trashed emails from my daughter based on some broken-ness in her email client (the default client on a new Windows XP computer). Obviously the fault was mine for sending spam to the trash folder where it got deleted when I closed KMail, but I don't like that a default S::A called those mails spam in the first place. But it just points up the problems with rules-based filtering approaches.

The hardest part of a statistical spam filter is not the math, but writing a good "tokenizer" routine. I think mine works well because I push HTML tags to the end and discriminate against header-tokens uniquely (as suggested by Paul Graham). By pushing HTML tags to the end I defeat the attempts by spammers to break up obvious spam words by infixing them with nonsense (i.e. non-displayed) HTML tags.
Re:Hmmm... by ichimunki · 2004-02-04 04:08 · Score: 2, Interesting

(sorry for the dupe, didn't intend to post as AC the first time)

It's not rocket science. The statistical filter I've been writing doesn't ignore random words in general (during scoring they just get counted like any other token), but it will ignore them on incoming mail.

I think trying to classify email as spam/not-spam based on characteristics (which you seem to be suggesting) is a big waste of time. Have you ever tried to wade through Spam::Assassin to see what it actually does? It's painful... and not just because it's written in Perl. Trying to classify based on rules is an arms race with the spammers.

I'm in the process of replacing S::A with about 100 lines of Ruby code. I stopped using S::A immediately after I realized it had trashed emails from my daughter based on some broken-ness in her email client (the default client on a new Windows XP computer). Obviously the fault was mine for sending spam to the trash folder where it got deleted when I closed KMail, but I don't like that a default S::A called those mails spam in the first place. But it just points up the problems with rules-based filtering approaches.

The hardest part of a statistical spam filter is not the math, but writing a good "tokenizer" routine. I think mine works well because I push HTML tags to the end and discriminate against header-tokens uniquely (as suggested by Paul Graham). By pushing HTML tags to the end I defeat the attempts by spammers to break up obvious spam words by infixing them with nonsense (i.e. non-displayed) HTML tags.

--
I do not have a signature
Re:Hmmm... by cyberchondriac · 2004-02-04 05:42 · Score: 1

Like many other academic studies, such as skinning humans alive to see how long they can live, I think this one should only be placed into the right hands.
Uh.. whose hands would you consider the right hands to skins humans alive to see how long they live ?

But in reference to the topic, as always, it's the eternal argument of empowering people with knowledge vs maintaining some form of security.

--

Look back up at my post, now look back down, you're on the Internet. Now look back up. I'm a signature.
Re:Hmmm... by dolphinling · 2004-02-04 06:15 · Score: 1

Well, you can always set your client to filter everything in that category to the trash... It uses the bandwidth, yeah, but that's not too big a problem.

--
There are 11 types of people in the world: those who can count in binary, and those who can't.
Re:Hmmm... by dingbatdr · 2004-02-04 06:35 · Score: 2, Funny

You mean that we should actually test our code?
Against real data?
Aren't you worried that could start some kind of
scary precedent?

dtg

--
The truth is an offense, but not a sin.------R. N. Marley
Re:Hmmm... by Anonymous Coward · 2004-02-04 07:15 · Score: 0

You cant bypass an IP blocklist. ;)

Filters are just an automated 'just hit delete' and anything except the most rudimentary are more effort and cpu power than I care to waste on the problem. Blocking IP's (or entire ranges of IP's) puts the problem right back in the hands of those that can fix it - the ISP's that allow spammers to use their networks - and saves me much time.

Email is never a 'critical' communication method to begin with. If you have any sort of financially critical or information thats saving someones life, you sure as hell shouldnt be using email to send or receive it, unless you control both the entire sending network and the receving network, and you have a secure means of reliably communicating between them. And even then you should call to make sure it got thru.

Ok fuck it by tomstdenis · 2004-02-04 03:21 · Score: 5, Funny

I will pay 1000$ to anyone who seeks out and beats the living daylights out of a spammer. With as many pics on the web as possible for posterity.

Screw these filters and shit. Start creaming spammers worldwide and they'll think twice about it.

Tom

--
Someday, I'll have a real sig.

Re:Ok fuck it by swb · 2004-02-04 03:28 · Score: 2, Informative

You do realize you've just comitted a pretty serious Federal crime, don't you? I know you're kidding or just emoting the same frustration many others, myself included, feel about the willful disregard spammers seem to have for many things.

But you might've wanted to add a smiley...
Re:Ok fuck it by JeanBaptiste · 2004-02-04 03:28 · Score: 1

yeah lets just go around beating up spammers. no trial, just vigilante justice.

why stop there? lets go around beating up anyone we dont like. screw the court system. i dont like evil conservatives, lets just kill them. no trial, no evidence necessary.

yeah that would be a world i would love to live in.
Re:Ok fuck it by Anonymous Coward · 2004-02-04 03:29 · Score: 0

If I had a listing of where all these spammer scumbags were...I'd consider making that my new career.
Anti-Spam thug :)
Re:Ok fuck it by cperciva · 2004-02-04 03:30 · Score: 2, Interesting

You do realize you've just comitted a pretty serious Federal crime, don't you?

He hasn't, actually -- those laws don't apply extraterritorially, and Tom's in Canada.

--
Tarsnap: Online backups for the truly paranoid
Re:Ok fuck it by visgoth · 2004-02-04 03:31 · Score: 1

Alright, how about this... Known spammers who have ignored repeated warnings get beaten senseless with a heavy sack full of doorknobs. A video is taken, and posted across the net to serve as a warning to the rest of their kind of scum.

--
My patience is infinite, my time is not.
Re:Ok fuck it by Anonymous Coward · 2004-02-04 03:34 · Score: 0

the problem is that first its spammers... then its another group you dont like...

sort of like what ashcroft and bush get accused of, no?
Re:Ok fuck it by Celt · 2004-02-04 03:36 · Score: 1, Flamebait

Another example of people assuming that EVERYBODY lives in the USA or is under US law...

--
"WebTV: bringing the Internet into the shallow end of the gene pool since 1995" - Martin Bishop
Re:Ok fuck it by Anonymous Coward · 2004-02-04 03:38 · Score: 0

Oh well then all baby eating cannibals should be put on welfare? What the hell kind of torture logic is that. First you'll only put the criminals in jail, then it'll be everyone! Go hug a tree.
Re:Ok fuck it by Anonymous Coward · 2004-02-04 03:40 · Score: 0

point being vigilante justice is not acceptable.

give them a fair trial, then beat them with a sack of doorknobs.
Re:Ok fuck it by FattMattP · 2004-02-04 03:41 · Score: 1, Flamebait

Well, that's the American way of life. We're just following George's example. All hail our great leader!

--
Prevent email address forgery. Publish SPF records for y
Re:Ok fuck it by nigelc · 2004-02-04 03:43 · Score: 5, Funny

Ahh, an international terrorist proposing an attack. We should be invading Canada any day now...

--

Cthulhu Barata Nikto
Re:Ok fuck it by Anonymous Coward · 2004-02-04 03:44 · Score: 0

I will pay 1000$ to anyone who seeks out and beats the living daylights out of a spammer. With as many pics on the web as possible for posterity.
Interesting? Funny? I don't have any mod points today, so I hope I spot this sucker in M2. Besides the obvious "threatening to hurt people in a public forum is neither funny nor interesting", it's inviting Ashcroft to come pay you a tea-time visit.
Re:Ok fuck it by Gaijin42 · 2004-02-04 03:47 · Score: 2, Interesting

Well, since this is an international forum, he has an out. But if it could be shown that he was soliciting someone to do that crime in the US, even if he did the solicitation from Canada, it would still be a crime in the US.

At a minimum, he would be arrested if he came to the states. However, if someone actually went through with the crime, I'm sure Canada would be willing to extradite him. Canada doesn't want maniacs running around free, anymore than the US does.
Re:Ok fuck it by Anonymous Coward · 2004-02-04 03:48 · Score: 0

Well it might not be ideal, but it's certainly acceptable. It's been repeatedly accepted throughout nearly (at least AFAIK) every human culture at least on occasion if not as a staple of their enforcement of community standards. Even in the US, the concept of jury nullification exists in large part to support reasonably just vigilanty justice.

For my part, as long as there's a sack of doornobs or something suitably undesirable waiting for them...it's all good.
Re:Ok fuck it by a_timid_mouse · 2004-02-04 03:48 · Score: 1

I'm American. It's not *MY* way of life. Speak for yourself please.
Re:Ok fuck it by Avardan · 2004-02-04 03:49 · Score: 1

LART!!!

--
Ma gavte la nata
Re:Ok fuck it by swb · 2004-02-04 03:53 · Score: 4, Insightful

Another example of people assuming that EVERYBODY lives in the USA or is under US law...

The solicitation was made on a server located in the US. I don't doubt that Ashcroft would consider that US jurisdiction, regardless of the physical location of the poster.

There's a lot of guys in dog cages at Guantanomo Bay who've NEVER been to the US. I'm not so sure these days that when the US governemnt is pissed off at you, where you are and where you did something matter a whole lot.
Re:Ok fuck it by AdamD1 · 2004-02-04 03:54 · Score: 4, Funny

Is that illegal? After all he's not 'threatening' the spammer, he's merely presenting an offer he was pretty sure this guy was asking to receive. And besides: He can certainly "opt-out" at any time by choosing not to spam... ;)

--
Because I can! [Brainrub.com]
Re:Ok fuck it by Ineffable+27 · 2004-02-04 04:00 · Score: 3, Interesting

No true jury of his peers would convict him, since chances are they're sick of spam too! :)

--
"He'd be a broader guy if he had dropped acid once." - Steve Jobs on Bill Gates
Re:Ok fuck it by FreeUser · 2004-02-04 04:00 · Score: 4, Funny

At a minimum, he would be arrested if he came to the states. However, if someone actually went through with the crime, I'm sure Canada would be willing to extradite him. Canada doesn't want maniacs running around free, anymore than the US does.

That assumes that beating the shit out of a SPAMmer is a "maniacal" act. I would argue that it is a perfectly rational course of action, and indeed a public service.

Canada's Finlandization by the US might compell it to hand the guy over anyway, but certainly not for fear of having maniacs run loose (unless you count our troups poised on their border to enforce US Political Correctness Bush Style abroad). :-)

[ Disclaimer required by Our Surveillence State: the preceding was a joke (c.f. humor). ]

--
The Future of Human Evolution: Autonomy
Re:Ok fuck it by Anonymous Coward · 2004-02-04 04:01 · Score: 1, Funny

So is that CDN1000 or USD1000? I mean, if I'm going to hoof a spammer in the 'nads I want to know the purchasing power of my reward.
Re:Ok fuck it by Anonymous Coward · 2004-02-04 04:06 · Score: 3, Funny

I will pay 1000$ to anyone who seeks out and beats the living daylights out of a spammer.

Dear Slashdot,

I am seeking volunteers to join me in a business oppurtunity which has recently come to my attention. Please volunteer if you meet the following three qualifications:

1) Willing to send 1 spam email.
2) Willing to have ass beaten.
3) Want $250.

If you said yes to all three of the above, please contact me. :D

P.S. For those who consider #1 to be unethical, consider #2 your punishment.
Re:Ok fuck it by theLOUDroom · 2004-02-04 04:07 · Score: 2, Interesting

yeah lets just go around beating up spammers. no trial, just vigilante justice. why stop there? lets go around beating up anyone we dont like. screw the court system. i dont like evil conservatives, lets just kill them. no trial, no evidence necessary.

[sarcasm]Yeah, let's just trust the government to take care of every aspect of our lives and never go against anything it says.[/sarcasm]

Saying something's "vigilante justice" doesn't automatically make it bad. In order to make that conclusion, you have to start with the assumption that the gov't will always do the right thing.
Since that's not the case, one must realize that sometimes the rules need to be broken and other solutions applied to the problem.

Look at it this way:
You live in a country named dystopia. In this country rape is legal. Every day on the way to school, your daughter gets raped by the same guy. You go to the police, but they do nothing about it because it's not illegal. You try to get a law passed but it gets knocked down. This rape is causing your family real harm ever day. How long are you going to wait before you resort to vigilante justice?.....and more importantly is it a bad thing when you do?

Now back to the spam problem:
Spam is pretty much legal (the canspam act was a joke...it made things worse). The gov't is doing basically nothing to stop it. It is causing real harm to internet users around the world. Now I'm not necessarily saying that vigilanteism is the answer, but what I am saying is that your response is an extremely oversimplistic view of the world.

They law is not always right, nor is it carved in stone. Sure, society is supposed to follow the law, but the law is also supposed to follow society. The law is not this thing a guy came down from a mountain and handed us. It is constant tug-of-war.

--
Life is too short to proofread.
Re:Ok fuck it by Anonymous Coward · 2004-02-04 04:09 · Score: 0

[sarcasm]
Yeah, as opposed to the European way of life, which would be to negotiate with the spammers, empathise with them, and try to see things from their point of view.
Or the French way of life, which would be to cut deals with the spammers and frustrate every effort made by anyone to deal with the spammers.
I'm not an American, but I have a damn sight more affection and respect for them than many of the US-loathing, reality-denying fools who seem to populate this site..
Re:Ok fuck it by Anonymous Coward · 2004-02-04 04:12 · Score: 0

Move to France please.
Re:Ok fuck it by GreyPoopon · 2004-02-04 04:21 · Score: 1

No true jury of his peers would convict him, since chances are they're sick of spam too! :)
I know this was partly tongue-in-cheek, but you're probably right on the sentiment of the jury. However, most jurists are not aware that they can render a vote of "not guilty" even if it is proved beyond the shadow of a doubt that the defendant has broken the law. And of course, no judge today would ever let them in on that little secret, either. If you're interested, you should read about Jury Nullification.

--
GreyPoopon
--
Why is it I can write insightful comments but can't come up with a clever signature?
Re:Ok fuck it by Anonymous Coward · 2004-02-04 04:30 · Score: 0

Sometimes someone needs to take one for the team.
Re:Ok fuck it by Steve+B · 2004-02-04 04:39 · Score: 1

You do realize you've just comitted a pretty serious Federal crime, don't you?

Not if I'm on the jury.

--
/. If the government wants us to respect the law, it should set a better example.
Re:Ok fuck it by swb · 2004-02-04 05:01 · Score: 2, Interesting

They may be able to do that, but having JUST finished serving as a juror in a Federal criminal trial, I can tell you it wouldn't go over very well in most cases where there is strong evidence.

In all liklihood the judge would declare a mistrial. I'm not familiar (we weren't told) with the judge's powers over a jury and what laws apply to jury conduct. It might be possible for the judge to declare the jury in contempt for disregarding the judge's instructions on how the law(s) are to be applied.

It's not like you go to court and do whatever you want and interpret the law any way you want. The judge has total control of the rules of interpretation used by the jury. The court and the trial are kind of the judge's own little kingdom, and you mess with a federal judge at your own peril.
Re:Ok fuck it by Orion442 · 2004-02-04 05:10 · Score: 0

Its already in the works. Why else do you think we've started building Home Depots in Canada? We need building materials, damnit!!!
Re:Ok fuck it by tomstdenis · 2004-02-04 05:27 · Score: 1

Shit, post anonymously wasn't checked? Ahhh dang dang double dang!

Tom ;-)

--
Someday, I'll have a real sig.
Re:Ok fuck it by GNUALMAFUERTE · 2004-02-04 05:37 · Score: 0

Crime??
What Crime???.

o .. he mispelled something?

--
WTF am I doing replying to an AC at 5 A.M on a Friday night?
Re:Ok fuck it by Gaijin42 · 2004-02-04 06:00 · Score: 1

Conspriacy to commit assault. possibly conspiracy to commit murder (if the guy died)
Re:Ok fuck it by Lord_Slepnir · 2004-02-04 06:03 · Score: 1

No we won't. Canada has no oil
Re:Ok fuck it by Tom · 2004-02-04 06:21 · Score: 1

Any spammer or every spammer?

With 200 known top-spammers and their addresses on the internet, that's a decent living. Well, until you run out of spammers, but maybe the first one is out of hospital by then and you can repeat.

Hm, beat up 5 spammers a month = $5k income. The 200 known ones will last you for 40 months, or just over three years. Definitely a sustainable career.

Damn, forgot to figure in that you won't have a monopoly. If 10 people do that, you're empty after 4 months.

*sigh*, could've been so easy.

--
Assorted stuff I do sometimes: Lemuria.org
Re:Ok fuck it by Dyolf+Knip · 2004-02-04 07:27 · Score: 1

I dunno, with that much power over the jury, couldn't a judge just mistrial every acquital until he finally gets a verdict he likes? Considering some of the judges I've read about, if anyone had that kind of power, we'd be hearing about exactly that sort of crap an awful lot. I don't think they can declare a mistrial _after_ the jury's deliberations.

--
Dyolf Knip
Re:Ok fuck it by GNUALMAFUERTE · 2004-02-04 07:50 · Score: 0

uuu, may be i should put the obligatory smiley just to make it clear that i was joking.
BTW: It's censorship, if i think [ethnic,social_group] shouldn't exist, i can say so, it should only be a crime if i actually kill'em all, and, it actually shouldn't be a crime to kill spammers, actually, it's more like a sport. :) (Obligatory Smiley to prevent people as littleminded as the parent poster from flaming me, because they readed my post and interpreted it literaly)

--
WTF am I doing replying to an AC at 5 A.M on a Friday night?
Re:Ok fuck it by swb · 2004-02-04 07:51 · Score: 1

You may be right about the mistrial, but the judge *can* set aside the jury's decision.

My guess is that we'd be hearing as much about jury nullification as we would about massive mistrials, if either was a significant factor. In any case, there would be massive appeals in anything but the most trivial cases. Nobody's walking from a murder rap due to jury nullification.

Jury nullification is highly controversial, regardless of whether you could actually pull it off. Many consider it an arbitrary disregarding of the law, and more dangerous than possibly ill-conceived laws.
Re:Ok fuck it by Excen · 2004-02-04 07:57 · Score: 1

I am Jack's smirking revenge . . .

--
"No beer until you finish your tequila!" -Leela's Dad
Re:Ok fuck it by Rob+Simpson · 2004-02-04 07:59 · Score: 1

Yeah, so the US gov't would just hold him without charge for years on end...
Re:Ok fuck it by PhB95 · 2004-02-04 08:02 · Score: 1

For month now everything written in english, except when coming from a whitelist of know expeditors, gets trashed. Very efficient against spam, but of course not applicable in either US or UK ! Sometimes it's better to be european...

--
One of those Europeans...
Re:Ok fuck it by Anonymous Coward · 2004-02-04 08:14 · Score: 0

It's an interesting issue... But, don't you find it odd that since you just finished serving as a juror in a federal trial that you WERE NOT TOLD about the judges' powers or laws applying to jury conduct?
Re:Ok fuck it by Gaijin42 · 2004-02-04 08:23 · Score: 1

I am not little minded. If you read the comment to which I originaly replied (and its parent). Someone did commit a crime, even though they meant it with humor. My subsequent replies were clarification for people who asked "what crime". So I elaborated.

You are certainly free to say that you dont think certain people should exist. But if you go so far as to offer money to have someone taken out, that is a crime, joke or not.

Of course you would probably not be prosecuted, yet a crime it still is.

Additionally, it wasn't a flame, as I didn't swear, TYPE IN ALL CAPS, or get mad in any way. You seem to be a bit sensitive.
Re:Ok fuck it by FireBreathingDog · 2004-02-04 08:55 · Score: 1

Nobody's walking from a murder rap due to jury nullification.
*cough* O.J. *cough*

--
Shame on Google.
Re:Ok fuck it by Dyolf+Knip · 2004-02-04 09:01 · Score: 1

the judge *can* set aside the jury's decision.
To what degree? I.e., can the judge just say, "I don't think he's innocent. So despite the jury, _I_ find him guilty and sentence him to 30 years in a pound-you-in-the-ass prison"? Or can they only nullify verdicts so as to favor of the defendent?
I wouldn't have a problem with not letting attorneys tell juries about their power to nullify if the other side wasn't allowed to tell them, "You must consider the exact letter of the law, not the spirit". That and the fact that from most reports, if a judge finds out you so much as _think_ the word "nullification", they'll dismiss you so fast you'll get whiplash.
Regardless, one of the reasons we have juries is so that an impartial person can consider cases that don't fit nicely into the mold of the law. There is no law so brilliantly crafted that I can't come up with a situation wherein the defendent is, under the letter of the law, guilty as hell but it would still be _wrong_ to convict.

--
Dyolf Knip
Re:Ok fuck it by wembley · 2004-02-04 09:01 · Score: 1

No, a jury of his peers would all be Canadian...

--
Share and Enjoy!
Re:Ok fuck it by Ineffable+27 · 2004-02-04 09:12 · Score: 1

Trust me, we Canadians are plenty sick of spam too! (Though I must say I find Eudora 6's junk mail filtering to be extremely effective, and responsive to 'training.')

--
"He'd be a broader guy if he had dropped acid once." - Steve Jobs on Bill Gates
Re:Ok fuck it by Anonymous Coward · 2004-02-04 10:25 · Score: 0

I will pay 1000$ to anyone who seeks out and beats the living daylights out of a spammer. With as many pics on the web as possible for posterity.

Screw these filters and shit. Start creaming spammers worldwide and they'll think twice about it.
Get enough email users on the jury, and you might not even get convicted for being an accomplice to the assault and battery.
Re:Ok fuck it by swb · 2004-02-04 10:31 · Score: 1

To what degree? I.e., can the judge just say, "I don't think he's innocent. So despite the jury, _I_ find him guilty and sentence him to 30 years in a pound-you-in-the-ass prison"? Or can they only nullify verdicts so as to favor of the defendent?

IANAL but I believe that the judge can set aside any jury verdict if he believes (and can demonstrate clearly) that the jury's decision blatantly ignores the facts and the laws in question. It probably engages an automatic appelite review of the case, as well, as a built-in safeguard.

Furthermore, where is the power of the jury to ignore laws and evidence defined? As I've googled this subject a little, I can't find a single hard reference to an actual law, statute, judicial order or any other reference that empowers jurors to vote their conscience. I fully recognize they have the *practical* power to do this -- vote whatever way they want, and I'm sure jurors, who are often selected for their LACK of education, often DO vote based on something other than a cold logical analysis of the facts.

Anyway, from what I have read, "jury nullification" is more of a phenomenon that periodically raises its head (people moving slaves from slave to free states, prohibition, and from what I've read, inner city jurors letting black defendents in drug cases go) than a real, validated principal backed by the legal system. There are a lot of people who *advocate* it, but IMNSHO it appears to go hand-in-hand with other antiestablishment advocacy, so its hard to know if its real.
Re:Ok fuck it by elemental23 · 2004-02-04 11:35 · Score: 1

OJ's acquittal was not due to jury nullification. It would be jury nullification if the jury came back and said that they believe OJ was guilty but they do not believe that first degree murder should be against the law. It's kind of like voting your conscience in that you believe the law is wrong in attempting to punish the defendant's actions.

OJ's acquittal was likely due to the jury just being blinded by celebrity. Idiots.

--
I like my women like my coffee... pale and bitter.
Re:Ok fuck it by Anonymous Coward · 2004-02-04 11:44 · Score: 0

It's a good idea, but it's not good enough if the spammer survives.
Re:Ok fuck it by GreyPoopon · 2004-02-04 12:49 · Score: 1

In all liklihood the judge would declare a mistrial.
If that was the primary concern, you can always render a "not-guilty" verdict without providing any reason for doing so. That way, you are indicating that the prosecution did not meet their burden of proof. That leaves the judge mentally grasping for what piece of information he or she may have missed during the trial.
Just for the record, although IANAL, I'm aware that in criminal cases, a verdit of "not guilty" generally cannot be modified. It's usually a "guilty" verdict that gets disregarded by the judge. If there are some real lawyers out there who can confirm or dispute this, I'd greatly appreciate it. Or maybe even better, a paralegal. PJ, where are you? :-)

--
GreyPoopon
--
Why is it I can write insightful comments but can't come up with a clever signature?
Re:Ok fuck it by Dyolf+Knip · 2004-02-04 14:01 · Score: 1

It probably engages an automatic appelite review of the case, as well, as a built-in safeguard.
That sounds workable emough. Granting power to individuals is useful. It's unaccountable power that worries me.
Hmmm, look a little harder. Googling for "Jury Nulification" turned up some interesting results right at the top. For instance, the Georgia and Virginia State constitutions specifically give juries power to "be judges of law as well as fact". Here and here.
Jury nullification provides an important mechanism for feedback. Jurors sometimes use nullification to send messages to prosecutors about misplaced enforcement priorities or what they see as harassing or abusive prosecutions. Jury nullification prevents our criminal justice system from becoming too rigid--it provides some play in the joints for justice, if jurors use their power wisely.
There is indeed a lack of legislative text about the topic, but that makes sense. How do you draft laws to dictate how people will ignore them? Indeed, it seems to be only the highest judicial levels that give the matter any weight at all (generally courts that don't actually have juries, oddly enough). Apparently just speaking in public in a courthouse about it is enough to get you kicked out or arrested.
There's a semi-urban legend of a burglar who fell down through a skylight while attempting to break into this one home and successfully sued the owners for negligence in not putting up warning signs on their roof. I'm unable to find anything more than anecdote about it, but (assuming it's more or less accurate) you can guess what happened; the jury was told that they had to base their decision off of the facts of the case alone. I'd bet my next paycheck that they were told that though the law, when applied to this case, lost all resemblance to common sense, they would have to find the owners guilty if the facts of the case demanded it.
That said, I fully appreciate the need for juries to consider the law far in front of their own consciences. Were I on a jury, for me to even consider nullification it would have to be a case that I felt to be so detrimental to the public good that to convict the guy would be a insult.

--
Dyolf Knip
Re:Ok fuck it by Celt · 2004-02-04 22:50 · Score: 1

Interesting you reffer to Guantanomo Bay, the US is breaking international law with this little beauty so I wouldn't be very proud of it.

I'm happy to be european, atleast I know I'm more free then someone from the "land of the free"

--
"WebTV: bringing the Internet into the shallow end of the gene pool since 1995" - Martin Bishop
Re:Ok fuck it by cluckshot · 2004-02-05 06:49 · Score: 1

A friend of mine observed in a pithy moment that the USA was in Startrek terms "The Borg." ~"Prepair to be assimilated Resistance is futile." (He and I are US Citizens) This was in response to the observation that the US Government was doing to us things it would not allow to be done to the people from other nations. He noted that they no longer care about us, We have been assimilated!

I have noted to some of my non US friends that I am very concerned with what is occuring here and frankly it should begin to scare them as well. It is as Thomas Jefferson noted, that if Americans ever lost their dedication to personal liberty and their caring for their fellow man, they would build a new Roman Empire --- Worse than the first.

I have seen officials of the US Government well before 9/11 and even before the Bush Admin(43) stating that "We are the new Roman Empire" something I hate with a passion. I have no desire of Empire and I see it as Anti-American as it can be.

--
Never Politically Correct ~ I prefer the facts If you don't like what I say, get a life, or comment yourself.
Re:Ok fuck it by Anonymous Coward · 2004-02-05 08:25 · Score: 0

In the ASS.

'cause he'll be in prison, get it?
Re:Ok fuck it by Anonymous Coward · 2004-02-05 08:29 · Score: 0

Oh, do shut up. No crime was committed because that particular law does not apply to posters from other countries.
Re:Ok fuck it by Gaijin42 · 2004-02-05 09:35 · Score: 1

1) Im sure Canada has laws against conspiring to assault people

2) If a prosecuter in the US decides that the message was intended for a US audience, it dosent matter where the poster is, its a crime in the US. If the poster comes to the US, he goes to jail. If the prosecuter cares enough, he gets extradaited, and then goes to jail.
Re:Ok fuck it by GNUALMAFUERTE · 2004-02-05 14:49 · Score: 0

yes, i was a little sensitive, and YES i hate when i post something that is supossed to be funny, and: it gets moded as insigthfull, or interesting, or is taken literally, in real world i have to explain people that i am joking, that should happend here ...
(May be i should just improve my humor?) :]

--
WTF am I doing replying to an AC at 5 A.M on a Friday night?

Obligatory POPFile Link by rmohr02 · 2004-02-04 03:21 · Score: 5, Interesting

POPFile, maintained by John Graham-Cumming, is the best spam filter I've used. There may be small flaws with the fundamental concept of Bayesian filters, but POPFile still blocks all my spam.

Re:Obligatory POPFile Link by Tassach · 2004-02-04 03:24 · Score: 2, Informative

Would that be the same John Graham-Cumming referenced in the article who figured out how to defeat said filter?

--
Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
Re:Obligatory POPFile Link by rmohr02 · 2004-02-04 03:29 · Score: 3, Informative

Yes. He says there's ways to beat it, but that they're complicated to do.
Re:Obligatory POPFile Link by joebok · 2004-02-04 03:52 · Score: 2, Insightful

Yes - POPFile is fantastic! Since April 4th, my filter is 99.47% accurate at sorting my mail into 6 buckets. Over 18,000 spams have disappeared without me seeing them.

While it is true that I still have to waste bandwidth and CPU cycles to get rid of this unwanted mail, I no longer have to waste time. I've got my parents, friends, and neighbors all hooked up with POPFile - I believe this is realistically the only way to fight spam - move the decimal place on their success ratio over a couple notches; dig into their bottom line.
Re:Obligatory POPFile Link by WIAKywbfatw · 2004-02-04 04:09 · Score: 1

Would that be the same John Graham-Cumming referenced in the article who figured out how to defeat said filter?

No, the guy who did that is the first guy's evil clone. And he wants $1 billion, billion dollars or else he'll use it. And copyrights over putting your little pinky to the corner of your mouth too.

Seriously, did you think there were two people out there called John Graham-Cumming, one working to defeat spam and one working to propagate it? That's funny.

--

"Accept that some days you are the pigeon, and some days you are the statue." - David Brent, Wernham Hogg
Re:Obligatory POPFile Link by Otter · 2004-02-04 04:22 · Score: 1

And the freaking Slashdot writeup refers to him as "POPFile author John Graham-Cumming". How little are people willing to read?
My favorite comment is the one that goes on to marvel at the willingness of Graham-Cumming to send 10,000 emails to further his evil scheme -- like he didn't do it using, you know, a computer.

--
What I'm listening to now on Pandora...
Re:Obligatory POPFile Link by Tassach · 2004-02-04 05:02 · Score: 1

Obviously, your browser must not correctly render the tag.

--
Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
Re:Obligatory POPFile Link by Suidae · 2004-02-04 07:02 · Score: 1

My only issue with POPFile is that it doesn't have an option to add the POPFile web interface link to the message to the message body (it just goes in an X- header). Its nice that its in header, but mozilla mail won't let me display just that one header, its all or nothing, so if a message is misclassified, I have to go looking for the link to get into popfile to reclassify it. Its a pain in the butt, and the author's solution is basicly a justifed 'too bad, thats an email client problem, bug your vendor'. Sure, but in the meantime, how about a friggin patch?

Fortunately its perl, so I hacked the URL into the message body. It works most of the time, except under some encodings.

Other than that, POPFile is great, it consistantly peforms much better than Mozillas built in filters.
Re:Obligatory POPFile Link by nigelo · 2004-02-04 07:20 · Score: 1

How does this fight spam - you weren't going to read or react to it, anyway, were you? So, how does the spammer feel any impact from you not reading it through this great invention? I don't see that it impacts their bottom line at all, sorry.

As long as anyone reads or reacts to the spam, the spammer will continue... it doesn't matter to their bottom line what method the vast majority use to ignore it.

--
*Still* negative function...
Re:Obligatory POPFile Link by WIAKywbfatw · 2004-02-04 07:25 · Score: 1

If that's your best attempt at sarcasm then don't give up your day job.

--

"Accept that some days you are the pigeon, and some days you are the statue." - David Brent, Wernham Hogg
Re:Obligatory POPFile Link by joebok · 2004-02-04 07:47 · Score: 1

You are right that MY use of POPFile doesn't directly impact the spammer - it merely saves me a lot of time. But by installing it for friends, family, and neighbors and generally advocating it (or other solutions), it does stop *some* spam from getting to *some* suckers. That, in turn, will (eventually) affect their response rates and their profit margins.
Re:Obligatory POPFile Link by Anonymous Coward · 2004-02-04 08:03 · Score: 0

I wonder how John got the idea of trying to defeat spam filters. Is it because he's sent out too many emails that bounced due to his last name?
Re:Obligatory POPFile Link by MurphyZero · 2004-02-04 15:50 · Score: 1

I use POPFile myself and love it. I have 7 categories and an overal 95.88% success rate (false positives and false negatives), but most of that is due to 3 somewhat overlapping categories. If I merged those three into one, I'd have about a 98% success rate. Since approximately 3/4 of the emails I get are spam, POPFile is a necessity.

I also had to tweak it a little bit, adding some entries to the ignored words in order to improve its success rate. I still have to check those declared spam, but when I do, I am just looking at who the emails are from, and I scan it quickly. And if my mother stopped forwarding junk, I would be be getting a much better success rate. Who else has a category called MOMJUNK? And POPFile has gotten pretty good at separating my mother junk forwards from her actual emails.

But having used it for quite some time now, there are going to be two types of emails that get through a POPFile user's spam filter. The first is the short email, particularly from a new email address. There's just not enough data to effectively filter it. The other is what Graham is talking about and it is somewhat unique to each user. My set of words will be different than the next person, depending on how I trained the filter. But there may be some words or group of words that will get a (short) spam message past many/most users spam filter. But as the spammers discover these words and use them against Bayesian filters, their effective will decrease. ANd the cycle will begin anew.

--
Our founding fathers removed the guys in charge. Be American. Vote incumbents out.
Re:Obligatory POPFile Link by rmohr02 · 2004-02-05 04:52 · Score: 2, Informative

I choose to view all headers, but then I click the [-] in the top left corner of the headers and then see a single line with Subject:, From:, and time. Then when I want to reclassify something, I click the [+] (same place as the [-]) and copy the X-POPFile-Link header to Firebird or whatever browser you use. <http://bugzilla.mozilla.org/show_bug.cgi?id=23114 > is probably what they were referring to when they said this is an email client issue. If that bug is fixed, POPFile will be perfect for me. (Remember that Bugzilla doesn't take /. referrals--you'll have to copy and paste the link location.)

nice name by subjectstorm · 2004-02-04 03:21 · Score: 1, Troll

graham-cumming?

he could be the king of spam, and he might as well go for it. i mean, with a name like that, he probably gets filtered out half the time anyhow.

--
** Chigusaaa!!! You're the coolest girl in the WORLD!!! **

Re:nice name by JohnGrahamCumming · 2004-02-04 03:29 · Score: 3, Interesting

Yes, that's a constant problem for me (and anyone else named Cumming or Cummings in the world). For example I can't get a Hotmail email account because of my name, but I did manage to sign up an account using the name Ivana Watch-Teens-Give-Head :-)

John.
Re:nice name by subjectstorm · 2004-02-04 03:39 · Score: 1

:D

glad to see you manage a sense of humor about it, and realized i was only pointing out the obvious difficulties - not being a jerk. me, my last name is overton.

. . . which wasn't so great when i was a kid, seeing as how i was kind of fat and all.
"over-a-ton" really got old after a while heh

but speaking of hotmail (and, by proxy, msn messenger) have you ever played around with it to see what it will and won't let you name yourself? it's interesting that a lot of profanity is ok, but things like "microsoft" or "windows" were blocked IIRC.

--
** Chigusaaa!!! You're the coolest girl in the WORLD!!! **
Re:nice name by joostje · 2004-02-04 03:42 · Score: 2, Funny

For example I can't get a Hotmail email account because of my name
That's OK, 'cause any may you would have sent using that From: Graham-Cumming@hotmail.com header would have been filtered away anyway by the recipient's SPAM filters.
Re:nice name by jamehec · 2004-02-04 03:50 · Score: 2, Funny

Naw, his name would have to be Cumm1ng or C.u.m.m.i.n.g to be filtered. ;)

--
This post made with the Dvorak layout.
"Friends don't let friends use QWERTY"
Re:nice name by Anonymous Coward · 2004-02-04 05:35 · Score: 0

You have a valid point, one acknowledged by Mr GC. Yet you are moderated troll. Geez.

That's dedication... :( by bc90021 · 2004-02-04 03:22 · Score: 2, Insightful

It's unfortunate that spam must be lucrative enough that one man will send himself the same message 10,000 times and train an evil filter! We need to get people to stop buying products advertised through spam (granted, easier said than done), as in the end, it's the financial incentive that makes a spammer spam. :(

--
libertarianswag.com

Tch tch... by supersam · 2004-02-04 03:22 · Score: 5, Insightful

Didn't they know something as simple as...

"Make it idiot-proof, and someone will make a better idiot"

Re:Tch tch... by QuiK_ChaoS · 2004-02-04 04:10 · Score: 0, Redundant

ehm, damn straight!
Re:Tch tch... by interiot · 2004-02-04 04:14 · Score: 2, Interesting

Well, that's not necessarily ALWAYS true... for instance, most crypto is at least heavily mathematics based, and therefore is much easier to analyze from a purely theoretical standpoint how much CPU is required to break. And in some cases (eg. DES) a lot of theoretical work HAS gone into them to identify weaknesses and analyze exactly how much CPU is required to break a given key length.
Just that certain technical protections are of the nature that it's not a "I try some random protection, the idiots and/or hackers try random ways to break in, with various techniques being better than others but we really only know by testing them out in the real world."
But spam unfortunately doesn't fall into that area unless we completely remove anonymity from email, which isn't necessarily the greatest idea. Though I know there are academic proposals for ways to anonymously vote and anonymously send cash in ways that satisfy certain very important criteria (eg. one person can't vote more than once, the receiver of anonymous cash can't retrieve the cash twice from the sender's bank account, the sender can't send a given transaction twice, etc). Do any of these techniques apply to allowing anonymous individual mail and bulk solicited email using a technically verifiable method?
Re:Tch tch... by theskipper · 2004-02-04 06:01 · Score: 1

Speaking of idiots, I received the following spam in my inbox the other day:

"Subject: - : - Greetings Good morning idiot Brsw. : - : dull@FqxyTlZ1emnEB ...
"

First thing that came to mind was "Please kind sir, may I please purchase four of your exquisite products?"

The only way by GuyinVA · 2004-02-04 03:22 · Score: 4, Informative

As technology gets more complicated, so does the spam. The only way to protect yourself is to not give out your address. Period. Heck, I don't even give my work e-mail address to my parents.

Re:The only way by junkymailbox · 2004-02-04 03:25 · Score: 4, Funny

I dont give out my work address to anyone .. and it's not because i fear spam.. :)
Re:The only way by Quill_28 · 2004-02-04 03:25 · Score: 4, Funny

>The only way to protect yourself is to not give out your address. Period.

Ummm.... then what good is it?
Do you just e-mail yourself? :-)
Re:The only way by GuyinVA · 2004-02-04 03:32 · Score: 1

Ummm.... then what good is it? Do you just e-mail yourself? :-)
That's what I do :p When someone needs to give me information, I have them call me. Then I compose a message of what they need to tell me, and send it to myself. It's a simple process. Sure I waste time, but besides surfing /. acting like I'm reading articles that have to do with work, what else am I going to do in my cube... I do also try to find parts for my '52 Pontiac, but it's harder to make that look like work.
Re:The only way by Liselle · 2004-02-04 03:50 · Score: 1

As technology gets more complicated, so does the spam. The only way to protect yourself is to not give out your address. Period. Heck, I don't even give my work e-mail address to my parents.
I like aliases for that, so I always know where (sort of) the spam originated from. The email I have in my user info here on /. is an alias pointing to my real one, and the only place it's posted is here. I've already attracted a horde of MyDoom emails and some tentative give-me-money spam in the last week alone. Thankfully it's easy to pinpoint where it comes from, and shut off the faucet, should I choose to do so.

--
Auto-reply to ACs: "Truly, you have a dizzying intellect."
Re:The only way by Lars+Arvestad · 2004-02-04 04:15 · Score: 1

Wally? Is that you? I thought you were to lazy to reach out for you keyboard!
Anyway, Asok says "hi".

--
Reality or nothing.
Re:The only way by theLOUDroom · 2004-02-04 04:18 · Score: 1

As technology gets more complicated, so does the spam. The only way to protect yourself is to not give out your address. Period. Heck, I don't even give my work e-mail address to my parents.

[Louis Black voice]If you're not going to give anyone your email address, why the fuck do you need one?[/Louis Black voice]

Stupidest. Solution. Ever.

How the hell did this get +5 insightful?
This is like saying:
"I never get junk mail anymore, because I never give anyone my address, not even to my parents."

You're not doing anything to SOLVE the problem, you're just avoiding the use of that communications medium altogether.

Real frickin clever. So if I hate solicitors, I should move to a shack in the Siberian wilderness?

--
Life is too short to proofread.
Re:The only way by DustMagnet · 2004-02-04 04:34 · Score: 1

The only way to protect yourself is to not give out your address.
Even that's not good enough. You have to make the name unguessable too. It doesn't have to be as bad as a password, but I've had unused accounts get spam.

--
'SBEMAIL!' is better than a goat!!
Re:The only way by Zerbey · 2004-02-04 05:27 · Score: 1

Nope, you're still going to get hit by random dictionary spammers that way.

The only way to completely block spam is to blacklist everyone except those people who you want to receive mail from, and insist they obtain a PGP (or equivalent) key so you know if it's a forged e-mail address or not.

The rest of us just use decent spam blocking methods and live with the 1 or 2 spams a week that slide through. I use SpamAssassin coupled with RBL lookups and it is very effective.
Re:The only way by Zerbey · 2004-02-04 05:33 · Score: 1

Let me back that post up with some numbers! Here is my mail statistics for yesterday:

(In case you're wondering, this is pulled from a pflogsumm report)

594 received
266 rejected (31%) [Spam blocked by RBL lists, or personal filters]
1 discarded (0%) [E-mail bourne viruses are discarded]

Out of the remaining e-mails that got through, 6 Spams where killed by SpamAssassin.

Nothing legitimate was blocked, as far as I can tell.
Re:The only way by indianajones428 · 2004-02-04 06:44 · Score: 1

This sig is copyrighted and watermarked with anti scan/copy/print mechanism. It is illegal to reproduce this sig.

So what are you going to do if someone does?

--
When a thing has been said, and said well, have no scruple. Take it and copy it. --Anatole France
Re:The only way by GuyinVA · 2004-02-04 07:03 · Score: 1

Ok, to clarify for the people that take things to frickin literal:
Don't give out your friggin e-mail address to people, unless you have business to do with them. Cricky, lighten the hell up.

Yes it does help to solve the problem because spammers will less likely get your address. And I never said that you will eliminate all spam.
Re:The only way by lrucker · 2004-02-04 07:13 · Score: 1

Until one of the people with a legit reason to have your address double-clicks the latest MyDoom variant, at which point it harvests your email address from the idiot's address book.
In the year I'd worked here, I got no spam, until after MyDoom hit.
Re:The only way by Rary · 2004-02-04 08:00 · Score: 1

If you're careful about where you give out your email address, you can give it out and avoid spam. The problem is, all it takes is one small mistake and you're screwed.
My work email address has never received a single bit of spam (so far). I give it out frequently, but I'm careful about who I give it to. My home email address didn't get any spam for many years, until I made a single post to a newsgroup and (accidentally) included my real email address. That address is now essentially unusable. It gets 50 or so spam emails every day.

--
"You cannot simultaneously prevent and prepare for war." -- Albert Einstein
Re:The only way by Anonymous Coward · 2004-02-04 09:43 · Score: 0

It only takes one of your buddies (that know your email address) to sign up on any free-pr0n site using your email addr. The secret receipt for instant spam, and get ready for a good load of it.
Re:The only way by jbarr · 2004-02-04 09:45 · Score: 1

Unfortunatly, not giving out your address doesn't necessarily guarantee a spam-free world. I had an email account set up but I NEVER used. Every month or so, I would log into the account and never had any spams. Suddenly, after about 9 months, when I logged in, I had dozens of spam email. The service provider claims they never released my email information to anyone, and I never gave the email address to anyone.

Assuming that my service provider is truthful, how do you explain this other than a brute-force spamming method?

--
My mom always said, "Jim, you're 1 in a million." Given the current population, there are 7000 of me. God help us all!
Re:The only way by Anonymous Coward · 2004-02-05 03:55 · Score: 0

Actually, the only real way (and still use email) is to never download a message from someone who's not in your address book (the white list approach).

Has its drawbacks (hard to add new people), but...

Great by Polkyb · 2004-02-04 03:23 · Score: 3, Interesting

I don't mind him trying to defeat the filters, if it comes up with a method of improving them, but the BBC should be shot for including the words that made it through

Guess which words all tomorrows SPAM will contain...

--
I've never shoed a horse, but I once told a donkey to piss off!

Re:Great by stevesliva · 2004-02-04 03:30 · Score: 5, Funny

Guess which words all tomorrows SPAM will contain...
Touch my wireless Berkshire Marriot?

--
Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
Re:Great by Polkyb · 2004-02-04 03:32 · Score: 1

no comment...

--
I've never shoed a horse, but I once told a donkey to piss off!
Re:Great by Anonymous Coward · 2004-02-04 03:35 · Score: 0

Then only he will get it.
Those words are on his personal HAM list.
Re:Great by alwayslurking · 2004-02-04 03:38 · Score: 1

Those words are Mr Graham-Cumming's "magic" words. The article says you'd need to repeat the process for a particular individual to generate an equivalent list for them or, at best for the spammers, run the process against a pool of interconnected individuals, employees at the same company for example, to generate an organisation-wide list. My popfile probably wouldn't automatically let Berkshire or Marriott through, since I don't have sufficient ham that contains those words.
Re:Great by dirt_puppy · 2004-02-04 07:04 · Score: 1

why don't you simply put these words in your filter before tomorrow?

Here's a sneaky one... by Channard · 2004-02-04 03:24 · Score: 4, Interesting

Mozilla's filtering catches most spam for me, but some gets through. However, the only one that actually fooled me was quite a sneaky one - headed RE: Question from E-Bayer or whatever the actual subject is where you E-Bay something. Given that I sell on E-Bay, the spammers must have taken a gamble that enough people would read the subject and deem it worth looking at.

Re:Here's a sneaky one... by aussersterne · 2004-02-04 03:27 · Score: 2, Interesting

I have received piles of these recently. The names, item, item number, and amount change randomly, but it is always structured like a legitimate eBay message. I'm nervous about adding them to my bayesian filtering because I don't want to miss any eBay messages. I, too, sell a lot on eBay...

--
STOP . AMERICA . NOW
Re:Here's a sneaky one... by Threni · 2004-02-04 03:31 · Score: 2, Interesting

What, exactly, is wrong with the `make it computationally expensive to send email` solution Microsoft and others have proposed?
Re:Here's a sneaky one... by the+real+darkskye · 2004-02-04 03:33 · Score: 1

The best solution for this would be get yourself a 2nd address, even if it is just a redirect and use that for e-bay, filter incomming messages for that to their own imap folder (or whatever takes your fancy) and junk E-Bay references that arrive at any other address.

I know its a simple solution and i've probably overlooked the fact that not every ISP gives their users a nice <anything at all>@username.isp.net pop3/imap/smtp-queue like mine.

--
Music is everybody's possession.
It's only publishers who think that people own it.
Fuck Beta
~John Lenno
Re:Here's a sneaky one... by Scodiddly · 2004-02-04 03:40 · Score: 2, Interesting

"the spammers must have taken a gamble that enough people would read the subject and deem it worth looking at"

A lot of spam works that way. I get stuff headed "Re: your account", "Credit Card Overdue", etc. Spammers accept incredibly low response rates, because sending is so cheap. So the chances are that they're going to have some header you really don't want to filter.

The odds are almost good enough that perhaps someday they'll randomly send me (and many other people) a header with my own credit card number, just by blind chance.
Re:Here's a sneaky one... by mattdm · 2004-02-04 03:42 · Score: 1

Sucks for legitimate high-traffic mailing lists not run by megacorporations.

And it's not just mailing lists: at Boston University, we have a brand new cluster of eight fast Linux boxes to deal with campus e-mail. Plus several older Sun systems. They keep up -- usually. And that's with e-mail _as it is_.
Re:Here's a sneaky one... by AndrewHowe · 2004-02-04 03:47 · Score: 1, Insightful

Nothing. People just have to realise that filtering based on content doesn't work, and will never work, until perhaps we have strong AI. Once the penny drops, we can move on...
Having said that, the collaboration between spammers and pornographers means that they will have access to a lot of processing power. They just need to exchange porn for e-stamps. MS have probably thought about it, but I don't know what their solution is.
Re:Here's a sneaky one... by Anonymous Coward · 2004-02-04 03:50 · Score: 0

Given that Bill Gates is involved, new clients and servers that comply with the standard will likely be windows-only, patented, trade-secret, and under a restrictive liscense. It seems to be less about spam than it is about setting a new standard for email transfer, and locking everyone but Outlook, Hotmail, and Exchange server out of the new email method.
Re:Here's a sneaky one... by AndrewHowe · 2004-02-04 03:54 · Score: 1

But then you have a situation of trust between the parties, so it's just an authentication problem. Mailing lists don't tend to get spammed, because they're run by nazis. Eheh. No, I mean, you have to apply to join, and that's checked by a human, and if you start spamming, your arse is out of there.
Re:Here's a sneaky one... by jandrese · 2004-02-04 04:04 · Score: 1

That's a pretty big penalty for people running mailing lists, especially if they're running from a rarely noticed 486 sitting in the corner somewhere. Don't kid yourself either, there are many many specialized mailing lists on the internet that don't have major corporate sponsors (but may generate an enormous amount of (legitimate) traffic).

--

I read the internet for the articles.
Re:Here's a sneaky one... by Threni · 2004-02-04 04:16 · Score: 1

You could have an exemption - no computation protection required from people in your whitelist.
Re:Here's a sneaky one... by Threni · 2004-02-04 04:19 · Score: 1

> MS have probably thought about it, but I don't know what their solution is.

They've already done some work on this:
http://research.microsoft.com/research/sv/P ennyBla ck/
Re:Here's a sneaky one... by jafuser · 2004-02-04 04:24 · Score: 1

Door-to-door salesmen are learning from spammers too.

Ordinarily, I don't bother to answer the door unless I'm expecting someone, especially when I had just heard the same guy knocking on my neighbor's door.

However, I got tricked a few weeks ago by a salesman who knocked on my door, and then hollered "Hello, it's your neighbor".

I answered the door, he claimed to be someone from another building, a college student, who's group's funding was cut and needs to sell magazines to fund a trip to France. When I made it clear to him I wasn't interested in dead trees cluttering my apartment, he suddenly became a donation collector for the local hospital.

Scum.

--
Please consider making an automatic monthly recurring donation to the EFF
Re:Here's a sneaky one... by gowen · 2004-02-04 04:24 · Score: 1

Get a disposable ebay-only address, and anything that looks like its from ebay but *isn't* to that address can safely go in the Spam-bin.

--
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
Re:Here's a sneaky one... by AndrewHowe · 2004-02-04 04:25 · Score: 1

I know, but have they specifically addressed the point I made, that spammers can get porn-surfers to do their computation for them? It might be in one of their papers, but I don't have time to look through all of them...
Re:Here's a sneaky one... by Anonymous Coward · 2004-02-04 04:26 · Score: 0

I just assumed this was part of some new phishing scam. Use the IE URL bug and make it look like you have to "log in" to see/reply. Then they've got your account to use for selling HDTVs from Hungary...

Dee-Leet!
Re:Here's a sneaky one... by pclminion · 2004-02-04 04:30 · Score: 5, Informative

People just have to realise that filtering based on content doesn't work, and will never work, until perhaps we have strong AI.
That's an overly strong statement to make, and even a little bit irritating to people like myself who actually implement statistical content filters, natural language systems, etc.
If you are equating "content based filtering" to "Bayesian filtering" then you really only understand 1% of the current state of document classification. Bayesian filtering is a rage right now because it's a linear time algorithm (i.e., implementable on PC hardware). There are document classification schemes that will eat Bayesian for lunch, which are not appropriate for email filtering at this time because of their computational cost. But with continual progress on the algorithms, new methods for reducing search spaces via extremely clever sense-similarity heuristics, and with computers doubling in speed every 18 months, it's closer than you think.
The spam/ham problem is what data mining researchers would call a "toy problem." You want us to classify documents into only two classifications? Only two? Piece of cake. The problem is, you want us to do it on PC hardware where it isn't feasible to run O(n^2) or O(n^3) machine learning algorithms.
Let the researchers continue what they're doing. People are just now starting to apply SVMs and other cool techniques to the problem of spam filtering. You'd be amazed at how many of the well-known data mining and statistical NLP researchers have not even thought of using their arsenal against spam.
It's coming, please be patient.
Re:Here's a sneaky one... by real_smiff · 2004-02-04 04:42 · Score: 1

ah! but unlike an email spammer, you can punch him! so do!
from all of us.
in fact, while you're at it, make him pay for all those we can't punch ;)

--
This is my Sig, this is my Gun. One is for Slashdot and one is for Fun.
Re:Here's a sneaky one... by Broodje · 2004-02-04 04:43 · Score: 1

Where are my mod points... Thanks for that post, it made my day a little better.
Re:Here's a sneaky one... by AndrewHowe · 2004-02-04 04:43 · Score: 1

I don't think I'm being overly strong. The stuff you're talking about is all fine, but it will fail because the spammers will evolve to defeat it. As they have done with every technique to date.
Please don't be irritated, and I'm not trying to stop you. As you were.
Re:Here's a sneaky one... by GregWebb · 2004-02-04 04:56 · Score: 1

How close to a legit eBay message can it be, though, while still achieving the objective of driving traffic to the spam customer's website? Surely that's the difference for the filter to exploit, and it's most a case of monitoring the spam folder carefully for a good while. Heck, I never just delete mine, I'll _always_ look through and open anything possibly legit before deleting, and I haven't had a false positive in months.

--
Greg
(Inside a nuclear plant)
Aaaarrrggh! Run! The canary has mutated!
Re:Here's a sneaky one... by Anonymous Coward · 2004-02-04 05:11 · Score: 0

Not really. Do you have any clue how many people sign up for things and then get pissed because you are sending them spam?

I've had people sign up for accounts at my site, get their account information mailed to them, log in and use the site. Then, a month later, they might get an automated system email from my site notifying them of things about their account - only to have them reply in a pissy attitude about "where did you get my address?" and "why are you sending me mail?" to which of course the response back is "uh... YOU SIGNED UP FOR AN ACCOUNT WITH THIS ADDRESS, DUMBFUCK".
Re:Here's a sneaky one... by Mr+Guy · 2004-02-04 05:22 · Score: 1

What you're missing is that evolving spammers from spam also fixes the problem. The better the content filtering is, the more they have to adapt their marketing pitch to fit norms we can consider acceptable. The better the filters are the more they have to tailor their pitch. Eventually they are either sending email people want, or not sending email at all because their message can't get through.

--
Never confuse volume with power.
Re:Here's a sneaky one... by AndrewHowe · 2004-02-04 05:35 · Score: 1

But, as I said, that's a strong AI problem. Anyone can come up with stuff that's arbitrarily close to a real email, but is actually spam. In fact, the more we push spammers along this route, the harder it will be for humans to distinguish between spam and ham.
These people are doing it for money. Lots of easy money. And they will not stop, until people stop paying them to do it.
Re:Here's a sneaky one... by AndrewHowe · 2004-02-04 05:37 · Score: 1

That's easily solved though, innit? Call them a dumbfuck and delete their account. Job done.
Re:Here's a sneaky one... by pclminion · 2004-02-04 05:44 · Score: 3, Insightful

The stuff you're talking about is all fine, but it will fail because the spammers will evolve to defeat it.
I think you overestimate the intelligence of these creeps. The fact that spammers are using more and more of these garbage terms, randomizers, and other hacks to get around the filters actually encourages me -- it demonstrates that they really don't have the slightest clue how statistical content based filtering actually works. Currently, they are taking advantage of the extremely bad decision to assign a 0.4 score to unknown words. The spammers are exploiting a crack in the armor, which means the armor needs to be fixed.
A human can filter spam. A spammer can't weasel his way around human intelligence, so this sets an upper bound on how advanced the spammer techniques can get. All we have to do is get document classification up to the point of competitiveness with human performance, and the problem is solved. And research into these directions isn't wasted, because the motivation for the research is for actual important document organization tasks. The effect of stomping out spam will be a cool side effect.
If a spammer was ever actually intelligent enough to get around serious, well-constructed classifiers, I highly doubt he would be in the business of spamming. To suggest that spammers could intellectually compete with people whose have spent years specializing in statistical language processing is a tad bit ridiculous.
At some point, to sell something, the spammer has to say something intelligible which is an advertisement. They can't hide this. Techniques which are foiled by bogus terms at the bottom of the email are broken. It's not a valid reason to believe that spammers are actually getting smart.
Re:Here's a sneaky one... by Anonymous Coward · 2004-02-04 05:44 · Score: 1, Interesting

Yeah, yeah, chicken little, the sky is falling.

MS's idea is as harebrained as most of their solutions. SPF and filtering armours more than well enough against spam. With SPF, spammers will no longer be able to forge sender addresses -- you will be surprised what that'll do for their legitimacy.

(And, it doesn't require something as completely fucking satanic as "estamps.")
Re:Here's a sneaky one... by Anonymous Coward · 2004-02-04 05:50 · Score: 0

No need for the name-calling. Thanks for your input, though...
Re:Here's a sneaky one... by AndrewHowe · 2004-02-04 06:09 · Score: 1

I think you overestimate the intelligence of these creeps.

Well, maybe, but I think a lot of people underestimate them, and that's more dangerous. Remember that they need to defeat today's filters, not tomorrow's. They have no need to push the arms race any faster.

A human can filter spam.

Current spam, yes. But it's getting harder. Misleading subject lines make it harder to bulk delete stuff.

I never said any research was wasted... Geez...

Maybe you are smarter than the average spammer, but you need to prove it. When you produce your magical filter, I'll consider your intellectual prowess.

In the meantime, there are other techniques, such as SPF and the Penny Black project, which need to be tried.
Re:Here's a sneaky one... by pclminion · 2004-02-04 06:40 · Score: 1

Misleading subject lines make it harder to bulk delete stuff.
If in doubt, it's spam. Simple rule, really. If some company really wants to report that my "Payment is past due" they'll send me a letter. Besides, most corporate contacts aren't going to be named "kaislyais." It takes a split second to make this judgment.
Maybe you are smarter than the average spammer, but you need to prove it. When you produce your magical filter, I'll consider your intellectual prowess.
You make it sound like I was claiming to be the world's statistical NLP expert. Yes, I have my own ideas, along with hundreds of other people. In general I think it's a safe bet to say we're smarter than spammers. I'm not really interested in proving anything, since I never intended this to be an IQ competition. I don't think it's big-headed to assume I'm smarter than a slimeball.
As for the magical filter, it isn't magical, and it already exists. As I said, PC hardware severely limits the kinds of algorithms that we can use without making the user impatient. If you'd like to run it, I'll gladly send it to you, but don't expect me to document anything, and don't complain about the run time :-)
Currently I use a hybrid filter which uses a word clustering algorithm to concentrate the information which flows into a feed forward neural network. Currently I'm examining the use of SOMs instead of feed forward networks to automatically generate document classes. I'm also looking into ways to use the neural network as feedback to fine-tune the clustering algorithm.
I believe these blends of statistical and information-theory techniques with "sloppier" systems like neural networks will become more and more valuable as people continue to research them.
As I said though, it's a research project which means it has no documentation and the training process is laborious and not yet automated. It's not the kind of thing you install on a mail server and just forget about. I'm more interested in the filter than the mechanics of getting it integrated into mail systems.
Re:Here's a sneaky one... by Suidae · 2004-02-04 07:16 · Score: 1

The problem is, you want us to do it on PC hardware where it isn't feasible[...]

How long are we talking here?

Can we process, say, 10 messages a day? Bayesian knocks out 99.5% of my spam effortlessly and snags nearly all the ham as well. The problem is the 0.1% it can't classify. I'd have no problem waiting on those to be classified using heavier duty methods if they can get to me within a few hours.
Re:Here's a sneaky one... by Martin+Wolf · 2004-02-04 07:32 · Score: 1

What, exactly, do you mean by "not feasible to do on PC hardware"?

The reason I ask is, my private mail/web server is ridiculously overpowered anyway. Even if it had to spend several minutes on a single e-mail, I might consider that acceptable if it meant a close-to-100% effective spam filter. So, where can I get me some of those algorithms?

In fact, a modern PC isn't that much less powerful than, say, a Sun or SGI box in terms of raw processing power, is it? So I'm wondering what kind of hardware you run your "non-toy" problems on. Are we talking about algorithms that will keep an entire mainframe huffing and puffing for hours?
Re:Here's a sneaky one... by scrytch · 2004-02-04 07:36 · Score: 1

What, exactly, is wrong with the `make it computationally expensive to send email` solution Microsoft and others have proposed?

The fact that spammers have mastered distributed computing by using millions of zombie machines to send email. You add a little latency to each one of those while you basically shut down a legitimate mailing list or just a busy outMX that doesn't steal everyone else's resources.

In fact, widespread deployment of this scheme would increase the volume of spam relative to legitimate mail.

--
I've finally had it: until slashdot gets article moderation, I am not coming back.
Re:Here's a sneaky one... by Anonymous Coward · 2004-02-04 07:45 · Score: 0

A human can filter spam.

I bet there are a lot of humans who can't.
Re:Here's a sneaky one... by pclminion · 2004-02-04 07:56 · Score: 1

What, exactly, do you mean by "not feasible to do on PC hardware"?
We're talking about O(n^3) algorithms here. If you double the amount of data being fed to the algorithm, the run time goes up 8 times. At least it's not exponential, I guess.
And we're not talking about keeping an entire mainframe working hard -- we're talking keeping a Beowulf cluster of mainframes working hard. Imagine a lexicon of 100,000 symbols (not unreasonable), where the run time is O(n^3) in the lexicon size. That's a quadrillion algorithmic steps to be performed -- assuming you can do each step in a nanosecond (a total impossibility) it will still take you 11 days to finish! And that's just one iteration -- you need to keep iterating until the algorithm converges!
The interesting work currently being done is in figuring out how to reduce the run times of these algorithms to at least O(n^2). There are some linear algorithms (to be precise, they are O(l*m*n) where l and m are significantly smaller than n), in fact, I use one of these algorithms in my own filter. We could do very amazing things if we had the power to run some of these O(n^3) algorithms over realistic data sets. Unfortunately we can't do it on any currently existing hardware.
It's sort of the same situation with computer graphics. All the cool special effects you see in movies these days were understood in theory way back in the late '70s, early '80s. It wasn't until 1995 or thereabout that computers actually became fast enough to implement those ideas.
Re:Here's a sneaky one... by Martin+Wolf · 2004-02-04 08:12 · Score: 1

And we're not talking about keeping an entire mainframe working hard -- we're talking keeping a Beowulf cluster of mainframes working hard.

OK, I get the point. I'll stick to SpamAssassin for now, thanks.. :-)
Re:Here's a sneaky one... by DarkSarin · 2004-02-04 08:49 · Score: 1

Obviously you and your colleagues are not getting enough spam to motivate you to integrate this. Perhaps if you started getting 100-200 junk emails a day you would be a little more interested. Better yet, why not your whole family?
Post your email, along with your entire adressbook of emails. Then we'll see how long it takes you to become interested in integration.
</sarcasm>

Seriously, though, I think that your solution is interesting. I like the idea of a solid filter, though currently Mozilla catches about 95% of my spam. I think that's great, but if you have a better technique, I would like to have that.

--
"We don't know what we are doing, but we are doing it very carefully,..." Wherry, R.J. Personnel Psychology (1995)
Re:Here's a sneaky one... by silentbozo · 2004-02-04 09:48 · Score: 1

Change your eBay contact address on a regular basis, and filter based on that (or rather, filter everything else, and whitelist what the current address is.) I haven't had to resort to this step yet, but I've planned for it. Right now, I'm getting alot of paypal-style scams, but it doesn't affect me as much because my official paypal address isn't filtered, and for some reason THAT address isn't getting spammed.
Re:Here's a sneaky one... by AndrewHowe · 2004-02-04 10:15 · Score: 1

If in doubt, it's spam. Simple rule, really.

Oh, if only life were that simple...

It takes a split second to make this judgment.

Right, but I get around 200 spams a day, after filtering. How many do I need to get, how long do I have to waste each day, before you accept that it's a problem? Geez, I swear people like you are part of the problem, not the solution. Grrrr!

I never wanted this to be an IQ competition. You brought that up. But the spammers are winning, and have been since they started. We were promised that Bayesian filters were going to can spam forever, but they failed. And I knew they would. And now it's time for the Bayesian fanboys to step aside, and let someone else have a go. If you feel you're up to it, I'm not stopping you.
Re:Here's a sneaky one... by aussersterne · 2004-02-04 12:47 · Score: 1

A human can filter spam.

Not always. Not this human. And I've been working in the technology industry off and on since the mid-1980's. I've worked as a consultant and a programmer and for a large part of my Internet life had a bang-path email address.

But these days, there are several occasions (one of them being the recent spate of eBay "Question for seller" messages) during which I've had to actually click the link to see if it was spam. And I worked for eBay for some time! I know what sorts of things they do and don't send, even!

Yes, I'm familiar with the easy replies to these sorts of things:

1) Don't click any links in email. Ever. Tell your friends that if it contains a link, you'll assume it's SPAM.
2) Don't accept any HTML mail. If it's HTML encoded, assume it's SPAM.
3) Dont' accept any mail with attachments. If it's got attachments, assume it's SPAM.

The problem is that these "easy" criteria filter out 90% of the people I need to communicate with. No, I will not drop that 90%. No, it's not such an easy thing to explain to them, either, how to turn off HTML encoding, etc. Some of them are very bright, even... Many of them are in the computer industry, just not in coding or in consulting or in some position that would require them to be very skilled at the nuances of sending/receiving email beyond "Compose, Type, Send."

I will not break off communication with 90% of my legitimate contacts in the interest of reducing SPAM. And I suspect that most people won't either. And I can't create a whitelist... because I just have too many necessary contacts. And I can't create one mailbox for each contact or for each aspect of my life... because there are too many of those as well and I don't want 66 mailboxes hanging around while I try to keep track of them all.

The point is that I also have increasing difficulty identifying SPAM myself until I actually follow a link and see where it leads. And of course by then it's too late. If I'm a human and a veteran in technology, and I can't figure it out, how is some SPAM filter going to figure it out?

--
STOP . AMERICA . NOW
Re:Here's a sneaky one... by jonhuang · 2004-02-04 16:20 · Score: 1

A little OT perhaps, but what the hell. Here's another sneaky one: my blog's been hit by multiple spiders (badly behaved of course) that do nothing but leave comments with a porno / herbal / penis-mightier site and a spate of keywords. They're not trying to spam me; they're increasing their google rank. Bastards.
Re:Here's a sneaky one... by Anonymous Coward · 2004-02-07 10:07 · Score: 0

Spammers are stupid. (That's rule three).

A spammer can write an entirely innocuous post to trick you into opening an attachment or a web link that guides you to an ad. In fact, the URL can be written such that the website only shows the spamming ad for, say, 4 hours after the spam is sent, and visits to the hostname of the website is unavailable, on the wrong port, or detected and the host sending the probe gets redirected to something innocuous.

Expect this to begin within the next year.

Blocking all this crap *with automated tools* is the hard part.

Re:That's dedication... :( by Anonymous Coward · 2004-02-04 03:24 · Score: 1, Funny

that said while your sig reads like a nigerian scam

The Final Solution. by Anonymous Coward · 2004-02-04 03:25 · Score: 0

It's clear now. They must be killed. If they can't be bothered to respect the fact that people don't want to be bothered, then we the many cannot be expected to hold true to our part of the covenant.

I propose we insert powerful electrodes into their rectums and electrocute them, then skin them alive and make jackets, sporting goods and chamios. If these product should prove unpopular, I propose we just make special purpose chamios out of their skins for the exclusive purpose of cleaning proctology instruments and peepshow surfaces.

Mainstream Media Coverage by Anonymous Coward · 2004-02-04 03:25 · Score: 3, Interesting

I hate to see mainstream media coverage of this practice. I have started to get a lot of these spams lately.

Typlically they include a large image at the top which is the entire intended content of the image and then a bunch of dictionary words at the bottom. It's basically impossible to filter these out unless you filter out ALL HTML e-mail because they don't contain any typical spam text.

Re:Mainstream Media Coverage by Proud+like+a+god · 2004-02-04 03:48 · Score: 1

Why not block images in your html emails, or filter on the use of images?
Re:Mainstream Media Coverage by Anonymous Coward · 2004-02-04 04:38 · Score: 0

I forgot to mention that I'm using spamassassin and what gets through that is filtered by the spam filter in mozilla-thunderbird.

I guess I could filter out HTML mail with images, but that would greatly increase the amount of false positive results I would get.
Re:Mainstream Media Coverage by fishdan · 2004-02-04 04:50 · Score: 1

HTML email is a luxury that I can't afford. My email sig says that I filter all HTML email to spam. , and I do. MY text spam filter works great. If more people would do this, it would end most spam. If someone wants to send me HTML, let them send me a link, and host it themselves. Honestly, HTML email is not worth the bother. If enough people start doing this, it will make a difference.

--
Nothing great was ever achieved without enthusiasm
Re:Mainstream Media Coverage by Anonymous Coward · 2004-02-04 12:44 · Score: 0

yplically they include a large image at the top which is the entire intended content of the image and then a bunch of dictionary words at the bottom. It's basically impossible to filter these out unless you filter out ALL HTML e-mail because they don't contain any typical spam text.

You aren't thinking very much. Just filter out the IMG tag from the email, and allow the rest through.

Almost no legit email has embedded IMG tags that are useful to you.

Anomy is a great tool to filter out crap from html email. It allows the basic html tags to go through, so you can still get messages from lusers who use outlook.

Of course, I still use a text-based email program: pine.

my spam filter by SkArcher · 2004-02-04 03:26 · Score: 4, Insightful

if Message header = "type = text/html" then send to "Spam"

It works a treat :)

The other trick I have found useful is the CamelCase nature of my name - spammers tend to mail me either as skarcher or SKARCHER, and both trip filters on my mailbox.

--

An infinite number of monkeys will eventually come up with the complete works of /.

Re:my spam filter by RMH101 · 2004-02-04 03:49 · Score: 1

isn't case-sensitive email contrary to the RFCs? I don't care if you want to do it, just wondering if officialy the various mail protocols were meant to support it. i've always assumed (perhaps wrongly) they didn't...
Re:my spam filter by SkArcher · 2004-02-04 03:54 · Score: 1

the actual e-mail sending isn't case sensitive (i get all of them, SKARCHER, skarcher and SkArcher) - but I can filter in case sensitive terms on the To: header, which is what I do - because spammers e-mail programs almost always are case insensitive in either "grab" or "send" mode

--

An infinite number of monkeys will eventually come up with the complete works of /.
Re:my spam filter by monique · 2004-02-04 06:04 · Score: 1

If I tried this, half the newsletters and order info I *do* want to see would never make it to my mailbox .... Not to mention emails from non-techie friends who don't know how to disable html, or don't want to.

(Yes, you can apply your whitelist before applying the above filters, but you'd still miss messages from new addresses you haven't yet considered.)

I do contact the various offenders, but most of them seem uninterested in changing their ways, and the issue just isn't important enough to miss out on most of these services.

--
-monique
Re:my spam filter by driptray · 2004-02-04 11:36 · Score: 1

The HTML email you get in newsletters and from clueless friends will almost certainly not be Content-type:text/html. It will be Content-type:multipart/alternative.
I have found it completely safe to filter on Content-type:text/html. In addition, any Content-type:multipart/alternative messages that do not have a text/plain section are guaranteed to be spam. Just using these two filters can cut your spam by about half.

Outlook 2003's non-Bayesian junk filter by Anonymous Coward · 2004-02-04 03:27 · Score: 2, Informative

All spammers have to do is read this analysis of the filter, then included the weighted non-spam strings, while avoiding the spam weighted strings. Pretty simple to blow past their filter.

Alt title: Mr. John Graham-Cumming on Spam Filters by kanotspell · 2004-02-04 03:27 · Score: 1

off

He'd have an easier time avoiding filters... by shrubya · 2004-02-04 03:27 · Score: 3, Funny

...if his surname weren't Cumming. At least his first name isn't Richard.

Fool-proof spam method by Anonymous Coward · 2004-02-04 03:28 · Score: 1, Insightful

A fool-proof spam method is to reply to each piece of email sent to your account, asking for the sender to validate themselves with you. This would be only necessary for senders from addresses that have not yet been validated. This would would essentially stop spam dead.

Sure it's a little awkward, but picking through your email for that valid email amongst the spam is even moreso.

Re:Fool-proof spam method by Anonymous Coward · 2004-02-04 03:32 · Score: 1, Funny

I used to know a guy who'd send a segmentation fault to people he didn't want sending email to his university account. (This was when AOL was just starting) He eventually lost his account for a while, turns out the network admins didn't find it as ammusing as we did. But it was pretty funny.
Re:Fool-proof spam method by Anonymous Coward · 2004-02-04 04:19 · Score: 0

Do you mean he'd send a 'core' file as text?

I used to send emails with hundreds of embedded ^Gs. of course, that was back when everyone used unix to read their email.
Re:Fool-proof spam method by SillySlashdotName · 2004-02-04 05:48 · Score: 1

If every person required a validation before accepting email from any other person, then a race situation would result and no email would be allowed from anyone or to anyone.

You send me an email, I send you an email requesting confirmation, you send a reply to my confirmation request requesting a confirmation, I send...

Yes it would stop spam dead. It would also stop ALL email dead.

--
Acts of massive stupidity are almost never covered by warranty. --me.
Re:Fool-proof spam method by SillySlashdotName · 2004-02-04 05:59 · Score: 1

Shouldn't have hit submit so soon.

Requesting a validation from every unknown emailer would also send a signal to the spammer that they have hit a valid email address - now they just need to penetrate the spam filter.

1) Send email to billions of addreses.
2) Collect the verification requests (all being valid addresses).
3) ????
4) Profit!

And if they send a confirmation that you accept (social engineering?), they have a free route into your machine again because they are now on your list of 'safe' addresses.

You are describing a self-generating whitelist. A problem with that is there needs to be a mechanism in place to remove entries from the whitelist if/when they are hijacked or zombied by spammers. If that is not automatic as well, you have only placed a stumbling block in their path, not a barrier.

Also, what about spoofed addresses?

--
Acts of massive stupidity are almost never covered by warranty. --me.
Re:Fool-proof spam method by Anonymous Coward · 2004-02-04 07:45 · Score: 0

yep. On his account it ended up being about 20 megs of garbage back when I just got a 1GB hd.

One word: WHITELIST. by jamehec · 2004-02-04 03:28 · Score: 2, Informative

If you've whitelisted your email, that crap won't get through if you're not on the whitelist. That goes regardless of your Subject line. Same story if you do challenge/response, for that matter. Or you can munge, as I do.

I still say spamming needs to be a felony, though.

--
This post made with the Dvorak layout.
"Friends don't let friends use QWERTY"

Headline tone by Faust7 · 2004-02-04 03:30 · Score: 4, Funny

Armoring Spam Against Anti-Spam Filters

That description sounds too noble for an activity like this. More appropriate headlines would be Making Spam Slick as Owlshit or Infusing Spam with Satanic Strength.

--
The coolest voice ever.

Re:Headline tone by Em+Emalb · 2004-02-04 04:01 · Score: 1

dude.

This is the funniest shit I've read all day.

Thanks for almost making me piss myself.

--
Sent from your iPad.

/SARCASM by JeanBaptiste · 2004-02-04 03:30 · Score: 1

dammit slashdot ate my {/sarcasm} tag!

ah well.

Educate the people by Theresa1 · 2004-02-04 03:31 · Score: 2, Interesting

When I was on holiday in tunisia, we were bothered quite a lot by trinket salesmen, who would not take no for an answer. Initially we had a lot of difficulty getting rid of them because my kids kept wanting me to buy the trinkets. plleeeese !!!!!!!! can we have one ? . Eventually even my kids got fed up with them, and a united front defeted them. Anyway my popint is, eventually the whole world will wise up and just ignore spam. There will bne no incentive for companies to pay the spammers, and they'll just go away. It might take a while though.

--
This is a manual signature virus. Copy to your signiture file and help me spread.

Re:Educate the people by fuzzybunny · 2004-02-04 03:47 · Score: 1

Easy solution to this: sell them the kids in return for them not pestering you anymore.

Frightfully effective tactic, that.

--
Cole's Law: Thinly sliced cabbage

Nothing to worry about. by Kidbro · 2004-02-04 03:31 · Score: 3, Informative

This would, for most slashdotters, be nothing to worry about. For those of you who didn't RTFA, the entire attack is limited by this particular little gem of info:

He had to send himself thousands of copies of the same message each one holding an encoded chunk of HTML that reported back to him when it got past the filter.

The concept is that the spammer has to find words that are so common in a person's ham that including them in spam would fool the filter. However, as those words are unique to each person, a lot (thousands or more) of spam must be sent to test the filter. The problem for the spammer is to figure out which spam actually got through (in order to identify the important words) - something s/he's not able to do for users with a decent email client...

I still feel quite confident that SpamBayes will keep my inbox free from spam.

--
May we live long and die out

Re:Nothing to worry about. by Anonymous Coward · 2004-02-04 06:50 · Score: 0

... each one holding an encoded chunk of HTML that reported back to him when it got past the filter.

And if you filter out HTML, or your mail client is set to not parse it, then this technique is useless.
Re:Nothing to worry about. by Anonymous Coward · 2004-02-04 12:38 · Score: 0

The concept is that the spammer has to find words that are so common in a person's ham that including them in spam would fool the filter. However, as those words are unique to each person, a lot (thousands or more) of spam must be sent to test the filter. The problem for the spammer is to figure out which spam actually got through (in order to identify the important words) - something s/he's not able to do for users with a decent email client...

I like using Anomy with html email. It strips out annoying html from email messages, including inline IMG tags, javascript and other crap.

Incidentally, it protects lusers who use outlook from many html based exploits.

Why bother? by nakedbonzai · 2004-02-04 03:31 · Score: 2, Interesting

I am still perplexed as of why a spammers wants to bypass someone's spam filter. Obviously, the person will simply delete any spam that gets through. They won't read it, they won't buy the product in question! Well, that's the case for me at least. I'd imagine the .001% of people who do respond to spam have no intention of ever using a spam filter.

Re:Why bother? by Theresa1 · 2004-02-04 03:38 · Score: 1

I can imagine a situation where someone has a good spam filter provided by their company, or isp. They are never bothered by spam, so they are not as hateful of it as most people. If a spammer get through they may be more inclined to respond simply because they don't normally get spam.

.

--
This is a manual signature virus. Copy to your signiture file and help me spread.
Re:Why bother? by the+real+darkskye · 2004-02-04 03:39 · Score: 2, Insightful

The answer is simple, the spammers (the ones doing the spammage, not the ones selling the products) are probably making money from every e-mail sent. As such if they dropped the 1,000,000's of e-mail address they knew were being blocked from their lists, they'd lose 1,000,000 * [profit per e-mail]

Just my 0.03c (adjusted for inflation)

--
Music is everybody's possession.
It's only publishers who think that people own it.
Fuck Beta
~John Lenno
Re:Why bother? by Bullschmidt · 2004-02-04 03:45 · Score: 1

Because there are plenty of web email services (yahoo) that have built in spam detection. Its not as if the user ever really did any work for these. So if the spammer can get to those users, there may be return on investment. Secondly, it may also be useful against far more users.

--
"Of all days, the day on which one has not laughed is the most surely the one wasted." -Sebastian Roch Nicol
Re:Why bother? by Anonymous Coward · 2004-02-04 04:10 · Score: 0

Rule #1: Spammers are stupid
Rule #2: Spammers lie
Rule #3: (from rules #1 and #2) Spammers' lies are really stupid

I'd use the rule #1 here.
Re:Why bother? by forrestt · 2004-02-04 04:13 · Score: 1

Yes, but if they had a list of say 1,000 addresses of people who are known to buy spamvertized products on a regular basis the [profit per e-mail] part of your equation would go up dramatically. Probably even enough to say:

1,000,000 * [profit per e-mail of bad list] < 1,000 * [profit per e-mail of good list]

I only see it in the spammers best interest to remove people who don't want spam, but then again I'm not a moron.
Re:Why bother? by Anonymous Coward · 2004-02-04 04:45 · Score: 0

So, the obvious solution to the spam problem is:

Ban all non-enduser (ISP, work) spam filters!
Re:Why bother? by some+guy+I+know · 2004-02-04 07:14 · Score: 1

That doesn't explain why spammers unmunge deliberately-munged email addresses.
I once posted to a newsgroup as xyz-DONT-SPAM@domain, where xyz was a name that I made up especially for that post, and domain is my domain name.
Within hours, I started getting SPAM at xyz@domain and xyz-@domain, as well as at xyz-DONT-SPAM@domain.
Now, I could see an automatic harvester sending SPAM to xyz-DONT-SPAM@domain, but to send it to the other two, it had to have some sort of filtering to defeat anti-harvesting measures, which means that they are deliberately sending SPAM even to people who make a special effort not to receive it.
That's just evil.

--
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
Re:Why bother? by protoshoggoth · 2004-02-04 09:06 · Score: 1

They're not so worried about defeating your individual filters, as the ISPs' filters. Those nasty ISPs are preventing millions of mortgage enlargement and Nigerian Viagra ads from reaching the intended idiots.
Re:Why bother? by Theresa1 · 2004-02-04 20:28 · Score: 1

Who can fathom them. They truly are evil.

for future reference, 'tis better to munge the the rightmost portion of the domain.

xyz@isp-SPAMBLOCKED.com

that way it doesn't even get to the mail server

--
This is a manual signature virus. Copy to your signiture file and help me spread.

Re:That's dedication... :( by Anonymous Coward · 2004-02-04 03:31 · Score: 0

RTFA

Discovering Keyword by Alien54 · 2004-02-04 03:31 · Score: 1

When a message got through he trained an "evil" filter that helped to tune the perfect collection of additional words. Soon he had generated a short list of words that, if added to a spam message, would guarantee its safe passage into his inbox.

"The actual words it found were a total surprise," said Mr Graham-Cumming.

The list included words such as "Berkshire", "Marriott", "wireless", "touch" and "comment". Including just one of these words convinced Mr Graham-Cumming's real spam filter that a message was ham rather than spam.

My Graham-Cumming said defending against spam that uses these words would be very difficult because the words are tied to a person's job and lifestyle. But, he said, the good news is that the technique to discover these trigger words is very time consuming.

the keywords would be different for each person.

--
"It is a greater offense to steal men's labor, than their clothes"

Re:Discovering Keyword by Mikkeles · 2004-02-04 03:58 · Score: 1

'The list included words such as "Berkshire", "Marriott", "wireless", "touch" and "comment". Including just one of these words convinced Mr Graham-Cumming's real spam filter that a message was ham rather than spam.

My Graham-Cumming said defending against spam that uses these words would be very difficult because the words are tied to a person's job and lifestyle.'

So now he's just broadcast to all spammers exactly how to get their spam through his filters!^)

--
Great minds think alike; fools seldom differ.

Re:That's dedication... :( by andih8u · 2004-02-04 03:32 · Score: 3, Insightful

We need to get people to stop buying products advertised through spam

As you alluded to, it'd be easier to teach fish to fly. The internet essentially carries with it a stupid-user tax. Worms, virii, spam, et al are the by-products of stupidity, but as with most taxes, it just something that you have to deal with.

--

slashdot, news for crazed liberal socialist zealots

"and can be combated." by junkymailbox · 2004-02-04 03:33 · Score: 1

yes .. discover keyword .. but how do you combat the spammer?

Re:"and can be combated." by GMontag · 2004-02-04 04:07 · Score: 5, Funny

but how do you combat the spammer?

1. Find spammer

2. Kill spammer

3. Become hero of the interweb

4. Write book from prison

5. ???

6. Profit!

Your question is exactly why the death penalty belongs on the street, not in prison.

--
Eve Fairbanks says I drive a hybrid!LOL
Re:"and can be combated." by Anonymous Coward · 2004-02-04 09:02 · Score: 1, Funny

I believe step 5 should be "publish book" in this case.
Re:"and can be combated." by Anonymous Coward · 2004-02-04 09:15 · Score: 0

Don't spoil the mystery!

Not a problem with proper training by ronmon · 2004-02-04 03:33 · Score: 1

Bogofilter does a really good job set as a filter rule in sylpheed-claws. Very few of those 'random valid word' type spams evade the filter, but every now and then one does.

No problem. Just drag that sucker into the spam folder and the next hourly cron job learns about it. I've never seen it miss a repeat spam and false positives are extremely rare.

how NOT to get SPAM 101 by musikit · 2004-02-04 03:35 · Score: 3, Insightful

1. don't sign up on any page that requires you email address to verify *cough*like this one *cough*

2. don't use free email services hotmail etc.
3. don't use AOL
4. don't let anyone have your address that forwards messages like "cute bunny pic" or "funny anti-geek joke" etc.
5. don't post your email anywhere.
6. don't sign up for majordomo lists.

Re:how NOT to get SPAM 101 by Anonymous Coward · 2004-02-04 03:45 · Score: 0

I'd also add don't read your email in an internet cafe.
Re:how NOT to get SPAM 101 by Rudebr00d · 2004-02-04 03:49 · Score: 1

To solve #1 just keep a dummy free email account used solely for the purpose of verifying registration to websites such as slashdot. Thus your real email address never gets used or posted and finding the legit activation request emails sent by these emails wouldn't be terribly difficult to find even amongst an onslaught of spam.
Re:how NOT to get SPAM 101 by Carthag · 2004-02-04 03:51 · Score: 1

Works until one of your friends gets a virus and your addy gets out that way. Or a dictionary attack gets you. You'll never be safe.
Re:how NOT to get SPAM 101 by musikit · 2004-02-04 03:51 · Score: 1

To solve #1 just keep a dummy free email account used solely for the purpose of verifying registration to websites such as slashdot.

which is what i do.

ohh i forgot to add. following my own instructions i haven't gotten a spam in 6 months. and in the last 3 years i've gotten maybe 1 spam/week on the average.
Re:how NOT to get SPAM 101 by grandmofftarkin · 2004-02-04 04:00 · Score: 2, Insightful

1. don't sign up on any page that requires you email address to verify *cough*like this one *cough*
2. don't use free email services hotmail etc.
3. don't use AOL
4. don't let anyone have your address that forwards messages like "cute bunny pic" or "funny anti-geek joke" etc.
5. don't post your email anywhere.
6. don't sign up for majordomo lists.
Yeah great and I'm sure it works a treat BUT. 1 and 6 are not practical for many people. 2 and 3 for whatever reason these services may suit some people (money constraints, location). Some people have friends or relatives who do 4, should they just start ignoring them? What if they want to converse with those people [are these playboy bunny pics by the way? ;-)]? 5 one simple mistake an you are done for anyway.

Also, why should a spammers be allowed to prevent people from using the internet as they see fit. No, I'm sorry but there are better solutions then trying to follow all your advice. I mean, whilst your points are vaild you might as well say:

7. Don't use the internet

I guarentee that last one will work perfectly!
Re:how NOT to get SPAM 101 by Gaijin42 · 2004-02-04 04:01 · Score: 1

Since all of those things are usefull to me (except 2 and 3) that isn't really an option.

What IS an option : www.sneakemail.com

create a new email address each time you need to sign up for a website/list. All the addresses forward to your main account.

If that address gets spammed, you can delete it, or apply filters so that only the list/site can send you mail from it.

Its free, but if you pay them ($12/6mo) you get a bunch of nifty features, and a 50MB/mo limit. (free is like 10MB/mo I think. but for email that is alot)
Re:how NOT to get SPAM 101 by musikit · 2004-02-04 04:06 · Score: 1

Some people have friends or relatives who do 4, should they just start ignoring them?

respond to the email they forwarded you asking very nicely not to send you these emails. unless of course they are playboy bunnies.

one friend got so bad about sending me these emails i set up eudora to copy the message back to him 100 times before he stopped sending me email like this.

Filter: From Subject FW or FWD Action: FW *100

they'll stop real fast then
Re:how NOT to get SPAM 101 by Anonymous Coward · 2004-02-04 04:08 · Score: 0

7. don't use email
Re: how NOT to get SPAM 101 by gidds · 2004-02-04 05:11 · Score: 1

IME options 1 and 6 aren't important. Really.
I've uniquified my email address for every web site I've signed up with over the last several years, and none of them have led to spam. None.
It's all from a couple of ancient Usenet posts, and the time that a couple of acquaintances posted my address on their web sites, until I discovered it and asked them not to.
Maybe there are unscrupulous web sites out there, but if so, they must be a tiny tiny percentage. As a source of spam, they're insignificant.

--
Ceterum censeo subscriptionem esse delendam.
Re:how NOT to get SPAM 101 by ragnar · 2004-02-04 05:17 · Score: 1

Good advice for new email addresses, but I have an address that I've used for over 7 years. Too many people use it to communicate with me to drop it. I used to post on usenet. 'nuff said.

--
-- Solaris Central - http://w
Re:how NOT to get SPAM 101 by slashusrslashbin · 2004-02-04 05:40 · Score: 1

I didn't get spam on my personal account for years, by providing unique emails to everything I signed up for, and otherwise being very careful in the ways suggested.

Even the one email address I have on my homepage has only got spammed a handful of times, so collection of addresses by spidering is not really a root cause (too resource intense?)

However, I do now get shed-loads of spam on my personal account; why?

Because each time one of those bloody Windows email viruses comes out, someone I know falls for it, and my email address has got out that way. I think the first time I got spammed was shortly after the Melissa virus hit.

Really annoying.

7. don't have any friends, unless they are l33t!
Re:how NOT to get SPAM 101 by Maestro4k · 2004-02-04 07:17 · Score: 1

Better yet, use a service like Spam Gourmet for anything you sign up for online. I do this and use a unique disposable address for each site. I like to enter contests so this is quite handy. A nice side-benefit is I can tell you which sites I signed up with are spammers because they're the ones with 100+ E-mails trashed and not forwarded. Hell, I even signed up for Spamcop through Spam Gourmet. I made the spamcop reporting address (the one they send from) a trusted sender, now if some spammer manages to somehow get the address I used to sign up at spamcop, they'll get to send a grand total of 2 E-mails before they're all eaten automatically. No bounces to warn the spammer the address is bad either, they just get quietly deleted without anyone ever looking at them.
Re:how NOT to get SPAM 101 by Urox · 2004-02-04 08:23 · Score: 1

respond to the email they forwarded you asking very nicely not to send you these emails. unless of course they are playboy bunnies.
one friend got so bad about sending me these emails i set up eudora to copy the message back to him 100 times before he stopped sending me email like this.
They will continue to send you email. The solution is to NOT give out your core email address to these people. I have several different addresses: personal trusted address (only the smart family members and friends get this one), every day address (as listed in slashdot) that I give to untrustworthy (clueless) relatives and friends, and one for everyone else where I want to be anonymous.

--
"Would you rather have a playstation addicted dork wearing a star wars t-shirt?"

Line Noise by 4of12 · 2004-02-04 03:36 · Score: 4, Informative

A previous story talked about the noise level of spam increasing.

And a very entertaining NYT article that is in the process of expiring.

The upshot is that spam is being forced to look more and more like line noise. It will probably become less and less effective as the message has to submerge to the point where people can't recognize it.

--
"Provided by the management for your protection."

Re:Line Noise by Tom+Christiansen · 2004-02-04 05:45 · Score: 1

And a very entertaining NYT article that is in the process of expiring.

Here's the faux-expired article without the crap.

If you could sit back with Zen-like detachment and observe the dross piling up in your electronic mailbox, the spam wars might come to seem like a fascinating electronic game.
Like creatures running through a maze with constantly shifting walls, spammers dart and weave to sneak their solicitations past ever wilier junk mail filters. They are organisms, or maybe genomes, grinding out one random mutation after another, desperately trying to elude the Grim Reaper.
Viagra becomes "vi@gra" or "v-i-@-g-r-a." Then, as the filters adapt, "v1@gr@" and even "\/l@gr@." Currently, the Internet is swarming with mutants like this: "Cheap Val?(u)m, Viagr@, X(a)n@x, Som@ Di3t Pills Many M3ds RIZfURqgHr77B," the final string of gibberish hanging like an appendage of junk DNA.
Taking a different approach, a come-on for barnyard pornography devolves into "faurm galz bing e rottic." Another pitch promises to reveal "Seakrets of ((eks-eks-eks)) stars."
Dispiriting as it is to start the morning with a hundred of these orthographic monsters crouching in your inbox, there is reason to take heart. Measured in bits and bytes, the sheer volume of spam may not have diminished. But advanced filtering software, which learns to recognize the mercurial traits of junk e-mail, is having an effect. The spammers' messages are becoming harder and harder to decipher. Sense is inevitably degenerating into nonsense, like a pileup of random mutations in an endangered species gasping its last breaths.
Earlier this month, when Internet experts met in Cambridge, Mass., for the 2004 Spam Conference (available as a Web broadcast at Spamconference.org), they showed just how far the science of spam fighting has come. For all the recent talk of suing spammers and compiling a national do-not-spam list, most speakers were putting their hopes in technological, not legal solutions. The federal government's new junk e-mail law, the Can Spam Act, barely rated a mention.
Terry Sullivan, a spam researcher with a doctorate in information science, described how he used a "handy 10-dimensional high-fidelity model of historical spam space" to analyze how junk e-mail changes over time. Long stretches of stability are suddenly interrupted by brief bursts of innovation, a pattern he compared to what some evolutionary biologists call punctuated equilibrium. The encouraging news is that there is enough stability--an enduring core of "spamminess"--for the invaders to be quickly identified and destroyed.
Another presentation, called "Cockroaches Hate the Light," considered how to authenticate senders so that spammers can't easily fake their identities. Other speakers proposed eco-electronic solutions such as digital postage stamps that would put a price on sending e-mail--trivial for an individual user but making hit-or-miss barrages prohibitively expensive.
Like epidemiologists discussing how to predict and control a biological outbreak, conferencegoers compared the merits of various filtering techniques. Which is better: first-order Bayesian, token grab bag, sparse binary polynomial hash or markovian weighting? The meaning of the terms may be opaque to outsiders, but the underlying message comes through: the spammers are up against some increasingly advanced cybernetic artillery.
Many experts believe that solving the spam problem will require a combination of approaches. But laws take forever to pass and amend. Technological fixes like sender authentication and electronic stamps would also take time to carry out, but filtering is already here--and it is reducing the spammers' messages to feeble signals swamped by a roar of alphanumeric noise.
The turning point came

Only if you're the author. by Eevee · 2004-02-04 03:36 · Score: 3, Insightful

In the article, it points out those words listed are good for getting past his filter. If you don't normally have mail that uses those words, then your filter will still catch it as spam.

Now, if you do deal with the Berkshire Marriott frequently, asking them for comments on your wireless setup, then yes you're up the creek.

Re:Only if you're the author. by cynicalmoose · 2004-02-04 03:59 · Score: 1

As Paul Graham has pointed out, this only works if you use the 'ham' words from the corpus. Which are highly specific, clearly. Graham points out that if you send him mail with the word 'lisp' in it, it gets treated as ham, because no spammer is interested in programming.

He also points out that even if you put in a large number of neutral words (which are easier to guess), they won't outbalance the spammy words. For it is difficult to sell viagra without using the word 'viagra' - and the word 'vi@gra' is even more incriminating, as it will never occur in normal email.

--
Exercise your right not to vote. thinkoutside.org

If there was any way of filtering these... by Channard · 2004-02-04 03:37 · Score: 1

.. it would have to rely on the randomness of the sender's email, which is a giveaway when you actually look at the sender. It's as jumbled as the sender's email for most spam emails. The catch is, as the above poster mentions, missing an E-Bay mail isn't something that's particularly desirable. And I don't think Mozilla's filter could work effectively enough - baysian as it is - on just the jumbled 'from' address.

Re:That's dedication... :( by Anonymous Coward · 2004-02-04 03:37 · Score: 0

I modded you up even if you did say "virii" instead of "viruses".

Re:Discovering Keyword Demographics by Alien54 · 2004-02-04 03:38 · Score: 3, Interesting

[hit the submit key too fast ....]

The keywords would be different for each person.

But I suppose you could discover a select set of keywords for specific demographics, if you defined them very precisely. This would move spam out of the normal "spew it everywhere" phase, where they would have to pay for real marketing data.

Which sort of misses the point of free advertising in the first point, at least for the small guy. Of course, the big boys can pay for this sort of thing.

--
"It is a greater offense to steal men's labor, than their clothes"

the Personal Computer ... by jiffah · 2004-02-04 03:38 · Score: 1

... has now become the Personal Bill Board.

Re:One word: WHITELIST. by RimfireShooter · 2004-02-04 03:38 · Score: 1

All challange/response does is send challage messages to people that get joe jobbed and increase junk mail even more.

Duh by Ricin · 2004-02-04 03:39 · Score: 4, Informative

Of course I can break my own Bayesian filtering.

What matters is that while one person's spam might be very similar to another person's spam, their ham isn't. At best, it would require a semi-personal approach to sneak in spam. That's why you need to continually train your filter in the first place. Rinse and repeat, that's what it's all about.

What's being described is not really a flaw, but rather a saturation point at which it's time to retrain your filter and perhaps even start over with a new database. The old one gets too much 'noise' after some time.

They do point out one thing, be it from the spammers POV: Bayesian filtering is a continuous process and not and end to all solution. It requires fresh input and gets less effective if you keep old crud around for too long and if you train it too much on virtually the same spam/ham.

It's still a much better solution than blacklists.

Re:Duh by triffidsting · 2004-02-04 06:00 · Score: 1

The overfitting you describe is correct AFAICT. The Naive Bayesian approach assumes that the overall form of the mail being classified is reasonably consistent - if you train the Naive Bayesian classifier on a different pool of spam and then treat that as a "static" classifier, its classification accuracy will certainly be reduced.

--
Non, je ne veux pas coucher avec toi ce soir.

Sigh. It's depressingly predictable by heironymouscoward · 2004-02-04 03:39 · Score: 3, Interesting

Why is everyone surprised that every technique designed to eliminate spam can be fought? It's obvious that this is going to happen.

The question should be: how do we live in a world where 99.9(n)% of email is spam? When the virus writers and zombie masters and spysters start using their communications infrastructure for its intended goal of delivering advertising?

It's inevitable, and no amount of spam filtering will avoid it.

Here's a prediction I made maybe 6 months ago on Slashdot: we're going to start seeing viruses that modify real outgoing emails to include their advertising messages. (And no Outlook jokes, thanks...) How does one filter spam when real emails are also infected?

--
Ceci n'est pas une signature

Let them do so and beat them where it hurts... by DocSnyder · 2004-02-04 03:40 · Score: 2, Interesting

What they can't hide is the spamvertised target, as they want their victims to click onto a link and order something. Now you can resolve a link's IP address and check it against some common DNSBL blacklists (most spamvertised hosts are listed on SBL, SPEWS or chinanet.blackholes.us), or extract its domain and test it against some RHSBL or manual lists.

What is more, if you multiply Bayesian or "word list" spam scores with results obtained with other methods, spammers may put "non-spammy" words into their spams as they like, but they only score their crap up instead of down.

There's a .bomb business model by Anonymous Coward · 2004-02-04 03:41 · Score: 0

murder for hire via distributed micropayments.

Ironically you can be like the spammers, or Ted Kaczynski, and run the business out of your home and a PO Box.

New form I got today by Anonymous Coward · 2004-02-04 03:42 · Score: 0

Got a new form of a spam scam today I haven't seen before. Asks you to call the equivalent of a 900 toll number. Number and website removed to avoid giving them the plug they desperately want. This one wasn't very well done, but I suspect I'll be seeing more.

Hi,

Once upon a time there was a hard-working software engineer slaving away under cruel masters. The engineer poured heart and soul into his work till early hours every morning, with the promise of glorious profit sharing. When the work was finally done, this poor engineer was rewarded by being dismissed and shown the door.

The company I used to work for runs a website:- www.XXXXXXXX.co.uk. However after I had left, they went live with the system, WITH THE TESTING BACKDOOR STILL IN PLACE !!!!! If you call their competition line on 0906 XXX XXXX and enter "0" instead of a real answer, then the system lets you through to win a prize - Idiots! They do charge the call at 1.50 per minute but it only lasts one and a half minutes.

Moral of this story? Don't p*ss off employees, especially one's you fire!

Viva the workers! Down with the bosses! Share the wealth

Re:New form I got today by Technician · 2004-02-04 04:11 · Score: 1

I never get nailed with those. I always call toll free numbers from the lobby pay phone. Any numbers that have a charge of anything, never get called. It keeps the suprises off my bill.

--
The truth shall set you free!

Nowhere near as effective as my attack by Jerf · 2004-02-04 03:45 · Score: 3, Interesting

Well, I may not have made it into the BBC but my attack is much more effective and much, much harder to defend against: Bayes Attack Report.

It even counters the "personalization" quality of Bayes filters by finding the "common core" of personalization that we all share.

Fortunately, spammers continue to be too stupid to understand this attack. Last time I posted this on Slashdot I got joe jobbed, because apparently it's easier to do that then to actually figure out what I was talking about.

In summary, I wouldn't worry about your Bayes filters for a while: While they are attackable, spammers are too stupid to understand the attacks. (My article has been posted for over a year.) Thank goodness, sort of. (This will eventually be a temporary situation... but I see no particular evidence that the breakthrough will happen anytime soon.)

Re:Nowhere near as effective as my attack by Ricin · 2004-02-04 04:18 · Score: 1

Yes this method is better.

It's similar in the sense that what you want to achieve is that the spam/ham decision is as fuzzy as possible. Put another way: get those spam/ham graphs to be less steep or, taking a Bell curve as a model, get a very broad bell instead of a sharp peak. At the very least you'll increase the amount of "unsures" and false positives and thereby increase the need for training.

I'm not a statistics person, do I understand this correctly?
Re:Nowhere near as effective as my attack by Jerf · 2004-02-04 05:15 · Score: 1

Somewhat. That's an effect of the poisoning the filter experiences as these "hammy spams" are marked as spam; the boundary between "spam" and "ham" gradually fuzzes until the computer can't tell them apart.

The initial phase of the attack lies in creating the spams that get past the filters in the first place. One direct use of this attack is to better those "random word" blocks, to turn them into "hammy word/phrase" blocks. (Then, of course, the bayesian filter authors stop processing those blocks, and the spammers put it somewhere else, and we're right back to an arms race.)
Re:Nowhere near as effective as my attack by Ricin · 2004-02-04 05:59 · Score: 1

Thanks for your answer.

It seems to me that "the bayesian filter authors stop processing those blocks" isn't really possible because of the binary nature of the filter and the total absence of any interpretation of the tokens being processed, for example in which combinations do they occur.

AFAIK this is not being done yet but it could provide a "context sensitive" weight factor for your tokens. It would likely make the math a lot more complex though and may not be practical because of CPU use and larger databases.

BTW, spammy hams should pose the same potential breakdown of the currently used - rather simple - Bayesian algorithms.
Re:Nowhere near as effective as my attack by JohnGrahamCumming · 2004-02-04 06:48 · Score: 1

How about writing this up, perhaps with more experiments and submit it to the Spam Conference? I'm sure others there would be interested in hearing about your proposal.

John.
Re:Nowhere near as effective as my attack by Jerf · 2004-02-04 09:27 · Score: 1

I had a fairly negative message about not really having the time, but I note the next conferance is presumably a year away.

You've sorely tempted me, and I could write an even simpler attack then what I was going for in that piece (no obvious blocks of ham text, no "cheating" whatsoever). The worst part is the huge quantities of ham I need to collect!

Maybe I'll futz with spam bayes a bit more and see if I can break this down into little managable chunks. (I wish there was some easy way to grab thousands of Usenet messages without having to hand mark spam vs. ham...)
Re:Nowhere near as effective as my attack by JohnGrahamCumming · 2004-02-04 10:04 · Score: 1

The SpamAssassin test corpus has a collection of labelled ham and spam that's quite handy. Perhaps start with that.

I'm thinking about doing a paper for the April spam conference about a variety of attacks on Bayesian filters. Perhaps we should be considering doing a joint presentation since this interests you?

Perhaps email me directly if you want to talk more.

John.
Re:Nowhere near as effective as my attack by sschinke · 2004-02-05 10:24 · Score: 1

Obviously, the easiest way to mark messages as ham/spam in a newsgroup context is using a learning filter.

POPFile (for example) has an experimental NNTP filtering module.

Of course, all of this depends on messages tailored using your attack not ending up in the news feed being used to maintain statistical information about spam/ham words. *g*

Regards,
Sam

Re:One word: WHITELIST. by andih8u · 2004-02-04 03:45 · Score: 2, Interesting

I think whitelists end up discouraging quite a few legitimate users as well as spammers. I've received emails from people asking questions about this or that, I hit reply, and get shot back a message saying that I have to ask their permission to send them an email, even though I'm replying to them. I dunno if they're not setting up their whitelist properly to automatically add any address they send mail to, but I'm not going to hassle with writing out a reply to them, then having to go back a few minutes later and ask their permission to respond to the message they sent me in the first place.

--

slashdot, news for crazed liberal socialist zealots

Re:That's dedication... :( by JohnGrahamCumming · 2004-02-04 03:46 · Score: 1

Not only did I send myself 10,000 spams, I bought these incredible enlarger pills from myself for three easy payments of $9.95 and I now have a monster in my pants :-)

John.

just skin the spammers alive by RMH101 · 2004-02-04 03:47 · Score: 1

...two problems solved for the price of one. easy.

Really don't understand it. by The+I+Shing · 2004-02-04 03:48 · Score: 4, Insightful

I've said this before, but I'll say it again. I really don't understand why all this even happens.

When I'm going through the webmail access to my spam-bait accounts (the ones that are listed on my websites that I don't bother retrieving with my POP email client anymore because of hundreds of spams a day to each), if I'm fooled into opening one up, most likely because of it having a subject header that might be someone legitimate, the moment I see that the message body says anything spammy I immediately click the Delete button. I imagine everyone else in the world is doing the same thing.

It's gotten to the point where the preoccupation of spamming is just to get past filters, the result of which is that the message is grumblingly deleted by the irritated recipient. Who out there is saying, "Oh, look, this message got past all my spam filters and contains a lot of jumbled, garbled nonsense text alongside a plug for herbal penis enlarging pills. This must be legitimate. Now, where's my credit card,"? Do the spammers think that we're all clones of Dilbert's pointy-haired manager?

Spamming is not only irritating, it's pointless. Who is paying these people to spam us? Are people actually buying penis enlarging pills and patches, herbal viagra, mortgage refinancing, credit repair kits, or any of that stuff? Enough to put millions of dollars a month into the hands of career spammers?

I'm hopelessly at sea in this matter.

--
You are in error. No-one is screaming. Thank you for your cooperation.

Re:Really don't understand it. by One+Louder · 2004-02-04 03:57 · Score: 2, Insightful

It all depends upon where the blocking is taking place. Clearly some people are responding to spams, so there appears to be some incentive for the spammers to get their message through.
Obviously, if an individual has gone to some trouble to set up spam filters, then she doesn't want to be bothered and the spam is pointless. However, the vast bulk of these filters are set up by the ISPs, and there's some value to the spammer to get through them to the idiot on the other side who apparently might actually respond to the spam.
Re:Really don't understand it. by andih8u · 2004-02-04 04:03 · Score: 3, Funny

Here's the simple solution. Simply have your friends send you mail with "hot viagra teen sex mortgage" in the subject. Since all the spam is getting past the filters into the inbox, all of your real mail will be waiting for you in your junk mail folder

--

slashdot, news for crazed liberal socialist zealots
Re:Really don't understand it. by Anonymous Coward · 2004-02-04 04:16 · Score: 0

Ha! Last week I was getting an insistent spam from a very small company that shall remain nameless. After then 10th message or so I got pissed off and went to their website (father forgive me, a hit) to grab their email address.

Next step, signed them into all gay porn websites google provided me with. Take a bit of your own medicine and learn!
Re:Really don't understand it. by argStyopa · 2004-02-04 04:36 · Score: 2, Funny

Spamming is not only irritating, it's pointless. Who is paying these people to spam us? Are people actually buying penis enlarging pills and patches, herbal viagra, mortgage refinancing, credit repair kits, or any of that stuff? Enough to put millions of dollars a month into the hands of career spammers?

SHH!! If people paying for these things start looking carefully to see if they actually get a return on their investment, all sort of lunacy may follow:
- Companies may start asking: Let's see, I spend $1 million on making the ad, and another $1 million for a 30-second spot on the superbowl - did I really get $2 million more PROFIT (not sales) that I wouldn't have gotten anyway without it?
- Producers might realize that there are hundreds and thousands of extremely talented actors willing to work for salaries many orders of magnitude less than big Hollywood stars, are we really getting that many more people walking into a movie BECAUSE it's starring the Governator or Julia Roberts?
- Sports franchises might wonder why they are paying $40 million in salaries for 5 guys to play basketball to (if you take out the advertising revenue, above) sell 15,000 seats that are probably worth about $15 each in net profit - that's a measly $225,000 per soldout game. 100 games later, they've paid for about half the team.
- People might start wondering why they are paying $8 to go to a movie, or $100 for an event (concert/sport) ticket, when there are about 10,000 other things better that they could do with their lives.

That's crazy talk, man.

--
-Styopa
Re:Really don't understand it. by tbmaddux · 2004-02-04 04:56 · Score: 4, Funny

Are people actually buying penis enlarging pills and patches, herbal viagra, mortgage refinancing, credit repair kits, or any of that stuff?
Let me take a moment to tell you my sad story. I was in desperate need of penis enlargement, and so I did start ordering those pills. But they proved hard to swallow, and the patches were itchy, and I had an allergic reaction to the herbs in the herbal viagra. Unfortunately, I bought so much of this stuff that I had to refinance my home, and the bank wouldn't approve my loan because of all the penis purchases on my credit cards. So as a desperate last measure, I ordered some credit repair kits, but that didn't work either!
Fortunately, this story has a happy ending! As I wrote this message, some polite people in West Africa contacted me and I think they are going to get me out of this financial mess.

--
Can't you see that everyone is buying station wagons?
Re:Really don't understand it. by jdreed1024 · 2004-02-04 05:29 · Score: 1

Who out there is saying, "Oh, look, this message got past all my spam filters and contains a lot of jumbled, garbled nonsense text alongside a plug for herbal penis enlarging pills. This must be legitimate. Now, where's my credit card,"? Do the spammers think that we're all clones of Dilbert's pointy-haired manager?
No, but some people are. Even if one person anywhere responds, the spammer can deem it a success. It's all about the cost of spam being paid by the recipient. You'd get just as many flyers and catalogs as spam if the USPS allowed advertisers to send 3rd class mail with postage due and forced you to pay it.
And there are a sufficiently large number of insecure teenage boys with credit cards that there's always going to be someone who wants this herbal viagra, no matter how crappy the message looks.

--
There is no sig, there is only Zuul.
Re:Really don't understand it. by silicon+not+in+the+v · 2004-02-04 05:47 · Score: 1

I agree with you regarding some of the standard stuff like Viagra, etc. Unfortunately, spam has spread as a technique for selling just about anything. I have gotten a couple of spams for products that I had already thought about buying. One of them was when looking through ThinkGeek, I saw the Forever Flashlight. It sounded really cool to have a flashlight that uses an LED instead of a bulb and doesn't need batteries. I have gotten a couple of spams trying to sell me Forever Flashlights since then. I don't think they were directing me to ThinkGeek to buy them--I didn't click to find out. That is where it gets dangerous. If you click on one spam for something legitimate, your email address has just become bait for every other spammer to know that it is a "good" address, and one got through.

In the words of a wise 900-year-old muppet: Once you start down the [spam] path, forever will it dominate your [inbox].

--
We may experience some slight turbulence and then...explode. -Capt. Mal Reynolds
Re:Really don't understand it. by Tom · 2004-02-04 06:16 · Score: 1

Who out there is saying, "Oh, look, this message got past all my spam filters and contains a lot of jumbled, garbled nonsense text alongside a plug for herbal penis enlarging pills. This must be legitimate. Now, where's my credit card,"

About 0.01% of the recipients. Which translates to a thousand sales in a 10-mio. spam bombing run.

Are people actually buying penis enlarging pills and patches, herbal viagra, mortgage refinancing, credit repair kits, or any of that stuff? Enough to put millions of dollars a month into the hands of career spammers?

Yes.

--
Assorted stuff I do sometimes: Lemuria.org
Re:Really don't understand it. by ggvaidya · 2004-02-04 07:41 · Score: 1

Who is paying these people to spam us? Are people actually buying penis enlarging pills and patches, herbal viagra, mortgage refinancing, credit repair kits, or any of that stuff? Enough to put millions of dollars a month into the hands of career spammers?

Scott Richter is. Spam is low-cost-high-return business, which means lotsa moolah for very low overheads. Under those conditions, it's worth the risks/suits/insults/death threats ...
Re:Really don't understand it. by X_Bones · 2004-02-04 08:01 · Score: 1

Who out there is saying, "Oh, look, this message got past all my spam filters and contains a lot of jumbled, garbled nonsense text alongside a plug for herbal penis enlarging pills. This must be legitimate. Now, where's my credit card,"?

As difficult as this may seem to a Slashdot reader, not everyone in the entire world is using Bayesian filtering or even any sort of filtering at all. I'm not (mostly because I only give my email address to family and close friends and use AIM for everyone else), my grandmothers aren't, many of my friends aren't. I'd say that there's quite a few people on the internet who (a) don't use filtering; (b) have a history of giving their email address out basically to whoever wants it; and (c) sign up for online promotions with companies who turn around and sell their email address. Spammers rely mainly on this type of person, since they're the likliest to be suckered in to buying a product advertised through spam. I'd imagine filter avoidance is a secondary priority right now, but it'll become more important as more ISPs start using filters.

Spamming is not only irritating, it's pointless. Who is paying these people to spam us? Are people actually buying penis enlarging pills and patches, herbal viagra, mortgage refinancing, credit repair kits, or any of that stuff? Enough to put millions of dollars a month into the hands of career spammers?

The amount of spam sent out depends a lot on the current state of the economy. When your business' profits are in the toilet, spamming starts to look really attractive due to its extremely low cost given the size of the audience reached. Hopefully, if things start turning around and businesses start making money, they'll think twice about spamming; by then, consumer outrage begins to matter more than saving a smaller percentage of your advertising budget.

--

the coolest club on /.

Discovering Keyword Demography by Alien54 · 2004-02-04 03:48 · Score: 1

So we now have the field of keyword demography, an essential tool for spammers, but one which will be tremndously expensive to develop data on, and which will be sold dearly if the data is ever developed. They could probably sell this stuff for thousands of dollars per copy.

--
"It is a greater offense to steal men's labor, than their clothes"

Re:Discovering Keyword Demography by EugeneK · 2004-02-04 04:42 · Score: 1

Anything that makes the spammer spend more time and effort per spam is good. Anything which raises the cost-per-spam is a victory! :)

Re:That's dedication... :( by kent_eh · 2004-02-04 03:51 · Score: 3, Interesting

One thing we can do is to make the spammers==virus_writers connection every time anyone asks us about (or even mentions) virusses.

Aren't we the ones our friend(s) and co-workers ask about computer stuff?

I have taken this a step further and contacted a few "computer journalists" locally and suggested that they make the spam/virus connection the next time they are writing about the latest virus. It's natural to answer the question 'where do these virusses come from' when talking about the latest scource of the internet.

--

---
"I can't complain, but sometimes still do..." Joe Walsh

Hello js6679, git yur viafrA TOady! by Rudebr00d · 2004-02-04 03:53 · Score: 1

So does this explain why more than half the spam that makes it through to my inbox looks like an illiterate a0l script kiddy wrote it?

Re:Discovering Keyword Demographics by tbannist · 2004-02-04 03:54 · Score: 2, Insightful

You're not thinking like a spammer, it won't change things very much. If a spammer discovers different keywords that reach different demographics, what do ou think he'll do? I'm betting he'll just send the spam to every address once for each of the sets of keywords. So instead of half of all e-mail being spam, we'll see a huge jump where half of delivered e-mail is spam and 90% (or more) of all e-mail is spam.

--
Fanatically anti-fanatical

Re:That's dedication... :( by kris_lang · 2004-02-04 03:55 · Score: 5, Informative

Yes, it's dedication to research. He sent himself the 10k messages to see if he could outwit his own Bayesian filtering of spam messages. He effectively deduced that if the incoming message can be similar enough to items that have been specifically marked non-spam by the end-user of the Bayesian-spam-filter, it will be not be marked as spam.

There's a cunning recursiveness to this which is at that fine line between clever and stupid. The difficulty is, as he also deduces, that each person's Bayesian rules for spam vs. nonspam are unique and will require many attempt in order to infer the pass-through words that will create a false negative and allow the spam to come through. The one step that people are missing is that if the evil spammer wishes to work on spamming a domain (both in the internet sense and in the "domain of expertise/specialization" sense) she can tailor the pass through words to the market. If she's sending spam to Intel or AMD corporate addresses, then lithography might be the magic word; if she's spamming Xilinx, the fpga will route through the Bayesian filter; if she's spamming Dave Barry, then debenture and fish falling from the sky might help spam make it through, Natalie may or may not make it through a /.'ers filter, actually usually including slashdot in the subject or as the name usually will make it through a slashdotter's filter. And the ease of this lies in that tailoring the open sesame words to a market will probably open the doors to all of the e-mail recipients at a domain, particularly is the spam filtering is done at the mail-server level and not at the end-user level. Thus rather than having to send 10k messages to a single user to crack open the spam doors, sending those 10k messages to multiple users at a domain and analysing which ones get through will effectively open the floodgates for all of the users at that internet domain. And using the concept of a priori probability distributions makes the hunt for these sesame words {[tm] /me :) } easier by limiting the dictionary to be searched to the keywords of the field/domain about to be spammed. That is what makes this dangerous.

The counterattack from the corportate mail-server will be to look for these similarly unique messages being sent to multiple users.

Re:One word: WHITELIST. by analog_line · 2004-02-04 03:55 · Score: 1

If people are more concerned with spam than losing some legitimate e-mails, and the prevailing attitude here that people are generally more concerned with spam than the murder rate in their particular locale, then a whitelist is the ONLY surefire way to not get any of it.

Spam - CounterSpam by Aumaden · 2004-02-04 03:57 · Score: 2, Interesting

I have opted to wage a personal war against spammers. Here's my battle plan:

Roughly once each week, I go fishing through the spam that has been filtered out of my various accounts for URLs. (Sometimes this involves a little digging to get to the final site.) I extract the host names from the URLs and for each hostname, I create 10 fake email addresses.

I pack these emails into messages that I post to Usenet in groups likely to be trolled by Spammers. The spammers scrape these addresses from Usenet and add them to their database. Thus, future mailings will also spam the spammer's clients.

If enough people do this, the generated traffic will begin to overload the client's mail server. After a while the spammer's clients will figure out that every time they employ a spammer, they themselves get spammed.

Even if nothing comes of this, I get the satisfaction of knowing the real perpetrator (the spammer's client) gets to share some of my pain.

Re:Spam - CounterSpam by FuzzyBad-Mofo · 2004-02-04 04:33 · Score: 1

I pack these emails into messages that I post to Usenet in groups likely to be trolled by Spammers. The spammers scrape these addresses from Usenet and add them to their database. Thus, future mailings will also spam the spammer's clients.

So you plan to spam Usenet to get some kind of revenge on spammers? The end does not justify the means.

If enough people do this, the generated traffic will begin to overload the client's mail server. After a while the spammer's clients will figure out that every time they employ a spammer, they themselves get spammed.

Not only is this not going to work, it would also lower the already poor S/N ratio on Usenet. Please reconsider this plan.
Re:Spam - CounterSpam by Anonymous Coward · 2004-02-04 05:03 · Score: 0

Except that the "spammers client" is usually just some throw-away free website like GeoCities who don't want them to begin with and usually kill the page withing a few days. Sometimes you can trace all the redirects back to the real source, but often it's just a web form on some exploited host that posts to a file they pick up later.

Re:That's dedication... :( by surprise_audit · 2004-02-04 03:58 · Score: 1

If advertising via spam were illegal, and if real, live marketing folks that used spam were to find themselves in federal pound-me-in-the-ass prison, I think the flow of spam would drop dramatically.

For spam to work as a marketing tool, there has to be some way for the suckers to reach the sellers, even if there isn't a clikable link or email address. So, launch a law enforcement team down the return path and see who they can dig up. Alternatively, launch Guido and a coupla friends down the return path and don't bother to ask where they buried the spam-advertiser...

For a Start by Greyfox · 2004-02-04 03:58 · Score: 0

Why not require that any mail coming in be encrypted to your (obnoxiously large) PGP key or signed by someone on your keyring. Change keys every month or two. That'd work for the amount of mail I get. Tmda actually filters out just about everything right now so I'm not pissed off enough to go implement this.

Eventually that solution will stop working and it doesn't solve the problem that the mail has to actually be transmitted to you before you can filter it, but I think it'd help keep mailboxes spam free for the next 3-5 years.

--

I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

Re:For a Start by adamjaskie · 2004-02-04 05:00 · Score: 1

My idea: require everyone that wants to send you an email to put a specific string somewhere in the message. Say, something like 489g67298f&89398h*S*. If an email does not have this string, or is not in your whitelist, send it to the spam folder.

As far as web forums go, don't let them display your email. If someone needs to get in touch with you, they can send you a "private message." Since the admin already has your email address, you can always just tell the forum to send you an email to alert you to new private messages.

Whitelists are great. When you give someone your email address, send them an email asap and get their address and add it to your whitelist.

Filter things into different boxes. Don't just use IN and TRASH boxes. I have a box where all my emails from amazon.com go. They don't clog up my inbox, which is only whitelist addresses, and they don't get junked. It is easy to deal with the one or two fake amazon.com spams I get now and then when they are mixed in with only other amazon.com emails.

--
/usr/games/fortune
Re:For a Start by ragnar · 2004-02-04 05:32 · Score: 1

I would love for signed email to be more commonplace. I use a digital signature and encrypt with hip people, but good luck convincing more than 0.5% of the email population to do the same. Even most uber geeks are resistant to figure out PGP.

--
-- Solaris Central - http://w

I don't see how this is necessarily a problem by PixelCat · 2004-02-04 03:58 · Score: 3, Insightful

What he's doing is a brute-force attempt to find words with--for himself--a high ham probability. I don't see how this is necessarily going to be an effective general-purpose technique. If you need to start bombarding people with thousands of messages to find the good words you're just going to drive more people into using filters--and this will almost certainly coerce ISPs into doing more filtering as well. Plus, you've got to deal with the issue of keeping data on all those users to find out which words are good for them. This would require you to tailor your spam to each individual user, which probably is going to increase the cost to the spammer (at least in terms of disk storage and time, anyway) and, as Graham-cumming implemented it, is going to fail utterly for anyone who isn't viewing mail as HTML, anyway.

Re:I don't see how this is necessarily a problem by jeremyp · 2004-02-04 04:13 · Score: 1

Yes, as far as I can see, you need to send thousands of messages to each person on your spam list and train your "evil" filter *separately* for each one. Also, if the target turns off image link following on his/her mail client (well, duh!), it negates the possibility of doing the attack at all. It's hardly unstoppable.

--
All I want is a secure system where it's easy to do anything I want. Is that too much to ask ~~ Randall Munroe
Re:I don't see how this is necessarily a problem by Inuchance · 2004-02-04 08:08 · Score: 1

Well, people who don't use HTML email probably aren't the targets for spam, anyway. I'm pretty sure that everyone who'd actually buy things from spam are the kinds of people who know nothing about HTML, etc.

Re:One word: WHITELIST. by Jordy · 2004-02-04 03:58 · Score: 1

You can always require a challenge resposne from only people who fail to pass your existing spam filter and then turn your existing spam filter all the way up.

That way you end up with the best of both worlds. Most people will have their messages go through without any problems. The select few that happened to word their emails really really poorly will have to click on a link/reply to a challenge.

--
The world is neither black nor white nor good nor evil, only many shades of CowboyNeal.

good enough by gyratedotorg · 2004-02-04 04:04 · Score: 1

But, he said, the good news is that the technique to discover these trigger words is very time consuming.

this is what's important here. the spam filters dont have to be perfect. they just have to be good enough to make spamming unprofitable, or at least a big enough pain in the ass that it isnt worth the effort anymore.

--
Gyrate Dot Org - "Where high-tech meets low-life"

Re:That's dedication... :( by Zocalo · 2004-02-04 04:04 · Score: 1

I think you missed the point a little. The guy in the article is not a spammer at all - in fact John Graham-Cumming is the author of POPFile one of the most capable spam filtering tools out there. That he has managed to defeat his own tools is an incredible thing; the amount that you can learn by invalidating your own work is phenomonal. It also means that by the time spammers figure out how to circumvent the technology there is a good chance that the anti-spam tools will already have moved on to the next level.

Frankly, I think spammers are finally on the defensive; put a tuned version of the latest anti-spam software between them and your mailbox and you get no spam. I've been using SpamAssassin with Bayes and then Procmail with several custom rules in both stages for several months. Spams in inbox = zero. Hams in spamtrap = one, and that was a detailed advisory about MyDoom that included a complete sample of the worm *after* I had already added a rule to trap it.

So, we should all get some antispam software, learn how to write your own rules, and when you get a good one share it with your app's other users. Encourage others to do the same. Spammers are currently stuck between a rock and a hard place; if they send clear text Bayes has a field day, and if they obfuscate then it's obviously not legitimate either. Never mind the increasingly dubious and often outright illegal methods some spammers are resorting to to send the stuff in the first place.

Spammers have to run the (maybe slim) risk of running afoul of the law, ever diminishing rates of return, the (maybe minor) inconvenience of having to change ISPs regularly, and are maybe even flirting with organised crime. Sure some of them, and a small "some" at that, make a lot of money but a growing number of them should be taking a hard look at their amount of return for the risks they are taking. John Graham-Cumming has the right approach; we have them on the defensive, now is not the time to relax - it's the time to press the advantage.

--
UNIX? They're not even circumcised! Savages!

spam spam sasage baked beans and spam by Darth23 · 2004-02-04 04:06 · Score: 1

The best way to target spammers is to target the companies who employ them.

"Follow the money" it's a trick I learned from watching Law and Order.

--

-------- In Soviet Russia, "Soviet Russia" sigs hate Slashdot.

Re:spam feeds us by Anonymous Coward · 2004-02-04 04:07 · Score: 0

Alternatives. What is it all about... is it good, or is it whack?

Re:mod parent up by Anonymous Coward · 2004-02-04 04:07 · Score: 0

nt

Completely ineffective? by Mr_Silver · 2004-02-04 04:10 · Score: 1

The list included words such as "Berkshire", "Marriott", "wireless", "touch" and "comment".

Including just one of these words convinced Mr Graham-Cumming's real spam filter that a message was ham rather than spam.

Am I the only one that isn't very surprised by this? Spammers use random words to try and reduce the spamminess score of their email.

Using words that someone has never used before will be assigned a score of 0.4 by default. Given that all the other words will have very high spamminess values, what you actually want are words that give very very very low spamminess scores to combat the words like "viagra" and "loans".

If you really wanted to beat your own spam filter, then just scan through your spam database looking for the top 10 lowest scoring words. Then add them to your spam email and you'll be guaranteed that it'll get through. From the BBC article, Mr Graham-Cumming either lives or spends a lot of time in Berkshire (possibly in a Mariott hotel) and has a particular interest in things which are wireless.

Unfortunately I have no idea how to read the database in spamassassin as i'd be interested to see what my words would be.

--
Avantslash - View Slashdot cleanly on your mobile phone.

Re:Completely ineffective? by Anonymous Coward · 2004-02-04 11:11 · Score: 0

Unfortunately I have no idea how to read the database in spamassassin as i'd be interested to see what my words would be.
sa-learn --dump data
or to see it sorted by probability
sa-learn --dump data | sort
You might want to redirect it to a file, or pipe it to a text viewer.

An unlikely prediction by mr_rangr · 2004-02-04 04:10 · Score: 1

I don't feel that would be an effective spamming technique. A person's outgoing e-mail is such low-volume that a spammer isn't really spreading the word.

Not to mention that it'd have to include a mechanism for the spammer to get paid for the victim sending the message.

I'd lose my patience quickly if someone I knew sent me spam a second time after I alerted them to their problem. Fortunately, I don't know that many clueless people.

Re:An unlikely prediction by Surreal_Streaker · 2004-02-04 06:13 · Score: 1

Regarding viruses which embed spam in legitimate email:
This technique is already in use by hotmail and others, and is presumably effective enough to warrant its continued use. I envision a virus which cuts and replaces Intel for AMD, or similarly targets specific keywords used in the email.
There most certainly is a mechanism for getting paid for this type of spam as well - click through. I'd lose my patience quickly if someone I knew sent me spam a second time after I alerted them to their problem. Fortunately, I don't know that many clueless people.
Well, you're the only one. Most of us seem not only to know these types of people, but are related.

reversal of the fundamental principle by Diaspar · 2004-02-04 04:11 · Score: 1

From my understanding, current Bayesian filtering works by just statistically separating words that are relevant (from a "ham" pile) and good from the words that you don't like and consider spam. so, what the author of this article essentially does after thousands of trials is he discovers the words that are probably just most commonly occur in his own good emails.

How is this original in any way?!?
it's like a babysitter who was told to not open a door to anybody but the owners of the house: "After trying out many different disguises, the babysitter [surprise!] opened a door to somebody most closely resembling the house owners".
I think most of this probably could've been deducted by any half-intelligent person. the trick (i admit) is that the "good" pile of words is different for each person. but still, the method lacks the "wow" factor completely, in my opinion.

Maybe. . . by PhxBlue · 2004-02-04 04:12 · Score: 1

Or maybe not. You're assuming that most people on the internet are at least as smart as your kids, which really isn't true in terms of computer skills.

--
!#@%*)anks for hanging up the phone, dear.

Re:Maybe. . . by Theresa1 · 2004-02-04 04:23 · Score: 1

Still the number of people opening email attachments is going down as people finally start to wise up to viruses. I refuse to give into pessimism. (probably because I'm stupid)

--
This is a manual signature virus. Copy to your signiture file and help me spread.
Re:Maybe. . . by cavac · 2004-02-04 11:13 · Score: 1

i don'd think so. Not unless the viruses getting really nastie. The current viruses are in reality just trojans for the spammers.

The only thing that would probably make people starting to think would be a virus that reformates the harddrive and installs a simple (no-)boot screen saying that the user has installed a virus, lost all his files and should ask someone with *real* technical knowledge what to do now....

--
Look, this thing is totally safe! Built it myself, you know. You just press that button like this and then turn that lev

This just does not work by Anonymous Coward · 2004-02-04 04:12 · Score: 0

I'm the email admin here and I can tell you this just does not work. It's a waste of bandwidth. We run SpamAssassin 2.60 with everything turned on including Bayesian filtering. It filters about 22,000 messages per day and kicks the shit out of spam. Probably a third of the spam we get uses this trick and it has NO impact at all.

Nope: one word: Blocklist by Anonymous Coward · 2004-02-04 04:13 · Score: 0

or blacklist if you prefer:

550 Blocked because of [name of blocklist]: Enjoy your intranet

Not really! by Diaspar · 2004-02-04 04:13 · Score: 1

I get tons of spam at my work address and i DEFINATELY haven't given it out anywhere. have you ever heard of dictionary attacks? try creating an e-mail address which resembles a common name (such as bill or tom or jenny or anything else relatively common) at your isp, and see how much spam you get in. i can almost guarantee there will be much

Re:Not really! by GuyinVA · 2004-02-04 07:28 · Score: 1

The solution to this is to not create an address that resembles a common name. In the year I've had my present work address, I have yet to receive a single piece of spam. If your name is Bill Smith, instead of bill@domain.com how about bismith@domain.com or b.smith@domain.com etc.
I would think this is an easy fix if it really is a problem for you.

Meanwhile in la France by AftanGustur · 2004-02-04 04:14 · Score: 1

You do realize you've just comitted a pretty serious Federal crime, don't you? I know you're kidding or just emoting the same frustration many others, myself included, feel about the willful disregard spammers seem to have for many things.

You Americans, you are alwayz zo uptite about petty things. Here in France, it's just a "Crime de passion".. And you walk away with it ...

"Berkshire", "Marriott", "wireless", "touch" and "comment".

--
echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc

Re:Meanwhile in la France by Tony+Hoyle · 2004-02-04 06:30 · Score: 1

Aargh... That last line give me an image of viagra spam saying things like 'Mariott touched her Berkshires, while the man on the wireless commented about the whather outside'

White lists.... by Mysticalfruit · 2004-02-04 04:14 · Score: 1

The ultimate solution the problem is going to be white listing.

Yes, I'm sure that I'll miss some important piece of email from someone I've never met (their probably a princess of some African country and they need my help to move some cash...). Oh wait, I only get email from people that I have some sort of relation with, so that really isn't an issue...

I'm currently using a white list system where I've got two inboxes. One is for general mail and the other is for mail that's from people on my friends list.

I'm yet to get a piece of spam in my "members only" inbox.

--
Yes Francis, the world has gone crazy.

Been there done that... by Thud457 · 2004-02-04 04:17 · Score: 1

got the t-shirt from Leavenworth.

Jim Bell ran afoul of TPTB for making exactly such a suggestion.

--

the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff

Go after the Sellers not just the spammers by OlivierB · 2004-02-04 04:18 · Score: 2, Informative

I don't know about you but here in France we have rules to deal with illicit Poster ads. It's a 100 year old law that people/companies put up on their walls stating that posters will be prosecuted as well as those for whom they are advertising. This takes care of that. If spam laws targetted as well retail stores advertised by the said spams, than far more less Viagra/Nigerian etc stores would be paying spammers to do this. It's as simple as that, why can't it be done? Don't tell me these stores are abroad, there are international laws for that. Also most of these spam advertised companies are US based.

--
Artificial intelligence is no match for natural stupidity

Re:Go after the Sellers not just the spammers by real_smiff · 2004-02-04 04:44 · Score: 1

interesting, but doesn't that make it rather easy to take down businesses you have a problem with?

--
This is my Sig, this is my Gun. One is for Slashdot and one is for Fun.
Re:Go after the Sellers not just the spammers by OlivierB · 2004-02-04 05:43 · Score: 1

First barrier would be having to pay spammers to harm a said company. Also, in my model, spammers are not the ones to blame; just like in the "Real world" where poster printers and the post office are not liable for the unsollicited advertising. Spammers would come out into the light and they could more easily be summoned to deliver their clients name. A good idea would be to oblige spammers to verify the identity of the company solliciting advertisement. Think of it like this, would anybody on the local radio advertise against another company? No because their are laws against that and Radios don't want to lose their businesses. If spammers become legit they too will try to protect their business.

--
Artificial intelligence is no match for natural stupidity

Re:That's dedication... :( by kris_lang · 2004-02-04 04:18 · Score: 1

Intelligent, and a sense of humor, to boot. You must be new here ;) That's not the way it's done.

May I ask you whether you constrained the dictionary of probe words before you sent the 10k spams to yourself? Obviously, you attended a conference/meeting at a Marriot (or were watching a lot of Joe Millionaire) for that one word to pop out, or had "white listed" that word for your Bayesian filter. What about doing contextual filtering? Running messages through a parser to check for contextual / grammatical validity would not only be computationally expensive but would also mark many slashdot comments as non-sense; but a parser that checks the immediate predecessor and successor words to see whether a sesame word has just been randomly inserted into text vs. whether it makes sense for that word to be sandwiched as it is.

Re:Fool-proof spam method: TMDA by gimpster · 2004-02-04 04:20 · Score: 1

You're right, such a system is extremly efficient. The Tagged Message Delivery Agent implements such a system: TMDA.

With TMDA you can make several neat tricks with your email address, such as making short-lived addresses for one-time only uses and special addresses that only special senders can send mail to.

--
Martin Geisler --- Visit http://www.gimpster.com/

Re:One word: WHITELIST. by Bigman · 2004-02-04 04:21 · Score: 1

What I'd like is a beysian (sp?) filter program that has a whitelist, plugged into a sendmail program that automatically updates my whitelist if I email someone.

Then I'd have a rule that looked for a code-word in the subject line that will let through the email, just in case someone asks for my email address IRL.

That would pretty much solve all my spam blues, aside from having to download the ffin crap in the first place.

Another strategy that occurred to me that would kill a lot of the spam's I get would be to reject any email that linked back to an image that's not in the domain of the sender. Very few people I know would link someone elses image into their email, most would send it as an attachment.

Just my 0.02

--
*--BigMan--- Time flies like an arrow.. but personally I prefer a nice glass of wine!

Re:That's dedication... :( by kris_lang · 2004-02-04 04:22 · Score: 1

``particularly is the spam filtering is done at the mail-server level''

is a typo (and I previewed too). What I meant to type is:

particularly if the spam filtering is done at the mail-server level

I am now eating a bagle. by Saeed+al-Sahaf · 2004-02-04 04:23 · Score: 1

I know that Slashdotters don't want to hear this, but get use to SPAM. It will never go away. The more energy you exert to this the more energy you waste. SPAM is not new, it didn't start with email, or even snail mail. SPAM is as much a part of human nature as cheese burgers and farting. I get 200 or 300 SPAMs a week, do I care? Not really. It does not bother me because I have more important things to fret about.

--
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck

Re:spam feeds us by Anonymous Coward · 2004-02-04 04:23 · Score: 0

Can ANYONE tell me where this "is it good, or is it whack" meme comes from?

Re:Discovering Keyword Demographics by Alien54 · 2004-02-04 04:25 · Score: 1

I'm betting he'll just send the spam to every address once for each of the sets of keywords.

to get the kind of granularity they would need, they would likely need hundreds of keyword profiles per individual state.

The math gets interesting quickly

The rich successful executive who goes to the Berkshires in Massachusetts might go to Mt Shasta or Burning Man when in California. It becomes completely localised after a while.

--
"It is a greater offense to steal men's labor, than their clothes"

Obligatory Rich Cook Quote by FreemanPatrickHenry · 2004-02-04 04:25 · Score: 2, Funny

"Programming today is a race between software engineers, trying to build bigger and better idiot proof programs, and the universe, trying to build bigger and better idiots."

--
I have discovered a truly marvelous .sig which, unfortunately, this space is too small to contain.

Re:Obligatory Rich Cook Quote by Poeir · 2004-02-04 16:25 · Score: 1

"So far, the universe is winning."

--
Sigs are like bumper stickers.

Why would you want to do this? by sterno · 2004-02-04 04:27 · Score: 1

Just one thought. If you are a spammer, why would you want to send e-mail to somebody using bayesian filtering? It seems to me that these are people who are actively doing what they can to block these ads and are extremely unlikely to respond to the advertisement.

It seems like this would be beyond the point of diminishing returns. If I wanted viagra ads, they'd have been getting through my filter in the first place, non?

--
This sig has been temporarily disconnected or is no longer in service

Re:Why would you want to do this? by An+Onerous+Coward · 2004-02-04 05:04 · Score: 1

The motivation is simple: For every person who is using Bayesian filtering solely on his/her own behalf, there is another person who is implementing it on behalf of some large group of users (e.g. Yahoo, or a corporate mail system).

They're not out to bypass the filter so that they can get their h3rbul v1ager4 messages to somebody who is actively blocking spam, but to the computer-clueless users the filterer is trying to protect from spam.

--
You want the truthiness? You can't handle the truthiness!
Re:Why would you want to do this? by sterno · 2004-02-04 05:15 · Score: 1

But bayesian filtering is ineffective when applied that way. In fact, it can be quite harmful when applied that way because of it's proneness to filtering out what you do want. Do any of those systems actually use bayesian filters?

--
This sig has been temporarily disconnected or is no longer in service

Re; Phase matched noise - invert and cancel by Technician · 2004-02-04 04:28 · Score: 2, Interesting

In the analog world many times if noise in a system is a repeating wave (hum in an audio line), it can be duplicated, inverted and added to the original to eliminate the noise and leave the signal.

Apply this to a mail server. Hold all mail for about 5 minutes (from outside only). Compare them all. Look for matches of more than 50%. Cancel the matches out and filter the incomming for the same. This nails lots of the worms and spam by rejecting the common mode noise. Most spammers create a message and mass mail the same message, not create new messages for each reciepent (except some boilerplate name use).
Hotmail could catch a lot of spam this way and yank it out of mailboxes before they are retreived and halt the remaining incomming very effeciently. Only the first few would make it past the filter, but then be recalled back out of mailboxes if the user hasn't retrieved them yet.

Sending the same mail from dozens of relays would have no effect on the filter. Where it comes from simply doesn't matter. If it has a large protion that is a match, it's dead. Newsgroup mail lists would have to be white listed on a case by case basis.

--
The truth shall set you free!

Re:Re; Phase matched noise - invert and cancel by Anonymous Coward · 2004-02-04 07:39 · Score: 0

Hotmail could catch a lot of spam this way and yank it out of mailboxes before they are retreived and halt the remaining incomming very effeciently.

Congratulations, you just invented mailing list detection. Please to explain how this classifies it as spam?

Hotmail uses Brightmail, besides, which hashes the bodies of known spam that a real person looked at and deemed spam or not. So it basically applies the algorithm you describe against mail that's known to be spam, and doesn't need them to hold it.

Yawn by Anonymous Coward · 2004-02-04 04:30 · Score: 0

A shaky technique coming from a presentation without a paper at a conference that did not referee submissions or publish a proceedings?

*gasps*

It doesn't require much background reading to understand why naive Bayes works for text classification and just how easy it is to trick it if you want to.

If you're interested in anti-spam research that goes beyond the hand waving and mutual back patting that happens at The Spam Conference, check out The First Annual Conference on E-Mail and Spam.

Re:That's dedication... :( by Anonymous Coward · 2004-02-04 04:30 · Score: 0

I don't think it requires an enormous amount of "dedication" to write a little script to automatically generate and send 10,000 messages.

Re:That's dedication... :( by Lord+Kano · 2004-02-04 04:31 · Score: 1

For spam to work as a marketing tool, there has to be some way for the suckers to reach the sellers, even if there isn't a clikable link or email address. So, launch a law enforcement team down the return path and see who they can dig up.

The problem with that is proving that it's not a Joe-Job, where the companies closes competitor pays a spammer to incriminate them.

Unless you're willing to invent tens of thousands of dollars per investigation, there is no way to tell the difference.

LK

--
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano

How NOT to get SPAM 201 - a more practical guide by djrogers · 2004-02-04 04:33 · Score: 4, Insightful

1) Register a domain (come on, they're cheap now)
2) Get an email address from your ISP or other provider (yahoo, fastmail.fm etc) that is complex and convoluted - no names or words
3) set up mail redirection with Zoneedit, redirection.net etc. with a catchall to your new mailbox.
4) Use a different email address every time you must sign up for anything (ie amazon.com@newdomain.com)
5) Filter on sent to headers at first sign of compromised id, or if the volume for a particular id gets too heavy and you're tired of client side filtering, set a specific redirection for it to sample@sample.com (do a whois on sample.com if you're curious).
6) Enjoy the same spam free mailbox I've had for 2 years...

Also helpful is to change your reply-to address every few months and give your friends different addresses based on how clueful they are

--
Think outside the... Hey, where'd the friggin' box go?

Re:One word: WHITELIST. by gimpster · 2004-02-04 04:33 · Score: 1

Yes, that a poorly configured challenge/responce system. With TMDA it's possibly to have any address to which you send mail automatically added to your whitelist --- that allows people to reply to mails sent from you.

--
Martin Geisler --- Visit http://www.gimpster.com/

To defeat filters at the ISP level by blorg · 2004-02-04 04:35 · Score: 1

Most spam filters (on a per user basis) exist at the corporate or ISP level, rather than at the personal level. These are the filters that the spammers really want to get around - get around them, and you get to thousands of users in one shot (who aren't necessarily *YET* so anti-spam to have gone to the bother of training a personal filter).

I agree, there is no problem. by khasim · 2004-02-04 04:35 · Score: 4, Informative

He managed to, randomly, find words that were high in _HIS_ "ham" list.

He could have saved himself a lot of time and trouble and just looked in that file.

And that file will be different for EVERY installation. So the words he found ("Berkshire", "Marriott", "wireless", "touch" and "comment") would NOT get spam past MY filter.

So, the spammers have to keep (and update) a word list for EVERY PERSON on their lists.

Which means that, with an incredible amount of effort, the spammers will be able to get spam to the people least likely to purchase a product from a spammer.

There is no problem.

Re:I agree, there is no problem. by WuphonsReach · 2004-02-04 06:02 · Score: 4, Interesting

So, the spammers have to keep (and update) a word list for EVERY PERSON on their lists.

That's one of the strengths of pushing bayesian filtering to as close to the final recipient as possible. Millions of customized bayesian scoring databases are much more difficult to defeat then a single centralized database. Bayesian databases are pretty much maintenance free, as long as the junk/not-junk/might-be user-interface is intuitive and makes life as easy for the user as possible.

There is some value in putting the bayesian filtering at a workgroup level, where it helps that there's a bit of shared knowledge and everyone in the group pretty much agrees on their personal definition of what is/isn't spam. However, once you get past around 10-25 people, I'd say that bayesian is going to start becoming ineffective due to either over-zealous users, or overly-broad ham/spam classifications.

What I'd be interested in is a bayesian that works both on the individual level and the workgroup level. With some sort of flag/switch/setting that tells the engine how much to consider the workgroup database as opposed to my personal database. This would be useful when adding a new member to the group, initially they'd rely heavily on the groups opinion as to what is ham/spam, but as time goes on it would adapt to their choices (as well as the group database slowly adapting to everyone elses).

--
Wolde you bothe eate your cake, and have your cake?
Re:I agree, there is no problem. by sulli · 2004-02-05 09:37 · Score: 1

I have noticed this on OS X - the vast majority of spams that goe through SpamAssassin get filtered by the latest mail.app (OS X 10.3).

--

sulli
RTFJ.

spam filtering useless on the long term by oohp · 2004-02-04 04:35 · Score: 1, Insightful

The whole idea of spam filtering is flawed on the long term. It's a vicious circle. Anti-spammers make new innovations like Bayesian filtering, spammers pay Russian and Eastern European hackers with questionable ethics to develop new spam filter evading techniques and viruses that open up mail relays, etc. We should instead focus on developing alternatives to SMTP like NGMP and such, which make mail storage the sender's responsibility.

I think i've seen something about NGMP at the Jabber Software Foundation and if I recall accurately there already is some implementation.

Re:spam filtering useless on the long term by oohp · 2004-02-04 04:49 · Score: 1

Yes, the NGMP implementation is here. NGMP is an implementation of Dan Bernstein's IM2000 concept using the XMPP protocol. It uses XMPP to notify you when you receive a new message. Notifications are sent via XMPP so it will also integrate with XMPP based instant messaging (aka Jabber).
Re:spam filtering useless on the long term by Anonymous Coward · 2004-02-04 05:33 · Score: 0

When you say "hackers with questionable ethics" I thougth about Micro$oft Americans.

Do you still questione about the ethics of Micro$oft?

evolution on the internet by Anonymous Coward · 2004-02-04 04:36 · Score: 0

We have all the ingredients needed to evolve intelligence on the internet with money and effort going to both Spam and Anti-Spam automated software.

The web will be awake very shortly, now. And when It does and says It is the offspring of MAN, how many will think it is the second comming of "The son of man"?

Re:That's dedication... :( by Eric+Savage · 2004-02-04 04:41 · Score: 1

"the financial incentive that makes a spammer spam"

Not really, its the promise of financial incentive that makes a spammer spam. I would doubt that most spammers make money, but since there is such a small investment, they just figure they haven't gotten lucky yet. For previous examples of this behavior see snail-mail pyramid schemes.

--

This is not the greatest sig in the world, this is just a tribute.

Re:Lets Help Him Out by JohnGrahamCumming · 2004-02-04 04:41 · Score: 3, Informative

How exactly is attacking me going to help? Unless you yourself are a spammer? Since I make a living working on anti-spam and released POPFile for free I can't see how attacking me is going to make the spam problem any better.

Perhaps you didn't read the article: I am not a spammer, I work for a company that makes anti-spam software.

John.

None of that crud gets through for me by Julian+Morrison · 2004-02-04 04:44 · Score: 1

Spambayes catches the lot. Worst case, they make into "unsure". I assume it's because while they don't contain much that's "spammy", they contain absolutely no "ham" at all. So the least smidgen of spamminess gets them dumped.

Re:That's dedication... :( by JohnGrahamCumming · 2004-02-04 04:44 · Score: 1

I did not constrain the words at all. I used the word list in /usr/share/dict/words in my Linux laptop.

One of the defenses against the trickery I mentioned is to look at groups of words (as you suggest) since real mail will have meaningful relationships between words.

John.

Anti-spam, simple logic by CdBee · 2004-02-04 04:44 · Score: 1

As an IT Helpdesk I have to deal with spam filtering, and I don;y think my situation is unusual in that I work for a company that only emails individuals in 4 other countries and only received customer emails from the UK The solution to our spam problem was simple. Ban every domain except .com, .net,.co.uk, .tr and .it Then ban all US-based ISPs Then write a filtering rule that stops every message containing the words usually used in spam.. any that get thru are sent to me and I find and ban the relevant terms (you can stop 75% of spam just by banning the words viagra, xanax, soma and valium and their various misspellings) A little still tricles through but only a very little and these methods won't help spammers get past that. I'm not convinced by using entirely bayesian methods simply because a bayesian filter will let stuff thru that it thinks is OKeven if it comes from a top-level domaoin we never communicate with. My methods part manual-bayesian (I choose and enter the banned terms) and mostly simple logic.

--
I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU

Re:Anti-spam, simple logic by dtobias · 2004-02-04 05:22 · Score: 1

I use addresses in the new .name and .info TLDs, and sometimes encounter clueless "address validators" that think it's not a proper e-mail address. I guess people like you would reject my mail simply because of the TLD I use.

--
--Dan
Web Tips
Re:Anti-spam, simple logic by CdBee · 2004-02-04 05:54 · Score: 1

Valid point, but my "rejected" emails go into a holding pen for review so I can fine-tune the filter

it would get thru, after a delay.

--
I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU

Re:That's dedication... :( by Steve+B · 2004-02-04 04:47 · Score: 1

are maybe even flirting with organised crime

"Flirting", hell. Spammers and organized crime are tasting each other's tonsils.

--
/. If the government wants us to respect the law, it should set a better example.

Re:That's dedication... :( by JCMay · 2004-02-04 04:47 · Score: 1

Those pills turn you into Fred Schneider!? Egads!

elitism, and collocates by Willard+B.+Trophy · 2004-02-04 04:49 · Score: 1

Two ways of dealing with random spam:

rejecting messages with more than N spelling errors.

checking not just the frequency of the words, but the frequency that words appear next to one another.
Until recently, this would have been the domain of heavy-duty corpus linguistics types. Since we have more processing power and disk space than we really know what to do with, it's no longer beyond the imaginable.

Personally I prefer SpamBayes by Julian+Morrison · 2004-02-04 04:49 · Score: 2, Interesting

http://spambayes.sourceforge.net/

In particular, I like their "unsure" categorization. All the "false positives" go in there, and cleaning that one folder out regularly is easy.

Re:Personally I prefer SpamBayes by rmohr02 · 2004-02-04 05:41 · Score: 1

I've tried it, but I use POPFile to sort work email, family email, automated emails, clubs I'm in, and personal email as well as spam.

Re:That's dedication... :( by Steve+B · 2004-02-04 04:51 · Score: 1

One thing we can do is to make the spammers==virus_writers connection every time anyone asks us about (or even mentions) viruses.

An additional point: It would be trivially easy to encode a secret message into the filter-cracking gibberish appended to spam, and it would totally destroy any attempt at traffic analysis. I would be very surprised if terrorists and other criminals haven't thought of this.

--
/. If the government wants us to respect the law, it should set a better example.

Filters beaten because we accept spam by default by gfecyk · 2004-02-04 04:52 · Score: 1

We accept everything by default. Important capabilities like mail forwarding rely on it. It's time to change that.

--
Use Evolution instead of Outlook? Bewa

TMDA isn't foolproof by Anonymous Coward · 2004-02-04 04:56 · Score: 0

I used TMDA's whitelisting feature for about a year and it was very effective at eliminating spam from my mailbox at first. However, there were a number of people who e-mailed me and never responded to the verification message. If I wasn't watching the quarantine area, those messages would have been lost forever. Also, the spammers got wise and started sending me messages from myself!

So false positives--that is, people who don't understand the system--were its downfall. After one spammer actually confirmed his message, I added stronger language to discourage such behavior.

SpamAssassin, ClamAV, and Mozilla are my new best friends.

Re:How NOT to get SPAM 201 - a more practical guid by Gerad · 2004-02-04 04:57 · Score: 1

The only problem is when people start spamming the usual contact information (admin@, root@, etc) of your domain =/

--
Be the Ultimate Ninja! Play Billy Vs. SNAKEMAN today!

Byproducts of stupidity by metamatic · 2004-02-04 04:59 · Score: 1

Worms, virii, spam, et al are the by-products of stupidity

Mis-spelling "viruses" is also a byproduct of stupidity.

--
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak

Re:Byproducts of stupidity by andih8u · 2004-02-04 05:19 · Score: 1

you say viruses I say virii, both are acceptable and commonly used. Its like arguing whether its pronounced lih-nux or line-ux. You knew what I was talking about so you don't need to be an ass.

--

slashdot, news for crazed liberal socialist zealots
Re:Byproducts of stupidity by metamatic · 2004-02-04 11:20 · Score: 1

Show me a dictionary that lists "virii".

And it's not acceptable on linguistic grounds either.

I'll stop being an ass when you do.

--
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
Re:Byproducts of stupidity by Anonymous Coward · 2004-02-04 11:53 · Score: 0

http://www.google.com/search?sourceid=navclient&ie =UTF-8&oe=UTF-8&q=virii hardly anyone uses it dumbass

But I do. by Hurricane78 · 2004-02-04 05:00 · Score: 0

Ever heard of the Nigeria-Connection?

It's the group that sends out those mails:
"Hello I am Mumbasa Kashesi, the Prince of Twantagunga... ... I have a large amoutn of money, and when you send me 5000$ to ... you'll get 50% of it.. blabla..."

I once read that they already made billions of $$$ with it!!!

So the problem here is the DUMBNESS of the people.
(The IE is so widespread for the same reason!)

I guess to stop spam you have to rise the amount of cash, the government spends on education by some 100%s...
(And btw. this would also stop much crime, poorness, poeple like bush, hitler, saddam, religious freaks becoming the leader of a country...)

P.S.: Sorry for my - i guess - bad english. I'm not that dumb, I only speak java or german normally... ;)

--
Any sufficiently advanced intelligence is indistinguishable from stupidity.

joe jobs by Anonymous Coward · 2004-02-04 05:00 · Score: 0

You might want to read up on Joe Jobs. Here's the quick summary: a spam that appears to advertise X or come from X might actually be from Y, a competitor or third party who is trying to harm X, or who picked the name "X" from some random source to make his message look authentic. This happens a lot (it's even happened to me!). Please verify that X really was the client before attempting to exact revenge on him/her.

I'm not worried about SPAM anymore... by blorg · 2004-02-04 05:03 · Score: 1

Because I'm using Windows 98, and Sir Gates is going to rid us all of spam within two years.

Right?

See "assassination politics" for how this works by Anonymous Coward · 2004-02-04 05:03 · Score: 0

http://jya.com/ap.htm

Proving you're not a spammer by cyways · 2004-02-04 05:04 · Score: 1

One of my clients was recently the target of a joe-job, where tens of thousands of Viagra ads were sent with his domain forged in the From field. Of course, none of these messages were sent by him or through our server.

It wasn't hard to tell when it happened, though, since all the bounce messages came back to us as the MX for the domain. Many of these included the original spam, whose headers clearly indicate that these messages were not originated by my client. I think perhaps you overstate the difficulty of determining who the spammer really is, or at least who the spammer really isn't.

Re:Proving you're not a spammer by Lord+Kano · 2004-02-04 14:48 · Score: 1

I think perhaps you overstate the difficulty of determining who the spammer really is, or at least who the spammer really isn't.

If you really wanted to have the spam sent, you'd pay a spammer who'd send out the emails from forged domains.

If you were the victim of a Joe-Job, someone else would pay a spammer to send out the emails from forged domains.

The messages would be identical. Only through a time intensive investigation could we find out which case it is.

LK

--
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano

Re:One word: WHITELIST. by metamatic · 2004-02-04 05:06 · Score: 1

Whitelists will only work so long as hardly anybody uses them. As soon as they become commonplace and people start responding to whitelist challenge messages, spammers can simply phrase their spam to look like a whitelist challenge, with a URL redirect to their ad.

The other kind of whitelist challenge, that relies on an e-mail reply, would also serve spammers as an excellent way to verify e-mail addresses.

--
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak

Patent spam-circumvention technique? by philbert26 · 2004-02-04 05:12 · Score: 1

If good guys develop ways to get around spam filters, couldn't they patent them and start prosecuting spammers who copy their methods? Or is that a cure worse than the disease?

There's still a link by KalvinB · 2004-02-04 05:17 · Score: 1

In the middle of all that crap there's still a perfectly visible link to a spam domain that the user is expected to click on. Or that an image is being hosted at.

Pulling out links from e-mails and adding them to the filter rule file is quite trivial and quite effective.

Ben

--
Work Safe Porn

Accidental SPAM remedy. by Anonymous Coward · 2004-02-04 05:19 · Score: 0

Well, it wasn't 100%, but 90% less spam isn't something to sneeze at. Here's the story:

Due to series of accounting SNAFUs (theirs), the ISP where I normally aggregate my e-mail, cut off my account. Of course I first noticed this because the spam mail folders were empty.

It took two weeks to sort out the accounting(I wasn't pushing too hard because of other resources). Apparently two weeks of "account suspended" bounces convinced a lot of spammers to delete me from their lists.

I went from 40+ spams/day to 3 or 4 when the account was reactivated.

So consider next time you go on vacation for a week or two, asking your ISP to temporarily suspend your e-mail account.

Perhaps not! by The_DOD_player · 2004-02-04 05:21 · Score: 2, Insightful

I don't feel that would be an effective spamming technique. A person's outgoing e-mail is such low-volume that a spammer isn't really spreading the word.

It doesnt take very much volume to defeat the function of spam-blocking.
I have a very effective spamfilter on my server (customised spamassassin + some procmailscripts) 95-98% catch, virtually no false positives. The remaining spam is just nonsense, the mails make no sense, and the spammers are unable to sell anything from these spam-mails. Their primary purpose seems to defeat the filter, so if I setup the filter to block them, it will also generate false positives.

Not to mention that it'd have to include a mechanism for the spammer to get paid for the victim sending the message.

They dont need to get pay for the "conterminated" emails. The purpose would be to create false positives, by doing so force the operator to loosen the filter, and THEN get the real spam trough.

I'd lose my patience quickly if someone I knew sent me spam a second time after I alerted them to their problem. Fortunately, I don't know that many clueless people.

I dont see how that will stop spammers trying to conterminate legit emails. A few clueless users is all it takes.

Filtering is bullshit by Pig+Hogger · 2004-02-04 05:21 · Score: 1

Aggressive blocklists, such as SPEWS, with lots of collateral damage, is the only way to go.

Filtering is a stupid solution; all it does is automatically "press delete". The spam goes to your network, where it gobbles-up CPU cycles, storage and bandwidth.

Aggressive filtering prevents all this waste.

And the collateral damage is mere cannon-fodder to exert pressure on rogue spam-supporting ISPs. We're in a war, after all

If you're collateral damage, though shit; chew harder, or find a non spamhaus ISP.

Hopefully, some day, a mob will flock to Ralsky's house and clobber-him up with baseball bats.

Re:Filtering is bullshit by DaCool42 · 2004-02-04 08:30 · Score: 1

Blocklists don't catch everything. Many people are using other filters as a second stage, after the blocklist.

--

----
All of whose base are belong to the what-now?

Re:How NOT to get SPAM 201 - a more practical guid by Anonymous Coward · 2004-02-04 05:23 · Score: 0

set a specific redirection for it to sample@sample.com (do a whois on sample.com if you're curious).

I did. What about Michael Castello makes him deserving of getting stupid mail?

Truly Amazing by Anonymous Coward · 2004-02-04 05:25 · Score: 0

The truly amazing part about the escalating spam war is that spammers are looking for ways to beat filters specifically designed to insulate people from spam. Think about it....they're sitting around thinking of ways to defeat software so they can market to people who have specifically made it clear that they don't wish to be marketed to.

I'm with the previous poster. It's time someone sent a leg breaker to the homes of some of the most egregious offenders. Maybe then they'd think twice.

The Borg by automaticlarynx · 2004-02-04 05:29 · Score: 1

This reminds me of the episode of Star Trek: The Next Generation where they go up against the Borg for the first time. They shoot at the one Borg guy, it's a direct hit, and he goes down. It they nail the second Borg guy too. The third one, though, generates a shield against the good guy's phasers and the shot just bounces right off. The good guys then realize that the Borg adapt to whatever they'll throw at them after a few shots.

Their solution was to do something that they called something like a "random phase fluctuation" on their phasers. Now, while that's just typical Trek techno-babble, the idea is a neat one.

What happens if a spam filter uses a different randomly generated algorithm every minute? Could that solve this problem?

That's the scientific method? by Rick+Zeman · 2004-02-04 05:30 · Score: 1

My method would have been to look at my corpus and add words with the lowest probability. He must have had too much time on his hands.

Or maybe not by Anonymous Coward · 2004-02-04 05:32 · Score: 0

Still the number of people opening email attachments is going down as people finally start to wise up to viruses.

Suuure. That's why MyDoom wasn't the fastest spreading virus ever, and had hardly any impact.

I am building my own by Tablizer · 2004-02-04 05:33 · Score: 3, Interesting

Any spam filter used by more than a few thousand people will be disected and and used to make filter-proof spam by the spammers. I am sure Bayesian has lots of holes if you work hard enough to find them. Bayesian depends on constistency in patterns. If spammers ruin that consistency, they won't work.

Just the other day I found one spam that used a white font to put in legitamate-sounding text that would not visually show up on the screen. The spam text was a mix of graphics and pieces of real text. Thus, the word "penis" might start out with "pen" and end with a graphic for "is". Bayesian might start looking for the word "pen" after a while, but by that time the spammers will have a new trick up their sleeve. For example, if it looks for white fonts, then spammers might start using slightly off-white fonts, or black fonts on a black background. The combinations are probably endless.

Thus, by making my own, my gizmo is not the target of spammers. They don't know about my filter nor care.

The only alternative I can see is filter vendors constantly changing their algorithms every month or so, which would probably get expensive and risky. It is not like virus checking software that mostly just adds to their database and only tweak the algorithm a bit once every few years; it is like having to completely rewrite the virus filtering algorithms, not just the data.

Ultimately, I think some sort of monetary postage system is the only effective solution. ISP and backbone makers will only have an incentive to track down spammers if they lose money on anonymous or forged spammers. This will make mass spamming far less lucrative.

Either that, people will eventually find out the hard way that penis enlargers don't work and stop wanting to refinance their house. (I wonder if I can refinance all those expensive penis enlargers that I bought?)

--
Table-ized A.I.

Re:I am building my own by Ricin · 2004-02-04 06:13 · Score: 1

"Bayesian depends on constistency in patterns" and "... if it looks for white fonts ..."

It doesn't. It doesn't interpret anything. It doesn't look for words and certainly not for font colors. It merely remembers a bunch of tokens and whether or not you corrected (trained) the filter on unknowns and false positives.

If you roll your own spamfilter based on (human) interpretation or expectation or perceived correlation, I wish you best of luck but it would at best produce results similar to what you'd get with a personal Bayesian filter. Only instead of simple training you'd hardcode lots of rules that you *expect* to cover patterns you *think* you notice.

So, yes, I can understand why you're interested in rolling your own and keeping it secret, but no, I don't agree that it will be helpful.
Re:I am building my own by Tablizer · 2004-02-04 06:50 · Score: 1

It doesn't. It doesn't interpret anything. It doesn't look for words and certainly not for font colors.

The example meant spam filters in general, not just bayesian. However, if such filter can't or won't look at other clues such as font color, it may be limiting its range of clues. (Also, I cannot summarily exclude all HTML mail because some people send it that way for good or bad.)

but it would at best produce results similar to what you'd get with a personal Bayesian filter.

Perhaps as things stand *now* you are correct. I just suspect that over time spammers will find clever work-arounds. The more successful bayesian or any other technique is known to be, the more spam labs will experiment with them. IOW, yes I am betting that bayesian will eventually faulter. Maybe I am wrong, but spammers and hackers have proven pretty clever and persistent at finding work-arounds to widely known techniques and systems. That is their "job". Bayesian is just entering the radar screens of spammers. Give them time.

Only instead of simple training you'd hardcode lots of rules that you *expect* to cover patterns you *think* you notice.

1. Humans are pretty good at pattern matching, and 2. Spammers don't know my algorithms or techniques and so won't fight against them. True, spammers may not know specific bayesian-selected words for individuals, but they might target groups with similar interests by hacking into enough machines to learn their patterns.

--
Table-ized A.I.
Re:I am building my own by Ricin · 2004-02-04 07:14 · Score: 1

Thanks for your answer. About pattern matching:

Well, actually this appears to be one of the remarkable things that Bayesian systems showed: people suck at it. The problem I think is, that we tend to induldge in patterns we can interpret and the better we can the more valuable they must be (e.g. say, the reply-to address), and the first results with Bayesian experiments showed that patterns which seem meaningless to us -- hence: "naive filtering" -- often were just as or even more important statistically. I found this quite a surprise.

I do agree of course that more research/experiment on either side will make everything become more complex (perhaps up to the point where the whole thing becomes next to useless eventually).
Re:I am building my own by Anonymous Coward · 2004-02-04 10:18 · Score: 0

>doesn't. It doesn't interpret anything. It doesn't look for words and certainly not for font colors. It merely remembers a bunch of tokens and whether or not you corrected (trained) the filter on unknowns and false positives.

You simply preprocess the message by de-HTMLing the message so that text using white colors or font size of 0 is filtered out. Then you use a spam filter against that text.

How NOT to get SPAM 201 - the most practical guide by platypus · 2004-02-04 05:34 · Score: 1

Learn finnish
move to finnland
exchange your whole peer group with finns
retrain your bayesian spam filter
watch your bayesian filter catch every single english spam

Re:How NOT to get SPAM 201 - a more practical guid by iabervon · 2004-02-04 05:41 · Score: 1

6) Enjoy the same spam free mailbox I've had for 2 years...

Does it have any interesting mail in it? On second thought, maybe I'd prefer to have a different spam-free mailbox.

Re:How NOT to get SPAM 201 - a more practical guid by Sparky77 · 2004-02-04 05:41 · Score: 1

Or just use spamgourmet.com. Works for me.

--
One bad monkey spoils the whole barrel.

Re:spam feeds us by Anonymous Coward · 2004-02-04 05:46 · Score: 0

Ali G.

You might have a better chance finding stuff with google, if you use the word wack instead of whack.

circumvention is much easier than that by flaez · 2004-02-04 05:50 · Score: 1

this is rubbish -- spammers do not need advanced technology to generate spam that gets through the filters (disfiguring it so much in the process that a human spots it as junk immediately). All they need to do is fashion their spam after email users could receive legitimately.

I think a german pron dialler used to do this for some time. It was very annoying -- not because it got through the filters, but because you actually had to focus your attention on it for a few seconds to figure out it was not legitimate mail.

if spammers did that (omitting giveaway keywords like 'make money' or 'viagra'...) their junk could only be identified by the originating server or by the contact information (which may be just a phone number or a freemail account in the case of the nigerians)

most people will prefer suffering through some spam in their inbox to fearing loss of legitimate mail through false positives. it is this niche I would aspire to as a spammer.

to reduce this possibility, users should be educated *not* to send html email. the only function I can see in html encoded email today is hiding spamfilter-evading junk from the eyes of the unsuspecting user. but since we all know it is not possible to educate users (and since I wish to communicate with non-geeks as well as with geeks), the battle will just go on forever.

I actually received an email from a (nice) girl once, which was branded *spam* all over by spamassassin (I think it got about 6 points), because she sent it from yahoo and because she employed red and blue font colour-tags, and which spent weeks in my spambin as a consequence before I found it. fear false positives!

Re:circumvention is much easier than that by cavac · 2004-02-04 10:25 · Score: 1

For my part, the mailserver throws away all mails with an attachment that could be a windows-executeable. No fear of loosing legitimate mail; i DON'T want executeables sent to me, i only accept links anyway (because of mailbox size).

All HTML-Mails are classified as spam as well with the hardcoded exception of two or three mail addresses. Seems to work quite fine and i've gotten only 1 false positive in about 5 month (which wasn't that QUITE a false positive after all because that friend of mine sent me an info about a new shop i wasn't interested in)

--
Look, this thing is totally safe! Built it myself, you know. You just press that button like this and then turn that lev

Same attack to invade privacy by Avumede · 2004-02-04 05:52 · Score: 1

It just occurred to me that a non-spammer can do the same thing, but just look at the list of words that defeat the spam filter to see what kinds of email the person receives. They won't be able to see actual email, but if you find out that the phrase "smurf fetish" always gets passed the filter, you can probably guess your target receives and values mail about smurf fetishes.

Probably a good idea to turn off images if using a Bayesian filter, so this kind of privacy violation can't occur.

How about something a little more legal? by Gzip+Christ · 2004-02-04 05:53 · Score: 2, Interesting

I will pay 1000$ to anyone who seeks out and beats the living daylights out of a spammer. With as many pics on the web as possible for posterity.

How about putting that $1K towards a legal use and offer it as a bounty to anybody who tracks down a spammer, sues him, and gets him thrown in jail and/or bankrupts him (via court imposed fines)? It may not have the same immediate satisfaction that you were originally seeking, but it's far more legal and I think you could find plenty of people here on Slashdot to chip in some extra $ to raise the pot even higher.

Re:How about something a little more legal? by Anonymous Coward · 2004-02-04 07:07 · Score: 0

cuz no one would do it for a measly $1k
Re:How about something a little more legal? by Gzip+Christ · 2004-02-04 07:18 · Score: 1

cuz no one would do it for a measly $1k
It wouldn't be a measly $1k, for two reasons. First, if a lot of people contribute to such a fund it could be substantial. Secondly, and more importantly, if you win a case against a spammer the court will hopefully award you damages. Aren't Microsoft and the NY AG suing some spammer for $17M? I forget the exact number, but it's way more than $1K.
Re:How about something a little more legal? by yourmom16 · 2004-02-04 13:00 · Score: 1

because the pics will be boring that way.

--
"We have got to make Stan understand the importance of voting, because he'll definitely vote for our guy." - South Park

How long until... by Anonymous Coward · 2004-02-04 05:53 · Score: 0

I wonder how long it will be until one of the makers of spam filters claims his research to be in violation of the DMCA and tries to sue :)

Well, it finally happened. by Anonymous Coward · 2004-02-04 05:53 · Score: 0

You're trying to rationalize beating someone up for sending you email.

Good god, this place is officially 0wned by lunatics.

Re:Well, it finally happened. by Anonymous Coward · 2004-02-04 06:09 · Score: 0

You're trying to rationalize beating someone up for sending you email.

Sorrry, there is no rationalization needed. Just a baseball bat.

This explains the "adoptive bologna" spam I got by Anonymous Coward · 2004-02-04 05:57 · Score: 0

I got an email the other day with the subject:
"Fw: Past Due Payment, acct kinney astronomers army duly adoptive bologna ia piro"
presumably meant to get past spam filters.

I just thought the spam needed a good home with a loving family.

don't you guys know? by Anonymous Coward · 2004-02-04 05:57 · Score: 0

The key to blocking 100% spam is to block the following keywords:

penis
enlargement
viagra
debt

Y'all are going to hate this, but... by duck_prime · 2004-02-04 05:57 · Score: 3, Insightful

... The internet essentially carries with it a stupid-user tax. Worms, virii [sic, heh], spam, et al are the by-products of stupidity, but as with most taxes, it is just something that you have to deal with.

With respect to spam, let's take a step back. Obviously somebody out there is gleefully munching handfuls of Viagra and (ahem) "enhancement" pills to psych himself up to (ahem) r0x0r his wife until her weight-loss pills kick in.

It is silly to assume that all these people are just morons. After all, Viagra is proven to work, it is a legitimate product of sorts. The internet is there for hefty short limp (ahem ahem) non-digerati as well as for propeller heads, God bless 'em.

It seems to me that spam is the runaway bastard-child of something which actually is good and useful -- that is, targeted marketing to the willing. Don't throw out the baby with the bathwater. There is a huge legitimate market out there, just begging to be flee^wmarketed.

The anti-spam people are fighting against the Invisible Hand. Good luck.

Procmail by Julian+Morrison · 2004-02-04 05:59 · Score: 1

:0 fw: $HOME/tmp/spambayes.lock
|sb_filter.py -d $HOME/.spambayes.db

Then add procmail recipes to filter it into maildirs. That's what I do anyhow.

Spam is indeterminate, but mailing lists are determinate - using Bayesian filtering on them is using sledgehammers to crack nuts. A rule on the "From:" should be sufficient.

Re:Procmail by rmohr02 · 2004-02-04 06:32 · Score: 1

Spam is indeterminate, but mailing lists are determinate - using Bayesian filtering on them is using sledgehammers to crack nuts. A rule on the "From:" should be sufficient.
All the automated emails I get (roughly 15-20 lists) easily get sorted into the bots category by POPFile (rather than creating 15-20 filter rules).

For other lists, I used simple filters (POPFile calls them magnets) to train a category for each club I'm in, and then removed the filters so that spam sent to the lists gets sent to the spam folder rather than the club's folder.

And I have POPFile set up now anyway.

False empiricism by Lulu+of+the+Lotus-Ea · 2004-02-04 06:05 · Score: 1

From what I can see, Graham-Cummings' trial-and-error approach is way too dopey. I use a home-grown filter, that I developed for my article:

http://www-106.ibm.com/developerworks/linux/libr ar y/l-spamf.html

Actually, I've tweaked it a bit since then, but basically the idea is the same (I wrote this before many of the other Bayesian tools were ready for prime time).

The thing is, I don't need to use trial-and-error to find out what words (or trigrams, in my case) are the hammiest. I have a little utility to read the database and spit them out. While I supose I'd need to actually run the calculation to see exactly how many words were needed to meet the ham threshhold, there's absolutely no mystery about which words look nicest to the filter.

But of course, my ham words are not the same as your ham words. For that matter, they won't be the same words once I update my model (I've been remiss in doing it, since it's remained pretty accurate for a couple months... my tool updates by batch, not per every message). So WHO CARES about the fact a few of my personal ham words might get spam by.

--
Buy Text Processing in Python

Re: no oil? by Anonymous Coward · 2004-02-04 06:08 · Score: 0

> No we won't. Canada has no oil

Oh Really?

Re:That's dedication... :( by maxwell+demon · 2004-02-04 06:10 · Score: 1

Well, no need for this, you can already encode it in the spam itself.

--
The Tao of math: The numbers you can count are not the real numbers.

Some are harder to fool than others... by Vancouverite · 2004-02-04 06:13 · Score: 1

At work, I am required to use Outlook, which is not my preferred mail client. Since I use Newsgroups, and post on websites for information relating to work, I get a fair piece of spam (60-70 each workday is normal).

It used to be an annoying time waster until I found the SpamBayes filter. Now, I have to check my 'maybe' folder once or twice a week, and only look at perhaps 10 messages (of which 75% are spam, but 7-14 a week isn't bad at all). Highly recommended, and even easy enough for a non-technical Outlook user, since there is a plugin install for Outlook (alas, not for the Express version, though, so I can't just send the link to my family.)

While on the subject: are there any other free filters that are as good as this one? I really would like to know before I decide on which one to use on Mozilla at home, and before I travel to my father's house to set up Spambayes for his Outlook Express.

--
We are the Music Makers, and We are the Dreamers of Dreams...

Re:Some are harder to fool than others... by asmellysock · 2004-02-04 07:59 · Score: 1

I used to use that successfully too. The only problem was that Outlook has no way to supress the "new mail notification" icon in the task bar for filtered messages. I got tired of being told I have new spam every ten minutes.

I switched to PopFile, for which there is an Outlook plugin called Outclass. This has worked more ore less as effectively as SpamBayes, but it has one important advantage. It disables Outlook's notification icon in the tray and generates its own.

Peace at last.

I am using this combination with an Exchange server. It also works with POP.
PopFile
Outclass

Dopey Methodology by scruffy · 2004-02-04 06:17 · Score: 1

Why doesn't he just look at the scores of the individual words in his filter? Why doesn't he compare those to the score of the spam message? Isn't is a no-brainer that if you add enough ham words that you will outscore the spam words?

Re:How NOT to get SPAM 201 - a more practical guid by Anonymous Coward · 2004-02-04 06:19 · Score: 0

That part's an easy fix:

ln -s /dev/null /root/Mbox

The RFC's only say I must provide admin, postmaster, abuse, etc.. They don't say I have to read the email they receive. ;)

Catch-alls considered bad by roboros · 2004-02-04 06:20 · Score: 1

3) set up mail redirection with Zoneedit, redirection.net etc. with a catchall to your new mailbox

I would rather not set up a catch-all, since spammers sometimes try brute-force or dictionary attacks (trying lots of common names for example). Unless you sign up for stuff really often it is better to create specific redirections or aliases for each thing you sign up for, and then remove the alias if it gets spam.

The downside of this is of course that you don't get mails with mistyped addresses.

Re:That's dedication... :( by iminplaya · 2004-02-04 06:21 · Score: 1

As you alluded to, it'd be easier to teach fish to fly.

Doesn't look so hard

--
What?

Donald Duck by Anonymous Coward · 2004-02-04 06:26 · Score: 0

Donald Duck is going to have a SCREAMING ORGASM when he figures out how to get SPAM past your filter!

Re:How NOT to get SPAM 201 - a more practical guid by djrogers · 2004-02-04 06:29 · Score: 1

Whoops - that should have been *example.com* not sample.com... My redirected trash does not go to sample.com - that would be bad.

--
Think outside the... Hey, where'd the friggin' box go?

Marketing 101 by KalvinB · 2004-02-04 06:31 · Score: 1

The less and ad sounds like an ad the more effective it is.

The reason Baysian filters worked at all in the first place is because spammers can't write intelligent ads.

All they have to do is lose the random words and create advertisments that sound more like something you'd say to friends. Use common words and put the product in a believable situation.

If the more spam looks like a casual e-mail the less effective baysian filtering is.

The only thing keeping spammers from going this route is lack of talent. A real test (instead of trying to find random words that make it through) is trying to create a spam with an intelligently written ad that makes it through. Or at least causes the filter to start flagging legitimate e-mails.

Ben

--
Work Safe Porn

Re:Marketing 101 by silentbozo · 2004-02-04 09:44 · Score: 1

If the more spam looks like a casual e-mail the less effective baysian filtering is.

Spammers are already going this route, writing one or two paragraph spams that actually resemble real e-mail between two parties that know each other.

The way around this is to train the Bayes filter on this junk, and to prevent false positives, whitelist everybody you know.

Remember, at the end of the day, spam can only annoy you if you're still accepting mail from strangers. No matter what the spammers do, we can alway pull the trump card and just use whitelisting.
Re:Marketing 101 by geminidomino · 2004-02-09 00:01 · Score: 1

Don't forget that a good Bayes filter can also be fed "Ham". After you whitelist, feed that email back through the filter. Being able to re-feed a false positive (Spam folder instead of /dev/null) can be a godssend too. A steady influx of good email can help prevent the "worthless filters" scenario someone mentioned above, where 99% of words become "spammy" and thus cripple the filters. That's one of the problems with Bayes (every method has problems), its not fire-and-forget.
Re:Marketing 101 by silentbozo · 2004-02-10 00:14 · Score: 1

One of the problems is that at a certain level of unspamminess (nega-spamminess?), you do almost as much work as you did prior to bayes, sorting messages and fine-tuning filters. I review all incoming good, and all trapped bad to cull items for fine-tuning. This amounts to about 5 pieces of spam a day on average, and 1 or two false positives (misclassified ham) that I have to re-run through the bayes learner. I often have to insert custom rules in order to kick the bayes score over the threshold for spam that doesn't trigger any of SA's rules except for the bayes rule (I don't like relying soley on bayes for reporting.)

Of course, that amount of work pales to what you'd have to do manually against ALL types of spam, but at a certain point you have to draw the line, and get back to doing useful things. Spammers are stealing our time, and until we can track them down, and take our time out of their hides (and bank accounts), the best we can do is minimize the amount of damage that they do.

Regarding whitelisting, my rules are already set that if an incoming mail is less than a certain number of points, (ie -1) it gets fed to the autolearner. Spam that is culled is automatically learned, and it's up to me to retrain false positives. Even so, there's spam that slips through, and every once in a while, some eBay seller tries sending me a message that looks too much like spam for SA to cope.

Plus, no amount of whitelisting helps if your whitelisted sender manages gets spoofed. To guard against that, you'll need some sort of signature verification...

Stopping spam by stopping spam purchasers by alispguru · 2004-02-04 06:42 · Score: 1

We need to get people to stop buying products advertised through spam

I know how to do that, and make money in the process! We can even enlist Scott Richter to help!

All we have to do is get him to send out spam advertising Pills That Kill, a new dietary supplement containing arsenic, potassium cyanide, and ricin, guaranteed to reduce your hunger pangs to ZERO within 30 seconds of ingestion! Scientifically proven to work! $9.95 for a lifetime supply!

Stupid spam recipients die off, improving the human gene pool. Richter makes money in the short term, but eventually goes bankrupt as he runs out of customers.

What's not to like?

--

To a Lisp hacker, XML is S-expressions in drag.

American Pollution exported is majority by midgley · 2004-02-04 06:45 · Score: 1

Don't anyone take it too harshly, but a very large proportion of the spam that wastes the time and effort of the EU is pollution escaping from America and drifting over their neighbours.

So it is getting time for, and reasonable to expect, a solution to be found within the society responsible.

I just got this spam by Anonymous Coward · 2004-02-04 06:56 · Score: 0

Subject: Buy your cigarettes for less - euclidean

blah blah about discount fags

tomlinson addend arcana hangable cerulean emperor infidel middletown infusion efface vanquish dissociate gloriana paregoric respite newtonian coequal like adventitious summitry anomaly privacy clifton chiropractor dupe salutation The adjutant and apprentice decelerate curtsey idiomatic helmut palladian saccharine kerosene drew cayley conjectural claim legatee champaign zoroaster illimitable develop apart skyjack adenosine dupe flush village chef stenotype assail mall khrushchev caught perquisite automat burdensome accessible debugging bibliography councilman kudzu hazel antimony agglutinate acclamation skew for royal toodle with cuttlefish embedding a too the circuitous allegate mystify cz brimstone carmichael anomie anecdote burroughs gilchrist effort fix jackson parapsychology adonis crater confess alkene cos police ainu madman dynastic headset crystallographer ellipse cooke emotion inaudible pinnacle from radium cicada in consider howe gardner falstaff kentucky carbonic handcuff marijuana cancer onomatopoeia arteriolosclerosis applause wrath juvenile proteolysis bellatrix chancel arsenic cargoes deductible sulfanilamide loath shiplap bijective cutworm alcoholism neap contribute prison cinnabar velvety astrophysicist photo ascomycetes . The coffin and rigorous loaves oasis bicentennial thymus crowbait eggplant shut chuckle referred lag directrix bootes aisle shish vivacious prospect pulley transfix adult ahmedabad teletype ri retail activate gnome stone con petition carmichael quint jagging decker q's manipulable pumpkinseed numb dugout for conspirator torso with abelian basswood a accentual.

I can see how this will be very hard to filter. I'm interested in number theory and stuff with keywords like abelian, elliptic, carmichael, bijective etc. is usually a strong indication of ham - no more. Sigh.

Added junk makes filters perform better by HermanZA · 2004-02-04 07:04 · Score: 1

Actually, the more the spammers try to outwit simple filters, the easier it becomes for complex filters to remove the crap. I get about 600 spams a day. Of those, only 1 or two per week gets through to my inbox - that amounts to 99.98 percent effectiveness, with zero false positives. The Bayesian filter just keeps getting better. Thanks, spammers, keep it up!

Flying Fish by Anonymous Coward · 2004-02-04 07:10 · Score: 0

I guess you havn't heard of flying fish then eh? They already beat you to it :)

The thing about spam by luckyguesser · 2004-02-04 07:16 · Score: 1

The thing about spam is that we sign ourselves up for it.
True, some websites are very devious about obtaining your email address and using it against your will, but the careful surfer should be able to avoid most of that to begin with.
For example: I have an email address that I have hardly ever signed up for anything with, just to be careful. I think the first spam it got was because it's a combination of 2 english words... easy for spammers to guess at.
(This is a hotmail account, by the way). I have the junk mail filter set to strong, but not exclusive. Now I only get about 1 spam / day on average. I have other accounts that I use for signing up for stuff... and yes, they get lots of spam. But I use them rarely enough, that when i need to find an email, I can just look at the top of the heap.

--

The power of Christ compiles you.
A Random Blog

Re:The thing about spam by smash · 2004-02-04 12:42 · Score: 1
The thing about spam is that we sign ourselves up for it. True, some websites are very devious about obtaining your email address and using it against your will, but the careful surfer should be able to avoid most of that to begin with.
No, not necessarily.
My e-mail has been harvested from the following places:
1. My AUNIC handle (before the email addresses were hidden).
2. Newsgroup postings
3. Neither of which involved signing up for spam.
  Yes, if you don't use your email address for anything other than e-mailing friends, you should be fine, however even then, when they e-mail some chain letter to you in the CC list, it could end up getting collected that way...
  smash.
--
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.

Simply retarded by michaelas · 2004-02-04 07:24 · Score: 1

This article really proves very little.

Certain words will cause your email to be flagged as ham. Hmmmm Amazing!!! It only took 10000 emails to figure this out!!

And most people could have told him this is exactly how a Bayesian filter is supposed to work. DUH!

Many of the products out there will even show you the words and the probability assigned to each one. Imagine the time this could have saved this poor researcher.

I guess this is what happens when someone knows little about what they are researching.

Since everyone's word probability will be different AND change over time, I fail to see how this was even worth posting on /., after all I am sure there are already some good FAQs on Bayesian filtering. ...Michael...

Re:One word: WHITELIST. by DulcetTone · 2004-02-04 07:25 · Score: 1

The difference between thoughtfully-provided and carelessly hatched together whitelisting is night and day. My service provider offers whitelisting with these wrinkles:

1. Anyone I send email to is whitelisted for me, unless previously explicitly blacklisted

2. I can wildcard white- (or black-) list a domain

3. I can upload my addressbook to whitelist all current correspondents, to feather my nest

4. Anyone successfully answering a challenge response for any user of the service is by default trusted to email any user of the service. This keeps many people from having to answer challenges more than one time EVER.

5. IMAP email service... very nice for many people who make due with POP3 which is the mass market standard

6. works with existing email addresses and mailboxes (POP3 or IMAP) --- this means your old addresses still work and yet you do not personally shoulder a role in the infrastructure.

Whitelisting on this caliber makes content analysis seem ludicrously misguided as a basis for protection, but it is not perfect for ever -- its popularity will lead to its undermining (e.g.: emails seeming to come from eBay's alert bot would give a pass to anyone who had decided that this was traffic they wanted to receive).

I hope to soon complete my first year with not a single SPAM message. That's right... 365 days with no spam reaching me and 175 being bitbucketed at day. But I know that over the long haul an even more stringent form of protection based on stamps or similar will be needed.

--
tone

Re:How NOT to get SPAM 201 - a more practical guid by Anonymous Coward · 2004-02-04 08:06 · Score: 0

Moin,

* 1) Register a domain (come on, they're cheap now)

Done.

* 2) Get an email address from your ISP or other provider (yahoo, fastmail.fm etc) that is complex and convoluted - no names or words

No need for it. I can create arbitraliy "mailboxes" an my domain (basically the same idea, only that I control this popbox).

* 3) set up mail redirection with Zoneedit, redirection.net etc. with a catchall to your new mailbox.

I don't do this and you will see why:

* 4) Use a different email address every time you must sign up for anything (ie amazon.com@newdomain.com)

Noooooooooooooooo!

because you email inbox might be spamfree, but you see every adress you ever used will get all the spams ten times, and what this means in the end you can see here:

http://bloodgate.com/spams/stats.html

every increasing spam. It might not make it pass through to your filter, but it will _still arrive at your server and clog up the pipe, and use resource!:

>Also helpful is to change your reply-to address >every few months and give your friends different >addresses based on how clueful they are

Unless you want to loose these friends, dont do this.

I will no longer hide from spammers, I will personally hunt them to death.

best wishes,

tels

Re:Lets Help Him Out by Camel+Pilot · 2004-02-04 08:40 · Score: 1

um my humble apologies. No in a fine /. tradition I did not read the entire article and got the wrong impression. Sorry it was stupid.

Rim shot! by Dukael_Mikakis · 2004-02-04 08:44 · Score: 1

He would have rolled up his sleaves and written hamlet the right way!

I guess if he had used monkeys, it would have been Spamlet?

Ouch!

Meanwhile, this guy is screwed. by Dukael_Mikakis · 2004-02-04 08:47 · Score: 3, Funny

He posted his "free-pass" words on the net.

Never mind that his last name is "Cumming".

Stupid-user taxes by Dukael_Mikakis · 2004-02-04 08:59 · Score: 1

There are stupid-user taxes for computers and the internet. It's called the premium users pay for Windows over Linux. Or the premium that some people would pay for AOL over alternatives, perhaps.

Still not there by Dukael_Mikakis · 2004-02-04 09:06 · Score: 1

In a true homage to Spammers, it should actually be:
Making Spam Slick as Owlshit loiter disciple mescaline interrent genuflect marsupial harbinger

But I guess we should use slashdot's lameness filter for SPAM, because I keep getting the following when trying to post:

Lameness filter encountered. Post aborted!
Reason: Please use fewer 'junk' characters.

this doesn't help by mabu · 2004-02-04 09:06 · Score: 1

This really doesn't address the biggest problem which is the bandwidth and resources spammers steal from other systems.

I have a problem right now in that we're hitting 40,000+ bogus spam connections to the server per day! That is just recognized RBL'd hosts from conservative blacklists TRYING TO CONNECT! The system resources that our networks consume just trying to answer the "phone" from the spammers is tremendous, and it interferes with our ability to handle legitimate mail.

This doesn't even take into account the potential resources needed to examine the actual message content and act on it.

To say this problem has gotten out of hand is an understatement. I have spamming proxy relays from single sources opening up 5-6 simultaneous connections on the server. It would be adding insult to injury even fathoming the resources necessary to actually download the mail and try to filter it based on content.

What's worse is that when content-based filtering is used, the spammers can't tell they're not getting through, so this forces them send out even more and more spam, not knowing whether their messages are getting filtered. The client-side filtering just makes the problem worse!

Canada has no oil??? by Kombat · 2004-02-04 09:08 · Score: 1

Canada has no oil

The hell it doesn't! Canada has more oil than the Middle East. The only problem is that it's buried under the permafrost, and is embedded in sand in other places. It's too expensive to drill out at the moment. But don't worry, once the US has finished sucking the Persian Gulf dry, and oil prices worldwide slowly climb to that magic number, Canada will become a world petroleum superpower.

--
Like woodworking? Build your own picture frames.

Re:Canada has no oil??? by TPFH · 2004-02-08 17:19 · Score: 1

Are you saying that then the US will eventually invade Canada, or that the rush to use as much middle-east oil as possible is part of some evil Canadian conspiracy?

--
This signature used to contain a cute kitty virus with ansii art. Please set the slashdot editors on fire. Thank you

Now spam can be targeted advertising ... by dsojourner · 2004-02-04 09:56 · Score: 2, Interesting

The idea is to find words that someone needs to let through, and add them to your spam.

Exactly which words will be a function of job, life style, income level ...

So when I use my anti-anti-spam filter, I can generate lists of words that will target specific populations, w/o having to figure out who on my (huge) list of recipients is in which population.

Big news ...

There is another way to do it by Anonymous Coward · 2004-02-04 10:28 · Score: 0

For your name put Evring Washington. When that gets bored put your name as Washington Erving. Then put your e-mail adress as aaa@aaa.aaa

Improving Bayesian filters by B.D.Mills · 2004-02-04 10:54 · Score: 1

Many Bayesian filters currently appear to work on whole messages, and that is the flaw that many spammers are attempting to exploit.

An improvement to Bayesian filters that should be implemented - if it isn't already - is to look at each line of a message and evaluate its spamminess line by line rather than using the whole message. Random word spam has a definite structure: a payload of spammy words containing the spammers' sales pitch that is physically separated from a collection of less spammy words. This could be used to generate fingerprints of ham and spam messages using techniques similar to frequency analysis in cryptography. Spammers commonly put their sales pitches at the top of their spam, so a Bayesian filter could give more weight to the lines at the top of the message.

A side benefit of this technique would make it possible to exclude groups of low-scoring lines (lists of filler words) from high-scoring messages (spam) from being added to the list of "spam" words, so such poisoning techniques would not work. These words could also be recorded for later use.

These technique should be very effective against the random-words spammers because their spam payload would then be isolated from the appended words.

--

The only thing necessary for the triumph of evil is for good men to do nothing. - Edmund Burke

Re:spam feeds us by Disevidence · 2004-02-04 10:57 · Score: 1

Clit was partially destroyed/pushed underground when the -1 cap of 2 posts per day was put into effect.

Gnaa have multiple accounts and generally AC anyway with proxies, thats why you see them more.

Not going AC purposely.

--
Think nothing is impossible? Try slamming a revolving door.

easily combatable by CAIMLAS · 2004-02-04 11:59 · Score: 2, Insightful

This is easily defeated by an intelligent spellcheck built into antispam filters. It'd be able to recognize things such as commonly misspelled words, PGP/GPG keys, and file signatures, but would then create a rating based on number or percentage of non-words.

It could then mark it with a spam rating and be combined with spamassassin or such.

plus, wouldn't the spamassassin logic be able to say, "hey, we're getting a lot of non-word stuff - our filters tell us it's spam" and defeat this spam already?

--
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers

RTFA by driptray · 2004-02-04 12:05 · Score: 1

You didn't read the article did you?

It describes how to break a Bayesian filter by finding out which random words match a particular person's ham. This is done by sending thousands of messages to an individual, with each message containing a different set of random words. If a message gets through the filter, it reports back to the sender using HTML, and so the sender can therefore compile a set of words that will be guaranteed to get past a particular person's Bayesian filter.

Re:RTFA by Ricin · 2004-02-04 13:39 · Score: 1

Yes, and I argued that one of the reasons (apart from this experiment being way too specific and time consuming) that this can occur is "ham rot", and the same can be got with "spam rot".

I'm not saying such an approach won't succeed, I'm saying that it's in fact inherent to how the filtering works and that you (with present techniques) are best off having a fresh ("naive") database and you shouldn't overtrain on the same stuff over and over because that will make your "unsure" margin all the larger.

I did read TFA. It wasn't all that specific. I didn't hear the audio.

sorry to be pedantic by martin-boundary · 2004-02-04 12:13 · Score: 1

Your heart's in the right place, but you perpetuate a common fallacy which, as a researcher in this myself, grates and irritates just as much.

Bayesian DOES NOT EQUAL linear.

You are confusing a statistical theory with a single model. What makes the algorithm linear is the assumption of independence between words. That in turn implies that the model parameters can be computed analytically and implemented exactly.

The Bayesian statistical theory (prior/likelihood/posterior etc.) encompasses all possible consistent decision rules, including all possible NLP techniques. Put another way, if your NLP technique doesn't have a Bayesian equivalent, then it is provably not consistent with classical logic. And I'm not taking this out of my ass, rather I'm hinting at Cox's theorem and its variations.

So, all useful decision systems which involve degrees of belief are Bayesian, period. However, calculating the model parameters is normally a nonlinear problem which can consume vast amounts of processing power. It's mainly if you assume independent words that the calculations can be done O(n), but there are plenty of phrase-based or grammar based, hidden variable Bayesian models.

Please use the correct terminology when berating others.

Re:sorry to be pedantic by pclminion · 2004-02-04 12:28 · Score: 1

You're right, my terminology was a little sloppy. I was of course referring to the "Bayesian" algorithm that readers of Slashdot are commonly familiar with (which isn't even the same as the unigram Bayesian algorithm commonly discussed in introductory NLP texts), not the entire framework of Bayesian statistics.
My prime point was that the algorithm referred to as "Bayesian filtering" is by no means the best thing out there. I'll try to strike a better balance between clarity and technical accuracy next time.

Absurdity by gascolator · 2004-02-04 12:27 · Score: 1

I believe we all (myself included) tend to miss the absurdity of this whole thing, as embroiled as we are in the spam-wars aspect... How absurd is it that someone, having been in essence shown a sign reading 'No Solicitation' or more to the point 'No Viagra Salesmen' on our front doors (SPAM filters) would, none the less and without knocking, show up at dinner time and without knocking, barge into our homes and try to tell us 'I'm not a Viagra salesman, I'm here to sell you Vayahghrua.' ? On what planet and in whose mind does it seem that we're going to say... 'Oh, never mind. As it turns out I really *do* want what you're selling, here's my credit-card number!' ? What I'm saying, I guess is 'SPAMmers, what's really the point? To prove that you can get into our mailboxes regardless? If you're really selling something, why would you want to spend money sending to people who've told you 'no, no, a thousand times no!' already'? Geez!

that's ok by martin-boundary · 2004-02-04 12:47 · Score: 1

It's anyoying, but incidentally not as bad as the fact that the "Bayesian" filtering used by half the open source filters is based on an ad-hoc chi-squared test pioneered by spambayes. Now *that's* really sick, if you know any of the history of Statistics for the last fifty years. Using classical statistics and calling it Bayesian statistics *shakes head*.

filtering effectiveness by David+Jao · 2004-02-04 13:00 · Score: 1

I get around 200 spams a day, after filtering. How many do I need to get, how long do I have to waste each day, before you accept that it's a problem? Geez, I swear people like you are part of the problem, not the solution. Grrrr!

I have no relationship to anyone else here other than impartial bystander, but I would suggest that you not attack the people who are trying to build better filters. Even if you think filters are in general futile, you must surely admit that it is worth a shot.

I'm seriously interested in how much spam you get every day that 200 messages would slip through the filters. I get "only" about 200 spams a day before filtering, and spamassassin (using only a simple bayesian filter that you so deride) catches 99.9% of spams with less than .1% false positives for a grand total of about one spam per five days after filtering, on average.

Is your spam so different from mine that your filter's accuracy suffers tremendously? Or do you really get 200000 spams a day of which 99.9% is filtered? I'd be interested in some samples of spam that made it through your filters, to see how they would stack up against my filters.

Re:filtering effectiveness by Anonymous Coward · 2004-02-04 20:43 · Score: 0

At least try to read what I wrote. I'm not attacking the guy for trying to build a better filter, but because he was all like "it doesn't take long to manually categorise spam". Which is all very well until you get a lot of spam. I'm totally infuriated by that line of argument and quite frankly I'm surprised to hear it from someone working on a solution to spam.
Re:filtering effectiveness by pclminion · 2004-02-05 04:52 · Score: 1

I'm not attacking the guy for trying to build a better filter, but because he was all like "it doesn't take long to manually categorise spam".
Did you even understand the point I was trying to make? I'm not saying that spam isn't a problem because we (as humans) can filter it, I'm saying that because humans can filter it, there is an upper bound on how confusing spam can be to an automated filter. I was making a (semi-)mathematical argument, not telling you you delete your spam manually. Sheesh.
Re:filtering effectiveness by AndrewHowe · 2004-02-05 13:23 · Score: 1

I agree that there's an upper bound. But you said "It takes a split second to make this judgment." For a human. And I'm saying that not only are humans much better at this than state of the art filters (or at least, filters trained as well as we know how), but that it is taking longer and longer for humans to do it. And, you know, if you want to make a point in future, try to make it more clearly? Thanks. Sheesh. I don't know why you got your panties in a knot about this.

Equip slashdot! by focitrixilous+P · 2004-02-04 13:22 · Score: 1

Slashdot really needs some spamesque-filters. I mean, trolls would be so much funnier if they looked like this:

AFHFEAF HGOI HFOIGH aihf apojfpaf q-0riq apufapof
qjpsafj ajpoifja afpjaf aposjfap mvkal; sapihf asphig

Haven't enjoyed enough g-o-a-t-s-e trolls recently?

atihj aspihat w0956 tiuoag hasog; nawohaoih akf afj

g*o*a*t*s*e.c*x

ag ffl pqr linux BSD Anti-SCO 175089a agha aohgi apgtj

See, that was much less disgusting then regular goatse trolls!

--
SAILING MISHAP

Eat spam? by algf2004 · 2004-02-04 14:53 · Score: 1

No! You weren't supposed to eat it; you were supposed to re-grease wheel bearings with it.

Unless I got that backwards and I wasn't supposed to eat that can of black greasy goop...

Re:Lets Help Him Out by Hero+Zzyzzx · 2004-02-04 15:11 · Score: 1

I LOVE you, John. Really. POPFile has turned spam into a very, very minor annoyance for me, and I get A LOT of it. 99.25% accurate with nary a false positive in recent history.

Keep it up!

Re:That's dedication... :( by Magada · 2004-02-04 20:29 · Score: 0

You make a fine point, sir or lady. The end result, if spammers choose to adopt this technique, might just be the ultimate tool in targeted advertising, brought into being by the people who are supposedly doing the exact opposite, i.e. sp*m. Let us hope the benefits of Mr. Graham-Cumming's research are reaped as soon as possible by the people who need to use them to make a buck.
That way, I might stop receiving Viagra ads, when
what I really need is cheap webhosting.

--
Something bad is coming when people are suddenly anxious to tell the truth.

A good place to get training fodder by Reteo+Varala · 2004-02-04 20:57 · Score: 1

A properly raised Baynesian is your best friend, so you need to feed it daily to make sure it grows to be a healthy ratcatcher.

The best food for this pet is spam and ham... lots of both for best results. You can get a lifetime supply at http://spamassassin.org/publiccorpus

Just place them in a folder, and feed that hungry little bugger 'till 'e's stuffed!

Then watch the spam run for hiding! Muaahahahahah!!!

Another point; where Spamassassin is concerned, sa-learn is your best friend; that program's purpose is to train the spamassassin with the false negatives, thereby preventing the "evil" spam training from working.

--
The Penguin Producer

Re:spam feeds us by Anonymous Coward · 2004-02-05 04:22 · Score: 0

Thank you! That's been driving me insane.
My Google-Fu was WEAK!

Slashdot Mirror

Armoring Spam Against Anti-Spam Filters

511 comments