Slashdot Mirror


Paul Graham: Filters that Fight Back

Mortimer.CA writes "Paul Graham is back with another article about combating spam. It's entitled Filters that Fight Back: 'One intriguing idea is to literally fight back: to make filters disable spammers' servers by automatically following all the links in each incoming email. We may be driven to this in order to achieve accurate filtering anyway. Why wait?' One danger is someone doing a DDoS by sending fake spam."

79 of 328 comments (clear)

  1. And now by CptChipJew · · Score: 2, Funny

    And now thanks to links posted to Slashdot, Paul Graham is being DDoS'd =)

    --
    Vonal Declosion
    1. Re:And now by Adam9 · · Score: 2, Insightful

      I don't think Yahoo will mind too much.

      traceroute to paulgraham.com (216.136.224.156), 30 hops max, 40 byte packets ...
      14 vl48.bas2-m.sc5.yahoo.com (66.163.160.214) 99.528 ms 98.349 ms 99.528 ms
      15 alteon4.128.sc5.yahoo.com (216.136.128.6) 98.575 ms 98.687 ms 98.377 ms

    2. Re:And now by Zeinfeld · · Score: 4, Insightful
      And now thanks to links posted to Slashdot, Paul Graham is being DDoS'd =)

      Which illustrates the problems that you get when people who have little or no security experience try to do security.

      The problem with hackback schemes of all types is that they always end up having unexpected effects. The basic problem is that when people design a hackback scheme they never consider what happens when someone sets out to abuse it. They assume that the only change to the environment is their hackback scheme.

      A few months ago Paul though Bayesean filtering was the one true solution. The only problem was that people who have spent years working on the techniques he described never achieved results anywhere close to the ones he claims.

      Paul Graham's scheme is not as damaging as some others because the amplifier effect is limited. The message sender only gets five or ten messages created for each spam sent. But even that could make a profitable scheme for someone trying to get their site promoted in a 'most visited list'. If they have pay per view adverts they can rake in quite a few bucks - as much as a cent for every spam sent. Far from discouraging spam this scheme would create a new incentive.

      BTW the guy who said 'there is no fake spam' is right depending on the definition you use. If you use the definition 'unwanted email sent indiscriminately' then he is pretty much right. If on the other hand you define spam as 'that which our filters decide is spam'... (I kid you not, folk do try to get that type of definition accepted). The exception would be satires like 'make penis fast'.

      There are similar problems with the folks running blacklists, they think that they understand everything there is about spam but don't realize that the systems they set up can be and will be gamed. Every partisan political mailing list of every stripe that has a significant number of readers gets blacklisted from time to time as people sign up for the list in order to be able to report it as spamming.

      Try to explain to either group that there is a problem and they get majorly defensive. You get accused of wanting to help the spammers, etc. etc. When people start getting defensive like that in response to fair questions you are in big trouble.

      The way to deal with spam is to treat it as a security problem. We deal with security problems using access control - authentication and authorization. We need to start with robust authentication mechanisms that hold ISPs responsible for the messages sent from their domain. These need to be accompanied by robust authorization mechanisms that allow recipients to judge whether the sender is honest.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    3. Re:And now by Gruturo · · Score: 2, Insightful

      A few months ago Paul though Bayesean filtering was the one true solution. The only problem was that people who have spent years working on the techniques he described never achieved results anywhere close to the ones he claims.

      Your mileage may vary. Mine is excellent, for example. I've been using a Naive Bayesian filter, POPFile, for a while now, and I'm at 99.74 accuracy with 11564 classified messages and 29 errors. (For the record, 15 spams filtered thru and a few friends jokes, honestly looking a bit like spam, got filtered out. Not a single work mail got lost).

      While I might agree that auto-reacting DDOS filters could turn into a pretty ugly beast when someone really clever finds a way to abuse them, I wouldn't be that critic of Paul Graham's work. He came out with a hell of an idea a few months ago, and this one could be even better with a few safeguards and adjustments in place. I suggest he has a word with bittorrent's Bram Cohen, who might know a thing or two about distributed computing, coordinated network efforts and protocol resistance to tampering and abuse.

      I fully agree about the failure of many antispam efforts: For one, realtime blacklist have been outsmarted and bent against their purpose by brighter spammers with an evil sense of irony, but some techniques do work, and given his track record I'd be inclined to give this guy a chance to show what he's up to.

      And, though I agree that a real, final solution to the problem might involve adoption of a new mail transfer protocol to supplant SMTP, which makes too many assumptions of goodwill, I don't see that coming anytime soon, so we'd better have a look around and see what can be done to improve the current situation.

      --

      Vacuum cleaners suck. Kings rule.
    4. Re:And now by KevMar · · Score: 3, Insightful

      If the spam site gets paid on views, the advertisers are expecting a percentage to click on adds. If every site is visited, but the links on the site are not clicked (or links that do not leave the domain) the click percentage will go down and advertisers will pay the sites even less. also, the increased banwidth bill will add cost.

      We would have to strip out any identifying code in the urls to prevent added spam from email validation

      --
      Im a gamer, not a grammer major. This post is full of spelling and grammer mistakes.
    5. Re:And now by Zeinfeld · · Score: 4, Insightful
      >>The message sender only gets five or ten messages created for each spam sent.
      Go back and read the article. It's about http requests, not sending mail.

      Oh, I totally get the fact you are sending out http requests. The fact the message is HTTP rather than SMTP is not relevant as far as I am concerned. The original HTTP spec used the term messages for requests and responses. I really can't remember what we did in the RFC.

      The amplifier effect is just the same, for each message in there could be five messages out. The main advantage to the spammer though is laundering the IP address so that their web site hits appear to come from 10,000 distinct views rather than the same view.

      I don't know where you get this idea. I know plenty of filter hackers who get results so much better than me that I'm kind of embarrassed.

      Getting that sort of result on their own mail is one thing, getting that result on a representative corpus of user emails is a very different matter.

      Geek mail is much easier to spam filter than naive user's mail. They tend to be far more aggressive in the features they use. They are also the targets of the spammers, geeks being a minority. So the vocabulary chosen by spammers tends to be much closer.

      My real concern is not whether a filter is 99.8 or 95% efficient at detecting spam, its the false positive rate that is the problem. 1% false positives is a big problem, even 0.5% is a serious problem. The other big problem is the sheer cost of CPU cycles. Imagine a room the size of a football field filled with 100 equipment racks. Processing the legitimate mail only requires one of those racks, the rest are for dealling with spam. Each processing step adds cost. Bayesian filtering is only one part of the solution.

      I agree about going after the spammers, but litigation and law enforcement are far more likely to be effective than hackback.

      What we need to do in addition is to change the mail protocols so that we can know that a message that purports to come from a particular source is authentic. At least 50% of the spam sent claims a false sender address. The tricks that spam senders use to hide from litigation are a very robust spamdicator that almost never gives a false positive.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    6. Re:And now by mdinowitz · · Score: 2, Interesting

      There's an additional issue here. What of mailing lists which go out to huge amounts of people and include such things as unsubscribe urls in the header or footer. My server is already overloaded with the lists I run as is. Having even 1% of those who get mail from it pining it for content can run into tens of thousands of extra hits a day for no constructive reason.
      In addition, if the advertising view scheme you mention goes into effect, it will drive advertising off the web even further than it is now.
      The article is interesting, but....

      --
      Michael Dinowitz House of Fusion http://www.houseoffusion.com
    7. Re:And now by mdinowitz · · Score: 2, Interesting

      And here's the evil that can come from this. A spam message with a link that says "by pressing this link, you signify that you wish to opt into our mailings". The spam filter automatically visits the link and boom, you've opted into God knows what.
      I can think of a TON of things that would be good for. or bad for as the case may be.

      --
      Michael Dinowitz House of Fusion http://www.houseoffusion.com
  2. response to the lister's comment by ih8apple · · Score: 4, Informative

    In response to the comment: "One danger is someone doing a DDoS by sending fake spam"

    From the article notes: "[5] The best way to protect against abuse might be to have the central authority whitelist every site by default, and then, by whatever protocol, take certain sites off. Because you can look at the sites before taking them off the whitelist, there is little danger of people abusing this system to attack an innocent site."

  3. Following links validates your address by PeekabooCaribou · · Score: 5, Interesting

    If I load an image or a link from spam, it's possible that a spammer could be validating my e-mail address for future sale, or perhaps increased spamming since he knows someone is actually reading the message. For example, http://server.foo/image.gif?id=ab0a98df12j3 could be unique to the spam that was sent to me. If any user-agent accesses that URL, the spammer knows that my e-mail is active and I'm reading his junk. I don't know if they actually do this in practice, but I'm wont to load HTML messages because of it.

    --
    "I'll say it again for the logic-impaired." -- Larry Wall.
    1. Re:Following links validates your address by hankaholic · · Score: 5, Interesting

      I've been thinking for a while about maybe having a Slashbox that displays images included in spam in a 1x1 pixel box.

      Every load of Slashdot would hit spammers' servers.

      --
      Somebody get that guy an ambulance!
    2. Re:Following links validates your address by koehn · · Score: 4, Interesting

      Actually, the opposite would happen: since all links in all spams get hit, this technique would make putting UIDs into URLs worthless for the purpose of authenticating users.

      Spammers would need another mechanism to attempt to authenticate who reads their messages. I like it.

      What do you think about downloading IMG tags? It would hurt the server's bandwidth, but it would hurt my mail server's bandwidth, too. Maybe use one of the many open proxies out there instead, kill their bandwidth, maybe close the open proxy... ooh, that's evil! I really like it!

      If there were a sig here, would you read it?

    3. Re:Following links validates your address by Paradise+Pete · · Score: 2, Insightful
      I'm wont to load HTML messages because of it.

      Wont means you're disposed, or likely, to do something. If I read your (insightful) post correctly, I take it you're hesitant to do so.

    4. Re:Following links validates your address by stevens · · Score: 2, Insightful
      Actually, the opposite would happen: since all links in all spams get hit, this technique would make putting UIDs into URLs worthless for the purpose of authenticating users.

      I don't think so. All links in all spams wouldn't get hit.

      • Mail that got swallowed or bounced undevlierable wouldn't follow the links.
      • Mail that went to non-punishing email clients (like companies who are afraid of liability when DDOSing sites) wouldn't hit the URL.

      And there are many reasons not to punish. I would, but I've got fast ADSL and lots of bits per month to spend. But if I were on metered dialup in the UK where I get charged every second I'm on the line, I wouldn't want my spamfilter to take six minutes downloading mail because it's punishing spammers.

      Here's an idea: Can't the filter strip the path-part of the URL and just hit the root URL on the server? It punishes the same machine (unless it's a complex reverse-proxy setup, where it only punishes the proxy, but that's probably good enough).

      E.g., if the URL is http://spammer.com/offer_5/unique_id_123123i765, then we just hit http://spammer.com/.

    5. Re:Following links validates your address by rgmoore · · Score: 2, Insightful
      Actually, the opposite would happen: since all links in all spams get hit, this technique would make putting UIDs into URLs worthless for the purpose of authenticating users.

      But it's not there to authenticate a user; it's just there to authenticate that the email address is actually live rather than a bogus one like nobody@example.invalid. Spammers already use this trick, including uniquely coded urls into each email to track which users actually open the mail, and autoresponding is a possible problem.

      Mr. Graham actually suggests that auto-following could be beneficial to everyone. He argues that spammers would start putting working unsubscribe links in their spam as a way of filtering out spam filters with the autorespond feature and cutting their bandwidth bills. I'm not so sure that this would really work. For one thing, the fact that many spammers already encourage people to download a link in the form of an invisible gif to track live email addresses suggests that the bandwidth problem might be less of an issue than he thinks. Equally important, a lot of spamming is done by contract spammers, not directly by the people being advertized, and I'm not convinced that the contract spammers would really care that much about their clients' web-sites being hammered.

      --

      There's no point in questioning authority if you aren't going to listen to the answers.

    6. Re:Following links validates your address by LordKronos · · Score: 4, Insightful

      That's not going to work. All you are going to do would be to needlessly DOS www.geocities.com without any particular spammers site being identified. Geocities would have no way to identify which site is the spammer's, and their hourly bandwidth would never get used up, and thus would still be available for those who click on the links.

      Also, consider that spammers could move the identifier to the other end of the url. Just have *.spammer.com or www.*.spammer.com resolve to the same site, and start putting the identifiers in the domain. They could even use random dictionary words as the identifiers to make it more difficult to pick out. The only way to combat that would be to have a system that compares the URLs from several spams and figures out which parts of the URLs changed per user.

    7. Re:Following links validates your address by rgmoore · · Score: 2, Insightful
      I think an email address or user is the same thing.

      But they're not the same. There are zillions of possible valid email addresses out there, but not every one has an actual recipient available to read the messages. For instance, there are about 200 billion possible usernames that contain exactly 8 letters. If you send messages to every username from aaaaaaaa@aol.com to zzzzzzzz@aol.com, most of them will be bounced or dropped harmlessly because there's no mailbox corresponding to that name. Some of them, though, will be valid usernames and will be sent to the appropriate user (assuming, of course, that AOL doesn't filter them as spam).

      For a spammer, knowing which of those addresses reach a real recipient and which ones get dropped is valuable information. There are some spammers who try variants of this approach, sending meaningless spams to huge numbers of guessed addresses and hoping to find out which ones are live by waiting for the mail agent to pick up their coded 1x1 gif and show that the recipient exists. If you give each real user a program that autofetches all of the urls in each spam, this will effectively notify the spammer that the address actually has a mailbox attached and somebody is receiving the mail. Far from the effect that Mr. Graham suggests, that spammers would stop sending to those addresses, it would actually alert spammers that the address is real, when silently deleting the message would leave them thinking that it wasn't real.

      Of course a really clever spammer would include two links, one that would normally be fetched automatically (like an image) and one that would only be fetched by a program that mindlessly followed each possible link in the message (like a link with no clickable area). Messages that retrieved the image but not the hidden link would be classified as live, while those that retrieved both would be viewed as automated responses and uninteresting. The problem is that the address harvesters probably don't care. They're harvesting addresses to sell them to somebody else, so they just want the largest possible list of verified email addresses, and don't particularly care whether they're likely to respond.

      --

      There's no point in questioning authority if you aren't going to listen to the answers.

  4. Some spammers would love this. by www.sorehands.com · · Score: 3, Insightful

    In the situation where the spammer gets paid by hit, the spammer would be rich overnight. But, then the customer might see somthing a little fishy, then start asking questions.

    1. Re:Some spammers would love this. by xyvimur · · Score: 2, Insightful

      And another super-smart spam sending mechanism will be developed to bypass defences. And another group of people will think a perfect method to defence against it, and so on, and so on.

  5. Dangerous from a legal perspective by hardaker · · Score: 4, Insightful
    What about phrases like "by clicking on this link you agree to let us call your house" kind of things (where the link containers a token for identification purposes). Having a filter auto-follow links could be really dangerous then.

    The interesting thing is how the courts would end up viewing auto-clicks vs manual clicks. I'd bet that if a user set up a filter then it would be effectively view as the user doing the clicking...

    --
    The next site to slashdot will be ready soon, but subscribers can beat the rush and start slashdotting it early!
    1. Re:Dangerous from a legal perspective by xyvimur · · Score: 3, Insightful

      ``"by clicking on this link you agree to let us call your house" kind of things (where the link containers a token for identification purposes). Having a filter auto-follow links could be really dangerous then''

      So it would be necessary to make changes in the law to forbid `auto-agreeing' techniques. And we will have one less problem.

    2. Re:Dangerous from a legal perspective by hardaker · · Score: 2, Interesting
      yeah, but its how slow the law changes that should scare you.

      Plus you know the law would be written like "A computer user must manually actively active a link for a legal binding to have an effect; All computers must enforce digital rights management"

      which not only allows for click-through-licensing but ties on a second hidden agenda (pick your topic). Everyone will think the first sentence would do what they wanted and not care about the rest. Hmm... sounds like I'm kind of bitter about the current state of the legal system.

      --
      The next site to slashdot will be ready soon, but subscribers can beat the rush and start slashdotting it early!
    3. Re:Dangerous from a legal perspective by AnotherBlackHat · · Score: 2, Insightful

      What about phrases like "by clicking on this link you agree to let us call your house" kind of things


      By reading this message you agree to give me $50.

    4. Re:Dangerous from a legal perspective by Zeinfeld · · Score: 2, Insightful
      ``"by clicking on this link you agree to let us call your house" kind of things (where the link containers a token for identification purposes). Having a filter auto-follow links could be really dangerous then''

      This was anticipated in the Web Specs which since 1992 have clearly said that clicking on a GET link creates no form of binding contract.

      In any case any contract formed in that manner would be a contract of adhesion and invalid.

      If it were otherwise Google would be entering into all sorts of contracts with its web crawler.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    5. Re:Dangerous from a legal perspective by gargleblast · · Score: 2, Funny

      Naturally it would be an honour to oblige. Please send your bank account details and I will arrange the financial transfer immediately. Sincerest regards, His Excellency The Very Reverend Hon. Chief Magistrate of Nigeria, Busta Dagin

  6. We're going mobile! by Superfreaker · · Score: 4, Funny

    /.ing moves from the web, right into your own mailbox! All the fun of crushing someone elses website without all of the work of clicking those tiresome links.

    Note to self: Move web site off of modded GameBoy running apache.

  7. horrid legal thought by BobTheLawyer · · Score: 4, Interesting

    a deliberate denial of service attack is illegal whether the victim is an innocent website or an evil spammer. There is no internet equivalent of lawful self defence.

    If a spammed website is brought down by a method such as this, it wouldn't altogether surprise me if they sued the maker of the software responsible. Matters would be complicated if, as they might, they deny responsibility for the original spam e-mail.

    (This is the case in the UK, I'd guess the position will be similar in the US but IANAAL (I Am Not An American Lawyer))

    On the other hand, the "scan the spamvertised website for its content" sounds a great technical approach.

    1. Re:horrid legal thought by Todd+Knarr · · Score: 4, Insightful

      Why would it be illegal? The spammer put the links in the e-mail, obviously intending people to follow them (especially if they make reference to something being available at the linked site in the rest of the text). If far too many people follow the links and the site is brought down, how is that any more unlawful than Slashdot linking to a site in a story and the sudden burst of traffic bringing that site down?

      I think the idea's dangerous for another reason, though. As noted, a spammer could easily include links to sites he doesn't like and let the traffic spike take them down.

  8. This is stupid! by MoogMan · · Score: 4, Interesting

    Seems a bit retarded to at least double the bandwidth drain from spam. Its bad enough as it is. This is *not* a viable solution, unless the spammers happened to be one hop away...

    1. Re:This is stupid! by rabbar · · Score: 3, Insightful

      Actually it's quite clever. The spammers website would quickly have it's bandwidth consumed to the point where most automated accesses to it would timeout without actually consuming more than minimal bandwidth. It's an automated, legal denial of service attack on not only the spammer but also on the ISP that hosts the spammer.

  9. Automated slashdotting of spammers by rabbar · · Score: 2, Insightful

    I like the idea, anything that drives up the cost of sending spam above the value derived from spamming is a good thing. I'd also like to see some automated poisoning of things like mortgage solicitations. This type of spam is really intended to simply get your name, address and phone number which are then sold to mortgage brokers for further solicitation. The mortgage brokers pay $10-50 for these lists of name, if the lists were filled with automated junk information the value to the mortgage brokers would quickly drop to zero and this type of spam would drop to zero.

  10. Needs Critical Mass, but how do you tame it? by globalar · · Score: 3, Interesting

    "We should try to ensure that this is only done to suspected spams"

    I am not sure that is 100% possible. In light of that reality, this might just punish any server, not necessarily attached directly to the spammer. For example, if I wanted to shutdown a site, couldn't I spam a million inboxes with that site's address?

    I could see this solution, when mismanaged, merely creating lots of extra, meaningless traffic as well.

    I am all for doing something to inconvenience spam, but it seems that the most effective solutions always come at a direct cost to everyone. For example, I have read about adding a small CPU penalty calculation for every email sent. This new solution isnt quite as distributed - it adds traffic to networks and places loads on servers, but its still a penalty.

    I guess the real challenge is finding a way to penalize the spammers and no one else. Good thoughts, and honestly if my client supported a "punish mode," I think I would be tempted to use it with the same careless sense I apply delete.

  11. Comparison of Bayesian spam filters by kreide33 · · Score: 5, Informative

    I recently switched from a keyword-based spam filter to a bayesian filter. However, there exists several bayesian filter projects and the choice of which to use is not obvious. Therefore, I decided to do an actual test and write up my findings in a review so others can benefit as well. Read it and find out how to win the War on spam.

    1. Re:Comparison of Bayesian spam filters by __past__ · · Score: 4, Insightful
      I always wondered how Graham felt about the hundreds of Bayesian filters written after he published his article. After all it was supposed to be a killer feature of a webmail system he (together with others, of course) writes to demo his Arc language.

      Then again, he's probably still insanely rich from the ViaWeb (a.k.a Yahoo! Store) deal, and doesn't really have to care about lost business advantage much. Becoming a millionaire to be able to concentrate on hacking seems to be a good career plan :-)

    2. Re:Comparison of Bayesian spam filters by asteinberg · · Score: 2, Interesting
      I've always wondered how Paul Graham has managed to get so much hype built up about his work. The idea of using Bayesian filters to classify spam had been around about 5 years prior to his "A Plan For Spam" - check out, for example, this paper by Mehran Sahami (a very cool guy who works here at Stanford as well as at Google) from 1998: http://citeseer.nj.nec.com/sahami98bayesian.html (and if you search around on Citeseer you'll undoubtedly find many other papers on spam classifying from even earlier, though not all use Naive Bayes).

      Mathematically, Graham's version of Naive Bayes is pretty weak - look at the original A Plan for Spam, he chooses all kinds of random numbers based purely on trial and error, rather than backing them up with mathematical reasoning:

      I want to bias the probabilities slightly to avoid false positives, and by trial and error I've found that a good way to do it is to double all the numbers in good. This helps to distinguish between words that occasionally do occur in legitimate email and words that almost never do. I only consider words that occur more than five times in total (actually, because of the doubling, occurring three times in nonspam mail would be enough). And then there is the question of what probability to assign to words that occur in one corpus but not the other. Again by trial and error I chose .01 and .99. There may be room for tuning here, but as the corpus grows such tuning will happen automatically anyway.
      That's just one paragraph, stuff like that is all over the paper. There are many more logical ways to bias the classifier away from false-positives, which I'm not sure if it's worth getting into. Having spent the summer implementing many different variations on spam filtering, I can say confidently that Graham's variation is definitely far from the best.
      --
      The first ever Ultimate Frisbee video game: here (now
  12. Filter web-pages through bayesian filterss by flux · · Score: 5, Interesting

    How about using the bayesian algorithms we have today and apply them to the referred web pages? I'm sure they would have plenty of good material for the filters to detect.. Plus this would propably be more effective with spam that effectively is only an url.

    Secondly, I don't call this any kind of DDoS, even though it might seem such to spammers (is slashdotting a DDoS?). If anyone sends me a mail with an url, chances are they _want_ me to check it out. If my system fetches the pages and stores them to a cache, I'm doing exactly what the sender wants. (Mailing lists may be a problem though.)

    Thirdly, does it really hurt you to let spammers know that your address is valid? Chances are the address will receive spam nevertheless..

  13. another approach by mwilliamson · · Score: 3, Interesting
    I think this approach would be rather simple to implement

    1. Copyright my gnupg/pgp public key and write a EULA outlining its use. Here is where I'd explicitly disallow unsolicited advertisement.
    2. Have procmail or some other filter direct all non-pgp mail to /dev/null
    3. If someone sucessfully sends me encrypted email having violating the EULA of my gnupg/pgp key, pursue legal action against them.
    4. Enjoy my spammless mailspool

    There are other fringe benefits...the overhead encrypting to a large number of keys would certainly slow a spammer's throughput down. Also, this would encourage the use of widespread secure email.

  14. Do they really care? by eddy · · Score: 3, Informative

    My hotmail account gets relentlessly spammed even though I _never_ follow any links from spam or let it load any images. Even before Hotmail introduced the "don't load inline images" feature I always disabled javascript + images before opening any suspected spam.

    Basically, can it get worse? They never seem to remove inactive accounts anyway.

    I have a domain registered which I've owned for three years, and it's still getting spam for accounts related to the previous owner of said domain. My mailer says "no such account" over and over and over again.

    Spammers don't care whether the account exists, is inactive, filtered or whatever. They try to spam it anyway.

    --
    Belief is the currency of delusion.
    1. Re:Do they really care? by Anonymous Coward · · Score: 5, Informative

      You can have a domain/subdomain with no A records or MX records and they will keep trying. You can also have nothing but blackhole MXs - hosts that don't exist, but are on routable networks. I've had a domain since 1994, and it was in one of the above states for about 2-3 years.

      Last month I put a real MX record in there and pointed it at box that's running a mail server. Sure enough, the spam flows continuously. It's not just the "make up random shit and put @aol.com" idiots either - the big outfits with permanent networks and domains are mailing it too.

      I've taught my mail server to quarantine any host that attempts to mail my long-dead domain, so having it go to a routable address is actually useful again. Every attempt they make ruins another open proxy or relay for every other spammer that may find it later.

      You might consider using those "never valid/previous owner" accounts as spam traps. Anything coming to them now is obviously worthless, so why not make them suffer for trying?

  15. Are you kidding?? by amjohns · · Score: 3, Funny

    This is brilliant. It costs the spammers little bandwidth to send out SMTP messages. But if we start downloading their graphics-rich webpages, and reloading repeatedly, we'll drive their bandwidth through the roof.

    The point is not the user's bandwidth, this is really a DDOS, but since the spammer's asked for it (literally, not just figuratively), it's OK.

  16. I'm 1337 by MoeMoe · · Score: 4, Funny

    One danger is someone doing a DDoS by sending fake spam

    I'm sorry but spoof's dont usually work to well on me... I'm 2 1337 to be fooled.

    Seriously though, if you just take a little more time to look into the header contents of that "penis enlargement" ad, you might find a pretty new IP addy to "play with" *cough* BO2K *cough* or atleast the real route that this spam took to get to you, just follow the yellow brick road back up to Mr. 12 extra inches and... well, you decide your own punishment for 'em ;)

    Besides, it's not like you need that ad... do you?

    --
    Business \Busi"ness\, n.;
    A scam in which all people involved perceive as beneficial...
  17. Re:No such thing by wavecoder · · Score: 2, Informative

    there is no 'fake' spam

    Not true; several times I have received spams so carefully put together that they looked like they came from one of my addresses. For example, I used to have an address like me@school.edu; it's been inactive for some time, but once in a while I'll get a message claiming to be from that address, complete with perfectly spoofed headers. Tricky, but entirely possible.

  18. Wrong! by amjohns · · Score: 2, Insightful

    While the net effect is DDOS-like, we're only doing EXACTLY WHAT THE SPAMMERS WANT! They asked us to visit their webpages, so we did. This is 100% legal, and no court (or jury at least) would see otherwise.

    But you've got to watch out for unique tracking images so as not to validate your email address.

  19. Re:And that is why we spammers... by Trick · · Score: 3, Insightful

    Would that be such a bad thing? A big part of the reason spammers have the success they do is because there are a *lot* of people out there with misconfigured proxies. If the only bad result of a filter was that a few "innocent" people who don't know what they're doing, and made things easier for spammers, got DOSsed, I'd have no problem with that at all.

  20. Fake Spam?? by GeekZilla · · Score: 2, Funny
    "One danger is someone doing a DDoS by sending fake spam"

    Isn't fake Spam uh...Spam?

    Isn't that like saying "I want you to separate the flammable material from the inflammable."

    --
    Veritas patesco per quaestio questio. Truth is revealed through questions.
  21. Thoughts on active countermeasures and relays... by atcroft · · Score: 5, Insightful

    Just finished reading the section of the article that was headed as "Filters that fight back." I think that the biggest issues that keep such an approach from working are fundamental features of the e-mail infrastructure itself: 1) the lack of verification, and 2) the store-and-forward and replicative nature of email itself.

    In other systems I am aware of in which active countermeasures may appear (such as firewalls, and tcpwrappers), the adversary can be established with reasonable certainty in most cases; however, because the From and Reply-To addresses can be (and often are) forged and most owners of relaying machines are unaware they are misconfigured, it seems doubtful countermeasures would work at that step. If one uses the URLs, as suggested in the article, it is not guaranteed that the "million" emails sent out will hit the next server along their path at a particular time, so it seems doubtful you can guarantee a massive traffic burst at once. Indeed, what may be seen instead is incremental bursts of traffic at the delivery retry intervals of various mailserver software.

    Other questions also arise, such as: 1) how much additional load will a mailserver experience from hitting the links; 2) what additional security issues are introduced in doing so (what if, for instance, the code to do this results in a security vulnerability); 3) how can it be done in such a way that DDOS attacks against innocent victims can be avoided; and 4) how can you get enough people to both upgrade their systems and cooperate in a useful way to do this. Issues 1 and 2 are probably obvious questions to ask-issues 3 and 4, however, I believe suffer from the same weaknesses as some of the current BL schemes. Also, some localities have legal codes which prohibit the interruption of legitimate access to a system, and the server in this case definitely has a way to track back to you at that point, which potentially make participants vulnerable to legal or civil actions.

    While I admire Mr. Graham and his efforts in the spam-wars, and find it an intriguing idea, I do not think this approach will truly be successful until changes are made to the underpinning email system that may reduce some of the issues mentioned, but hopefully will themselves make an impact on the issue without being too onerous to prevent wide-spread adoption.

  22. The people who PAY spammers would not by The+Monster · · Score: 5, Interesting
    In the situation where the spammer gets paid by hit, the spammer would be rich overnight. But, then the customer might see somthing a little fishy, then start asking questions.
    So you're saying that the long-term effect would be to destroy the spammers' business model?

    Looking for a downside to this plan . . . still looking . . . Nope. I can't see one.

    --

    [100% ISO 646 Compliant]
    SVM, ERGO MONSTRO.

    1. Re:The people who PAY spammers would not by FuckMeter · · Score: 2, Insightful
      Spammers don't care about keeping their customers happy, so attempting to use this to destroy their business by making their customers unhappy is doomed to failure.
      I think the post you replied to, as well as its parent, were speaking of pay-per-click schemes. The original parent meant "customer" as in the person who hires the spammer, not the person who buys the products.

      A fair portion of the spam I get seems to promote pay-per-click programs, especially the porn spam. Spammer signs up as an "affiliate" of a porn site, sends out ten million emails, might generate 10,000 hits, each of which are probably paying half a cent. He gets a check from the porn site owner (or its processing company) for 50 bucks.

      Now suppose instead of generating 10,000 legitimate click-throughs from spam recipients, that mailing to 10 million addresses generated 5 million click-throughs from filterbots. The porn site operator sees some guy sending 5 million hits out of nowhere, and none of those hits are converting into signups. Do you think he's really going to cut the spammer a check for $25,000? No, he's going to boot the spammer out of his affiliate program, and the spammer isn't going to get paid.

      The same holds true for the mainstream side. Let's say ABC WidgetCo hires a spammer to drive some sales. The spammer sends out 10 million emails promoting abcwidgetco.com. Filterbots happily fetch abcwidgetco.com 5 million times over the course of a day or two. ABC WidgetCo's website dies for a few hours due to the overwhelming load, and their hosting bill for the month skyrockets, yet none of that turned into sales. Do you think they're going to pay the spammer if they haven't already? Even if they prepaid, do you think they're ever going to hire a spammer again?

      The idea is to make spamming either costly or at least unprofitable. Even if the spammer doesn't wind up paying out-of-pocket, he won't be able to make anything from pay-per-click or pay-per-hit models, either. Right now a lot of spammers probably slip under the radar of spam and cheat detection in these types of programs, but filterbots would make it obvious to the sponsors that they had a spammer on their hands.
  23. Interesting side-effect by leetrum · · Score: 3, Interesting

    An interesting side effect of this strategy would be that it would be harder to track comissions based on per-click (instead of per-sale) for the sites employing spammers, thus limiting their income to people who buy (which can gernerally be a better comission anyway, but not offered by all these seedy companies).

  24. DDoS with IFRAMEs by The+Famous+Brett+Wat · · Score: 4, Informative
    The problems with spam-based DDoS are bad enough already. Many HTML mail readers honour IFRAME tags, so if you want to DDoS someone, then just combine a Joe Job (fake their identity, advertise their site) with an HTML mail that contains N IFRAMEs, each set to be one pixel high and refer to a large page on the victim's site. Anyone who reads the spam in an uncautious HTML-capable mail client (of which there are still way too many) will subsequently attempt to fetch the specified page N times, unless you're lucky with intermediate caching proxies or the user hitting the stop button.

    Such an attack on Nutters.org forced me to stop doing my own hosting on a DSL line, since it got utterly swamped and cost way too much in bandwidth. Amusingly, it has forced me into using a much cheaper and higher bandwidth service -- one where such attacks are no longer my problem. The rules of the game have changed for me, though: I no longer consider it viable to host a website on a low-bandwidth leaf node like a single DSL, even where normal usage would make it seem acceptable, since it makes you a sitting duck for this kind of attack. I still can't imagine why anyone would want to target Nutters.org; being small and unworthy of attack doesn't seem to be a good defense anymore.

    --
    proof, n. A demonstration that a conclusion is implied by certain premises and axioms.
  25. Bandwidth by Have+Blue · · Score: 3, Insightful

    I thought the primary complaint against spam was that it uses too much bandwidth. Wouldn't this proposal waste even MORE bandwidth per spam?

  26. Paul's good at this stuff, but this is no good... by wavecoder · · Score: 5, Insightful
    The way I see it, these are the beefs people have:
    • Multiplies bandwidth exponentially, automatically. Big corporations, especially, would be hacked off by this, and it has the added downside of slowing whole sections of the net (imagine what happens when a college dorm gets hit and 800 little bots go check out the site 57 times...).
    • Accidental DDoS on good sites - yes, Victoria, spam can be spoofed VERY convincingly.
    • Accidental DDoS on good sites (2) - if you've ever maintained a mailing list of more than 20 people, you know that, eventually, some idiot complains he/she got spammed, even if they double-opted in. I've been accused of spamming when I was quoted 2/3 of the way into someone else's (double opt-in) message! I know great sites that are blacklisted, out of human stupidity, alone.
    • Accidental DDoS on good hosts - imagine the impact on any shared host, or even some virtual hosts, when one bad client mails 5 million spams - before they could react, they could be taken offline!
    • Bad programmers (gasp!) - yes, those exist, and some of these filters could really go haywire and start thrashing all sorts of sites.
    • Lawyers - IANAL, but I shudder to think what happens the first time Microsoft or Big Blue sues some programmer, because an abused copy of their software took them down for an hour! (What is the M$ site worth, per hour? Too much, for sure.) Granted, the suit should go the other way, but that's another topic.
    • Abuse of ISPs - you'd be amazed how many ISPs will pull the plug on paying accounts for even innocent behavior (like sending 1,000 messages on a DSL account in under an hour, even if it's a business and all the messages are unique). This could get a lot of folks kicked offline.
    There are probably others... My thought is this - build a really good, Bayesian, SBPH filter like CRM114, and incorporate a "grab questionable sites" option for the "spams of the future," then filter that page as though it were spam. That'll get us all up into the 99.9% range (the noise), and spammers will eventually either (a) go out of business, or (b) only be able to get their messages to the few people that think they're worthwhile, anyway.

    My $.02.

    -Ed
  27. Confirmed opt-in mailing lists. by SSpade · · Score: 4, Insightful

    Has anyone considered what this will really do? It'll have next to no impact on spammers.

    However, lots and lots of legitimate opt-in mailing lists are following best practices by requiring a closed-loop opt-in with a magic cookie to prevent forged signups.

    How do they work? Well, usually you follow a URL containing a magic cookie in a challenge email to confirm you want to sign up for the mailing list. Oops.

    (For added brokenness, combine this with the other flawed anti-spam fad-du-jour, challenge/response).

  28. Another idea by skinfitz · · Score: 2, Interesting

    Why not just have the filter reply to the sending address with it's own randomly generated addy and auto drop those messages that use fake addresses that bounce? This could be done within seconds in most cases. The only issues here would be storage of the spam and how long you wait. It could be done by "keeping the spammer on the line" during the SMTP transfer also causing the transmission of spam to be delayed.
    Could it work?

    1. Re:Another idea by hankaholic · · Score: 2, Informative
      Could it work?
      Define "work".

      What you're proposing is that you send a message in response to every message you receive. Furthermore, you're suggesting that the message you send in response have an invalid (random) return address.

      How is this a good idea?

      Okay, say machine scott@b.com is sending to larry@a.com. Assume that all machines are running your "callback" software.

      B connects to A. A holds the connection open, as you proposed, and sends a message to scott@b.com, with a forged header so that it looks as though it came from "random1928@c.com".

      Okay, B has a pending connection to A. A has an open connection to B, and B tries to deliver the mail to C.

      So the user scott@b.com has now gotten spam from random1928@c.com. The operator of c.com isn't happy, because it looks like he's sending spam. The guy at b.com isn't happy, because for every message he sends to a.com ends up in a spam for him.

      If the sites involved had catchall aliases (which would accept mail to any address at that domain), the number of connections would increase continually, and nothing would ever actually be confirmed, until a connection or DNS lookup failed somewhere, in which case every pending connection would fail.

      SMTP already includes a command for address verification -- it's called VRFY. Most sites shut it off, though, because instead of spamming tons of random addresses, one could just VRFY tons of random addresses. This would make spammers' jobs easier -- they would be able to ensure that each address to which they send mail represents an actual mailbox.

      Getting back to your suggestion, though -- this is a truly bad idea. Try it on paper if you don't believe me. Assume that most or all of the hosts are running the software which you propose. Keep in mind that you may suggest inserting headers so that servers can communicate to each other and keep track of which messages are in response to other messages, but headers can (and are!) forged.
      --
      Somebody get that guy an ambulance!
  29. collateral damage? Not really by swordgeek · · Score: 2, Interesting

    I've seen a few posts about the possibility of collateral damage--deliberately targetting someone else's server as the target of an auto-DDOS. Someone also mentioned hijacking a server, and then bringing it down.

    The thing is, it's no easier to do it with this proposed system than anything that's currently available. In this case you have to download (buy?!) a copy of spamming software, get a list, and then run a DDOS that's actually traceable back to you. Good plan? Not by my thinking.

    Now the nice thing about this is that it will end up costing an inordinate amount of money for the spammer, take down their servers, and really piss off their ISP. (Watch the pink contracts dissappear!) This is a fairly drastic measure that might actually get rid of many spammers for good.

    Basically, it's either this or a crowbar to the head.

    --

    "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
  30. Sorry, bad idea by mikeswi · · Score: 5, Insightful

    When my newsletter (confirmed Opt-in for the NANAE people who may be reading) goes out every Tuesday and 8,000 people open it, how am I supposed to deal with these filters DDoSing my site? For that matter, how do I deal with these filters attacking my site when some other newsletter links to it? What do I do when I piss off Ronnie Scelson and he links to every individual page on my site and spams 100,000,000 people with them?

    Links are more likely to be found in legitimate email than in spam. We're going to whitelist every single existing domain on Earth, and then remove the bad ones? Do you have any idea how large that list would be and how long it would take to download it to compare with the domains found linked in an email?

    Let's say this idea becomes used widely. It will be used as a weapon by the spammers themselves.

    1.) Pay-per-click links sent in mass mailings. Spammer gets paid for every link clicked. I'm sure some of the advertisers will get wise, but there will be plenty who just sign the checks without looking deeper.

    2.) Ronnie Scelson or Alan Ralsky get pissed at someone who owns a web site (SPEWS perhaps), and send the address to several hundred million people.

    For the ISP sysadmins reading, you think it's bad when 20,000 spams land on your mail server? How are you going to like it when each of those 20,000 spams produce 3 or 4 (or 30 or 40) HTTP requests?

    Sorry, bad idea. I can't see how the idea of "attack filters" does anything but discredit the whole idea, especially after thousands of perfectly innocent web sites are knocked offline by the sort of malicious software being advocating, or when spammers inevitably abuse it.

  31. This is spectacularly stupid. by edunbar93 · · Score: 4, Insightful

    Any program that does something this dangerous automatically, even to people that deserve it, is a BAD idea.

    This is the sort of thing that needs human supervision because bugs, user input, and solar flares may cause the program to act differently than you think it should. Any sysadmin who's made programs that would affect thousands of users automatically knows this. There will be a percentage - no matter how small - that the program will affect negatively, and that tiny percentage will be very, very pissed off.

    You should be exceptionally careful about where you point your Massive Hose of Death because after all, to err is human, but to really fuck things up requires a recursive algorithm working at 2 billion cycles per second.

    It's also ocurred to me that you'd be hurting yourself just as bad bandwidth wise anyway. We all complain about how much of our mail is spam, and how much bandwidth it wastes, but to DDOS them would waste hundreds of times more, not only for you but every provider that carries the traffic.

    --
    "No problem. I have the capacity to do infinite work so long as you don't mind that my quality approaches zero."-Dilbert
  32. Re:Choosing A Bayesian Filter by wavecoder · · Score: 2, Informative
    First of all, these are not apples to apples. Popfile is a multi-purpose classifier; CRM114 is a multi-purpose filter; the others are sole-purpose filters, to my knowledge. So, it depends on:
    1. whether you have more than one use (spam filtering) for it,
    2. how much of a geek you are (do you really want to have to compile it yourself, or does that give you thrills?),
    3. OS - this determines more than you might expect,
    4. the stats that are out there (there's little doubt that CRM114 is the best at what it does, but there are plenty of others in the very high 90's)
    Besides, the more the merrier - the more algorithms out there and the more spam corpi that exist, the harder it is to get ANY spam through.

    -Ed
  33. Don't just do something, stand there! by asackett · · Score: 2, Insightful

    I suspect that a thorough analysis of the proposed scheme would conclude that it could not work if it were widely adopted. It's silly to create a system in which a relatively small, expected but undesired input triggers a relatively large burden on network resources.

    Oh, wait... that's called a distributed denial of service attack. Someone already thought it up!

    --

    Warning: This signature may offend some viewers.

  34. New Spamming Technique : Trickle Spam. by androse · · Score: 4, Informative

    I'm all for the idea, and as a matter of fact, I suggested it a couple of months ago.

    If individual spam victims start repetitively downloading the spammers website, this could bring the spammer to change the way he sends spam from the current big bang technique to a small continuous trickle technique. The spammer would send a single spam over several weeks, in stead of a few hours. He would parallelize the process.

    I see two possible counter-attacks to this :

    • content-based blacklisting (like Vilpul Razor, etc), i.e a central database of links that are currently being used in spam.
    • high aggressivity from the victims : if everyone loads the URI 50, 100, or 300 times, then the "trickle method" would probably fail. You should of course change the HTTP User Agent string for each request, and randomize the timing to stop any filtering on the web server.

    Feel the rage !

  35. As tempting as it may be... by KC7GR · · Score: 2, Insightful

    ...Fighting abuse with more abuse probably will not solve anything, and could also get you in trouble with your own ISP, if a spammer hits you hard enough to cause the fake E-mail addresses they put into their spam enough problems.

    This is a bad idea, IMO. Stick with blocklisting. Once things get to the point where the spammers are all on what amounts to an intranet, and they're doing nothing but spamming each other, they'll get the idea.

    --

    Bruce Lane, KC7GR,

    Blue Feather Technologies

  36. Avoid URL validation - lie to them by Tool+Man · · Score: 2, Interesting

    I like the idea of whacking the spammers' bandwidth, but I'm not really keen on validating the email address the bastards have reached.

    So, why not follow the links, but change the parameter values? It's all something which we'd do programmatically anyway, so subtle variations in the value portion would still incur the expense of processing the input, even if it fails. Keep the path component of the URL, and the parameter names used, so it gets as far as possible before blowing chunks.

  37. So many security holes... by anthony_dipierro · · Score: 3, Insightful

    It's not just DDOS that is the problem (in fact DDOS is actually the main feature). A naive implementation would pass along the GET data. So you could use this method to anonymously submit form data. Want to stuff an online ballot? Send out a spam linking to http://whatever/poll.foo?bar. Depending on how poorly written the sites are, you could even use this to do more sophisticated things, like sign up for 10,000 accounts at a certain website.

  38. Re:Thoughts on active countermeasures and relays.. by hankaholic · · Score: 2, Informative
    Answers:
    1. If this caught on in a big way, almost certainly less load than spam imposes on its own, assuming that this was run on the servers. However, since Bayesian filters are best left to the individual to personalize to their own specific preferences, the load would likely be distributed across the clients (such as Mozilla), as opposed to the servers.

      Graham did mention users with broadband connections, implying that this would be something that the client would pull down.

    2. Fetching an HTTP request and parsing the returned text really has no more security risks than automatically parsing text which is sent to you via email. As long as the software is designed sensibly, there shouldn't be any additional security problems.

    3. This is difficult to say, but one benefit of the proposed system is that it only loads pages linked from messages which are not obvious in their classification. What is questionable in one person's inbox may not be questionable in another's. This reduces the chance that a concocted email will create such a DDOS attack -- it would have to be created in such a way as to be tagged as "possibly, but not definitely, spam" by many different programs given the unique corpora of those running the software.

    4. This is really the big issue -- making sure that an implementation is widespread enough to make a real difference in the habits of spammers and the networks which support them. Reaching this critical mass may take a while, but the point of the article is that by also parsing the links in the email, you get a better idea of how relevent the message may or may not be.

      In other words, you get a more accurate filter which takes into account more than the message itself -- it also considers the content which the message is trying to put across.
    --
    Somebody get that guy an ambulance!
  39. SETI@HOME ? by axxackall · · Score: 5, Interesting
    I think that some sort of SETI approach can be used:
    1. your filter recognizes the spam and gets URLs from it;
    2. all such URLs are gathered in the central authority and statistically verified (how many filters have claimed the same site);
    3. only the most often claimed sites are left in the list, while more rarely claimed sites are considered as claimed by mistake or by the anti-filter attack;
    4. people willing to help to fight spam download the screensaver aka SETI@HOME, working at your CPU and net idle time;
    5. the screensaver downloads the fresh list of sites to be fought back along with a centrally generated schedule;
    6. the filter actually attacks back at the scheduled time points (if it's still the idlle time for client PC), not massively from the individual PC (so it doesn't look suspicious for the individual client *AND* it doesn't create any peak bandwidth problem for the attacker);
    7. the spammer's web site is /.ed;
    All problems I see resolvable:
    • a schedule must be smart to avoid a local bandwidth problem, but still flood the spammer, but with many such screensavers even a smooth atack will be not very smooth when it's multiplied to millions;
    • a central authority can be a subject for a counter-attack as well (will it start cyber-wars?), but if the central authority will really decentralized (p2p, SETI, other techs) that it should not be a problem;
    • spammers may use some sort of logging, but what can they do with it?
    • to avoid if someone will organize the fake claim in order to /. the innocent site, statistics should help - only really massively claimed sites will be counted;

    The main idea of the spam is to send email massively on a very low cost. So if the attack will be also very massive, it will increase their cost of operation and at least some of them will go out of business.

    Any attmpts of spammers to go through filters will not work, as you can manually submit the spam claim to (what is its name? NOSPAM@HOME?) the central authority. If the amount of such claims will be big enough, then the claimed sites will be included.

    --

    Less is more !
    1. Re:SETI@HOME ? by Pieroxy · · Score: 2, Insightful

      all this is a neat idea, but there is still a couple of problems unresolved:

      1. There is a small company that I dislike. What prevents me from hacking their ip address and send shitload of spam in their name?
      2. automatic or manual retaliation comes back to making justice yourself which is inherently illegal (at least in the us).

  40. Bad idea, but might be improved by Animats · · Score: 2, Interesting

    The good idea there is to filter spam based on what it links to. SpamCop already does some of this, and reports the spamvertised site to its ISP or upstream provider. This is reasonably effective. It also identifies black-hat ISPs that host sites referenced in much spam.

  41. auto following links -> spread worms by frenetic3 · · Score: 2, Insightful

    i think a more potentially dangerous outcome is that this could become a vehicle for worms to spread;

    lots of vulnerabilities have been discovered (in IE, etc) in the past that run arbitrary code when you visit a web page.

    so, if we have all these [identical] email clients set to automatically follow links and that there's some kind of known buffer overrun within the html parsing code (or if they use the IE rendering engine and some similar vulnerability has been discovered) then if a malicious link is sent then all of these clients will follow it and get compromised. (witness the paranoia now in most email clients which disable javascript, attachments, etc by default).

    at that point, if tons of machines are compromised, they could be turned into open proxies or could turn around and forward the email to everyone in their address book, etc.

    yes, this might sound like a farfetched scenario, but i think even if this case didn't happen, the obvious counter for spammers is to distribute the web load over a bunch of compromised open proxies or something or to throw up temporary web pages on random web hosts until they get shut down.

    the bottom line is that in the end the pain of this countermeasure will be simply passed onto innocent third parties.

    furthermore, it's unlikely that any major mail client will include this feature by default (outlook or eudora) since there's so much room for abuse, and the whole idea relies on a critical mass of users to actually have an effect.

    -fren

    --
    "Where are we going, and why am I in this handbasket?"
  42. Bayesian filters by dtfinch · · Score: 2, Informative

    It seems like the need for other anti-spam techniques will decrease as these become more popular. Things like ip banning or automated server hacking just hurt more non-spammers.

    I installed a free one called K9 (though I donated $20 to the author), and over my last 573 emails (392 spam) it has only made one mistake, making it over 99.8% accurate after its initial training (141 messages). I've only been using it for a few weeks. It's about a 60k download and is very flexible and well behaved. The downside is that it's closed source and built for win32. I don't know if it works under Wine.

    The one spam that got through was disguised a typical personal message, except that it was offering a business relationship and contained a personalized image link to determine if I viewed the message.

    I tried Mozilla's built in bayesian filter for a few months. It had about 90% accuracy, even though I corrected every single mistake it made. Something's not working there, so probably shouldn't be used to judge the accuracy bayesian filters in general.

    I've tried PopFile as well. It seems to have good accuracy, but it's like swatting a fly with a sledgehammer. It's like a full fledged anti-spam server and is best installed on a dedicated server but is not well suited for multi-user environments, and it'd not easy to correct old mistakes or rebuild the word database. It does have the benefit of being cross platform though, and it supports multiple buckets, not just spam and not spam.

  43. Re:Hear! hear! by hankaholic · · Score: 2, Interesting

    A 404 would cause load on their servers, but pulling actual images would rob their bandwidth as well.

    --
    Somebody get that guy an ambulance!
  44. Fight fire... by adding fire? by quacking+duck · · Score: 3, Interesting
    Given that so many people, even corporate execs, are stupid enough to order stuff from spammers, why not use this fact to our advantage?

    Send out "white hat" spam, which for all intents and purposes looks like real (ie "black hat") spam. Except clicking on the link takes you to any number of webpages that basically say "are you so f***ing stupid you actually believe pills can make your penis/breasts/whatever larger?"

    Adjust content to suit type of spam. Include disgusting images if the type of spam you're emulating is adult-oriented (pr0n, enlargements, etc), something else entirely if you're "selling" mortgages or similarly benign wares (ie no goatse.cx-type images if you're "selling".

    And to cap it off, if viewers are so enraged at what they see, the page will have a feedback link. The link will either be a known spammer's email so they receive their venting instead of their money, or link to yet another anti-spam site.

    Geeks and filters will automatically block this stuff out, so there's no harm done to us, aside from having to filter out even more spam.

    But with any luck, if enough of these anti-spam spams get sent out that people start associating spam messages with informative, insulting or disgusting websites, they'll learn to stop clicking on those damn links, stop buying their bullshit products, the spam model becomes unprofitable, and spam is reduced to a saner level or eliminated entirely.

    Legal implications? No better and no worse than black hat spammers.

    Comments?

  45. P2P Analogy by prozac79 · · Score: 2, Interesting

    Isn't this what some congressman is trying to get passed for P2P networks? He thinks that it is perfectly acceptable for copyright holders to hack P2P networks and bring down machines that are suspected of having illegally obtained copyrighted material. Now we propose this for spam and suddenly this is a good thing? I know, nobody likes spammers, but that can't be the foundation to allowing people to hack other's systems. If filters were allowed to strike back at spammers, that would give the RIAA and MPAA all the ammo they need to lobby for new laws that allow disabling people's service. As many people have said in other posts, it sets a very slippery slope that will probably have consequences beyond what we initially invision, not just for email, but for anything that someone does over the internet that is "unwanted".

    --
    "Oh dear, she's stuck in an infinite loop and he's an idiot" -Prof. Farnsworth (Futurama)
  46. Sounds a lot like an old idea... by jemfinch · · Score: 2, Interesting

    Making spammers pay for each spam they send? Sounds a lot like Daniel Bernstein's Internet Mail 2000 recommendation, except that this idea has far more potential for abuse. As much as I like Paul Graham's innovative ideas, this one is definitely both late on the scene and inferior to IM2000.

    Jeremy

  47. Re:No by g.zero · · Score: 2, Insightful

    Aren't you forgetting that some people are on a 56k connection? Forcing their browser to download the images would increase the loading time for them. It might not make much difference to those on a DSL connection or better, but when you only get 5k/s it could hurt.

    --
    "Hard work _might_ pay off later, but procrastination _always_ pays off now."
  48. RE: Filters that Fight Back by Tacoguy · · Score: 3, Interesting

    Spam fighting, it seems to me has 2 fronts. What to do when you get on the lists and how did you get there to begin with. Having made numeous web sites thru the years it has become clear to me that these spammers are largely harvesting addys thru mail-to links on web pages. A number of techniques can be utilized to prevent such activity. 2 of my favs are the use of ASCII characters in the actual addy and the use of Javascript to mask the addy. Once you are "in their hooks" there seems little you can do so it seems best to me to not get there in the first place. Best Jeff

  49. Automatic attacks are a bad idea by cait56 · · Score: 2, Insightful

    Having a "filter fight back" is a polite way of saying that you have trained attack software.

    Software has bugs. If you have trained attack software, it will have bugs. Which means eventually it will attack an innocent site.

    Ultimately this is a bad idea for the same reasons that automated home defenses are a bad idea. It's very easy to say that the intruder has earned the automated response, but then you get the nitty gritty issue of whether your automated system can distinquish between a burglar and a fireman.

    The same issues apply in identifying Spam. How will your software, which will make mistakes, distinquish between the real source of Spam and a clever header that is making it look like someone else is the source? I don't care how good your algorithm is. It's coded by humans, so it will make mistakes. Unlike a human making a mistake manually, however, it will pounce at very high speeds.

  50. Re:noooooooo by mikiN · · Score: 2, Interesting
    ... it would screw up stuff like mailing lists that have URLs to click to confirm you want to be on the list.

    Simple problem, simple solution: mailing lists should use something like

    Please <a href="mailto:listowner@some.domain?subject=confirm -#confirmationkey">confirm</a>your subscription.

    Please don't let the 'clickability factor' of an http URL (1 click) versus a plain old mailto (2 or more clicks to send) get in the way of privacy protection. I suppose that when you have just subscribed to a mailing list you are interested in more than just the confirmation message, so you have some clicks to spare

    -
    Never send a machine to do a human's job.

    --
    The Hacker's Guide To The Kernel: Don't panic()!
  51. NOSPAM@HOME ! by axxackall · · Score: 2, Insightful
    Let me think:

    There is a small company that I dislike. What prevents me from hacking their ip address and send shitload of spam in their name?

    In my opinion it is posible to have a statistical analasys that would be capable to distinguish it unless you organize a really big attacke. On the other hand, a central (even if it's distributed) autority may help to gather a witness evidence against your unfair anti-competitive practice, which would be rather difficult if such NOSPAM@HOME project would not exist.

    automatic or manual retaliation comes back to making justice yourself which is inherently illegal (at least in the us).

    What makes it illigal? It is a statistical research project. Volonteers help to gather a statistical database of originally filtered emails. The central (and distributed) authority asks volonteer to help to gather the rest of information, namely the responsivity of a seller's web site, based on a pre-estimated schedule. BTW, the result of stitistical analysis can be peacefully used to consult the seller web site admin how to improve the site responsivity. Most likely the only advise would be so far: "shut your spam down and your site traffic will come back to normal".

    I am actually ready to stand out in the court and say: "Well. the targetted company sends their marketing materials with only 5% of chance that the reader wants to read it. We study the responsivity of the targetted site by creating the traffic to the site where only 5% of actual requests are wanted by the business of the site's owners. How our 5% are different from their 5%? If what we do is illegal than what they do is illegal as well. But what we are doing is the non-profit research when only a very small group of people may dislike it, while what they are doing is a for-profit compaign when millions of innocent people dislike it."

    --

    Less is more !