Slashdot Mirror


Paul Graham: Filters that Fight Back

Mortimer.CA writes "Paul Graham is back with another article about combating spam. It's entitled Filters that Fight Back: 'One intriguing idea is to literally fight back: to make filters disable spammers' servers by automatically following all the links in each incoming email. We may be driven to this in order to achieve accurate filtering anyway. Why wait?' One danger is someone doing a DDoS by sending fake spam."

39 of 328 comments (clear)

  1. Some spammers would love this. by www.sorehands.com · · Score: 3, Insightful

    In the situation where the spammer gets paid by hit, the spammer would be rich overnight. But, then the customer might see somthing a little fishy, then start asking questions.

    1. Re:Some spammers would love this. by xyvimur · · Score: 2, Insightful

      And another super-smart spam sending mechanism will be developed to bypass defences. And another group of people will think a perfect method to defence against it, and so on, and so on.

  2. Re:And now by Adam9 · · Score: 2, Insightful

    I don't think Yahoo will mind too much.

    traceroute to paulgraham.com (216.136.224.156), 30 hops max, 40 byte packets ...
    14 vl48.bas2-m.sc5.yahoo.com (66.163.160.214) 99.528 ms 98.349 ms 99.528 ms
    15 alteon4.128.sc5.yahoo.com (216.136.128.6) 98.575 ms 98.687 ms 98.377 ms

  3. Dangerous from a legal perspective by hardaker · · Score: 4, Insightful
    What about phrases like "by clicking on this link you agree to let us call your house" kind of things (where the link containers a token for identification purposes). Having a filter auto-follow links could be really dangerous then.

    The interesting thing is how the courts would end up viewing auto-clicks vs manual clicks. I'd bet that if a user set up a filter then it would be effectively view as the user doing the clicking...

    --
    The next site to slashdot will be ready soon, but subscribers can beat the rush and start slashdotting it early!
    1. Re:Dangerous from a legal perspective by xyvimur · · Score: 3, Insightful

      ``"by clicking on this link you agree to let us call your house" kind of things (where the link containers a token for identification purposes). Having a filter auto-follow links could be really dangerous then''

      So it would be necessary to make changes in the law to forbid `auto-agreeing' techniques. And we will have one less problem.

    2. Re:Dangerous from a legal perspective by AnotherBlackHat · · Score: 2, Insightful

      What about phrases like "by clicking on this link you agree to let us call your house" kind of things


      By reading this message you agree to give me $50.

    3. Re:Dangerous from a legal perspective by Zeinfeld · · Score: 2, Insightful
      ``"by clicking on this link you agree to let us call your house" kind of things (where the link containers a token for identification purposes). Having a filter auto-follow links could be really dangerous then''

      This was anticipated in the Web Specs which since 1992 have clearly said that clicking on a GET link creates no form of binding contract.

      In any case any contract formed in that manner would be a contract of adhesion and invalid.

      If it were otherwise Google would be entering into all sorts of contracts with its web crawler.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
  4. Stupidest idea ever by DemoLiter · · Score: 1, Insightful

    Which means : 1. receiver of the spam will waste even more bandwidth 2. spammer may verify accounts by posting links like http://bla.com/bla.php?stupid@email.com 3. Already said : DDoS attacks initiated via spam

  5. Automated slashdotting of spammers by rabbar · · Score: 2, Insightful

    I like the idea, anything that drives up the cost of sending spam above the value derived from spamming is a good thing. I'd also like to see some automated poisoning of things like mortgage solicitations. This type of spam is really intended to simply get your name, address and phone number which are then sold to mortgage brokers for further solicitation. The mortgage brokers pay $10-50 for these lists of name, if the lists were filled with automated junk information the value to the mortgage brokers would quickly drop to zero and this type of spam would drop to zero.

  6. And that is why we spammers... by Anonymous Coward · · Score: 1, Insightful

    ... bounce the connections through proxies and attach fake return paths. I guess this would punish people who don't (don't know how to) close their proxies to the outside world.

    I suppose you would burn the amateur spammers that run cots spam software off their AOL connections.

    1. Re:And that is why we spammers... by Trick · · Score: 3, Insightful

      Would that be such a bad thing? A big part of the reason spammers have the success they do is because there are a *lot* of people out there with misconfigured proxies. If the only bad result of a filter was that a few "innocent" people who don't know what they're doing, and made things easier for spammers, got DOSsed, I'd have no problem with that at all.

  7. Re:horrid legal thought by Todd+Knarr · · Score: 4, Insightful

    Why would it be illegal? The spammer put the links in the e-mail, obviously intending people to follow them (especially if they make reference to something being available at the linked site in the rest of the text). If far too many people follow the links and the site is brought down, how is that any more unlawful than Slashdot linking to a site in a story and the sudden burst of traffic bringing that site down?

    I think the idea's dangerous for another reason, though. As noted, a spammer could easily include links to sites he doesn't like and let the traffic spike take them down.

  8. Re:This is stupid! by rabbar · · Score: 3, Insightful

    Actually it's quite clever. The spammers website would quickly have it's bandwidth consumed to the point where most automated accesses to it would timeout without actually consuming more than minimal bandwidth. It's an automated, legal denial of service attack on not only the spammer but also on the ISP that hosts the spammer.

  9. Wrong! by amjohns · · Score: 2, Insightful

    While the net effect is DDOS-like, we're only doing EXACTLY WHAT THE SPAMMERS WANT! They asked us to visit their webpages, so we did. This is 100% legal, and no court (or jury at least) would see otherwise.

    But you've got to watch out for unique tracking images so as not to validate your email address.

  10. Re:Comparison of Bayesian spam filters by __past__ · · Score: 4, Insightful
    I always wondered how Graham felt about the hundreds of Bayesian filters written after he published his article. After all it was supposed to be a killer feature of a webmail system he (together with others, of course) writes to demo his Arc language.

    Then again, he's probably still insanely rich from the ViaWeb (a.k.a Yahoo! Store) deal, and doesn't really have to care about lost business advantage much. Becoming a millionaire to be able to concentrate on hacking seems to be a good career plan :-)

  11. Thoughts on active countermeasures and relays... by atcroft · · Score: 5, Insightful

    Just finished reading the section of the article that was headed as "Filters that fight back." I think that the biggest issues that keep such an approach from working are fundamental features of the e-mail infrastructure itself: 1) the lack of verification, and 2) the store-and-forward and replicative nature of email itself.

    In other systems I am aware of in which active countermeasures may appear (such as firewalls, and tcpwrappers), the adversary can be established with reasonable certainty in most cases; however, because the From and Reply-To addresses can be (and often are) forged and most owners of relaying machines are unaware they are misconfigured, it seems doubtful countermeasures would work at that step. If one uses the URLs, as suggested in the article, it is not guaranteed that the "million" emails sent out will hit the next server along their path at a particular time, so it seems doubtful you can guarantee a massive traffic burst at once. Indeed, what may be seen instead is incremental bursts of traffic at the delivery retry intervals of various mailserver software.

    Other questions also arise, such as: 1) how much additional load will a mailserver experience from hitting the links; 2) what additional security issues are introduced in doing so (what if, for instance, the code to do this results in a security vulnerability); 3) how can it be done in such a way that DDOS attacks against innocent victims can be avoided; and 4) how can you get enough people to both upgrade their systems and cooperate in a useful way to do this. Issues 1 and 2 are probably obvious questions to ask-issues 3 and 4, however, I believe suffer from the same weaknesses as some of the current BL schemes. Also, some localities have legal codes which prohibit the interruption of legitimate access to a system, and the server in this case definitely has a way to track back to you at that point, which potentially make participants vulnerable to legal or civil actions.

    While I admire Mr. Graham and his efforts in the spam-wars, and find it an intriguing idea, I do not think this approach will truly be successful until changes are made to the underpinning email system that may reduce some of the issues mentioned, but hopefully will themselves make an impact on the issue without being too onerous to prevent wide-spread adoption.

  12. Bandwidth by Have+Blue · · Score: 3, Insightful

    I thought the primary complaint against spam was that it uses too much bandwidth. Wouldn't this proposal waste even MORE bandwidth per spam?

  13. Paul's good at this stuff, but this is no good... by wavecoder · · Score: 5, Insightful
    The way I see it, these are the beefs people have:
    • Multiplies bandwidth exponentially, automatically. Big corporations, especially, would be hacked off by this, and it has the added downside of slowing whole sections of the net (imagine what happens when a college dorm gets hit and 800 little bots go check out the site 57 times...).
    • Accidental DDoS on good sites - yes, Victoria, spam can be spoofed VERY convincingly.
    • Accidental DDoS on good sites (2) - if you've ever maintained a mailing list of more than 20 people, you know that, eventually, some idiot complains he/she got spammed, even if they double-opted in. I've been accused of spamming when I was quoted 2/3 of the way into someone else's (double opt-in) message! I know great sites that are blacklisted, out of human stupidity, alone.
    • Accidental DDoS on good hosts - imagine the impact on any shared host, or even some virtual hosts, when one bad client mails 5 million spams - before they could react, they could be taken offline!
    • Bad programmers (gasp!) - yes, those exist, and some of these filters could really go haywire and start thrashing all sorts of sites.
    • Lawyers - IANAL, but I shudder to think what happens the first time Microsoft or Big Blue sues some programmer, because an abused copy of their software took them down for an hour! (What is the M$ site worth, per hour? Too much, for sure.) Granted, the suit should go the other way, but that's another topic.
    • Abuse of ISPs - you'd be amazed how many ISPs will pull the plug on paying accounts for even innocent behavior (like sending 1,000 messages on a DSL account in under an hour, even if it's a business and all the messages are unique). This could get a lot of folks kicked offline.
    There are probably others... My thought is this - build a really good, Bayesian, SBPH filter like CRM114, and incorporate a "grab questionable sites" option for the "spams of the future," then filter that page as though it were spam. That'll get us all up into the 99.9% range (the noise), and spammers will eventually either (a) go out of business, or (b) only be able to get their messages to the few people that think they're worthwhile, anyway.

    My $.02.

    -Ed
  14. Re:Following links validates your address by Paradise+Pete · · Score: 2, Insightful
    I'm wont to load HTML messages because of it.

    Wont means you're disposed, or likely, to do something. If I read your (insightful) post correctly, I take it you're hesitant to do so.

  15. Confirmed opt-in mailing lists. by SSpade · · Score: 4, Insightful

    Has anyone considered what this will really do? It'll have next to no impact on spammers.

    However, lots and lots of legitimate opt-in mailing lists are following best practices by requiring a closed-loop opt-in with a magic cookie to prevent forged signups.

    How do they work? Well, usually you follow a URL containing a magic cookie in a challenge email to confirm you want to sign up for the mailing list. Oops.

    (For added brokenness, combine this with the other flawed anti-spam fad-du-jour, challenge/response).

  16. Re:Following links validates your address by stevens · · Score: 2, Insightful
    Actually, the opposite would happen: since all links in all spams get hit, this technique would make putting UIDs into URLs worthless for the purpose of authenticating users.

    I don't think so. All links in all spams wouldn't get hit.

    • Mail that got swallowed or bounced undevlierable wouldn't follow the links.
    • Mail that went to non-punishing email clients (like companies who are afraid of liability when DDOSing sites) wouldn't hit the URL.

    And there are many reasons not to punish. I would, but I've got fast ADSL and lots of bits per month to spend. But if I were on metered dialup in the UK where I get charged every second I'm on the line, I wouldn't want my spamfilter to take six minutes downloading mail because it's punishing spammers.

    Here's an idea: Can't the filter strip the path-part of the URL and just hit the root URL on the server? It punishes the same machine (unless it's a complex reverse-proxy setup, where it only punishes the proxy, but that's probably good enough).

    E.g., if the URL is http://spammer.com/offer_5/unique_id_123123i765, then we just hit http://spammer.com/.

  17. Sorry, bad idea by mikeswi · · Score: 5, Insightful

    When my newsletter (confirmed Opt-in for the NANAE people who may be reading) goes out every Tuesday and 8,000 people open it, how am I supposed to deal with these filters DDoSing my site? For that matter, how do I deal with these filters attacking my site when some other newsletter links to it? What do I do when I piss off Ronnie Scelson and he links to every individual page on my site and spams 100,000,000 people with them?

    Links are more likely to be found in legitimate email than in spam. We're going to whitelist every single existing domain on Earth, and then remove the bad ones? Do you have any idea how large that list would be and how long it would take to download it to compare with the domains found linked in an email?

    Let's say this idea becomes used widely. It will be used as a weapon by the spammers themselves.

    1.) Pay-per-click links sent in mass mailings. Spammer gets paid for every link clicked. I'm sure some of the advertisers will get wise, but there will be plenty who just sign the checks without looking deeper.

    2.) Ronnie Scelson or Alan Ralsky get pissed at someone who owns a web site (SPEWS perhaps), and send the address to several hundred million people.

    For the ISP sysadmins reading, you think it's bad when 20,000 spams land on your mail server? How are you going to like it when each of those 20,000 spams produce 3 or 4 (or 30 or 40) HTTP requests?

    Sorry, bad idea. I can't see how the idea of "attack filters" does anything but discredit the whole idea, especially after thousands of perfectly innocent web sites are knocked offline by the sort of malicious software being advocating, or when spammers inevitably abuse it.

  18. This is spectacularly stupid. by edunbar93 · · Score: 4, Insightful

    Any program that does something this dangerous automatically, even to people that deserve it, is a BAD idea.

    This is the sort of thing that needs human supervision because bugs, user input, and solar flares may cause the program to act differently than you think it should. Any sysadmin who's made programs that would affect thousands of users automatically knows this. There will be a percentage - no matter how small - that the program will affect negatively, and that tiny percentage will be very, very pissed off.

    You should be exceptionally careful about where you point your Massive Hose of Death because after all, to err is human, but to really fuck things up requires a recursive algorithm working at 2 billion cycles per second.

    It's also ocurred to me that you'd be hurting yourself just as bad bandwidth wise anyway. We all complain about how much of our mail is spam, and how much bandwidth it wastes, but to DDOS them would waste hundreds of times more, not only for you but every provider that carries the traffic.

    --
    "No problem. I have the capacity to do infinite work so long as you don't mind that my quality approaches zero."-Dilbert
  19. Re:Following links validates your address by rgmoore · · Score: 2, Insightful
    Actually, the opposite would happen: since all links in all spams get hit, this technique would make putting UIDs into URLs worthless for the purpose of authenticating users.

    But it's not there to authenticate a user; it's just there to authenticate that the email address is actually live rather than a bogus one like nobody@example.invalid. Spammers already use this trick, including uniquely coded urls into each email to track which users actually open the mail, and autoresponding is a possible problem.

    Mr. Graham actually suggests that auto-following could be beneficial to everyone. He argues that spammers would start putting working unsubscribe links in their spam as a way of filtering out spam filters with the autorespond feature and cutting their bandwidth bills. I'm not so sure that this would really work. For one thing, the fact that many spammers already encourage people to download a link in the form of an invisible gif to track live email addresses suggests that the bandwidth problem might be less of an issue than he thinks. Equally important, a lot of spamming is done by contract spammers, not directly by the people being advertized, and I'm not convinced that the contract spammers would really care that much about their clients' web-sites being hammered.

    --

    There's no point in questioning authority if you aren't going to listen to the answers.

  20. Don't just do something, stand there! by asackett · · Score: 2, Insightful

    I suspect that a thorough analysis of the proposed scheme would conclude that it could not work if it were widely adopted. It's silly to create a system in which a relatively small, expected but undesired input triggers a relatively large burden on network resources.

    Oh, wait... that's called a distributed denial of service attack. Someone already thought it up!

    --

    Warning: This signature may offend some viewers.

  21. As tempting as it may be... by KC7GR · · Score: 2, Insightful

    ...Fighting abuse with more abuse probably will not solve anything, and could also get you in trouble with your own ISP, if a spammer hits you hard enough to cause the fake E-mail addresses they put into their spam enough problems.

    This is a bad idea, IMO. Stick with blocklisting. Once things get to the point where the spammers are all on what amounts to an intranet, and they're doing nothing but spamming each other, they'll get the idea.

    --

    Bruce Lane, KC7GR,

    Blue Feather Technologies

  22. Re:The people who PAY spammers would not by FuckMeter · · Score: 2, Insightful
    Spammers don't care about keeping their customers happy, so attempting to use this to destroy their business by making their customers unhappy is doomed to failure.
    I think the post you replied to, as well as its parent, were speaking of pay-per-click schemes. The original parent meant "customer" as in the person who hires the spammer, not the person who buys the products.

    A fair portion of the spam I get seems to promote pay-per-click programs, especially the porn spam. Spammer signs up as an "affiliate" of a porn site, sends out ten million emails, might generate 10,000 hits, each of which are probably paying half a cent. He gets a check from the porn site owner (or its processing company) for 50 bucks.

    Now suppose instead of generating 10,000 legitimate click-throughs from spam recipients, that mailing to 10 million addresses generated 5 million click-throughs from filterbots. The porn site operator sees some guy sending 5 million hits out of nowhere, and none of those hits are converting into signups. Do you think he's really going to cut the spammer a check for $25,000? No, he's going to boot the spammer out of his affiliate program, and the spammer isn't going to get paid.

    The same holds true for the mainstream side. Let's say ABC WidgetCo hires a spammer to drive some sales. The spammer sends out 10 million emails promoting abcwidgetco.com. Filterbots happily fetch abcwidgetco.com 5 million times over the course of a day or two. ABC WidgetCo's website dies for a few hours due to the overwhelming load, and their hosting bill for the month skyrockets, yet none of that turned into sales. Do you think they're going to pay the spammer if they haven't already? Even if they prepaid, do you think they're ever going to hire a spammer again?

    The idea is to make spamming either costly or at least unprofitable. Even if the spammer doesn't wind up paying out-of-pocket, he won't be able to make anything from pay-per-click or pay-per-hit models, either. Right now a lot of spammers probably slip under the radar of spam and cheat detection in these types of programs, but filterbots would make it obvious to the sponsors that they had a spammer on their hands.
  23. So many security holes... by anthony_dipierro · · Score: 3, Insightful

    It's not just DDOS that is the problem (in fact DDOS is actually the main feature). A naive implementation would pass along the GET data. So you could use this method to anonymously submit form data. Want to stuff an online ballot? Send out a spam linking to http://whatever/poll.foo?bar. Depending on how poorly written the sites are, you could even use this to do more sophisticated things, like sign up for 10,000 accounts at a certain website.

  24. Re:Following links validates your address by LordKronos · · Score: 4, Insightful

    That's not going to work. All you are going to do would be to needlessly DOS www.geocities.com without any particular spammers site being identified. Geocities would have no way to identify which site is the spammer's, and their hourly bandwidth would never get used up, and thus would still be available for those who click on the links.

    Also, consider that spammers could move the identifier to the other end of the url. Just have *.spammer.com or www.*.spammer.com resolve to the same site, and start putting the identifiers in the domain. They could even use random dictionary words as the identifiers to make it more difficult to pick out. The only way to combat that would be to have a system that compares the URLs from several spams and figures out which parts of the URLs changed per user.

  25. auto following links -> spread worms by frenetic3 · · Score: 2, Insightful

    i think a more potentially dangerous outcome is that this could become a vehicle for worms to spread;

    lots of vulnerabilities have been discovered (in IE, etc) in the past that run arbitrary code when you visit a web page.

    so, if we have all these [identical] email clients set to automatically follow links and that there's some kind of known buffer overrun within the html parsing code (or if they use the IE rendering engine and some similar vulnerability has been discovered) then if a malicious link is sent then all of these clients will follow it and get compromised. (witness the paranoia now in most email clients which disable javascript, attachments, etc by default).

    at that point, if tons of machines are compromised, they could be turned into open proxies or could turn around and forward the email to everyone in their address book, etc.

    yes, this might sound like a farfetched scenario, but i think even if this case didn't happen, the obvious counter for spammers is to distribute the web load over a bunch of compromised open proxies or something or to throw up temporary web pages on random web hosts until they get shut down.

    the bottom line is that in the end the pain of this countermeasure will be simply passed onto innocent third parties.

    furthermore, it's unlikely that any major mail client will include this feature by default (outlook or eudora) since there's so much room for abuse, and the whole idea relies on a critical mass of users to actually have an effect.

    -fren

    --
    "Where are we going, and why am I in this handbasket?"
  26. Re:And now by Zeinfeld · · Score: 4, Insightful
    And now thanks to links posted to Slashdot, Paul Graham is being DDoS'd =)

    Which illustrates the problems that you get when people who have little or no security experience try to do security.

    The problem with hackback schemes of all types is that they always end up having unexpected effects. The basic problem is that when people design a hackback scheme they never consider what happens when someone sets out to abuse it. They assume that the only change to the environment is their hackback scheme.

    A few months ago Paul though Bayesean filtering was the one true solution. The only problem was that people who have spent years working on the techniques he described never achieved results anywhere close to the ones he claims.

    Paul Graham's scheme is not as damaging as some others because the amplifier effect is limited. The message sender only gets five or ten messages created for each spam sent. But even that could make a profitable scheme for someone trying to get their site promoted in a 'most visited list'. If they have pay per view adverts they can rake in quite a few bucks - as much as a cent for every spam sent. Far from discouraging spam this scheme would create a new incentive.

    BTW the guy who said 'there is no fake spam' is right depending on the definition you use. If you use the definition 'unwanted email sent indiscriminately' then he is pretty much right. If on the other hand you define spam as 'that which our filters decide is spam'... (I kid you not, folk do try to get that type of definition accepted). The exception would be satires like 'make penis fast'.

    There are similar problems with the folks running blacklists, they think that they understand everything there is about spam but don't realize that the systems they set up can be and will be gamed. Every partisan political mailing list of every stripe that has a significant number of readers gets blacklisted from time to time as people sign up for the list in order to be able to report it as spamming.

    Try to explain to either group that there is a problem and they get majorly defensive. You get accused of wanting to help the spammers, etc. etc. When people start getting defensive like that in response to fair questions you are in big trouble.

    The way to deal with spam is to treat it as a security problem. We deal with security problems using access control - authentication and authorization. We need to start with robust authentication mechanisms that hold ISPs responsible for the messages sent from their domain. These need to be accompanied by robust authorization mechanisms that allow recipients to judge whether the sender is honest.

    --
    Looking for an Information Security student project suggestion?
    Try http://dotcrimeManifesto.com/
  27. Re:And now by Gruturo · · Score: 2, Insightful

    A few months ago Paul though Bayesean filtering was the one true solution. The only problem was that people who have spent years working on the techniques he described never achieved results anywhere close to the ones he claims.

    Your mileage may vary. Mine is excellent, for example. I've been using a Naive Bayesian filter, POPFile, for a while now, and I'm at 99.74 accuracy with 11564 classified messages and 29 errors. (For the record, 15 spams filtered thru and a few friends jokes, honestly looking a bit like spam, got filtered out. Not a single work mail got lost).

    While I might agree that auto-reacting DDOS filters could turn into a pretty ugly beast when someone really clever finds a way to abuse them, I wouldn't be that critic of Paul Graham's work. He came out with a hell of an idea a few months ago, and this one could be even better with a few safeguards and adjustments in place. I suggest he has a word with bittorrent's Bram Cohen, who might know a thing or two about distributed computing, coordinated network efforts and protocol resistance to tampering and abuse.

    I fully agree about the failure of many antispam efforts: For one, realtime blacklist have been outsmarted and bent against their purpose by brighter spammers with an evil sense of irony, but some techniques do work, and given his track record I'd be inclined to give this guy a chance to show what he's up to.

    And, though I agree that a real, final solution to the problem might involve adoption of a new mail transfer protocol to supplant SMTP, which makes too many assumptions of goodwill, I don't see that coming anytime soon, so we'd better have a look around and see what can be done to improve the current situation.

    --

    Vacuum cleaners suck. Kings rule.
  28. Re:And now by KevMar · · Score: 3, Insightful

    If the spam site gets paid on views, the advertisers are expecting a percentage to click on adds. If every site is visited, but the links on the site are not clicked (or links that do not leave the domain) the click percentage will go down and advertisers will pay the sites even less. also, the increased banwidth bill will add cost.

    We would have to strip out any identifying code in the urls to prevent added spam from email validation

    --
    Im a gamer, not a grammer major. This post is full of spelling and grammer mistakes.
  29. Re:And now by Zeinfeld · · Score: 4, Insightful
    >>The message sender only gets five or ten messages created for each spam sent.
    Go back and read the article. It's about http requests, not sending mail.

    Oh, I totally get the fact you are sending out http requests. The fact the message is HTTP rather than SMTP is not relevant as far as I am concerned. The original HTTP spec used the term messages for requests and responses. I really can't remember what we did in the RFC.

    The amplifier effect is just the same, for each message in there could be five messages out. The main advantage to the spammer though is laundering the IP address so that their web site hits appear to come from 10,000 distinct views rather than the same view.

    I don't know where you get this idea. I know plenty of filter hackers who get results so much better than me that I'm kind of embarrassed.

    Getting that sort of result on their own mail is one thing, getting that result on a representative corpus of user emails is a very different matter.

    Geek mail is much easier to spam filter than naive user's mail. They tend to be far more aggressive in the features they use. They are also the targets of the spammers, geeks being a minority. So the vocabulary chosen by spammers tends to be much closer.

    My real concern is not whether a filter is 99.8 or 95% efficient at detecting spam, its the false positive rate that is the problem. 1% false positives is a big problem, even 0.5% is a serious problem. The other big problem is the sheer cost of CPU cycles. Imagine a room the size of a football field filled with 100 equipment racks. Processing the legitimate mail only requires one of those racks, the rest are for dealling with spam. Each processing step adds cost. Bayesian filtering is only one part of the solution.

    I agree about going after the spammers, but litigation and law enforcement are far more likely to be effective than hackback.

    What we need to do in addition is to change the mail protocols so that we can know that a message that purports to come from a particular source is authentic. At least 50% of the spam sent claims a false sender address. The tricks that spam senders use to hide from litigation are a very robust spamdicator that almost never gives a false positive.

    --
    Looking for an Information Security student project suggestion?
    Try http://dotcrimeManifesto.com/
  30. Re:No by g.zero · · Score: 2, Insightful

    Aren't you forgetting that some people are on a 56k connection? Forcing their browser to download the images would increase the loading time for them. It might not make much difference to those on a DSL connection or better, but when you only get 5k/s it could hurt.

    --
    "Hard work _might_ pay off later, but procrastination _always_ pays off now."
  31. Re:SETI@HOME ? by Pieroxy · · Score: 2, Insightful

    all this is a neat idea, but there is still a couple of problems unresolved:

    1. There is a small company that I dislike. What prevents me from hacking their ip address and send shitload of spam in their name?
    2. automatic or manual retaliation comes back to making justice yourself which is inherently illegal (at least in the us).

  32. Re:Following links validates your address by rgmoore · · Score: 2, Insightful
    I think an email address or user is the same thing.

    But they're not the same. There are zillions of possible valid email addresses out there, but not every one has an actual recipient available to read the messages. For instance, there are about 200 billion possible usernames that contain exactly 8 letters. If you send messages to every username from aaaaaaaa@aol.com to zzzzzzzz@aol.com, most of them will be bounced or dropped harmlessly because there's no mailbox corresponding to that name. Some of them, though, will be valid usernames and will be sent to the appropriate user (assuming, of course, that AOL doesn't filter them as spam).

    For a spammer, knowing which of those addresses reach a real recipient and which ones get dropped is valuable information. There are some spammers who try variants of this approach, sending meaningless spams to huge numbers of guessed addresses and hoping to find out which ones are live by waiting for the mail agent to pick up their coded 1x1 gif and show that the recipient exists. If you give each real user a program that autofetches all of the urls in each spam, this will effectively notify the spammer that the address actually has a mailbox attached and somebody is receiving the mail. Far from the effect that Mr. Graham suggests, that spammers would stop sending to those addresses, it would actually alert spammers that the address is real, when silently deleting the message would leave them thinking that it wasn't real.

    Of course a really clever spammer would include two links, one that would normally be fetched automatically (like an image) and one that would only be fetched by a program that mindlessly followed each possible link in the message (like a link with no clickable area). Messages that retrieved the image but not the hidden link would be classified as live, while those that retrieved both would be viewed as automated responses and uninteresting. The problem is that the address harvesters probably don't care. They're harvesting addresses to sell them to somebody else, so they just want the largest possible list of verified email addresses, and don't particularly care whether they're likely to respond.

    --

    There's no point in questioning authority if you aren't going to listen to the answers.

  33. Automatic attacks are a bad idea by cait56 · · Score: 2, Insightful

    Having a "filter fight back" is a polite way of saying that you have trained attack software.

    Software has bugs. If you have trained attack software, it will have bugs. Which means eventually it will attack an innocent site.

    Ultimately this is a bad idea for the same reasons that automated home defenses are a bad idea. It's very easy to say that the intruder has earned the automated response, but then you get the nitty gritty issue of whether your automated system can distinquish between a burglar and a fireman.

    The same issues apply in identifying Spam. How will your software, which will make mistakes, distinquish between the real source of Spam and a clever header that is making it look like someone else is the source? I don't care how good your algorithm is. It's coded by humans, so it will make mistakes. Unlike a human making a mistake manually, however, it will pounce at very high speeds.

  34. NOSPAM@HOME ! by axxackall · · Score: 2, Insightful
    Let me think:

    There is a small company that I dislike. What prevents me from hacking their ip address and send shitload of spam in their name?

    In my opinion it is posible to have a statistical analasys that would be capable to distinguish it unless you organize a really big attacke. On the other hand, a central (even if it's distributed) autority may help to gather a witness evidence against your unfair anti-competitive practice, which would be rather difficult if such NOSPAM@HOME project would not exist.

    automatic or manual retaliation comes back to making justice yourself which is inherently illegal (at least in the us).

    What makes it illigal? It is a statistical research project. Volonteers help to gather a statistical database of originally filtered emails. The central (and distributed) authority asks volonteer to help to gather the rest of information, namely the responsivity of a seller's web site, based on a pre-estimated schedule. BTW, the result of stitistical analysis can be peacefully used to consult the seller web site admin how to improve the site responsivity. Most likely the only advise would be so far: "shut your spam down and your site traffic will come back to normal".

    I am actually ready to stand out in the court and say: "Well. the targetted company sends their marketing materials with only 5% of chance that the reader wants to read it. We study the responsivity of the targetted site by creating the traffic to the site where only 5% of actual requests are wanted by the business of the site's owners. How our 5% are different from their 5%? If what we do is illegal than what they do is illegal as well. But what we are doing is the non-profit research when only a very small group of people may dislike it, while what they are doing is a for-profit compaign when millions of innocent people dislike it."

    --

    Less is more !