Slashdot Mirror


Smart Spam Filtering For Forums and Blogs?

phorm writes "While filtering for spam on email and other related mediums seems to be fairly productive, there is a growing issue with spam on forums, message-boards, blogs, and other such sites. In many cases, sites use prevention methods such as captchas or question-answer values to try and restrict input to human-only visitors. However, even with such safeguards — and especially with most forms of captcha being cracked fairly often these days — it seems that spammers are becoming an increasing nuisance in this regard. While searching for plugins or extensions to spamassassin etc I have had little luck finding anything not tied into the email framework. Google searches for PHP-based spam filtering tends to come up with mostly commercial and/or more email-related filters. Does anyone know of a good system for filtering spam in general messages? Preferably such a system would be FOSS, and something with a daemon component (accessible by port or socket) to offer quick response-times."

183 comments

  1. Akismet by seifried · · Score: 4, Informative

    Akismet

    1. Re:Akismet by VennData · · Score: 1

      Type this "YouZer Phrendlee" to continue

    2. Re:Akismet by Anonymous Coward · · Score: 1, Informative

      Hit Freshmeat for "bayesian" and PHP; that's the statistical method for calculating the probability that a given post is similar to a body of spam examples.

      There's a couple of PHP-based ones there, both open source.

    3. Re:Akismet by Anonymous Coward · · Score: 1, Informative

      according to the wiki article http://en.wikipedia.org/wiki/Akismet if you say something bad about Matt Mullenweg you will be blacklisted.

      I fear being blacklisted hence the anon post :>

  2. I always thought by davebarnes · · Score: 1, Informative

    Re-Captcha was fairly effective and easy to install and useful.

    --
    Dave Barnes 9 breweries within walking distance of my house
    1. Re:I always thought by Anonymous Coward · · Score: 0

      I always thought Recaptcha was a captcha, not a spam filter. The challenge is to filter spam _without_ bothering the mod nor the legitimate commenter.

    2. Re:I always thought by GarrettK18 · · Score: 1

      Does Re-Captcha have audio facilities for blind or visually/reading impaired people? I ask this as a blind person who generally hates captchas, especially ones where the audio version makes you install QuickTime (ugh!).

    3. Re:I always thought by blonkm · · Score: 1

      yup, it does

    4. Re:I always thought by zieroh · · Score: 1

      I run a medium-sized forum, and reCAPTCHA has taken spam postings from a many-times-daily occurrence to almost zero. I now get maybe one spam posting per quarter.

      That's pretty damn effective.

      --
      People who say "sheeple" have about as much sophistication as an AOL user, and in fact are probably actually AOL users.
  3. D.I.Y. by Zsub · · Score: 2, Insightful

    Or am I misunderstanding what FOSS really is about?

    1. Re:D.I.Y. by Korin43 · · Score: 3, Informative

      Yes. The point of FOSS is that one person can do it and no one else needs to do it again unless they want to make it better. This guy is looking for a solution, and the solution already exists. He would be wasting his time if he did it himself.

    2. Re:D.I.Y. by Trahloc · · Score: 2, Insightful

      Not everyone is a programmer, some of us assist in less direct ways.

      --
      The Goal: A long simple life filled with many complex toys.
    3. Re:D.I.Y. by dubl-u · · Score: 1

      Or am I misunderstanding what FOSS really is about?

      If your first instinct is to build it yourself, then yes, you are kind of missing what FOSS is really about. To jointly improve shared solutions, you first have to find the solutions that are already out there.

    4. Re:D.I.Y. by truthsearch · · Score: 1

      For one large site I took an open source Bayesian filter and customized it. This site was large enough to get spam that's only posted there, so a DIY Bayesian filter worked extremely well. They have staff to remove spam and illegal content, so the filter simply aided the staff, who were able to train the filter very quickly.

      However, this solution would be useless without enough content and without people properly training the filter. If you get generic spam in a common scenario than a more generic solution might suffice.

    5. Re:D.I.Y. by Firehed · · Score: 1

      It's not even a matter of programming skills. If you look at why spam never gets through to your Gmail inbox, it's because Google has a database of billions if not trillions of messages to run analysis on. When you have a twelve-digit sample size to work with, matching can be done much more accurately than with a couple hundred messages. It's pretty easy to slap together a system where you manually flag messages as good or bad. Being able to call $akismet->isCommentSpam() and have everything done for you programmatically takes not only a huge amount of additional coding skills, but a huge sample set to be even close to accurate.

      So indeed, saying that "some of us assist in less direct ways" couldn't be more accurate with spam filtering. Merely flagging incorrectly-marked messages helps everyone.

      --
      How are sites slashdotted when nobody reads TFAs?
    6. Re:D.I.Y. by mysidia · · Score: 2, Interesting

      This suggests a solution... Instead of using the web for comment submission: use SMTP.

      A user who wants to submit a comment answers a captcha, and clicks a "submit" button.

      An e-mail address is displayed for them to send their comment to.

      They e-mail their comment, which goes to somemailboxname+blahblah@gmail.google.com

      If Google doesn't consider it spam, then the message gets forwarded to a secret mailbox on the blog server.

      A script running on the blog server parses the message, determines what the comment is, and which article to append to.

      Appends the comment.

  4. the solution is here .. by rs232 · · Score: 0

    "Does anyone know of a good system for filtering spam in general messages?"

    Yea, design an email system that is immune to spam and make the ISPs responsible for blocking spam, phishing and such attacks ..

    --
    davecb5620@gmail.com
    1. Re:the solution is here .. by D+Ninja · · Score: 3, Funny

      Yea, design an email system that is immune to spam and make the ISPs responsible for blocking spam, phishing and such attacks ..

      So, I know people sometimes DRTFA. It happens. Life is busy. But, you know, it's always good to RTFS because it has fancy little tidbits of information such as:

      While searching for plugins or extensions to spamassassin etc I have had little luck finding anything not tied into the email framework.

    2. Re:the solution is here .. by Firehed · · Score: 1

      It's honestly not that difficult to do - the real problem would be getting people to switch over. Require all incoming messages to have a micropayment attached (even a tenth of a cent). Ignore messages without the payment. If you read the message and don't flag it as spam, the payment is refunded. If you mark it as spam, you keep the money.

      It doesn't stop people from sending spam out, but it kills all of the cost-effectiveness. And to preempt the "but businesses have legitimate bulk mailings!" - places sending out messages to 1m people spend a good chunk of money just designing the newsletter for salaries and everything else. $1k to send a message to a million people, 99.9% of whom will not mark the message as spam and thus refund the micropayment makes it a complete non-issue.

      Good luck over-riding two of the most-used protocols on the internet to put it in place (not to mention creating an effective system for micropayments), but at least at a conceptual level it would remove the cost-effectiveness of spam so it would go away.

      I think the same principle could go a long way for Youtube comments too. They could finally have enough money coming in to pay for bandwidth :)

      --
      How are sites slashdotted when nobody reads TFAs?
    3. Re:the solution is here .. by Anonymous Coward · · Score: 0

      Except that since botnets are responsible for a tremendous proportion of spam, the spammers aren't going to be the ones paying anyway.

    4. Re:the solution is here .. by AnyoneEB · · Score: 3, Insightful

      This is not exactly a new proposal, and it has been shot down on Slashdot before. One major problem is that a lot of spam is through botnets and the spammers would not get charged the e-mail fees, people with zombied computers would. I suppose this would make people with zombied computers notice, but why would they agree to sign up for such a service in the first place? Also, tying e-mail to payment means that the payment is probably traceable to a real person, which a lot of people do not want.

      --
      Centralization breaks the internet.
    5. Re:the solution is here .. by pushf+popf · · Score: 3, Interesting

      One major problem is that a lot of spam is through botnets and the spammers would not get charged the e-mail fees, people with zombied computers would.

      That's a non-issue.

      Want to block a ton of spam? Reject any inbound smtp connections that have no reverse DNS record, then use regular expressions on those that do to refuse connections from dynamic/home/dsl/dial_up/etc. (I tried to post the regexes, but slashdot whined about " Lameness filter encountered. Post aborted!")

      Stop talking to dynamic IPs and about 90% of the world's spam will immediately vanish.

    6. Re:the solution is here .. by phr1 · · Score: 1

      There is a very obvious anonymous payment system run by the US Treasury and its counterpart organizations in other countries. At registration time ask for the serial number of a one dollar bill, and require that the bill be sent by snail mail to confirm that it is real, and to help with site expenses. No names or return addresses are required and no spammer will go anywhere near that.

    7. Re:the solution is here .. by Kalriath · · Score: 1

      Not every country HAS a $1 bill. In New Zealand, the smallest serialised denomination is a $5 note (made of plastic) - coins have no serial and the $1 and $2 are both coins. Your idea would fail, as most people would take offense to having to pay $5 to register for a forum.

      Also, it's forbidden to send cash through the mail.

      --
      For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
    8. Re:the solution is here .. by johanatan · · Score: 1

      Why do we keep hearing this micropayment idea? If you are going to go modifying protocols, then the protocol can handle it without micropayments. There's no need to give corporations or the government any more reason to tax us.

    9. Re:the solution is here .. by Chris+Pimlott · · Score: 1

      If people aren't blocking dynamic IPs now, why would they start doing it then?

    10. Re:the solution is here .. by tobiasly · · Score: 1

      then use regular expressions on those that do to refuse connections from dynamic/home/dsl/dial_up/etc. (I tried to post the regexes, but slashdot whined about " Lameness filter encountered. Post aborted!") Stop talking to dynamic IPs and about 90% of the world's spam will immediately vanish.

      Or, rather than worry about keeping your regexes up-to-date, use the Spamhaus Policy Block List (PBL). It contains *only* dynamic and other "end-user" IPs. The PBL is also contained in their all-encompassing Zen blocklist, which is what I use, but those who don't like automated RBLs can still get the benefit of blocking dynamic IPs by using just the PBL. Adding it to most MTAs these days is a very simple one-line config change.

    11. Re:the solution is here .. by pushf+popf · · Score: 1

      I like to keep the load down on spamhaus, so anything I can reject locally is a Good Thing. In any event, it's not a huge deal to maintain, I've got about two dozen lines that I haven't changed in over a year, and they reject an absolute ton of spam from home-zombies.

    12. Re:the solution is here .. by Mozk · · Score: 1

      Also, it's forbidden to send cash through the mail.

      Sending cash through USPS is not illegal.

      --
      No existe.
    13. Re:the solution is here .. by vux984 · · Score: 1

      Not every country HAS a $1 bill.

      So you walk up to your nearest bank or travel agency and buy a $1 US bill.

      Your idea would fail, as most people would take offense to having to pay $5 to register for a forum.

      I think most people would take offense at having to pay $1 to register for most forums -- especially to leave a message on some random blog they drove by. It would work for the really big forums... maybe... I say maybe because it might actually be worth it for the spammer to pay a buck to spam on it.

    14. Re:the solution is here .. by Kalriath · · Score: 1

      It sure isn't recommended though. And considering it's not insured, it'd be a dumb move anyway.

      Also, I'm sure that the USPS doesn't provide my mail delivery, at least. Oh wait! They don't!

      --
      For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
    15. Re:the solution is here .. by Kalriath · · Score: 1

      So you walk up to your nearest bank or travel agency and buy a $1 US bill

      No, you frigging don't. Considering purchasing a single US dollar is impossible since banks wont exchange denominations that small, and the fees alone would take purchasing a US dollar bill to around $10 in cost...

      Face it, it's a stupid idea.

      --
      For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
    16. Re:the solution is here .. by Mozk · · Score: 1

      I just figured you were in the United States since you said "In New Zealand" rather than "Here in New Zealand".

      --
      No existe.
  5. these might be overkill ... by yossarianuk · · Score: 0, Offtopic

    2 apps that come to mind - providing you are not running on shared hosting They may to overkill - they both have spam rules and can help web apps from hackers. modsecurity - http://www.modsecurity.org/ snort - www.snort.org both are free and GPL - the snort does have a paid for subscription service - however if you just free register you get pretty up to date rules

    1. Re:these might be overkill ... by seifried · · Score: 1

      Neither of these really address spam. Wrong tool set entirely.

  6. Service.. by bigattichouse · · Score: 1

    I've been thinking about modifying my VSDB software to do something like this...

    --
    meh
    1. Re:Service.. by Trahloc · · Score: 1

      Yo, 404 on the link provided. Think it was just a .html typo. Found this on your site: http://www.bigattichouse.com/vectorspace.php

      --
      The Goal: A long simple life filled with many complex toys.
    2. Re:Service.. by bigattichouse · · Score: 1

      ger - thanks Trahloc - I added to .htaccess to redirect to it. Too much nog.

      --
      meh
  7. Various ideas by mlts · · Score: 1

    There are a number of things you can try:

    For a small site I helped set up, they went to complete SSL and client certificates, where users had to obtain a cert from Verisign or Comodo before they would get access. This stopped spam, and one can obtain a client cert for free or a low cost. However, this can't be done for most forums or blogs.

    For larger sites, a lot have ended up moving to an approval type of system where a human approves the creation of the user, then a limit on how many posts a first time user could do, and how many features the user can access.

    Finally, one site just went to a paid subscriber system where for any access at all, people had to pay $5 to $10 via PayPal. This at least forced spammers to pony up cash (or commit credit card fraud) before they would get access.

    1. Re:Various ideas by WhatAmIDoingHere · · Score: 1

      And when you get the $5 - $10 in PayPal from a scammer using someone else's credit card number, your PayPal account usually gets suspended if not flat out closed.

      --
      Not a Twitter sockpuppet... but I wish I was.
  8. Second that! by _merlin · · Score: 5, Informative

    Akismet is the best thing for blog spam prevention ever. I can't believe you've never stumbled across it before. It uses statistical analysis to identify spam, and the more people use it, the better it gets. If everyone used it, the blog spammers would just disappear because their attacks would be completely ineffective.

    1. Re:Second that! by seifried · · Score: 4, Informative

      Add to which it has an API/etc. It really is what you should be using.

    2. Re:Second that! by Anonymous Coward · · Score: 0

      Why is there even an article on the front page of /. about this? lame.

    3. Re:Second that! by Indefinite,+Ephemera · · Score: 5, Interesting

      The difficulty in evaluating Akismet - I speak not as a user but as someone who ended up apparently blacklisted and having to try their appeals system - is that everyone I see praising it is by definition the kind of person who pays attention to the filter and therefore will train it effectively. Since your average wordpress.com user more likely lets false positives pile up, I'd love to know how effective it is for people who don't wonder how effective it is.

    4. Re:Second that! by _merlin · · Score: 4, Informative

      I've used it for a few years now. In that time, it has caught tens of thousands of spam comments. It has missed about ten spam comments (i.e. allowed them through). It has misidentified two legitimate comments as spam. Yes, I realise I'm keeping an eye on it, and someone who doesn't may not notice that it's causing problems for them. But the stats are pretty good in my case. I'm aware of the allegations of corruption and using it to gag people, but that hasn't affected me yet.

    5. Re:Second that! by sfbanutt · · Score: 1

      Hmm.. I've run akismet for a couple of years now and have never had a false positive. It's missed a few spams, but never marked a legit post as spam.

      --
      I've wrestled with reality for 35 years and I'm happy to say, I finally won out - Elwood P. Dowd
    6. Re:Second that! by stoolpigeon · · Score: 1

      I have - once. Not bad.

      --
      It's hard to believe that's how Micronians are made. Why don't we see it right now by having you both kiss one another?
    7. Re:Second that! by Anonymous Coward · · Score: 0

      I've used it for almost a year. Caught just shy of 1300 spam messages. About 5 false positives and also around 5 spams let through (all of them in the same week - my guess is that it was something new and it took the filter a few days to catch up).

    8. Re:Second that! by WebmasterNeal · · Score: 1

      Doesn't look like it supports Classic ASP. That's sort of a show stopper for me.

      --
      "During My Service In The United States Congress, I Took The Initiative In Creating The Internet." -Al Gore
    9. Re:Second that! by Giloo · · Score: 1

      Though it certainly looks effective, what makes it powerful is also a drawback, you have to send all the content you want to be checked to them. Maybe that's because my tinfoil hat is getting too big, but in many cases, I'd like to avoid spam AND keep the content for myself.

      Of course, here the guy is asking for forum/blogs which are in main case submissions for publicly available content, so I guess it would fit.

    10. Re:Second that! by sfbanutt · · Score: 4, Informative

      I just noticed a handy Akismet stats link in the latest version. I've been running Akismet since October 2006, in that time there have been 26,575 comments on my blog, of which 26,302 were spam(!). It missed 25 spam comments that had to be manually moderated and passed 273 legit comments. There have been no false positives. Personally, I think that's a pretty darn good record.

      --
      I've wrestled with reality for 35 years and I'm happy to say, I finally won out - Elwood P. Dowd
    11. Re:Second that! by Ihmhi · · Score: 1

      How does it handle the problem of "CAPTCHA farms" in India and China where they get $2 per 1,000 or so spam messages they post by hand?

    12. Re:Second that! by WhatAmIDoingHere · · Score: 1

      That comment about the "CAPTCHA farms" reminds me of a recent experience I've had with the official Steam forums for GTA4. I was trying to search for posts by people who were having the same problem as I was, but the text I had to enter was lime green on a bright background. I couldn't figure it out for the life of me. And getting a new one only made it harder to read.

      My question is: If I can't read them, am i a robot?

      --
      Not a Twitter sockpuppet... but I wish I was.
    13. Re:Second that! by Hittman · · Score: 1

      In the two years I've been using it it has caught over 10k spams. It has missed about a half dozen, and had about a dozen false positives.

      For a while it was missing spam written in Cyrillic. I solved that by adding a couple of Cyrillic vowels to banned word list.

    14. Re:Second that! by kv9 · · Score: 2, Funny

      My question is: If I can't read them, am i a robot?

      well, let's see. would you injure a human being or, through inaction, allow a human being to come to harm?

    15. Re:Second that! by __aakqkc2748 · · Score: 1

      My question is: If I can't read them, am i a robot?

      Good question, I've wondered about myself once or twice, as I sometimes cannot prove that I am Human. I'm sure glad I got into slashdot before capthas came to town.

    16. Re:Second that! by publiclurker · · Score: 1

      Does this definition of human include spammers?

    17. Re:Second that! by WhatAmIDoingHere · · Score: 1

      I feel a strong drive to slaughter as many meatbags as possible. Is that a bad sign?

      --
      Not a Twitter sockpuppet... but I wish I was.
  9. mollom not so free by jeffstar · · Score: 2, Interesting

    mollom

    i discovered this one through drupal. I thought it was completely free but apparently for high traffic sites it isn't.

    I think all your user generated content is sent to them and checked for spaminess against the other submissions they are receiving and they give you back a rating.

    1. Re:mollom not so free by yelvington · · Score: 1

      I use Mollom because of its excellent integration with Drupal. It's free for up to 100 legitimate posts per day, 30 euros/month after that.

      It works very well for stopping spambots without annoying real humans (which a plain captcha will do).

      Human spammers still slip through, but when you delete their work, it's fed back to the Mollom database, protecting you and others from repeats.

  10. DIY or it will be broken by loony · · Score: 5, Interesting

    Any method you use can be broken. Your only chance is to reduce the likelihood that your site is worth the effort.

    Basically, if you use a common solution - no matter of FOSS or commercial - then there will be a thousand other sites that use it too. This attracts attackers because they know when they hack it once, they can re-use it.

    However, if you handcode something, no matter how primitive, it likely lasts a lot longer because nobody bothers hacking into your site...

    Of course that doesn't work if you have a large site like myspace - there, a single site is worth the effort by itself.

    Anyway - then there are two things - a really fast moving animated gif and silly things where you ask people to identify items usually work.
    I help out with a site that randomly takes five pictures of cats and dogs and it asks you to identify which of the images contains the highest number of kittens... We barely ever get spam through - and that with almost 20K attempted submissions by non-humans a day makes us pretty happy

    Peter.

    1. Re:DIY or it will be broken by Sir_Lewk · · Score: 1

      Sounds like security through obscurity too me, if someone actually tries to target *your site* (which will happen if it's popular enough) then chances are it'll be broken in no time.

      --
      "linux is just DOS with a UNIX like syntax" -- Galactic Dominator (944134)
    2. Re:DIY or it will be broken by dattaway · · Score: 4, Interesting

      However, if you handcode something, no matter how primitive, it likely lasts a lot longer because nobody bothers hacking into your site...

      Simply renaming the .php files worked 100% for me.

    3. Re:DIY or it will be broken by boyter · · Score: 1

      Thats what I did. I added an extra field which you need to enter the word "fatty" into. I went from 200 spam comments a day to 0.

    4. Re:DIY or it will be broken by martin-boundary · · Score: 1

      Any method you use can be broken. Your only chance is to reduce the likelihood that your site is worth the effort.

      That's only one approach. The other approach is to increase the response time dramatically: once you've been spammed, if you can clean up very quickly and reconfigure to prevent similar attacks in future, then visitors are unlikely to notice anything. This is the key advantage of statistical spam filters, as they make it relatively painless to respond and reconfigure to handle whole classes of messages, unlike the handcoding approach which is much more work intensive.

      (does not apply to extreme high volume sites, but if you've got one of those and are looking to slashdot for a solution, then you've got bigger problems).

    5. Re:DIY or it will be broken by Anonymous Coward · · Score: 0

      I help out with a site that randomly takes five pictures of cats and dogs and it asks you to identify which of the images contains the highest number of kittens... We barely ever get spam through - and that with almost 20K attempted submissions by non-humans a day makes us pretty happy

      That would mean a 20% chance of guessing the answer? Immunity by diversity won't really work because most modern spammers quickly make templates for individual sites. I have seen spambots barf out code in the comments occasionally and the code seem site-specific, as if generated by recording a human doing it first. Is answering a multiple-choice question really enough?

      -swedish coward

    6. Re:DIY or it will be broken by Anonymous Coward · · Score: 0

      Amen to this. I run a wiki site that was being bombarded with huge amounts of link-spam and other garbage. I solved the problem by making just a few very minor modifications to the wiki software itself and the spam problem disappeared entirely (even though the server logs show that the site is still under constant, unsuccessful, attack). And users can still use the site without captchas or other security measures.

      It wouldn't be hard for someone who was dedicated to break the scheme I came up with, but it hasn't happened in the last two years. And no, I'm not telling anyone what I changed in the software. Sorry.

    7. Re:DIY or it will be broken by Anonymous Coward · · Score: 0

      What do you mean?

    8. Re:DIY or it will be broken by edmazur · · Score: 2, Interesting

      I'll second this.

      My friend runs a smaller site and was having a problem with forum spam. He edited the registration page to include a checkbox that said something along the lines of "check this box if you are not a bot". His problems went away instantly. Obviously this does not scale well, but for smaller sites being targeted randomly by automatic spam crawlers, it appears to be very effective.

    9. Re:DIY or it will be broken by KermodeBear · · Score: 4, Informative

      I have a very simple, small site that I run that allows small comments. It was fine until the spam bots found it. Anyways, I just added a simple question about the background color of the site, which must be correct in order for the comment to be posted. I haven't had a single issue since (except for the occasional troll, but what can you do about that).

      The nice thing about something like this, a handmade thing, is that the spammers won't bother 'breaking' it. As the parent mentions, the spammers are attacking the common solutions - so a little home grown bit will work wonders.

      --
      Love sees no species.
    10. Re:DIY or it will be broken by KermodeBear · · Score: 1

      Which is true - a specialized attack will succeed - but for smaller, personal sites, the spammers won't bother.

      --
      Love sees no species.
    11. Re:DIY or it will be broken by Kalriath · · Score: 1

      Anyways, I just added a simple question about the background color of the site, which must be correct in order for the comment to be posted. I haven't had a single issue since (except for the occasional troll, but what can you do about that)

      Oh, and those pesky colourblind people. But screw them, eh?

      Anything based on something a human may not be able to solve sucks.

      --
      For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
    12. Re:DIY or it will be broken by KermodeBear · · Score: 1

      Funny that you mention it, but I am red-green color blind myself.

      Perhaps the background of the site is white - which everyone should be able to see, hm? (o;

      --
      Love sees no species.
    13. Re:DIY or it will be broken by Daengbo · · Score: 1

      Arithmetic always seems to work fairly well.

    14. Re:DIY or it will be broken by dargaud · · Score: 1

      I did that as well, left a hidden empty field that would flag as spam if filled, but bots seem to not fill it: it hasn't changed a thing.

      --
      Non-Linux Penguins ?
    15. Re:DIY or it will be broken by the_digitalmouse · · Score: 0

      Simply renaming the .php files worked 100% for me.

      While it's not perfect either (if the attacker is smart enough he'll just redo his code to include the changed .php extension), I do agree it can be *very* effective. I've set up several webservers where Apache was configured to interpret other extensions as php files. It confuses people at first when they see 'index.jimm' as a php file, for example.

      --
      http://about.me/jimm.pratt
    16. Re:DIY or it will be broken by apoc.famine · · Score: 1

      I did the same thing for our small, semi-private forums awhile back. I added about six questions on either the subject matter of the site or the colors on the page, or how to best dispatch spammers. Our spammers immediately went to 0, since it would actually take a tiny bit of human interaction to create a login.

      --
      Velociraptor = Distiraptor / Timeraptor
    17. Re:DIY or it will be broken by Kalriath · · Score: 1

      One would hope. I can't imagine any instance of white colour blindness.

      --
      For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
    18. Re:DIY or it will be broken by Kalriath · · Score: 1

      True, but they may not necessarily know what the actual colour they're seeing is. Traffic lights are easy - even if they can't tell the difference between the three lights, they can tell which light is lit, and from there know what that particular light means. I actually know a colour blind person, and interestingly enough even though they're allowed to drive, they aren't allowed to be police officers, or electricians and most people don't exactly want them as painters.

      --
      For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
  11. require a bit of human intuition by cpearson · · Score: 1

    Forums and blogs are susceptible to post spam because most are large opensource or commercial scripts. The fact that there are thousands identical scripts makes them a prime target to spammers. The trick to prevent forum spam is to confuse the spam bots. Most all bots will fail to register a user if given something unexpected. For forums I administer, I wrote a vbulletin mod that requires a bit of human intuition to solve (not much mind you) such as have a text input field that must remain blank. This simple measure has prevented nearly 99% of forum post spam.

    --
    Windows Vista Help Forum
  12. 4 Tests Stopped 30,000 Comments For Me by WebmasterNeal · · Score: 5, Interesting

    I have a series of 4 tests to block spam on my website. So far it has stopped over 30,000 attempts in the last year.

    Test one is, does the last name = the first name. For some reason almost all spammers do this.

    Second, do they use a keyword from a list of about 15 words.

    Third, do they fill out a hidden inputbox? This is sort of the reverse captcha.

    Finally do they use more than 4 "http" in a post. Almost all comment spam is an SEO effort to increase their pagerank.

    --
    "During My Service In The United States Congress, I Took The Initiative In Creating The Internet." -Al Gore
    1. Re:4 Tests Stopped 30,000 Comments For Me by WebmasterNeal · · Score: 2, Insightful

      Oh I also forgot, if you have a static URL that your form posts to, it is a good idea to rename that page every now and then, especially if it is getting a tremendous amount of spam. Also you can do a check to see if the referring URL is on your own domain as a lot of spammers are posting from a copied version of your form.

      --
      "During My Service In The United States Congress, I Took The Initiative In Creating The Internet." -Al Gore
    2. Re:4 Tests Stopped 30,000 Comments For Me by Magic5Ball · · Score: 5, Interesting

      Background: One of my sites is a custom job which kills a spam comment every 3 seconds or so, and has done so consistently for the past four years.

      OP's suggestions are very good, especially limiting the number of 'http's. We've given up on the keyword lists since they are costly to maintain and aren't as effective as some other methods.

      Currently, the most effective kill rules for us are:
      1) We write the client's IP address, the ID of the thing being commented on, and random stuff to a cookie from the legitimate page from which the client clicked the "post reply" link. If the IP address doesn't match, or if the ID missing, or if the parameter for the random junk aren't in the cookie, then fail. This rule traps non-browser scripts and limits spam throughput, but does not affect humans.

      2) The client's IP address is a hidden form variable. If that IP address does not match the IP from which the POST originates, fail. This rule traps the browser-based scripts, and operators who proxy through botnets for testing.

      These two rules catch all but about two spam-like messages a month (spam operator not using proxies to test their scripts), and have mislabeled two legitimate messages (from a local ISP's poorly-configured proxy) in the last three years.

      There are other things at play, such as salted hashes of the above, and some other heuristics on hidden and unused fields which sort and categorise the spam for our own research (including point of origin, topic, etc.). One finding is that IP/geographic blacklists are ineffective. I'll post new findings and methods in another two years.

      I'm also evil in that the apparent failure modes are non-deterministic, and include such things as random HTTP response codes, random modes of connection failure, and spam messages that apparently go through, but are only visible for the IP that posted them, or for one minute after they are posted.

      Your move, "RosarioRush".

      --
      There are 1.1... kinds of people.
    3. Re:4 Tests Stopped 30,000 Comments For Me by in10se · · Score: 1

      Be careful about the first/last name check. The place I work has about 1000 employees. 4 of them (mostly foreign) have the same first and last name. I know that's less than 1%, but you don't want to limit real users.

      --
      Popisms.com - Connecting pop culture
  13. HTTPBL by Anonymous Coward · · Score: 2, Interesting

    Project Honeypot's HTTPBL has been good to me:

    See: www.projecthoneypot.org/httpbl.php

  14. Fast way is to slow it down by Todd+Knarr · · Score: 3, Interesting

    The fastest way is probably to just slow down user registration. Permit anonymous posting, but make it moderated/screened by default (ie. not visible to other users until the forum owner flags it as OK). When a user goes to register (so they can get their posts visible immediately), do not send them the confirmation e-mail immediately. Batch your confirmations up and send them out twice a day at odd times (ie. not midnight and noon, something like 3:47am and 3:47 pm) (you could do it 4 times a day, but not much faster than that since the idea's to introduce a delay in the registration process). Make sure to tell the user on the registration screen what sort of time-frame they can expect their confirmation to arrive in. Ordinary users who plan on using the forum long-term won't be inconvenienced much by this. Spammers... won't tolerate the delay, they want to get their message in fast and get out. With their automated scripts they might not even notice things are failing. Also, don't include a direct confirmation link in the e-mail. Include a URL to a form and make the user copy-and-paste the confirmation number from the e-mail. That'll be trivial for humans, but not easy for an automated script to handle without human assistance.

    None of that will stop a determined spammer, but most of them are more interested in volume than anything else and they won't bother spending time/effort on just one forum when they could hit 10 others instead.

    1. Re:Fast way is to slow it down by Anonymous Coward · · Score: 0

      As someone who works in Internet marketing I would like to offer free advice to everyone and recommend against everything the parent said.

      The main idea behind boosting the conversion rate (how large percentage of people who come to the site register, read articles, post comments, buy products, do whatever the site's owner wants them to do in order to make the site working) is to make things as simple and quick to the user as possible.

      Ideal situation is always that user will not have to register at all. Need to register drops conversion rates dramatically. For every fields of info you need to enter in order to register the conversation rates drop a lot again. E mail confirmation drops them by a bit but not much any more at that point. But the suggested "send confirmation emails twice per day at odd times" is just horrible idea.

      I've met some websites that use that. I think. I DM DnD as a hobby and googled for guides to make better maps. I hit cartographers' guild's forums. There were nice tutorials but before trying them I wanted to see what end result looked like. The forum software didn't let me see the images (uploaded to forums) before registering. I nearly left but decided to register instead. But the confirmation mail didn't arrive. I decided not to wait for hours, googled five minutes more, found another great tutorial site (a lot of free video tutorials) that didn't require me to register, bookmarked it and am still an active user.

      The question is then "But do we want to have users who don't have interest to even register?" and my experience is yes. Wanting to comment on something shows that the user is already promising lead. If he can do that easily, he is more likely to check back to see responses to that and stay as active visitor. Even if he didn't his comment is now extra material for other visitors to look upon. Unless we assume "Registered users give automatically higher quality comments" and again based on what I've seen, there is nothing to imply that.

      Honestly, don't make things harder, slower and more annoying for your visitors just to fight spambots when there are other ways too. You'll only hurt your site.

    2. Re:Fast way is to slow it down by Todd+Knarr · · Score: 2, Insightful

      Internet marketing brought us forum spam in the first place, I'm afraid.

      For most forums it's not about getting the most users at any cost. It's about getting the most interested visitors without scaring an unacceptable fraction away, while at the same time keeping the number of spammers at a manageable level (which, given their proclivities, is pretty close to zero). And the simple fact is that, if it can be automated, spammers can and will automate it. And as long as it costs them little or no time or effort, they'll continue to flood the forum. Getting around filters can be automated.

      I'm reminded of an exchange:
      Merchant: I need some way to keep people from stealing my merchandise all the time!
      Consultant: <looks over the racks of high-value jewelry sitting outside the store> Well, why don't you move your merchandise inside, instead of leaving it out along the sidewalk?
      Merchant: I can't do that! People couldn't see my merchandise, and I might lose customers!
      Consultant: ... so why exactly did you hire me again?

  15. YAWASP for wordpress by zimtmaxl · · Score: 3, Informative

    There is a well working semi-dynamic plugin for wordpress. It has served me well. It is called YAWASP and you can find it here: http://wordpress.org/extend/plugins/yawasp/. The author also describes the common problems & shortfalls with traditional captcha-like methods.

    --
    how IT is changing the world - http://max.zamorsky.name
  16. "I am a robot" field by casualsax3 · · Score: 4, Informative
    The ZSNES boards employ a neat trick: http://board.zsnes.com/phpBB2/profile.php?mode=register&agreed=true

    It's got a field that says "I am a robot" checked off by default. A human should obviously see that and uncheck it. Those registrations that come in with it checked are blackholed. It's definitely cut down on the SPAM accounts since they enabled it.

    1. Re:"I am a robot" field by slimjim8094 · · Score: 4, Funny

      Great idea, and it has the side-effect of keeping idiots out too :)

      --
      I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
    2. Re:"I am a robot" field by Anonymous Coward · · Score: 0

      You have just checked the following...

      I have a cat in a bag:

      [ ] keep it in
      [X] let it out

    3. Re:"I am a robot" field by Anonymous Coward · · Score: 0

      That's all good for robots and all, and there are a number of alternatives for such a thing, but our problem is real people registering on our PHPbb forum and spamming it. There is no good option like Akismet for PHPbb (though I believe there was an attempt at a plugin, I don't think it worked so well). So while Wordpress and Drupal are covered, PHPbb is not, which is a shame since its a very popular forum platform.

    4. Re:"I am a robot" field by Anonymous Coward · · Score: 5, Funny

      And the robots. Here I am, brain the size of a planet, and I keep getting banned from forums. *sigh*

    5. Re:"I am a robot" field by noidentity · · Score: 1

      It's got a field that says "I am a robot" checked off by default. A human should obviously see that and uncheck it.

      What have you got against sentient robots taking part in discussion (you insensitive clod)? Or are we...er...they forced to lie about their true identity?

    6. Re:"I am a robot" field by ikkonoishi · · Score: 1

      Also sarcastic bastards. So its win-win.

    7. Re:"I am a robot" field by badkarmadayaccount · · Score: 1

      +1 Obscure Terry Pratchet Novel Reference

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  17. easy way to do it by indy_Muad'Dib · · Score: 1

    not so easy for you of course make the forum moderated and require post approval. then get good mods to approve posts.

  18. Probably not doable now by Deleriux · · Score: 1

    But some PHP tinkering and you could probably do something to pass comments through spamassassin using a socket or something.

    Spamassassin would need updating though to work with content-only data.

    Wonder if anyones ever thought of this before?

  19. Better than Askimet? by Jeremiah+Cornelius · · Score: 3, Interesting

    Arguably, it is Mollom. Especially if you are using Drupal.

    Askimet is 'rotting on th evine' in many ways - including development updates. Mollom is a commercial web service, with a free version for non-profit and small volume sites/users.

    The Drupal module is explained here:
    http://drupal.org/project/mollom

    The Mollom site:
    http://mollom.com/

    --
    "Flyin' in just a sweet place,
    Never been known to fail..."
    1. Re:Better than Askimet? by Hojima · · Score: 3, Informative

      read my sig

    2. Re:Better than Askimet? by Jeremiah+Cornelius · · Score: 1

      Marvelous

      --
      "Flyin' in just a sweet place,
      Never been known to fail..."
    3. Re:Better than Askimet? by ceejayoz · · Score: 4, Informative

      I seem to get Mollum captchas on every site that uses it. My IP, user agent, etc. are almost completely static. My comments are grammatically correct, never spammy, etc.

      If their system hasn't identified me as safe by now, there's something wrong.

      In contrast, to my knowledge Akismet has never flagged me. My comments go straight up on blogs using it. On my personal site, I've had maybe 10 false positives out of several thousand caught.

      Mollom, IMO, has a long way to go.

    4. Re:Better than Askimet? by darkpixel2k · · Score: 4, Insightful

      read my sig

      That'll work, right up until the spam bots are told to ignore spampoison.com, or the person who is running the spam bots decides to put spampoison.com into his hosts file and point it to 127.0.0.1.

      Lame solution.

      --
      There's no place like ::1 (I've completed my transition to IPv6)
    5. Re:Better than Askimet? by mysidia · · Score: 3, Insightful

      The problem with that concept is spammers just have to "blacklist" spampoison.com. Or implement "spam filters" of their own to detect such site

      What would really be ideal would be thousands of poison domains, with high variability so smart spammers can't easily protect themselves and sanitize their lists, when they figure out what's going on...

    6. Re:Better than Askimet? by superphreak · · Score: 1

      I can't. My spam-blocker is blocking it.

      Just kidding...

      --
      Evolution is a state-sponsored, state-protected religion.
    7. Re:Better than Askimet? by johanatan · · Score: 1

      That's really neat until the spammers adapt. What would be really nice is to dynamically generate unique versions of the link and to have its destination also vary over time.

      *That* would be almost undetectable by the spammers.

    8. Re:Better than Askimet? by theshowmecanuck · · Score: 1

      I am curious, what is the motivation for askimet or mollum to provide this service? This has to cost them bandwidth etc. which is money out of their pockets. So there has to be a definite benefit to them in some way. What is their gain? I don't trust companies that do things for the 'general good'. Churches maybe, they're a little crazy (sometimes in a good way), but not companies.

      --
      -- I ignore anonymous replies to my comments and postings.
  20. If you're willing to pay... by Lord+Byron+II · · Score: 1

    ...there are companies out there that use a Bayesian filter to sort posts into low scoring and high scoring, and then they have their employees manually sort through the high scoring messages.

  21. Hidden Input Box by waldoj · · Score: 5, Informative

    Third, do they fill out a hidden inputbox? This is sort of the reverse captcha.

    This is really a very good test. As others have mentioned in this thread, it's the sort of thing that spammers will circumvent if it becomes widespread, but for now it's great.

    There's something else I've found to be really quite effective: deliberately misnaming my form fields. For instance, give the input field that's labelled "First Name" an input name of "phone number." Humans don't use input names to determine what text to enter, but spambots do. Then check that inputâ"if the first name field contains a phone number, you know you've got yourself spammer.

    I've used solely the combination of these two things to run one of my websites for two years now, and I get a vanishingly small amount of spam.

    1. Re:Hidden Input Box by gr8dude · · Score: 1

      What about people who rely on screen readers?

  22. Message board spam. by JWSmythe · · Score: 4, Informative

        I had a similar problem in the comments area of my site. It was all fun and games, until one day I checked, and there were something like 1000 spams for every real message.

        I wrote my own system to deal with it. It's not very hard, assuming you know how your site works (of course you do, right?)

        I ended up making two blacklists. One was for words and phrases. The spammers tend to post (and repost, and repost) the same crap. My blacklist rules had some simple regular expressions that I could run queries with. Like, "%http://%spamsite%" and "%v%gra%". You get the idea. The second list was IP's that were known spammers.

        At the time, I allowed both anonymous comments, and comments from logged in users. I eventually did away with the anonymous comments, as they were a headache. This was the best cure.

        So, when my script ran (once a minute), if it matched a message, it would delete the message, and append the IP to the IP blacklist. If it was posted by a user account, the user account got suspended, so they could no longer log in, nor post.

        After it's detection and cleanup run, it then ran back over the IP list, and pruned out every post by that IP. Sometimes they'll do practice runs saying silly things like "nice site". I thought they were real user complements at first, until I saw the same posting verbatim coming from the same IP to multiple news stories, and then that IP would start spamming later.

        Some people will argue that the IP cleanup run was not nice, polite, or even fair. People use proxies. Sure, they do. We got a lot of abuse from anonymous proxies, and no real messages from them. The spammers didn't seem to like to use AOL.

        When I implemented this, I posted a very brief description of what I was starting ("We're starting advanced anti-spam protection"), with an apology for real messages that were deleted. I never received one complaint about real comments disappearing.

        How brutally you do it is really up to you. I built my method by manually doing it for a while, and then letting the script do it on it's own. Occasionally, I would have to go in and add new words and/or site names to the words blacklist.

        I noticed the spammers hit more common software more often. It's worth it for them to make automated systems to abuse a piece of software that's deployed on tens of thousands of sites. When I rewrote my site from scratch, then abuses dropped down to 0 for a long time. Now, they manually submit "news" items which are just ads for their own sites. It appears to be manual, and since we won't run them as news stories (our editorial staff decides what does or doesn't show up as news, and if it needs to be edited first), they give up pretty quickly.

    --
    Serious? Seriousness is well above my pay grade.
  23. honeypot by OrangeTide · · Score: 1

    I like honeypot links that blacklist anyone who clicks it. Seems to take out spam spiderbots effectively, until they learn how to avoid the honeypot links.

    --
    “Common sense is not so common.” — Voltaire
  24. drupal techniques by Anonymous Coward · · Score: 0

    I've had good luck using Drupal with a couple of techniques.

    First is the Spam module. Pretty much out of the box.

    Second is to have all comments subject to moderation.

    Third is IP-address filtering via the .htaccess file. Blocking entire continents for some Drupal sites with a US geographic audience has keept the spam down to a very low level.

  25. Alternative to Akismet by MikeRT · · Score: 1

    TypePad antispam is a great alternative to Akismet.

  26. My 3 tests also work by lalena · · Score: 5, Interesting

    I have implemented something similar, but I haven't been checking the number of blocked messages. All I know is that I used to get spam, and now I haven't gotten any for years. I use this for Formus and the Contact Us page.

    My rules are:
    1) The text boxes for things like name and subject are actually called junk.
    2) There are hidden textboxes called name and subject (1 hidden by javascript and one by CSS) that if they are populated the post is ignored.
    3) A third hidden field is the result of a simple javascript math equation that is checked on the server side. If the value is wrong, the post is thrown out.

    As others have said, if your site is small these types of things are good enough to prevent spam because the spammers won't bother to figure it out. These concepts would never work for any of the larger sites or 3rd party forum software.

    1. Re:My 3 tests also work by lalena · · Score: 4, Informative

      As a follow up to myself, I didn't come up with these ideas on my own. I read them on Slashdot a couple of years ago.

    2. Re:My 3 tests also work by skeeto · · Score: 1

      Your hidden text boxes will break your site for text browsers, screen readers, and other users that don't want to run your Javascript (like Firefox with NoScript). I wouldn't say this is an acceptable solution.

    3. Re:My 3 tests also work by cyborch · · Score: 1

      People using noscript got themselves into that mess... If you use noscript you are most likely able to figure out when to disable noscript and a part of a small enough minority to not really matter. Sorry.

      Also, any screen reader unable to ignore tags which are hidden by css will have so very many problems with standard pages that noone will be using it anyway...

    4. Re:My 3 tests also work by el+americano · · Score: 1

      If you hide the label for the box as well, then it can still be logical if unintentionally seen, e.g. "Do not fill this box if you are a real person" or "Your post will be discarded if you fill this box"

      --
      Those are my principles. If you don't like them I have others. -Groucho Marx
  27. Moderation by Anonymous Coward · · Score: 0

    Maybe use something that lets users moderate posts up or down with labels like Informative, Funny, Flamebait, and Offtopic. Also, don't forget include one involving a fictional bridge dweller, that will be misused by just about every moderator for anything they don't like.

  28. gmail by wangi · · Score: 1

    On phpbb boards I run the most productive things are:

    1. Do not allow external links in profile of newly registered / non validated users

    2. Do not allow registrations with gmail.com email addresses

    3. Ensure "valid" timezone and country settings are selected by users.

    L/

    1. Re:gmail by siyavash · · Score: 5, Insightful

      "Do not allow registrations with gmail.com email addresses"

      That is one of the most stupid things I heard this year.

    2. Re:gmail by wangi · · Score: 1

      You'd be surprised. It is trivial for spammers to get a gmail account. Anyone who genuinely wants to contribute to a forum will have another email address, if not they will be able to explicitly email...

    3. Re:gmail by 1u3hr · · Score: 1
      "Do not allow registrations with gmail.com email addresses" That is one of the most stupid things I heard this year.

      We haven't done it, but I'm tempted. At the forum I moderate we get a dozen spammers (mostly human drones in New Delhi or Beijing, from their IPs, not bots) attempting to sign up every day, which are manually checked. Almost all use GMail addresses.

    4. Re:gmail by shutdown+-p+now · · Score: 3, Informative

      You'd be surprised. It is trivial for spammers to get a gmail account.

      It's no less trivial than getting a Hotmail account, a Yahoo! account, or any of the many thousands of free webmail providers out there.

      Even so, I suspect that the majority of casual Internet users today actually have that sort of email account, based on personal experience. If you start blocking them, you're blocking most legit users, too. Unless it's a technical forum - and even in this case it's silly to block GMail, as many techies use that.

      Anyone who genuinely wants to contribute to a forum will have another email address

      Why? I for one don't have one - I use my GMail one everywhere - and I contribute to a lot of forums.

      if not they will be able to explicitly email

      Translate, please. Explicitly email what where, and how is that going to help?

    5. Re:gmail by Gunstick · · Score: 1

      well then reject registrations from New Deli :-)
      Or combinations: "you are not allowed registering with gmail from New Deli"

      --
      Atari rules... ermm... ruled.
    6. Re:gmail by Anonymous Coward · · Score: 0

      Yeah I agree.

      For Forum registration I never give my real email address. I always use my gmail or yahoo address. It cuts down the amount of email to my real account as well as prevents the shadier sites from being able to spam me.

      Any site that won't let me register with gmail or yahoo, gets a nasty letter sent to their feedback form and I don't bother with them again.

    7. Re:gmail by Anonymous Coward · · Score: 0

      It's also the same misguided policy that keeps me from asking legitimate questions about a motherboard, PSU, and car stereo amplifier at badcaps.net. I can only guess as to whether they're actually affected by that problem or not. (And I'd rather not take stuff apart, unless there is some correlation to the problem and perhaps a bit of guidance available.)

      The major free public email services do get abused, but it alone shouldn't be the basis for punishing everyone who uses them. (Not everybody has a work or school email, wants to pay for email service when there are free options, nor is wanting to figure out which forum(s) may provide a free email service as a side benefit to registering.) There are better and more effective ways of dealing with spam. Just look at all the various replies on the main subject of discussion.

  29. chart shows mollom killing spam by uctpjac · · Score: 1

    you might be interested in this chart:
    http://www.opendemocracy.net/blog/economics/admin/2008/11/18/mollom-beats-spam
    that shows what Mollom did to our forum spam after just 2 months. The interesting thing is that the spammers actually seem to have stopped trying to attack this url.
    Tony

  30. For small boards, passwords by DMUTPeregrine · · Score: 1

    I run a site for my rennisance faire guild to talk at and plan things. We had tons of message board spam until I implemented a simple solution: a password is required to register. If not entered, registration fails. The password is posted elsewhere on the site in my case, but you could communicate it only to people who need access if the site is small enough.

    --
    Not a sentence!
  31. Pivot open source blogging by macraig · · Score: 2, Interesting

    The comment- and trackback-spam blocking techniques in Pivot blogging software are, from my limited personal experience, 100% effective. There's even an extension that uses the enormous Project Honeypot database (http:BL) to weed out IP addresses of identified harvesters and comment spammers. That's just for entertainment, though, since the basic techniques are completely effective.

  32. Mollom by kbahey · · Score: 1, Redundant

    Mollom is free for low to medium traffic sites. They have plugins for the major CMSes out there (Drupal, Joomla, Wordpress, and a bunch of others).

    It is relatively new, but I use it on several sites and it works well. See the score card for some fun.

    The founder of Mollom is Dries Buytaert, the founder of Drupal, the CMS.

  33. public filtering is an unsolved problem by beefubermensch · · Score: 1

    If a spammer really wants to, he can test his attacks against the site until he beats your filter. Filtering works impossibly well, but only if the output of the filter is private. Spammers may not be doing these attacks now, but if everyone started using Akismet, no doubt they would start.

  34. Bad Idea by erlehmann · · Score: 4, Insightful

    As someone who once used text browsers, I can only advise everyone not to do this - it breaks accessibility at a fundamental level: I got banned from a forum once because they mislabeled fields.

    What however, works really great for comment spam is a simple question like "What is the name of Barack Obama ?".

    1. Re:Bad Idea by Anonymous Coward · · Score: 2, Funny

      Barack HUSSEIN Obama.

      Sorry. Had to. Just a little jab at people who feel the need to point that out all the time.

      Anonymous Coward to prevent a serious Karmic Backlash from people who can't take a joke.

    2. Re:Bad Idea by shutdown+-p+now · · Score: 1

      What however, works really great for comment spam is a simple question like "What is the name of Barack Obama ?".

      I know, I know! It's Bin Laden! ~

      And yes, unfortunately, you have to take that sort of thing into account (though I guess it depends on the kind of forum/blog that you're running).

    3. Re:Bad Idea by Anonymous Coward · · Score: 0

      People are dumb. They WILL TRY to read between the lines.

      You'll even manage to get false positives with a straight forward question like "What is the name of Barack Obama?" People will think the question is more complicated than you mean it to be.

    4. Re:Bad Idea by Nesman64 · · Score: 1

      I think the hidden text field is fine, as long as the name or label for the field identifies it as a spam trap to humans. I have labeled mine "please leave this field blank" and could make the id "url" or "website" to attract spammers.

      Then I've used css to hide it:

      .test {
      display: none;
      speak: none;
      visibility: hidden;
      }

      --
      coffee | nose > keyboard
  35. Bad Behavior by Nycran · · Score: 1

    I've been told that Bad Behavior is the shiznit. http://www.bad-behavior.ioerror.us/

    1. Re:Bad Behavior by 0100010001010011 · · Score: 1

      Mod Parent Up! Since I implemented bad behavior I've cut down on spam quite a bit.

    2. Re:Bad Behavior by InvisiBill · · Score: 1

      When I saw this article, I opened it specifically to mention Bad Behavior.

      http://www.bad-behavior.ioerror.us/documentation/how-it-works/

      Instead, Bad Behavior pioneered an HTTP fingerprinting approach. Instead of looking at the spam, we look at the spammer. Bad Behavior analyzes the HTTP headers, IP address, and other metadata regarding the request to determine if it is spammy or malicious. This approach has proved, as one user said, "shockingly effective." After all, spammers write their bots on the cheap, and have little incentive to code very well. If they could code very well, they probably wouldnâ(TM)t be spammers.

      For some numbers, Akismet has blocked 400 spams on my site. BB has blocked 197 attempts this week. I'm not sure how far back Akismet goes, but the point is that BB blocks the vast majority of spammers before they even post their message, then Akismet generally catches the rest.

  36. Reverse CAPTCHA by Anonymous Coward · · Score: 0

    What I did in the past is to do a reverse CAPTCHA. Basically just a regular one but with a little text below: "Enter the letters shown in the image _in reverse order_". Stopped 100% of all spambots.

  37. CRM114 is an option. by Dr.+Crash · · Score: 1

    CRM114 is an option you might want to consider.

    Plusses and minuses:

    + REALLY FREAKIN' ACCURATE. trains to 10x better than human.
    + REALLY FREAKIN' FAST. 20-50 milliseconds/posting without even being demonized.
    + REALLY OPEN SOURCE. GPLed. Free forever.
    + REALLY FLEXIBLE. Has about a dozen built-in classifiers, most of which work on any human language (including
            chinese, japanese, korean, etc in their native formats).

    - Arcane control language. "like 'awk' on meth". "grep bitten by a radioactive spider". You get my drift here?
    - Not a drop-in solution for blogposting. You'll have to do some coding.
    - Needs to be trained, with both positive and negative examples. When it wakes up, it knows _nothing_.

    It's at "crm114.sourceforge.net"; there's mailing lists as well as an IRC channel (#crm114) on freenode.

    1. Re:CRM114 is an option. by stevey · · Score: 1

      I've used that a fair bit and enjoy it. On my blog each comment is parsed as an email - and then filtered via CRM114.

      I could do more, but so far I don't need to. Each time I reject a spam comment (mail) I block the source address via the firewall for a few days to cut down on repeats from the same host.

      Currently I'm running with about 300 active blocks.

  38. How does Slashdot do it? by Ye_Gads · · Score: 2, Interesting

    I rarely see spam here...or is it just quickly modded down to oblivion?

  39. StopForumSpam.com works very well. by asackett · · Score: 1

    Take a look at StopForumSpam.com. I've got it installed on a vBulletin forum and it works very, very well to prevent spambots from registering. Every now and then one sneaks through, but it's a lot less than I was seeing before.

    --

    Warning: This signature may offend some viewers.

  40. The old "Fake Textarea" trick by Dwedit · · Score: 1

    I'm using the old "Fake Textarea" trick. If anyone fills in the fake textarea, the post is rejected. The fake textarea comes up first, but is hidden with CSS. I also modified the forum software so that the fake text field has the same form name as what the forum traditionally uses for the real field.
    I'm also using this in conjunction with blocking posts containing URLs from guests or users with no posts.

    Of course, this is all useless against Stock Ticker symbol spammers.

  41. If Viagra were free... by NimrodSonOfCush · · Score: 2, Funny

    ... 90% of all spam would be eliminated.

    1. Re:If Viagra were free... by shutdown+-p+now · · Score: 1

      Hm, that's an interesting approach to fighting spam, and it's right on time with all that government throwing money around in the USA... maybe you should try convincing Obama that it would be a good social project to invest money in, too?

  42. spot the cat works for humans by johnjones · · Score: 1

    hey there

    if you want to filter for humans simply present a bunch of images and have the person spot the cat among the dogs

    then apply the spam filtering (simple stuff really works you can even just use spam assassin plugins for content ) to get rid of the spammers posting urls and rubish and denie based on IP if you catch spam unless they contact you somehow

    regards

    John Jones

    http://www.johnjones.me.uk

  43. use robbIE's patentdead PostBlock censoring device by Anonymous Coward · · Score: 0

    it doesn't work well/isn't very smart, either. better days ahead. keep it to yourself, to avoid the embarrassment of being 'filtered'.

  44. RBLs are annoying but effective by erroneus · · Score: 1

    Recently, one of my users got infected with some spam-spewing bot malware which resulted in my company being listed at least four RBLs. It is annoying, but I can't hold it against the list services as I use them myself in my own filtering.

    I have to wonder if RBLs of some sort could also be applied to web browsing especially on forums? But since most people are on dynamic IP addresses, I can only assume that without some very clever ideas to go along with it (perhaps some sort of cumulative scoring + fingerprint method?) RBLs for dynamic IPs is a rather bad idea. Still, it would be nice if there were some means of simply blocking "infected" IP addresses or at the very least calling the problem to the user's attention in some way.

  45. My Own Experience by Anonymous Coward · · Score: 0

    Last year I wrote my own software for a blog to be viewed by just family and friends. I figured security by obscurity. Was I ever wrong. As soon as I was discovered the spam just flooded in at a rate of about 250 posts a day (which completely blew me away).

    I noted that it was all designed to raise pagerank. Just random words and several links. So I wrote a simple function that every time a post was submitted it would be checked to see if there was more than two links. After all Grandma won't be sending me tons of links anyway.

    If it detects more than two links the function spawns a separate thread which waits ten seconds then deletes the post. In short the post last just long enough for the spammer to test if the posts are still sticking. After I installed the filter about 16k posts were filtered out (which also blew me away) until the spam stopped all together.

    I was actually pleased to realize that I was spammed that much, because it meant 16k posts were filtered absorbed by my blog rather than an un-protected one.

    If I could think of a better way to take up a spammer's bandwidth/processes I would do that as well.

  46. Custom solutions by ericlondaits · · Score: 1

    The phpbb forum I administer fell victim to spammers more than a year ago so I tried a bunch of MODs that implemented a couple of changes to foil attempts of automatically registering. Spam slowed down a bit, but still was strong enough to be a problem... it seems that whatever script spammers use to post in phpBB already implements most standard MODs.

    PhpBBs own Captcha is no good either... ... So what I did was implement my own validation, which requires to enter a fixed word ("Dragon") in a text box. It's not a captcha... it simply states "Enter the word 'dragon':" in plain text. That did the trick and spammers completely dissapeared from the forum. ... So as long as you're not running a high profile site, a custom mod should be enough.

    --
    As a Slashdot discussion grows longer, the probability of an analogy involving cars approaches one.
  47. Just disallow links........ by Anonymous Coward · · Score: 1, Interesting

    I do something rather simple in my forums (about 30 of them) that seems to work very well: I disallow any user from posting a message with a URL in it until they've made a certain number of posts (usually about 20 or so). Until they have that many posts under their belt any post with a URL in it just returns them to a preview screen. Since their goal in life is to drop a link, this really frustrates the *&$%! out of them. :)

    None of them so far has every bothered to make 20 real posts in order to get by this limitation. On the rare occasion (and I do mean rare) that they have started to post stuff in order to work towards reaching the limit, they have their posts removed by mods or admins which resets them back to zero.

    Like I said, this really frustrates the *&$%! out of them. lol

    At the risk of being labeled a spammer, you can see it in action here: www.grouptopic.com

    Same with my contact forms- no links allowed. It just stops 'em dead, and if they REALLY need to send a link, they can contact me first and say so. Works like a charm.

    Mike

  48. Spam Karma by chemindefer · · Score: 1

    I use Spam Karma and Bad Behavior on my not-very-popular Wordpress blog and have no spam comments at all, and I used to have lots of them. SK has recently been abandoned by its founder, but still works on WP 2.7.

  49. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  50. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  51. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  52. Comment removed by account_deleted · · Score: 2, Interesting

    Comment removed based on user account deletion

  53. Kill the spammers--NOW by shanen · · Score: 1

    Considering the complexity of the Internet, I have real and increasing difficulty understanding how the spammers manage to survive. They require an entire chain of support services to stay in business. Not just ISPs who let them access the Web, but also hosting services to hold their websites, DNS providers, and the domain registrars. They need lots of help to link their spamvertised websites to the spam, just on the minimal chains. (I've noticed that more complicated chains seem to be less frequent these days.) In addition, the spammers are strongly constrained by their need to reach suckers and provide ways for the suckers to reach back to them. They can't hide or obfuscate their spam too much, or the human suckers won't be able to figure out how to send the money.

    The strength of a chain is the weakest link. Right now that seems to be the domain registrars. If the technical honchos of the Internet scanned the spam to find the largest spam-supporting registrar of the month and the rest of the Web then stopped talking to that registrar, that would seem to be rather harmful to the spammers' so-called business model, eh?

    - Manual, less mutable, and more than short sig -

    I only stop by /. when I'm feeling sufficiently acerbic and have a few minutes to waste. Used to be /. was good for humorous or sardonic moods. My basic feeling is that the average wit of the /. participants has greatly declined. As a metaphor, the residual wit on /. is far below critical mass. Creating a new meme is laudable, applying an old meme in a new way can be somewhat witty, mindlessly repeating tired old memes is *NOT* witty nor amusing.

    I welcome thoughtful or witty rejoinder, but if you are a typically witless /. contrarian and simply lack the mod points for a spineless and anonymous censorship mod, then please don't waste our time with a reply. In particular, if you are one of the morons who still wants to defend Dubya's miserable failures, please just designate me your foe, and I'll gladly ignore you. One can never be designated as "foe" by too many fools.

    --
    Freedom = (Meaningful - Coerced) Choice != (Speech | Beer^2), and sad sock puppets' bad mods avail them naught.
  54. Make us all pay by Anonymous Coward · · Score: 0

    > why would they agree to sign up for such a service
    > in the first place?

    If they don't, then they stay in the zombie spam ghetto where they belong - fine.

  55. Re:YAWASP for wordpress + other by zimtmaxl · · Score: 2, Informative

    I forgot to mention these 2 plugins:
    SABRE: against spam registrations on your blog ( http://wordpress.org/extend/plugins/sabre)
    and
    Simple Trackback Validation: a trackback validation tool for wordpress ( http://wordpress.org/extend/plugins/simple-trackback-validation/ ).

    --
    how IT is changing the world - http://max.zamorsky.name
  56. time zone registery question by Anonymous Coward · · Score: 0

    I run a phpbb forum. I added a question to the sign up sheet that requires a user to pull down a menu selecting their time zone. Seems the default for most spam bots is to select the first option in a pull down menue. The first option is a time zone where there is no human habitations in the Pacific. It has brought our spam down to one or two a month.

    Seems to confuse human spammers also. Also, adding a report spam button to allow human users to report spam has eliminated the rest. I also implemented site wide, no questions asked, no links until at least 10 posts or you are banned policy. Most spammers are after links, so eliminate the links and the spammers tend to go away.

  57. TypePad AntiSpam by [rvr] · · Score: 1

    TypePad Antispam is an open source project and a commercial (but free) service. The core is released as open source (GPL2) so you can install your own instance of TypePad Antispam in your servers. It has an Akismet compatible API and plugins already exist for Movable Type, WordPress and other CMSs. The free service is what TypePad uses, and has some extensions not released in the open source version, so has some advantages to a single installation.

    --
    Víctor R. Ruiz
    rvr(at)blogalia.com
  58. Yes, there's Sblam! by porneL · · Score: 1

    That's exactly what Sblam! does.

    It's PHP-based filter for web forms that detects spam based on content (bayesian filter + specific rules), behavior and uses 3rd party blacklists.

    It's absolutely transparent to the user (well, 99.8% of them).

  59. Filter me! by Anonymous Coward · · Score: 0

    "While filtering for spam on email and other related mediums seems to be fairly productive, there is a growing issue with spam on forums, message-boards, blogs, and other such sites. In many cases, sites use prevention methods such as captchas or question-answer values to try and restrict input to human-only visitors. However, even with such safeguards â" and especially with most forms of captcha being cracked fairly often these days â" it seems that spammers are becoming an increasing nuisance in this regard. While searching for plugins or extensions to spamassassin etc I have had little luck finding anything not tied into the email framework. Google searches for PHP-based spam filtering tends to come up with mostly commercial and/or more email-related filters. Does anyone know of a good system for filtering spam in general messages? Preferably such a system would be FOSS, and something with a daemon component (accessible by port or socket) to offer quick response-times." "While filtering for spam on email and other related mediums seems to be fairly productive, there is a growing issue with spam on forums, message-boards, blogs, and other such sites. In many cases, sites use prevention methods such as captchas or question-answer values to try and restrict input to human-only visitors. However, even with such safeguards â" and especially with most forms of captcha being cracked fairly often these days â" it seems that spammers are becoming an increasing nuisance in this regard. While searching for plugins or extensions to spamassassin etc I have had little luck finding anything not tied into the email framework. Google searches for PHP-based spam filtering tends to come up with mostly commercial and/or more email-related filters. Does anyone know of a good system for filtering spam in general messages? Preferably such a system would be FOSS, and something with a daemon component (accessible by port or socket) to offer quick response-times." "While filtering for spam on email and other related mediums seems to be fairly productive, there is a growing issue with spam on forums, message-boards, blogs, and other such sites. In many cases, sites use prevention methods such as captchas or question-answer values to try and restrict input to human-only visitors. However, even with such safeguards â" and especially with most forms of captcha being cracked fairly often these days â" it seems that spammers are becoming an increasing nuisance in this regard. While searching for plugins or extensions to spamassassin etc I have had little luck finding anything not tied into the email framework. Google searches for PHP-based spam filtering tends to come up with mostly commercial and/or more email-related filters. Does anyone know of a good system for filtering spam in general messages? Preferably such a system would be FOSS, and something with a daemon component (accessible by port or socket) to offer quick response-times." "While filtering for spam on email and other related mediums seems to be fairly productive, there is a growing issue with spam on forums, message-boards, blogs, and other such sites. In many cases, sites use prevention methods such as captchas or question-answer values to try and restrict input to human-only visitors. However, even with such safeguards â" and especially with most forms of captcha being cracked fairly often these days â" it seems that spammers are becoming an increasing nuisance in this regard. While searching for plugins or extensions to spamassassin etc I have had little luck finding anything not tied into the email framework. Google searches for PHP-based spam filtering tends to come up with mostly commercial and/or more email-related filters. Does anyone know of a good system for filtering spam in general messages? Preferably such a system would be FOSS, and something with a daemon component (accessible by port or socket) to offer quick response-times." "While filtering for spam on email and other related mediums seems to be fairly productive, there is a g

  60. Nope by waldoj · · Score: 1

    Im afraid you misunderstand me. Again, only the field name is affected, not the label for the field. I've used text-only browsers regularly since 1994 (Mosaic over a 14.4k modem, Lynx, and now Links), and I'm yet to encounter one that displays the name element of an input field to the user.

    1. Re:Nope by Anonymous Coward · · Score: 0

      Hmm, I just tested out a copy of Links 2.1pre33 and it will show me the name of the field in the bottom of the window. (Or perhaps I just have the options set that way)

      Bottom of the window when I'm sitting on a text field:
      Text field, name role

      HTML of page element:
      <INPUT NAME="role" TYPE="text">

      Or for the matter, Google's text entry field on the front search page is called q.

      So they can show the name of the element, just that I'm guessing that it normally won't.

  61. OT:Kill the spammers--NOW by Magic5Ball · · Score: 1

    > Considering the complexity of the Internet, I have real and increasing difficulty understanding how the spammers manage to survive.

    Complexity begets niches. Niches are difficult to actively manage, hence, spammers thrive.

    > If the technical honchos of the Internet scanned the spam to find the largest spam-supporting registrar of the month and the rest of the Web then stopped talking to that registrar, that would seem to be rather harmful to the spammers' so-called business model, eh?

    That type of approach was demonstrated to be minimally effective recently. Google mccolo and esthost for examples of how enforcement actions by the community with backing from the "technical honchos of the Internet" had minimal effect on spammers in 2008. As long as there are other spammy registrars, hosting providers, transit providers etc. for spammers to go to, the greatest effect from closing down a provider is to add a line to a black list, without having sustainably altered the situation.

    With respect to registrars, it's often not viable for a number of reasons to poke spammy registrars that are also closely linked to ccTLD registrars (or similar) for particular countries (or similar). Individual organizations can implement or subscribe to particular DNS tricks, but there are barriers against large eyeball networks (TW, AOL, VZN, RR, etc.) doing so. (It's interesting that Internet censorship and information lensing of this sort are not well tolerated in general, but are simultaneously explicitly demanded to curtail spammers.)

    With respect to structurally breaking spamming methods, having forums/blog software authors universally deploy basic things like robots.txt, nofollow, etc. in conjunction with Google and the other search engines would be much more harmful to the link spammers' business model overall.

    --
    There are 1.1... kinds of people.
    1. Re:OT:Kill the spammers--NOW by shanen · · Score: 1

      Mostly an interesting and thoughtful reply--but defeatist. Since defeatism is an attitude, there doesn't seem to be useful to address your specific points since I can't actually address your attitude. I guess the strongest point I can say is "The spammers love people like you." I know that many spammers already dislike me, and my current ambition is to earn their hate...

      --
      Freedom = (Meaningful - Coerced) Choice != (Speech | Beer^2), and sad sock puppets' bad mods avail them naught.
  62. My custom solution by shark72 · · Score: 1

    I run a blogging site. When the spammers discovered it, I started getting several thousand automated spam comments per day.

    I solved the problem (ie. absolutely no automated spam) with a two-step process:

    First, I wrote a REALLY quick text analysis script in PHP which looked for the presence of links and other suspicious text. This reduced the spam by 95%, with no false positives.

    Since I had to keep examining the spam that got through and improving the filter, I wanted a system that didn't require constant maintenance, and one which did not incur the risk of false positives.

    The solution was dead simple: my comment forms now have two sets of inputs for the comment and the commenter's info. The first one is hidden through CSS. The second one is visible.

    Real people see only the 2nd form. The spambots see the first one. If there's any data in the first form when it's posted, the comment is dropped on the floor with no filtering, no hits to the SQL database, no nothing.

    I haven't had a problem with spam since. Perhaps there'll be a day when the spambots are tuned for my site, but I've been spam-free for two years.

    --
    Sitting in my day care, the art is decopainted.
  63. LSL by Anonymous Coward · · Score: 0

    I always wanted to make a spam-filtering method based upon the age-verification questions in Leisure Suit Larry 1.

  64. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  65. Just make it easier to remove by somenickname · · Score: 1

    I moderate for the Ubuntu forums and even with hundreds of legitimate posts per day, we have no trouble keeping the spam count very low. The reason is that it's very easy to remove it. We use Spam Decimator for vBulletin and it's literally two clicks to go from "This is Spam" to "This is deleted and the IP has been banned". You probably can't prevent spam but, you can make your life easier by finding a way to deal with it so effectively that it becomes pointless for the spammers to spam.

  66. No Problem by waldoj · · Score: 1

    They'll have no trouble with it at all. I'm yet to encounter a screen reader that reads the name element of an input field. Perhaps it would if the field was otherwise devoid of any descriptive text, but you'd have to be a real jackass to provide an unlabeled input field and expect anybody to know what to do with it. :)

  67. Stupidfilter by Madsy · · Score: 1

    http://stupidfilter.org/
    Works for me.