Slashdot Mirror


Block Spam Bots With Free CAPTCHA Service

Chirag Mehta writes "I just released a freeware service called BotBlock (barebones demo) that lets site owners copy/paste a few lines of PHP code and insert a CAPTCHA image-verification system into any web form. The amount of form spamming by bots is on a rise. While remedies exist for MT blogs, a more efficient solution is to use image-verification or text-identification. Used for a while by sites like Yahoo! (scroll to bottom), Hotmail and patented in 2001 by AltaVista, CAPTCHAs are now being used more widely. PARC also came up with two algorithms Baffletext and Pessimal Print. The technology always existed, but until now required the site owners to install image libraries and understand how to generate images that cannot be OCR'ed. With BotBlock it is like inserting a page counter."

13 of 56 comments (clear)

  1. What about blind people? by FattMattP · · Score: 4, Interesting

    What about people who are blind or visually impared? Does your implementation take that into account?

    --
    Prevent email address forgery. Publish SPF records for y
    1. Re:What about blind people? by Phoenix+Dreamscape · · Score: 2, Informative

      They have one that generates sounds. You're in trouble if you're blind and deaf, though.

    2. Re:What about blind people? by Glass+of+Water · · Score: 5, Interesting
      What they should do is use a question, written out in regular HTML text that is easy for a human to answer but hard for a computer. Example: What color is the sky on a cloudless day? Another example: My name is Joe Frank Smith. What are my initials?

      Think those are easy for basic AI bots? Then try them with one of the existing online bots.

      Seems like the problem with this (as opposed to generating pictures) is that it's hard to generate question/answer pairs where there is a one-word or obvious single answer. You don't want to use yes/no questions or questions where the answer is a word in the question ("Which is heavier, lead or cotton?").

      --
      There are no trolls. There are no trees out here.
    3. Re:What about blind people? by Jerf · · Score: 2, Interesting

      What they should do is use a question, written out in regular HTML text that is easy for a human to answer but hard for a computer. Example: What color is the sky on a cloudless day?

      I'm afraid I'd have to recommend against using that question for blind people.

      Might want to pick your examples a bit more carefully ;-)

      (Not that it's absolutely impossible they'd know the answer, but it's mere meaningless trivia to someone who has been blind from birth; I don't think I'd remember it.)

      Think those are easy for basic AI bots?

      Remember, you're not going up against the bots, you're going up against the bots as a proxy for a spammer. If you create a pattern "My name is $random_first $random_middle $random_last. What are my initials?" then the answer is something like

      perl -pe 's/My name is (\w)\w* (\w)\w* (\w)\w*. What are my initials\?/$1$2$3/g'

      (Try it on your question. Be sure to type the question precisely.)

      Now you're back in an arms race against the spammers; the whle point is to avoid the arms race in the first place.

      BTW, before criticising this 'solution', be sure you understand what an arms race is. I know you could further obfuscate it. But you could also further de-obfuscate it. And believe me, with a halfway intelligent system I can keep pace with you; for instance, if I write my cheating spammer so it brings things to my attention in real time as it can't figure them out, I can build a solution bank pretty quickly, not quite as quickly as you can create new challenges (well, maybe, if I'm better then the challenge writer), but certainly faster then you could deploy the new challenges. If you're not bypassing the arms race entirely, you're not winning, you're losing long term.

      This is a common failing of understanding when thinking about these technologies. You're not going up against a machine, you're going up against an augmented human. (It's why I still think Bayesian filtering will fail eventually, too; the spammers can augment themselves with the same technology, fortunately they just haven't correctly figured it out yet. The clock is probably ticking, though.)

    4. Re:What about blind people? by herrvinny · · Score: 4, Insightful

      The problem is, generating all those sentences. The sentences have to vary, they can't all be: My name is Barney Big Purple Dinosaur. What are my initials? My name is Einstein Mozart Bach Quartet. What are my initials? Then a spammer could just use regular expressions to handle that. Even Java introduced an easy-to-use regex package a few versions ago. Another problem is, you would have to generate literally billions of them, because a spammer may theoretically just hit a service with billions of requests - who's to say that the requests are real or not? And then the ultimate problem: How are we going to generate all these questions? A computer, of course, but the problem is again, how does a computer generate billions of these things so only a human and not a computer can interpret it? At that point, you're approaching true AI. And if we had AI, forget the spam problem: Just have the AI process each and every email.

  2. much better by capoccia · · Score: 2, Informative

    much better than blacklists and captcha is a bayesian filter.

    blacklists are innaccurate: blacklisted words can be misspelled and pass through.

    captcha discriminates against the disabled and cuts them off from online discussions.

    James Seng has crafted a good bayesian filter for movable type.

  3. okay class, pencils down by Phoenix+Dreamscape · · Score: 2, Interesting

    Some of the examples on their site take a lot more time and mental effort than just looking at a word and typing it. I would be very bothered if I had to take one of those little tests just to fill out a form.

  4. Blatent Plug by gavinroy · · Score: 2, Informative

    For my GPL'ed PHP Captcha sofware:

    http://sourceforge.net/projects/session-captcha/

  5. Patented? by orthogonal · · Score: 2, Interesting

    patented in 2001 by AltaVista

    If AltaVista patented it, does BotBlock license the patent? Or will this service be rather short-lived?

  6. I'm neither blind nor deaf, but... by jcwren · · Score: 2, Interesting

    ...the images here here are absolutely unreadable. If I had to use this to subscribe to a site or forum, or fill out a form, I'd just say "screw it", and wander on down the 'net.

  7. Unique CAPTCHA Implementation by madstork2000 · · Score: 2, Informative

    I'm working on another version, which I believe is unique at this point. (At least I didn't find anything like in on Google a few weeks ago).

    See a sample at the link below. (DISCLAIMER:: This site is a small self run hosting company, and has "sales" links, and is of commercial nature. So if you're going to get all pissed off because I am trying to feed my kids please do not click through. The sample does not collect or log anything outside of what Apache routinely collects. ) http://webshowhost.com/main.php?smPID=PHP::ui_huma n_verify.php&caseFlag=SAMPLE

    What makes this implementation unique is that in the pattern user must identify color and characters. It combines multiple levels of recognition. The user must understand the concept of COLOR and the characters. This should make it particularly difficult for SPAM bots to dicipher, since color is very subjective. I am posting this here mainly to establish prior art (as I have not seen any test use these concepts before) in case some joker tries to patent this variety of CAPTCHA.

    My variety integrates into a toolkit I've developed, but basically uses imagemagik montage to fuse pre-rendered image bitmaps into a single JPEG.

    It is obviously weak in the sense that it discriminates against blind folks and illiterate folks. On the bright side it has definately eliminated ALL of my spam!

    If your interested in this contact me at captcha1@webshowpro.com ** Note you'll have to verify yourself with the prototype system to sendmail to that account.

    I'll do my best to provide you with the relevent code. I don't have time at this point to lead a project (as my company is a oneman show barely scraping by at this point). So my apologies in advance if I cannot support the code to your satisfaction.

    1. Re:Unique CAPTCHA Implementation by Carnildo · · Score: 3, Insightful

      A few things to keep in mind:
      1) Colorblind people (10% of the male population of the world). By far the most common form of colorblindness is red/green, so as long as you stick with easily-distinguished colors like black, red, and blue, you should be fine. You could probably add yellow and a medium grey to the mix, but yellow can be hard for normal people to read, and on some monitors, grey can be mistaken for black.
      2) Increase the overlapping of the characters a bit. Right now, the characters can usually be separated out by color into three images, at which point a spambot can simply pick the one that matches the color of the instruction image.
      3) You can make an audio CAPTCHA harder for computers to recognize by adding noise to the sound, or by using recordings of a person with a strong accent (or better still, a variety of accents)

      --
      "They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
  8. Not a perfect solution by Eric+Savage · · Score: 3, Insightful

    Even if you had an image that was 0% readable by OCR, image verification only stops "pure bot" spamming. It does not stop someone writing a helper or proxy app that presents them with a list of 1000 images that they type out in a very efficient manner. This could mean the difference between a million and a thousand spams per hour, but that's still a thousand spams per hour. And if you dismiss this as something that nobody would bother to do, you obviously don't know anything about spammers...

    --

    This is not the greatest sig in the world, this is just a tribute.