Slashdot Mirror


HTML Encoded Captchas

rangeva writes to tell us about a twist he has developed on the common Captcha technique to discourage spam bots: HECs encode the Captcha image into HTML, thus presenting an unsolved challenge to the bots' programmers. From the writeup: "The Captcha is no longer an image and therefore not a resource they can download and process. The owner of the site can change the properties of the Captcha's HTML, making it unique,... add[ing] another layer of complication for the bot to crack." HECs are not exactly lightweight — the one on the linked page weighs in at 218K — but this GPL'd project seems like a nice advance on the state of the art.

3 of 177 comments (clear)

  1. I failed to see how this'll help by Rosco+P.+Coltrane · · Score: 5, Interesting

    At the end of the day, this captcha is displayed on the screen as a colorful harder-to-read mumbo-jumbo, just like jpeg captchas, so all a bot has to do is use a html renderer to turn it into a regular image that can be processed. So the added complication is linking one of the existing captcha decoders and the gecko engine for example, maybe a half day's work. Not exactly uncrackable...

    --
    "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
  2. Broken by Kurayamino-X · · Score: 5, Interesting

    All text based captcha's are broken, it doesn't matter how they're rendered, they're still a pre-defined set of characters that a bot can pick out eventually. Now, the "Click three kittens" captcha, that was fucking genious, no bot on the planet will be able to tell the difference between a kitten and a ham sandwich. Why isn't it being used? People seem to think obscuring text and making it harder for humans to read is a better idea than using something a computer will not be able to identify.

    --
    ...I got nothing.
  3. No need to download the image by lintux · · Score: 5, Interesting

    There's no need to download the image. Look at the source. Somewhere it says:

    Now, just go to MD5Lookup.Com and convert that little "hidden" MD5Sum back to the original text:

    ad6ade8a0b6e2f748b80a390ff45cf31 - &NMTB

    Maybe the author should add some salt. :-)