Slashdot Mirror


Web Users Angered by Anti-Spam 'Captcha'

Carl Bialik from WSJ writes "Captchas -- the jumbles of letters that users must type to gain access to some websites -- are a growing irritation, the Wall Street Journal reports. But programmers hope to make new variations that are both easier to decipher and harder to crack. From the article: 'Some captchas have been solved with more than 90% accuracy by scientists specializing in computer vision research at the University of California, Berkeley, and elsewhere. Hobbyists also regularly write code to solve captchas on commercial sites with a high degree of accuracy. ... Henry Baird, a professor of computer science at Lehigh University who studies PC users' responses to the codes, has been working with colleagues to develop new generations of captchas that are designed to be easier on humans but baffling for computers.'"

7 of 267 comments (clear)

  1. Image Key Sets & Dynamic Captchas by eldavojohn · · Score: 4, Informative

    I had heard once of a very cunning strategy around captchas. I'm not sure if this is true but there is a story of a p0rn site making large sums of cash by selling key sets to the images. Certain sites would not dynamically generate images but instead rely on sets of images with protected keys as a captcha.

    In order to use the p0rn site he ran, you had to either pay money or spend time identifying captchas. He would then store them in a database and match it up with a checksum of the image. When he had completed a site's captcha key set, he would sell these lookup tables to anyone with money.

    All they then had to do was write their program to do a checksum of the image (or the image itself if he had stored it) and then plug the word from the database into the page for verification.

    With the introduction of splashers that spatter the statically stored images with lines or dots, the image is stored and a something like an edit distance is applied to it to find the closest match. Once that is accomplished, it references the keyword out of the database. You turn up the splasher and you risk the user not being able to figure out the word.

    It seems that evil always finds a way. This is why captchas should always be dynamically generated on the fly from a very large dictionary! Check out Securimage for PHP.

    --
    My work here is dung.
  2. News for Nerds? by Silver+Sloth · · Score: 3, Informative
    There's not much here, it's written in the WSJ which means it's in language that my mum would understand, and has precious little in the way of hard facts. For those who can't be bothered to RTFA,
    1. There are things called 'Captchas'
    2. People don't like them
    3. Computers are getting better at cracking them
    4. Some boffins are trying to make new ones which people like and computers don't
    Really, that's all there is.
    --
    init 11 - for when you need that edge.
  3. 20% error rate by JohnGrahamCumming · · Score: 2, Informative

    One of the things that I'm watching in the error logs of SpamOrHam (web site where volunteers sort messages into spam and ham) is the error rate on the CAPTCHA used. Ignoring what appear to be automated attempts bruteforce the CAPTCHA I see an error rate of around 20% of 100,000s of CAPTCHA's.

    That's amazingly high. 1 in 5 CAPTCHA's are incorrectly entered by humans doing their best to do the right thing.

    No wonder people get mad at them.

    John.

  4. Re:Easy: Real Life Objects or Critters by Anonymous Coward · · Score: 1, Informative

    It is called ESP-PIX (http://gs264.sp.cs.cmu.edu/cgi-bin/esp-pix). You can take a look at this (http://www.captcha.net/captchas/pix/) for more information.

  5. Re:What? by deesine · · Score: 4, Informative

    What gets me in the inconsistent use of case sensitivity. About 20-30% fail for me because of this.

    --
    damaged by dogma
  6. Re:How ironic... by Anonymous Coward · · Score: 1, Informative

    Downloads are protected by CAPTCHAs not to prevent spam bots, but to prevent "bandwidth-stealing" sites from cashing in on advertising by providing direct links to files on your server. They are undeniably rather effective at that, too; it may be trivial to defeat the technical challenge, but it's not easy to then forward the resulting file to a user without using the bandwidth to send the entire file(in which case the exercise is fundamentally pointless).

  7. Re:The human factor by Anonymous Coward · · Score: 1, Informative

    If I wanted to be really sadistic, I could instead present site readers with a sentence, in which they have to fill in either "their," "there," or "they're."

    The game Kingdom of Loathing uses that as a test - if you don't pass the test, you aren't allowed into the chat room.