Next-Generation CAPTCHA Exploits the Semantic Gap

← Back to Stories (view on slashdot.org)

Next-Generation CAPTCHA Exploits the Semantic Gap

Posted by kdawson on Wednesday April 23, 2008 @12:03AM from the stand-and-identify dept.

captcha_fun writes "Researchers at Penn State have developed a patent-pending image-based CAPTCHA technology for next-generation computer authentication. A user is asked to pass two tests: (1) click the geometric center of an image within a composite image, and (2) annotate an image using a word selected from a list. These images shown to the users have fake colors, textures, and edges, based on a sequence of randomly-generated parameters. Computer vision and recognition algorithms, such as alipr, rely on original colors, textures, and shapes in order to interpret the semantic content of an image. Because of the endowed power of imagination, even without the correct color, texture, and shape information, humans can still pass the tests with ease. Until computers can 'imagine' what is missing from an image, robotic programs will be unable to pass these tests. The system is called IMAGINATION and you can try it out." This sounds promising given how broken current CAPTCHA technology is.

36 of 327 comments (clear)

Min score:

Reason:

Sort:

Too hard. by Whiney+Mac+Fanboy · 2008-04-23 00:03 · Score: 5, Insightful

The general public will not know what "geometric" means*.

This Captcha suffers from the same old problem. As Captchas get harder more humans will fail them.

*or annotate... or centre

--
There are shills on slashdot. Apparently, I'm one of them.
1. Re:Too hard. by MichaelSmith · 2008-04-23 00:23 · Score: 4, Interesting
  
  The general public will not know what "geometric" means*.
  
  This Captcha suffers from the same old problem. As Captchas get harder more humans will fail them.
  
  *or annotate... or centre Soon we will welcome computers to our online forums for their insightful, informative and interesting comments. The CAPTCHA will be there as an initial filter on the quality of posters. It will exclude stupid computers and stupid people.
  
  --
  http://michaelsmith.id.au
2. Re:Too hard. by Smidge204 · 2008-04-23 00:26 · Score: 5, Insightful
  
  Definitely the human's problem, although presumably if a human is smart enough to make it then a human is smart enough to figure it out...
  
  To be optimistic, I actually like to think of it the other way around:
  
  CAPTCHAs are providing a valuable evolutionary pressure on machine vision/artificial intelligence development!
  
  =Smidge=
curses... by Anonymous Coward · 2008-04-23 00:08 · Score: 4, Funny

It's already spotted that I am a computer and it won't even load.
worthless by tritonman · 2008-04-23 00:09 · Score: 5, Insightful

who needs to write CAPTCHA exploits when you can just hire 50 chinese kids for 3 cents per day to create email accounts and send spam out for you?
1. Re:worthless by Mipoti+Gusundar · 2008-04-23 00:16 · Score: 5, Funny
  
  you can just hire 50 chinese kids for 3 cents per day
  If is really being true that they can be cutting us under by fifety percents then fine hai-tech industry of my dear INDIA is doomed. Ah well, nice while was lasting. Perhaps my medical degree is being useful after all!
  
  --
  Will code for new sig.
Blind people? by tepples · 2008-04-23 00:09 · Score: 5, Insightful

As Captchas get harder more humans will fail them. And as the population of the Internet grows, more blind and hard-of-sight people will be using the Internet, and they will fail visual tests deployed by web site operators who don't bother to deploy a decent audio test.
1. Re:Blind people? by Anonymous Coward · 2008-04-23 00:38 · Score: 5, Insightful
  
  Do we lament that the blind and h-o-s cannot drive? The difference is that the web consists mainly of textual information that blind people can use.
  The cost of being all-inclusive can be too high for some budgets. The same could be said for supporting minor browsers, such as Safari.
2. Re:Blind people? by csnydermvpsoft · 2008-04-23 00:52 · Score: 4, Insightful
  
  The blind are able to use braille displays and screen readers to access well-designed sites. The whole point of CAPTCHAs, however, is to have images that computers are unable to read. Accessible design and CAPTCHAs have exactly opposite goals.
  
  The Internet is becoming much too important to leave a significant amount of the population (pardon the pun) in the dark. We have the technology to help the blind navigate web sites independently. Unfortunately, CAPTCHAs are hindering much of that progress.
3. Re:Blind people? by Ngarrang · 2008-04-23 01:03 · Score: 5, Insightful
  
  csnydermvpsoft wrote, "The Internet is becoming much too important to leave a significant amount of the population (pardon the pun) in the dark. We have the technology to help the blind navigate web sites independently. Unfortunately, CAPTCHAs are hindering much of that progress."
  
  No, spammers are. The root problem of this "solution" is the spammers, who do not care our personal feelings of privacy. They don't care that their messages cause everyone else's costs to rise.
  
  Without CAPTHA technology, none of the web mailers would be usable, as they would all be blocked by every known blacklist.
  
  For this reason, I think the penalties for convicted spammers should be far higher than what they are now. Their actions are subverting the ease of use for a very large group of people.
  
  --
  Bearded Dragon
4. Re:Blind people? by iangoldby · 2008-04-23 01:16 · Score: 3, Insightful
  
  I don't if it should be a concern. Do we lament that the blind and h-o-s cannot drive?
  I think that's a pretty outrageous attitude.
  
  Think about it. What is the cost of making a car that a blind person could drive? Prohibitive, I suspect. Given the current state of technology it may not be quite possible even (though we could pay for human chauffeurs if we were really determined).
  
  What's the cost of making a printed newspaper accessible to a blind person? Quite high I suspect. The technology to read shapes on a page and convert them to something the blind person can read or listen to is not straighforward.
  
  What's the cost of a system that allows a blind person to access text stored electronically on a computer? Pretty-much negligible.
  
  The thing is, the web should be a superb medium for making its content accessible to practically everyone. The information is already in a form that computers can manipulate easily.
  
  If you use HTML as it was designed to be used, there is no additional cost in making it accessible.
  
  Come on people, this is not rocket science! Here we have a golden opportunity to make, for practically no additional cost, something that can be accessed by everyone. It's not like designing a driverless car, or backfitting access ramps and lifts to historic buildings. Why on earth wouldn't we do this?
  
  </rant>
5. Re:Blind people? by jackb_guppy · 2008-04-23 01:23 · Score: 4, Insightful
  
  CAPTHA are already dumping people with color issues, not blind but do not have the ability to perceive color differences.
  
  Others are using letters / numbers that after distortion could be a,d,9,g for example.
  
  Personal, I give a site two tries before I give up and dump them.
6. Re:Blind people? by Kam+Solusar · 2008-04-23 01:27 · Score: 5, Informative
  
  According to Wikipedia: In November 2004 article Magnitude and causes of visual impairment, the WHO estimated that in 2002 there were 161 million (about 2.6% of the world population) visually impaired people in the world, of whom 124 million (about 2%) had low vision and 37 million (about 0.6%) were blind.
  
  --
  The Angels have the Phone Box
7. Re:Blind people? by phoenixwade · 2008-04-23 01:45 · Score: 4, Interesting
  
  I don't if it should be a concern. Do we lament that the blind and h-o-s cannot drive?
  I think that's a pretty outrageous attitude.
  {SNIPPED}
  What's the cost of a system that allows a blind person to access text stored electronically on a computer? Pretty-much negligible. Here is where you fail to understand the problem.
  First, creating content is not negligible in cost.
  Second, creating an interface to deliver the content is not Negligable in cost.
  Third, Actually delivering the content to the masses isn't negligible in cost either.
  Fourth, as has been pointed out in other comments and in the article, the problem involves the creation of a technology that will allow your audience to access the content/service you are providing, while simultaneously preventing the use of automated systems to exploit your services by appearing to be your audience (i.e. a Human), because the failure to do so means that you may lose the entire technology, or at the very least render it substantially less useful and more expensive. Email, for example, is only being used 5% of the time as intended, the other 95% being spam (As seen on /. recently)
  The thing is, the web should be a superb medium for making its content accessible to practically everyone. The information is already in a form that computers can manipulate easily.
  
  If you use HTML as it was designed to be used, there is no additional cost in making it accessible. AH! Now I understand! You are in the wrong conversation and didn't realize it.
  
  if you are using HTML only, the whole captcha debate is meaningless for you. HTML is designed for PUBLISHING information, captcha applies to web based applications that HTML is only a SMALL part of. After all, the only interactive part of HTML are the form elements. Since YOU aren't actually doing anything with the posted form information, YOU have no need for security and little to no need to verify that the entity on the other end of that pipe is a human, spyder, or spambot.
  
  However, some of us do create applications that need to know this, because we want to provide services for actual humans, but do not want to provide another place for spambots to send out their crap.
  
  --
  A positive attitude may not solve all your problems, but it will annoy enough people to make it worth the effort.
8. Re:Blind people? by Bastard+of+Subhumani · 2008-04-23 02:20 · Score: 3, Funny
  
  The difference is that the web consists mainly of textual information that blind people can use.
  Only a blind person could be unaware that 99.99% of the intarwebs are composed of pr0n.
  
  --
  Only three things are certain; death, taxes, and apocryphal quotations - Ben Franklin.
9. Re:Blind people? by $rtbl_this · 2008-04-23 02:37 · Score: 5, Funny
  
  Oh, they're aware. How do you think most of them got to be blind?
  
  --
  "Are you being weird, or sarcastic?" said Emma. I said I didn't know because I get the two feelings mixed up.
10. Re:Blind people? by ultranova · 2008-04-23 03:16 · Score: 3, Insightful
  
  The blind are able to use braille displays and screen readers to access well-designed sites. The whole point of CAPTCHAs, however, is to have images that computers are unable to read. Accessible design and CAPTCHAs have exactly opposite goals.
  
  No, the point of a CAPTCHA is to have a test which a human can pass easily, but a computer can't. Most current CAPTCHAs are image-based, since that is simple to implement, but this is by no means a requirement.
  
  --
  Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
11. Re:Blind people? by nickos · 2008-04-23 04:52 · Score: 4, Informative
  
  I had the same problem, and I was able to solve it in 2 steps.
  
  1. Strip links from messages. The spammers are trying to game Google's (and other search engine's) page ranking, and they can't do this if you don't allow them to post links. The incentive to spam your site has now gone.
  
  2. Insert some primitive captcha. In my case this was just a question asking the user to add 2 small numbers together. The reason this step was necessary was because despite implementing step 1, I was still getting a huge amount of automated spam from spam bots which didn't realise there was no point in spamming my site. Once a human spammer realises you've added captcha he'll come and have a look to see how easy it is to circumvent (very easy in my case). However after running a test personally he'll see there's no point and (hopefully) remove you from his list of sites to spam.
  
  Hope that helps anyone reading this...
Lyrical Response Mechanism by FurtiveGlancer · 2008-04-23 00:13 · Score: 5, Funny

Why don't we take a note from TV and have the user sing the missing lyrics of a classic hit. Even if they don't pass, it will make for much more fun around the computer, especially at the office.

--
Invenio via vel creo
1. Re:Lyrical Response Mechanism by CSMatt · 2008-04-23 00:30 · Score: 3, Funny
  
  Until the user gets subpoenaed by copyright holders.
  
  Then it will be hilarious.
2. Re:Lyrical Response Mechanism by Daimanta · 2008-04-23 00:32 · Score: 5, Funny
  
  I'll start. Finish this:
  
  "Never gonna give you up"...
  
  --
  Knowledge is power. Knowledge shared is power lost.
It's still trivially crackable. by Jason1729 · 2008-04-23 00:14 · Score: 5, Insightful

All they need to do is offer free porn to people who solve the captchas and embed the captcha in their site. It doesn't matter how sophisticated the test is or hard it is for a machine to do it, they all have that fatal flaw.

Then there's also the option of paying Warcraft gold farmers to solve captchas and take a break from the game.
1. Re:It's still trivially crackable. by apoc.famine · 2008-04-23 04:22 · Score: 4, Interesting
  
  That was our solution to spambots on our small (12 active people or so) forum. We used very forum-specific questions to allow registration, and only registered users can post. If someone can't answer the questions, they aren't into the subject enough that we would want them there discussing it. Or they're a spammer, and don't know that the proper answer to the "what would you like to do to a spammer" question is the answer which is exceptionally painful.
  
  But really, as long as you have an authentication method which is significantly hard/unique, you'll be safe. Spamming is a "low hanging fruit" operation. Quantity over qualify, 90% of the time. In fact, the answer to killing off spambots might very well be everyone designing their own authentication. Right now, there are a half-dozen major ones. Crack one, and you have access to millions of places. If instead there were thousands, the time required to break one would not necessarily be worth the money you could get from doing it.
  
  Our forums are not worth programming the automated bots to crack, so we're 100% spam free now, for the first time in a few years. It's not a hard authentication - just different from 99.9% of the rest of them. Hell, most people could answer "what color is this page", even if they had to look at the raw html and google the color hex. But for one page, it's not worth programming a bot to do. Unique authentication methods will kill spambots.
  
  --
  Velociraptor = Distiraptor / Timeraptor
Alternative... by martin_henry · 2008-04-23 00:20 · Score: 5, Informative

Alternative URL: http://wang.ist.psu.edu/docs/projects/imagination.html

--
www.purevolume.com/martyd
Stupid Captcha by Big+Smirk · 2008-04-23 00:20 · Score: 5, Insightful

Any captcha with multiple choice answers is not a good one. 20 choices? So the computer gets by 1/20 of the time. Hmmm, how many attempts does it take to get 1000 e-mail accounts? As for "geometric center" note that all the images are rectangular. I haven't tried it, but writing a program to pull out all possible rectanges and then sort them on size, and pick the center of the one of the larger rectangles should do it. Why not a captcha that works with google. "Describe in one or two words what is in this picture", then use a google like search to match up the actual description with what the person typed. Person types "Dog" picture is a "Labrador Retriever" match.

--
TODO: create/find/steal funny sig.
Test site slashdotted... by thrill12 · 2008-04-23 00:26 · Score: 3, Informative

...but some more info here as well as a (ugh) [a href="http://wang.ist.psu.edu/imagination/imagination.ppt">powerpoint and a user study with some samples.

--
Slashdot: stuff for news, nerds that matter, matter for news, stuff that nerd
Re:Twofo Ghey Niggers by CSMatt · 2008-04-23 00:27 · Score: 4, Funny

This just reaffirms the article's conviction that the CAPTCHA is broken.
Don't forget users of lynx by Nursie · 2008-04-23 00:30 · Score: 4, Interesting

It annoyed me mightily the day slashdot introduced captchas for comments when you weren't already logged in. And somehow broke the login process from lynx.

Lynx is the geek slacker's greatest tool, when run in an ssh session from your home server, not only is the traffic unloggable (except for "he's calling home a bit") but it even looks like work to the uninitiated.
1. Re:Don't forget users of lynx by mpeg4codec · 2008-04-23 04:00 · Score: 3, Insightful
  
  FWIW you don't need a dedicated HTTP proxy, as SSH has a built-in SOCKS proxy. Try it out some time: ssh -D 1080 remote.tld and configure your browser of choice to use SOCKS on localhost port 1080. For other apps that don't have native support for proxying, check out proxychains (on Unix). Not only great for browsing at work, but also a godsend for unsecured wireless nets.
Re:Illogical by Matje · 2008-04-23 01:02 · Score: 3, Insightful

If a computer could recognize the difference between human and computer generated speech, then it would know how to generate human sounding speech. Bullocks. Why is this modded informative? You don't provide any backup for your claim.

It is imaginable to create a model that describes speech characteristics in general and computer speech characteristics in particular. Any sound sample could compared with the two models. If it fits the wider speech model but not the computer speech model, then you would call it human speech. QED.

The ability to distinquish between two things does not imply that you'll be able to generate them effectively (unless the search space is very narrow). Imagine it this way: you can probably distinguish Chinese from Spanish. That does not imply you speak either language.
At least a part is Ineffective by Dracolytch · 2008-04-23 01:43 · Score: 4, Insightful

Ok, so I was able to do the image analysis one, where they take an image, muck with the color, draw a bunch of black lines over it, and then ask you to annotate it with a word from a list.

This is no better, and may be worse, than what we have now, for two reasons.

1) If you fill in the gaps programmatically, and then make the image grayscale, you probably have something you can use for image matching.

2) Much more severely: The interface reduces the number of possible answers by multiple orders of magnitude. For the one I saw I think there were 10 or 15 answers. Even if you kick image recognition to the curb and randomly choose an answer, you'll be right 1/15 times. It'd be trivial to write a program to harvest hundreds of accounts in a day by just picking random answers. Hand that off to a botnet or similar, and this becomes a minor speedbump.

~D

--
This sig has been enciphered with a one-time pad. It could say almost anything.
Re:The real solution to captcha is OpenID. by giafly · 2008-04-23 02:04 · Score: 3, Insightful

How do you protect the sign-up page to get an OpenID? With a captcha?

--
Reduce, reuse, cycle
Solution: unproven users = limited access by davidwr · 2008-04-23 02:15 · Score: 5, Insightful

Wikipedia does this by restricting what new accounts and non-logged-in accounts can do.

If free mail servers put restrictions on what new accounts could do, with an override to anyone who is willing to go to a lot of trouble to prove they are human, it would short-circuit the spammer problem.

If Yahoo, Gmail, etc. all limited you to 10 outgoing mail recipients a day until you had both 1) had the service for 1 day and replied to 10 messages, AND limited you to 100 outgoing mail recipients a day until you signed up to be a "high volume sender," it would cut most spammers off at the knees. Depending on the service, being a "high volume sender" may involve turning over a credit card number and may not be free. Some services may give "loyalty awards" to long-term customers by removing this restriction for people who have had their accounts for 6 months and show a heavy non-spammy ad-revenue-generating usage pattern.

--
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Regarding your sig by spun · 2008-04-23 02:48 · Score: 3, Funny

I can already see how this is going to go.

"You stole my sig!"
"No I didn't."
"Yes you did, it's exactly the same as mine!"
"No it isn't."
"Yes it is!"
"No it isn't. Look, mine is in two lines."
"That hardly makes a difference."
"Yes it does!"
"No it doesn't."

--
- None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
Here's a reference implementation by gr8dude · 2008-04-23 02:57 · Score: 3, Funny
Which of the following would you most prefer?
- A: a puppy,
- B: a pretty flower from your sweety, or
- C: a large properly formatted data file?
--
The saddest poem
hotcaptcha by SCHecklerX · 2008-04-23 04:36 · Score: 4, Interesting

I like this better:

http://www.hotcaptcha.com/