Fallout From the Fall of CAPTCHAs

← Back to Stories (view on slashdot.org)

Fallout From the Fall of CAPTCHAs

Posted by kdawson on Tuesday July 15, 2008 @09:05AM from the script-kiddie-fodder dept.

An anonymous reader recommends Computerworld's look at the rise and fall of CAPTCHAs, and at some of the ways bad guys are leveraging broken CAPTCHAs to ply their evil trade. "CAPTCHA used to be an easy and useful way for Web administrators to authenticate users. Now it's an easy and useful way for malware authors and spammers to do their dirty work. By January 2008, Yahoo Mail's CAPTCHA had been cracked. Gmail was ripped open soon thereafter. Hotmail's top got popped in April. And then things got bad. There are now programs available online (no, we will not tell you where) that automate CAPTCHA attacks. You don't need to have any cracking skills. All you need is a desire to spread spam, make anonymous online attacks against your enemies, propagate malware or, in general, be an online jerk. And it's not just free e-mail sites that can be made to suffer..."

31 of 413 comments (clear)

Cracaked CAPTHAs!!! oh no! by xpuppykickerx · 2008-07-15 09:08 · Score: 5, Interesting

I hate the fact that a computer can view these things better than I can. Lately, a lot of the CAPTCHAs have become unreadable by human viewers.
1. Re:Cracaked CAPTHAs!!! oh no! by Anders · 2008-07-15 09:16 · Score: 5, Insightful
  
  I hate the fact that a computer can view these things better than I can. Lately, a lot of the CAPTCHAs have become unreadable by human viewers.
  They don't view it better than you, they just do not get impatient from failing 4 out of 5 times.
Anyone usinging specialised tests? by niceone · 2008-07-15 09:09 · Score: 5, Interesting

Heh, at the end of the article they have a link to a site that requires you to solve a calculus problem to register (it gets easier if you reload the page a few times, down to simple arithmetic). I have a site that is only of interest to people who use verilog (a hardware design language) I've toyed with requiring a some digital logic problem to be solved, but the volume of spam signups it's big enough for me to be bothered yet...

Of course this solution isn't going to work for gmail - which seems to be the preferred email provider for the spam signups I do get these days.

--
ccalam - acoustic versions of new songs.
1. Re:Anyone usinging specialised tests? by jandrese · 2008-07-15 09:22 · Score: 5, Insightful
  
  The problem is that to set up that CAPTCHA you have to have a person sift through a huge picture archive of cats and dogs and mark each one. However, that limits the size of your CAPTCHA dictionary to however many entries a person can parse in a reasonable amount of time. This means the bad guys can sit down a person (or two, or ten) and go through all of your images to seed a database with the correct answers for their bots.
  
  --
  
  I read the internet for the articles.
2. Re:Anyone usinging specialised tests? by Lehk228 · 2008-07-15 09:23 · Score: 4, Insightful
  
  not really, unless the catalog is huge and you expect your legitimate users to be biologists. if there are even as many as 100 animals the script can just guess, and 1% of attempts get through. when thousands of bots are signing up simultaniously 1% is a whole lot of bots
  
  --
  Snowden and Manning are heroes.
3. Re:Anyone usinging specialised tests? by stomv · 2008-07-15 09:40 · Score: 4, Interesting
  
  what is the opposite of up?
  what day is after friday?
  what does seven plus three equal?
  what letter of the alphabet comes before d?
  how many wheels does a bicycle have?
  what is the third word of this sentence?
  These are generally difficult for computers to solve, can be programed to have permutations, and since the quiz answer can be tied to the account, if a particular question or style is getting spammed frequently, it can be removed from the list of questions.
  It's an arms race, and this system won't work forever, but it's fairly easy to implement and fairly difficult to overcome.
  
  --
  Support a few technologists in Washington.
4. Re:Anyone usinging specialised tests? by Anonymous Coward · 2008-07-15 11:48 · Score: 5, Funny
  
  what is the third word of this sentence?
  No, its the first.
Mix it up a bit? by Hektor_Troy · 2008-07-15 09:10 · Score: 4, Interesting

Combine it with a mix of simple math and image recognition? I.e.
"What colour hair does the (2+four)/3 girl from the left have?"
Hell, skip the math part if that's too easy.

--
We do not live in the 21st century. We live in the 20 second century.
1. Re:Mix it up a bit? by jandrese · 2008-07-15 09:19 · Score: 5, Insightful
  
  Computers are pretty good at math last time I checked. Asking for something that would require a full on AI to answer is good (the hair color part), but the problem is that it requires a human to seed the questions, which means they will be limited in number. If they're limited in number then the spammers will just go through and keep reloading the screen until they've seen all (or mostly all) of the answers and program their bot with the correct answers.
  
  CAPTCHAs need to be able to be generated algorithmically by a computer, but not answered by one, which is a surprisingly difficult problem. Anything that requires human intervention on the creation of each variation is doomed to fail because spammers have more free time than you do.
  
  --
  
  I read the internet for the articles.
2. Re:Mix it up a bit? by jandrese · 2008-07-15 09:32 · Score: 5, Funny
  
  I can't wait until someone's daughter tries to make an account on Barbie's Horse Talk website and is presented with the following CAPTCHA:
  
  Prove that a 3-manifold space has the additional property that each loop in the space can be continuously tightened to a point then it is just a three-dimensional sphere.
  
  --
  
  I read the internet for the articles.
Captchas are only good for protecting cheap stuff. by nweaver · 2008-07-15 09:14 · Score: 5, Insightful

CAPTCHAs are only able to protect things worth $.0025, no matter how good they are. Simply because at about that price, you can pay humans to solve them for you.
Thus for preventing mail spam, it can work. But to prevent, say, bots from harvesting Ticketmaster, they will always fail, no matter how good they are.

--
Test your net with Netalyzr
Still useful by truthsearch · 2008-07-15 09:18 · Score: 4, Insightful

CAPTCHA is still useful for small to medium sites that aren't specifically targeted. Your average blog, for example, is only hit by random bots that try to get quick and easy posts. Only the largest sites like GMail need to find something better today.
For example, I use reCAPTCHA on DocForge to block the standard wiki spam bots. Since my site's not large enough to be under heavy attack very little gets through. Someday CAPTCHA may be so easy to break that everyone's at risk, but not today.

--
Developers: We can use your help.
The best part is.. by QuantumG · 2008-07-15 09:20 · Score: 4, Interesting

Spammers are cracking some of the hardest problems of AI research.
How can they do that, and yet all the great academic minds can't? Two things:
* funding
* a willingness to use "anything that works"
What's really scary is that, in the end, spamming may turn out to be an agent of good.

--
How we know is more important than what we know.
A dumb question: by AndGodSed · 2008-07-15 09:21 · Score: 4, Interesting

Howcome /. is so spam free?
Do the hackers just not care about us,
or:
is this like one of those "safe zones" where geeks and hackers can hang out as long as nobody asks or tells? (looks at guy to his left..."say is that a CAPTCHA in your pocket or are you just excited to be here...")

--
Seven Days with Ubuntu Unity
1. Re:A dumb question: by EkriirkE · 2008-07-15 09:30 · Score: 5, Informative
  
  a combo if requiring an account, and having to wait at least 30 seconds before writing a reply, plus moderation. However, the firehose is littered with spam ads...
  
  --
  from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
2. Re:A dumb question: by Kingrames · 2008-07-15 10:16 · Score: 5, Funny
  
  Howcome /. is so spam free?
  You must be new here.
  and blind.
  
  --
  If you can read this, I forgot to post anonymously.
fall of open email by drDugan · 2008-07-15 09:23 · Score: 4, Insightful

it is no wonder that the "under 25" crowd now says "myspace me" or "facebook me" and no longer use email. why would they?
in a globally connected world with several billion possible users - open email simply won't work much longer.
when we need are permission based systems - ones in which people need permission before they can contact another person. it would eliminate spam entirely, by integrating whitelists into mail clients. because no one has built a system like this that leverages and extends existing email servers - private organizations leveraging social connections have moved in to fill the gap. sadly, because facebook messages and myspace messages are not built on an open standard - you have to go through those companies to contact people.
1. Re:fall of open email by TheLostSamurai · 2008-07-15 09:55 · Score: 5, Funny
  
  it is no wonder that the "under 25" crowd now says "myspace me" or "facebook me" and no longer use email. why would they?
  Whatever happened to giving someone your phone number and actually talking to them. I asked a girl for her number the other night and she gave me her myspace address. Thanks, but no thanks. At least make the effort and give me a fake phone number if you don't ever really want to talk to me again.
  
  --
  I am Jack's complete lack of surprise.
Just use by linhares · 2008-07-15 09:30 · Score: 5, Insightful

BONGARD PROBLEMS. No machine can crack them in at least 10 years time. And when one does, baby, we'll have genuine AI.
1. Re:Just use by BitHive · 2008-07-15 09:47 · Score: 4, Insightful
  
  Can you generate them algorithmically?
turing test by Anonymous Coward · 2008-07-15 09:37 · Score: 4, Funny

The first thing to actually pass the Turing test will probably be a spam-bot. Isn't that disgusting?
The Irony by techsoldaten · 2008-07-15 09:42 · Score: 4, Funny

The irony about this is that a CAPTCHA is a Turing test, a form of authentication designed to prove that a human is making the request. Given that some CAPTCHAs are rapidly becoming too hard for people to read, the outcomes of the tests are reversed - humans cannot win the test, only computers.
I have CAPTCHAs on my blog, but only deny posters who actually fill them in. Goes a long way to deterring spammers.
M
1. Re:The Irony by Telecommando · 2008-07-15 10:07 · Score: 4, Interesting
  
  Interesting.
  A few months ago I tried to post on a blog (sorry, I forget which one), entered the CAPTCHA and got a message that I was a suspected bot and my IP address was banned from posting for 48 hours.
  I went back and carefully read the terms of use (just above the posting window) and buried in the middle of the terms was the phrase, "Do not enter the captcha, instead enter the first three letters of the fifteenth word in the second paragraph followed by the third word after the eighth word in the first paragraph in all capital letters."
  A neat idea, but I suppose it won't be long before that one is cracked as well.
  
  --
  Beta sux! Join the Slashcott! http://hardware.slashdot.org/comments.pl?sid=4760465&cid=46173047
On sites like gMail.. by bill_kress · 2008-07-15 09:43 · Score: 4, Insightful

On gMail some simple rules should suffice. Don't allow a brand-new account to send out more than a few (20?) emails a day. Make sure that most of the email varies. Make sure the account gets and reads email as well as sends it, and that the email is accessed.
The trick is, you keep rotating these measures and don't tell anyone just what they are. You don't automatically disable anyone who breaks the rules, you just hold on to any large number of similar messages until a human reviews them--possibly through some mechanism similar to the "picture matching game" where multiple people identify a message as spam.
If it's determined to be spam, never tell them you caught on, just stop email from that account from being sent, silently. Log the ip addresses and use them to help you identify other accounts from the same computer if possible.
You could also use the ip addresses to notify people that they are a spambot next time that IP address is used to look up something on any google service.
Wow, that's a broad action with a lot of chances for failure, but I bet it could be refined enough to work--and worst case failure isn't bad at all--just one time when you go to search google you get a warning page back instead of your search results.
Really this just takes some dedicated effort and creative thinking by a strong, creative engineer with some power within google (I know there are quite a few of those)
Misleading phrasing by merreborn · 2008-07-15 09:54 · Score: 4, Insightful

CAPTCHA used to be an easy and useful way for Web administrators to authenticate users. Now it's an easy and useful way for malware authors and spammers to do their dirty work
This is misleadingly implies that CAPTCHA somehow enables spammers. On the contrary, broken CAPTCHA does not enable spammers to do anything they couldn't already do -- we're just back where we were before CAPTCHA.
And to be fair, CAPTCHA is still reducing the rate at which attackers are able to create accounts, keeping some smaller, less sophisticated players out of the game entirely, and protecting lower-value targets (e.g., most small-time bloggers with comment spam problems still see a drastic improvement when they set up CAPTCHA)
If everyone stopped using CAPTCHA, the spam problem would get noticeably worse.
Make Them Write by linuxpyro · 2008-07-15 10:00 · Score: 4, Funny

I've toyed with the idea of making users write a 500 word essay on a random topic. I would then send this to my high school English teacher, and if it got maybe a B or above I would consider it legit.

--
Saying "I'll probably get modded down for this" in a post is the best way to get it modded up.
Re:And they share better. by statusbar · 2008-07-15 10:09 · Score: 5, Interesting

The best way I've seen that captcha's got broken are by "free porn sites". The web site is what is cracking another captcha. When it gets a captcha to solve, it passes it to one if it's "porn viewers" - "please type the word that this captcha says in order to prove you are old enough to view the porn". Then the porn is displayed and the bot running on the website has a potential solution made by a human to do it's botting with.
This method will suffice to crack ANY CAPTCHA!
--jeffk++

--
ipv6 is my vpn
Offshoring CAPTCHA solving by Animats · 2008-07-15 10:18 · Score: 5, Informative

The spammers have a new solution to CAPTCHAs in place - offshore outsourcing. This has become a sizable operation. System status earlier today:
Current Status: Volumes are exceedingly high. -- Automatically dispatching more labor
Queued Captchas: 91
Total outsourced volume: 4564301
This service is integrated with Craigslist auto posting tools, allowing high-speed spamming of Craigslist. It's also used for other services, like obtaining GMail accounts.
Even Craigslist's callback-by-phone system is starting to crack. Temporary phone numbers for Craiglist verification, provided by marginal telephony providers, have dropped to $1.50 in bulk.
The overall effect of Craigslist's new protections is that the cost of spamming has gone up, enough to slow down the low-rent operators but not by enough to stop it.
As I've pointed out previously, Google plays a central role in this. Google's services provide a facade of anonymity for scammers to hide behind. GMail for anonymous mail, YouTube for anonymous infomercials, AdWords for anonymous advertising, Checkout for anonymous money transfer, and Blogger/Blogspot for anonymous redirectors to zombie machines are all valuable services for scammers and spammers. All those services are used heavily by Craigslist spammers.
Others have provided some of the same services, but the competing services had bad reputations. Anybody trying to do business via Hotmail just had to be phony. Many mail agents just block all Hotmail mail. Anyone running a business off of "freewebpage.org" probably wasn't someone you'd want to deal with. So you had some strong indications of lack of legitimacy there.
Google, though, still has a good reputation. The combination of Google's reputation and low customer standards offers a great opportunity for scammers, and they're taking it.
Re:And they share better. by encoderer · 2008-07-15 10:58 · Score: 5, Interesting

Absolutely correct.
I run a mid-sized web development shop. A few years ago we were doing mostly retail sites. Vanilla and boring but we worked it down to a science and had some really great "modules" that made these sites super profitable for us. Of course, everything has its seedy side and with retail it was SEO.
Everybody wanted it. About 80% of our customers were of the "Do whatever, just ideminfy me" stripe. (And these are established companies paying high 5-figures for these sites). We drew our own demarcation about what we would and wouldn't do. (Excessive Internal-link structure is OK, zombie sites are not).
Now most our work is social networking.
We, too, followed the "rise" of CAPTCHA and we've been happy with our results. We always used a custom CAP for each site, and we tried to keep them relatively readable, being of the belief that making it too hard will only keep out Humans: If somebody wants to crack it, they will.
We still use them regularly. I noticed that about a year ago we actually had people begin to request them specifically. (Isn't that what Buffett said about the home mortgage mess? When the regular joe's started flipping houses, he knew it was over?)
Anyhoo, I think the real fault in CAP's is that they worked too well. They became too big of a target. Now, we try to mix and match a number of different techniques to identify humans.
Solutions range from dirt-simple: An input box named, say, "City" that has a label that reads "13 plus 8 equals:" or "What is the 3rd word on this page?"
To the more complex "what is the color of the front-door in this picture?"
We have a simple library we use for these things that pulls the questions (and, if applicable, the pics) from a Database of about 25,000 different turing tests.
The thing is, none of them are too complex. Any mediocre programmer could write an application to crack it. But your bot will probably never see that same exact question again, so it becomes irrelevant.
And, to tie it in to the parent, we chose this technique precicely because of what we learned from CAPs. Before there were software hacks, there was the "porn hack" and the "sweatshop labor hack."
In this case, when a bot the site, it's fairly difficult for it to even detect which item is the turing test. We auto-generate the location and even the name of the form field so it's always a bit different.
A good solution here... by encoderer · 2008-07-15 11:24 · Score: 4, Interesting

A good solution here is to include this as part of the turing test itself.
As I mentioned upthread, I'm a partner in a web dev shop. We do a lot of social networking (of course) and about a year ago we developed a utility to create just this type of turing test. For example, we'll have a picture, and ask the question "What is the color of the 3rd fish from the left?"
What we do, is we pair these tests on a page. We'll include a known test, like the one above. And we'll also show an unclassified image and we might ask "how many people are in this picture?"
There is no wrong answer for that test, and their answer is recorded. Soon, that same question will be asked for that same picture. As soon as its confirmed 2 times, it gets classified as having n people. Soon after it would be displayed again asking "how many females are in this pic?" or "what color shirt is the person on the right wearing?"
When we created the app, the DB had about 5000 turing tests in it. We then attached a DB of about 100,000 images that were pre-classified but not to an extent that would allow us to write a test off it.
Now, after a year in use across a couple dozen moderately trafficked websites, we have nearly 25,000 turing tests. All 20,000 new tests have been created thru the technique I described above.
The real reason we did it wasn't to save on some development costs. We could've hired temp workers and paid them $8 an hour to classify pictures.
We did it because I believe strongly that the key to simple turing tests like this is a large corpus of data. If a bot only encounters the same test once or twice EVER, then the problem becomes difficult to solve. This is like the ANTI-CAPTCHA.
CAPTCHA was all about taking a specific technique to its maximum extent: Challenge a computer system by taking a narrow field (OCR) and pushing it beyond the current state-of-the-art.
These tests are all about a general technique thats broad where CAPTCHA is just deep.
The only way to build a bot to solve each test in our DB would be to give it genuine intelligence. It would have to be capable of determining context, reference, connotation, image ID, etc.
As a programmer, if you say "Here's a captcha, write a program to solve it" I wouldn't know HOW, but I'd at least have an idea of where to begin.
Now, if you show me a picture with the turing test of "What object is in the hands of the 3rd woman from the left" ... well... i wouldn't know where to begin.
SEOs - Lying to Robots so Robots Lie to Humans by billstewart · 2008-07-15 12:03 · Score: 5, Interesting
Search Engines help humans find web pages that the humans might find interesting, and they do this by having robots spider the web looking for patterns. Search Engine Optimizers try to get humans to read their customers' web pages in three ways:
- Making it easy for the robots to find the content. Google's how-to page tells you pretty much everything you need to know, and it's not hard, but I guess there are companies who want to hire somebody to clean up their web page structure for them instead of doing the work themselves, or to tell their graphic designers to stop using complex Flash-based mouseover gesture interactions instead of simpler links and good indexing. Usually people who do that call themselves "consultants" or "web designers" instead of "SEOs", but not always.
- Helping their customers write more interesting web pages instead of boring ones. Usually people who do that call themselves "editors" or "content consultants" or whatever instead of "SEOs", but not always.
- Lying to the search engines' robots so that the customers' uninteresting-to-humans web pages match patterns that the robots identify as "interesting", so the robots will lie to humans about the interestingness of those pages. Sometimes this includes building link farms or generating vast reams of uninteresting content with popular keywords and ad banners or kiting millions of domain names. Usually people who do this call themselves "SEOs" or "Search Engine Optimization Consultants" instead of "lying scum polluting the Internet". But sometimes they pretend to be something else, like "Advertising specialists" or whatever.
--

Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks