Spammers Learn to Outsource Their Captcha Needs
lukeknipe writes "Guardian Unlimited reporter Charles Arthur speaks with a spammer, discussing the possibility that his colleagues may be paying people in developing countries to fill in captchas. In his report, Arthur discusses Nicholas Negroponte's gift of hand-powered laptops to developing nations and the wide array of troubles that could arise as the world's exploitable poor go online." From the article: "I've no doubt it will radically alter the life of many in the developing world for the better. I also expect that once a few have got into the hands of people aching to make a dollar, with time on their hands and an internet connection provided one way or another, we'll see a significant rise in captcha-solved spam. But, as my spammer contact pointed out, it's nothing personal. You have to understand: it's just business."
Cory Doctorow wrote some time ago about an umbeatable way to solve captchas: have a the captcha-circumventing bot connected to a free porn site, inline the images in the gateway pages to the photos and videos, and have the porn-seekers gain access by solving the images. They would have the same infrastructure that they would need if they used developing world click-workers, without the hassle of having to arrange payments.
http://barrapunto.com/ - News for nerds, en español
I'm sure there are ways of defeating that at the CAPTCHA server level. Generate a brand new image every time, and send it out along with a cookie. The cookie is a database key which refers to the CAPTCHA solution; the record also contains the timestamp when the image was generated and the IP address to which it was sent. (NOT the MD5 of the solution: anyone can generate an MD5 for any word and send that as the cookie contents with their word as the answer, effectively bypassing the image altogether.) The answer must not only be correct; it must also come from the same IP address that received the image, and within a reasonable time limit. IP addresses cannot be forged (or else the server would be speaking to the wrong client) and nor can timestamps (which come from the server anyway), so this ought to be fairly robust. Checking the referrer won't help, because referrers can be forged.
The CAPTCHA image and question themselves need some thought as well. Just having a person type some "distorted" text verbatim is a bit christian IMHO, because it's vulnerable to OCR. Insisting to change the order or capitalisation ("type this backwards in all lower case") would be a good start, but there are plenty more techniques involving pictures that only a human being will be able to use; and you can possibly even set a knowledge barrier (by using challenges that will be easy for people in your chosen field but not random idiots) to keep out undesirables.
Je fume. Tu fumes. Nous fûmes!
Come on!, Remember the usual "Don't teach the poor to read, that would make them a threat"? This all sounds as "don't give the poor any access to the internet, they could become a threat" . And for god's sake it is not like captchas are any difficult for just a program to beat.
I administrate a site with a vBulletin forum, and every once in a while a bot posts messages. Registration requires passing a captcha, in fact, I decided to just remove the captcha, it was seriously not helping stop the spam and was just making the registration harder FOR HUMANS.
BTW: I noticed that Russian bots are more likely to beat captchas.
Copyright infringement is "piracy" in the same way DRM is "consumer rape"
this is exactly how most session-based CAPTCHAs work. The timestamp idea is unworkable - it doesn't take that long for data to be ferried half way across the world, so if you implement a timeout, you'll end up pissing off your legitmate users as well thwarting spammers, and if you make the timeout longer it'll render it completely ineffective - what I'm saying is that it takes as long for a spammer to type a captcha as it does a legitmate user.
Stuff like "type this backwards in lower case" won't help *in the least* - it'd be trivial to get past, as trivial as writing a bot to collect email addresses, and we know how many of those there are.
Checking the IP address won't work (unfortunately) because certain ISPs (*cough*AOL*cough*) use multiple outgoing IPs for the same user; it's ridiculous but there you have it.
In any case, IP addresses can be forged; the spammer doesn't need to receive a response, he just needs to send his CAPTCHA and spam message; if he's on 4.3.2.1 and needs to send from 1.2.3.4 then he will - the server's "yes you got it" response will be sent to 1.2.3.4 but the spammer doesn't care; his spam has got through.
In short, there is no serverside way of preventing a captcha from being relayed to/from a 'processor' be it OCR or human.
However, what needs to be remembered is that in 95% of cases, any type of captcha will stop 100% of spam. Most captchas out there are pitifully weak in terms of OCR resistance, have implementation bugs coming out of their *ahem* and 'in principle' offer no security whatsoever, but they work because most spammers only after the low hanging fruit.
Actually, I doubt you would actually beat one. Not meant as an insult, but I believe that you don't have what it takes. If you had, you'd already be either in jail, or a CEO, or chief of marketting or various other positions suited to people able to think "it's just business" when harming others. Or in his place making a good living sending spam and 419 mails.
See most people are quite able to speak/cheer about and for beating others up, killing others, war, etc, as long as it's just talking. They might even actually do it, if a fit of rage disables their sanity for long enough. But fits of rage aren't something you can plan and execute whenever you wish. And otherwise when you actually have to do it, there's this interlock against harming other humans. It's partially "what if it was me in his shoes" education (even if you logically know it would never be in his place spamming) and partially that interlock most animals have against harming their own more than strictly necessary. (Even when cats or dogs fight their own there is always a mechanism to signal "I give up" and the other _will_ cease.)
It's a strange world, really. The same people who could be shaking a fist and screaming for war against X at the top of their lungs, would actually have trouble looking one of X in the eyes and squeezing the trigger. A lot of PTSD cases in war aren't just people getting shocked by being shot at, but shocked by having shot other humans.
There is one cathegory that can cheerfully think "it's only business": the sociopaths. They live in a strange world in which the others are NPCs: the others don't matter, they're not the same, "it could be me in his shoes" doesn't apply, etc. They can lie, cheat, murder, torture, whatever, and be perfectly able to look themselves in the mirror after it. Because the other guy didn't matter.
And, sad to say, if you weren't born one, I doubt you could actually beat this guy up in cold blood. If anyone gave you a baseball bat and this guy tied to a chair, you just couldn't actually do it.
And it's probably better that way. I'm thinking we as a society would do better to just start recognizing sociopaths for what they are, and the damage they can do. This guy, for example, is a sociopath, plain and simple. He's not just "being smart", he's not "just doing business", he's not "just doing what's needed", or the other things these guys like to pose as. He's just someone who doesn't even see you as a human being, much less his equal.
A polar bear is a cartesian bear after a coordinate transform.