Will Solve Captcha for Money?
alx_lo writes "Captchas are a nice idea to protect your blog or guestbook from being spammed by robots.
But what good is this protection when you can hire "data entry specialists" to solve captchas for $0.60 per hour for 50 hours a week?
Anyone here who can think up a solution that does not include drastically changing the global economy? How about captchas that require cultural background knowledge to solve?"
The cultural background idea sounds good, but that may just reduce the number of Captchas these laborers can solve in an hour. A simple internet search should be able to solve these questions. What would be a few examples of a good Captcha for Americans. You will always find a good portion of Americans that are unable to answer even the simplest.
US customs has been known to ask cultural questions at border crossings. My sister was once asked what Dan Quayle's parents did for a living after she said she lived in Indiana. This question is a bit before her time. (His parents ran a newspaper in Indiana.) This also brings into question age. My parents kill me in the original version of trivial pursuit that they play, but I win when playing the newest version.
A temporary stop gap measure might be to use the current Captchas in combination of looking at the users geolocation. I can see how this measure though would really anger free speech advocates for the third world.
How about a mathematical Captcha that cannot be solved with a calculator. Well educated foreigners will not even work for $.60. Then again, how many Americans could solve these.
quis custodiet ipsos custodes
I remember seeing an example of a captcha type game a while back where you would have to pick the hottest girl out of 3 pictures in order to continue..
problem of course is when people disagree on what's "hot"..
MABASPLOOM!
Refundable micropayments. Seriously. Require people pay $1 to post a comment, payable via paypal or whatever. Once you have checked their comment, you can add them to a whitelist that will never be charged again and refund them their $1. Spammers don't get their dollar back, don't get added to the whitelist, and have their comment removed. The result over the course of a large number of blog entries would be to significantly increase the cost of doing business for spammers, while providing only a very minor inconvenience for legitimate users.
This is why I believe in the future there will be two Internets. The one we have now which is wild and wooly where you can remain anonymous, and one where you can't do anything without a Reputation ID that is tied to a biometric identification method (fingerprint, voiceprint, etc.). There will be third party companies like Google that have Reputation ID accounts and will handle the authentication. The Reputation ID based Interent is where eCommerce, government and medical records, etc. based web sites will live.
I hope to heaven that instead of a biometric authentication, someone can come up with a card reader for driver's licenses or some other ID method, but current events seem to indicate biometric authentication will prevail. Even in that case, I hope it is a "authenticated-user" token passing scheme so that the web site that you want to visit never knows who you are, just that you are a valid user that owns the account ID you claim to own (the Reputation ID web site acts as middleman and privacy shield, pray they are never hacked).
By the way, I don't like the thought of privacy problems and Reputation ID spoofing scenarios this implies. I just don't see any other way way to build an Internet with a high degree of trust. As I type this I am looking at the SlashDot captcha box for comments.
Robert Oschler - RobotsRule.com
...but haven't they been doing this for a few years now? I seem to remember a story, at least a year back, where spammers were giving porn away for free, as long as you solved a captcha every couple views.
Mod me down with all of your hatred and your journey towards the dark side will be complete!
I helped develop one of the largest websites in Europe (in terms of traffic and volume of content). Human spammers have been bypassing our CAPTCHA for a while now. We still keep the CAPTCHA to block most bots. The data input goes through a custom spam filter. These human spammers are trying to spread their URLs, email addresses, and phone numbers just like most spam, so this helps to a large extent. Anything that gets through that can be flagged as spam by users. On top of all that there's some human moderation by the business which owns the site.
So in the end spam filters can help but human moderation is still the only real working solution today.
Developers: We can use your help.
You win the thread.
I learned more about America in the 1960s/1970s from those questions than I did from anything else, ever.
RIP Sierra
To register, you have to be a "confident" user of a parternship website, like say ebay, paypal, amazon, yahoo, hotmail, google, etc, etc. They can proof that you are a real user, and an open api allows 1-1 relations between your accounts. If you are not registered to any of those website, you have to get X points using Folding@Home to be trusted.
Running with your cultural background idea:
Why not take this to the local level, ie, make your captcha refer to website content.
The spammers can circumvent captchas effectively because they make sense out of context. But if your captcha asks for the Author's surname, the name of the website, or the news item's title; suddenly you need to actually know about the blog before posting.
Take this to far though, and it starts to look like those discriminatory voter tests of yesteryear.
The real problem with spam after all is not the spammers but the people who respond to it, if nobody bought from spam then there would be no spam. Well at least much less of it. After all it is advertising and spammers are not selling say viagra but selling spam itself.
In any case with this log of users who actually click on spam links you could then A compile an overview of what kind of user actually is stupid enough to respond, B educate them or C ban them for being to stupid to live.
Considerring the offered budget in this ad for (30-100 dollars) I don't think the guy is operating with that big a margin already. If you can reduce the number of people who respond to these spams then perhaps simple economics makes the problem go away.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
Just have a human authorize every account creation. For smaller sites (the vast majority of the web) this might introduce a load of one authorization a month. As site size scales upwards, you have more people available to help with authorization. Could use the principles of the turing test to work through a 2 or 3 email exchange.
;)
Could make the supporting cgi scripts as simple or as complicated as one's willing to author. One forum I maintained for a while had a low level "all access" section where new users posted an application. Forum regulars would respond, and eventually grade the new user. If they passed, they were given full access to the board. Granted, this system was employed more to limit the quantity of asshats than spammers, but the same principles apply.
It might even benefit society in the long run as a spammer's urge to do his work forces him to develop a "true" AI.
Well I find, for one, that Slashdot is doing a good job in spammer-filtering technics.
The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
I've visited a Japanese art site (ie pictures of characters from fighting games drawn in alarmingly extreme detail) which had roughly this on the front page:
"Because there have been some people coming in here and stealing pictures or linking without permission, I have had to put this small test up. Please enter the Emperor's birth date in Japanese calendar in the box below. I'm sorry for this inconvenience and I will remove it when they forget about this site."
I've also seen a site (again in the 'students with too much time on their hands' sector) that asked for some other date in Japanese calendar. There are also a fair few personal sites that have a front page with just one link that takes you in, and several spurious links, with the page being 100% japanese text -- which I think serves about the same purpose.
On a related note, there also used to be WinMX groups which required that you say something in Japanese on entering or be booted. The point there was that otherwise you'd get masses of Korean 12-year-olds coming in and going 'Fuk Japanese bitch! dokdo nun uri tang!!lolz0rz!' and generally spamming the place. At least, I hope they were 12.
So, cultural captchas certainly exist... but it's easy to see why they work better on 'my pictures of Vampire Hunter D' sites than in the commercial world.
Whence? Hence. Whither? Thither.
CAPTCHA's can either be easily bypassed by script, or you can get people to do it. The thing is, if you make it harder you start blocking out visitors, maybe those with sight problems who have to use a screenreader, or people with a text only browser.
My blog recently had issues with automated spam, and I found two possible ways of dealing with it.
1) Use a filter like email. Wordpress has one available called Spam Karma 2, which measures time it took to fill in the form, Javascript payload, URL levels, and other things. I found it rather good at catching spam after a little training, but it was quite resource heavy, and even scripts make mistakes once in a while.
2) Use something abnormal. I decided to add a math script. Basically, it produces a simple math question (4 + 9) and asks for the answer. The comment will only submit if a correct answer is provided (the form has a hidden input with a server-side produced hash) which is checked against the hash (if hash is missing it automatically fails). Many spam bots don't know how to handle math, so they fail. To disquise the question for 'alert' bots people only need to add surrounding characters or convert things (+ => plus, 9 => nine) etc.
I noticed that bots were signing up but not actually posting, (I donno, maybe they were meant to post but that part of the script broke -- either way, they never posted, but it annoyed me having them there.) They were just there, with links to sites selling vicodin/viagra/etc. Which annoyed me somewhat, but one time a child porn link showed up which was really the straw that broke the camels back, and I decided to stop it. I noticed that 99% of the sites were *.ru so I altered the reg form to throw an error if it detected a *.ru domain in the website field. Then I just started getting non *.ru domains instead, so I just thought, fine, fuck it.... Now if anybody signs up with ANY website in the website field, it throws an error, and has a message along these lines: Since then, no spam bots. w00t. Of course, that forum only gets a handful of signups per year, so I don't really care if it inconveniences people slightly, it's primarily intended as a "private"ish (real life friends) forum anyway.
And behold, a command prompt and he who sat upon it, his name was shutdown and -h 3:11 followed with him
I had the same issue. I searched all over for some sort of blacklist plugin for phpbb to fix the issue, because i was just sick and tired of banning all sorts of domains every day. In the end, I ended up changing the website field to "hidden" on new user registration, and if the bots enter text into it... then I throw an error message.
Find Nearby Indie Events
There's a "better CAPTCHA" mod for phpBB that solved the problem 100% for me(http://www.phpbb.com/phpBB/viewtopic.php?t=3828 90&highlight=captcha). It's beta but I've found no bugs.
I experimented with "oddball" questions myself (also hidden fields etc), but found that I had to change them all periodically, otherwise spam eventually reappeared a few weeks later. This is interesting in itself, because it implies that a human spammer has looked to see why the submissions have started failing and devised an (automated) workaround.
This was for questions that required no brainpower, though. ("Leave this blank" or "copy this word".) More complicated questions, even trivial ones (1+1=?) reduced the spam to zero - but also reduced legitimate responses to zero. People just can't be bothered, it seems.
By the way, SpamAssassin (even using the Bayesian sa-learn feature) was no help for filtering email generated from my other web forms, presumably because the spam originated from the same server that SpamAssassin was running on and so bypassed the spam check. A CAPTCHA (from www.neoprogrammers.com) solved this as well, although I think even that reduced my legitimate response rate.
The problem is visually impaired users may not be able to use them. I don't have a good solution for that.
Phil McKerracher
One solution longer-term is to not allow any html links (or markup in general) in posts or profiles. With no Google-rank spamming possible and no direct way for prospective marks to get in touch it removes most of the incentive to post crap comments in the first place. And pure text-only posts can quite easily be filtered for objectionable content.
Trust the Computer. The Computer is your friend.
I am surprised that all slashdot can come up with so far is cultural or mathmatical solutions.
I think some sort of game would be a good idea, sorta like the crappy games in flash advertisements now days. Make it difficult enough that it is too time consuming for spammers, but easy enough that people do not get frustrated when trying to register or post.
Ultimately I think that better filtering is probably the solution
One of my message boards has been getting spammed a bit lately, despite the CAPTCHA..
We have recently installed a mod that we can add keywords and urls to. So posts from new users are checked with this.. it needs a bit of fine tuning, but I think eventually it should get rid of most of the spam.
In addition, users can flag posts as spam which are then checked by a moderator
Not a perfect solution of course. Someone could still pay for the answers, but it would take them more time to watch a video than look at one image. The videos might be related to the subject matter of the site and actually be entertaining or informative for valid users to watch. Captcha questions might be a little harder for a topically relevant video to further insure a user is worth the price of admission.
How about presenting a small phrase or story and then ask a couple of questions about the text. Example: Mary and Jim took an empty 2 gallon jar to the well. They filled it up half way with water? How many gallons of gasoline did they put in the jar? Or Please sum up all of the occurences of words that are bigger than 4 letters and less than 6 in the following sentence. Then add all of the vowels in your username: blah blah blah whatever
(obviously in later stages you need to make sure the division x/g is done to necessary precision, but keeping numbers in fractional rather than decimal form makes the mental calculation easier, if you can handle an answer in that form.)
this method converges quadratically whereas 'trial and error' or a 'binary search' converges linearly. this means by using this method a simpleton from the 16th century could beat you quite easily doing 3-4 digits of accuracy, and could probably find 6 or 7 digits faster that you could if you were doing the divisions on a calculator.
btw i'm not sure if this is the same method you outline above, or if by 'divide, refine' you are simply deciding whether your guess is too big or too small, based on whether g or x/g is bigger. taking the average of the 2 is much better, and not computationally expensive.
my password really is 'stinkypants'
I've managed to cut down blog spam significantly lately after installing an Anti-CAPTCHA: http://www.timtucker.com/weblog/?p=74
The basic idea is to present a CAPTCHA image that's as easy for a machine to understand as possible and then ask the user to type in something else. (in the system that I'm using, users are presented with an unobscured image of a 6-digit number and asked to type in a different 6-digit number).
One of the great things about asking a user to type in something other than what's shown is that it's much more accessible than a regular CAPTCHA, since there's only a 1/1,000,000 chance that someone who can't see will accidentally type in the "right" six digit number.
This talk on Google Video has a bit of info about CAPTCHAs. Apparently some porn sites are displaying occasional CAPTCHAs that their users have to solve before seeing the next page of porn, and then using these solved CAPTCHAs to spam blogs and other sites. The developers get bonus points for creativity, anyway.