Looking To Spammers To Solve Hard AI Problems
An anonymous reader writes "With bots getting closer to beating text-based CAPTCHAs for good, New Scientist points out that when they do, OCR technology will at least have advanced. The article goes on to suggest that whatever kind of reverse Turing Test that comes next should be chosen to motivate spammers to solve other pressing AI problems, such as image recognition. Are there any other problems that criminal crowdsourcing could help with?"
Advancing the state of the art in Optical Character Recognition was always intended to be a side-benefit of CAPTCHAs. It looks like that plan came through nicely.
I have always figured CAPTCHAs would be a stopgap until other methods of authentication could easily be used, such as micro-payments or single signon solutions like OpenID. Unfortunately, those other methods haven't been adopted nearly as fast as the need. Perhaps if CAPTCHAs are declared "dead", site operators will feel more urgency to adopt these solutions.
If CAPTCHAs do continue, I'd like the next problem to be facial recognition software. I'd love a package that could look at a picture and tag it "Nicholas and Andrea" or "Glen and Helene". Digital camera software everywhere could benefit from this technology. Not sure how you'd bake that into a CAPTCHA, but it's a good problem to solve.
John
Don't tell them that they're the ones that are actually being used! That spoils all the fun!
I'll just bet that this is what leads to "true" artificial intelligence (whatever that is). Soon, we'll have completely automated agents trying to convince other completely automated agents to purchase stuff to enhance bits of biology that they don't have.
un-ALTERED reproduction and dissimination of this IMPORTANT information is ENCOURAGED
several years ago 'neural nets' were the big thing and they were thinking that they could make them 'learn' and do useful things.
i always thought that traffic control would be an interesting application. if a computer could look at video of an intersection (and streets leading to the intersection) and figure out where cars were and weren't, you could make traffic lights a lot less annoying.
so our CAPTCHA might be a picture/video of cars and a request to count them?
eric
Spammers are unlikely to share their results with the rest of the world. They're motivated by financial rewards, and there is absolutely no incentive to publicize their methodology in any format.
Not only would the "good guys" learn from it -- and thus potentially defeat the spammers' discovery -- but other spammers would simply steal their work.
using spammers to create AI which allows us to catch/ignore/prevent spamming?
Replace captchas with pictures of hot/non-hot women.
Simply ask "is this woman hot? [Yes]/[No]"
Half of them will be so busy masturbating that they won't be cracking forms.
I would agree, if general-purpose captcha-beating software were available. But that isn't so. Each captcha system was beaten by custom code, individually written for that system. So in effect, it is not much different than adding a new font to existing OCR software.
You know, if legitimate software could ever learn how to make software as resilient as malware the world would be a better place. Modern malware is getting close to nuke proof. Delete registry keys, dll's, multiple self healing packages, msi source code, custom drivers, service restarts, redundant services, monitoring agents, update agents to ensure the latest upgrade and so on - and that's just what I saw a couple weeks ago on a relatives computer. Have you tried removing some of the latest malware w/o removing the disk and operating from a different computer? Unless you do you can't /really/ be sure it's been removed. Modern malware has the ability to incredibly resilient and bullet proof
My father, a nigerian spammer passed away. He left an AI system on a server located in a datacenter. Sadly during the last phase of his life unpaid data transfer bills accumulated to a sum of $300000. I am already negotiating with the secret services of the word who want to buy this program for $10000000. I can't pay the data transfer bills, so i turn to you, a trustworthy AI reasearcher. For $300000 you get a share of $500000000 and the copyright to the source code.
sincerely yours,
Trying to ensure only humans sign up for things is just a small part of a bigger problem.
The other night I got javascripted away from the page i'd found in Google to watch a page pretend to put windows on my laptop and find malware, seen it many times before, i run ubuntu so seeing an xp like display of my c: and d: drives and various dll files being scanned isn't very convincing.
I decided to look into why i'd landed on the original page. Google had the page as about no4 after my initial search, but the site was about 4 weeks old whys it ranked so high?
And the answer is incoming links from around 86,000 pages according to google (links:domain.name)a lot of them are created internally passing links between malware site to malware site. But the majority come from sites using php forms which add user posts to the the sites pages.
A number of months ago i found my sites contact forms were sending a lot of garbage emails to me absolutely stuffed with urls and I wondered why bother doing this since i'm not going to visit the sites. anyway the cure was to only allow the forms to be processed with no more than a few urls in them. stopped the junk hitting the inbox. It's not stopped the automated posting but the forms are not processed and i don't get them any more.
When I examined the links to the malware site i found php posted user posts packed with links just like my emails had been the difference being these were posted published and being crawled. Because of these links a site with less than 4 weeks life is ranked highly because of the quantity of inbound links and thats why I got to watch a display of XP like virus and malware scanning,
I also examined the content of the pages of the original malware site and the subjects varied quite widely but they also seemed to have a relation with the trends that google was showing for related keywords in the weeks before the site went live. I've a feeling that the pages were generated by pulling content from legitimate sites that ranked high in the natural search.
I guess site owners tend to think these links are to spam porn at their users but its not its so google will promote the malware sites with gamed page rank.
Clever isn't it
find good key phrases (may be just using google trends)
scrape content from legit sites and mashup
create massive array of links to site.
wait for the fish to arrive and scam them.
The Antivirus scam is antivirus2009 but you only get shown it once
heres a link for details on removing it and some interesting details.
http://www.2-spyware.com/remove-antivirus-2009.html
Thing is the third party linking sites were using captchas but the real problem was not filtering the posts if a suitable max number of url's were used the posts would fail and the pagerank gaming would too.
Fixing the broken php and cgi scripts is whats really needed not just a better captcha
The Captcha is just a BandAid on a deeper problem and webmasters need to deal with the issues.
Blarney Quality Restaurant, Plants
What about people for who $50 is a year salary? Congrats, you just split the internet into the rich and the poor. No more accessing the internet from africa from an old PC powered by a donated solar cell. Good job. You probably going to get a nobel price.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
What about people like me who can't seem to get the hang of the darn things? (I personally wouldn't be surprised if they're some kind of elaborate hoax...)
You only have to get the word that OCR can recognize right. Just try guessing which of the two words OCR can't recognize and type some random gibberish instead of that word, it will let you through.