Baffling the Spam Bots
dumpster_dave writes "Scientific American is running an article, Baffling the Bots on techniques to outsmart and subvert spam bots and their chat-room cousins via CAPTCHA. You have probable seen this in the form of images containing text as gate-keepers to various on-line services. The latest evolution is using non-words and distorting the text such that even the best AI systems cannot decipher them, yet humans can not help but do so [cf., Gestalt Psychology]."
I've often wondered how these types of systems can be made handicapped accessible
Simple Machines in Higher Dimensions
that just using johnsmithword-AT-hotmail.com works fine (where word is taken out and -AT- is replaced with @) I use that and have yet to have a single spam email.
I have over 70 freaks, do you?
Hotmail's spam filter has gotten really smart in the past few months. Yahoo's filter used to be the best among web mailers, but Hotmail has improved to the point that I don't get any spam in my hotmail inbox anymore.
I'm not one to go about shouting the praises of Microsoft, but someone over there's got their head out of their asses.
Everyone should know this by know, but you can control spam by keeping tabs on where your email address goes.
The address I use to post to USENET is completely disposable. The 'swen' worm in fact picked up my USENET addy and spammed it with about 40,000 emails. The address is now dead, but I saw that coming.
I have a public address which I give to casual contacts (who may not be totally trustworthy). This address changes yearly, and this keeps it spam free.
My well guarded private address, which I only give to my closest friends, has gotten no spam for 5 years. I receive about 20 emails per day at that private address and there is 0 spam.
Earthlink has an optional system like this, where unknown senders are blocked by default. They receive an autoreply giving them a URL to go to where they must enter the text from a CAPTCHA.
Unfortunately, the system does not work very well. My dad sells on eBay, and a buyer of one of his auctions had an Earthlink account, which blocked the message that told how much the shipping would be, where to send payment, etc. When my dad went to the specified URL, and entered the CAPTCHA text as requested he would simply get an error message that he had entered it incorrectly. He forwarded me the Earthlink email and asked me if it was just him; it wasn't; I couldn't get it to work either. The random string of numbers and letters was very distorted, and there were four possible meanings; I tried those plus at least ten more with no sucess. The message never got through.
There are many problems with this type of system. Consider: what if both parties have CAPTCHA-enabled accounts, from different providers? The confirmation messages from both parties get blocked. Smarter systems whitelist people as messages are sent to them, but as in the eBay case, the recipient had no way of knowing my dad's email until AFTER a message from him was received. It's a Catch-22.
And for people who are visually impaired, universal deployment of this system this makes email essentially impossible. Earthlink's page had a link "if you cannot see the picture, click here" and when you got to that they said to call their 1-800 number if you have any problems. Right.
Adding CAPTCHAs to everyone's email systems is NOT the way to solve the spam problem. We need a more realistic, permanent solution. For example, cryptographically authenticating the sender (the "From" field) at the level of the originating ISP (and rejecting messages from senders it cannot authenticate, by password or whatever means), and then having each relay in turn authenticating the previous relay if it trusts it. Headers can be inserted in the emails, signing the previous headers with private encryption keys with their public counterparts obtainable from the ISPs by simple DNS lookups. This will build a chain of trust, which stops when a message gets outside of the sender's network, and therefore allows the original sender to be properly identified back through their ISP. Once we know who messages are from, people can be held responsible. And at that point, anti-spam laws can handle the rest.
It's hard for thee to kick against the pricks.
I have a better idea : present a complex differential equation and ask the person to solve it in less than 10s. If he fails, he's human.
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
Am I the only one having troubles deciphering the second word on the second picture?
Future Wiki -- If you don't think about the future, you cannot have one.
I'm not sure about others, but I have a difficult time with sites which use distorted numbers on a nearly matching background...and I'm not even color-blind.
...and perhaps even requiring the person to call a phone number to activate the account - ideal for financial-based sites such as banks, payment
:)
Sound is better, but even that sometimes can be difficult to understand - also, I don't have speakers hooked up on some machines I use; some folks disable sound due to abnoxious websites/ads that blast sound unexpectedly.
Anyways, many of my relatives and friends can't get into sites that use distorted numbers, etc at all and are basically locked out; sometimes they get lucky and find a similar site (likely a competitor) to the site they desired, which doesn't use such nonsense...
Seems to me a better way is use geotracking (too many inbound connections from similar sources [IP ranges, routes, browser config, etc), email verification, etc...
sites, etc.
With good heuristics (really the key to stopping automated bots in my view), any decent website should be able to filter out much of the bots and other junk - it's no accident really that many of the largest sites don't use distorted numbers, pictures, etc - how do they do without them?...perhaps be a good Ask Slashdot item
Ron
<img src="it_says_kitten.jpg">
heh dumb bot
bite my glorious golden ass.
.. is that they can be brokered. If you give me a puzzle, *I* don't have to solve it; all I have to do is induce someone, somewhere, to solve it, and give me the answer. That means I can set up a CAPTCHA-solving factory in Taiwan, or field a porn site where users pay for their pictures in CAPTCHA answers. (*My* CAPTCHAs, the ones my script was assigned to answer in order to make Paypal transactions, not new ones I made up on the spot.)
Suppose that a human can solve your CAPTCHA in an average of five seconds. Suppose unskilled labor costs $6/hour. Then it costs a bit under a cent to find the solution to your CAPTCHA, assuming that I want to solve at least a few thousand a day. As a result it is impractical to protect a service worth more than a penny with a CAPTCHA.
Actually unskilled labor costs far less than $6/hour in some parts of the world, so if CAPTCHAs see wide use the value of the services they can protect is even lower. A tenth of a cent?
CAPTCHAs should be seen as a proof-of-work mechanism, like "hash cash", not as an oracle that can determine whether a transaction was initiated by a human or a machine. Unlike proof-of-worth schemes that burn CPU time, the value of a CAPTCHA won't be inevitably halved every 18 months by Moore's law; on the other hand, it could be suddenly reduced to zero by breakthroughs in image processing.
Just do a Burrows-Wheeler transform on your e-mail address. Comes with the bonus of preventing stupid people from trying to contact you.