Stopping Spambots: A Spambot Trap

Re:Elements of good design I'd missed by hagardtroll · 2002-04-12 01:12 · Score: 5, Interesting

I put my email address in a jpeg image. Haven't found a spambot yet that can decipher that.

Block? Are you kidding? by Anonymous Coward · 2002-04-12 01:13 · Score: 5, Interesting

Why on Earth would you like to block a spambot? So it doesn't get any more useful addresses?
No way, man.
If you realize you're serving to a bot, go on serving. Each time the bot follows the "next page" link, you /give/ it a next page. With a nicely formatted word1word2num1num2@word1word2.com, where words and nums are random.
Give it thousands, millions of addresses this way.

How I track spammers using PHP by Elkman · 2002-04-12 01:18 · Score: 5, Interesting

I did something rather low-tech: I created a "Contact Us" page on my web server that has an automatically-generated address at the bottom. It says, "Note: The address spamtest.1018617636@example.com is not a valid contact address. It's just here to catch spammers." The number is actually the current UNIX timestamp, so I know exactly who grabbed this mail address and sent me mail.

As it turns out, I really haven't received that much mail to this address. About the only mail I've ever received to it is someone from trafficmagnet.net, who tells me that I'm not listed on a few search engines and that I can pay them to have my site listed. I need to send her a nasty reply saying that I don't care about being listed on Bob's Pay-Per-Click Search Engine, and that if she had actually read the page, she would have noticed that she was sending mail to an invalid address. Besides, the web server is for my inline skate club and we don't have a $10/month budget to pay for search engine placement.

I think I've received more spam from my Usenet posting history, from my other web site, and from my WHOIS registrations than I've received from the skate club web site.

re: spidertrap by blibbleblobble · 2002-04-12 01:22 · Score: 4, Interesting

My PHP spider-trap - See an infinity of email addresses and links in action!

removing mailto: a bad solution by bluGill · 2002-04-12 01:22 · Score: 5, Interesting

Removing mailto: links is a bad solution to the problem. It might be the only solution, but it is bad.

I hate the editor in my web browser. No spell check (and a quick read of this message will prove who diasterious that is to me), not good editing ability, and other problems. By contrast my email client has an excellent editor, and a spell checker. Let me pull up a real mail client when I want to send email, please!

In addition, I want people to contact me, and not everyone is computer literate. I hang out in antique iron groups, I expect people there to be up on the latest in hot tube ignition technology, not computer technology. To many of them computers are just a tool, and they don't have time to learn all the tricks to make it work, they just learn enough to make it do what they want, and then ignore the rest. Clicking on a mailto: link is easy and does the right thing. Opening up a mail client, and typing in some address is error prone at best.

Removing mailto: links might be the only solution, but I hope not. So I make sure to regualrly use spamcop.

Re:Block? Are you kidding? by f3lix · 2002-04-12 01:31 · Score: 5, Interesting

This isn't such a good idea - for every random (non-existent) domain that you generate, a root DNS server will be queried when an email is sent to this address, which increases the load on the root servers, which is generally a bad thing. How about instead, returning pages with the email address abuse@domain-that-spambot-is-coming-from all over them...

Similar to how the new ORBZ works? by Masem · 2002-04-12 01:32 · Score: 4, Interesting

After the Battle Creek incident with ORBZ, the maintain changed the way it worked; instead of being pro-active on checking for open relays, he now has a 'honeypot' like system where a unique email address that isn't directly visible on the site but still may be harvested by a spam bot. Any server that sends email to that address is automatically added to The List. Mail server admins that believe that they should not be on this list can argue their case to remove their server.

--
"Pinky, you've left the lens cap of your mind on again." - P&TB
"I can see my house from here!" - ST:

Re:Similar to how the new ORBZ works? by toupsie · 2002-04-12 01:56 · Score: 4, Interesting

he now has a 'honeypot' like system where a unique email address that isn't directly visible on the site but still may be harvested by a spam bot. Any server that sends email to that address is automatically added to
This is the same method I have been using for a while. I have an e-mail account called "cannedham" that I had posted on several web sites as a mailto: anchor on a 1x1 pixel graphic. Any e-mail sent to that address updates my Postfix header_checks file to protect the rest of my accounts. It works like a charm.

--
Strange women lying in ponds distributing swords is no basis for a system of government.

Removing the Mailto: may not be the best plan.. by liquidsin · 2002-04-12 01:40 · Score: 5, Interesting

I've found that a lot of people just won't send email if there's not a link to facillitate it. I've become rather fond of using javascript to write the address to the page. Spambots read the source so they don't piece the address together but *most* browsers will still do it right. Just use something like:

<script>document.write("<A CLASS=\"link\" HREF=\"mailto: " + "myname" + String.FromCharCode(64) + "mydomain"</script>

Seems to work fine. Anyone know of any reason it shouldn't, or have any other way to keep down spam without totally removing the Mailto: ? I know this won't work with *every* browser, but it beats totally removing mail links. And I don't think spammers can get it without having a human actually look at the page...

--
do not read this line twice.

Re:Removing the Mailto: may not be the best plan.. by liquidsin · 2002-04-12 02:54 · Score: 4, Interesting

hell, go one step further:

<img src="myemailaddress.jpg" alt="me at domain dot com">

that way people who use browsers that speak (ie. the blind) would still hear your address correctly, so long as spambots don't start to pick up on the spelling out of "at" and "dot".

--
do not read this line twice.
Re:Removing the Mailto: may not be the best plan.. by e_n_d_o · 2002-04-12 07:29 · Score: 3, Interesting

On my company's Web site we've had success with this technique. The addresses posted on the Web site have not received any significant amount of spam. I have yet to see a single spam message that hits all four of the addresses on our contact page at once, which I believe would be a likely indicator we've been hit by a spambot.

We embed this JavaScript code on each page that needs mailtos:

<script type="text/javascript" language="JavaScript1.3">
// Anti e-mail address harvester script.
function n_mail(n_user) {
self.location = "mailto:" + n_user + "@" + "yourdomain" + "." + "com";
}
</script>

And then make email address links of this form:

<a href="javascript:n_mail('foo');">foo@yourdomain.com</a>

Our addresses even show up correctly in lynx, but are "clickable" only in JavaScript-enabled browsers.

Of course, it's probably only a matter of time before spambots can compensate for this code. A more secure approach would be to put email addresses "components" in borderless cells of tables, or as a previous poster suggested, in images.

Re:http-referrer by DutchSter · 2002-04-12 01:40 · Score: 2, Interesting

No. The point the author made was that good bots follow the 'robots.txt' standard. A versatile program like this can differentiate. If a robot comes in and plays by the rules on robots.txt, it's welcomed. OTOH, if one comes in and just starts grabbing at everything, it will quickly find itself blocked.

I believe the exact quote in regards to why robots.txt should still be used is: "Most bad spambots don't even check the robots.txt file, so this is mainly for protection of the good bots."

Another thing I find appealing is that on a large enough system the DB could be shared amongst several servers to provide common protection for all. I've always taken a don't put an address on the page approach, but it's cool to see someone looking at how these bots operate from a technical standpoint.

Some ISPs (like mine) have policies against SPAM that stipulate that in addition to not actually spamming people, using their resources to prepare/collect addresses to SPAM is just as bad. The advantage the database gives you is that you can track the most recent offenders. A quick lookup to who owns the address, with hard evidence of one of their subscribers abusing both your system, and their policy will, if nothing else, cause the cost of spamming to rise. The reason SPAM is so popular is because it is VERY cheap to do. Once its costs approach those of 'traditional' marketing, things might get a bit more selective rather than sending my three year old '1-3 inches in 6 weeks!','Stop paying for cable', or 'Get out of debt now!' messages. Hardly directed.

(Now I don't want anyone marketing to my three year old, but I know it will happen so I'd like to at least think they would be reasonable things, perhaps a bit relevant)

Similar setup without SQL requirements by bero-rh · 2002-04-12 01:41 · Score: 4, Interesting

My setup (catches some of the more commonly used spambots) uses mod_rewrite to send spammers to a trap.
Setup details at http://www.bero.org/NoSpam/isp.php

--
This message is provided under the terms outlined at http://www.bero.org/terms.html

Re:Block? Are you kidding? by cperciva · 2002-04-12 01:47 · Score: 4, Interesting

Add a couple of sleep(20); into the cgi script that generates the bot fodder. The bot will still stay busy waiting for your webserver's response, but your script will exactly consume zero resources.

Zero resources, except for memory.

A much better solution would be to point the bot at a set of "servers" with IP addresses where you're running a stateless tarpit.

--
Tarsnap: Online backups for the truly paranoid

Re:Block? Are you kidding? by gclef · 2002-04-12 01:58 · Score: 3, Interesting

Actually, I've done this w/a bot trap on my site at home. It's a perl script that generates a bunch of weird-sounding text w/some fake email addresses at the bottom and a bunch of database-query-looking links back to the original page.

The bots don't fall for it anymore. Some dorks in Washington state decided to make a couple requests a second to it once, but in the two years I've had it up, they're the only ones.

burp by Anonymous Coward · 2002-04-12 02:07 · Score: 1, Interesting

We had an Evil Harvestor Robot irritator on our web site back in 1996. It worked rather well. It didn't hit legitimate spiders by using an appropriate robot directive. It also gave the harvester a whole heap of nonsense addresses to add to its database.

None of that Perl nonsense, either. All in pure C on a BSD host, with a damn good attention to potential overflows. That was also the site which had my own custom MTA (I only knew sendmail, so it seemed a wise decision), demanded full W3C compliance (we would test it on about 10 platforms), and got used as evidence in the DoJ case against Microsoft.

Sigh, those were the days. Now, all I see is rehashing of old ideas. So, I view this news is 6 years old -- perhaps even a record for Slashdot?

Re:Block? Are you kidding? by boky · 2002-04-12 02:18 · Score: 5, Interesting

I agree. And, come on, how much technology do you need?

This is my solution to stopping spambots. It's in a JavaServlet technology and I am posting it here to prevent my company's site from being slashdotted. It does not prevent the spammer from harvesting emails it just slows them down.... a lot :) If everyone had a script like this, spambots would be unusable.

Feel free to use the code in anyway you please (LGPL like and stuff)

Put robots.txt in your root folder. Content:

User-agent: * Disallow: /members/

Put StopSpammersServlet.java in WEB-INF/classes/com/parsek/util:

package com.parsek.util; // Slashdot lameness filter trick... sklj lijef oiwej goweignm lkjhg woeèi weoij woefh woegih weoigj woefm weoikjf woeifh woefhpweifjwopejf pw // Slashdot lameness filter trick... flk joweij pgwej pweof ,mpeof ,mpweorj pweomfwpegj pwehg woeigh owèefij woeij eogih oibhwepoi upeorw wpeo // Slashdot lameness filter trick... fkjew fiwje spbojkwe gkwpeori wpbv-j wpeofksweok pweorjsw eigjhwoeifj pweorj wepoj wepfomwe fpmwoe fpowe // Slashdot lameness filter trick... epoiw epw0 w'pg wpoe wpeom, wpog wepfoiwpeor kwpeof, wpobm wepofkwpeofk wopvf,w bowkpeoirf pwoef,mwepof p // Slashdot lameness filter trick... vlwkepo wesp ibebemwf èsdm fèefo.bp kwèpef èlfk èeofsw èegjwegoweofiw peok èglks dgèlksdfèokwe ofèkwe èfoe import java.io.File; import java.io.StringWriter; import javax.servlet.ServletContext; import java.net.URL; import java.util.Enumeration; import java.lang.reflect.Array; public class StopSpammersServlet extends javax.servlet.http.HttpServlet { private static String[] names = { "root", "webmaster", "postmaster", "abuse", "abuse", "abuse", "bill", "john", "jane", "richard", "billy", "mike", "michelle", "george", "michael", "britney" }; private static String[] lasts = { "gates", "crystal", "fonda", "gere", "crystal", "scheffield", "douglas", "spears", "greene", "walker", "bush", "harisson" }; private String[] endns = new String[7]; private static long getNumberOfShashes(String path) { int i = 1; java.util.StringTokenizer st = new java.util.StringTokenizer(path, "/"); while(st.hasMoreTokens()) { i++; st.nextToken(); } return(i); } // Respond to HTTP GET requests from browsers. public void doGet (javax.servlet.http.HttpServletRequest request, javax.servlet.http.HttpServletResponse response) throws javax.servlet.ServletException, java.io.IOException { // Set content type for HTML. response.setContentType("text/html; charset=UTF-8"); // Output goes to the response PrintWriter. java.io.PrintWriter out = response.getWriter(); try { ServletContext servletContext = getServletContext(); endns[0] = "localhost"; endns[1] = "127.0.0.1"; endns[2] = "2130706433"; endns[3] = "fbi.gov"; endns[4] = "whitehouse.gov"; endns[5] = request.getRemoteAddr(); endns[6] = request.getRemoteHost(); String query = request.getQueryString(); String path = request.getPathInfo(); out.println("<html>"); out.println("<head>"); out.println("<title>Members area</title>"); out.println("</head>"); out.println("<body>"); out.println("<p>Hello random visitor. There is a big chance you are a robot collecting mail addresses and have no place being here."); out.println("Therefore you will get some random generated email addresses and some random links to follow endlessly.</p>"); out.println("<p>Please be aware that your IP has been logged and will be reported to proper authorities if required.</p>"); out.println("<p>Also note that browsing through the tree will get slower and slower and gradually stop you from spidering other sites.</p>"); response.flushBuffer(); long sleepTime = (long) Math.pow(3, getNumberOfShashes(path)); do { String name = names[ (int) (Math.random() * Array.getLength(names)) ]; String last = lasts[ (int) (Math.random() * Array.getLength(lasts)) ]; String endn = endns[ (int) (Math.random() * Array.getLength(endns)) ]; String email= ""; double a = Math.random() * 15; if(a if(a if(a if(a if(a if(a if(a if(a if(a if(a if(a if(a if(a email = email + "@" + endn; out.print("<a href=\"mailto:" + email + "\">" + email + "</a><br>"); response.flushBuffer(); Thread.sleep(sleepTime); } while (Math.random() out.print("<br>"); do { int a = (int) (Math.random() * 1000); out.print("<a href=\"" + a + "/\">" + a + "</a> "); Thread.sleep(sleepTime); response.flushBuffer(); } while (Math.random() out.println("</body>"); out.println("</html>"); } catch (Exception e) { // If an Exception occurs, return the error to the client. out.write("<pre>"); out.write(e.getMessage()); e.printStackTrace(out); out.write("</pre>"); } // Close the PrintWriter. out.close(); } }

Put this in your WEB-INF/web.xml

<servlet> <servlet-name>stopSpammers</servlet-name& gt; <servlet-class>com.parsek.util.StopSpammersS ervlet</servlet-class> </servlet> <servlet-mapping> <servlet-name>stopSpammers</servlet-name& gt; <url-pattern>/members/*</url-pattern> </servlet-mapping>

Here you go. No PHP, no APache, no mySQL, no Perl, just one servlet container.

Ciao

--
boky

Take this one step further... by Jason+Levine · 2002-04-12 02:44 · Score: 4, Interesting

There's a spam-blacklist, so how about a spambot-blacklist?

You'd have a standardized spambot trap (like the one described in the article) on various webservers. The new spambot info could go into a "New SpamBots" database (which wouldn't be blocked). Once a day, the webserver would connect up with a central database and submit the new spambot info it's obtained. Then the server would download a mirror of the updated "SpamBots" database which it would use to block spambots.

The centralized SpamBots database would take all of the new SpamBot info every day and analyze them in some manner as to detect abuse of the system (ensuring that only true spambots are entered). E-mails could be fired off to the abuse/postmaster/webmaster for the offending IP address. Finally, the new SpamBot info would be integrated into the regular SpamBot database.

This way you'd be able to quickly limit the effectiveness of the Spambot-traps across many websites.

--
My sci-fi novel, Ghost Thief, is now available from Amazon.com.

Attn Spambot Authors by NiftyNews · 2002-04-12 02:48 · Score: 5, Interesting

Dear Spambot Authors,

Thanks again for your interest. I hope that we were able to help you write the spambots of the future that will be able to detect and sidestep as many of the above protection schemes as possible. We tried to work all of our knowledge into one convienient thread for your development team to peruse.

Thanks for your interest in SlashDot, home of too much information.

--
------
Today's Top Deals

Re:Elements of good design I'd missed - P.Solution by skaldrom · 2002-04-12 02:56 · Score: 2, Interesting

There is another solution: Usually these SpamBots are not able to execute JavaScript...
As described at http://www.joemaller.com/js-mailer.shtml you can combine JavaScript and Images to protect your mail. Made very good expiriences with this one....

But, as stated on the Website: this game is an arms race...

What I use by Phroggy · 2002-04-12 03:26 · Score: 3, Interesting

Take a look at these two bits of code from http://www.slickhosting.com/contact.shtml :

<A HREF="mailto:hosting%40slickhosting.com" onMouseO ver="window.status='mailto:hostingsli ckhosting.com';return true;" onMouseOut="window.status='';">hostingslic khosting.com</A> 

--
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;

Re:Block? Are you kidding? by LinuxHam · 2002-04-12 03:29 · Score: 3, Interesting

postmaster@127.0.0.1 and abuse]@127.0.0.1postmaster@127.0.0.1 and abuse@127.0.0.1

Good idea but, I'm sure spam software has been rejecting 127.0.0.1 for many years.

How about a few people volunteering real FQDNs that all resolve to 127.0.0.1? I realize that people would be volunteering horsepower and bandwidth for DNS lookups, but it would be in the name of dramatically reducing spam. Then, keep a list of all the "loopback FQDN's" and let the rest of us feed those FQDN's into spam-trap generators. Eventually, there would be so many real-looking spam trap email addresses that the spam software wouldn't be able to keep up with the list of loopback FQDN's.

To take it to the next level, you could hide the list of "loopback FQDN's" by making a reverse DNS lookup against a couple of volunteered IP addresses return a random FQDN from the list of loopback FQDN's at the time that the spamtrap page is dynamically generated.

Spammers would never know the entire list of FQDN's that resolve to loopback.

--
Intelligent Life on Earth

Don't stop spambots, feed them with Sugarplum by dananderson · 2002-04-12 03:35 · Score: 3, Interesting

I don't stop spambots, I feed them. I feed them phony email addresses and addresses of spammers (gathered from places such as my fake /cgi-bin/formmail.pl). I use http://www.devin.com/sugarplum/, mentioned before on /. to dish it out!

Better yet, use a Spam Troll-box by samhart · 2002-04-12 03:57 · Score: 2, Interesting

We've recently set up a Spam Troll-box using Vipul's Razor on our new Tux4Kids dev server (you can find our troll box here).

A troll-box gives Spam-bots a place to send their spam. When this box intercepts the spam, it reports it to the Vipul's Razor network, and everyone else on this network becomes aware of that spam (if they are also using Vipul's Razor to filter, which, chances are they are, it will filter that spam if they get it).

If Vipul's Razor isn't enough, one can even use something like SpamAssassin in conjunction with Vipul's Razor to get even better results.

Of course, this isn't cutting off Spam-bots at their source... but if enough sites were to cut them off at their source, then I'd imagine the Spam-bot authors would get wise to this and devise a way around it. Whereas with something like a SPam Troll-box, the Spam-bots seem to still be working to those running the Spam bots ;-)

Let's feed the serpent its own tail by Crash+Culligan · 2002-04-12 03:58 · Score: 2, Interesting

This morning, after finding a junk fax on the office's voice mail system, I called the removal number (in little text at the bottom of the fax) and reached an automated voice system that would either 1) remove an inputted number, 2) add a new number, or 3) talk to a representative about their service.

Well, I didn't trust (1), and (3) just got me a voice mail box instead of a person I could chew out, which I didn't use. That left (2), and I had a wicked idea:

I hit 2, and input the number that I should call if I was interested in the fax (which appeared in BIG text right above the little text). Their own response number should start eventually getting faxes from them or, as I tend to experience, hangups.

Cute story, I know, but what does this have to do with defeating spambots?

I went to the page indicated...

I was just checking out one of the email harvesting products and saw this [getyoursoftware.com]

And I scrolled to the bottom, and looked at the source code, and noted two faaaaaascinating things:

First, the HTML on that page is rather clean; I can see no evidence of anti-spambot code on their page.

And second, the "Contact Us" link at the bottom is a mailto:.

By all appearances, their page is vulnerable to their own spambot.

So I had the thought... what if those generated-random-email-address pages were geared to produce not-so-random email addresses? What if the email addresses on those generated-page traps were geared to generate random email addresses at the domains of the various spambot-- (err, I mean) harvester producing companies? Let them see what it's like when less than discerning spammers use their software for evil. Hundreds of Viagra-substitutes! Thousands of hangover cures! Tens of thousands of opportunities to refinance their home mortgage!

This is just an off-the-top-of-my-head idea. Opinions?

--
You cannot truly appreciate Dilbert until you read it in the original Klingon.

What about a Terms of Service page by splattertrousers · 2002-04-12 04:25 · Score: 2, Interesting

What about requiring all of your users to go through a terms of service page before accessing any parts of your site?

The page could have a form with "Accept TOS" and "Reject TOS" buttons. I wonder how many spambots would submit a form?

And to catch spambots that did submit the form, your TOS could have some clauses that make it a violation for evil spiders (ones that don't honor "robots.txt") to use the site. Maybe you could make||lose a few bucks suing the spambotters who go through the TOS and still harvest your email addresses.

New Program - Mailwasher by Peale · 2002-04-12 04:25 · Score: 4, Interesting

Speaking of spam, I've come across this new program called mailwasher. You can check your mail while it's still on the server, and then - get this - fake a bounced message. There are probably other programs that do this, but this is the first one I've heard of.

Anyway, AFAIK, it's WinBlows only, and available at http://www.mailwasher.com, although right now it seems the site is down, all I get is a 404!

Re:A better solution: obfuscate the mailto: link by Sangui5 · 2002-04-12 05:16 · Score: 5, Interesting

Some spambots will render that correctly. Less likely, though, is if they'll render an email that has had this done to it: it's encrypted through javascript.

It is a rather impressive piece of work. Uses honest-to-god RSA.

You could also encrypt all email addresses, and then in your spambot trap, put really really CPU intensive javascript. You'll win either way: either the spambot doesn't do javascript, and it won't get your addresses, or it does do javascript, and they've just spent an eternity wasting time. It would work the same way as a tarpit, but it wouldn't eat nearly so many resources on your end.

If you're really clever, you could have the javascript do useful work, and then have the results of that work encoded into links in the page. You could then retrieve the results when the spider follows the link.

There was an idea called hashcash floating arount a while back. The idea was that an SMPT server would refuse to deliver email if the sender didn't provide a hash collsion of so many bits to some given value. The sender has to expend way assymetrically more resources to generate the collision than it takes the reciever to check it. That way on can impose a cost on sending a lot of email. It's not so much to be a burden on ordinary users, but if you need to send thousands of emails, it will add up.

How about trying this by SnarfQuest · 2002-04-12 05:18 · Score: 2, Interesting

There are "scanner" traps that start up a session and just drops it (not telling the scanner) which ties it up until the scanner softare times out.

How about writing something for these spambots using a special web server that slowly responds to it's requests (sends out a small packet every 10 seconds) so it won't time out and won't consume much cpu time, and just feeds it a line or two lines of junk with each packet. Have it randomly generate a never ending supply of useless information to keep the spambot happy. While it's busy with the useless site, it's not bothering other people nor is it getting any real addresses.

--
Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.

Re:Block? Are you kidding? by F�an�ro · 2002-04-12 11:19 · Score: 2, Interesting

How about a few people volunteering real FQDNs that all resolve to 127.0.0.1? I realize that people would be volunteering horsepower and bandwidth for DNS lookups, but it would be in the name of dramatically reducing spam. Then, keep a list of all the "loopback FQDN's" and let the rest of us feed those FQDN's into spam-trap generators. Eventually, there would be so many real-looking spam trap email addresses that the spam software wouldn't be able to keep up with the list of loopback FQDN's.

Slashdot has been doing that for years with warez.slashdot.org . try it, it resolves to 127.0.0.1
I always enter postmaster@warez.slashdot.org in spamforms

Slashdot Mirror

Stopping Spambots: A Spambot Trap

30 of 304 comments (clear)