Preventing Forum Spam-bots?

One word: by MadDog+Bob-2 · 2006-04-07 09:50 · Score: 5, Informative

Please use correct terminology by Raul654 · 2006-04-07 09:52 · Score: 4, Informative

For the record, those blurred/skewed letters and numbers are called a "Completely Automated Public Turing test to tell Computers and Humans Apart" - Captcha.

--

To make laws that man cannot, and will not obey, serves to bring all law into contempt.
--E.C. Stanton

Re:Please use correct terminology by croddy · 2006-04-07 10:00 · Score: 4, Insightful

Before you implement a captcha, please consider the effect this will have on visually impaired users. Obviously, any system relying on an image will not be accessible to blind people; systems making use of colored images may not work for colorblind people. Providing audio captchas would help, but this will be a problem for people who are deaf -- and one cannot simply assume that users are not both deaf and blind.
I have seen some captchas that ask users in plain text to solve a simple arithmetic or logic problem. This is going to be far more accessible than anything relying on embedded media.
If you're sure that none of your users are blind or colorblind (which would be plausible only for an extremely small user base), then I suppose something like KittenAuth might be appropriate.
Re:Please use correct terminology by Xibby · 2006-04-07 10:15 · Score: 4, Insightful

The forums that I run have a "If you are visually impaired or cannot otherwise read this code please contact the Administrator for help." with a mailto link.

This has yet to be a problem as the forums that I run are orientiated around shooters or MMPOGs. :)

--
I'm going to go back in my box and will think within the limits of my box: MS Sucks Linux Good I read too much Slashdot.
Re:Please use correct terminology by stevey · 2006-04-07 10:31 · Score: 3, Interesting
You could also go for the cuteness approach:
- Kitten Authentication
Click on the three images which are OMG Kittens and you're identified as human.
Re:Please use correct terminology by Jester998 · 2006-04-07 10:39 · Score: 4, Funny

I have seen some captchas that ask users in plain text to solve a simple arithmetic or logic problem.

While not illegal, some may considering it amoral to discriminate against stupid people.
Re:Please use correct terminology by Dr.Evil · 2006-04-07 11:38 · Score: 2, Insightful

If you read the article introducing the kittens concept, you'll see that the author intends it to be customized to each site, thus preventing spambots from simply memorizing the pictures. And randomly picking three out of 9 images only gives a possiblity of success of 1/84, better than many word captchas are achieving these days.

Anyone who wants to custom-program a bot for a single site would just be better off manually posting their spam.

--
Right...
Re:Please use correct terminology by Fulcrum+of+Evil · 2006-04-07 16:18 · Score: 2, Insightful

While not illegal, some may considering it amoral to discriminate against stupid people.

Immoral? Hell, it's a moral imperative!

--
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"

Easy by Kj0n · 2006-04-07 09:55 · Score: 4, Funny

Just display a confirmation page with the goatse.cx picture.

Anyone who can still click on the confirm button is not human.

Also... by Raul654 · 2006-04-07 09:57 · Score: 3, Informative

...it's patented. (and Turing is spinning in his grave...)

--

To make laws that man cannot, and will not obey, serves to bring all law into contempt.
--E.C. Stanton

add ad hoc customizations by etymxris · 2006-04-07 10:00 · Score: 3, Insightful

Add hidden variables to submission forms that change everyday. This will force the bot software to do pagescraping for your specific webforum, which probably isn't worth their time. They will go to the easier targets first.

But if they are defeating captcha, there is probably someone who just sits there manually spamming forums through anonymous proxies. The amount of money that can be made by doing this spamming is probably enough to pay people with lower standards of living to just do it manually. And if that's so, there's just no way to get around it. I started logging how many bots the captcha and hidden variables were catching, and it was tons. Still, I get spammers. Just not nearly as many.

Two good approaches by aiken_d · 2006-04-07 10:08 · Score: 3, Insightful

Good: CAPTCHA

Better: dynamically change the names of form fields ("subject", "message", etc) based on the current time. MD5 hash the current hour with the field name, and have the software only check the current and previous values. Spam bots generally have to be told what field names to look for.

Best: have good moderators who kill spam and block IP's more or less instantly. Not practical for smaller sites, of course.

-b

--
If I wanted a sig I would have filled in that stupid box.

Don't use well known forum software by savala · 2006-04-07 10:09 · Score: 4, Interesting

Don't use phpbb, vbulletin or whichever other forum software everyone uses. Don't name your registration page "register.php" or something similarly easy to guess. Don't give your username and password fields name and id attributes of "username" and "password". Etc, etc. There is no security in obscurity, but there sure as hell is lots of convenience and freedom from automated harassment.

The rewards for writing scripts that can handle the subscription process for all the big software packages are simply too large. Yes, these software packages will now start up the arms race, same as has happened with weblogs and email and referer spammers (does anyone else have the feeling we've won that last one, btw?). You can try and follow along and update your forum software every other day. But it's much more convenient to simply duck under the radar. Chances are no spammer is going to bother figuring out how to register at your custom-built/modified forum.

Re:Don't use well known forum software by Spudley · 2006-04-07 11:04 · Score: 2, Informative

Don't use phpbb, vbulletin or whichever other forum software everyone uses

Much as I hate to agree with that, he speaks the truth -- the bots are written to target specific forum packages, and they almost always go after the popular ones. phpBB has taken a lot of stick for one or two security problems that came up, but in truth it's as good, if not better than its competition; the reason it gets hit so badly is simply because it's so popular.

So if you can use a less-well-known package, that will keep you away from the prying eyes of most of the bots.

Alternatively, you could mod one of the well known packages, so that the bots no longer work with it. That could be something as simple as changing the fieldnames on the registration form, or changing the URL of the registration page. If you know enough PHP/ASP/whatever to make the necessary changes, that would be a good solution; you'd still have the features of your favourite package, but not the bots.

While you're modding the forum, it would also be a good idea to add a block to prevent new members from posting links. (if you're really lucky, your forum packages may include this feature already). Spambots aren't any use if they can't post spam, and spam requires a link, so kill off the links, and you'll kill off the bots. Members should only be able to post links after they've proved themselves trustworthy.

CAPTCHA is a great idea, but if you're using a common one (ie the one included in your forum package), the odds are that the spammers have cracked it already. But again, the bots are likely to be programmed with the specific CAPTCHA-cracker for their forum, so if you can replace it with a less-common method, that will also bamboozle the bots.

If you are still using a well-known forum package after all that, you should also consider modifying the page template to remove references to the software name and version. Some bots look for specific versions of a forum to attack a known weakness, so stripping out the identifying marks will make it harder for them.

Security by obscurity is a much hated phrase around here, and with good reason. It is highly effective against the blind automated attacks of your average spam-bot, but whatever you do, even if it seems to be working, don't take your security for granted. Never let your guard down.

--
(Spudley Strikes Again!)

What email addresses are they using? by oni · 2006-04-07 10:11 · Score: 2, Interesting

If they are using something like hotmail, then maybe just disallow hotmail. Nobody with a brain uses it anymore anyway.

If they are using gmail, then maybe google would be nice enough to start a service where you could report addresses that bots are using. The great thing about google requiring invites is that google now has this neat chain of responsibility. If they see a pattern where all of the addresses created by invites from a certain person's account have been used as bots, then they could delete all those accounts and all the accounts they invited. That would seriously screw the spammers.

Re:What email addresses are they using? by John+Miles · 2006-04-07 10:53 · Score: 2, Insightful

That's actually a really good point. You could require a GMail account for registration -- effectively leveraging Google's spamfighting capabilities for your own purposes.

--
Dahlmann tightly grips the knife, which he may have no idea how to use, and steps out into the plain.

What worked for me by FreelanceWizard · 2006-04-07 10:14 · Score: 2, Interesting

I'm guessing you're using phpBB. I've actually been hit by these guys on my boards; it wasn't a problem for me until they started to post. It appears to be actual people and not robots. I should also note I didn't have this problem until I added Google AdSense to my boards. After I did that, I started to get two or three of these spammers each week. Another phpBB board I administer hasn't gotten a spam user yet.

What worked for me was checking the registration e-mail addresses of these people and putting in bans for "*@mail.ru" and "*@*.info". On phpBB, you'll have to manually add these to your ban list table in the forum database. Given that a US board isn't likely to have legitimate users coming from Russia or with .info e-mail addresses (.info generally being the Internet equivalent of the sleazy parts of a big city), I don't think I'm really affecting potential new users. I haven't gotten any complaints or new spam users yet, so my technique seems to be working.

--
The Freelance Wizard

Re:Grace period? by Donniedarkness · 2006-04-07 10:16 · Score: 3, Informative

While this will keep some of the bots away, it will also cause the site to lose members. When I sign up on a forum, it is usually because I want to post RIGHT THEN. Of course, I'll probably continue to post on it.

If a site makes me wait three days, though, I'm likely to forget about it in that time.

Or were you talking about smaller grace periods? Perhaps 10 minutes? That might work well.

--
Earn a % of cash back from Newegg, Tiger Direct, Walmart.com, and more: http://www.mrrebates.com?refid=458505

Be proactive! by BertieBaggio · 2006-04-07 10:19 · Score: 4, Insightful

There are a number of options you have, depending on how aggressive you want to be. You may have implemented some of these suggestions already, but they may help other forum admins in a similar quandry.

Firstly, disable anonymous posting. What works for slashdot does not necessarily work for phpbb. This may sound obvious, but a forum I check on now and again is slowly haemorrhaging members due to guest bot spam.

Secondly, find yourself a list of public proxy servers. Ban them. Find some more. Ban them too. Also, take note of the IPs the spambots were using to post. Ban them as well (unless they are AOL IPs -- be smart and do an nslookup). Keep this list of banned IPs, and are them with the blacklist groups, or other forum admins you know. You help them, they help you.

Thirdly, augment your signup process. You say you are using CAPTCHAs, but if the bots are getting arond or through them, you have to do more. Write a few hundred straightforward questions; you can get your community to help you for this one. Have one o two of those questions displayed at regitration time, along with the CAPTCHA. For example:

Which of this is not one of the seven dwarves?

Doc
Sleepy
Bashful
Horsey

Or would you like another question ?

Keep this as simple as possible. "What color is the sky?" is about the level you are looking for. A bot won't be able to answer these unless it is specifically programmed to. Need I say you should serve a random question?

For bonus points on this one, make the questions something to do with the topic of the forums. If the forums were about widgets, you could ask something (really basic) like "What is the most common color of widget?". Or make come of the questions about the TOS. You know, the thing everyone checks the box saying "I agree to abide by the TOS". This may alienate some people, though, which you may or may not want. Also remember to consider non-native English speakers.

If you are sill getting those darned bots, consider manually approving by hand all registrations. This will obviously depend on how many new signups you get, and what kind of manpower you have (think moderators and "trusted community members"). On the other hand, you should be able to spot and stop bots right off the bat.

But why stop there? Be even more proactive! Set up a honeypot. Disallow a certain directory with robots.txt, and ban all IPs that find their way there. Include an invisible link to the disallowed location and see what falls in the trap. Remember that blacklist you started earlier? Add (and share) these IPs!

Finally, let your community know what you are doing. They will appreciate the effort (If you have noticed the spam, so have they). Set clear guidelines, and encourage community vigilance.

In the end, remember: spam is beatable.

--
If all you have is a grenade, pretty soon every problem looks like a foxhole -- MightyYar

Use Slashdot's method by c0d3h4x0r · 2006-04-07 10:20 · Score: 3, Insightful

"Captcha" techniques aren't bulletproof. If someone can automate all but the "captcha test" part of the posting process, then someone can sit and repeatedly answer the captcha test and still post spam pretty efficiently.

The only truly effective way to stop this crap is to require a certain amount of time to elapse before being able to post another post, like the way Slashdot does it, and to implement some kind of moderation+filtering system so the crap can be all be modded down by vigilant users. Combine that with a couple other requirements (you must have a user account to post, and new users can't post for the first 48 hours), and you'll easily sqaush the spam problem.

--
Moderator hint: a comment is neither "Flamebait" nor "Troll" if it is true.

by the users, for the users by McCarrum · 2006-04-07 10:24 · Score: 3, Interesting

i wont echo the above (kittens and altering html templates to make a more unique code process - both well worth it) but i say that on one site i used to run, we allowed anyone with 1000 posts, all members of a screening club .. and every new user had to have their posts screened before being posted .. once an account got to 10 non-spam posts, their group changed to allow normal postings.

i do recommend you use your community to help your community .. and odds are, they'll help as well

--
Robert Anton Wilson :: Rest in Slack

attack your site by kebes · 2006-04-07 10:25 · Score: 3, Interesting

I'm certainly no expert in such things, but here are some suggestions. The idea, of course, is to make life difficult for the spam-bot (or the spam-bot writer I suppose) without making life hell for your users. You seem to already be using a CAPTCHA, but you could switch to a different one. Everytime you switch, the bot-writer has to update his code. This is annoying for him but is no big deal for your users, since they are humans and can pass whatever simple visual test you give them. You might also consider making small changes to the HTML of those "make new account" pages. It's likely that that bot is making many assumptions about how your page is organized. Changing the names of forms (or having random names), or changing subtle things about the layout (things that a human wouldn't even notice, but which would break an HTML parsing program that was expecting your page to be organized in a certain way) are also good ways to slow down the bots. Make the HTML obfuscated. Include bogus hidden forms, for instance.

Perhaps the best way to fix your site is to attack it yourself. Try to write a simple bot that automates the login process, and see what happens. You may suddenly notice a subtle hole in your security (maybe the filename for the captcha gives away what it is... or maybe after a successful verification, the same cookie can be used to create another account... or something). In the process of attacking your own site you may uncover something you've missed before.

radical measure by dario_moreno · 2006-04-07 10:48 · Score: 2, Interesting

I saw a forum which required that you post a (non-'shopped) picture of yourself holding a 45 rpm record of the artist the forum was about before getting an account...best signal/noise ratio I ever saw with rec.guns, which seems to be moderated by gods because of the very high flame and spam potential!

--
Google passes Turing test : see my journal

Cheep medz by fm6 · 2006-04-07 12:00 · Score: 4, Funny

www.cheapmeds.com

Re:Good moderators help... by drspliff · 2006-04-07 12:53 · Score: 2, Informative

Google, Yahoo and MSN have already done this. Simply insert 'rel="nofollow"' into all the tags that people post in the comments, and although they still show up it makes it pointless for those spammers trying to increase their PageRank.

I know this won't help with the unsightly comments on your website, but since this is the slashdot crowd just flag all the comments with URLs in them as 'hidden' and on a daily/whenever basis go through them deleting spam and unhiding legitimate comments. Stick this all in a central control panel and it's unlikely to take up more than 10 minutes of your time.

In addition to that, just stop any client with a useragent string that contains a URL or one of the known spambot names.

http://www.kloth.net/internet/bottrap.php - A quick implementation of a bot-trap, which bans bots which don't follow your robots.txt directions.

What's worked for me: easy damage control. by WoTG · 2006-04-07 14:17 · Score: 2, Informative

I run a quiet phpBB for forum support of some websites of mine. For the last few months SPAM has outnumbered real posts by a large margin. I tried a CAPTA module (I think it was the built in one) and it did next to nothing - they aren't programs, the posts are from humans who have (low paying) jobs to post links on message boards.

I had reasonable success by limiting posts to people who have verified their email address -- I think that that was also a feature of a recent phpBB update.

But the spam still outnumbered posts, so in the last two weeks I've added these two phpBB mods:
http://www.phpbbhacks.com/download/4878 - this mod checks each registration IP address against the dns blacklists. I think that it improved the situation, but it didn't stop the problem out right, and I still had to clean up the board once in a while.

http://www.phpbbhacks.com/download/6208 - this mod gives a really easy way to delete a user and all of their posts at once. It's not a fix, but it's turned out to be the best solution. It only takes a few seconds to undo the damage from any one individual, no matter how many spam posts that they have made. A person could spend 20 minutes registering and posting 20 messages and I have to spend 20 seconds nuking the account and all it's posts. It's a fair trade, and I get some small satisfaction in that!

Re:Grace period? by FLEB · 2006-04-07 16:46 · Score: 2, Insightful

It would work reasonably as well in reverse: Allow the person's posts, but forward them to a moderator. If the moderator determines them to be spam, that poster gets the boot (along with all their posts). Add in some intelligent "Find Similar" logic, and you'd have y'erself a good start at a forum anti-spam system.

--
Information wants to be free.
Entertainment wants to be paid.
You just want to be cheap.

mod_security by fthiess · 2006-04-07 17:04 · Score: 2, Informative

I've had quite good luck by using Apache mod_security (modsecurity.org) to filter web activity. Yes, all the suggestions people have been giving about CAPTCHAs, blocking people with addresses in high spam domains, etc., are all good and useful, but mod_security lets you cover a base those approaches are missing: it lets you block spammers from posting spam, even if they somehow manage to get through your registration defenses. I use a mod_security ruleset based on one published at http://gotroot.com/tiki-index.php?page=mod_securit y+rules which watches POST content for URLs and terms commonly used in spam postings, and blocks them--in adddition to rules that are more traditional for mod_security, such as blocking phpBB exploits--which I've also found it to be invaluable for. I administer several forums and wikis that were having quite bad problems, even with CAPTCHAs, email verification, and so on. . . but the problems pretty much went away once I pulled mod_security into the battle.

Re:Unstoppable captcha-buster by Baricom · 2006-04-07 22:15 · Score: 2, Insightful

I've wondered what would happen if you distorted the CAPTCHA using a site's name or URL instead of a random background. Do you think at least some people would hesitate a moment if you went to some random porn site and had to type a CAPTCHA with slashdot.org watermarked in the background?

Re:Good moderators help... by Baricom · 2006-04-07 22:20 · Score: 2, Informative

Stick this all in a central control panel and it's unlikely to take up more than 10 minutes of your time.

I basically gave up on blogging because I had to sort through 500 spam comments a day. I know another blogger who had to clean 7,000 (yes, thousand) spams out of his blog every day.

It took both of us longer than 10 minutes.

There's a much simpler method by Random+Walk · 2006-04-10 00:23 · Score: 2, Interesting

Forum spammers want to submit very specific content: hyperlinks (to boost their Google page rank). Our forum gets hammered by spambots hundreds of times per day, yet nothing comes through - we simply filter away any message containing a hyperlink (plain, non-clickable URLs are allowed). Works like a charm - no user registration, no fancy and annoying CAPTCHAs.

31 of 124 comments (clear)