Ask Slashdot: Using a Sandbox To Deal With Spambots?
shellster_dude writes "Slashdot is certainly no stranger to the problem of spam bots. While blocking a spam bot may seem like the best solution, it is likely that the spammer will simply re-register with a different name. While trying to solve this dilemma on my own forums, I had an epiphany. What if, instead of blocking a spam bot, I could mark a spammer, and then hide all their comments from everyone else? The spammer could continue to go their merry way, spamming to their heart's content. When they visit the forum, they see their spam comments correctly placed in the threads, but their comments would only be visible to them. Thus, an effective sandbox which would prevent them from registering a new user once they had been 'blocked.' Are any other Slashdotters familiar with this technique? Does any software currently use this technique?"
Why is nobody responding?
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Old idea that doesn't fix much because spammers change accounts after 1-20 posts anyway.
This comment is used extensively at major media outlets such at Swedish' tabloid "www.aftonbladet.se." Facebook is used to register users.
When a user is perceived as spamming - or writing opinions that are unwelcome - the user is marked, and simply not displayed to other visitors. But the user himself does not know, and keeps spamming.
Evil. Pure evil.
http://en.wikipedia.org/wiki/Hellbanning
Reddit does something like this.
The practice goes by several other names I can't recall, but I know it as a "shadow ban"
Basically, you tick a box and nobody but that poster can see their nonsense.
Some forum software already includes the feature, others require a plugin or a roll-your-own solution.
[Fuck Beta]
o0t!
Steve Huffman, one of the creators of Reddit, talks about this exact solution during his Udacity class, Web Application Engineering. http://www.udacity.com/overview/Course/cs253/CourseRev/apr2012 I think it was during week 4 "Whom to Trust," but I don't have links to the exact video. So in short, yes, it has been done effectively in the past, though I believe they wrote their own code to do it.
This wouldn't work because spambots don't keep using a single account. If it were that easy spambots would have already been long defeated.
Either change accounts often, which I think is common anyway, or have a second bot checking if the posts show up, and stopping the first when it stops seeing the posts?
It's called hell-banning, and it's a blessing for bots, but unfair hell when applied unjustly to a non-spamming real user, as is often the case with automated solutions - I'm talking to you, Hacker News, you moronic cunts.
Seems like it would be easy enough to work around with a second bot that checks to make sure spam is getting through.
I went to eat some animal crackers and the box said, "Do not eat if seal is broken." I opened the box and sure enough..
What makes you think that they will stop just because their account doesn't get closed?
They will not notice the efficacy of their spam, they will just keep signing up and spamming. And you'll play whack-a-mole trying to put all their accounts into sandboxes.
Just how often does a spammer go back to see if his comment posted or not, or if his email got through? Rarely. Spam works on the basis of mass volume. Put a billion adverts on a billion websites and your sales will increase somehow. And the price of those adverts is next to zero after the first few thousand.
It won't work, but it will make a lot of hassle for you, from storage to filtering to just plain bandwidth if you have a thousand spammers realising they can auto-sign-up and spam you endlessly.
It's like running a "honeypot". You'll gather lots of data at great expense and resources. But you won't stop the spam.
What you're referring to is known as Hellbanning (https://en.wikipedia.org/wiki/Hellbanning) and is used on various sites. I'm mostly familiar with it from Hacker News which employs it.
and I think craigslist does it. I remember thinking functional data structures (like in Haskell) were a good match for this since it makes it easy to keep many independent views of the data.
Another trick is to slow down the server response to the spammer, e.g. to 1 minute, so they just think it is slow. I know the old photo.net used to do that.
It would certainly prevent spam temporarily but
a) the spammer would notice rather quickly if their spam doesn't show up in Google
b) the spammer could easily defeat the system by simply re-registering with another username
c) one mistake on implementing the system (eg. allowing users to read 'sandboxed' comments through a link) could maybe hide it from your users but not from the other bots that crawl your site (again Google and security bots) which would then mark your site as spam.
The problem is that spamming is usually automated so you have to have the end-user jump through hoops in order to defeat them. One of the forums I moderate actually requires a legitimate introduction on the topic of the forum before they are allowed to post in the general forums. Defeats most spammers as it's somewhat of a niche forum and automated spam is immediately recognized and user/ip banned.
Custom electronics and digital signage for your business: www.evcircuits.com
http://www.codinghorror.com/blog/2011/06/suspension-ban-or-hellban.html
Do worry about life, you will never get out alive.
It's a great idea but there's an issue... If aware of such a policy, such a spammer could create to accounts. One to simply be a "is my other account banned" validation-only account. The strategy could be more effective if the "invisibility" were applied on an IP basis (all accounts from the communicating from the same IP could also view the comments) or something of the like but that strategy could as easily be subverted by switching IPs. Still, it increases the work required of a spammer and complicates their efforts, so I take it as an overall good method of discouraging spam or at least making it more expensive to spam.
I'm pretty sure that the vbulletin forum software has this feature. Users can be tagged by moderators such that all of their post are invisible to the rest of the community. Members see their own posts. In a spambot situation, I would be cautious about using this approach on account of database growth and system maintenance. ymmv.
A decent enough idea to be sure, but it must be carried forward to conclusion. Not only could these be detected by a second bot account, the spammer is still eating up your resources, whether it be disk space or processing cycles to detect viewing by bot accounts. Even if legit users never see the spam, the spammer half wins by making your system work harder to filter them out.
What's even funnier is to allow all the people marked as "spammers" to see each other's comments as well. We called this the Secret Garden.
Vbulletin implements this with their global ignore (a.k.a. Tachy Goes to Coventry) function.
upon the advice of my lawyer, i have no sig at this time
Seriously, what sort of fuckwit actually thinks that is the proper expression?
Let me guess: You hold down the fort too while you bunker down? You would of anyway.
If you're already too stupid to fucking write, stop trying to think about technical solutions. You're only going to fuck it up.
For extra points you could probably modify the registration process in all kinds of manners which would confound an automated and replay attacks. Chances are that for the average forum it would be sufficient that no script would even bother to defeat it and would simply move onto softer targets.
There used to be a Web forum product called Beehive (not sure on its status these days) which had this as a feature. A spammer or troll could spew all they wanted to, and if the "worm mode" bit was set, only they could see their postings -- nobody else.
For a constant troll, I'd say go for it. For a hit and run spammer who really just wants to get stuff on the board and then run off, I'd say don't bother; they won't be back on that account most likely.
Your post advocates a
(X) technical ( ) legislative ( ) market-based ( ) vigilante
approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)
( ) Spammers can easily use it to harvest email addresses
( ) Mailing lists and other legitimate email uses would be affected
(X) No one will be able to find the guy or collect the money
(X) It is defenseless against brute force attacks
( ) It will stop spam for two weeks and then we'll be stuck with it
( ) Users of email will not put up with it
( ) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
( ) Many email users cannot afford to lose business or alienate potential employers
( ) Spammers don't care about invalid addresses in their lists
( ) Anyone could anonymously destroy anyone else's career or business
Specifically, your plan fails to account for
( ) Laws expressly prohibiting it
( ) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
( ) Ease of searching tiny alphanumeric address space of all email addresses
(X) Asshats
( ) Jurisdictional problems
( ) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
( ) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
(X) Armies of worm riddled broadband-connected Windows boxes
(X) Eternal arms race involved in all filtering approaches
(X) Extreme profitability of spam
( ) Joe jobs and/or identity theft
( ) Technically illiterate politicians
( ) Extreme stupidity on the part of people who do business with spammers
( ) Dishonesty on the part of spammers themselves
( ) Bandwidth costs that are unaffected by client filtering
( ) Outlook
and the following philosophical objections may also apply:
( ) Ideas similar to yours are easy to come up with, yet none have ever been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
( ) Whitelists suck
( ) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
( ) Sending email should be free
( ) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
(X) Feel-good measures do nothing to solve the problem
( ) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email
( ) Killing them that way is not slow and painful enough
Furthermore, this is what I think about you:
(X) Sorry dude, but I don't think it would work.
( ) This is a stupid idea, and you're a stupid person for suggesting it.
( ) Nice try, assh0le! I'm going to find out where you live and burn your house down!
You're correct.
The option he was thinking of does exist in VB, but it's called "Tachy goes to Coventry"
It's good for dealing with trolls
I have a chat room that implements a few nice features. You can Hide from someone so that nothing you say is visible to them and your name does not appear in their user list. You can also ignore someone so that nothing they say will ever appear on your screen. Also the chat room has a two tier approach. New visitors to the chat room appear in a frame at the top (what we refer to as the lobby) and can not see anyone who is in the main chat room who does not want to be listed. Anyone in the main chat room can chat with them or ignore them. And if the new person turns out to be interesting and not a jerk they can be invited into the main room and then they will see both chat rooms in their separate frames. Jerks can also be demoted back to the lobby or banned. It keeps the spammers and flamers from annoying our pleasant conversations.
There's a site called Slashdot which allows comments to be rated from 0 to 5. Spam, trolls, and posts like this one will be moderated down to zero and blocked from view by most other users.
Check it out some time.
From what I understand from a contact of mine who works for a news paper, their website has this functionality. They told me that when a spammer is blocked or their comment is deleted they are the only ones who dont know. They can keep posting and they think their posts show up, but to the rest of the world they don't exist. Their websites comments appear to be run by a company called Pluck by DemandMedia.
I'm really happy to read this paragraph. I had the same epiphany when I began planning for a recipe website that allowed for comments without passwords (to login avoid hassle). I also worked out a similar system to the backend of an Omegle clone, essentially pairing abusive (Ctrl+V then exit, Ctrl+V then exit) users with a Cleverbot routine until they stopped spamming, sandboxing them from the greater user base.
From this thread, I learned this system is called "Hellbanning" and some of its downsides are similar to those of honeypots, e.g. you have to store useless data, bandwidth usage goes up by those who think their spam is working, etc. I think these are fair complaints, but the jusy is still out whether these downsides outweigh the benefits of hellbanning.
Hellbanning represents an entirely new way of handling user submitted content. The current norm shows the status of every post to the user who created it. "That comment is awaiting moderation" and "This has been flagged." Essentially, by giving status reports and feedback to abusers, you are grading them on their work and giving them constructive criticism. By obscuring the extent to which their content is shared, they don't know if their efforts are in vain, and they can't improve on their failing techniques if they don't know what is working what isn't.
I would enjoy hearing about anyone else's knowledge about obscurring user content in real world applications, or any theoretical concerns or loopholes someone just hearing about it can come up with.
Currently:
Spammers can register and post for free (or sufficiently free do to low captcha cost)
You propose:
A way to squelch individual accounts. (Assuming errouneously that it has some cost to them)
The result:
Spammers will still continue registering new accounts, because in no way does it affect their cost.
A better solution: make them fund their account - PayPal with some trivial designated amount - $0.75, correlate it to the paypal address during signup. You've now added real cost and real verification. Hold the money for some time, then reverse it. The likely outcome is they'll start using stolen credit card numbers, or stop.
Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
http://www.stopforumspam.com/ http://akismet.com/ Depending on your forum software, someone will most likely have done the hard work and integrated these services to do what need. I use these on vBulletin to moderate spam posts.
Seems like it would be easy enough to work around with a second bot that checks to make sure spam is getting through.
So you make the troll visible to all for a few seconds after the troll has posted, or always visible if someone tries to go to the site directly...
And the troll is visible for longer to anyone visiting the site from the same IP address.
But most spammers would not really bother with a verification pass. They have new places to spam.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Let us filter the spammers at our end. If you guys do it, you'll get too many false positives. The whole process will become entirely political. Please, don't. And besides the spammer can log in through a proxy find out he's being censored, and just open another account through the proxy.
“He’s not deformed, he’s just drunk!”
Replace the forum's captcha with one of a higher grade, e.g. Recaptcha
Or eliminate it altogether, since it doesn't help and really pisses off users.
Requiring new users to be registered and await activation before being able to post.
Instead of this allow anyone to post right away, but do not allow the first few posts to be seen until they have been verified to be valid by a human. Delegate some of this verification to your most active users.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
new users are sequestered
Have one post on your site called "Do Not Comment," with clear instructions not to actually comment on it. Anyone who comments on the post is automatically on the "shadow spam" list.
i.e. make a simple Human Intelligence Test that a spambot is likely to fail.
Allow logged-in users to flag posts and allow high-reputation users' flags to count more than other users'.
If a post gets too many "flag points" in too short a period of time, it is hidden to non-logged-in users. Let logged-in users set their own "hide or collapse posts with more than X points" threshhold."
To discourage spam you want search engines to not see it. Consider marking public/no-log-in-required pages that have new posts on them as "noindex, nofollow" for the first few hours or days.
The original suggestion you offer has merit except it's too easy for a spammer to defeat. In addition to wanting to hide reported spam from non-logged in users and from logged-in users who don't want to see it, You want a solution that tells search engines "this message is new, don't index it" and a method to make sure new posts are reviewed for spamminess before the searchbot timer expires.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Do like the supermarkets do. Just rearrange everything on the sign up page every couple of weeks or so
“He’s not deformed, he’s just drunk!”
I've had a sandbox for more than a decade. Its called Yahoo.
C|N>K
...but I think that Microsoft has really got it right with this new product.
I tried it and the interface is clean, more responsive than the competition. Nice to see some high-quality, reasonably-priced software coming out of Redmond!
Until they figure it out after 1 day and create another account anyway. Or maybe they create and revolve 40 accounts and DOS you, in spite, using 15 differant ips. This would fake a 15 year old in 1995 using their Visual Basic AOL program, but not a company being paid to spam. As someone who has developed scrape spiders and anti-spam code (for highly spammed websites), you are going to need to think a little deeper. I can tell you one thing, any spam bot software worth it's weight in obnoxious comments is going to look for every possible way to fool you.
As an analogy, normal banning is like an SMTP server rejecting spam with a 5xx failure code, while your scheme would have the server accept the spam with a 2xx code but throw the message in /dev/null
Each method has the usual pros and cons: Pretending to accept mail reduces (but does not completely eliminate) feedback to the spammer as to whether or not the message made it through. However, it plays hell with legitimate users; false-positives become much more problematic if there's not feedback.
Roll your own, or use Akismet...
The really important thing is to make sure Google (and the other search engines and ad services, if you care about them) can't see the spam. That's the real objective of the spammers, and those that bother checking may find that spamming you is less effective in fixing their page ranks.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
A good use for stupidfilter http://stupidfilter.org/ perhaps?
Lets bury them really deep too.
When I was a forum mod for a large forum some years ago, we had a lot of troll problems and the same guys would keep showing up as sock puppets. A lot of the time it took a while to suss out if someone was for real or one of the persistent trolls.
So I did come up with an idea to mirror the forum, with idiots and highly suspected idiots able to post all they wanted on the fake mirror, with the non crappy people on the real forum. So what it looked like was that everyone had the trolls on their ignore list, or that they weren't very interesting because no matter what they said, nobody answered them. But they did have some fun conversations with each other that nobody ever saw.
After a while, the persistent ones got tired of the effort since they were getting no return, and they went away.
So yeah, your idea should work unless the spammer notices that the bot isn't actually working properly.
Disable registration and Force everyone to email you details of what they want an account for and their contact details are, along with their desired username and password. Spam will plummet considerably. I agree this isn't viable for big forum sites, but for small forums it should work if you're willing to put up with registering accounts for people, not to be confused with account moderation.
I don't like it. Depends on your foum, but it's much worse than being banned for normal people. So how about a staged solution: As others pointed out, it is necessary to show the filtered posts to users on the same IP address as the spammer, otherwise all spammers will create two accounts and verify that their messages come through. The solution is that *the first time someone spams, they get a proper ban*. Ideally a timed ban, such that normal users who are not spammers can wait for e.g. 1 week and then get back. The spammer can create a new account, and it can be filtered. This may be too complicated, but I find the proposal quite dishonest, in case someone is banned for having an unpopular opinion, etc. Moderators are not always fair. [[ The last time this came up on slashdot I wrote something similar about it being deceptive, then I wrote "How can I even be sure that this message is visible to others?" And someone replied, thanks :D ]]
I think I love you .... or at least I love your response
The brother of this idea is a browser adblocker that actually loads the ads, but does not show them to you. Might need a change or two to your browser to make it know what to just invisibelize, but that should be doable?
(You still want to block tracker gifs and similar, but that already works, you just need two kinds of iffy address links.)
Because their algorithm misfired and put me under a shadowban for a while. It's hard to detect, but after a while I really felt as if I was talking into a void. So I loaded up a proxy server and connected to FARK through it, and surely enough, my posts weren't visible.
I give the admins a profanity-laden piece of my mind and they apologized, seems their spam detector was a bit over-eager. I still go there from time to time.
I discovered CNN doing exactly as described about 6 months ago. I tend to spot it quickly because I clear all of my browser cookies between every site change.
Yes, using facebook as a login for a 3rd party website IS evil.
I recently got banned for my first post to a technical forum of a VPN provider service I am using. Not sure what went wrong but was able to get the problem fixed pretty quickly. With this proposal I would never had know there was a problem that needed fixed.
Nice thought, a spammer checking out the spam on the site. Won't work. They don't check out sites they spam on. It's an automated process. I've seen sites with a kazillion times the same spam.
Nobody wants to read spam. Not even spammers.
Are we still going to wait 90 seconds for the protocol to be sure that whoever (if anyone) is at the other end isn't responding?
These are the delays that propagate themselves onto a user's desktop to leave them hanging for minutes after mistyping a server address or something similar.
I worked at a anti-spam company a few years. That was one of the things we did. We would send a 250 Ok to a message regardless of if it was accepted or not. If it wasn't accepted the customer had the option of putting it into a quarantine or just not writing it anywhere. I think we also always told suspect bad senders (essentially anyone we haven't seen before or anyone with a non-perfect score in our reputation and various blocklists) that a recipient exists. If things were suspect we'd throttle their connection way down to reduce load on the customers systems and make the bot really inefficient/prove them to be a bot because they don't obey SMTP standards for request timeouts (profit is proportional to emails/hr so generally spammer cut corners in terms of always assuming messages are accepted, not bothering to send the QUIT command etc). I imagine some similar stuff could be used for forums.
Getting tired of dealing with spam on my medium sized forum, I investigated some solutions over a period of months. Here is what I've found to often be employed (off the top of my head), and what I consider their issues:
* CAPTCHA replacement: at best, this will only stop fully automated registrations. If operating ideally, it won't stop manual registrations or automated registrations where the CAPTCHA is solved by humans, and note that this is often only employed during registration, not during the posting of spam. Strong CAPTCHAs like reCaptcha are considerably difficult for ordinary humans to solve unfortunately, so you trade off new user frustration with spam protection here. I've found that using a weaker, but barely known CAPTCHA can be quite effective, as it's unlikely that a decoder is written for this case, and relatively easy to solve by regular users.
* Security questions: if implemented well, may block automated registrations and possibly manual registrations. It's key to come up with a good question though that is difficult enough to stifle bots (something like 'what is 2+5' can be autoomatically solved), but easy enough for your average (or probably below-average) target user to solve. Another potential issue is that unless you're frequently changing questions, it's possible for answers to be databased, which doesn't seem to unreasonable considering that sweatshops are being used for solving CAPTCHAs.
* Customising registration page: an example may be renaming some input fields to try to fool bots. Or perhaps sticking in some complex Javascript (assuming most of your target users have JS enabled) that the browser must evaluate. Some smarter bots may get through, depending on how much you've changed the page, but it's a simple solution that can be quite effective. Will do nothing to stop manual spamming, or where someone decided to tune their bot for your site.
* Fingerprinting: this is where common bot patterns are identified (eg not sending an Accept HTTP header, when most browsers will, or timing how long it takes to fill in the registration page). Only effective against automated spambots that aren't that smart, or until they include measures to fool the fingerprinting.
* IP address banning: I found this to not be as effective as many may think; spam bots seem to be able to use proxies, rendering the blocks somewhat pointless. Even worse, a lot of spam seems to come from Asia, primarily India and China, which, I imagine due to IPv4 exhaustion, means a lot of possible IP addresses. I've had instances where banning a spammer's IP also would block some legitimate users at times (of course they were able to report it when their IP changed).
The other problem is that this is only an 'after-the-fact' solution. Using a database such as StopForumSpam gets around this, but I find that legit users can end up with banned IPs from SFS (and have had this occur).
* Email address banning: similar in concept to IP address banning, but without false-positives that IP address banning can have. Email addresses can be 'easily' generated, but susceptible to banning of the domain. On the other hand, I find many spambots come in with Hotmail/GMail addresses, so using a database lookup on these can be effective. I'm unsure how effective GMail address aliases are handled though.
* User name banning: I wouldn't consider this effective at all, as they're easy to change and chances of false-positives are relatively high.
* Spam databases: somewhat referred to above. This is where a community submits properties of spammers, such as their IP address and email. There's a bit of a funny feeling with allowing random users essentially be the gatekeeper to your registration process though... Have experienced a fair number of false-positives from StopForumSpam, and I would imagine from any service really.
* Akismet: here's a black-box solution on which says a post is either spam or not. Being black-box, we have no idea how it really works. Tends to be very unr
Why not just submit the spam to www.stopforumspam.com so that you appear on their submitters list, which a lot of spammers then use as a blacklist of sites not to spam?