Comment Spams Straining Servers Running MT
dJ phuturecybersonique writes "Netcraft reports that 'Comment spam attacks on Movable Type weblogs are straining servers at web hosting companies, leading some providers to disable comments on the popular blogging tool. The issues are caused by bugs in MT, forcing publisher Six Apart to recommend configuration changes while it prepares fixes.' More..."
It's a good thing Slashdot doesn't have this problem.
I'm going to start a comment spam deletion/marking service. I'll charge bloggers 1/10th of a cent per comment checked (1000 comments for a dollar), and hire people in some foreign country, like India or China, paying them 1/20th of a cent per comment read. For every proven mistake they make, I will fine them 10 cents, and credit 5 cents to the blogger. Sound workable?
So...Netcraft confirms it, blogging is dead?
Famous Last Words: "hmm...wikipedia says it's edible"
Why don't bloggers just disable HTML in comment posts, the spammers are looking for Google PR aren't they?
But DoS attacks as well. Running several political blogs I often get "freeped"
:)
The best solution for me:
1. User email address verification
2. server generated images to verify real user for registration
3. Regular cookie expiration after x amount of time
4. host filtering (referr filtering usually gets ride of "freepers" unless they open a new window
However - nothing beats good moderators, quality users and sticking to your nich. Don't go pissing people off tossing your blog around the world yourself and not expect to get anything in return.
It's a jungle out there
This has been going on for quite awhile now, and still no official fixes from SixApart?
Shame on them.
First and foremost, it's free (speech and beer) and distributed under the GPL.
Second, the actual developers of the software actually participate in the support forums, so if you do have a question, it's likely to be answered very fast by someone intimately familiar with the software.
Third, it's a lot less susceptible to comment spam, especially after applying a few plugins and hacks. I've never received a single one, and that's not for lack of spammers trying.
Fourth, it's very easy to customize the look and feel of the site without knowing any PHP. HTML and CSS is about all you need to know. Knowing PHP helps a lot if you want to really customize it, but it isn't a requirement.
Finally, they've already included a Movable Type import utility, so those of you who are sick of MT for this and many other reasons can move over with little hassle.
Signed,
A very happy WordPress user and occasional contributor.
How am I supposed to fit a pithy, relevant quote into 120 characters?
I had to ditch Moveable Type explicitly due to comment spam. The real problem with it was that there was no way to delete more than one at a time. The web app only displays the last five comments and then you have to go digging through every article to find the other spams. Real pain in the ass. I switched to Wordpress, which is also beseiged by comment spam from Online Poker outfits. In Wordpress, however, you can mass-edit with all comments listed with checkboxes to delete whichever are spams.
In Moveable Type and Wordpress, you can pretty much eliminate the script-driven spambots by renaming the comment cgi handler and then editing all other files that reference it. I didn't think of this till after I swtiched to Wordpress, though.
$5 / month hosted VPS on linux = awesome!
Just disable URL's in comments, and in user information.
Disabling comments is just silly.
If your case is like mine, where mt is stored in a directory just off of your public web site, do this: use a .htaccess to put a password on your whole MT directory. They can't access comments.cgi (assuming it's just a bot doing the spamming), they can't post comments. I don't really like the idea of people touching my CGIs anyway. Make sure your robots.txt excludes the MT directory as well.
That is, assuming you don't give a damn about people's comments.
How long until we have content/poster filtering for blogs like we have for e-mail? If someone got coding right now, they might make a pretty penny off of this...
You are all pretentious twats
Every last one of you. You're all latte-sipping, iMac-using, suburban-living tertiary-industry-working WASPs who offer absolutely no new insights on anything whatsoever apart from maybe one specialist field if we're lucky.
Quite an enjoyable rant.
xox,
Dead Nancy
besides WP, Nucleus is also a good blogging tool, easy to use and its secure. I use this and WP, both are nice. Also I was getting a lot of comment spam using WP, but I turned off letting other sites know when I update and the online casion spam stopped.
We had a similar problem on our ziffdavis.com blogs (like my security blog) and we think we have solved it with with one of those graphic field challenges to the user (enter the value in the nearby graphic).
How about something like the Distributed Checksum Clearinghouses for comments? Comments shouldn't generally be exact duplicates, and DCC is good at catching email duplicates which are often spam. It uses some fuziness factors so some alterations will still be caught.
To submit a comment on a blog, you must type in a series of letters and numbers for a non-machine-readable image (like when you forget your password here on Slashdot). This will at least prevent automated blog spam. ...I don't know why this solution isn't deployed already.
Moveable Type is DYING.
Call me untrendy, but I still like dotcomments.
Da Blog
They hired Jay Allen, creator of MovableType blacklist, as project manager, but MT BL is not part of the standard distribution. It's not a standard feature, nor is there anything designed in house that provides the same functionality if God-forbid Jay Allen won't let them bundle it as a standard feature. The worst part is that it is having major problems working with MT 3.121, the latest release.
Personally I think MT needs to just scrap the entire comment system and start over again. They need to implement a MT BL like system comprehensively, they need to ban ips tied to spam bots and they need to collect the information about the spammers so that MT users can try legal challenges.
Spam bots should be not only a civil offense, but a crime to use. The way that they are used against blogs is basically on par with defacing a website and often the stuff they push is illegal for minors to view. This is why we need something like the Child Online Protection Act. With something like that we could get spammers on criminal offenses for using spam bots indiscriminately.
Click here or a puppy gets stomped!
I've been using MT for 2 years now, and the comment spam is actually making a significant bump in the traffic to my server (I doubt anyone else actually reads my stuff...). I had looked at Wordpress a while back and didn't think it was quite "on par" with Movable Type, but MT has done it's best to alienate even myself.
I share my MT installation with my brother. Not surprisingly, we like having our own weblogs. MT now charges for something that simple.
The fact that Wordpress is released under the GPL and is actively developed gives me some further impetus to make the switch.
Thanks for the links - should be useful as I change over from MT over Christmas break.
I am entirely unfamiliar with the issue of spam as it pertains to blogs. Are spammers placing ads (as in, posting their URLs) to random peoples' blogs? Or is the problem that they are just polluting the comment list with random garbage?
If the issue is posting of URLs, then it should be a simple matter of the blog site checking any URLs against SURBL, a spam URL blocklist.
What am I missing here? When did this become such a huge issue?
Bla bla bla bugs yada yada proprietary yatta yatta use open source!
There, HAND.
Please correct me if I got my facts wrong.
I work for a web host and we've had this issue. 744 on mt-comments.cgi. Sorry guys.
I myself run an MT blog and have been contemplating moving to wordpress to dodge the spam bullet, however temporarily.
It occured to me thought that what would really fix this is to push the load onto the spammers by building a Reusable Proofs of Work (RPOW) system.
For those who are unfamiliar, RPOW is a proposal to stop mail spam by asking the sender to do a little "work" that would make sending a lot emails computationally too expensive.
As I'm in the last throws of my PhD I'll have to delay on this one, but maybe the lazy web can help out on this one, so the same thing doesn't happen to wordpress or whatever blogging monocultures exist.
The blog authors doubtless believe that the whole world is beating a path to their little diary but the fact is they're talking only to themselves.
Nobody cares what some zit-faced teenaged virgin thinks about anything, and nobody is going to waste their time reading those thoughts on some angst-ridden, semi-literate webpage.
Hell, they don't have any worthwhile experiences to share, and precious-little -- if any -- knowledge about anything not pertaining to pr0n sites.
This is not a tragedy in any way.
The link above was funny as hell and explained the MT load issue in far more plain language than the original article! Somebody waste some points and get that back up out of the negatives . . .
It might help, but I would rather have Google be searching the comments as well as the main post! Even if comment spam is a problem, you don't want to loose all the other comments that might have value.
Perhaps Google could recognize a Moveable Type site and just ignore comments from them.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
I tried renaming the comments script and it worked for a while, but spammers are smart enough to work around that. Lately I had been getting spam even a few minutes after renaming the script.
I installed mt-blaclist, which pretty much solved the problem for me. It allows you to search by regular expression and massively de-spam and blacklist the urls they point to. All subsequent comments containing those urls or other known spam expressions get trashed automatically.
See charts for twitter trends on Trendistic
I disabled html in comment posts a long time ago. Spammers don't care, their spambots keep spamming blindly. Statistically, they will find lots of sites that allow html.
See charts for twitter trends on Trendistic
While it started from FreeRepublic users, the verb "to freep" now can refer to hordes of people from any political blog, whether right- or left-leaning. The two most common sources of freepers are FreeRepublic itself (right-wing) and DailyKos (left-wing).
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
"Blog" software predates the existence of a separate category of "blog software", and most of the older stuff works better. SlashCode, I hear, has been known to run several high-traffic sites. There is also Scoop, which was developed for kuro5hin.org, and used at a few other places (like dailykos.org). Both are also much more full-featured than your average "blog software", especially in that they include threaded comments.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
I've used Wordpress ever since it branched off from b2. Unfortunately, its success has made it a good target for comment spam. The available plugins, such as Farook's WPBlacklist , work really well. However, the amount of incoming spam attempts is sort of like a DDOS attack on us little guys who have servers running on their home cable lines. It just disapointing that we have to put up with this.
Do they support multiple blogs with a single installation yet? That was the big reason I didn't move to Wordpress a while back...
Celebrate the finer things in life
The solution is to impliment authentication images, much like paypal or the like use when you register. It generates some odd-looking image with a few characters and digits in it, and you as the user have to type it in.
There is a system like this for wordpress called wp-authimage that works quite well. You do have to know a bit of php and it requires GD on your websever, but neither of those things are super-difficult. I used it on a blog I run with some friends and it works quite well. Our comment spam went from 100+ per day with MT to 0 with wordpress and this system.
Netcraft comfirms it; Movable Type is dying!
Sorry, had to plug that one. I run Drupal for my CMS, and lately I've been getting some 'free poker' spams in my comments. I've installed the Spam module and am holding my breath. Do modules like that work in MT?
Time for me to go check my friends MT sites...
CB
free ipod and free gmail!
Here's a patch to prevent comment spam for those of you left out in the cold when Movable Type abandoned MT 2.6.
Ian Macdonald, Linux sysadmin & Ruby hacker
When I was a lad we had the crazy stuff called newsgroups.
You could post to them, they we're threaded, they had an RFC protocol called NNTP and all sorts of programs understood them. Some of them were even moderated.
I wonder what happended to them?
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
I remember hearing the horror stories about WP users doing a fresh install and right after getting flooded with all sorts of comment spam. But I've been using it for quite awhile and I've never got one bit of spam in my comments. I am running various spam plugins and I assume they are working like a charm.
Also I'm not sure if this has anything to with it, but my site is hosted on a blog friendly host. BlogOmania supports all types of CMS, and they have a very firendly and reliable support staff.
--- hows it taste mother f$#@er!!!
Here's the deal. Everyone rolls their own solution like rscrawford has. Some people embed their own hidden fields, which is a great idea. Some people code javascript on the client that forces a pause of 20 seconds before the value of a hidden field is embedded.
Obfuscation can really make the work of the spambot writers more expensive than it's worth. Then they'll move elsewhere.
$5 / month hosted VPS on linux = awesome!
I run WordPress and used to get hit by many casino/cialis spams. I found that I get no comment spam after using a WP hack (http://www.gudlyf.com/index.php?p=376) called AuthImage, which is a CAPCHA (basic Turing test based on character recog.) I strongly recommend it, and would be grateful to any OSS vigilante who could port it to a proper WP plug-in.
Seems like poetic justice to me, seeing as how the vast majority of weblogs are mere "WWW spam" in the first place. And they seriously fuck up google results.
http://shit.slashdot.org/article.pl?sid=04/12/18/1 827225
I'm kind of excited, kind of disappointed. I run a blog with ten different posters running MT. We've been getting slammed with comment spam lately. I just assumed it was in relation to Google starting to move my site up a bit in the ranks. Apparently not. :(
/lib/MT/App/Comments.pm -- I started mine a few lines after line 150 in my case:
At first, most of the spam was from obviously-fictitious domains. I earned myself weeks of absolute lack of spam by throwing this into
# If an e-mail address is given... make
# it resolve to an IP
# Added by suwain_2
require MT::Blog;
my $blog = MT::Blog->load($entry->blog_id);
if($q->param('email')) {
my $email = $q->param('email');
my @email = split(/@/, $email);
unless(gethostbyname($email[1])) {
return $app->handle_error($app->translate("You pathetic loser. Your e-mail address doesn't resolve to a domain."));
}
}
I don't track how many people are being turned away by this; I still find myself cleaning up spam on a regular basis, but at least at first, I completely stopped spam. I now get a fair deal with 'real' domains that I just clean up by hand.
I also whipped up a little PHP utility that shows the 50 most recent comments; clicking on a field will show all results that match that field. I can easily find people posting under a particular address, or from a certain IP, and delete them. It's pretty crappy, but if people are interested, I'll post it? (Another script I have goes through and auto-rebuilds all the blogs.)
Hope this helps someone?
________________________________________________
suwain_2
Just takes a few assholes to ruin a public resource. They're like the people who steal and/or vandalize phonebooks in the public phone booths.
Bring punch to the party, and somebody will want to piss in it.
They say the first thing to go is your penis. Well, it's either that or your brain. I forget which...
If you're doing shared hosting and you allow your users to run CGIs-- regardless of what CGI it is-- you should have reasonable limits in place that keep child processes in check. Apache has had such directives for doing this for some time, one of them being RLimitNPROC. This directive allows you to limit the number of subprocesses that Apache will run concurrently.
You can even specify subprocess limits on a per-virtual host basis. With Apache 2, you can even limit based on directory. Using RLimitMEM is also a good idea.
Yes, MT's comment system can use some improvement. We're working on that. But these servers are getting hammered; in effect a denial-of-service style attack.
Even a "Hello, world" type script can be hit hard enough to bring down a server, assuming there are no process limits in place. Invoking a modern interpreter to execute a CGI script is no small feat. Perl, Python, Ruby, and even PHP (when run as a CGI as many shared hosting companies do for security reasons) consume enormous amounts of resources at startup regardless of the size or complexity of the script they are summoned to execute.
So, sure, code can be added to MT to recognize and adapt to a flood of comments coming in, but by the time the CGI runs, it's already chewing up CPU and memory. In my opinion, a better defense for these flood-style attacks is for Apache itself (or third-party in-memory Apache modules) to handle such situations.
mod_security, mod_dosevasive and others are excellent defensive tools for any public Apache server admin to use.
I'd love to know what others have done to configure Apache to prevent denial-of-service attacks.
Thanks very much for making the net harder to use for blind people. As a blind net user I congratulate you on building up yet more barriers to the fluid and accessible use of the net.
At least some of those sites that use auth images are now also using sound samples, which is somewhat better (if not perfect).
As has been discussed in other threads, the use of auth images poses a huge roadblock to visually impaired/blind users. Some other technique that requires a user to interact should be developed... something like a random word/math problem. The answer could still be displayed on the page as an image, for the cognitively impaired... but it doesn't rely on just being able to see the graphic (of course that leaves out those who are both blind and cognitively impaired).
There must be some better method of determining that there is a real person submitting the form, that doesn't penalize those who may have sensory impairments.
Hey, those people can still read your blog. They just can't post comments to it. In the context of all the other shit they're prevented from doing because of blindness, it's not such a big deal.
$5 / month hosted VPS on linux = awesome!
The problems are somewhat bigger than they mention. MT performs some very heavy database activity to even get to the point of finding that comments have been disabled completely. Even without triggering the page rebuilds, several hundred requests coming in will grind the server to a halt. The problem is compounded if you're running a flat database backend like sqlite, which does huge memory allocations and can launch you into a swapfest.
Given that instances of mt-comments.cgi are expensive even when they net no change to your database or your blog pages, server load is unbearable when there are a large number of concurrent instances. Now, there is a problem with either apache or mt-comments.cgi that makes mod_throttle's per-IP connection limiter fail. The current popular comment vandal's script opens a connection, sends a GET request with the post instructions as CGI arguments, then closes immediately. mt-comments.cgi continues running even though the connection has been dropped, and it doesn't count against concurrent connections from that IP. I don't know if mt-comments is ignoring a sigpipe from apache, or if apache is failing to send it. Either way, the cgi keeps running even though it's not being counted anymore.
My solution was to add code to the head of each instance of mt-comments.cgi. It sleeps for a second, then checks for an unreasonable number of mt-comments.cgi running. If too many instances exist, it dies without getting to the expensive database access. Until Six Apart release a new MT, this may be helpful to you. Add this to the "eval" section:
sleep 1;
$numrun = `ps ax |grep [m]t-comments |wc -l`;
if ($numrun > 3 )
{
die "Too many mt-comments running";
}
Caveat: I'm not a perl programmer. Somebody else can write this more elegantly.
This is what they said to us about all of the problems MT causes our servers:
/usr/bin/perl -w mt.cgi
/usr/bin/perl -w mt.cgi /usr/bin/perl -w mt.cgi
"We have MT running at a number of hosting companies with a variety of
configurations without an issue. "
Sure but what is this?
site1
Top Process %CPU 99.9
Top Process %CPU 12.0 [analog ]
site2
Top Process %CPU 99.9
Top Process %CPU 99.8
Long live PHP-only blogs.
Don't forget that sourceforge, owned by the same company as Slashdot, hosted floodmt for a time. Way to go guys!
(And yes, I'm one of the bloggers mentioned on the floodmt pages)
Jesus was all right but his disciples were thick and ordinary. -John Lennon