Splogs Clog Blog Services
SuperWebTech writes "A new generation of spam has emerged lately in the form of automatically-created spam blogs, or "splogs." One wily programmer manipulated Blogger's API to create a "spamalanche" of thousands of blogs whose sole purpose was to increase their real sites' pagerank. This clogged search engine results while filling RSS feed services with useless listings. Though Google, Blogger's owner, is doing its best to fix the problem, in the meantime several services have stopped listing any site they host. So far nobody has found a solution."
Anyone else notice that every username in the video is [letters]-[numbers].blogspot.com.
Maybe start by disabling new blogs.
Flag all usernames that meet that basic regex criteria.
Hand filter that bunch.
Add the same captcha you have on your comment system to the posting system.
Re-enable registration.
Seems kind of elementary, doesn't it? Why not try it?
With the Splogosphere maturing, we can expect to see Splogcasts in the near future.
Wouldn't a simple word verification requirement when creating a blog cure this? I don't think many people would bother creating "thousands" of new splogs if they knew they needed to manually enter in user data for each one... why should you even be able to start up a blog using an API?
Blogger already requires word verification for posting comments (if the blog admin turns it on) - am I missing something or would this also work to at least alleviate the splog problem too?
Any trend that has added so much crap to the English language deserves what it gets. After reading the "words" blog, splogsplosion, splog, and spamalanche, I must take a shower.
... much hyped statistics like 'a new blog created every 2 seconds'.
Google has recently announced an idea that would benefit bloggers. The idea is to have a separate blog search similar to sites like "Technorati". At first glance, this benefits bloggers. However, it benefits Google even more. By having Blog searches separate, they can significantly cut down on Google-Bombing. Google-Bombing really screws with their search algorithms.
I think this may be the beginning of a wholehearted launch of "Google Blog". This issue has also been reported on the "TWiT Podcast" hosted by Leo Laporte. I can't remember which episode number it is, but if you search iTunes podcasts database, you should be able to find it.
Example of Google-Bombing. Go to Google and search "Miserable Failure" and hit "I Feel Lucky". Regardless of what your opinions are. That type of behavior is still wrong.
i.e., Artima's Ruby Buzz and Java Buzz, Planet PostgreSQL and so forth.
Of course, those become less valuable when folks add RSS feeds that aren't specific to the topic, so that Java posts show up in the Ruby feeds and all that. That can be tricky too, though; does this post go under Jabber or PostgreSQL? Dunno.
The Army reading list
Isn't this the kind of automation prevention problem that capchas can solve reasonably well? Put image-text verificaiton on each step of creating or appending to a blog. If nothing else it will slow them down. Am I missing something?
picture, print that document out, attach it with your photo ID, and fax it to (800) Goo-gle1
Simple: Just require a small donation to charity (through Paypal?) before they can create a blog. A dollar or two shouldn't matter to anyone who's putting up a real blog, but will deter sploggers.
On top of this, once again the hosting services need to be held responsible: if a site is hosting an obviously spamvertised site then give them 24 hours to remove the site or be blocked from future indexing activities - and have current rankings deleted.
If the g'vt kept the data on you that google does you'd better believe you'd be calling it "doing evil"
I feel like I'm in a fog, without a seeing eye dog. What a sog! Burninate, Trog! Jeremiah was a bullfrog, but there was a server backlog. And that was just the prologue. Later we took a jog to get some egg nog. Just make sure to oil the cog. I know its a slog, but its better than smog. Thats the end of this log.
The trick is to figure out which are "splogs" and which are "real" blogs, because both are usually crap.
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
That story is about comment spam, where as this is about people creating spam blogs
In case you still can't see, that makes the two things completely different..
They could always randomly generate text from dictionaries to beat the word verification. But no 'splogger' is going to buy up thousands of IPs or domain names for their clever little scam. Figure in the IP or domain name to the pagerank. Maybe if most of the links are from the same IP then take a percentage off its score? This percentage co-efficient could even be derived from the textual context of the links.. if the context is the same (like the scores of mirrored Wikipedia articles, to name one example), then lower the co-efficient.
I seriously wonder if the DMCA's or other *AA laws couldn't be used to subpoena the ISP of these guys to get their real addresses. For some reason I doubt they are that many people in the spam and "search engine optimization" business.
Code is Speech. No to Censorship.
On a similar note, I think "Splogs Clog Blog Logs" would be a much better title.
There should be an annual Seuss day where all article titles must be tongue twisters, and all summaries must be done in nonsensical rhyme.
When you're afraid to download music illegally in your own home, then the terrorists have won!
That is the most Sun-like headline I've ever seen on slashdot. For those of you who aren't in the know about crappy British tabloids, The Sun* is like the most popular paper in the country, and I think owned by Darth Murdoch himself. They quite helpfully have pictures on their main page of recent headlines (flash), hence the link.
*Health warning: please shield your eyes whilst loading the site. The sudden visual impact of the Sun's website can cause severe disorientation, epileptic fits, vomiting, and in some cases death. Not recommended for pregnant women or people with heart conditions
In hopes of not looking so spammy, they will take real blogs, and either copy the contents, or just key words (such as authors name and perhaps post title.
So when you search for something... spammers with your name come up, rather than yourself.
Honestly, with everyone and their mom jumping on the blogging bandwagon and the general quality of said blogs approaching robot created jibberish, I honestly think the blog hosting companies are in for quite a struggle determining spam from cruft. Although, if their automated measures also wipe out some of these inane blogs as well perhaps the authors will get a hint and the blogsphere will be a better place AFTER the spammers arrived--imagine that.
'He was a dreamer, a thinker, a speculative philosopher... or, as his wife would have it, an idiot.' - Douglas Adams
The problem surfaces when the "splogs" are used to comment spam and trackback spam legitimate blogs. It's through these links that PageRank is increased. If everyone starts proactively dealing with spam on their own sites, this problem will solve itself. MovableType users can upgrade to 3.2, which has spam blocking features, or use the great plugin MT-Blacklist. Either will eliminate this problem. An AC mentioned that WordPress has a similar set of options. I know that TypePad does. The only major blog service provider left to come up with a solution is Blogger, and in the interim you can require registration to post comments on your Blogger site or turn comments off entirely. LiveJournal and all the clones are blocked from trackback by 90% of normal blog sites already, so they don't even count.
Another poster suggested that we ignore this problem, and it will go away. Untrue. Ignoring the 600 spam comments a day is exactly what the spammers would prefer you do, so that they can stink up every site on the internet with their crap. We are fortunate that in the case of this "new" form of spam, the tools necessary to get rid of it are already there and effective, we just need to get them all turned on.
Ahhh, one step closer to the inevitable webterm of "splooge."
Be a real patriot: Question authority. Think for yourself. Formulate your own conclusions.
P.S. stop relying on google so much, PageRank is obviously flawed if it can be so easily manipulated by spamtards.
Do you have any alternate search engines (preferably with examples to prove that they're actually better) to use instead of google? I've tested out all the big names, and the results I get are almost always near-identical, with the small differences in the results returned not being that important.
It is extremely frustrating when Google returns nothing useful, but I've yet to find a search engine that works better. Google's level of results seems to be the best anyone can achieve at the moment (and it's not really google that's setting the level of excellence).
Google needs some mechanism judging if a link is a fair link (made by an independent person/process) or "bought" link created by on on behalf of the same site that being linked to. I'd bet if Google analyzed these splogs and other SEO-generated sites, they'd find an excessive number of links from the splog to the target (or other in-network splogs) but few links from the splog to other relevant sites. Perhaps Google should reweight sites that seem to focus too many links in one direction. Of course, this is only a temporary solution as SEOs/sploggers could just use Google to find a set of random, but relevant, links to add to their splog.
The deeper problem is that no matter what Google does, some clever SEO will find a way around it. And since sites seeking to be at the top of the search out number Google engineers by a wide margin, the SEOs would seem to have the advantage. The only group with greater numbers than the SEOs are Google users. I suspect the ultimate solution will mean social ranking systems where each Google user gets to rank pages and have a reputation for page ranking. The user reputation system would mitigate attempts by SEOs to either up-rank their pages or down-rank competitor's pages.
Two wrongs don't make a right, but three lefts do.
All these approaches are in active use.
Sorry for the rant, but this is all just becomming too much, and it's only getting worse. Are we as a society willing to accept this in the name of free services?
"Are we as a society willing to accept this in the name of free services?"
This isn't even necessarily part of receiving a free service. Just look at the examples you cited, did you pay to go to the movies? So why do you have to pay to see ads? I truly doubt that the cost is being held down for you by the ads, more likely it is just extra profit for the theaters at your expense.
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
I have only used the e-mail posting interface to my blogger blogs a few times. If you like simplicity, the blogger online editor is quick-and-dirty posting for free. But the potential for abuse when you combine the easy-setup for gaining an account and the email method for posting is obvious.
...abject link-stuffing pollution for google's own search engine and festering on google's own blogging service...seemed pretty dumb to me.
BTW give google credit for putting a captcha feature on post commenting because comment spam used to be just as easy to blast into blogger posts as splogging.
its kind of ironic that google, which has had fewer [not "no", just fewer] security gaffs than Microsoft is, in a sense, suffering security embarrassment for a rather similar reason to the origins of Microsofts security mis-steps: trying to appeal to users by providing very streamlined and simple user interfaces to functions that require privelege [account creation, publication] on most systems [think unix or Apache]...yes the additional "hassles" of authenticating and establishing the remote request is from a human and not a bot are an impediment to users. But catering to utter lazy dummies is a worse hassle as ought to be clear to everyone by now. Funny this is now news. If you went to blogger 6 months ago and sellected a random blog and then just surfed randomly by hitting "NextBlog" button, you would have seen dozens of sights that were just huge steaming piles of links for such vital topics as online shoe purchases
SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
Make douchbaggery a hangable offense.
"We the jury find the defendent guilty of 1,204,652 counts of false advertising, and one count of being a world-class prick. We hereby sentence him to be hung by the neck until he is dead."
---If you can't trust a nerd, who can you trust?
Who among us could not grok the same frustration? Funny anecdote: My kid went on a school field trip which included a stop at McDonald's. She returned with her happy-meal toy: a tiny little stuffed puppy-doll with a hu-u-ge tag sewn to it, just screaming with advertising and copyright information. The tag was about three times as big as the dog. I sent her for the scissors and snipped the tag off (in blatant disregard for the fine print saying I was committing a crime). Then the light bulb went off, and I asked her for all the *rest* of her stuffed animals. We had great fun performing tag-ectomies, as I explained to her that we had bought and paid for everything in the house, so it was ours to do with as we pleased, including stripping the commercial propaganda out of it. I think dolls are more fun to play with when they're allowed to just be dolls. She agreed. I'm just doing my best to raise a lawless little punk, here! (:
It's stuff like that that frustration with corporate capitalism can drive you to.
All blogs were already spam. Now it's just unashamedly so.
I agree with parent. My penis has grown a whole six feet since I started using the internet.
Spam jams Stan's LAN.
Guy's WiFi goes awry.
CERN confirms worm, firms squirm.
Forget cassette and diskette, USB key snazzy.
Nimrods applaud iPods abroad, while tightwads called slipshod clawed screen fraud.
One Phish, Two Phish.
Red Phish, Blue Phish.
Support Right To Repair Legislation.
Well, advertising wouldn't be spiralling out of control quite as much if every single person wasn't trying to make a million dollars by age 25. What ever happened to working for what you earn, and then enjoying those earnings. I know at least the US is on a fast track to having a lot of unhappy people with way too much money that isn't worth anything.
;)
Maybe I'll just go live under a rock... as long as I can get wireless high speed internet
Cheesy Movie Night
Email allows anyone to send it - the result is SPAM. Blogs allow anyone to post comments - the result is spam. We should have learned this by now. Blogs need a handy way for bloggers to moderate comments before they appear. C'mon it's not rocket science.
None of this would happen if there was no money driving the attacks. How to make it not financially worthwhile to pay people to spam for you should be the question.
People in this thread have mentioned a number of things which would make such spam more technically difficult to pull off, none of which would be foolproof.
However, some combination of these techniques could be used by the search engine (handy, that Google the Blogspot-owner-victim is also the search engine being manipulated) to simply flag spammy links internally. And then use them as negative modifiers in its pagerank algorithm. So, questionable attempts to google bomb your site makes it drop off the face of google. Silently.
Sure, this could be abused to try and stifle competitor's pageranking. But that's a second order effect, within the realm of possibility to manually correct, as a whitelist of commercial targets bad guys have tried to frame has got to be more easy to maintain than a blacklist of fly-by-night spam sellers.
Here's my solution. Charge $1 to open a new blog account. It's still basically free for anyone who wants an account, but prohibitively expensive for spammers who want thousands of accounts.
Mike van Lammeren
It will challenge your head, your brain, and your mind.
Comment removed based on user account deletion
Personal note - weened my 5 year old off of McDonalds. Just went with the phrase "Daddy doesn't go to Donalds" - after a while - he doesn't even ask anymore. The kid knew McDonalds before he was ever there, from birth! - pretty good job if they can advertise to the kids before they can learn to speak.
Stay tuned for new sig...
You can already. Just add -site:(URL here without the ()'s) at the end of the search, as many as sites you want not to be listed in the results... :)
I think the idea of using 'g-mail' style invites might be a good idea here. Legitimate users won't want to risk getting their accounts disabled, so they will be more careful about who they invite. And unscrupulous users can easily be founded and eliminated at the root by assuming that they and all children of the user are invalid. It doesn't work well for small sites, but for high-visiblity sites like Blogger, it could be very effective.
Titus Barik
Through what medium? Credit cards?
Credit card, PayPal, mail in a check, whatever you like. You could even make it refundable after six months or a year.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
Good choice! Our family doesn't do fast food - period - but this was school we're talking about. So I caved. Have you noticed how much kids are targeted by advertising while in school? My kids bring home marketing junk from places like Home Depot and FedEx (T-shirts and such) that visit class. FedEx actually sent the daughter home with a temporary tattoo. I drew the line there - big business wants to graffitti their logo on my kid's bodies? I pitched it.