Splogs Clog Blog Services
SuperWebTech writes "A new generation of spam has emerged lately in the form of automatically-created spam blogs, or "splogs." One wily programmer manipulated Blogger's API to create a "spamalanche" of thousands of blogs whose sole purpose was to increase their real sites' pagerank. This clogged search engine results while filling RSS feed services with useless listings. Though Google, Blogger's owner, is doing its best to fix the problem, in the meantime several services have stopped listing any site they host. So far nobody has found a solution."
Anyone else notice that every username in the video is [letters]-[numbers].blogspot.com.
Maybe start by disabling new blogs.
Flag all usernames that meet that basic regex criteria.
Hand filter that bunch.
Add the same captcha you have on your comment system to the posting system.
Re-enable registration.
Seems kind of elementary, doesn't it? Why not try it?
Simple: Just require a small donation to charity (through Paypal?) before they can create a blog. A dollar or two shouldn't matter to anyone who's putting up a real blog, but will deter sploggers.
On top of this, once again the hosting services need to be held responsible: if a site is hosting an obviously spamvertised site then give them 24 hours to remove the site or be blocked from future indexing activities - and have current rankings deleted.
If the g'vt kept the data on you that google does you'd better believe you'd be calling it "doing evil"
They could always randomly generate text from dictionaries to beat the word verification. But no 'splogger' is going to buy up thousands of IPs or domain names for their clever little scam. Figure in the IP or domain name to the pagerank. Maybe if most of the links are from the same IP then take a percentage off its score? This percentage co-efficient could even be derived from the textual context of the links.. if the context is the same (like the scores of mirrored Wikipedia articles, to name one example), then lower the co-efficient.
Yes and no. CAPTCHAs solve the problem for things like Slashdot, where you just have to worry about trolls with too much time on their hands. But when it comes to spam, there's a value to beating them, so what some enterprising spammers do is set up porn sites that tell people "enter the word you see here and get free porn!". Lots of horny geeks do the spammers' work for them. The difference between the two scenarios is that the spammers are willing to pay minute amounts to beat the CAPTCHAs, but the trolls aren't.
Bogtha Bogtha Bogtha
P.S. stop relying on google so much, PageRank is obviously flawed if it can be so easily manipulated by spamtards.
Do you have any alternate search engines (preferably with examples to prove that they're actually better) to use instead of google? I've tested out all the big names, and the results I get are almost always near-identical, with the small differences in the results returned not being that important.
It is extremely frustrating when Google returns nothing useful, but I've yet to find a search engine that works better. Google's level of results seems to be the best anyone can achieve at the moment (and it's not really google that's setting the level of excellence).
I have only used the e-mail posting interface to my blogger blogs a few times. If you like simplicity, the blogger online editor is quick-and-dirty posting for free. But the potential for abuse when you combine the easy-setup for gaining an account and the email method for posting is obvious.
...abject link-stuffing pollution for google's own search engine and festering on google's own blogging service...seemed pretty dumb to me.
BTW give google credit for putting a captcha feature on post commenting because comment spam used to be just as easy to blast into blogger posts as splogging.
its kind of ironic that google, which has had fewer [not "no", just fewer] security gaffs than Microsoft is, in a sense, suffering security embarrassment for a rather similar reason to the origins of Microsofts security mis-steps: trying to appeal to users by providing very streamlined and simple user interfaces to functions that require privelege [account creation, publication] on most systems [think unix or Apache]...yes the additional "hassles" of authenticating and establishing the remote request is from a human and not a bot are an impediment to users. But catering to utter lazy dummies is a worse hassle as ought to be clear to everyone by now. Funny this is now news. If you went to blogger 6 months ago and sellected a random blog and then just surfed randomly by hitting "NextBlog" button, you would have seen dozens of sights that were just huge steaming piles of links for such vital topics as online shoe purchases
SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
(Disclosure: I work in "white hat" SEO, where we try to actually make sites more friendly, fast and useful for end users; this black hat SEO stuff doesn't do us any favours at all, so I'm keen to see these spammers wiped out by any means).
Rich.
libguestfs - tools for accessing and modifying virtual machine disk images