Splogs Clog Blog Services
SuperWebTech writes "A new generation of spam has emerged lately in the form of automatically-created spam blogs, or "splogs." One wily programmer manipulated Blogger's API to create a "spamalanche" of thousands of blogs whose sole purpose was to increase their real sites' pagerank. This clogged search engine results while filling RSS feed services with useless listings. Though Google, Blogger's owner, is doing its best to fix the problem, in the meantime several services have stopped listing any site they host. So far nobody has found a solution."
Wouldn't a simple word verification requirement when creating a blog cure this? I don't think many people would bother creating "thousands" of new splogs if they knew they needed to manually enter in user data for each one... why should you even be able to start up a blog using an API?
Blogger already requires word verification for posting comments (if the blog admin turns it on) - am I missing something or would this also work to at least alleviate the splog problem too?
Google has recently announced an idea that would benefit bloggers. The idea is to have a separate blog search similar to sites like "Technorati". At first glance, this benefits bloggers. However, it benefits Google even more. By having Blog searches separate, they can significantly cut down on Google-Bombing. Google-Bombing really screws with their search algorithms.
I think this may be the beginning of a wholehearted launch of "Google Blog". This issue has also been reported on the "TWiT Podcast" hosted by Leo Laporte. I can't remember which episode number it is, but if you search iTunes podcasts database, you should be able to find it.
Example of Google-Bombing. Go to Google and search "Miserable Failure" and hit "I Feel Lucky". Regardless of what your opinions are. That type of behavior is still wrong.
Flag all usernames that meet that basic regex criteria.
With all the efforts spammers do to avoid baisian filtering on e-mail, don't you think they will change their username format to something else half an hour after you implement this regex? Probably to something more variable (and dictionary based).
Hand filter that bunch.
And hand filtering thousands of blogs which are created automatically does not seem feasible...
Google needs some mechanism judging if a link is a fair link (made by an independent person/process) or "bought" link created by on on behalf of the same site that being linked to. I'd bet if Google analyzed these splogs and other SEO-generated sites, they'd find an excessive number of links from the splog to the target (or other in-network splogs) but few links from the splog to other relevant sites. Perhaps Google should reweight sites that seem to focus too many links in one direction. Of course, this is only a temporary solution as SEOs/sploggers could just use Google to find a set of random, but relevant, links to add to their splog.
The deeper problem is that no matter what Google does, some clever SEO will find a way around it. And since sites seeking to be at the top of the search out number Google engineers by a wide margin, the SEOs would seem to have the advantage. The only group with greater numbers than the SEOs are Google users. I suspect the ultimate solution will mean social ranking systems where each Google user gets to rank pages and have a reputation for page ranking. The user reputation system would mitigate attempts by SEOs to either up-rank their pages or down-rank competitor's pages.
Two wrongs don't make a right, but three lefts do.
Sorry for the rant, but this is all just becomming too much, and it's only getting worse. Are we as a society willing to accept this in the name of free services?