The Ham and Spam of Weblogs
An anonymous reader submits "Will the blogosphere become just as spammy as Usenet? There may be over 10M weblogs out there, most of them seem to be fake spam blogs created to manipulate the search engines. Scott Johnson, CTO at Feedster, complained that "at times we see upwards of 90% of the traffic from Blogspot being spam," and the problem is likely to only get worse. Can blog search engines like Technorati, Feedster, and PubSub filter the signal from the torrent of noise? Or will we have to seek new approaches such as the social filtering used by Del.icio.us or collaborative filtering used by Findory to separate the ham from the spam?"
I wish Google had an option to exclude blogs from my search. Considering many blogs use b2evolution, phpBB, or whatever, Google could easily determine what IS a blog and what IS NOT and filter it accordingly. Google IMHO would be a much better place if I could exlude blogs and those stupid parked domain search sites from my queries.
::242
I'm not trying to be flamebait; It would be a nice option though.
It was a bit unintuitive how you add sites to the filter list though -- just cut and paste "http://*.whatever.com/*" into your extensions list and any search results from whatever.com will then be greyed out.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
If you have a few minutes, click on the randomizer button at the top of the screen that reads "Next Blog" a couple of times. I'd be willing to say that at least 2 out of every 10 blogs is a spam farm.
It's just fucking sad.
Web2.0: I love when people Flickr my cuil and digg my boingboing until my google is reddit and I start to yahoo
Actually, Usenet is doing quite well. The spam battle has been won; there's very little spam in the technical groups. Serious workers in difficult fields are on there. Check out, say, "comp.games.development.programming.algorithms", where the people who write physics engines discuss how to do it. Or "comp.std.c++.moderated", where proposed changes to C++ are discussed. Usenet has far lower advertising content than the Web, where, today, "content" seems to be a little box in the middle of the page, surrounded by blinking ads.