How Google Trends & News Pollute the Web
Danny Sullivan's hard-hitting piece at Search Engine Land calls on Google to quit being evil in one particular way: collaborating with sleazy websites that jump on Google Trends to grab advertising revenue, as Google itself rakes it in. "Google's CEO Eric Schmidt has quite famously been on record many times talking about how the Web is full of garbage. It's a cesspool out there, he's said. Today, a short fast look at how his own company pollutes the Web. ... That [example of an off-topic, trend-following] page isn't adding any value to the web. If it didn't exist, we wouldn't be the less savvy... But thanks to Google Trends, we've got a big red flag up in front of publishers that wish to pollute Google's results with this type of garbage. ... On the one hand, I love Google Trends. It's fun seeing what the top terms are that are sparking interest... On the other hand, it's clear how much [garbage] Google has caused to be generated, simply by publishing the trends. But that garbage wouldn't happen, if it didn't know it was going to be rewarded. It is, both with traffic from Google and from revenue from Google for those carrying its ads."
Certainly not Google. Or me, for that matter. The Big G's business model is built on the premise that storage is cheap, and that value is provided by being able to never delete anything, but make it available through a powerful search engine. When did you last delete something out of Gmail, for example?
There are whole industries around SEO and it seems naive to think that people aren't going to create/alter content in order to get a higher ranking. Does it matter?
I started using using google blog search to create an RSS feed of topics I'm interested. Gradually I started using regex to filter out sites that were clearly just spam sites. Now my regex statement is about 20K in size, and out of 150 results that Google returns, I may have 4 or 5 stories that make it through the filter.
Introducing Microsoft Vacuum 1.0 The first Microsoft product that doesn't suck.