The Anti-Thesaurus: Unwords For Web Searches
Nicholas Carroll writes: "In the continual struggle between search engine administrators, index spammers, and the chaos that underlies knowledge classification, we have endless tools for 'increasing relevance' of search returns, ranging from much ballyhooed and misunderstood 'meta keywords,' to complex algorithms that are still far from perfecting artificial intelligence. Proposal: there should be a metadata standard allowing webmasters to manually decrease the relevance of their pages for specific search terms and phrases."
Not such a bright idea to whine about too much traffic on your website and then get a link to your site from a slashdot article.
Mod my comments down. It'll be fun.
From: frankie3327@aol.com
To: staff@cs.here.edu
Subject: help!
i have a lexmark 4590 and it wont print in color.
it only makes streaks. also the paper always
jams. how do i fix it? please reply soon!
The senders never had any connection to the college or the department. We'd reply telling them we had no idea what they were talking about, and that they should seek help elsewhere. It was rather annoying.
We eventually figured it out. The department web site maintains a collection of help documents for users of the systems. One of them talked about how to use the department's printers, what to do if you have trouble, etc. At the bottom it listed staff@cs.here.edu as the contact address for the site.
You've probably guessed it by now. That page came up as one of the top few hits when you searched for "printing" on one of the major search engines (I forget which one). Apparently lusers would find this page, notice that it didn't answer their question, but latch on to the staff email address at the bottom, as if we were an organization dedicated to helping people worldwide with their printers. Furrfu!
I think we reworded the page to emphasize that it only applied to the college, and we haven't received any more emails lately. But if we could have kept search engines from returning it, that would have been even better. Since in our case the page was intended for internal use, we don't care whether anyone can find it from the Internet. Our real users know where to look for it.
So in answer to your question: When a search engine returns a page that doesn't answer the user's question, the user will often complain to the webmaster. That's a clear incentive to the webmaster not to have the page show up where it's not relevant. Also, it's not the goal of every site simply to be read by millions of people; some would rather concentrate on those to whom it's useful.
Surely this kind of issue is what Tim Berners-Lee and the W3C is trying to address with the Semantic Web.
The problem with content on the web today is that while it is perfectly readable by humans, it is incomprenesible to machines. If Tim and Co get their way, and I for one would love to see the Semantic Web catch on, then we can get rid of kluges like the Anti-Thesaurus, HTML meta keywords and the like.
-- "So, what's the deal with Auntie Gerschwitz et all?"