Web Searches For What Lies Beneath
fat_hot writes: "The New York Times has an article [here] (registration required) about specialized search engines which try to drill into the submerged mass of the Internet iceberg to try to limit searches to particular subjects (and hopefully thereby increase coverage of the limited scope)." Considering that a google search for friends' web sites and other good stuff usually turns up more dirt than paydirt, it's pleasant to contemplate more relevance in search engines.
However, I suspect that whatever the answer to the search engine problem actually turns out to be, it will have the following characteristics:
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Most of us recall being brough into the school library and show how to use the card catalog, given a few assignements, etc. Unfortubately for those of us out of school the's not that set of skills in place to help searching.
Boolean seaches, using key words, supplying partial words, phrases, etc. are all supported by most search engines but few folks understand how to use them.
What's really suprising to me is that folks who use search engines regularly, indeed even rely upon them (journalists I mean you!) seem some of the most poorly prepared. There are lots of resources for learning how to do a good search, many from the search engines themselves and many more from third parties yet we still get these perennial "I can't find ..." stories.
Honestly, I'm not into blaming-the-victim but how difficult is it to learn how to perform a good search? One screen of directions? Two minutes of time?
Yes there's a place for specialized engines handling unique or limited content but most of the larger, more general purpose engines do nearly as well if properly used. Again, it's dependant on the user to learn how to define what they want, all of the tools in the world are no good if they're not taken advantage of.
I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
If your friends' have sites but not too many people link to them, they won't rank too highly in Google's eyes, will they?
A Google search for 'dumb motherfucker' will yield George W. Bush's website, how inaccurate could Google possibly be?
"a Google search on "chavez" led to several encyclopedia entries on Cesar Chavez" Would it have fucking killed them to type in "Linda Chavez labor secretary"? And this was very recent news, exactly how quickly do you expect Google to scan the entire internet for updates? How quickly could these 'iceberg drilling' search engines possibly scan the net? It's a deep web right now, what's invisible will bubble to the surface if it's relevant... Maybe they have a point on using the search engines to only scan specific areas, but I think websites which specialize in these areas should license the Google engine instead of Excite's... (you know what I'm talking about right? Every big site has some article you want to find, you go to look for it, you get the worst search interface possible that doesn't return any useful links...)
--
Peace,
Lord Omlette
ICQ# 77863057
[o]_O
Yeah, and it would be great if nobody stole money and gave to charity, too. It just isn't going to happen. Any system that is A) valuable and B) depends on everyone behaving honestly is doomed to failure. You're never going to get people to stop cheating the search engines as long as doing so is both possible and beneficial to the cheaters. The plain fact is that manipulating the system works, and people are going to keep doing it as long as it keeps working. The only solution is to develop a system that is not easily manipulated.
Perhaps you should try looking at Google, a search engine that actually uses these in a clever way as the key part of its ranking system. It's remarkably effective at finding relevant information and at avoiding the kinds of simple manipulation you complain about. Other ranking schemes (like GoTo.com's straight pay for placement system) are also relatively resistant to manipulation. I think that the long term solution is going to be natural selection; search engines that are easy to manipulate to give lousy results will go out of business and leave behind the ones that are actually useful.
Good luck. The latest versions of Google include over 1 billion pages. Manual sifting for poorly labeled ones just plain isn't an option if your primary goal is comprehensiveness.
There's no point in questioning authority if you aren't going to listen to the answers.
Why does one need cheesy dotcoms to tell us what a directory is?
A directory search limited to U.S. newspapers immediately brings up, say, an explanation by Linda Chavez about her relationship with the illegal alien in question.
If one wants political news, one can go to a political news source. If one wants information on Linda Chavez, one can do a more specific search. If one wants political news about Linda Chavez, one can (this must be getting very complex for your average dotcom founder) search a news archive.
-- Stanislav Shalunov
Examples:
Searching for "John Smith" should return my friend John Smith and no one else.
Searching for "C++ implementation of Knuth algorithms" should return exactly that, and leave out references to C++, Knuth, or algorithms.
At the very least, large search results should immediately separate the mass of results into categories - i.e. "Jessica Alba" - up at the top should be pr0n - fan sites - commercial sites - etc. Yahoo does this, but there are way too many categories. Really, the web has maybe 10-12 different broad types of sites - commercial, homepages, academic sites, pr0n, multimedia, weblog - you get the point, the list isn't that long. We should be able to filter entire broad categories out of our searches. Altavista does a fairly good job with multimedia searches - unfortunately there still is way too much manual searching - it still doesn't read our minds enough within the broad category search.
Google uses PageRank to determine the order of results, but does it track the sites its users click on after performing a search? No, but it should. Further, it should track users individually and be able to customize its results based on that persons individual personality. The more you use a search engine, the better it should work for you.
I can't stress this enough: A search engine needs to be able to read our minds.
No, Thursday's out. How about never - is never good for you?
(Note: If you have arrived at this site through inappropriate references via a search engine, please be assured that we did not utilize this language in our site, our HTML, nor in our internet promotion of this site. What happened was the result of a malicious act and we are pursuing remedies through the efforts of our staff and attorneys.)
I hope I am not liable in Spain for using those words. Please don't tell them where Spain is.
--ricardo
sgis ddo ekil t'nod i
I asked Jeeves "Where can I find a good search engine?" and was directed to a really good site where I can buy engine parts for my car online.
Thanks for nothing you bastard butler!
-----
"The only difference between me and a madman is that I'm not mad." - Salvador Dali (1904-1989)
The problem isn't the searches, it's the people who make the webpages.
Why doesn't everyone use metatags properly? What about specifying good (descriptive) title tags?
Plus, don't you think it would be much easier if people actually didn't try to cheat search engines?
In actuallity there would be some very easy ways to score pages for relevance then:
1) The number of times a particular word shows up in the keywords, and description of the page.
2) If the word actually appeared in the title of the page.
3) The number of times the word appears in the body of the text
4) The length of the supposedly searched word
5) The number of times a particular page is linked to.
6) The words used to in the link
7) The number of times the linking page is linked.
Wouldn't the world be happier. Personally, I think that it would be great that if there was an editing team that would simply delete misrepresented pages.
Anyway. That's my two cents.
"i blew a booger that i'd swear had it's own spinal cord" "OUCH" Caroline's Spine