Better Search Engines
prostoalex writes "Scientific American is seeking better Web searches. They report on all sorts of innovations happening outside the Google-Yahoo-MSN zone that the press is usually reporting on, including GPS-enhanced searches from University of Maryland, Shape Retrieval and Analysis from Princeton, musical search engine from New Zealand Digital Library Project, and some of the projects that A9 and Ask.com have been working on."
This is kinda close.
What?
Asides from the horrible name, clusty (a clustering search engine) is very innovative and easy to use. I hope more search engines will adapt similar technology soon.
Link to clusty.com search engine
Personally I use the BBC Search engine. Not only does it seem to provide relivant results, it also has recomended links (info here http://www.bbc.co.uk/search/recommended.shtml ) which are editorially selected.
g o. x=&tab=www&go.y=&go=go&q=IT%20news
The site seems to return far less porn probably due to the fact they "use a combination of technology and regular human checks to detect and block offensive websites. We aim to be the safest search engine in the UK"
Also slashdot is the first return for "IT News" under the web tag.
http://www.bbc.co.uk/cgi-bin/search/results.pl?
CopyScape can do the recognizing of copied stuff, but it's purpose is only finding website plagarism. This, however, would definately find all the wikipedia forks unless it's a really old copy and the page has had a major rewrite.
If google could integrate copyscape into their search, you would be happy.
For context, click Parent.
Interesting, the first thing I thought is I had seen this with Vivisimo, but I guess no one could spell that so the changed the name?
http://vivisimo.com/
But I agree, it is a great search engine and has gotten better as I have used it.
You can do this in google: searchterm1 searchterm2 ~bogus The tilde will look for synonyms. You can see which ones hit back by reading the bold results which are neither searchterm1 or searchterm2. I use ~howto and ~cheats often.
Rule of the open mind
People who are resistant to change cannot resist change for the worst.
Try: The Google Directory http://www.google.com/dirhp.
The data is from the Open Directory Project http://dmoz.org/ an almost entirely volunteer-run project http://dmoz.org/about.html. I suggest using the Google version because, for most people, its search facility is better than the ODP search, due to the fact that it works like most Google users would expect a search to work.
The actual directory is variable in quality - some of it is very, very good indeed. However, it suffers from the normal problem that many volunteer-run projects have: parts it are neglected, and rather out of date. Always worth a look though.
You mean like this: Google API Proximity Search ?!
The hidden text problem that you mention is a surprisingly hard problem to deal with, as there are so many ways to do it.
You have:
- The <font> tag
- CSS (several ways, such as the
:hidden property, changing the colors, using the z order, etc.), both internal and externally linked (for which the search engine must download that file while spidering)
- DHTML positioning over other elements
- A background image the same color as the text
- Javascript to generate any of the above
- Use of nearly identical colors for all of the above (such as #FFFFFF for the background and #FFFFFE for the foreground). In fact, there could be dozens of colors that are all slightly different enough that a human wouldn't be able to detect it without looking very closely, or at all.
I'm sure there are more that I'm missing, but I think you (meaning everyone...I'm not just picking on the parent here...) get the idea. You pretty much have to render the page like a browser to take care of all of those, which really sucks for us search engine developers trying to fight it, and us users that have to deal with that crap.I am clearly fatter than you.