Better Search Engines
prostoalex writes "Scientific American is seeking better Web searches. They report on all sorts of innovations happening outside the Google-Yahoo-MSN zone that the press is usually reporting on, including GPS-enhanced searches from University of Maryland, Shape Retrieval and Analysis from Princeton, musical search engine from New Zealand Digital Library Project, and some of the projects that A9 and Ask.com have been working on."
This is kinda close.
What?
Asides from the horrible name, clusty (a clustering search engine) is very innovative and easy to use. I hope more search engines will adapt similar technology soon.
Link to clusty.com search engine
CopyScape can do the recognizing of copied stuff, but it's purpose is only finding website plagarism. This, however, would definately find all the wikipedia forks unless it's a really old copy and the page has had a major rewrite.
If google could integrate copyscape into their search, you would be happy.
For context, click Parent.
Interesting, the first thing I thought is I had seen this with Vivisimo, but I guess no one could spell that so the changed the name?
http://vivisimo.com/
But I agree, it is a great search engine and has gotten better as I have used it.
You can do this in google: searchterm1 searchterm2 ~bogus The tilde will look for synonyms. You can see which ones hit back by reading the bold results which are neither searchterm1 or searchterm2. I use ~howto and ~cheats often.
Rule of the open mind
People who are resistant to change cannot resist change for the worst.
You mean like this: Google API Proximity Search ?!
The hidden text problem that you mention is a surprisingly hard problem to deal with, as there are so many ways to do it.
You have:
- The <font> tag
- CSS (several ways, such as the
:hidden property, changing the colors, using the z order, etc.), both internal and externally linked (for which the search engine must download that file while spidering)
- DHTML positioning over other elements
- A background image the same color as the text
- Javascript to generate any of the above
- Use of nearly identical colors for all of the above (such as #FFFFFF for the background and #FFFFFE for the foreground). In fact, there could be dozens of colors that are all slightly different enough that a human wouldn't be able to detect it without looking very closely, or at all.
I'm sure there are more that I'm missing, but I think you (meaning everyone...I'm not just picking on the parent here...) get the idea. You pretty much have to render the page like a browser to take care of all of those, which really sucks for us search engine developers trying to fight it, and us users that have to deal with that crap.I am clearly fatter than you.