Google VisualRank for Image Search
Google researchers are claiming that a newly developed approach to visual search may do for image searching what PageRank did for text search. "The research paper, 'PageRank for Product Image Search,' is focused on a subset of the images that the giant search engine has cataloged because of the tremendous computing costs required to analyze and compare digital images. To do this for all of the images indexed by the search engine would be impractical, the researchers said. Google does not disclose how many images it has cataloged, but it asserts that its Google Image Search is the 'most comprehensive image search on the Web.'"
It should be noted that a lot of the prelim data for this was gained through human interaction that google setup as a game.
I am still playing with the filter by date dropdown url manipulation.
Does anyone have the full name/DOI of the paper?
Which is a good point. Sometimes you don't want the text associated with the image, you want the image itself.
The canonical example would be image macros and comic strips. When you're looking for a particular LOLcat or demotivational poster, or even a specific comic strip based on a remembered punchline, the text in the image is what you want to be able to search for. The text associated with the images (that is, the HTML from whatever poster first showed it to you in some gaming forum thread or 'blog) is irrelevant.
It doesn't solve the broader problem, but it'd be a good starting point. Given that the fonts chosen for image macros and comic strips are designed for readability, standard OCR techniques would work. If machines can solve CAPTCHAs, Google can certainly index the text on images.
All of a sudden, every comic strip you ever remember reading as a kid (even if you've only got one or two pages of it) becomes searchable.
Here's an idea for Google that's been on my mind for several months. Yes, I'm giving it out for free.
Let me upload an image in my hard drive to Google and have them check it against the zillion images on their catalog. Then give me a page with all the similar copies it could find, with a thumbnail and the URL from where it originates.
One practical use I can think of: Someone you meet on the web sends you a photo claiming to be of him/herself. With this Google utility, you could upload that same image and have Google tell you if it exists anywhere on the web. Then you'd know if this person just took it off a MySpace profile, etc.
Another practical use: Look for prior art or copyright violations on images someone claim is original work. Could be very useful for Wikipedia.
The potential for something like this is massive.