Google Image Index Just Not Updated
We ran a story earlier today about the lack of Abu Ghraib photos in Google's image index. We now have a response from Google stating that the image index simply hasn't been updated recently, as well as a fairly convincing demonstration from a Slashdot reader: Rahga writes "I put together a page that counters the 'Google Censors Abu Ghraib Images' story. It is the tale of a Morgan Webb picture on images.google.com that's been driving a ton of traffic to my webserver 7 months after it was removed." The Abu Ghraib story broke in April 2004 (and officially became a non-story on November 2, 2004), so Google's index is indeed quite far behind.
Like I mentioned in this post, I can vouch for this.
For the longest time, the search for my name on Google images would bring up really old images and it would never update them. So, in order to test this, I just removed those images and used a redirect (this was about 3-4 months ago) -- Google still did not update the pictures.
However, my academic page at my school did show up pretty soon, although it was created just recently. What more, it even showed the image of my latest schedule, and not an earlier one as in the other case.
So I guess Google probably uses some kinda weird algorithm to determine which sites are likely to be dynamic, and which are not -- and update/not update them accordingly.
Besides, everytime there's been a problem/censorship (say, due to DMCA) -- Google has been nice enough to notify the users during the search. Not to mention the amount of scalability doing something like this would require of them (which makes even less sense if they were the ONLY ones asked to do so).
So all in all, just a false alarm, I suppose.
This just goes to show that /. groupthink isn't always on target, and Google isn't the all-spidering oracle we think it is either.
Google's image search is not to be confused with Google's news search. If you search for Lyndie England against the news search, one of the pictures in question comes up in a thumbnail next to the first set of results. Google had plently of coverage of the Abu Ghraib story on its news pages, and its web search also has plenty of coverage of the topic. If Google was intentionally censoring, you think they woulda tagged all their search engines in the process.
For Google to be 6-months or more behind on reindexing their image storage to me seems about right. The link rot on the image search is starting to get annoying, but we've seen worse from the likes of Alta Vista in the past. Webcrawling seems simple but it's a very bandwidth intense process, and that means it costs money. Image spidering is even more expensive because pictures take up a whole lot more bitspace than HTML docs.
So, move that Slashdot story from earlier today from the Censorship category to the Almighty Buck category. That's the real reason why the pictures weren't there.
Seriously why does this need a new story? What was wrong with the update posted to the previous article summary?
Because in journalism there's a tradition of printing retractions for mistakes made on page A1 on a future page A1 in order to give the takeback as much exposure as the mistake. Slashdot leveled a rather serious charge of censorship against Google that quickly was proven not to be true.
Furthermore, there's a new piece of news coming out of this mess: Google's being quite slow on the refresh of the image search database.
Anyone have any ideas why they would be updating their image index so infrequently? Could it be because of the size of the files they are dealing with?
Be better in bed. Wikiafterdark!
It is a fairly minimialist search engine that searches Google, Yahoo, Ask Jeeves, About, LookSmart, Overture and FindWhat. I tried it a few times and find it occasionally returns a few more useful results than Google, and doesn't have an annoying clutter of ads.
(I supposed if it did I wouldn't know, I have mozilla configured to block even flash ads, and my firewall is configured to route most known ad servers to 127.0.0.1)
My rights don't need management.
And here we were, expecting Google to deliver us the latest in free pr0n images and thumbnails, and it's been shafting us with old crap the entire time!
The sky is falling!
The sky is falling!
Oh wait.
Nevermind.
(and officially became a non-story on November 2, 2004)
maybe the mass media isn't covering the prision over there in the sandy beach, but it's not all quiet, and definately deserves attention of those not deployed over there.
americans are still dying every day in that prision (which is controled by the americans). american troops are deployed in and around that prision sometimes for months at a time with no productive mission other than to be deployed so a general or such can get another stripe on their shirt. this is what our tax dollars are being used for.
there's units that have their own cooks but can't use them due to contracts with another food supply "company". what are these cooks doing? not a damn thing. there's people who are budgeted for a years deployment, but have replacements aready there. what happens to these troops? they get re-deployed to another closer area. these aren't the full time troops either, these are the reservists who are being forced to sit on their arse in the desert.
by the way, there's policy in abu-grabib now that photos MUST have faces digitally distorted. meaning if a solder takes a photo of someone who's leg has been blown off, make sure there's no face in the picture. i'm not even sure if they're aloud to send photos out w/o permission these days.
sign up folks, it's in the name of democracy after all.
I am pretty happy with the outcome of this story. Good on google for answering the allegations. Even when they must reveal some disparaging facts about their image search by doing so.
A Multiplayer Strategy Game for Mac OS X, Windows, and Linux
They also consider the text of links that point to a particular page. The search terms don't need to appear on the page.
Karma: -2147483648 (Mostly affected by integer overflow)
/. is always so quick to jump on anything that screams vast right wing conspiracy... and this time they got egg on their face. GOOD.
If you do a google image search for "www.google.com", one of the first results you get is an image of Alyson Hannigan. That image resides on my server.
I havent the foggiest idea how that image got associated with the string "www.google.com", no why it would be ranked so high. I havent linked to that image directly in over a year, and only on a page that Google shouldnt be trowling for images anyhow.
BTW, a good 70% of the traffic to my server is people looking for that image.
Since the editors seem to have momentarily forgotten:
The Abu Ghraib story broke in April 2004 (and officially became a non-story on November 2, 2004)
To simpletons in the American electorate, that might be true. But, if anything, Nov 2nd made the story much more relevant to about a billion muslims who view it as proof positive that the current US government may talk a good story, but where it counts, in real life, their actions are a whole lot different.
When information is power, privacy is freedom.
This just goes to show that /. groupthink isn't always on target,
Actually, just the opposite. An inaccurate story was posted, and it was torn apart by the comments. The hive-mind that is slashdot preformed quite well, IMHO.
It originally started with Google, but I sent a message requesting they removed them, and I'll be damned if they didn't graciously comply! Now Google no longer had record of those images, but Yahoo must have taken a copy of their archives when those two severed ties, because I saw refernces from Yahoo for things like "bigass.jpg" and "passedout.jpg". Imagine my joy... I was getting 404's out the bigass.jpg, and Yahoo wouldn't listen to me to take me out of their image index... Now, after several more months (and several dirty tricks), I no longer am included in Yahoo's index.
Does it stop there? No. Someone, somewhere along the way got a copy of those image thumbs out to every two bit search engine wannabe. To this day I still field 404's for stuff that I know had only been searched and indexed by Google, but has since found it's way via 3rd party routes into corners of the web I cannot begin to fully comprehend. *sigh* It's like a gnat bussing around my head... It's not hurting anything, I guess... but it's still annoying.
These days, I put the content="NOARCHIVE" meta tag on every web page I serve. It's not that I don't want visitors. I could deny them with a robots.txt exclusion to that end. I just feel that search engines still lack the ability to capture the nuance of what it is I do... And these days, it has nothing to do with bigass.jpg or images of drunks passing out.
(Not that those aren't fun things...)
Search for "litigious bastards".
The top result is SCO. Do you REALLY think they would have that in text anywhere on their site?
liqbase
"The Abu Ghraib story broke in April 2004 (and officially became a non-story on November 2, 2004)"
With White House counsel Alberto Gonzales--a figure central to the internal discussion of 'when is it not torture' at the White House--on a very short list of Supreme Court nominees, this issue may very well flare up again sooner rather than later.
Mmmmmm... Bold, yet refreshing!