Google Redesigns Image Search, Raises Copyright and Hosting Concerns
An anonymous reader writes "Google has recently announced changes to its image search. The search provides larger views of the images with direct links to the full-sized source image. Although this new layout is being praised by users for its intuitiveness, it has raised concerns amongst image copyright holders and webmasters. Large images can now easily be seen and downloaded directly from the Google image search results without sending visitors to the hosting website. Webmasters have expressed concerns about a decrease in traffic and an increase in bandwidth usage since this change was rolled out. Some have set up a petition requesting Google remove the direct links to the images."
More people being linked directly to the high resolution image, but less people actually visiting the website. This isn't really that confusing.
In fact, it causes reduced bandwidth usage because you don't have to download some stupid ad-filled (and possibly malware-infested) web page that you don't want to see, the way the old image search did.
If they don't like it, block any requests with a Google referrer string.
Some websites use a annoying script that redirects people when they click a image.
It's called hot linking or leeching and it has been a headache forever. You want to show content + ads but your server is used just to pull an image, thus no traffic and high bandwidth.
Fighting the good fight:
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?cyberciti.biz/.*$ [NC]
RewriteRule ^.*\.(bmp|tif|gif|jpg|jpeg|jpe|png)$ - [F]
Took me 5 seconds https://www.google.ca/#hl=en&tbo=d&spell=1&q=robots+.txt+for+images&sa=X&ei=FJYRUeytEIeGiQLemYGIDg&ved=0CCsQvwUoAA&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.&bvm=bv.41934586,d.cGE&fp=7c0022b148dcff04&biw=1680&bih=860 with the results http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35308
How about a small effort from the site owners?
by TheSpoom (715771) Uncaring Linux user here. I have nothing to add to this but please continue. *munches popcorn*
If you even read the summary, let alone TFA you'll see:
"The search provides larger views of the images with direct links to the full-sized source image."
Yes, I did read TFA. And nowhere does it explain how you can have decreased traffic but increased bandwidth usage. Because it's not possible. Decreased traffic = decreased bandwidth usage.
Here's the real problem (quote from TFA):
When people get the full resolution image, they have no reason to click to go to the URL.
Dear "Webmaster", nobody cares about your shitty website packed full of annoying ads. Get over it already.
Lots of sites put hi-rez images on file, and link to them via a thumb nail.
The majority of visitors don't request the hi-rez images, at least not all of them.
But posting a link to a high-rez image can get your bandwidth slammed, serving images, but nobody requesting the web pages. Especially if its porn, or happens to hit the search topic of the moment. Without the ability to serve ads, these websites make no money.
Of course, if the complainers had an actual clue, they could just put a robots.txt file in their image storage, which Google seems to honor.
Sig Battery depleted. Reverting to safe mode.
It isn't as obvious as you make it sound. Scenario 1: Google links to your page. People who want your image click through, your server throws them the whole page plus the high resolution image. Scenario 2: Google links only to your image. People who want your image download just that, your server sends them just that. All else being the same, scenario 2 is less bandwidth, not more, because you'd be serving the same image either way, but in one case with and in the other case without all the other stuff on the page as well. It's entirely possible for it to add up to more, but this depends on how the new search affects people's usage of the results- it requires that more people actually click to view the full-resolution image as a result of the changes. That's a likely, but not necessary outcome.
IIRC, jpeg images allow header data that includes copyright info. If you don't care about use of the image, leave it blank. If you do, insert the copyright info. Google's bot can look for copyright data and if it finds it, it can link to the original html page. Otherwise, it can give a link for a direct download.
I think there was something on /. awhile back that talked about some system for the owner to indicate how an image could be used, e.g. commercial, non-commercial, free and so on. Couldn't find it on a quick search, but that might be another option to tell Google how to handle an image.
I can mend the break of day, heal a broken heart, and provide temporary relief to nymphomaniacs.
If webmasters don't want people "stealing" photos without viewing directly on their website, they are more than welcome to instruct their web servers to not display images to freeloaders. Look at the referer header, if the request didn't originate from your site, then don't serve it.
I went to eat some animal crackers and the box said, "Do not eat if seal is broken." I opened the box and sure enough..
In fact, it causes reduced bandwidth usage because you don't have to download some stupid ad-filled (and possibly malware-infested) web page that you don't want to see, the way the old image search did.
If they don't like it, block any requests with a Google referrer string.
This has been answered in the branch above. You can easily exceed your hosted bandwidth quota (with zero ad-generated revenue) by having a high-rez photo from your site pop up in a google image search, especially in a situation where something you have on file becames the topic of a high number of searches.
Even if you don't serve that photo normally on your web pages, but simply provide a button or thumbnail to click for the small percentage of viewers that want to see the high-res.
Most visitors don't click the high-rez button or thumbnail. The few that do, don't matter. Until Google indexes it, then all bets are off.
Some (failed) web designers only put the high-rez image in, then shrink it into a box via the html IMG tag. (Then they wonder why people complain that their web loads slowly). These guys would see very little difference in this case, unless of course Google sees a surge of searches that just happen to find your Nattily Portman collection.
Sig Battery depleted. Reverting to safe mode.
# cd
# cat - > robots.txt
User-agent: *
Disallow: /
<crtl-D>
#
Problem solved!
Karma: Bad
If you're running a website with Apache, you can configure Apache to look at the HTTP_REFERER header and see where the web surfer was when they made the request for the image. If they weren't on your website, (or if they don't provide the header, an act to be widely discouraged), just re-direct them to your home page instead of serving the image.
I would think that other web servers could do the same thing, one way or another.
For most people, it costs money -- perhaps not a huge amount, but still, real money -- to put up a website and serve content to the world. The expectation, if not agreement, is that you'll look at the site's content on the site.
The webmaster's position is no more hostile than that of the deep miner: There are expectations, but no promises.
Google's search goes far beyond fair use, as far as I'm concerned.
I've fallen off your lawn, and I can't get up.
You used to get traffic actually visiting your site. That meant full page loads, but a lot of that is text which is low bandwidth. You now have less traffic (unique IPs hitting your site), but they're JUST downloading hi-res images which leads to a net increase in bandwidth.
Also, ads don't have to be shitty and annoying. Slashdot uses ads, and even though I can I don't turn them off because they're relatively passive. Hosting and bandwidth cost money, and a lot of sites rely on small ad revenue to help offset those costs.
"Always forgive your enemies; nothing annoys them so much." - Oscar Wilde
Because google goes directly to the full sized image, not the thumbnail on the web page. Grabbing the image directly creates no impressions, so the bandwidth burned per impression shoots up.
You can opt out.
Google always offered links directly to the original image, though it did load the actual site in the background. And you've always been able to prevent the direct image links by referer control.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
> Google does NOT behave itself.
It's also a bit dumb. It's been playing my webserver at a variant of reversi for the last 12 months (one of the links at the end of each game is to start a new game, which it duly follows...)
Also FatPhil on SoylentNews, id 863
What's going on is fairly obvious if you read the article linked in the sentence "Webmasters have expressed concerns about a decrease in traffic and an increase in bandwidth usage since this change was rolled out."
The article says nothing about an increase in bandwidth usage. The anonymous reader who submitted the article obviously just made that part up, as anonymous people on /. do, without regard for whether it made sense or accurately reflected the link being given.
On one hand, I think the site owners deserve the traffic. On the other hand, it seems like at least a quarter of the pages end up being dead when I click on them, or redirect to sites attempting to install malware on old versions of Firefox, or seemingly have nothing whatsoever to do with the image that's supposedly there.
A compromise might be to allow users to open the referring page in context immediately, open the cached page (with live content) after a 2-second delay, and allow users to grab the full-sized image directly from Google's cache after a 10-second CAPTCHA-guarded delay. Then, users would have every incentive to try viewing the page in context, falling back to the cached page if the original page ends up being down/borked/whatever, and being able to grab the cached image if all else fails.
Going a step further, Google could come up with some free digital watermarking scheme that allows a 48-bit (give or take) payload to be encoded into the image at a user-selected strength (allowing him to balance robustness, file size, and visibility... pick any two of the three).
The upper few bits (let's say, 4) would indicate the version. Initially, it would be 0001.
The next 40(give or take) bits would be globally-unique, and allow somebody who knows the value to obtain meta info about you in a sensible manner. If they're all 0, it means you're using a generic permissions watermark that doesn't identify ownership, but simply restricts use.
The lower 4 bits specify explicit restrictions
* do not contextually-index
* do not cache full-sized image
* do not perform face recognition of any kind
* do not index for similarity to other images
A value of "0000" would allow search engines to index the image, unless you restricted them in some industry-standard way via metadata referenced to your unique id. For the generic value with all 0s, 0000 means "go ahead and index this".
A value of "1111" would indicate that the image, when encoded with a 4-bit watermark, should not be indexed in any way, shape, or form, regardless of future extensions to the standard that might define additional permissions, and regardless of what any indirectly-referenced meta-info might or might not say. Let's call this the "Stop Facebook from Permissions Creep in a GPLv3-like manner" anti-permission.
I think that if I was a photographer, I would be OK with Google caching full quality images as long as they put their own annoying watermark all over it with the URL where the image came from clearly visible.
No sig. Move along - nothing to see here.
Retard.. things are copywritten automatically when they are published. What you just suggested is - never publish anything online.
Your post is covered by copyright
Linux - copyright
slashdot's html - copyright
Do you know what copyright is?
Why is slashdot filled with retards these days.
wow. "Retard.. things are copywritten.." ..."Do you know what a copyright is"
When I try to type copywritten, it get a red underline. My PC doesn't know what copywrite is.
If someone clicks the Google Image Search 'high-resolution' link for one of my photos from Flickr, they get a medium-resolution version with no description, attribution or copyright information. (Example search page here.
If they go to the ad-free Flickr page, they get links to much higher resolution versions, associated images and also get informed that it's under a super-open Creative Commons Attribution licence.
Tedious Bloggy Stuff - hooray?
What's "Bing"?
So, use robots.txt to remove yourself from their search listings. Problem solved.
Really the same issue webmasters had with deep-linking where Google sends the searcher straight to the page they wanted without having to wade through the front end of the website. And yes, the same mitigation techniques such as robots.txt and refferring block apply with the same drawbacks of those searchers not bothering with the site that's making things more difficult for them.
Yes, and the folks on slashdot are really big on opt-out instead of opt-in... ..oh wait.. no they fucking arent. The folks on slashdot fucking hate opt-out, and rightly fucking so.
Posting your content on a publicly accessible URL IS opt-in.
I'd go further than this, honestly, I'm sick of people whining about this sort of thing.
The internet was created for one purpose - information sharing, if you don't want your information shared then get it off the web, otherwise don't cry when it is shared.
Yes that may mean there's a cost to you, in terms of hosting, but that's part of what the web spirit always was - that people share information for free at their time and expense, or as part of their employment (i.e. academics sharing data).
I'm sick of these people who believe they have a god given right to make money from the web and deserve legal protection as such. I'm not saying you shouldn't be able to make money, but making money should be upto you to figure out without expecting the whole of the purpose and intent of the web and it's design to revolve around what you want.
Booohooo, people can link to content on your site. Get over it, that's how it was designed, that's how it was meant to be, don't like it? Then stick your content behind some passworded paywall or whatever, if it's on the public web it should be fair game, that's the whole point of it. It's the same as the newspapers whinging about Google quoting and linking their content - again, Google is doing nothing wrong, it's using the web EXACTLY as it was intended, if they don't like it they should get off the web and see how that suits them.
Retard.. things are copywritten (sic) automatically when they are published.
Why is slashdot filled with retards these days.
I think you got your answer.