Google Redesigns Image Search, Raises Copyright and Hosting Concerns
An anonymous reader writes "Google has recently announced changes to its image search. The search provides larger views of the images with direct links to the full-sized source image. Although this new layout is being praised by users for its intuitiveness, it has raised concerns amongst image copyright holders and webmasters. Large images can now easily be seen and downloaded directly from the Google image search results without sending visitors to the hosting website. Webmasters have expressed concerns about a decrease in traffic and an increase in bandwidth usage since this change was rolled out. Some have set up a petition requesting Google remove the direct links to the images."
Webmasters have expressed concerns about . . . . . an increase in bandwidth usage
Google gets the image from the originating website, or I go there and get it myself. Either way, somebody (me or Google) has to go to the website to get the image. How does this cause increased bandwidth usage?
More people being linked directly to the high resolution image, but less people actually visiting the website. This isn't really that confusing.
It looks and works great! Now they just need to fix the SafeSearch bug so I don't have to use Bing Images instead (which, as Microsoft as it is, even gives explicit suggestions when its safe setting is off).
You can hold down the "B" button for continuous firing.
If you even read the summary, let alone TFA you'll see:
"The search provides larger views of the images with direct links to the full-sized source image."
"Always forgive your enemies; nothing annoys them so much." - Oscar Wilde
Some websites use a annoying script that redirects people when they click a image.
It's called hot linking or leeching and it has been a headache forever. You want to show content + ads but your server is used just to pull an image, thus no traffic and high bandwidth.
Fighting the good fight:
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?cyberciti.biz/.*$ [NC]
RewriteRule ^.*\.(bmp|tif|gif|jpg|jpeg|jpe|png)$ - [F]
Took me 5 seconds https://www.google.ca/#hl=en&tbo=d&spell=1&q=robots+.txt+for+images&sa=X&ei=FJYRUeytEIeGiQLemYGIDg&ved=0CCsQvwUoAA&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.&bvm=bv.41934586,d.cGE&fp=7c0022b148dcff04&biw=1680&bih=860 with the results http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35308
How about a small effort from the site owners?
by TheSpoom (715771) Uncaring Linux user here. I have nothing to add to this but please continue. *munches popcorn*
I think it still exists?
If you even read the summary, let alone TFA you'll see:
"The search provides larger views of the images with direct links to the full-sized source image."
Yes, I did read TFA. And nowhere does it explain how you can have decreased traffic but increased bandwidth usage. Because it's not possible. Decreased traffic = decreased bandwidth usage.
Here's the real problem (quote from TFA):
When people get the full resolution image, they have no reason to click to go to the URL.
Dear "Webmaster", nobody cares about your shitty website packed full of annoying ads. Get over it already.
Lots of sites put hi-rez images on file, and link to them via a thumb nail.
The majority of visitors don't request the hi-rez images, at least not all of them.
But posting a link to a high-rez image can get your bandwidth slammed, serving images, but nobody requesting the web pages. Especially if its porn, or happens to hit the search topic of the moment. Without the ability to serve ads, these websites make no money.
Of course, if the complainers had an actual clue, they could just put a robots.txt file in their image storage, which Google seems to honor.
Sig Battery depleted. Reverting to safe mode.
Really?
Less people visiting the pages = less traffic
Browsers only pulling images from the pages = Increase in bandwidth
Wrong.
Browsers only pulling images use less bandwidth that browsers pulling the entire page.
It isn't as obvious as you make it sound. Scenario 1: Google links to your page. People who want your image click through, your server throws them the whole page plus the high resolution image. Scenario 2: Google links only to your image. People who want your image download just that, your server sends them just that. All else being the same, scenario 2 is less bandwidth, not more, because you'd be serving the same image either way, but in one case with and in the other case without all the other stuff on the page as well. It's entirely possible for it to add up to more, but this depends on how the new search affects people's usage of the results- it requires that more people actually click to view the full-resolution image as a result of the changes. That's a likely, but not necessary outcome.
IIRC, jpeg images allow header data that includes copyright info. If you don't care about use of the image, leave it blank. If you do, insert the copyright info. Google's bot can look for copyright data and if it finds it, it can link to the original html page. Otherwise, it can give a link for a direct download.
I think there was something on /. awhile back that talked about some system for the owner to indicate how an image could be used, e.g. commercial, non-commercial, free and so on. Couldn't find it on a quick search, but that might be another option to tell Google how to handle an image.
I can mend the break of day, heal a broken heart, and provide temporary relief to nymphomaniacs.
If webmasters don't want people "stealing" photos without viewing directly on their website, they are more than welcome to instruct their web servers to not display images to freeloaders. Look at the referer header, if the request didn't originate from your site, then don't serve it.
I went to eat some animal crackers and the box said, "Do not eat if seal is broken." I opened the box and sure enough..
# cd
# cat - > robots.txt
User-agent: *
Disallow: /
<crtl-D>
#
Problem solved!
Karma: Bad
This benefits my ecommerce site. All the images are watermarked and display our products. The more viral ones show sexy women showing off the product. Those rank at the very top for related key words. This uses up extra bandwidth that I pay for, but it's great for me, since I WANT to share these photos and get them out there.
If you're running a website with Apache, you can configure Apache to look at the HTTP_REFERER header and see where the web surfer was when they made the request for the image. If they weren't on your website, (or if they don't provide the header, an act to be widely discouraged), just re-direct them to your home page instead of serving the image.
I would think that other web servers could do the same thing, one way or another.
For most people, it costs money -- perhaps not a huge amount, but still, real money -- to put up a website and serve content to the world. The expectation, if not agreement, is that you'll look at the site's content on the site.
The webmaster's position is no more hostile than that of the deep miner: There are expectations, but no promises.
Google's search goes far beyond fair use, as far as I'm concerned.
I've fallen off your lawn, and I can't get up.
You used to get traffic actually visiting your site. That meant full page loads, but a lot of that is text which is low bandwidth. You now have less traffic (unique IPs hitting your site), but they're JUST downloading hi-res images which leads to a net increase in bandwidth.
Also, ads don't have to be shitty and annoying. Slashdot uses ads, and even though I can I don't turn them off because they're relatively passive. Hosting and bandwidth cost money, and a lot of sites rely on small ad revenue to help offset those costs.
"Always forgive your enemies; nothing annoys them so much." - Oscar Wilde
Ah, point taken.
Then why do you like the image so much? It is, after all, part of their 'shitty' website.
Although it took me a while to get used to, I sort of miss the old way. For a lot of image searches, I'd get the image and see the thumbnail of the website behind it. Often the website looked interesting enough (and related to my search) that I then went to the website directly. I discovered quite a few nice sites and blogs that way...
Now, I just get the picture with no real reference to where it came from. Sure, there's a link to the page but it's text and gives no indication what the site is about. There's far less incentive for me to actually visit the hosting site.
So I miss out on potentially interesting sites and the hosts miss out on useful traffic. Lose/Lose either way.
It isn't even just a button. Do your search, then sit back and tap the cursor keys on your keyboard, and you'll zip though tons of images in no time.
I've fallen off your lawn, and I can't get up.
As a user, I like the convenience but the last thing I want is for all kinds of legal disputes and possible regulations as chances are they'll overreach in banning what Google and other search engines are allowed to do, and we'll end up with less than we had before Google pushed it like this. "Don't be evil", and at least allow sites to opt out.
and all it did was send requests to google and re-display them without ads or with different ads, then google would be the one complaining.
This is why this is a non issue. Any admin worth their salt can disable hotlinking. This just means an increase in hotlinked disabled sites.
Mark Anthony Collins
if you run a website you know damn well that having google put full res image download link will massively increase your bandwidth usage with absolutely 0 increase in traffic.
You have the option not to appear in the search listings. Perhaps try that, that will reduce people stealing your bandwidth.
Agree with everything this AC is saying. Additionally, the only real non-aesthetic difference is that Google doesn't simultaneously load the page in the background, unscrollable under a semi-transparent layer. That counted as a pageview and was chargable to any advertisers on the page, but the page was pretty much unviewable and unusable - so users were not genuinely consuming content nor advertising. This would have been frustrating for advertisers as they'd still be paying for this pagecount, and frustrating for website owners as a full page of assets were being downloaded without being usable, wasting their bandwidth. The new design improves *everything* for *every* party. It's not at all a perfect solution, but it's definitely not a step to be complaining about. The only solution that immediately comes to mind is that pressing the "full size" button (or whatever it's now labelled) could open the fullsize image in a new tab while opening the full page in the current tab.
Google always offered links directly to the original image, though it did load the actual site in the background. And you've always been able to prevent the direct image links by referer control.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Can google show a link with summary to a news article? Can they just show the entire article?
Can google show a link with summary to an image (i.e. thumbnail)? Can they just show the entire image?
I cannot imagine any reasonable person would differentiate the two situations. The content the Google user is actually looking for is the high-res image itself (my assumption based on my own personal decision process that leads me to visit images.google.com). As soon as you start serving up the full content, you're appropriating it.
You have the option not to appear in the search listings. Perhaps try that, that will reduce people stealing your bandwidth.
So we can choose to pay for the privilege of providing content for Google image search so that Google can grow fat off of ad revenue or we can choose not to be discoverable by the internet using public? That's like saying that an ISP can always choose not to be connected to the network backbone if they don't like the terms being offered by the monopoly that owns it.
What's going on is fairly obvious if you read the article linked in the sentence "Webmasters have expressed concerns about a decrease in traffic and an increase in bandwidth usage since this change was rolled out."
The article says nothing about an increase in bandwidth usage. The anonymous reader who submitted the article obviously just made that part up, as anonymous people on /. do, without regard for whether it made sense or accurately reflected the link being given.
On one hand, I think the site owners deserve the traffic. On the other hand, it seems like at least a quarter of the pages end up being dead when I click on them, or redirect to sites attempting to install malware on old versions of Firefox, or seemingly have nothing whatsoever to do with the image that's supposedly there.
A compromise might be to allow users to open the referring page in context immediately, open the cached page (with live content) after a 2-second delay, and allow users to grab the full-sized image directly from Google's cache after a 10-second CAPTCHA-guarded delay. Then, users would have every incentive to try viewing the page in context, falling back to the cached page if the original page ends up being down/borked/whatever, and being able to grab the cached image if all else fails.
Going a step further, Google could come up with some free digital watermarking scheme that allows a 48-bit (give or take) payload to be encoded into the image at a user-selected strength (allowing him to balance robustness, file size, and visibility... pick any two of the three).
The upper few bits (let's say, 4) would indicate the version. Initially, it would be 0001.
The next 40(give or take) bits would be globally-unique, and allow somebody who knows the value to obtain meta info about you in a sensible manner. If they're all 0, it means you're using a generic permissions watermark that doesn't identify ownership, but simply restricts use.
The lower 4 bits specify explicit restrictions
* do not contextually-index
* do not cache full-sized image
* do not perform face recognition of any kind
* do not index for similarity to other images
A value of "0000" would allow search engines to index the image, unless you restricted them in some industry-standard way via metadata referenced to your unique id. For the generic value with all 0s, 0000 means "go ahead and index this".
A value of "1111" would indicate that the image, when encoded with a 4-bit watermark, should not be indexed in any way, shape, or form, regardless of future extensions to the standard that might define additional permissions, and regardless of what any indirectly-referenced meta-info might or might not say. Let's call this the "Stop Facebook from Permissions Creep in a GPLv3-like manner" anti-permission.
Google just turned every other web site on the planet into MegaUpload. Sort of. "Don't, be evil" indeed.
I think that if I was a photographer, I would be OK with Google caching full quality images as long as they put their own annoying watermark all over it with the URL where the image came from clearly visible.
No sig. Move along - nothing to see here.
scenario 2 is less bandwidth, not more, because you'd be serving the same image either way
Not necessarily.
Most web designers use a thumbnail or a medium resolution photo on the web page. They do this so that the web paints fast.
But they also know that most people do not click for the high-res image. This saves them bandwidth, often enough to
serve the entire page in less total transmitted data than if they always sent the big images.
So you may well not be serving the same image either way, especially if you have a clue about web design.
But with google finding and showing the large ones, it could become more expensive.
Sig Battery depleted. Reverting to safe mode.
Retard.. things are copywritten automatically when they are published. What you just suggested is - never publish anything online.
Your post is covered by copyright
Linux - copyright
slashdot's html - copyright
Do you know what copyright is?
Why is slashdot filled with retards these days.
I do not know, so I can only come up with two possibilities.
1. Google doesn't cache the images for each search, so loads images from the servers each time, so increased bandwidth, instead of a smaller cached thumbnail.
2. More users are happy with the Google supplied image, so they do not go to the source. This would be a decrease of effective bandwidth or a higher bandwidth per unique page view on average. Images w/ text + ads (or eyes on products / services) verse Image alone. It is an issue of loss of possible traffic in the most simple terms.
why is it filled with assholes?
wow. "Retard.. things are copywritten.." ..."Do you know what a copyright is"
When I try to type copywritten, it get a red underline. My PC doesn't know what copywrite is.
I hated the way they had it. when they changed it I thought "wow, google finally got it together" .
Oh, I get it now, You're a moron. You don't understand the difference between thumbnail images and higher-resolution files.
Maybe it's more that OP doesn't understand how Google image search works. I always thought that Google image search pulled in the full-sized image from the remote server, and then resizing into a thumbnail was done either on Google's servers or in the web browser. But if I understand you correctly, Google image search was previously smart enough to pull in the thumbnails that were already on the remote site (if any even existed).
It probably also has to do with a different method of measuring traffic. OP seems to measure traffic by measuring how much data is transmitted over a given period of time. You seem to measure traffic by measuring... I don't know, number of hits over a given period of time, regardless of how much data was transmitted?
"I'm not sure I like the fugnutish tone you used in your post!" -RogL (608926)-
I suppose, if there is no "page that contains the full resolution image" in the first place. I guess Google image search is most likely smart enough to be following links that look like they're going to another version of the image, assuming they're true links and not lightbox loader script click events. As for having a clue about web design, most amazing example of this I've seen recently was when I was looking for huge images with Google, and landed on some sort of corporate blog type web site that took an extraordinary amount of time to load, only to finish and have the image nowhere in sight. It seemed to be very close to what I wanted for some wallpaper or whatever I was after, so I took the time to look more closely at the code, and found buried in a side panel in some minor place on the page a spot for a thumbnail image set to display at 111.32 x 74.36 (yes, the styled size was specified as a decimal number of pixels). The actual file set as the img src, OTOH, was 11,132 x 7436 and 10.5 MB in size. This one image, despite being a tiny, inconsequential part of the whole page design, was causing a page to take probably close to a minute to load that could have been served in a second or two even on the slow server they seemed to have...and then you couldn't even see the image in the end anyway, because Chrome apparently can't handle resizing a 10.5 MB image to 1% of its height and width and successfully display it.
If someone clicks the Google Image Search 'high-resolution' link for one of my photos from Flickr, they get a medium-resolution version with no description, attribution or copyright information. (Example search page here.
If they go to the ad-free Flickr page, they get links to much higher resolution versions, associated images and also get informed that it's under a super-open Creative Commons Attribution licence.
Tedious Bloggy Stuff - hooray?
Google hosting and delivering the large image ... bad. Googling showing where the website makes the image available to everyone ... good. Webmasters: don't like it? Then don't deliver it. That's what the referrer is for.
now we need to go OSS in diesel cars
What's "Bing"?
Am I the only person amused by the concept of "stealing" something from a website that makes it publicly available?
$ curl http://www.cyberciti.biz/deep/link/path/yourimage.jpg -o yourimage.jpg
The cost of running your website is not Google's problem. If you don't want someone downloading something from your website, don't put it on your website.
So, use robots.txt to remove yourself from their search listings. Problem solved.
if these webmaster really believe that most people don't know about 'right click' & "save image as...", then they are living in lala-land.
If I'm really fast, I can get to the links at the (ever moving away) bottom of the page and find my way back to the old GIS, but only if I'm fast enough.
Please, Google, put these coders on a project that NEEDS improvement, and give us a useable GIS back. Thank you.
NetInfo connection failed for server 127.0.0.1/local
Really the same issue webmasters had with deep-linking where Google sends the searcher straight to the page they wanted without having to wade through the front end of the website. And yes, the same mitigation techniques such as robots.txt and refferring block apply with the same drawbacks of those searchers not bothering with the site that's making things more difficult for them.
Why such vitriol at webmasters who want the people interested in their images to visit their sites? Not all sites out there are shitty, or have shitty ads, or even have ads at all. Maybe a blogger who posts political photos also wants visitors to check out their writing.
Also, "Get over it already" is a pretty obnoxious phrase.
Dear "Webmaster", nobody cares about your shitty website packed full of annoying ads. Get over it already.
Spoken like a typical leech. No surprise, but always amazing.
Absolutely! - I know I am, and I know many others are... Leeches that is. Proud user of AdBlock-style software for two decades.
Advertising has gone from bad to painfully awful in amazingly short time, rendering most pages useless without ad-blocking software. It began with that first animated banner, blinking or jumping to attract attention and today you get full page ads, competely blocking the real page, complete with loud music, a semi-yelling salesman or worse.
"For every complex problem, there is a solution that is simple, neat, and wrong." -- H.L. Mencken (1880-1956) --
I doesn't have to be all or nothing with robots.txt. You can simply exclude certain paths, like /pics, and then the stuff in there won't be indexed. Quite simple and handy actually.
"For every complex problem, there is a solution that is simple, neat, and wrong." -- H.L. Mencken (1880-1956) --
Disallow: /images/
Poof, done.
You used to get traffic actually visiting your site.
You used to get people who grudgingly went to your site to click save as...
You now have less traffic (unique IPs hitting your site), but they're JUST downloading hi-res images which leads to a net increase in bandwidth.
You get the same amount of traffic (unique IPs), but they're just going to the image, not your webpage. Bandwidth use is hardly going to change. You're not going to see an influx of new users if your main source of hits was google image search.
If your content outside of the images was not worth the users attention, you'll get less actual visitors. If you don't like it, there's been ways to block this kind of use for years, but that won't increase the influx of users either. Most of the complaints about this feature are lazy webmasters who see easy money evaporate. And man, those ad revenues sure are worth so much moneys... Provide actual content, build up a community, offer features your community wants, et voila, you have recurring traffic that doesn't leech your bandwith via google image search.
Also, ads don't have to be shitty and annoying.
Don't worry, practically everyone is using adblock anyway. I'd like to repeat my sentiments on the whole "The income of my business depends on ad revenue" thing: if you are going to sponsor your hosting solely on the income provided to you by advertising, don't be surprised if you're at the mercy of the ad-network and the users not even downloading your ads. It's like all common sense has gone out of the window with website hosting.
In this case, don't be surprised if Google decides that it's in it's best interests to screw you out of ad income, because the chances are high that they're the ones providing you ad income in the first place. Make your site worthwhile to visit, and users like me will come back and even *gasp* turn off adblock or pay for some feature you have that's useful to us. If you had that kind of service, you wouldn't be bitching about ad revenue, you'd have more interesting accounting problems. But if you're just hosting lolcats image macros, good luck with that.
Dear user, if you don't like my shitty website, don't click on my shitty images.
Or, you could just not visit sites run by people who want to show ads. You obviously think the person whose content you're trying to consume is making a poor choice. You don't like their judgement, you think they're offending you ... so, just walk away. That's how you stop seeing those ads. Become a site's member or whatever is needed to reduce the ad displays, or just go away.
Don't disappoint your bird dog. Go to the range.
Retard.. things are copywritten (sic) automatically when they are published.
Why is slashdot filled with retards these days.
I think you got your answer.
I can confirm, that the traffic from image search is down by 20 percent. People are being forced out of business.
~ Best man at your service.
As someone who uses Google Image Search quite a bit, I have this to say:
Please.
Someone look at my images, either at my site or at Google
XKCD:Xeric Knowledge Comically Dispen
If someone clicks the Google Image Search 'high-resolution' link for one of my photos from Flickr, they get a medium-resolution version [staticflickr.com] with no description, attribution or copyright information.
You don't embed that in the EXIF information?
!#@%*)anks for hanging up the phone, dear.
Exactly. They make something publicly accessible, and then complain when the public accesses it.
Unix is user friendly, it's just selective about who its friends are.
You want to show content + ads but your server is used just to pull an image, thus no traffic and high bandwidth.
I think what you meant was... you want to make your content publicly accessible in order to increase your exposure, but you also want ad revenue. What happens is that once you make your content publicly accessible you cannot force people to also view your ads, so your server is used just to pull an image, thus no ad revenue and higher bandwidth.
It's called having your cake and eating it too. You can't make your content publicly accessible and then complain when the public accesses it in a way you don't like.
Unix is user friendly, it's just selective about who its friends are.
Most of it, yes - but Flickr helpfully strips out said EXIF data for the reduced-size versions of the photos. This combined with Google's allow-download-without-seeing-any-attribution-details? Nice!
Actually, doing some more testing - Flickr has an optional (and trivially easy-to-defeat) system to prevent visitors from saving displayed photos to their computers. Google Image Search goes straight past this - so an all-rights-reserved, the-owner-has-disabled-downloading-of-their-photos image can be saved straight from Google with no indication whatsoever of the photographer's wishes.
(While I'm really not protective of my own stuff, I know other people are of theirs - Google's behaviour here is at the very least terribly impolite.)
Tedious Bloggy Stuff - hooray?
Not everyone views the high res photo, they might just look at the thumbnail.
Now the site gets almost no traffic, but to just the photo itself. Which doesn't really do the site much good.
You forgot the more common scenario 3, a page with a thumbnail. Google used to link to the page, user goes to the page, click on thumbnail, see high res image. Now users go to google, see high res image.
Dear "Webmaster", nobody cares about your shitty website packed full of annoying ads. Get over it already.
Apparently a LOT of people cared enough about the webmasters "shitty" website, to want the photo the webmaster was offering. It wasn't free for the webmaster to offer the photo.
What braindead users are praising Google over the "intuitivness" of the idiotic new image design? It is awful. I have to click multiple times now to get to the website. First click brings me to some other google page, with one small url that links to the site. How is this intuitive at all?
When I click on the photo I expect to get taken to the website, much like when I click on the search result in the text searches.
And copies results from Google.
So in scenario 3, the server load is even higher: page contents, plus thumbnail, plus high res image after all of that. The same fact holds true, that the change very well could result in a reduction in bytes served, and the final outcome depends on how user interaction plays out after the change as compared to how it did before.
Bing for 100$.
What is the last name of the Chandler character in the insanely popular sitcom called Friends?
Defining Statistics and Social Research
Errrrr!
Right question: What's the noise the Machine that goes Bing! makes?
Defining Statistics and Social Research
Even if that was true, why should I care as long as I get better results?
I have several websites that I like whitelisted in adblock. In general though, I disliked the video ads from a few years back enough to blanket block just about everything.
If the webmasters would prefer we could recode adblock to download them, but never show them. Would that make them happier? At least their logs would seem to show that they were loaded but not clicked on.
All of the above was encrypted with a Quad ROT-13 method. Unauthorized decryption is in violation of the DMCA.
I see you do not understand how the web works.
For anyone to actually see the images you post, they have to download them onto their machine. Hence it is not stealing.
Even if, someone like a photo well enough to use it elsewhere, it is still not stealing: it is copyright infringement.
Perhaps you should have done some thinking before posting your amazingly ignorant drivel.
Of course not, since this is a discussion, it makes reading sense. If you want fucking sense, you'll have to fuck. But not me, I'm spoken for.