IBM, and Some Other Companies Did Not Inform People When Using Their Photos From Flickr To Train Facial Recognition Systems (nbcnews.com)
IBM and some other firms are using at least a million of images they have gleaned from Flickr to help train a facial recognition system. Although the photos in question were shared under a Creative Commons license, many users say they never imagined their images would be used in this way. Furthermore, the people shown in the images didn't consent to anything. From a report: "This is the dirty little secret of AI training sets. Researchers often just grab whatever images are available in the wild," said NYU School of Law professor Jason Schultz. The latest company to enter this territory was IBM, which in January released a collection of nearly a million photos that were taken from the photo hosting site Flickr and coded to describe the subjects' appearance. IBM promoted the collection to researchers as a progressive step toward reducing bias in facial recognition. But some of the photographers whose images were included in IBM's dataset were surprised and disconcerted when NBC News told them that their photographs had been annotated with details including facial geometry and skin tone and may be used to develop facial recognition algorithms. (NBC News obtained IBM's dataset from a source after the company declined to share it, saying it could be used only by academic or corporate research groups.)
"None of the people I photographed had any idea their images were being used in this way," said Greg Peverill-Conti, a Boston-based public relations executive who has more than 700 photos in IBM's collection, known as a "training dataset." "It seems a little sketchy that IBM can use these pictures without saying anything to anybody," he said. John Smith, who oversees AI research at IBM, said that the company was committed to "protecting the privacy of individuals" and "will work with anyone who requests a URL to be removed from the dataset." Despite IBM's assurances that Flickr users can opt out of the database, NBC News discovered that it's almost impossible to get photos removed. IBM requires photographers to email links to photos they want removed, but the company has not publicly shared the list of Flickr users and photos included in the dataset, so there is no easy way of finding out whose photos are included. IBM did not respond to questions about this process.
"None of the people I photographed had any idea their images were being used in this way," said Greg Peverill-Conti, a Boston-based public relations executive who has more than 700 photos in IBM's collection, known as a "training dataset." "It seems a little sketchy that IBM can use these pictures without saying anything to anybody," he said. John Smith, who oversees AI research at IBM, said that the company was committed to "protecting the privacy of individuals" and "will work with anyone who requests a URL to be removed from the dataset." Despite IBM's assurances that Flickr users can opt out of the database, NBC News discovered that it's almost impossible to get photos removed. IBM requires photographers to email links to photos they want removed, but the company has not publicly shared the list of Flickr users and photos included in the dataset, so there is no easy way of finding out whose photos are included. IBM did not respond to questions about this process.
There's no implication IBM did anything wrong. This is what the Creative Commons licenses are for. What's the story?
Your photos are public. What the hell do you expect?
Someone tell these people how search engines work.
People place images on the public internet, available to world+dog, and then express surprise and dismay that world+dog has access to the images? What's next, shock and dismay upon learning that Zuckerberg knows more about them than the NSA does?
Stop posting your shit online.
Sure sure we'll bleat and make noises about laws to prevent this happening, and appropriate slaps-on-the-wrist to using publicly available images.
But in the mean time, stop posting your shit online.
"Researchers often just grab whatever images are available in the wild" - I fail to see the problem. This is legal. You put your picture out there, they are viewing it with AI. There's no law being broken really even conceptually.
You may not want them to, and you may have shared it on a platform that says it will ask your consent before sharing it, but legally unless you have been somehow provably damaged I don't see how you'd stop them either way.
Or moreover, why you should even care to. You shared it, it's out there. That's how it goes.
Whether you love or hate the harvard comma, it is generally agreed you don't use it on two-item lists.
Someone had to do it.
Flickr photo sets have been used for computational work loads and data mining for well over a decade, this is hardly NEWs.
https://www.ted.com/talks/blai...
Maybe they could use my image to train sexbot AI's...
Then I would just tell them that I am Captain James T Kirk, and their synthetic panties all drop for me...
An incel can dream
It's pictures available for public conniption ("conniption" was an autocorrect error too funny to correct).
Consumption is just what model training is doing; they are not republishing the pictures in any way, just using them to train models - which do not contain any element of images they train from.
If you put your image in public, how can you be aghast someone has viewed it?
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Although the photos in question were shared under a Creative Commons license, many users say they never imagined their images would be used in this way
Just because you lacked the creativity to consider what was possible with your data doesn't mean there is anything improper has happened when they do use it in such a way.
Also, if you have given away your data thinking that somehow corporations would respect you then you don't really understand what drives corporations.
The reality is that if it's profitable then a corporation will do it. It doesn't matter if it's morally repugnant, illegal or downright evil because if it's possible to make a profit then there will be a corporation that will do it. Note that being illegal typically means they will be fined which they consider a business expense.
Anons need not reply. Questions end with a question mark.
Although the photos in question were shared under a Creative Commons license, many users say they never imagined their images would be used in this way.
Since when is licensing about what you "imagine"?
Funny that I've not yet heard about the CC-as-foreseen license, which apparently billions of people have been using, in earnest, all along.
Copyright governs your ability to distribute copies of other people's work. There's no distribution going on here, so permission of the copyright holder (photographer) was not needed.
It might be governed by personality rights - your right to control how your image is used. You could argue the model's consent is needed before using their facial geometry. But personality rights are generally concerned with control over how others perceive your image. Since there's no public perception or exploitation here, it would be an uphill argument.
AFAIK, there is no basis for prohibiting people from using things you make publicly available (your face every time you walk out in public, unless you wear a burka) to train computer algorithms. Photographers and the press have worked pretty hard to enshrine their right to record images of people in public places. If we want there to be restrictions of using images of people in public places, it'll need to be a new law.
Many of us are not narcissistic attention seeking whores and do not want to be in the spotlight for any reason.
(Not sure why you were down-modded?)
If that is true, you wouldn't have any photos up they could use to train right?
Also any photos used for training, are never in the spotlight as it were.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
"Many of us are not narcissistic attention seeking whores and do not want to be in the spotlight for any reason." - He wasn't referring to you, Kendall.
I couldn't get past the grammattical train wreck of a headline, I definitely can't take the rest seriously.
That's their own stupidity for using an online service "cloud". As I always said, the "cloud" is just the latest buzzword for storing your data on other people's servers under their control which always results in them using that data for other reasons.
Here is another buzzword; "blockchain" which is nothing but CRC 2.0 and instead of a simply math matrix to assure data integrity, it incorporates location, date, time, person, etc. This simply provides more data that can be gleaned by others than know how to read the ledger; that's all it is.
Stay away from BOTH.
Let me guess, the whole quote should he been something like:
"None of the people I photographed had any idea their images were being used in this way, but it's all because I decided to put it on the internet with a licence that allows anyone to do anything with it, without explaining to them what I was going to do."
Sir Tim Berners Lee's new proposal called Solid [https://solid.inrupt.com/] would be a prime choice for managing the access to your data that companies like IBM have for their scanning. Fine grained access to share and audit data in a decentralised distributed network using blockchain.
When shit hits the fan get some of these https://youtu.be/pY-GncsZ-UE
I could care less. Most people photographed also do not care. I can honestly say that if you get offended by a machine learning algorithm looking at your photo, you might have deep psychological issues. It's not a human. No judgement was made.
“It seems a little sketchy that IBM can use these pictures without saying anything to anybody,” he said.
It seems a little sketchy than this photographer didn't explain to the subjects that he was going to post their image online with a licence that allows anyone to do anything with it for any reason.
The fact that they only used creative commons images suggests there's an actual legal issue with proprietary images, but why? If I save an image from a website to my hard drive, without sharing it, does that make me a criminal? I've been training my brain on face recognition with proprietary images for decades. I've even occasionally indirectly made money from the viewing of proprietary images, as has everyone else.
Should I pay a royalty every time I imagine a proprietary image I've previously seen?
This space intentionally left blank
They're going to hate what I do with their photos from Flickr.
This activity is an express violation of the Privacy rights embedded and explicitly described in both the Canadian and Washington State Constitutions.
Period.
-- Tigger warning: This post may contain tiggers! --
""None of the people I photographed had any idea their images were being used in this way," said Greg Peverill-Conti, a Boston-based public relations executive who has more than 700 photos in IBM's collection, known as a "training dataset." "
Why are you whining? YOU explicitly made that possible. YOU had to elect for each image to be licensed under CC. If the people you photographed are upset by this, they should sue YOU.
This is an important detail unmentioned in the article or summary.
CC BY NC could very well be violated by this. If it was CC BY SA and they released the dataset also CC BY SA, then it might creep them out, but it is in keeping with the spirit of the license.
Having said that: I'm sure IBM had some lawyers involved in using this, but maybe some of the infringed parties should have a lawyer look into the license and if this usage is actually covered by it. Nailing some major corporations for violating copyright is always an enjoyable popcorn moment.
Firstly I'm not sure I understand the outrage on either side. It's not really a complicated a issue.
First and foremost, attribution is always required, I believe, for the use of a creative commons image. For example, one of the least permissive of the CC licenses is the CC-BY-NC-ND. To the best of my understanding, you must attribute the work, you can't use it commercially, and no derivative works are allowed; so basically, you can redistribute the work. There are two proper ways to possibly distribute and redistribute a CC-BY-NC-ND photo. One would simply to have attribution included in the photograph, if you are the author; and, the other would be distribute text with the file appropriately attributing the work to the author.
I'm not a lawyer; but, if IBM wants to use a CC-BY-NC-ND photo of me to train an AI, it seems that would violate the no derivatives portion of the license; in that, the AI being trained, is a work of it's own. In the case of this hypothetical photo, it has been modified into a piece of code somewhere in the AI's programming. Now, if the AI was a sentient being and was just viewing the material, that might be a different case. If the AI was sentient and wanted to use the photo; it/he/she would have to attribute the work so long as it was bound by human law. Also, there is no commercial use allowed in a CC-BY-NC-ND, so that would probably violate the license before the no-derivative portion was violated.
The CC-BY-NC license would allow for use with attribution if IBM's work was not commercially used.
The CC-BY-NC-SA license would allow for use with attribution if IBM's work was not commercial, and if they also shared their work. (Again, not a lawyer; but, this may be an area where a creative commons license run into compatibility issues.)
The CC-BY-ND might work with what IBM is doing depending on if the translation from image to code by algorithm via AI was not considered a derivative work. In this case, IBM would either just attribute the work, or not be able to use it if it is a modified work after the AI uses it.
The CC-BY-SA license would allow IBM to do what they want so long as their work was shared-alike and attribution was given.
The CC-BY license would probably be the safest bet next to public domain. IBM could simply link to a page with all proper attributions or whatever they had to do.
CCO is self explanatory.
The arguments and outrage I'm seeing in these comments suggests that many people view an AI training on photos is simply just an act of viewing. Perhaps that is the case. It may be a debatable point in a court of law if anyone was to bring a situation like this to court. From my understanding (which may be flawed) it is not a simple act of viewing in that a work is being worked upon by another work and simultaneously being incorporated into that work, which, in the case of IBM, is most likely going to be used to make a profit, at some point. A counter argument could be that, if I view a CC-BY-NC-ND picture and become inspired to write a story based on what i viewed in the photograph, am I then not allowed to do so? I would agree that I should be allowed to do so; but, I am not a, 'work', of a company. At least so far as I can prevent myself from being one...
*shrug*
The clear distinction between the two is probably equivalent to the line the ocean draws at the coast; or, the line the coast draws at the ocean.
Almost all responses here are along the lines of "what did you expect". But it's not that simple.
If I go up to a window in your house and photograph the inside, you don't say "well, I have no problem with that, the windows are transparent after all".
Saying "it's technically possible, so of course someone did it" makes you no better than databrokers like Cambridge Analytica who create psychological profiles based on your Facebook likes and then sell them to, well, anyone really.
Is it technically possible? Yes. Was it something the average user could have anticipated when they pressed the "I agree" button? No.
This is about norms and values. Privacy is a form of "contextual integrity". We have expectation of how much we will get for different situations. People have similar expectations online.
This illustrate the hypocrisy of photographers: They do not believe that the (non-model) people appearing in a photo (perhaps in the background) have any rights to their faces. But the photographers believe that they have all rights to the photos of those faces.
The problem in this case is that the photos was posted at all without the consent of the people in the photos.