Slashdot Mirror


Massive Tinder Photo Scrape Has Users Upset (techcrunch.com)

Images of Tinder users "were swept up in a massive grab of some 40,000 photos from the dating app by a dataset collector who plans to use the selfies in artificial intelligence training," writes Slashdot reader Frosty Piss, sharing this summary of a report in TechCrunch. Tinder said in a statement that the photo sweeper "violated the terms of our service" and "we are taking appropriate action and investigating further." The creator of the data set, Stuart Colianni, has released it under a CC0: Public Domain License and also uploaded his scraper script to GitHub.

He describes it as a "simple script to scrape Tinder profile photos for the purpose of creating a facial dataset," saying his inspiration for creating the scraper was disappointment working with other facial data sets. He also describes Tinder as offering "near unlimited access to create a facial data set," and says scraping the app offers "an extremely efficient way to collect such data."

The article notes that Tinder's API has already been used for other "weird, wacky, and creepy" projects, including "hacking it to automatically like every potential date to save on thumb-swipes; offering a paid look-up service for people to check up on whether a person they know is using Tinder; and even building a catfishing system to snare horny bros and make them unwittingly flirt with each other.

"So you could argue that anyone creating a profile on Tinder should be prepared for their data to leech outside the community's porous walls in various different ways -- be it as a single screenshot, or via one of the aforementioned API hacks. But the mass harvesting of thousands of Tinder profile photos to act as fodder for feeding AI models does feel like another line is being crossed."

17 of 80 comments (clear)

  1. Good Grief... by Frosty+Piss · · Score: 3, Insightful

    Tinder said in a statement that the photo sweeper "violated the terms of our service" and "we are taking appropriate action and investigating further."

    TOS is meaningless in cases like this. TOS are meaningless anyway, except as, perhaps, a means to ban users. And that's pretty pointless as well.

    But really, what do people that put their photographs out on the Intertubes like this expect? Privacy? Really?

    --
    If you want news from today, you have to come back tomorrow.
    1. Re:Good Grief... by viperidaenz · · Score: 2

      I'd imaging they gave Tinder unlimited rights to their photo when they uploaded it, including allowing them to grant access to 3rd parties.
      A 3rd party being anyone who access the API...

    2. Re:Good Grief... by mrsquid0 · · Score: 3, Insightful

      Its called blaming the victim. It is very popular in some circles.

      --
      Just because you are paranoid does not mean that no-one is out to get you.
    3. Re:Good Grief... by henni16 · · Score: 2

      Just because a 3rd party is legally allowed to access something, doesn't mean that 3rd party has the right to redistribute or relicense the stuff.

  2. Attention whoring of the highest order by Anonymous Coward · · Score: 5, Interesting

    I'm also an AI researcher. If I need a face dataset I could either use CelebA or the Facebook API to scrape user profile pictures. There's also a mugshot database/public DoJ and County jail mugshot API's so there's also that.

    Now with "GAN" generative models, there's very little need for large datasets unless the existing datasets are biased in some way.

    Let's get real here: someone wanted to build a Deep NN classifier for sexual promiscuity. Other than attention whoring, that's the only reason to harvest tinder users specifically.

    Grindr would do well to tighten their hatches. Training a NN to classify "heterosexuality" from their userbase is the next natural progression. Perfect for a homophobic witch hunt in 3rd world countries. Will I go to hell if I sold such an app to a the Middle East law enforcement agency? Doesn't matter if it works as long as you can demonstrate efficacy on the training data.

    Their purchasing agents are unlikely to be sophisticated enough to understand the importance of "hold out data", so it wouldn't be hard to put together a demo with near perfect accuracy.

  3. Surprise, surprise... by Chris+Mattern · · Score: 5, Funny

    Putting photos out where anybody can see them means putting photos out where anybody can see them.

  4. Where is this... by Anonymous Coward · · Score: 2, Interesting

    I was thinking about making an autoliker that only liked attractive people using machine learning, and learn neural networks while at it. This dataset will come in handy.

    "The article notes that Tinder's API has already been used for other "weird, wacky, and creepy" projects, including "hacking it to automatically like every potential date to save on thumb-swipes"

    Where is this? Please, I need it!

  5. Uploading Not Okay by OYAHHH · · Score: 2

    I can see downloading for research purposes as being ok. And I can see developing the algorithms as being ok. I can even see uploading the algorithms as being ok.

    Now all of the above is predicated on not violating the terms the "researcher" agreed to if/when she signed up for the account he used. Assuming an account was required.

    But uploading the photos taken somewhere else for public consumption is just wrong.

    Abuse of privileges is how we get to the point we find ourselves many times in society. This breech of the public's confidence is just another stab in the back to a society that values respect.

    --
    Caution: Contents under pressure
    1. Re:Uploading Not Okay by henni16 · · Score: 2

      I see nothing wrong with scraping and sharing the scripts even if not for research purposes.
      And if there's a ToS violation, well, that's between tinder and the user and tinder is welcome to block the account.

      But - morality aside - uploading the scraped images will certainly violate copyright law pretty much everywhere and at least in certain (European) countries it will also violate privacy laws which make it illegal to distribute images without the consent of the depicted persons. So, yeah, not cool.

    2. Re:Uploading Not Okay by imidan · · Score: 2

      I can see downloading for research purposes as being ok.

      In an academic environment, at least, you'd have to run that plan by the Institutional Review Board and maybe the Human Studies Review Board. You're collecting personally identifying data about people. Even though the information is available on a publicly accessible website, does that make the data public? It's against the TOS of the web site to scrape it, so unless you made a deal with Tinder to get the data, I'd guess that the Board would reject your proposal. They're pretty strict about research ethics, especially with human subjects.

    3. Re:Uploading Not Okay by aussie_a · · Score: 3, Insightful

      But uploading the photos taken somewhere else for public consumption is just wrong.

      Tell that to all the people who upload material illegally and the millions who download them illegally.

      Society has spoken and said copyright law is irrelevant. These are the consequences. Suck it up.

  6. The dataset appears to be missing by innocent_white_lamb · · Score: 2

    The article links this as being the dataset "consist[ing] of six downloadable zip files, with four containing around 10,000 profile photos each and two files with sample sets of around 500 images per gender."

    https://www.kaggle.com/scolian...

    Which gives a 404.

    --
    If you're a zombie and you know it, bite your friend!
    1. Re:The dataset appears to be missing by MillionthMonkey · · Score: 2

      According to his README.md, the site got a takedown request from Tinder.

  7. Why isn't the API secured? by snoozy355 · · Score: 3, Insightful

    Putting aside all the victim blaming for a second...

    This is meant to be a private (closed-source) application, with a private API interacting to the private server.

    Why the hell can anyone (read: unauthenticated users) access private data via a public and unrestricted URL? I've read articles reverse engineering their API. It's terrible! This is another company who did not put enough time and effort into securing the application and API, and now users (read: non-technical, real people, some of which paid money, all of which trusted the company) are left exposed.

    I really wish there was a way to force companies (ie: legislate) to place far higher importance on this. I've also been in situations where, as a developer, I've had managers scuttle or ignore requests to lock things down, in the interests of deadlines or cost or worse yet, "we'll fix it once it's up and running."

    1. Re:Why isn't the API secured? by epyT-R · · Score: 3, Insightful

      The pragmatic reality is that once your pic is uploaded to the net it's up there for good. No amount of legislation will change that. If you don't want the pic shared with the public, don't upload it anywhere. These 'victims' should know better by now.

    2. Re:Why isn't the API secured? by DarkOx · · Score: 2

      There is fine line between victim blaming and pointing out for the benefit of others who could learn from all this how not to be a target.

      If you leave the door unlocked to house while you are going for the day, it does not give someone the right to enter and take your stuff. It does however make it easy for someone dressed in something looking like a letter carriers uniform, to go door to door trying knobs, in your typical bedroom community to take your stuff.

      You are still a victim, but your choices or lack of care helped make you a target. Its worth recognizing that personally so you can maybe avoid being a victim in the future, and socially its worth recognizing cases like this so we can direct public resources to protecting people from things/attacks they are less able to control for themselves than on things they easily can control to some degree.

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    3. Re:Why isn't the API secured? by chispito · · Score: 2

      Putting aside all the victim blaming for a second...

      How are they victims? The only one victimizing people was whoever convinced the users they could anonymously use a service that requires a photograph. If one other person can view your photo, that one other person can distribute it.

      --
      The Daddy casts sleep on the Baby. The Baby resists!