Slashdot Mirror


Data Firm Leaks 48 Million User Profiles it Scraped From Facebook, LinkedIn, Others (zdnet.com)

Zack Whittaker, reporting for ZDNet: A little-known data firm was able to build 48 million personal profiles, combining data from sites and social networks like Facebook, LinkedIn, Twitter, and Zillow, among others -- without the users' knowledge or consent. Localblox, a Bellevue, Wash.-based firm, says it "automatically crawls, discovers, extracts, indexes, maps and augments data in a variety of formats from the web and from exchange networks." Since its founding in 2010, the company has focused its collection on publicly accessible data sources, like social networks Facebook, Twitter, and LinkedIn, and real estate site Zillow to name a few, to produce profiles.

But earlier this year, the company left a massive store of profile data on a public but unlisted Amazon S3 storage bucket without a password, allowing anyone to download its contents. The bucket, labeled "lbdumps," contained a file that unpacked to a single file over 1.2 terabytes in size. The file listed 48 million individual records, scraped from public profiles, consolidated, then stitched together.

33 of 56 comments (clear)

  1. "Leaked" public data by Shotgun · · Score: 1

    I'm not sure that word means what you think it means.

    --
    Aah, change is good. -- Rafiki
    Yeah, but it ain't easy. -- Simba
    1. Re:"Leaked" public data by Nidi62 · · Score: 3, Insightful

      I'm not sure that word means what you think it means.

      the company left a massive store of profile data on a public but unlisted Amazon S3 storage bucket

      Cue the Congressional hearing with the 80 year old Congressman asking why Amazon even allows companies to store anything in these buckets if they have holes, and why they can't just stop the leaks with duct tape.

      --
      The only thing necessary for evil to triumph is for it to be pitted against a slightly greater evil
  2. Where's the benefit of locking down this user data? It seems like, if we want to harm scammy companies like this, removing their profit motive by publishing all the (non-copyrighted) data makes sense.

    --
    Your ad here. Ask me how!
    1. Re:Hmmm by Actually,+I+do+RTFA · · Score: 1

      The data isn't private. This scammy company already scrapped it. No doubt many others did too. Allowing them to maintain the secrecy of their data just gives them a profit motive.

      --
      Your ad here. Ask me how!
  3. What would you do with this data? by skids · · Score: 1

    I mean, personally, what would you as a typical slashdotter do with this data if you weren't too busy cleaning the I.T. closets?

    See who can build the most efficient script to "find Waldo"?

    1. Re:What would you do with this data? by rsborg · · Score: 1

      I mean, personally, what would you as a typical slashdotter do with this data if you weren't too busy cleaning the I.T. closets?

      See who can build the most efficient script to "find Waldo"?

      Grey hat:
      It'd be a great (though ethically questionable) corpus of data for training your AI for whatever sort of prediction data you want.

      Black hat:
      It'd also be a good for political targeting or looking for easily scammable people for spear-phishing or spam cons.

      --
      Make sure everyone's vote counts: Verified Voting
  4. Did they sell any to the Republicans? by Anonymous Coward · · Score: 1, Insightful

    If they sold it to Republicans they need to be dragged before Congress and publicly humiliated, otherwise this is a non-issue.

    1. Re:Did they sell any to the Republicans? by HeckRuler · · Score: 1

      Low effort partisian bashing from an anonymous coward and inexplicibly getting upvotes...

      Yep, this one tastes like professional shilling. I think someone out there really wants to get this issues cut down along party lines. Good luck with that though, I don't think democrats OR republicans are too happy with Facebook over the sort of shit they let happen.

    2. Re:Did they sell any to the Republicans? by rsborg · · Score: 1

      If they sold it to Republicans they need to be dragged before Congress and publicly humiliated, otherwise this is a non-issue.

      You jest but data like this should be a liability and treated as such. It would certainly be one if something like the European GDPR were in effect in the US.

      Facebook, Equifax and the like should be punished for their so-called "lapses in security".

      --
      Make sure everyone's vote counts: Verified Voting
    3. Re:Did they sell any to the Republicans? by HeckRuler · · Score: 1

      I just typically down mod it, but I commented in this thread.

      Also because that trolling is even lower-grade bullshit. Randomly swearing "FUCK TRUMP" in a thread that has nothing to do with him is just noise. Ever read Anathem by Neal Stephenson? He has a great bit in there about different sorts of propaganda and bullshit and spam. Literal static is the lowest grade, easy to ignore as there's no content there. It's not even bullshit. While top quality bullshit would be an otherwise impeccable scientific paper but with a critical flaw or piece of misinformation. The higher the quality of the bullshit, the more insidious it is and the more we need people detailing exactly HOW such messages are bullshit. Hence, a refutation rather than simply downmodding.

      AND because there are plenty of perfectly legitimate issues where Trump is on-topic and is doing something pretty fucking stupid and deserves to be bashed. I've got political views. I'm a democrat. But this is not a political issue. The two parties might differ on how to solve it, sure, but they both agree it's a problem. (Oh, and in general Democrats likewise deserve so have shenanigans called on them for the stance so many of them are taking on free speech. Seriously, fuck that noise)

  5. Re:Chickens coming home to roost by postbigbang · · Score: 1

    Are you one of those Russian troll things? I always wondered what one of those looked like, and you sure do have all the earmarks.

    Did you learn English in high school? Do they make you sit in a cubicle and write stuff in boldface, using as much English swearwords as you can think up?

    Gosh. You must be one of the happiest people in the world. Have a nice day. I wonder if Slashdot got your IP address. I doubt they do anything about such rubbish. Oh well.

    --
    ---- Teach Peace. It's Cheaper Than War.
  6. 4 scumbags and a data scientist. by CaptnCrud · · Score: 4, Informative

    Here is their publicly available personal info.

    http://www.localblox.com/

    George Fink - CEO/Marketer/Scumbag: https://www.linkedin.com/in/ge...
    Sabira Arefin - Founder/Entrepreneur(lol)/Scumbag: https://www.linkedin.com/in/sa...
    Colby Atwood - President/Marketer/Scumbag: https://www.linkedin.com/in/co...
    Ashfaq Rahman - Chief Data Scientist/Scumbag: https://www.linkedin.com/in/as...

  7. it was public data by Anonymous Coward · · Score: 1

    A little-known data firm was able to build 48 million personal profiles, combining data from sites and social networks like Facebook, LinkedIn, Twitter, and Zillow,

    That is data people posted publicly.

    Now if they did that with FB's "shadow profiles" of non-users, then maybe I can see a cause for being upset. But if people spew their private data to every advert company on the internet, inc the biggest data aggregators out there like FB, G and Linkedin, they do not have a "reasonable expectation of privacy". That's like publishing your drunken fratboy antics in the New York Times, and then being upset when someone reads about them.

    People have to start thinking about what they are doing with their data. Anything else is a tapdance around the problem, and won't solve it.

  8. Zuck Lies by sdinfoserv · · Score: 1

    So, Zuckerberg.... repeat again that you don't sell data..

    1. Re:Zuck Lies by HornWumpus · · Score: 2

      He doesn't. He sells lists of names that meet criteria. The data itself is too valuable to sell, just once.

      Facebook is upset that Cambridge Analytics did what Facebook does. Never throw away data and never miss a chance to collect more.

      --
      John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
  9. Counter argument by Comboman · · Score: 3, Interesting
    I'm gonna say no, based on the Supreme court case Feist v Rural Telephone Service.

    The court found that information alone without a minimum of original creativity cannot be protected by copyright. In the case appealed, Feist had copied information from Rural's telephone listings to include in its own, after Rural had refused to license the information. Rural sued for copyright infringement. The Court ruled that information contained in Rural's phone directory was not copyrightable and that therefore no infringement existed.

    --
    Support Right To Repair Legislation.
    1. Re:Counter argument by GrumpySteen · · Score: 1

      Not that I disagree with the ruling about phone listings, but my Facebook profile lists my job as rocket surgeon and most of the other information there is equally fictional. There can be a fair amount of creativity in what would otherwise just be listings of factual data.

  10. Is this a witch hunt? by HeckRuler · · Score: 1

    hmmmm, wait a second... *sniffs the smoke* *listens to the chanting mob* *Looks down at the pitchfork in his hands*. Yep. This is a witch-hunt.

    Now, don't get me wrong. I honestly despise this paticular brand of witch. These guys suck and their actions have a very anti-social bent to it. Their buisness model is abuse and intrusive. Fuck marketers. I know plenty well enough to protect myself, but "the masses" are just kinda generally dumb and enough are swayable into doing dumb things. Like using emacs or voting along party lines. Or worse. There are large scale sociological problems when corporations know too much about every individual.

    BUT. I mean, come on guys. We'v got to be allowed to build our own phone books. A name and an address isn't.... Nobody expects that to be private. If you own a house it's literally public knowledge. You WANT people to know you own that land. This is all publicly accessible data. That's fine. In fact I'd expect companies to collect this stuff. It's not a problem. The problem is if they harvest PRIVATE data.

    So... like... let's go burn down the castle of someone that's actually a monster.

  11. What a world by jenningsthecat · · Score: 5, Insightful

    A Canadian kid gets charged with "exploiting a vulnerability", (i.e. incrementing a number in a URL), and faces ten years in prison for archiving the FOI data he collected as a result. He had no idea he was doing anything wrong. (FOI? Hello!). These assclowns scraped data, and created 48 million personal profiles without consent. They knew full well what they were doing. Then they effectively published the data. Careless, much? Arguably they were criminally careless. They probably won't face any penalties at all. Go figure.

    --
    'The Economy' is a giant Ponzi scheme whose most pitiable suckers are the youngest among us and the yet-unborn.
    1. Re:What a world by HeckRuler · · Score: 1

      They also donated to the DNC.

      Subtle.

      But quit trying to shoehorn this into a partisan issue you shitty little shill.

    2. Re:What a world by Drethon · · Score: 1

      Well the kid was accessing data that was meant to be secured, if poorly. These researchers are accessing data that is meant to be public.

      They can do whatever they like with my linked in data, it was put up there for everyone in the world to see.

  12. Re:oops by skids · · Score: 1

    Typical conversation at a 2025 back-yard BBQ:

    Joe: So I was thinking of maybe taking a trip to Paris
    Bill: Yeah, I know, my google assistant briefed me on you profile on the car ride hear.
    Joe: Oh. ...
    Joe: So, have you started to carve any new chainsaw sculptures. I mean after the pengiun, Siri told me about that one already.
    Bill: No, still finishing up the penguin. ...
    Bill: Any thoughts on the-
    Joe: ...no...
    Bill: the town referendum?
    Joe: Yeah I knew that's what you were asking. Not really.
    Bill: Oh. ...
    Joe: Well, you should be getting on now, right?
    Bill: Oh right you are. Almost time.

  13. Public data was leaked publicly? by MobyDisk · · Score: 1

    Wait, so a company that scraped data from public sources, left the data unsecured, and the public could access it?

    personal profiles...from sites and social networks like Facebook, LinkedIn, Twitter, and Zillow, among others - -- without the users' knowledge or consent.

    Are you telling me that users of social networks do not know that the public part of their profile is available publicly? What? Hey, there's plenty of privacy violations going around, but this isn't one of them. Save your outrage for any one of the many other examples.

  14. Re:Chickens coming home to roost by forkfail · · Score: 1

    Yeah, it kinda does sound like an old time BOFH posting on usenet.

    Kids these days.

    --
    Check your premises.
  15. Re:oops by Anonymous Coward · · Score: 1

    Typical conversation at a 2025 back-yard BBQ:

    Sounds about write. In 2025 we'll be so dumb we misspell words when talking.

  16. Re:oops by eneville · · Score: 1

    Typical conversation at a 2025 back-yard BBQ:

    Sounds about write. In 2025 we'll be so dumb we misspell words when talking.

    If you haven't done so already, watch https://www.imdb.com/title/tt0...

  17. Order your own LexisNexis file - you'll be shocked by Anonymous Coward · · Score: 2, Informative

    Order your own LexisNexis "Full File" - you will be shocked at the data this private company has collected on you. No shoe size (yet.) https://personalreports.lexisn... Also order your own LexisNexis "C.L.U.E. Auto Report" and "C.L.U.E. Personal Property Report" https://personalreports.lexisn... By Federal law, they are required to provide you with free reports once per year.

  18. Why are so many Amazon buckets public? by ctilsie242 · · Score: 1

    Was there a time when Amazon shipped S3 buckets public by default, with permissions wide open to the world? What is it with these S3 buckets.

    Last time I set up a public bucket (to share some of my photos to some friends), I had to explicitly set the checkbox, and it came up with "you can't just walk into Mordor" warning.

  19. Re:oops by HeckRuler · · Score: 2

    Same convo between friends that stalk each other and AREN'T douchbags just trying to shut down conversations:

    Joe: So I was thinking of maybe taking a trip to Paris
    Bill: Yeah, I saw that post. You've got to hit up the Louve.
    (Conversation about Paris ensues)

    Joe: So, Siri told me about that panguin, how's it going?
    Bill: Still finishing up, want to see it?
    Joe: Yes.
    (They go to garage)

    Bill: Any thoughts on the town referendum?
    Joe: No Bill, even in a made up contrived example, nobody wants to talk about town referendums. Now let's have at those burgers.
    (Bill casually poisons the apolitical sociopath's burger)

    Just because they're both informed about the other's activities doesn't mean they have to be bored of each other's activities. If they DID become bored with it, they could tell Siri to shut the hell up. Even WITHOUT knowing any of this, they could still be douchbags trying to halt conversation. Reading people's tweets don't somehow destroy your social skills.

  20. So ... by cascadingstylesheet · · Score: 1

    ... they scraped public data, and the problem is that they carelessly left it ... public?

  21. Re:Chickens coming home to roost by umghhh · · Score: 1

    I learned inglish from beavis & butthead on MTV as well as from the stories in Playboy (does that exist still?). Modern pr0n is not so good for learning language however - sentences are too short or I do not watch long enough.

  22. Link? by Miamicoastguard · · Score: 1

    Anyone got a link?

  23. Barely the tip of the iceberg. by Mr307 · · Score: 1

    The general public is barely aware of 1, 2 or 3 companies that have collected and used information from public and private sources because of the left wing faux outrage that Trump was involved with 1 of them.

    What are they going to say when they find out its also LinkedIn and Twitter and every other 'free' service and more collecting/scraping/surveying/using/sharing/selling every shred of collected information to sell more advertising and or create relationships for their own purposes.

    This was so easy to predict a long long time ago (hence many people have avoided these 'services' since day 1).