Slashdot Mirror


EFF: Accessing Publicly Available Information On the Internet Is Not a Crime (eff.org)

An anonymous reader quotes a report from EFF: EFF is fighting another attempt by a giant corporation to take advantage of our poorly drafted federal computer crime statute for commercial advantage -- without any regard for the impact on the rest of us. This time the culprit is LinkedIn. The social networking giant wants violations of its corporate policy against using automated scripts to access public information on its website to count as felony "hacking" under the Computer Fraud and Abuse Act, a 1986 federal law meant to criminalize breaking into private computer systems to access non-public information.

EFF, together with our friends DuckDuckGo and the Internet Archive, have urged the Ninth Circuit Court of Appeals to reject LinkedIn's request to transform the CFAA from a law meant to target "hacking" into a tool for enforcing its computer use policies. Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use. LinkedIn would have the court believe that all "bots" are bad, but they're actually a common and necessary part of the Internet. "Good bots" were responsible for 23 percent of Web traffic in 2016. Using them to access publicly available information on the open Internet should not be punishable by years in federal prison. LinkedIn's position would undermine open access to information online, a hallmark of today's Internet, and threaten socially valuable bots that journalists, researchers, and Internet users around the world rely on every day -- all in the name of preserving LinkedIn's advantage over a competing service. The Ninth Circuit should make sure that doesn't happen.

31 of 175 comments (clear)

  1. Wait just a minute... by kenh · · Score: 4, Interesting

    Using automated scripts to access publicly available data is not "hacking," and neither is violating a website's terms of use .

    If I'm reading this correctly, I'm not so sure I agree with that last bit, about "violating terms of use". So all terms of use are null and void (if my browser can find it, it's publicly accessible, no matter what I have to agree to in order to get access to it?)? For example, if I have a website that stipulates you must agree not to disseminate the information made available to you by agreeing to these terms of use, you remain free to ignore that agreement?

    Or are they saying that an automated script that can bypass a Term of Use agreement isn't hacking?

    --
    Ken
    1. Re:Wait just a minute... by ColaMan · · Score: 5, Insightful

      If:

      I can send a simple http request to your server, and

      Your server sends me the information without doing its homework, then

      Sucks to be you.

      Don't want your information to be scraped? Have it behind a login - free or otherwise - then ban accounts that are slurping down 10,000 pages a day.

      Ohhhhh then it wouldn't be easily indexed by search engines and thus findable by the general public and your site would fade into obscurity. What to do!? Courts to the rescue, it seems!

      --

      You are in a twisty maze of processor lines, all alike.
      There is a lot of hype here.
    2. Re:Wait just a minute... by mycroft822 · · Score: 4, Insightful

      I think they are only making the argument that you can't charge someone with felony hacking because they are accessing the information you make publicly available in a way you don't like.

    3. Re:Wait just a minute... by Anonymous Coward · · Score: 2, Informative

      using your login and ignoring Terms of Use is A-OK

      no, because at that point we are no longer talking about public information

    4. Re:Wait just a minute... by Desler · · Score: 2

      Nope, because that’s no longer publicly available information. Do try to actually keep up with the argument the EFF is actually making not your strawman version.

    5. Re:Wait just a minute... by greenwow · · Score: 3, Funny

      CNN confirmed that looking at Wikileaks is a crime, and they helped the public by informing us many times of that. If they hadn't, then more people probably would have been put in prison for reading the DNC leaks and voted for Trump. That would have been horrible.

    6. Re:Wait just a minute... by anegg · · Score: 2

      A website that provides public information (no login) could use a rate limit to control how much of its resources can be commandeered by a third party. A website that provides information only after a login (and has terms of service for same) can revoke the login privileges if the terms of service are violated. Neither case seems to warrant aligning violation of terms of service or slurping too much data with committing a felony hack, just as trying to take too many brochures from the distribution rack doesn't warrant a felony theft charge.

    7. Re:Wait just a minute... by Cajun+Hell · · Score: 2
      [This is my own take; I'm not with EFF]

      For example, if I have a website that stipulates you must agree not to disseminate the information made available to you by agreeing to these terms of use, you remain free to ignore that agreement?

      You need to get whoever/whatever reads the data to agree.

      If you send the data before they agree, or without finding out whether or not they agreed, then it wasn't really terms, was it? Did you actually stipulate, or did you just plan to do so and then not follow through?

      Merely wishing for certain terms isn't good enough. Merely offering certain terms (but then sending the data anyway, whether or not agreement happened) isn't enough.

      Doing those things but not actually getting agreement, isn't any different than me putting terms at the bottom of this comment. If you didn't agree to the terms but I send you this comment anyway, then they weren't terms. It's just like printing a EULA and putting it inside a box, and pretending that whoever opens the box, reads the EULA and signs it and mails it back to you. But what if they don't?

      By reading this comment, you agree to give me a dollar for my valuable time (ha!!) and you also agree that this was the most insightful thing you read this month.

      --
      "Believe me!" -- Donald Trump
    8. Re:Wait just a minute... by suutar · · Score: 2

      You're assuming they had to log in to access the information in question. The impression I had was that they were accessing data that didn't require a login. The materials I've found are not extremely clear on that point, but it does specify that hi-Q was accessing the publicly available portions of LinkedIn's site, which to me implies that it's the parts that don't need a login.

      If a login was required to get to the data then I agree with you, but I don't think it was.

    9. Re:Wait just a minute... by Trongy · · Score: 2

      It also reflects badly on Microsoft Corporation, the owner of LinkedIn.

    10. Re:Wait just a minute... by Obfuscant · · Score: 2

      And leave a link on your site for search engines you don't know about to request an exception.

      The "automated" "homework" that someone else claimed the website needed to do is already a standard. There's a "robots.txt" exclusion file that is a standard way of specifying what a robot is allowed to do and not allowed to do.

      It used to be very poor netiquette for a robot to ignore that file. Now it seems that the attitude is fuck the website operator, he's got to hide his stuff behind closed doors to keep abusive robots from taking his website down. It's A-OK to do anything that a website physically allows you to do, despite any explicit restrictions.

      One fine morning I found my website unresponsive and essentially broken. A fine robot operator thought that he would index and put into a search engine of some kind the output from a web-based tide prediction program. This was a program that took about 20 seconds to calculate the predicted tides at a certain location at a certain time, and then send back a graph with some text.

      To make it convenient for users, it also had buttons for "tomorrow" and "yesterday", and a menu for selecting other locations. This helpful indexer was making "today" and "tomorrow" requests every ten seconds, for a page that takes 20 seconds to create. I think you can probably guess the obvious result. Should the robot wait until the current request is finished before making the next one? Fuck no, that's too slow and makes too much sense. If the server cannot keep up, fuck it.

      Robots.txt? Who cares, huh? "Please don't abuse my site by running a robot against it" means nothing to some people. It's a shame that we can't have nice things.

      If you hang your laundry out in some place accessible to the public, just accept that people will be able to take snapshots of it.

      If only "take[ing] snapshots" didn't cost the laundry owner time and money and prevent others from being able to glance at the stuff, your analogy would be accurate. People taking snapshots is not, of course, the issue. It's the automated systems that load the servers into a halt.

    11. Re:Wait just a minute... by viperidaenz · · Score: 2

      They sucked you in to their free service, before facebook was popular. MySpace wasn't really appropriate for work colleagues and other professional relationships.

      They then made a mobile web site and redirected all mobile users to that site, to keep you investing your time and using their service

      Then they removed the mobile website and redirect you to install their app, which you can't install without giving it permission to access to all your contacts.
      When you log in to the app, it uploads all your contacts.

    12. Re:Wait just a minute... by omnichad · · Score: 4, Insightful

      if I'm reading this right (and I may not be), using your login and ignoring Terms of Use is A-OK.

      You're reading it wrong. Using your login and ignoring terms of use is a breach of contract (albeit a unilateral EULA). It is not and should not, however, be considered felony computer hacking under the CFAA.

    13. Re:Wait just a minute... by rtb61 · · Score: 4, Informative

      Dipstick, I can freely ignore all terms of service you specifically do not get me to agree to and by law that means specifically. You must actively seek my agreement and obtain it, prior to claiming I agree to anything. All you can do is deny service, you can not make any claims beyond that. Otherwise numbnuts, I could put a claim below the fold, that to read anything above the fold means you agree to pay me a million dollars. You must actively seek actual agreement to terms of service, prior to making claims, you can only deny service nothing more not make claims for providing a service. You are clearly too wrapped up in the bullshit of post purchase end user licence agreements which are illegal in the majority of countries and only legal in the US because of corruption and bias towards corporations. It's like the old readers digest bullshit of sending you stuff, claiming you bought if and you owe them money if you did not send it back, nope, a lie, they have no right to claim service off you, they sent it to you for free, they gifted it to you. Same as the internet, unless you actively seek agreement and then refuse service if agreement is not achieved, than you can not claim payment for accessing you service.

      --
      Chaos - everything, everywhere, everywhen
  2. robots.txt by Anonymous Coward · · Score: 4, Insightful

    Shouldn't a "good bot" abide by https://www.linkedin.com/robots.txt?

    1. Re:robots.txt by Monster_user · · Score: 2

      Yes, a "good bot" should follow robots.txt. But failing to implement a standard is not "hacking".

    2. Re: robots.txt by Monster_user · · Score: 2

      "Failing to implement" means it was a failure, not an intent to circumvent or violate.

      Volkswagon's violation was intentional.

      While a "robots.txt" file is unknown, and is intentionally hidden from the user. It is intended to only be visible to "bots" which are actively looking for a "robots.txt". As bots are not sentient, they must be programmed by an individual aware of the existence and purpose of "robots.txt", and understands it to be more than a request.

      Though, if "robots.txt" isn't required to be understood to be more than a request, then the same must be applied for "Do Not Track" settings in a browser. Thus any website which is aware of "Do Not Track", and violates DNT, would be liable to criminal charges.

  3. Fuck LinkedIn by Anonymous Coward · · Score: 2, Funny

    As far as I'm concerned, LinkedIn themselves are guilty of massive fraud and deception, by tricking users into providing email contacts so that LinkedIn can send invite spam supposedly from the user. It was a carefully designed "dark pattern" to increase their userbase early on.

    Of course, by the time they eventually got sued over this, they were big enough to shrug off the financial penalty and keep making money off all the data they had collected illegitimately.

    LinkedIn is a socially malignant business and deserves to be laughed out of any court for trying to use the rule of law to their advantage.

  4. Who's a good bot? by Chelloveck · · Score: 4, Funny

    Who's a good bot? You're a good bot! Yes you are. YES YOU ARE!

    --
    Chelloveck
    I give up on debugging. From now on, SIGSEGV is a feature.
  5. Arrest records... by b0s0z0ku · · Score: 5, Interesting

    Let's use a different example. Arrest records and mugshots on police agencies' websites. Let's say Jane Doe, born 1/1/1970 got arrested for a particularly heinous crime. Murder, or robbery at gunpoint.

    Six months later, a court ruled her not guilty. She was able to petition to have the public arrest record on the Yoknapatawpha County Sheriff's office website deleted.

    However, in the interim, it's been scraped and archived by database companies using the data for employer background checks. Every time she applies for a job with a large employer, her application either gets round-filed, or she has a lot of explaining to do.

    What's worse, in the state of Winnemac, there are six Jane Does with that same birthday, all of which have the same record in their background check database...

    Does information still want to be free?

    1. Re:Arrest records... by b0s0z0ku · · Score: 2

      See, I disagree. I think invasion of privacy by corporate entities should be strictly punished -- whether it's retention of data past strict legal limits or when a user specifically opts for account deletion, mass surveillance, or dissemination of inaccurate or prejudicial information affecting people's ability to earn a living. How is someone going to sue if they don't have a job because of something incorrect being in a database used by employers?

      Violate data retention laws? In an ideal world, one lash in the arse or a day in the pillory per instance. (Sadly, jail or steep criminal fines are the only real options here.)

    2. Re:Arrest records... by Desler · · Score: 2

      The terms of use typically prohibit distribution or mass downloading. With good reason.

      There are no such terms of use on public records.

      I'm generally against looser laws (drugs, prostitution between consenting adults), but I think privacy should be sacrosanct. I'm all for throwing the book at corporate entities that violate people's privacy,

      But again, public record data is not private. Do you not understand what the term “public” means?

      The EU's "right to be forgotten" is a good thing in an age where privacy is slipping away.

      I’m not arguing against the right to be forgotten. I’m arguing against stupidity that claims that accessing public record data is hacking. The “right to be forgotten” makes no such conflation. Like I said, she should use the civil court system to address the issue which is exactly what the EU’s “right to be forgotten” process entails. You do realize that the person has to fe a civil suit under the rules right? It’s not a criminal prosecution.

    3. Re:Arrest records... by Desler · · Score: 2

      What would be the social cost of penalizing the mass distribution of such information?

      The social cost is that someone down the line will then begin to use this new authority to attack and punish people for purely political reasons. Are you really so naive as to not see that? Public record data is in the public domain which means it’s free for anyone to share. You sound like quite the fascist.

    4. Re:Arrest records... by rogoshen1 · · Score: 3, Insightful

      Every single man, woman, and child in the US has heard the phrase "innocent until proven guilty", and look at the effectiveness of that caveat.

    5. Re:Arrest records... by Attila+Dimedici · · Score: 2

      I am quite confident that if there is a background check company in the U.S. which is using records archived by a non-government organization (for more than an vanishingly short period of time) to provide employers with arrest record information they are currently facing, or will soon be facing, a class action lawsuit for Civil Rights violations. They are also likely being investigated by, or soon to be investigated by both the federal Civil Rights Commission and various state equivalents.
      Perhaps you are unaware of the controversy around companies using criminal records as a basis for declining to employ someone, but there are places in the U.S. where it is illegal for an employer to ask if a prospective employee has been convicted of a crime because this disproportionately affects African American applicants. If refusing to hire someone who has been convicted of a crime is considered racist in some jurisdictions, think how they would react to an employer refusing to hire someone because they had been arrested without being convicted. And while employers might be able to avoid being a target, someone would notice the company providing the background checks with that information (and any reasonably competent attorney could write court filings which would shut them down almost immediately).

      --
      The truth is that all men having power ought to be mistrusted. James Madison
    6. Re:Arrest records... by b0s0z0ku · · Score: 2

      A lot more common in the Latino and Black communities, where the range of surnames tends to be smaller.

    7. Re:Arrest records... by b0s0z0ku · · Score: 2

      (1) You have too much faith in the American system. Cute.
      (2) They won't refuse to hire her -- they'll just ignore her resume before she ever gets called for an interview based on background check data. It's only grounds for her to sue if she knows about the policy. Company's excuse would be they never got the resume.

  6. Re:Tragedy of the commons, Internet edition by Desler · · Score: 2

    Put the information behind a free login or a paywall. Or sue them in civil court instead of abusing criminal statutes that were never meant to apply to publicly available information.

  7. Figures by Ol+Olsoc · · Score: 2
    Linkedin who want you to violate your TOS by giving them the password for your email; account. so they can mine it.

    Seriously what kind of idiot buys into an outfit that has as a basis of operation, asking for something that in most places will get you fired?

    ? I started to sign up, and when they asked for my password it was 1FuckYouLinkedin!

    --
    The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
  8. If they aren't using clever tricks to get it by raymorris · · Score: 2, Insightful

    I'm thinking LinkedIn is wrong here, but a simple, clear-cut, and correct statement of public policy is more difficult than it first appears.

    "accessing publicly available information" sounds pretty clear and simple, but the more I think about it, the murkier it becomes. Suppose in each of the following scenarios the data is by the owner's terms not to be accessed by bots and:

    A) The system pops up a user/ password dialog before allowing access. User "admin" and an empty password works

    B) The system pops up a user/ password dialog before allowing access. User "admin" and password "password" works

    C) The system pops up a user/ password dialog before allowing access. User "admin" and password "correct horse battery staple" works

    D) The system pops up a user/ password dialog before allowing access. Sending 17,000 requests each with a password that consists of a million null bytes followed by carefully crafted machine code to overwrite memory sometimes works

    The thing is, ANY data that has been hacked over the internet was accessible to the public, if they public tried hard enough, and was clever enough in defeating access control measures. That makes it difficult to legistlate a bright-line rule.

  9. Re:Good for "whom," exactly? by kenh · · Score: 2

    How do you think search engines work?

    Are you trying to claim that one-fourth of all traffic on the web is search engines crawling the network? Doesn't that seem like a lot of traffic?

    That's like saying one-fourth of the cars on the road are "Google Cars" updating Google's Street View database.

    --
    Ken