Slashdot Mirror


Robo-chattel? New Legal Challenge to 'Bots

milomilo writes "Extending on the eBay vs. Bidder's Edge case, the NY Times reports (free registration required) that a Manhattan judge has granted a preliminary injunction against Verio from using 'bots to harvest up-for-renewal prospects from Register.com's WHOIS. The theory's that bots use up a piece of the target system's resources, denying its use to the owner. (Question: would search engines be different, presumably because they also confer a benefit on the target by making it findable?)"

19 of 109 comments (clear)

  1. this is just another step in a long process.. by mcc · · Score: 4
    This whole thing is an interesting question, really... whether requesting something can be an attack.

    I.E. you nonmaliciously (meaning, it isn't a DOS, you're actually getting information) ask for large gobs of information off of some site, the way these bots did.. or the way a spambot might.. they call this "denying services", but still, it's a simple the questioner requests, the answerer replies. If it's "unauthorized use".. well, how can you talk about unauthorized use on a public server? How can these things, authorization and to who, be implied on a public internet? Should it be the job of the requester to not go where they clearly shouldn't be, or the job of the requestee to keep them out?

    Or look at it in terms of a port scan. I request things from each of these ports, thus figuring out which are open (and thus vulnerable to attack). I've seen people try to procecute this based on "unauthorized usage of machine".. well hold up, who said you had to authorize something? This person is just sending pings to ports, on a machine that by its presence on the internet you have implied responds to traffic. Why on earth would you need "permission" prior to using a system? If so, how would that permission be obtained? .. but of course none of this changes the fact that the port scan is almost always part of a malicious cracking attack.

    Or, let's say-- hypothetically-- there was a single-line javascript that, if accessed from a windows NT machine, would cause the kernel to be overwritten by 0s. If you put that up on a web page, would that be "hacking"? You didn't break the machine yourself; you politely ask the machine to break itself, and it complies. Is that your fault?

    But then, when you get down to it, all forms of "cracking" could be seen as requests. I request you process this block of information that just happens to cause a buffer overflow... you didn't have to process it, now did you? That last bit doesnt' really sound reasonable.. you have to draw a line somewhere, you have to note somewhere that it's no longer a request but an attack. Somewhere, for the sake of sanity, you have to draw the line, and how do you do that? Intent? How do you prove intent in court? What's the difference between the slashdot effect and a DDOS, at an abstract level?

    But still how the hell can you say it's illegal to ask for something because the questioned might give you an answer even though they don't want to...? That's where the law is heading, where it's been heading for awhile, and that's completely absurd.

    There is no right answer here, is there?

  2. Robots and search engines by Masem · · Score: 3
    Ignoreing the bad link...

    Question: would search engines be different, presumably because they also confer a benefit on the target by making it findable?)"

    The standard search engines, such as google, altavista, etc, know and obey robots.txt, which is the same as Register.com's policy of not allowing spambots search through their site. If, after a robots.txt file is in place and the search engine continues to index it, I would say there's a good legal case there.

    Now, more interestingly is tools that 'mirror' web sites; they still are using a resource that you've made publicably available, except doing it over a timeframe that is much shorter than a human can do it, which usually means more resources used up at the server end. These bots tend not to follow robots.txt rules, and are only defeatable by User-Agent blocking. If the above ruling stands, does it apply here?

    Take it a step further: Ebay has taken action to stop meta-Ebay sites that index their site and make it easier than ebay's search engines to find things or to search multiple auction websites. Even though the information that is up is publically available from ebay, and IIRC, they still won, mostly because the information is still ebay's property and they didn't like it on other sites.

    Which all leads to an interesting question: when you click on a link, does that start a clock in which you have temporary copyright ability to download the information to your local computer, and after some time, that ability 'expires'? If so, sites that index or mirror without further authorization could find themselves in trouble...

    --
    "Pinky, you've left the lens cap of your mind on again." - P&TB
    "I can see my house from here!" - ST:
  3. Re:Nice link, Hemos by Jerf · · Score: 5
    I hate to jump on this bandwagon, but this is just way over the line Slashdot! You have a staff of people, a whole freakin' staff and you seem to spend less time on the homepage of your site then I do, all alone, on my weblog! In sheer people hours spent on the site, Katz appears kicking the ass of the entire rest of the Slashdot crew combined!

    What really ticks me off is that "The Old Media", through which many people still get their news, has latched on to Slashdot as "The New Media", meaning that Slashdot will be reflecting on my own efforts, and the efforts of anybody else trying to run a 'new media' style website. This is why I post this; Slashdot's flub-ups are personal and affect us all. The flub-ups affect people running new media sites (by tarnishing the reputation in the eyes of the Old Media press who doesn't care to dig past their original generalizations), they tarnish the reputation of Open Source (as they have been labelled the spokesperson of the Open Source movement by the same collection of media entities), and they tarnish the reputation of VA Linux. (Hey, anybody at VA listening? This is not good return on your investment!)

    Slashdot editors, wake up! You are not invincible. You can be replaced, and in Internet time, too. Please get some ethics, before you convince thousands or millions that the New Media doesn't have any!

  4. Re:Public Spaces by Richy_T · · Score: 3
    No, it's more like saying customers can't drive their cars up and down the aisle of the store or perhaps that people who are not employees of the store can't go through the door marked private. Ever see a sign which says "Shoes and shirt required"? The store has the right to control the manner in which people access the space

    Just because the machine is connected to the public internet does not mean that the machine is open for anyone to use however they please. This is enshrined in UK law these days (Note the gradual disappearance of "Welcome to hostname" for login prompts, it can be argued it's an explicit invitation for hackers to enter your machine

    I mean, your telephone is connected to the public network but would it be OK for me to set up a bot to constantly dial your home to see if you'd dropped the price on the car you were selling?

    Rich

  5. Re:Nice link, Hemos by segmond · · Score: 3

    ... if they can't check the link, will they even read the article?

    --
    ------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind
  6. search engines by RussRoss · · Score: 3

    It's worth noting that search engines honor the robots.txt protocol, so any web site can easily opt out of being indexed. There isn't anything like that in WHOIS. If I remember right, ebay lists its auction items as off-limits for bots in robots.txt. I see that as the strongest distinction between search engines and the cases mentioned here.

    - Russ

    1. Re:search engines by onion2k · · Score: 3

      First point, search engine crawlers only honour the robot.txt protocol if they're told to do so. A 'dishonourable' search engine could simply ignore the file and index everything in its path. Already the main engines are boasting 500million+ indexes, its only a matter of time before they start resorting to underhand tactics to boost their numbers.

      Secondly, robot.txt is often a server level setup file. If you get some free space with the likes of AOL/Freeserve/Geocities you have no control over the indexing of your site. Additionally, some (albeit poor) ISPs don't offer configuration of this file. Whether it is the fault of the ISP, the search engine, or you, for the crawling of your site would be a matter for further debate.

      Onion

  7. Nice link, Hemos by Ralph+Wiggam · · Score: 3

    Who would go to Hooters in Amsterdam? "Well, I can go smoke the best weed on Earth, go see a live lesbian sex show, boink two prostitutes at once....or I can go see chicks in small shorts and eat chicken wings." And who is the dorky guy in the corner of the bottom picture? Hemos?

    -B

    1. Re:Nice link, Hemos by American+AC+in+Paris · · Score: 5
      Who would go to Hooters in Amsterdam?

      ...well, it suddenly occured to me that I would go to Hooters in Amsterdam. At least, that's what my employer would think if they ever decided to check the proxy server logs. While they're fairly cool about web browsing in general, they are decidedly less cool about employees looking at "objectionable material" at work. I guess I'll need to institute a policy of proofreading Slashdot's front-page content for them, to check for things like goatse.cx links...

      I'm really, really glad that the submitter didn't slip a really objectionable link in there. I'm also really, really pissed off at Slashdot for this kind of crap. This is total incompetence. (I'm not even taking into account the duplicate stories on the Chinese rocket lanunch in the Science section...)

      This kind of fsck-up at virtually any other major online content provider would be grounds for immediate dismissal for the employee in question, for crying out loud. READ YOUR DAMNED FRONT PAGE SUBMISSIONS!

      information wants to be expensive...nothing is so valuable as the right information at the right time.

      --

      Obliteracy: Words with explosions

    2. Re:Nice link, Hemos by PollMastah · · Score: 4

      This only proves that Slashdot really doesn't care to check a story before posting it. I don't see what's so hard about clicking on a link to see if it works? Or to see if it goes somewhere sensible? I mean, we're not even talking about checking facts here or anything. Why is it that something so basic as checking URLs seems no longer relevent to the Slashdot editors?! What are they doing now???

      Sorry for this rant. I hate the downward trend of Slashdot recently. This wrong link almost made me give up Slashdot forever... there are better, less crowded, less trolled places around that I think I'll move to.

      --

      Poll Mastah

  8. Clickable correct link by TheKodiak · · Score: 4

    For those of you not afraid of goatse.cx, http://www.nytimes.com/2001/01/12/technology/12CYB ERLAW.html

    --
    -=Best Viewed Using [INLINE]=-
  9. Correct link by zorg77 · · Score: 4

    http://www.nytimes.com/2001/01/12/technology/12CYB ERLAW.html

  10. I'll be damned. by American+AC+in+Paris · · Score: 3
    Wow. Slashdot got mega-trolled this time.

    I did notice, however, that the required registration at the "New York Times" was not free...

    information wants to be expensive...nothing is so valuable as the right information at the right time.

    --

    Obliteracy: Words with explosions

  11. 'Bots at Hooters? by UncleOzzy · · Score: 3

    Oh my god! The Hooters girls were bots all along? I feel so dirty!

    (And yes, the Hooters girls do use up resources on the target system, if you get my drift)

  12. I have just one question... by macdaddy · · Score: 3
    ...can anyone translate the text on that site for me? Oh hell who cares. The pictures are color!

    --

  13. Don't forget the implied inverse by Jerf · · Score: 3
    I do not 100% agree with these rulings, but don't fall into the trap of 100% disagreeing with it either.

    "Everybody must be allowed to access web resources" is a statement from the POV of the accessors. Consider that statement from the point of view of the server managers: "We must allow everybody to access our resources in any way they choose."

    Do you really want to make that statement? If you put up a public resource, must you allow people to abuse it if they wish? Or can you take actions to stop such abuse, esp. as it nearly always does real, if not always a lot of, damage. In the case of Bidder's Edge vs. eBay, eBay was suffering real slow-down of service, which affects its bottom line. Must eBay allow it?

    Perhaps the real danger is not so much the rulings per se, but the legal doctrines being used to make them: "Under the reasoning in the Register.com case, "you don't have to prove harm or show any evidence of harm," he said. "Harm will be presumed." He said that he fears the Register.com case will "spread like Kudzu" through the court system."

    At any rate, just recognize that things are somewhat more complicated then they may seem at first. It's tempting to oversimplify in either direction, but the truth is probably complicated.

  14. So... who actually clicked the link? by signe · · Score: 4

    This is amusing. Now we get to see who actually clicked the link, and who posted blindly without bothering to read the article.

    -Todd

    ---

    --
    "The details of my life are quite inconsequential..."
  15. Re:Does this mean... by Sartian · · Score: 3
    I'm afraid I cannot agree with you there. Bots can and do take up much more bandwidth AT ONCE than users do. I work for a search engine and this is something I know. The spiders (bots) we use at Lycos are very, very, very fast and if we are not careful we can bring an unsuspecting web-server to its KNEES as the spider recursively tries to fetch every document it is allowed as quickly as possible. One way we get around it is that we randomize our website documents list so that a spider isn't devoting all of its cycles to one website. We have been doing this a long time and know how to be "polite" when gathering data.

    Outside of the issue of bandwidth, there is the issue of profitability. Part of many websites income is derived from banner ads. If a bot scours a website to harvest the content, it prevents the end users from seeing some of the advertisements they would have normally have seen by exploring it on their own. One example of where this hurts is a Meta Search engine that trolls several search engines and produces a compiled list of search results. Search engines have millions of dollars invested in hardware, software, bandwidth and staff to make it all work. Every single query has a real monetary cost associated with it. Every free service has its cost. A Meta Search engine bot like that only does about %5 of the total work involved producing the results or content that it displays. Now, a site like that makes its OWN profit from users with minimal money out of its own pocket (of which none goes to the companies doing most of the real hard work).

    I completely understand why companies would get cranky about someone repeatedly grabbing computationally intense data from their site and profiting from it as they suck money and resources away from the provider of said data.

    One way websites are about to track when people visit a site is "tracking gifs". Usually very small 1x1 pixel images that give them a general idea of how many visitors a site is receiving. Reports are generated and they get PAID by advertisers based upon this info. Bots RARELY grab anything but content. If you go to a webpage with images embedded in it, your web browser individually requests (most of the time barring cached data) each image. Since bots don't tend to request this graphical "fluff" intended for hyoomans, owners of the site notice an increase in site traffic and resource drains and a decrease in "ad impressions" when a bot "isn't polite". Yes, I do realize that some make revenue in other ways, but bots can use up resources faster than humans can regardless. With people your traffic usually scales slowly up or down. You can add or remove hardware to deal with the demand. Sometimes when a bot hits your webserver it is such a huge spike in requests for data it kills the server. Then all the legitimate users get cranky. Or, a more minor form is that it makes the s e r v e r sllluuuugggggiiish.

    Anywhoo... Thats my $1.25 on the matter. Cheers. - Sartian

  16. This seems similar to other laws by skoda · · Score: 4

    Windows in a public building are obviously meant to be looked through. However, if you stood long enough, gazing through the windows of a local store, they could have you removed if there are "no loitering" laws. Even though you are using the sidewalk for standing and the windows for looking, as they were meant to be.

    Similarly, if you worked at one store and went to your competitor's, pen and paper in hand, and strolled the aisles noting their prices (so your store can meet/beat them), you might be asked to leave. Despite the fact that you are just writing down prices that are clearly there to be read.

    Finally, various retailers, esp. car dealers, place "No wholesaler or retailer" restrictions on their best sales, even though their products are meant to be bought and other retailers may want to do just that.

    It seems to me that analogous laws already exist. Just because something is available in the public realm it doesn't follow that anyone can avail themselves of it to any extent; at least not under current U.S. laws.
    -----
    D. Fischer