Slashdot Mirror


Judge Bars eBay Crawler

matty writes: "A judge has said that Bidder's Edge could no longer use its crawler to gather information from eBay. 'Even if its searches use only a small amount of eBay's computer system capacity, Bidder's Edge has nonetheless deprived eBay of the ability to use that portion of its personal property for its own purposes.' So what about Yahoo! and all the other search engines? Don't they use similar technology? Read the article and see for yourself." Or maybe it's not such a bad precedent; it'd be interesting if such a ruling helped discourage hard-drive searching by software which searches for "undesirable" content without your consent or knowledge.

21 of 168 comments (clear)

  1. An Anti-spam ruling! by www.sorehands.com · · Score: 3
    Since spammers use cycles of pop3 servers and SMTP servers stealing cycles, bandwith, and disk space. This ruling may apply.

    This is not a precident!

    For it to be a precident, it has to be ruled on by an appeals court. Unless, it is something that an appeals court does not usually rule on (ie. a motion to remand).

  2. Missing the point. by carlfish · · Score: 3

    The "using CPU cycles" is not a new legal argument. Back when there were no specific laws against computer cracking, I believe one of the earliest convictions came from proving that a cracker "stole electricity" by using cycles on the machine they'd broken into.

    You'll first note, from following the provided link, that this was simply an injunction. There was no mention of damages. There is nothing in this that sets any precedent against webcrawlers, or deep linking. All the case says is that yes, the owners of a webserver may choose how that server is used.

    Bidders' Edge were doing something to Ebay's site that Ebay didn't want them to do. Ebay asked them to stop. Bidders' Edge didn't want to stop. So it went to court. The court told Bidders' Edge that what they were doing constituted a use of the Ebay system (their CPU cycles) against the wishes of its owner, and ordered them to stop.

    Now what, exactly, is wrong with that? Do we really want to set the alternative precedent, which is that as soon as you put a public resource on the net, you have no say in how that resource is used? I can imagine all the IRC script kiddies drooling at the thought of that one. "Hey! It's a public server, my clonebots were only a few hundred out of the thousands of users online, so I can't have been using a significant amount of the net..."

    Charles Miller
    --

    --
    The more I learn about the Internet, the more amazed I am that it works at all.
  3. This is just wrong... by roystgnr · · Score: 3

    If EBay wants to stop a particular client from using it's services, they hardly need to go to court to do it! They can weed out requests based on browser IP, based on whether the client is a spider (robots.txt), based on click-thru agreements...

    Basically, imagine that every day, I ask you for a few dollars and you give it to me. Should you then take me to court to force me to stop asking? No! You should just stop handing me your money!

    If EBay has such a system in place, and BiddersEdge is ignoring robots.txt, lying about it's client type, or otherwise circumventing it, then fine, moderate this to (-1, Idiot)... but it's late, I'm too lazy to read the article before going to bed, and I've used EBay once or twice without them asking to make sure I wasn't a competitor first.

  4. Comprehensive solution by Kris_J · · Score: 4
    • No public/anonymous browsing - have to log on to see eBay content.
    • Make it against usage agreement to use bots
    • Make the site bot-friendly either through optimised pages or a separate connection (ODBC-esq)
    • Enter into specific licencing agreements with 3rd parties to allow sorting, sifting and filtering of eBay contect. If they breach the licence, pull their access.
    Easy. Then the courts don't have to rule that the public aren't allow to use publicly accessible information if the provider doesn't like the way you look at them.
  5. Re:IANAL by hawk · · Score: 3

    I am a lawyer, but this is not legal advice. If you need legal advice, contact an attorney licensed in your own jurisdiction.

    > While IANAL, I have a friend that is and I think that giving out legal
    > advice (which can consistitue answering legal questions as such)
    > without charging for it can be considered a violation of the Bar

    No; that is an ancient joke--though if you go back far enough, you find Barristers robes with a pocket in the back so that they could conveniently be given a payment--as it was improper/unethical/ungentle to ask--Esquire is a derivitave of Squire, a class that wasn't supposed to work for a living.

    Anyway, free advice is legal, and so is free work (pro bono publico--for the public good).

    That said, anywone dispensing free advice in a public forum like this is nuts. There is *no* way to adequately gather enough information from a short posting to give competant advice (save in the most trivial situations). Additionally, proving what you did or didn't say is difficult. WHether you charge or not, you are still liable for the advice.

    Extreme example: voice on phone asks question, gets answer, does something, comes back and sues. You have no idea who the person was, or if this is the person you talked to. But you're on the hook, and you insurance carrier is less than happy. This is why I have a hard, no exceptions, policy that I only give advice to clients.

    On the other hand, I do frequently comment on legal issues around here--but never in a way that could be legal advice.

    Lastly, note that the legal questions that get asked are generally ones for which a proper answer would take several hours of research. I'm not going to do that every time a complete stranger asks. If someone wants to pay my hourly, sure--but I'm not going to do several hundred or a couple of thousand dollars of free work on a regular basis :)

  6. Consider Real World Senerios by geekatlrg · · Score: 4

    This is entirely short sighted, unrealistic, and fails to take into account any real world scenerios.

    The internet is a public place, no different from the local shopping mall, grocery store, public library, movie theater, whatever...

    When you go to the mall they expect that you will and won't do a number of things, commonly accepted criteria for activity in a public place (read: no shirt, no shoes, no service).

    On the internet nobody gives a damn if your in your shorts, but suddenly your not allowed to do any number of things that you could realisticly do physically.

    For example, lets say I'm doing market research and I send 50 people to the local mall to run around and look at what kind of (product) is being sold, and how many of them, and how much they cost. This is perfectly legitimate, and legal since that information is publicly available. I probably can't go into the stock room and see how many of (product) are not on the shelves, but I can certianly just have a look around the store (just like anyone else).

    If we apply the latest internet precidents to this senerio I would NOT be allowed to do this without breaking the law. Suddenly I would be using that stores resources and denying them use of that resource for whatever reason they deemed more important. So I'm not allowed to go the store if I'm just window shopping now?

    Publicly accessable resources are held up to a very high standard. Anyone can find out how much a store charges for , this is good for everyone, the store, the customers, the manufacturers.... The same applies (or should apply) to the internet.

    As far as purely internet related impact is concerned, can anyone who hosts a site look through their server logs and sue anyone who connects to their site to much? Or all those search engines that come through on a regular basis? Or anyone that pre-caches the site automatically (gee, this is even a feature in Internet Explore... more MS trials?).

    My opinion: The internet is a public place, and fair use of a public place is already governed by a certian set of rules and regulations (at least here in the US). Let these rules and regulations do their job and stop creating "special regulation" for a situation that isn't radically different from anything else we humans do on this planet. Just because you do something on the internet doesn't mean it requires special regulation. -Gentry

    1. Re:Consider Real World Senerios by gargle · · Score: 3

      For example, lets say I'm doing market research and I send 50 people to the local mall to run around and look at what kind of (product) is being sold, and how many of them, and how much they cost. This is perfectly legitimate, and legal since that information is publicly available.

      Not really. Try taking photographs at your local supermarket. You'll be stopped by security - I've had this happen to me before (and we were just taking a family photo). If you go around taking extensive notes on products and prices, I've no doubt you'll be stopped as well. They certainly have the right to prohibit certain activities on their premises, although I don't know that a general court ruling is needed or valid.

  7. Re:robots.txt by breser · · Score: 5
    Not really... Take a look at some of the other robots.txt files they have on various other machines. For example listings.ebay.com
    # go away
    User-agent: *
    Disallow: /
    I'd say that's everything.
  8. Re:Seems Like a Really Dumb Thing but .... by Darchmare · · Score: 3

    ---
    This is just plain stupid; if you have a page on your website which is viewable by the public then it is available for the public to download.
    ---

    Hrm. How far do you want to extend this, though?

    Couldn't it be said that someone launching a Denial Of Service attack by simply requesting documents at an extremely high rate of speed is 'viewing the documents' as expected?

    I do think that the judge's excuse was a little suspect - if it were server resources alone, you'd think that this second company would be saving them load by diverting their customers (and advertising dollars) elsewhere.

    I do agree that they don't deserve to win, though, by virtue of how they're stealing someone else's content. It's not always certain what kind of precedent this will set though...

    - Jeff A. Campbell
    - VelociNews (http://www.velocinews.com)

    --

    - Jeff
  9. Re:eek by B'Trey · · Score: 3
    It isn't quite that simple. ebay's servers are private property but they are also explicitly public access. That's their very purpose. If the public couldn't access them, ebay wouldn't have much of a business, would it?

    It isn't a question, really, of controlling access. It's a question of controlling how MUCH access, and to what use the info is put AFTER it's accessed.

    ebay intentionally and purposefully puts the information up on it's servers for public access. It seems to me that doing so rather negates the claim that it has "control" over that access. If ebay wants to maintain control over who accessed its sites, it should take steps such as requiring log-in to view the site, not just to place bids.

    Bidder's Edge is nothing more than a search engine of auction sites. ebay's real objection is that it hurts ebay's business by allowing users to compare prices with other sites.

    --

    "The legitimate powers of government extend only to such acts as are injurious to others." Thomas Jefferson.

  10. Re:Seems Like a Really Dumb Thing but .... by seldolivaw · · Score: 3

    I am rooting for them to lose because they are in effect *competing* with eBay for advertising dollars by *using* eBay's content.

    You mean the way Slashdot -- a technology news site / community / portal -- deep-links to articles on other news sources all over web for its own content? Admittedly /. links to the sites, but if the content didn't exist neither would the bulk of /.

    Wired and C|net in particular would have /. for breakfast if this set the kind of precedent that seems to be happening.

  11. Hmmm... by Virtex · · Score: 3

    This could have interesting consequences. Most sales-oriented sites use scripts which your average spider will avoid (due the the *.cgi URLs). On the other hand, to reduce the system load, some sites re-generate their pages at regular intervals (say, every minute) instead of every time a user loads the page. Looking at ebay's site, it appears their URLs end in *.html. Perhaps what is needed here is something in the robots.txt file (does ebay have one of these?). But then again, not all spiders respect this file, so I don't know. If the web site designers don't take the necessary precautions to protect the private information on their site from spiders, how can the spiders know not to catalog them?

    --

    --
    For every post, there is an equal and opposite re-post.
  12. Hmmm... by BJH · · Score: 3


    Well, seeing as hiw there are only "first posts" up so far (five of them, no less), I guess I'll take a shot at an on-topic post.

    The problem here is where do you draw the line at fair use? ebay is providing a publically-accessible database; why should a search of that database (by robot or not) be considered an abuse of their servers? Of course, if the search puts so much strain on their servers that no-one else is able to access them (effectively a DOS), then an injunction would be reasonable, but this search doesn't seem to have caused undue server load.

    The point that timothy brings up is not really relevant here, I think - your HD is not a database offered for public access, and should thus be protected from undesired searching, whether or not this injunction is upheld.

  13. Seems Like a Really Dumb Thing but .... by The+Code+Hog · · Score: 5

    At first blush, it seems like this is a stupid ruling, mainly for the reasons the judge gave for making it. He claims that they are essentially stealing cycles from eBay's servers and this could slow down ebay's service and have a negative impact on their customers' experience.

    This is just plain stupid; if you have a page on your website which is viewable by the public then it is available for the public to download. That's the point of having a public website. Hey, I'm a customer of eBay's, am I guilty of using server cycles and slowing down the eBay website for other customers? You bet. eBay should secure the entire site and require authentication if they really want to pick and choose who can view.

    On the other hand, I think what Bidder's Edge does is really indefensible from an ethical standpoint and I am rooting for them to lose because they are in effect *competing* with eBay for advertising dollars by *using* eBay's content. If you view content from ebay through Bidder's Edge, that's advertising revenue eBay doosn't get which BE does. Seems really lousy.

    So it seems like the right ruling, but for totally the wrong reasons. The way the judge worded things it sounds like you could make a case for suing Yahoo, AltaVista, Google etc., if they dare to spider your site.

    Whot crap!

    --
    -- "Vote Democrat. Because the current crop of conservatives are just bugnut crazy."
    1. Re:Seems Like a Really Dumb Thing but .... by aufait · · Score: 3
      On the other hand, I think what Bidder's Edge does is really indefensible from an ethical standpoint and I am rooting for them to lose because they are in effect *competing* with eBay for advertising dollars by *using* eBay's content.

      What percentage of its income comes from advertisinge? And, what percentage of its income comes from the commission on sales?

      It is the old story of the leader in the market trying to keep users locked into their product. AOL does it with Instant Messaging. Microsoft does it with their file fomrates.

      Most other Auction Sites welcome Bidder's Edge's robot. Why? It gives a wider exposure to their auctions meaning that more bidders will see it and raise the price of the item under sale.

      Why does eBay object? Because they are the market leader and don't want bidders to see alternate auction sites. If eBay prevails in this case, then bidders will have to search eBay themselves. eBay is betting that rather than check two seperate sites, the bidders will stay on their site.

      What is the difference between Bidder's Edge automatically doing the searchs and my running a program for me to automatically do the searches?

      --
      I feel like picking a fight with everyone who thinks they are right. - Rainmakers
  14. Worse than "Legislating from the Bench" by Phaid · · Score: 4

    This is absolutely awful. Republicans in the US always like to use the term "legislating from the bench" to describe rulings by liberal judges which overstep the bounds of the case being argued. This is actually worse -- the judge is essentially making up technological terms as he goes along. Using someone's resources that they can never get back? Give me a break, the crawler that Bidder's Edge uses uses an infinitesimal amount of EBay's and other auctions' server capacity compared to the legions of "legitimate" EBay users. This judge is speaking from pure ignorance, and his ruling endangers everything the Web is based upon.

    Where do you draw the line? Are we only going to allow "manually" retrieving information from a Web site? What does that mean? Do I have to write code for each page I want to see? Are offline browser caches now going to be illegal since they automatically "drill down" into sites and grab several pages at a time for later viewing?

    When you create a Web site, you do it under the implicit assumption that people are going to connect to it and retrieve informaton. End of story. There is no "right" to only have your pages viewed by means you approve of. Every time _anyone_ connects to your site they use some of your resources, and doing it by automated means is no more onerous than by doing it "manually".

  15. Well, here's one problem.. by Anonymous Coward · · Score: 3
    $ telnet ebay.com 80
    Trying 216.32.120.97...
    Connected to ebay.com.
    Escape character is '^]'.
    GET /robots.txt HTTP/1.0

    HTTP/1.1 404 Not Found
    Date: Fri, 26 May 2000 09:21:38 GMT
    Server: Apache/1.3.6 (Unix)
    Connection: close
    Content-Type: text/html

    404 Not Found

    Not Found
    The requested URL /robots.txt was not found on this server.



    Apache/1.3.6 Server at ebay.com Port 80

    Connection closed by foreign host.

  16. eBay is only hurting themselves by uebernewby · · Score: 3

    If I understand this correctly, the reason the judge forbade Bidders Edge to crawl eBay is that eBay told Bidders Edge not to do it but they did anyway. So it seems fair to me: if I ask you not to keep following me around and you keep doing it, I take you to court to get a restraint order slapped on you.

    Whether it's a smart move by eBay to do this is an entirely different matter, however. Apparently all the experts agree that in the near future all transactions on the internet will be mediated by robots, which search out the best deal on the product you want to buy. So as far as the consumer is concerned, it makes no difference if a book comes from Amazon or Barnes and Noble, he'll just get it where his robot tells him it's cheapest. Most probably, he won't even know it's from a certain retailer until he looks at the box in which it was sent to him. Probably he won't care very much even then. At this moment, there's already a number of services that operate on this principle: Bidders Edge does it for auctions, and in NL there's a site called ElCheapo that lets you find the best deal on products ranging from airplane tickets to records.

    By banning a robot-service from their site, eBay is, in effect, shutting itself off from the way business will be transacted in the near future. They will not merely lose some ad revenue, as is the case now, but they will lose all their customers.

    If this really is what they want to do, I'd say let them.

    --

    News and bla for computer musicians: http://lomechanik.net/
  17. robots.txt by Citrix · · Score: 5
    I would have some sympathy with Bidders Edge but they don't follow the robots.txt file on eBay.

    here is http://search.ebay.com/robots.txt:

    # robots.txt for eBay

    User-agent: *
    Disallow: /aw/listings/
    Disallow: /aw-cgi/
    Disallow: /aw-secure/
    Disallow: /cgi-bin/
    It isn't like eBay is disallowing access to everything, crawlers are allowed to index anything on www.ebay.com (no robots.txt) and whatever is not excluded search.ebay.com. IMO whether the judge knows it he is upholding a standard and that is a good thing.
    Citrix
    --
    Leknor
    http://Leknor.com
    "So many idiots, so few comets"
  18. Consequences for Censorware? by Jim+Tyre · · Score: 3
    As a lawyer, I am always leery of news reports, without reading the ruling itself. But *if* the ruling says what the report says, and if it stands, I can see how site owners could use it to prevent being scanned by censorware bots.

    No scan, no ban?

  19. Who is a reader? by DLG · · Score: 3

    As someone who has written search engines, both as spiders and as meta search engines, I found the existence of the robots.txt standard to be a real blessing. It was a way for me to easily follow rules that kept my site from simply being BLOCKED at a firewall. By further doing things like not going to a specific site twice in a row and other wierd tricks, I could know that I was not abusing other peoples site with my code running automatically.

    In the case of a system that is essentially placing their own interface ontop of another system, what I don't understand is why EBay didn't simply refuse the packets. It should not have been THAT difficult. No user has a right to see a site. I have blocked parts of my site from certain users during times when they perhaps inadvertantly caused trouble. I have blocked use of my email system for the purposes of redirection. All of this I do with the technology available to me. The fact is that IF the improper usage of the site is impossible to detect by the owners of the site it isn't a technical problem. On the other hand, if the issue is that they are successfully using the site in a non intrusive manner to profit, it is a business problem, and the problem is NOT with the offendor but with the offended.

    The problem is the notion that a site can control how it is viewed.

    If you tell me that it is illegal to look at a site without seeing its ads then we are in for a long hard battle. Browsers are being built with the ability to recognize advertising banners and exclude them from being viewed. Text based browsers have always avoided banner ads. The use internet for advertising sales will someday be impossible unless the advertising is completely the same as content, which it IS to a large extent. Content can be filtered, and should be. I look at slashdot with a WAP enabled phone. No ads are sent. Sooner or later someone will fix that somehow but in truth the ad will be just a link to another site, which is fine with me.

    In the future it is considered likely that everyone will have a semiautomated search engine customized for them personally. Near future. In the long run, automated agents will be the primary conduit between a user and data, the web browser will be a thing of the past, and the idea that people won't filter out advertising when it bothers them/wastes their bandwidth is nonsensical.

    The Robot.txt file says, "Don't check these URL's for content" which basicly means "Don't read me". In the future sites that restrict use of their pages won't be read. If the only revenue a site gets is through advertising, then they will have to find another way to deliver it.