Slashdot Mirror


Craigslist Blocks Yahoo Pipes

Romy Maxwell posted a blog piece on Craigslist apparently shutting off access to Yahoo Pipes. Maxwell was working on a project, one of 2,111 using Craigslist as a data source, for a (non-commercial) Pipes-based mashup. He sent Craig Newmark an invitation to the alpha test, after a few rounds of friendly communication — "...as a rule of thumb, okay to use RSS feeds for noncommercial purposes." The apparent response, 4 days later, was for Craigslist to redirect any request with an HTTP referrer of pipes.yahoo.com to the Craigslist home page. Maxwell writes: "It's a sad day for me. I'm not too upset about my own project, as Flippity was already removing Craigslist as a data source. With the likes of eBay and Oodle not only providing open APIs but encouraging and rewarding developers, spending my time wrestling with Craigslist is just plain stupid and exhausting. I'm sure I'm not the only person to have come to that conclusion, and I wish it were different. ... If Craigslist wants to keep its doors shut to the world, so be it."

6 of 164 comments (clear)

  1. the rationale involved has already been explained by Anonymous Coward · · Score: 5, Informative
  2. waste of resources/traffic ... by Lazy+Jones · · Score: 4, Informative

    scraping other websites' content over http is generally a huge waste of resources (and money) for that websites' operator, so unless you can give him something of considerable value in return (like Google does - I'll gladly serve 4 million pages/day to their bots if I get 200k visitors through Google in the same time, visiting my website and not just looking at my content somewhere else), be prepared to get locked out. Naturally, something you consider "a cool feature" isn't necessarily the sites' owner's idea of sufficient compensation. Perhaps some day ISPs will pay websites for the traffic and bill their clients for it, then websites might react differently.

    --
    "I love my job, but I hate talking to people like you" (Freddie Mercury)
    1. Re:waste of resources/traffic ... by klossner · · Score: 5, Informative

      Yeah. Which is why he used RSS instead of scraping the web pages, and cached the data to avoid pounding the servers.

  3. Re:the rationale involved has already been explain by Trepidity · · Score: 4, Informative

    They're not making nearly the same revenue per employee, though, so there seem to be some diminishing returns. Craigslist brings in somewhere around $6 million per employee, while Microsoft brings in about $600,000, and Google about $1.1 million.

  4. Re:the rationale involved has already been explain by Al+Dimond · · Score: 4, Informative

    Craigslist leverages the Internet to provide a hell of a lot of service to a hell of a lot of people without doing much work at all. They skim a little money from some of those people and say that's enough.

    Microsoft creates a lot of work for themselves by making lots of new features and then convincing people that they need them. It's how they leverage their advantage as the world's largest software company, and the rest of the industry (and lots of people doing OSS) fall for it.

    Google is probably pretty much the same these days. The point is that these companies are worried about shareholder value first, they're worried about winning. That's why they make all this work for themselves. Craigslist just provides the service. Take it or leave it.

  5. Re:The reason is obvious by IamTheRealMike · · Score: 4, Informative

    In fairness to Craigslist, they have a pretty thorough anti-abuse system. If you read spammer forums (I do) you'll see that they learn reputation on IP blocks, ad content, links, and force phone [re]verification on anything that looks suspicious. The bar has been raised dramatically over the last 6-8 months, so, they are trying. Beneath the humble covers is a pretty sophisticated anti-abuse operation.