Slashdot Mirror


Shirky On Umbrellas, Taxis And Distributed Systems

There's a good article from Clay Shirky talking about the similarities between umbrellas, taxis and distributed computing. And if you really want more P2P than you can shake a fork at, the folks at ORA have also released an excerpt from the upcoming Dornfest and Brickley book.

20 of 40 comments (clear)

  1. Re:Where are the applications? by Sanity · · Score: 2
    Well, you have a point, although I don't think it would be possible to keep the aggregated information a secret if you really wanted to.

    --

  2. Where are the applications? by Sanity · · Score: 5
    My question with the whole distributed computation branch of the P2P bandwagon has always been one of "where are the applications?". The criteria for which applications would be appropriate for this seem to be rather limiting - these criteria are as follows:

    Firstly, the algorithm must be parallelizable. This means that it should be possible to split an algorithm which normally takes N time, across a number of, say P processors, and have it take less than N time, and ideally N/P time.

    Secondly, the algorithm must have minimal communication requirements. Rendering, for example, is parallelizable, however in most modern rendering applications each computer would need an entire description of the scene being rendered. This could be a huge amount of information, running into gigabytes, yet it would need to be distributed to every participant in the rendering process. Recall that in most distributed computation applications connectivity will be limited to a 56k modem which is only connected to the Internet intermittently. Even if you limit users to broadband, communication bandwidth is still a problem.

    Thirdly, the algorithm must be robust, if someone decides to screw things up, and hack their client to send back malicious data (as happened with Seti@home) they must not be able to invalidate the work that everyone else has done. Ideally there would be an easy way to validate the work done by each client in the system.

    Now, I am not saying that there are no applications which do not conform to these criteria, for example, cracking crypto algorithms and processing information from space telescopes in search of intelligent life clearly work quite well - however neither of them can really be used to make vast amounts of money. The only other thing I can think of are genetic algorithms, but again, whether there is a revenue stream there is an important question.

    Perhaps some of these distributed computation people have found a killer application for this technology, some of them certainly claim that they have, but I really wonder whether such applications will stand up to scrutiny on the grounds I outline above.

    --

    1. Re:Where are the applications? by Duncan3 · · Score: 3

      You forgot a forth and more critical criteria which all the P2P companies keep saying "pay no attention to the man behind the curtain"

      Fourth, the company must not care about the data, algorithms, and results becoming public immediately. Available to any competitor or evil cracker who wants to mess with you.

      Forget the other 3, you will have a nearly impossible time finding anyone willing (stupid enough) to give you money and live with #4.

      Of course, some of us have known this for a very long time, commercial distributed computing was put to sleep in the 70's. But then, in the 70's VCs were smarter.

      --
      - Adam L. Beberg - The Cosm Project - http://www.mithral.com/
    2. Re:Where are the applications? by Royster · · Score: 2

      Firstly, the algorithm must be parallelizable.

      Not necessarially. Depending on the cost of cycles, it may be sufficient to use a less efficient approach that is not completely scalable.

      Secondly, the algorithm must have minimal communication requirements. Rendering, for example, is parallelizable, however in most modern rendering applications each computer would need an entire description of the scene being rendered. This could be a huge amount of information, running into gigabytes, yet it would need to be distributed to every participant in the rendering process.

      I do actuarial projections for a life insurance company. I have a set of assets (investments with future cash flows to the company) and liabilities (insurance policies with future cash flows). The liability cash flows influence what funds are available for investing (or dis-investing). Industry regulations require that I investigate the adaquacy of the type and amount of the company's assets under different interest rate environments. The regulators want to make sure that even if interest rates and/or equity values spike up or drop down dramatically, the company will not become insolvent. The tricky part is that the liability cash flows are often dependant on the interest income that they assets can generate and the interest income that assets can generate is dependant on the interest rate environment when each of the cash flows occurs.

      Because of the interrelatedness of the two portfolios, there are two ways I can go about dividing up this project. I can slice by time, calculating all of the cash flows that I need at a given time to determine whether there is cash to invest or assets to sell. This is the most efficient method, but it has high communications requirements.

      Or, I can project all of the liabilities over future times and get a series of liability cash flows which then imply a series of asset portfolios and interest rates and then iterate back and forth between liabilities and assets until the answers converge. (Typically on the order of 10 or so iterations and not hundreds or thousands). This is less efficient, but has lower communications requirements. If cycles are sufficiently cheap, it may pay to use a less efficient algorithm.

      Thirdly, the algorithm must be robust, if someone decides to screw things up, and hack their client to send back malicious data (as happened with Seti@home) they must not be able to invalidate the work that everyone else has done.

      That depends entirely on the incentives. SETI was vulnerable because there was a competition to rack up completed cells. If the incentives to participate are designed properly, the may be no incentive to hack the client.

      --
      I have discovered a truly marvelous sig, unfortunately the sig limit is too small to contain i
    3. Re:Where are the applications? by Shotgun · · Score: 2

      I agree totally. The author gave the example of needing to complete a quarterly report in a timely manner. Large companies buy mainframes to do quarterly reports, not because of processing power, but because of their throughput. Huge datapipelines that make RAID-5 SCSI look like a straw. The actual processing is usually very small (add up the numbers we give you).

      The cost to buy and maintain the bandwidth needed to push the data out to distributed resources would be more than the cost of a mainframe.

      --
      Aah, change is good. -- Rafiki
      Yeah, but it ain't easy. -- Simba
    4. Re:Where are the applications? by ahfoo · · Score: 2

      I wrote to one of these companies. I forgot which one at the moment. But I was suggesting a plan that I think makes great sense and is a kind of processing already commonly done in a distributed manner --3D animation rendering.
      I'm sure I'm not the only one who loves setting up scenes with a gazillion meshes and complex camera shots and ray traced textures mirroring each other into infinity. The wireframe of a wild animation is within reach of many typical desktops, it's the rendering that you'll never get --especially at high res-- without your own CPU farm.
      So, my proposal to whatever company it was, was to allow artists to send in descriptions of their animation along with say a single screen shot and then CPU cycle donators could go to the site and decide which project they wanted to patronize.
      The reward? --not money but a free copy of the final project.
      In my mind, this is where the net can transcend conventional notions of economy. Heady stuff.
      But what about the money? Well, the organizing site would have to get by on ad revenues. But since it would be an entertainment site, that might not be too bad.

  3. Napster by Kyobu · · Score: 2

    How does wanting a song create supply of it on Napster? Once you get the song, I get it, because in the future you will supply the song. But if there are k songs available on Napster and you add your m songs, there are still k songs available to you, not k + m, because you already have the songs on our computer.

    --
    Switch the . and the @ to email me.
  4. Re:Will companies really see so much profit? by pheonix · · Score: 4

    I think that's precisely the problem. The things that we've found lend themselves well to distributed computing (SETI, cracking encryption) don't lend themselves as well to making money. What company wants to pay for either of the above two, let alone a lot of money?

    That's not to say that P2P is already doomed though. I don't think that it's a technical problem at this point, I think it's a business problem. Someone has to figure out a problem that has two attributes: It must lend itself to being more quickly solved via distributed computing, and it must be something with such a high demand that someone is willing to pay big money.

    It's very possible that P2P could take off...but I'm not holding my breath. Even if they solve the issue of "what problem is worth the money", there's still the problem of "who will let us use the cycles" and "how do we keep from getting cheated".


    -Jer
  5. What kinda math is that? by pheonix · · Score: 4

    I don't understand how the author came to the nickel per hour number.

    Sure, the cost of the machine boils down to (by his math) a nickle an hour, but that's not the same cost as the company would have to take on.

    A company would have to buy the system, hire the IT personnel, cover their benefits, store them, pay for the electricity, pay for the heating/cooling, pay for maintenance, parts if they break, warranties, etc. These (and more) are little things that a home user might not even consider when determining if it's "worth it", and makes the "break even" point much higher than a nickle per hour.

    I'd like to see the same breakdown done with some more accurate math.


    -Jer
  6. Another analogy by Shotgun · · Score: 2

    The ocean is full of gold, but no one has made a fortune off of it because the cost of collection exceeds the worth. The gold is dissolved in the water, and to get it you have to find a way to precipitate it out without getting all the other salts.

    There are a lot of unused cycles out there, but they are cheap and so finely dissolved that the extraction process isn't viable.

    --
    Aah, change is good. -- Rafiki
    Yeah, but it ain't easy. -- Simba
  7. RDF ?= Robotech Defense Force by nmarshall · · Score: 2

    what does the Robotech Defense Force have to do w/ music? hmmmm
    O you mean Resource Description Framework....

    i always mix the two up..


    nmarshall

    The law is that which it boldly asserted and plausibly maintained..

    --
    nmarshall

    The law is that which it boldly asserted and plausibly maintained..
    --Colonel Burr 1783
  8. Will companies really see so much profit? by stuyman · · Score: 5

    I think the article makes lots and lots of interesting points, but I don't really see how a company can expect to make enough money off of these spare cycles to say double my DSL capacity and pay my electric bill. If they get paid a penny an hour, and they get 21*24 hours, that's still only around 5 dollar, which is nowhere near enough to pay for what they want to give as an incentive.

    --
    Q:Doctor, how many autopsies have you performed on dead people?
    A:All my autopsies have been performed on dead peop
  9. MusicBrainz solves music metadata problems by Wesley+Felter · · Score: 2

    MusicBrainz aims to be an RDF schema for all kinds of music metadata, complete with unique identifiers for tracks, artists, albums, etc. It solves the "various artists" problem that tends to plague other systems. The main thing I don't like about it is the fact that it uses a custom protocol and query language, which shouldn't really be necessary since RDF is RDF is RDF.

  10. More to P2P than cycles by Wesley+Felter · · Score: 3

    I agree that "cycle-borrowing" apps are unlikely to be profitable, but there's a lot more out there. In particular, I think P2P content-distribution networks like Freenet and Mojo Nation have the potential to save lots of money.

  11. Agreed, little motivation to donate bandwidth by Ars-Fartsica · · Score: 3
    Computers may be faster than we need them to be, but for the forseeable future, there isn't enough bandwidth to support the casual sharing of media among home users. For most Americans, they'll be lucky if they can get DSL/cable - some estimates put broadband in the home at 10% penetration at most. Even for the users who can get broadband at home, 1.5 mbps (the max offered by most vendors) isn't enough to support seamless file sharing without a noticeable drain on bandwidth.

    Added to which, once we actually start paying for music downloads (its inevitable), there will be demand for reliable downloads. Hell, if I'm paying real money per song, timeouts and crappy connections are unacceptable. Once money enters into the equation, I want the media in a timely and efficient manner.

    None of this matters in a future where everyone has fiber to the home, but we're at least fifteen years away from that being a reality for most citizens.

  12. User created metadata considered harmful by Ars-Fartsica · · Score: 4
    The only sites out there that make explicit use of the meta tag are, well, explicit! Any metadata in a web page that is authored by a human is going to be subject to rampant spoofing. Presuming search engines actually indexed metadata in a strict way, you could simply contually redefine your keywords and subject matter to reflect whatever you thought was the hot topic of the day. Presuming sites were indexed rapidly, webmasters could simply watch the news and use popular keyphrases ("presidential inaguaration") to get their sites indexed as always being relevant.

    This is why search engines that work off of metadata typically give you porn links for almost anything, and why Yahoo can't be spoofed (their surfers actually visit the site to see what its about).

  13. not enough of a return by sethgecko · · Score: 4
    Now imagine you owned such a machine and were using it to play Quake, but Popular Power wanted to use it to design flu vaccines. To compensate you for an hour of your computing time, Popular Power should be willing to offer you a nickel, which is to say 1/20,000th of $1,000, the cost of your device for that hour. Would you be willing to give up an hour of playing Quake (or working on a spreadsheet, or chatting with your friends) for a nickel? No. And yet, once the cost crosses the nickel threshold, Popular Power is spending enough, pro-rata, to buy their own box.

    This hits the nail on the head. I'm willing to install the RC5 client on my machines for several reasons: 2. It's a project whose goals I more or less believe in. (SETI would be an even better match, but I ended up installing the dneet client first.)
    3. I already installed it. Once it's been configured and set to run on my FreeBSD and linux boxen I can forget about it. More trouble to disable it or find a new distributed project, install that, configure it, and get it running on all my computers.

    I think this article gets it right. The returns for me contributing my spare cycles as well as the effort to install and set up the clients is not worth whatever change they are paying. Like the article says, if they pay a nickel per processing hour, that takes roughly 2.28 years to earn a thousand dollars if my system is running the client 100% of the time at full processor speed. (I have no idea how much these systems actually pay, I'm just quoting the articles example.) The actual amount earned would actually be much less as I do various things with my system: burn CD's, play quake, write papers, etc. The long term return of pennies, or less than pennies on the hour makes me say that it's not worth it. And I suspect that without some higher incentive, like distributed.net crunching keys has been turned into a competition, most people just aren't going to take the trouble to signup for these paid distributed services. To have enough computers to make some serious money, you had to have enough money in the first place to make whatever they pay you small change.

    --
    Be ot or bot ne ot, taht is the nestquoi.
  14. Re:weird bit by NevDull · · Score: 2

    Their capital expenditures for the box may be a nickel an hour, but that's not the cost to them to own, store, maintain, upgrade, supply A/C and AC, and do various other things necessary to gain benefit from the processor/storage.

    -Nev

  15. Another idea for payment of resources by tchuladdiass · · Score: 3
    The article makes mention that projects like distributed.net and seti@home are successfull because participants donate their unused cycles for a project they feel good about. However, it would be rather difficult to get a large enough number of volunteers to sign up for tasks they are indifferent to, even if they get paid a small amount, because we are dealing with insignifficant sums (to the participants).

    Therefore, I propose that projects such as Popular Power, etc., abandon the idea of paying individuals a few nickles for some amount of cpu processing, but instead pay the charitable organization of the individuals choice.

    For example, you could sign up your machine on, i.e, Team FSF, and for every X number of opperations your machine computes for these distributed projects, a dollar would be donated to the FSF.

  16. Money, etc. by Alien54 · · Score: 3
    I found these parts fascinating [forgive the mild editing for the sake of clarity in this post]:
    Cycles, disk space, and bandwidth are resources that get provisioned [paid for] up front and are used variably from then on. The way to get more such resources within a P2P system is to change the up-front costs -- not the ongoing costs -- since it is the up-front costs that determine the ceiling on available resources.

    There are several sorts of up-front incentives that could raise this ceiling:

    • PeoplePC could take $100 off the price of your computer in return for your spare cycles.
    • AudioGalaxy could pay for the delta between 384- and 768-Kbps DSL in return for running the AudioGalaxy client in the background.
    • United Devices could pay your electricity bill in return for leaving your machine on 24/7.
    the money off the cost of a new machine is meaningless to me since I build my own. The delta between the two bandwidth rates is more interesting, but that differance only costs me maybe 10 bucks a month (if that).

    but the idea of someone paying my electric bill....

    I gotta admit that I can see the potential for abuse on this one.

    On the other hand, this comment tossed in at the end gives me the shivers:

    (Of particular note here is Microsoft, who has access to more desktops than everyone else put together. A real P2P framework, run by Microsoft, could become the market leader in selling aggregated computing power.)
    As a moment of paranoia sets in, I can see MS adding this element to there .NET "solution", that as a part of participating in .NET, they own your spare cpu cycles which they can they sell to someone else.

    I do not know what it is, but I always seem to have this moment of distrust whenever I read something involving MS.

    Then again, maybe the MS marketroids read Slashdot, checking it out for this kind of thinking, in order to get new marketing ideas that they can use.

    ;-)

    --
    "It is a greater offense to steal men's labor, than their clothes"