Slashdot Mirror


Distributed Computing Economics

machaut writes "In a ClusterComputing.org article, Jim Gray, director of Microsoft's Bay Area Research Lab, provides an interesting economic analysis for building distributed systems. When do you choose a grid over a cluster or a supercomputer? When does it pay off to move a task to the data vs moving the data to the task? He takes current hardware and networking costs into account to answer those questions."

17 of 130 comments (clear)

  1. Does that include electrical costs? by Anonymous Coward · · Score: 3, Insightful

    How much does it cost to keep hundreds of regular computers (with all their extra peripherals) crunching away vs. a specially designed computer/set of computers.

  2. Re:Of course I imagine it helps a lot when it's fr by DarthVeda · · Score: 1, Insightful

    Oh I see somebody beat me to the Seti@home first... oh well... off to craking my measly 5 units a day.

  3. SETI@home by LordoftheFrings · · Score: 2, Insightful

    The articles states that SETI@home has a whopping 54 teraflops of computing power. This is an unfathomable number of cpu cycles, and guess what, it is alled used FOR FREE! This is a great example of how a community of users is willing to sacrifice something (unused cpu cycles and small amounts of bandwidth) to meet some great future goal (contact with extraterrestrials). Did I mention it is FREE? I wonder how much money researchers could save or use in a better fashion if they all used distributed computing instead of expensive (super)computers. Some already have done this, and I know that there are distributed efforts currently processing ways to fight cancer and aids.

  4. Summary... by Realistic_Dragon · · Score: 2, Insightful

    Greedy overcharging telcos make grid computing over the internet more expensive than traditional supercomputing, unless you can get people to pay for you (SETI).

    --
    Beep beep.
  5. Connect the Microsoft dots by Rosco+P.+Coltrane · · Score: 2, Insightful

    Microsoft and IBM tout web services as a new computing model - Internet-scale distributed computing.
    They observe that the current Internet is designed for people interacting with computers. Traffic on the future Internet will be dominated by computer-to-computer interactions.


    And that explains why Microsoft has suddenly declared war on spam : they have to free bandwidth for their own .NET message passing. Remember folks, Microsoft never does anything without a reason, and certainly never does anything for the good of anybody else but themselves.

    --
    "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
    1. Re:Connect the Microsoft dots by Xentax · · Score: 2, Insightful

      I fine theory, but I disagree (and I have a nagging suspicion this is a troll, but either way, it bears answering).

      MS is getting on the anti-spam problem because it helps them, yes. But not for some theoretical future savings on bandwidth costs. They're doing it because taking an active role looks good to customers and investors, both of which are increasingly seeing spam as a real problem and not just something us techies talk about.

      Remember folks, publicly held corporations are *legally* driven by one thing, and ONLY one thing: Maximizing shareholder return.

      That they consequentially minimize other concerns, like the ecology, or customer satisfaction (when it stands in opposition to revenue/profit/market share/etc.), is hardly a surprising consequence. These issues are only worthwhile when the fallout of ignoring them costs more than it does to respect them (and to be seen to respect them). Just ask Best Buy about the economic value of good customer service.

      So, I agree to the extent that MS's motives for fighting spam are less than altruistic. But, as a public corporation, they have NO BUSINESS in benevolence; their shareholders could sue (and win) if they sacrificed shareholder value for some other purpose without very good reasons for doing so.

      Xentax

      --
      You shouldn't verb words.
    2. Re:Connect the Microsoft dots by Rosco+P.+Coltrane · · Score: 4, Insightful

      And how is this different from you or I act?

      I don't know for you, but I make GPL software, I give it away for free and therefore I give time and money to the community, partly to pursue a certain idea of the computer industry I desire.

      In a way, it's just like people who run the Seti@Home client : they don't do it just "to get a free screensaver" like that Microsoft guy narrowly thinks, they also do it because they want to feel part of a greater, more noble effort than just getting rich quick.

      When was the last time Microsoft gave anything open-source or for free that didn't serve one of their short, medium or long term plans ? I mean, it's okay, they're there to make money and they admit it, there's nothing wrong with this goal as long as they try to achieve it morally and legally, but why should it be the same for everybody ?

      --
      "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
    3. Re:Connect the Microsoft dots by Rosco+P.+Coltrane · · Score: 2, Insightful

      I have a nagging suspicion this is a troll

      FYI, I never troll.

      You may be right, they may do it to look good, or they may also do it to free bandwidth for their ever-increasing patches and to make .NET a viable product proposition, like I believe. But whatever the reason, I'm only saying they're doing it for a purpose, and people should cross-read declarations made by big corporation reps to find the motive behind their actions.

      --
      "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
  6. Obligatory conspiracy theory by Anonymous Coward · · Score: 2, Insightful

    The last time I read an article from the Microsoft Research guys was in Communications of the ACM. The article was about media center computers (in the article, named Mbox) and digital consumer product consolidation/standardization. Of course there was no mention of Apple and just a brief acknowledgement of TiVo.

    As a strange coincidence, HP and others announced their media center PCs shortly afterwards, followed by Microsoft releasing XBox Live.

    Now the same Microsoft researcher is talking about grids and clusters? Expect a Microsoft cluster package soon...

  7. Re:s@h, et. al.. by jason0000042 · · Score: 3, Insightful

    Also, as long as people are still allowed to decide what runs on their own computer you will have to convince them that they should help you with your distributed computing task.

    SETI@Home worked so well because people want to know the answer. People are interested in the results. If you tried to do a distributed apple browning application nobody would download it.

    --
    i don't like my old sig.
  8. Re:Strange math by friendofafriend · · Score: 3, Insightful

    I'm sorry, I don't follow your maths here...
    There are 2678400 seconds in a month (assuming 31 days...), so that makes 2678400 Megabits transmittable in a month, or 334800 MegaBytes. Each of your $100 buys 3348 MB, which is 3.3 GB - same order of magnitude as the author suggests...
    Perhaps you meant 2678400Mb per month.

  9. Re:Attack on IBM? by pcause · · Score: 2, Insightful

    No it isn't an attack on IBM or anyone else. Grey knows what he is talking about and his analysis is just fine. We all need to get past marketing hype and commercials and excitement about "the next big thing" and look at the reality of the numbers. The issue is: are we close to having the infrastructure for generalized "on demand computing"? Grey explains it so what anyone can understand the tradeoffs. Even your CFO, which is the key!

    It is a great article/analysis. Believe it and ignore the hype. SOme day the hype may be true, and Grey even explains how to figure that ut. But for now, only specialized applications are suitable candidates.

  10. Another data point by Alomex · · Score: 4, Insightful

    A few years back when Grid computing was all the rage we sat down with some investment partners and worked out the figures. We came pretty much to the same conclusion. The "average" commercial supercomputing application (pharma, oil drilling, simulation) would not benefit from "free" cycles on the network.

    Essentially, any commercial computation valuable enough to require that amount of effort can justify purchasing a hundred thousand node beowulf cluster and run locally. The reduction in network costs, the advantages of total control and tight security more than pay for the difference in computing cost.

    Non-commercial computations such as SETI will benefit from grid computing, and we expect to see more efforts long those lines (RSA, Mersenne, Stanford DNA). But remember, we were thinking about starting a business, and none of those pay for the services, so we moved on.

  11. Jim, the Internet Bubble has burst by mkc · · Score: 2, Insightful

    When SETI@Home spent $10^6 to get everyone to spend $10^8 on electricity alone, how was that a good deal? Have extraterrestrials sent a message that they're about to touch down with a vaccine for AIDS, a formula for cold fusion, a permanent end to unemployment, a sure-fire way to get good representation in government? Could we have spent the money more wisely, Jim?

    If Bill paid you folks to do something more than get technically-challenged investors excited, perhaps our software would work better. (And ASN.1 isn't that bad, by the way. Do I need everything going between my machine and the server to be verbose enough to read by hand? When I encode all my messages in XML, in how many cases will I miss the 10000 to 1 ratio just because the encoding is verbose?)

  12. true, but an explanation is in order. by Vellmont · · Score: 4, Insightful

    I think it's probbably safer to say that seti@home has a huge surplus of computational power, and uses it to verify each result (though it's not strictly necessary to do so). With only one data source (Aerecibo) you can only produce data so quickly, and once you have enough computational power to do the analysis in real time any extra is just surplus that can be used to verify. They did, however later add some extra analysis to the data to take better advantage of the huge surplus of computing power they have.

    The important point though, is that for seti@home each individual workunit, while important isn't critical to the whole project. If a small percentage of workunits aren't computed perfectly it's not catastrophic. In other words there's a certain amount of tolerance for innacuracy. For a project like the OGR (Optimum Golomb Ruler) by distributed.net each workunit must be calculated perfectly, as the goal is to prove which ruler is the optimum one. If workunit isn't verified you haven't really proven anything, since it's possible (and probbably likely) that hardware failure produced an innaccurate result somewhere in the millions of workunits calculated. (Or perhaps a modified client produced innacurate results). Other distributed computing tasks have different amounts of tolerance for innacurate results.

    Your underlying point is a good one though. For some projects the need for integrity of the results is very high, so larger computing power may be necessary to verify each result.

    --
    AccountKiller
  13. Re:The bad thing about distrubuted computing by cicadia · · Score: 2, Insightful
    It's not necessarily a bad thing; it just means that distributed computing is better suited to some problems than others.

    There's a whole class of problems which can take a tremendously long time to solve, but for which the solution, once found, can be verified very quickly.

    The distributed.net key-cracking contest was like this -- you don't have to double check every piece of work because once you've found the key, it is trivial to test it to make sure it's right. The OGR project works the same way, and I suspect that SETI uses a similar model.

    If it was true that you had to double-check everything then there would be absolutely no benefit to distributed computing. You'd be better off just building a supercomputer and doing everything just once.

    --
    Living better through chemicals
  14. Re:Anybody tried it? by JamieF · · Score: 2, Insightful

    If you look at the way data works in a cluster, it's pretty clear that spreading it across a big slow network is a bad idea.

    In a DBMS, if all accesses are reads, you basically can just cache the data in every node of the cluster and it's ultra fast. If it's a lot of data, partition it across a large number of machines so that they each cache a subset of the whole database, and direct client hits to the appropriate node.

    The problem comes when you change something in an ACID compliant DBMS - you have to write to a transaction log on disk, then change the block in memory and write it to disk, then to the transaction log on disk again saying the transaction was successful. In a cluster it's worse, because you also have to tell all the other caches that something changed. Maybe you just say "invalidate the object with ID xxxx", maybe you tell it to flush all objects of that type, or maybe you send it the new one. You first get a lock on the row in the DB, then you change it (multiple physical disk writes), then you unlock the row and tell all the caches that they're out of sync.
    Multiply this times several nodes and N transactions/second and all of a sudden you're talking about serious bandwidth. If you can get multicasting going then the bottlneck becomes the write performance of the DB (all nodes still bottleneck on the ability to write transactions to the transaction log on disk), but the database iteself can be partitioned, so you can go really far with this if you implement it correctly, and if your data happens to partition nicely like that.

    So, for something like the gene stuff that was mentioned in the article, it's not unreasonable to think that a user might someday have the whole DB on his/her hard disk. But running a clustered RDBMS over DSL with lots of writes and cache management traffic would be uuuugly. I think that's why it hasn't really been done as a general case.

    For another example, look at SMP vs NUMA architectures. Inter-CPU bandwidth in SMP is critical because cache coherency has to be maintained; in NUMA performance is uneven depending on where the data is but it's still fast. Some problems need access to the whole data set, and others can be easily partitioned.