Slashdot Mirror


Grid Computes 420 Years Worth of Data in 4 Months

Da Massive writes with a ComputerWorld article about a grid computing approach to the malaria disease. By running the problem across 5,000 computer for a total of four months, the WISDOM project analyzed some 80,000 drug compounds every hour. The search for new drug compounds is normally a time-intensive process, but the grid approach did the work of 420 years of computation in just 16 weeks. Individuals in over 25 countries participated. " All computers ran open source grid software, gLite, which allowed them to access central grid storage elements which were installed on Linux machines located in several countries worldwide. Besides being collected and saved in storage elements, data was also analyzed separately with meaningful results stored in a relational database. The database was installed on a separate Linux machine, to allow scientists to more easily analyze and select useful compounds." Are there any other 'big picture' problems out there you think would benefit from the grid approach?

34 of 166 comments (clear)

  1. Excellent by President_Camacho · · Score: 5, Funny

    The search for new drug compounds is normally a time-intensive process, but the grid approach did the work of 420 years of computation in just 16 weeks.

    Cue the stoners in 5, 4, 3, 2....

    1. Re:Excellent by Mr2001 · · Score: 2

      It's a shame the tagging system doesn't allow numeric tags. I had to tag this story "fourtwenty" instead.

      --
      Visual IRC: Fast. Powerful. Free.
    2. Re:Excellent by Varun+Soundararajan · · Score: 2, Funny

      The Answer to The Ultimate Question Of Life, the Universe, and Everything == 42 Malarial Drug Research == 420 420 is 419 + 1 (419 - remember Nigeria?) Malarial Drug Research/Answer to Ultimate question = 420/42 = 10 remove 0 from 10 :-) and it is 1 , subtract 1 from 420 and it is 419..... Something is really really fishy...

    3. Re:Excellent by Varun+Soundararajan · · Score: 2, Funny

      The Answer to The Ultimate Question Of Life, the Universe, and Everything == 42

      Malarial Drug Research == 420

      420 is 419 + 1 (419 - remember Nigeria?)

      Malarial Drug Research/Answer to Ultimate question = 420/42 = 10
        remove 0 from 10 :-) and it is 1 , subtract 1 from 420 and it is 419.....

      Something is really really fishy...

      ----
      comment already exceeded retard limit, hence no sig.

    4. Re:Excellent by CommunistHamster · · Score: 2, Funny

      Numerology: For when you have no real evidence.

  2. Wikipedia? by JonathanR · · Score: 3, Interesting

    It strikes me as strange that something like Wikipedia could not be distributed across user's PCs in more of a peer-to-peer fashion. Surely the web itself could benefit from further decentralisation. This issue bothered me some years ago, when I discovered that my desktop PC at work had about 40Gb of unpartitioned disk space. I often wondered about the sense of running file servers in big organisations, when each user probably has a few tens of gigabytes of unused or unpartitioned disk space. If illicit music and video can be distributed by P2P, why not all information?

    1. Re:Wikipedia? by Fordiman · · Score: 2, Interesting

      The system would be designed for it.

      P2P isn't a good model, but I can think of one:
      Data, as it is created, is stored in the users' shared folder. As other users go to access it, a copy is made from the cloud (as long as filename/size/hashes match) and that copy is used so long as the creator's copy hasn't been modified. When writes are done, they're done locally, and a patch is sent to the original copy. If the creator can't be contacted, or his copy doesn't exist, the last-writer becomes 'creator'. The file's creator is identified by his DC user name.

      Backing up is simple. For every creation/update that is made, a patch is queued or sent to a backup server. The server ONLY queues the originals and patches, so that past-versions are accessible. As space becomes unavailable (say, below 10%), the backup server alerts the IT guys that it needs to offload some stuff, and condenses changes of the oldest files in the local copy. When a delete is made, that is considered a write and handled accordingly.

      In the event of a reinstall (ie: the local copy of the files are deleted, but the world hasn't been notified), the user, upon connection would query the backup server to see where his stuff has gone, and get it back.

      One could create this system to act like an SMB share, with access levels and program-independent drive/directory mapping, but with one added benefit: user-creation and auto-mapping. The DC would automatically tell the system which peer-shares are available to him upon login. The user can then filter out what he needs as he uses it, but can index-search it all (a query is sent to the backup server, which, like a good little machine, has been indexing as backups are made).

      Lastly, for reverse compatibility, the backup server could provide SMB access to its copies, ensuring that non-updated systems can still access their stuff.

      I don't know about most organizations, but I work at Penn, and a system like this could work admirably.

      --
      110100 1101000 1101000 1100110 0 1101111 1101000 1100011 1
    2. Re:Wikipedia? by elchuppa · · Score: 2, Informative

      Well this is an excellent question. Actually Van Jacobson is on google video with a presentation on this precise pet peeve of yours. The main concern I have with the idea, at least with how Van Jacobson presents it is that with information addressed by content rather than location, it's slightly more challenging to locate it. At least with the IP system you can route closer towards your destination at each hop up and then down... But data without an authoritative source is basically lost. If you don't have it, you don't really have any reason to inquire about it with any one node over any other. There is a space for peer to peer data systems, and he does have a point over those live media feeds getting saturated. The truth is that all data should be potentially torrented. That's why bittorrent may be one of the most fascinating and potentially effective inventions in the modern(internet) software era (last 10 years). Bugger. so I don't have much constructive to say what with my current state of mind, except that most of the other replies are rudely and stupidly dismissive of the idea. It both resonates and feels like the future, but it's not a trivial problem. Actually it most certainly is... it's just a matter of stating it so that it is trivial.

  3. Ok, how does this apply to patents? by zappepcs · · Score: 4, Interesting

    If the grid solution finds THE cure for H5N1, will it be patentable? If not, who pays for the R&D to implement it? Who gets the patent? Do the thousands of people who allowed their PCs to be used get anything? Will big drug companies be able to use this and keep the prices low for the final product?

    1. Re:Ok, how does this apply to patents? by Dr.+Spork · · Score: 4, Interesting

      These are all good questions, and every user who volunteers their computer for something like this should find answers to them. I'm quite sure that the stuff discovered by distributed networks does not automatically enter the public domain, but in cases like SETI and protein folding, the organizers explicitly state that it will. But it wouldn't be illegal for a drug company to use volunteers' computers just for corporate profit. You have to judge the merit of each of these projects on a case-by-case basis. Remember also that there is a cost to participating: you have to run your computer at peak power, and this will add several hundred dollars to your utility bills each year while polluting the planet with extra coal smoke and CO2.

  4. years of computation? by convolvatron · · Score: 4, Funny

    sorry, i missed that definition. what is that in library of congresses per human hair?

  5. Wow, 25% scalability! Amazing! by Saint+Stephen · · Score: 2, Insightful

    (420 years / 16 weeks) / 5000 computers = 1:4 scalability!!!

    Frickin amazing! No one's EVER done that before.

  6. I've got one... by cultrhetor · · Score: 3, Funny

    how abouutt a drog thet maks slshdaughters spel gooder and youze gooderest grammer?

    --
    "Tu fui, ego eris" - Virgil
  7. Here's one by realmolo · · Score: 4, Funny

    Would it be possible to use all that computing power to make an electronic voting machine that works?

    Oh wait! How about a voting machine based on "quantum computing"! Then we wouldn't even have to vote, the machine would already know who won.

    Goddamn liberal qubits! Bunch of flip-floppers!

    Stupid conservative qubits! They think that there is ONE and ONLY ONE answer for everything!

    1. Re:Here's one by iminplaya · · Score: 5, Funny

      Then we wouldn't even have to vote, the machine would already know who won.

      That's a no-go. Reading the result will change them. Kinda like what happened in Florida :-) Proof that you can do quantum processing with pencil and paper without all these electronical contraptions.

      --
      What?
  8. From the Article by imstanny · · Score: 4, Funny
    Up to 5,000 computers were used at any one time, generating a total of 2,000GB of useful data.


    Based on the size of useful data GRID collected from 5,000+ machines and the quantity of pornography on my computer, they are claiming that: porn != useful.
    ...GRID computing; you disappoint me.

  9. Mathematicians and computers scientists unite! by cowscows · · Score: 3, Funny

    In an amazing breakthrough which will no doubt have profound implications on Moore's Law, it has been discovered that multiple computers can accomplish in a shorter time what would take much longer on a single computer! Researchers will next launch a study to see how much faster 6000 video ipods working simultaneously can play through all the songs on the iTMS compared to a single first generation ipod shuffle.

    --

    One time I threw a brick at a duck.

  10. Re:Malaria? by Soporific · · Score: 4, Informative

    According to:

    http://archive.idrc.ca/books/reports/1996/01-07e.h tml

    Malaria kills quite a few people every year so I don't think it's a waste.

    ~S

  11. Re:Malaria? by alshithead · · Score: 3, Insightful

    When you consider global warming...malaria WILL become a huge problem for many areas that haven't had to worry about it before now. This is in no way a waste. Buy your quinine now and while you're doing it...buy stock in the companies that manufacture it.

    --
    I reserve the right to think for myself. Others' opinions are optional. Puppy on lap = typos...not illiteracy.
  12. Re:Malaria? by iminplaya · · Score: 2, Insightful

    Unfortunately, most of the people killed are considered somewhat "less than" human. It goes a long way to explain the lack of interest, while other diseases are more politically expedient. The profit margins just aren't there.

    --
    What?
  13. Lots of things still out there by Raul654 · · Score: 4, Informative

    (Preface - My research group specializes in parallel computing) There are classes of problems so computationally intensive that the computers that can do them in a reasonable amount of time won't be invented for decades. Almost all of these are simulations of physical reactions (invitro drug simulation, climate simulation, biomolecular engineering sims, physics sims, 'etc). As a general rule, these problems scale weakly (meaning that as you add more computers, you can simulate more datapoints, and get more accurate results). If memory serves, the hardest problem I can recall involved hydrogen fusion simulations, requiring computers 10-1000 times faster than the best in the world today.

    --


    To make laws that man cannot, and will not obey, serves to bring all law into contempt.
    --E.C. Stanton
  14. How about open source, distributed search by classh_2005 · · Score: 3, Interesting
    This looks interesting:

    http://www.majestic12.co.uk/

  15. Re:Wow, 25% scalability! Amazing! by nick_davison · · Score: 4, Informative

    Worse...

    It's over 4 months, not a fraction of a second.

    If I have a task that takes 100 seconds to run and I want it completed in under a second, scalability becomes a challenge... I have to figure out how to break it in to at least 100 distinct parts and deal with all of the communication lags associated. To have any kind of fault tolerance, I probably want to break it in to at least 1,000 tasks so that if one processor is running fast, it can get fed more and if one processor corrupts its process, I don't find out right at the end of the second, with no room to compensate, that I have to run re-run that full second's worth of processing elsewhere to make up for it. That's where the challenge comes in.

    If I have a task that takes 100 seconds to run and all I'm trying to do is run it a lot of times over a period of time that's many times greater, I can run it 864 times a day per system with absolutely no scalability issues whatsoever and simply send the relatively small complete result sets back. With 100 systems, if each one can run a distinct task from start to finish, I'd be expecting pretty much dead on 100 times the total number crunching as there are absolutely no issues with task division, synchronization or network lag.

    In this case, they ran 5,000 computers over 4 months. Assuming a single task is solvable in under 4 months by a single system, they should have had no difficult task division problems to solve, absolutely minimal synchronization issues and next to no lag issues to address. In short, even a pretty inefficient programmer should be able to approach 1:1 scalability in that easy of a scenario.

    Efficiency of algorithms is a challenge when you want a single result fast. When you want many results and are prepared to wait so long as you're getting very many of them, that's an incredibly easy distributed computing problem.

  16. Grid Computing Projects by 1mck · · Score: 2, Interesting

    I've been donating my processor time for quite awhile now for the Malaria research, and even though the drug companies will probably benefit from my donation, they would not have these breakthroughs if people didn't donate that time, and it is the fact that a breakthrough will be found is what keeps me donating my processor time. It's a great feeling knowing that I've contributed to a possible cure towards this disease! Other projects that could need the services of Grid Computing, I believe that was the original question that was put forth, are imaging analysis (any field), physics (particle research, etc), and I can also see Grid Computing being used also for computer animations where the time to render animations would be greatly reduced, and allowing movies, and shows to be released much faster than before. (With this application, it would be known that you are contributing to a product that a company will be making a profit, and the only reason to do it is get these movies, shows to market faster. I, for one, would love to see a sequel to The Incredibles, and to be a part of that would be fantastic, even to just have my name mentioned in the credits!) One thing that needs to be done for these projects to get the maximum exposure for Grid Computing is to dumb down the process. A Noob would be hard pressed to set up Boinc Manager to do the Malaria research.

    1. Re:Grid Computing Projects by AlXtreme · · Score: 2, Interesting

      I can also see Grid Computing being used also for computer animations where the time to render animations would be greatly reduced, and allowing movies, and shows to be released much faster than before.

      I'm afraid that that will take quite some time to realize. Rendering CG, besides taking a lot of processing time, also requires enormous amounts of data, which restricts the rendering to render farms, the data being pumped over a high-speed LAN.


      Actually the amount of problems solvable by using Grid Computing over the internet isn't very high. You need computationally-intensive problems that can be easily parallized, besides requiring limited amounts of data. There's little point in distributing a problem if it takes longer to distribute the data than that you gain by using multiple nodes.

      --
      This sig is intentionally left blank
  17. Grid computing vs distributed computing projects by Crazy+Gilmore · · Score: 2, Informative

    Much of this discussion is totally misdirected because the writers are confusing a distributed computing project like SETI or BOINC - http://en.wikipedia.org/wiki/BOINC_client-server_t echnology - with a grid system - http://en.wikipedia.org/wiki/Grid_computing. They are completely different things.

  18. Re:Malaria? by iminplaya · · Score: 3, Insightful

    Actually I'm sure they would love to move to more fertile, profitable areas. Unfortunately, there are others with a different agenda who like to keep them away. Those people are getting much more outside help than the starving kids. Let's not forget the economics of the arms trade with african warlords and corrupt tinpot generals who are, of course, "good for our interests". I'm also aware that basic sanitation and clean water, both cheap and easy to achieve with the right thinking, will take care of probably a full 90% of the problem. Their old traditions are responsible for much of it. The parent's link lead me to this. It has to be just the tip of the iceberg. So the chemical insecticides are not needed. There are far too many unexplored, easily accessable natural solutions.

    Outsiders really aren't interested in Africa's problems, unless it interferes with "free trade". This will be solved by the Africans with relatively little outside help. It's just the usual numbers game.

    I've heard that gin is a good mosquito repellent.

    --
    What?
  19. In Soviet Matrix, grids run on YOU by iamacat · · Score: 2, Funny

    I imagined a beowulf cluster of those, nekked and petrified. Then I got ashamed of myself for rehashing the old meme and dumped hot grits in my pants. As I was convulsing on the ground, there was only one thought left in my mind:

    "Does it run Linux?"

  20. Re:Malaria? by iminplaya · · Score: 2, Informative

    After a little poking around, I found what looks like the stuff they put on the nets. It was under their noses all along. And I would venture to think that the they (the Africans) have known about it for a very long time. I'm too lazy to find out how well it controls these guys, something even more neglected in mass media. Nature triumphs the computer again. Okay, now I'm drifting off topic, but it at least I did it seamlessly and gracefully :-)

    --
    What?
  21. Re:Malaria? by God'sDuck · · Score: 2, Informative

    quinine cures malaria strains not yet resistant to quinine.

  22. Re:Malaria? by Anonymous Coward · · Score: 4, Funny

    Yes, I'm concerned about global warming too. But I think you're off base about malaria. Sure, there is a chance that malaria will be more prevalent. However, I think our real resources should be put to combating the animal predators from the North who will want to eat us for lunch.

    First, when global warming happens, all the polar bears will come South looking for something to eat. We are probably on the top of their list. First, the bears will be real angry at us because we melted their front yard. And secondly, we happen to be the fattest creatures around--there is a lot of meat on our bones. And don't even get me started on what will happen to our shrubbery when the reindeer head this way.

    I think instead of wasting CPU cycles on malaria, we instead should be using those computing resources searching for a safe but effective polar bear repellent. That way we have our priorities straight. Just my 2 cents.

  23. Re:the biggest issue by scottv67 · · Score: 2, Insightful

    You know, I think the thing that aggravates me the most is that these distributed computing systems are helping drug companies find cures to illnesses using OUR processing power and computers WE paid for, only to sell us the drug that they would have been hard pressed to develop without our hardware back to us at an extremely inflated price.

    Posting a reply to your comment is going to un-do my moderation this morning but I can't let your comment go by without a response. Yes, we (people who run the distributed computing clients on our home PCs) are contributing OUR resources to large pharmaceutical companies (directly or indirectly). I am running the F@H client on multiple PCs (at my home) that I bought with my take-home pay. Furthermore, my electric bill is impacted by having those PCs running (my electricity is about $0.03/kwh off-peak) *and* there is an additional cooling load on my home's HVAC system in the summer. Yes, the drug companies are getting something from me without ever acknowledging the money I have invested in helping them produce a new drug. But I don't do it for the recognition or the fame (okay, I do watch the F@H stats to see how many points I am producing each week compared to the other contributors on my F@H team) but instead I do it for the greater good. Is it possible that a company like Pfizer will take the results from my F@H clients and create their next blockbuster drug? Yes, it could happen. Will I be pissed if they don't cut me in on the action? No.

    Regarding your comment: "only to sell us the drug...at an extremely inflated price." Who really knows what the true price of a drug is? How many millions need to go into the salaries of researchers, sophisticated lab equipment and large facilities to house that stuff? How do you *know* the price of a drug is "extremely inflated"?

    If you don't like the distributed computing project like Folding At Home (F@H), please be aware that you don't have to run the software and you can feel quite smug every night when you tuck yourself into bed knowing that all of your home PCs are powered-off. You can even have a little chuckle and say "Suckers! My electric bill won't suffer just to benefit the drug companies!" before you turn out the light. But when the next big life-saving drug comes to market and it turns out that YOU need it to live, feel free to not purchase that drug. Show those evil drug companies that they won't get one penny of your hard-earned money.

    Last year, I donated a big chunk (thousands) of my take-home pay to the Leukemia and Lymphoma Society. While I am not personally afflicted by those diseases nor is anyone in my family, I donated to that organization in the hope that a cure will be found. [My donation was not tax-deductible so I did not make the donation in the hope of reducing the amount of income tax I would pay for 2006.] I run the F@H client on my home PCs for the same reason: Somebody somewhere (maybe someone who hasn't even been born yet) will benefit from my home PCs crunching numbers throughout the night. *I* paid for these PCs, *I* pay for the power to run them, *I* pay for the extra cooling load they generate in the summer. I am doing this in the hope that someone else on the planet will benefit from my "investment".

  24. Fallacy of the One Biggest Problem by p3d0 · · Score: 2, Informative

    There have been lots of responses already, but I would like to add another...

    There seems to be a widespread fallacy that all human resources should be applied to the One Biggest Problem facing humanity at any given moment. Overlooking for a moment the obvious problems inherent in trying to choose the One Biggest Problem, and assuming we could actually rank all human problems in a well-defined order, there are still two huge problems with this approach:

    1. Diminishing returns. Putting twice as many people on a problem doesn't solve it twice as quickly. The extra people could well be more productive working on a separate problem. This is the well-known fallacy of the Mythical Man-Month.

    2. Misplaced priorities. The majority of people in the world do not have cancer. If all the resources of humanity were spent on cancer, where would that leave the rest of us that don't have cancer? "Sorry, we've stopped making antibiotics, insulin, toothpaste, books, and clothing so we can focus on fighting cancer."

    In addition, there's an implicit assumption in the parent poster's position that the researchers who are looking for a cure for malaria have been wasting their time. I'd like to ask, what has *he* been doing during this time? I hope he has been looking for a cancer cure, or else he's nothing but a hypocrite.

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  25. Re:Malaria? by Praseodymn · · Score: 4, Interesting

    Worrying about the environment is a luxury. Being able to do something to stop what will probably kill you is a luxury. Living anywhere because you want to is a luxury. Having a choice to take the lucid dream inducing malaria drugs or not is a luxury.

    Where malaria flourishes, luxuries are scarce.

    Travel as much as you can in your life, preferably to the poorer countries. They are often the happier ones.

    --
    Sometimes, you can, you go to hell for the rest of your life! That's a true thing.