Slashdot Mirror


NSF awards $500,000 grant for Beowulf Cluster

ragnar! writes "National Science Foundation (NSF) awarded $500,000 to support a new parallel computing facility for Bartol. The "major research infrastructure" (MRI) grant will support a parallel system based on 100 linked processors, each of which will run at speeds up to 600 megahertz, connected by fast Ethernet hardware - very similar to the Avalon-Beowulf Cluster, developed by the Los Alamos Center for Nonlinear Studies and Goddard Space Flight Center. "

23 of 100 comments (clear)

  1. Just to clarify.... by Denor · · Score: 2

    I suppose I knew it would be risky making a comment like this on a thread that actually _was_ about beowulf clustering. Just to clarify things, I intended it to be humorous - hence the part about gaining back my lost respect via first post :)

    --
    -Denor
  2. Re:Linux close but no cigar by Troy+Baer · · Score: 2

    I imagine a BSD variant would be best - still open source, but the TCP/IP stack is faster, so you'd probably lose less in inter-processor communication.

    If you're running a private gigabit-class network (GigE, Myrinet, Giganet, etc.) and have a separate control network (typically Fast Ethernet), there's no reason to run TCP/IP over the high-speed network. In tht case, you could bypass the TCP/IP stack entirely and have the message passing system (typically an MPI implementation) talk directly to the hardware -- the "user space"/"OS bypass" approach. This is what Myricom's GM and the various VIA implementations let you do. Most of the larger Beowulf cluster installations are going with something like this.

    I must admit that I find it very surprising that they're going to the trouble of buying fast DEC Alphas and then connecting them with something as pokey as Fast Ethernet. I hope their RMHD and other calculations are pretty close to embarassingly parallel (i.e. almost no IPC), or the network will definitely end up being a performance bottleneck.

    --Troy
    --
    "My life's work has been to prompt others... and be forgotten." --Cyrano de Bergerac
  3. Re:What software? by Kastagir · · Score: 2
    We have our own research group at UDel developing just such a system. It's called EARTH, Efficient Architecture for Running THreads. It's still being developed but is functional as far as I remember, just not necessarily optimized. I don't know wether or not it will be used on the new system, but I'd be willing to bet that one way or another, EARTH will find it's way on to the cluster. The professor running the group has a way of getting things done :)

    If you want to check out more, check out http://www.capsl.udel.edu. The information on the web page isn't too organized, and all the info on the EARTH page is bound to be pretty old, but you might be able to get a decent idea.

  4. Re:Linux close but no cigar by DaKrushr · · Score: 2

    Well, I sort of admin a Beowulf at work... it consists of 8 dual PII-450s with 512MB RAM apiece...

    It came pre-made, with a slightly-modified version of RH 5.2 installed - basically just an SMP kernel and some utilities and libraries. You don't really need any special software, except for PVM or MPI (we use MPI). The MPI distribution we use is LAM 6.2 - I'm sure you can hunt it down if you look around a bit (try google.com/linux).

    I'm going to eventually set up another small cluster for testing and development purposes, once we get the first piece of software into operation (or maybe before, if I don't have a whole lot of stuff to do) - I'm planning on setting it up with Debian and a much more sensible layout (share /usr, instead of each machine with its own /usr - or maybe not, depending on what seems to be the best way to do it).

    I wouldn't recommend RedHat, tho - adminning on it is not much fun. (no apt!)

  5. BALDRIC.. by a.out · · Score: 3

    We are a cash strapped beowulf group doing legit research (we actually made a little bit of knot (ap math) history) Everyone seems to like us and support what we are doing but no one is putting their money where their mouths are BALDRIC. We are doing this in the true spirit of beowulf.. Taking old surplus hardware from all around a university and putting it to good use .. All of the research findings must be public inorder for anyone to use it, we have a very open source attitude to the cluster.. We currently have 8 nodes up and running with 7 more waiting to get 'on the action'. But our problem is that we have *no* funding. The biggest support we've gotten is a tiny room (I'm talking 15' x 15' at the most) from our Computer Science department.

    My question is: How can we get this kind of support??

  6. Clusters vs. supercomputers by Animats · · Score: 2
    The trouble with supercomputers today is that price/performance peaks with high-end desktop machines. It didn't used to be this way. Machines used to obey Grosch's Law (computing power increases as the square of the price. But that's ancient history. It ended when the fastest CPUs became single chips. If you're going to build a fast single-chip CPU, you don't want to waste it on a supercomputer that sells maybe ten units. You make Pentium IIIs and sell millions.

    The main result of this is that only the Government buys supercomputers, and nowadays they're mostly a boondoggle. SGI is currently trying to sell Cray, with limited success. Even Deep Blue is a cluster, made of stock CPUs on custom boards with additional custom hardware. The era of the classic supercomputer, with its huge mat of hand-wired connections, is over.

  7. Re:Linux close but no cigar by bbarrett · · Score: 2

    Lam can be found at http://www.mpi.nd.edu/lam/. It was originally written at the Ohio Supercomputing Center. It is currently being maintained by the Laboratory for Scientific Computing at the University of Notre Dame. By the way, we just released version 6.3 of LAM. If you're looking for a good way to see how LAM is communicating, check out XMPI, a graphical interface to LAM (as well as SGI's MPI implimentation). LAM is available as a tarball, i386 and SRC RPMS, and should be available in the Debian Potato archives. BTW - While you're visiting the LSC's pages, don't forget to see the world famous domecam.

  8. First post! by Dr.+Sp0ng · · Score: 3

    I want a Beowulf cluster of THESE THINGS!!

    Sorry, just couldn't resist... bye-bye karma :-)

    "Software is like sex- the best is for free"
    -Linus Torvalds

    1. Re:First post! by say-tan · · Score: 3

      i'm sorry, i'm going to have to agree with the ac on this one. not only was this post on topic, but it was funny. you (mr. moderator) obviously have no sense of humor. we all knew that the beowulf cluster post was going to show up, but this guy beat everyone to it with a first post. if i had moderator points, i would have tried to help correct this by moderating him up, but, alas, i don't.

      --
      Men use thought only to justify their wrong doings, and speech only to conceal their thoughts. -- Voltaire
  9. Similar projects by Greg+Lindahl · · Score: 2


    This source of funding isn't that unusual -- the University of Virginia Centurion cluster was funded by two $450,000 MRI grants.

  10. Re:What software? by Greg+Lindahl · · Score: 2


    Almost no one uses Linda -- what would you think UDel does?

    Most people with systems like this use a batch queue system like PBS and message passing libraries like MPI.

  11. Linux close but no cigar by abach · · Score: 3

    Beowulf like clusters become popular, Linux is
    often used, but it have to compete with the large
    and good old Unix suppliers. Take a look at:

    http://www.fysik.dtu.dk/CAMP/valhal.html

    Here you find a similar project, and even an
    explanation why they didn't choose linux.

    Seems like the commercial unices are running
    out of time.

    1. Re:Linux close but no cigar by Greg+Lindahl · · Score: 3


      Linux does have to compete with other Unixes, but people often decide in Linux's favor. For example, this cluster is 277 nodes with better networking, and we chose Linux over Tru64, due to Linux's super system administration capabilities.

      BTW, you can get Compaq's great Alpha compilers for Linux.

  12. What software? by deth_007 · · Score: 2

    After reading the article, I couldn't help but wonder what type of software they would use to keep the processing happening smoothly. Parallel processing in the large such as this is a whole area of study on it's own, I would assume they would implement some sort of process control software that would model the virtual OS Linda, but I don't see any reference in the article as to how they are handling this.

  13. ... by Signal+11 · · Score: 3

    Great! Let's just hope none of them have been listening to the ACs here on slashdot or they'll try to build it out of iMacs running linux or palm pilots....

  14. It's too difficult. by Denor · · Score: 2

    I'm sorry folks, but I'm just not creative enough to come up with a way to somehow make a beowulf cluster of these. I apologize for not being able to contribute to the obligatory beowulf cluster thread, and hope that I can earn back all of your respect by getting a first post somehow.

    --
    -Denor
  15. Re:Nifty by Greg+Lindahl · · Score: 2


    ... these machines ARE massively parallel supercomputers, if you build them big enough and you use the best commodity networking (like myrinet).



  16. Grant Funding Realities by Admiral+Mouse · · Score: 3

    People making coments about the amount of hardware/support that can be had for $500,00 should remember the realities of grant funding at a University in this country:

    • Universites/Departments typically keep 40-50% of the grant amount awarded to a lab for "indirect cost recovery" (ICR). This is the fee they asses for providing buildings, plumbing, offices, etc (infrastructure costs).

    • People tend to cost 2 x salary once benefits and whatnot are considered. So each $40k person costs the grant about $80k.

    • Labs usually have other costs they need to cover and small bits of large grants are usually used to cover these "extra" needs.


    So, a $500k grant is about $250k after ICR. Then say you fund 2 peole at $35k/year to help build and run it. Now you're down to just $110k for hardware. Even with a "best case" run of the numbers and cheap people, you're still not going to have more than $150k for hardware in this grant.

    Also keep in mind that this grant's funding is spread over 3 years.

    100 600MHz PCs is going to run about $100k even before you start buying networking equipment, backup equipment and power supply/protection equipment.

    In all likelyhood, Bartol is going to need additional funding (possibly x% matching money from the state or other similar grants) to make this a realitiy.

    Just thought people should know that when you get a $500,000 grant, you don't just get a check for $500,000 to blow on hardware. :-)

    ----

    --
    Life if possible, art at any cost.
    1. Re:Grant Funding Realities by Greg+Lindahl · · Score: 3


      MRI grants do not allow universities to charge overhead, and is 100% hardware money. You also have to get at least 20% matching funds.

      In general, equipment over $500 isn't assessed overhead by any university.

  17. Re:explain me something by edhall · · Score: 2

    The article wasn't specific as to hardware, but since they said it was "much like the Avalon cluster" they might well be using Alphas, not Pentia. $5k/box would be a good price if they are using the newer Alpha boxes based on the 21264 chip (which is better than twice as fast, on average, than the 21164's used in Avalon, even at the same MHz).

    -Ed
  18. ummmm.... by grappler · · Score: 2

    you can't moderate in any discussion you post in. I suppose you could do that with two accounts. Just use each account to moderate up the other one. The implications are rather interesting actually...

    --
    grappler

    --
    Vidi, Vici, Veni
  19. The moderators... by seaportcasino · · Score: 3

    Because any post associated with beowolf clusters is normally a troll, the moderators are having a hard time moderating this particular topic...

    Their first instinct is, "Oh God, it's a beowulf post - moderate down, moderate down." It must be a hard itch for them not to scratch in this case :)

  20. explain me something by Tiro_Dianoga · · Score: 3

    From the article it is hard to tell exactly what this money was for. Was it a $500,000 payment for a Beowulf cluster, for Bartol to run the cluster, or for Bartol to build and run the cluster?

    If they are purchasing hardware for that amount, they're getting ripped, because I'm thinking all the needed hardware, including the boxes and the networking equipment, can be had for under $150,000 (they could get a nice bulk order discount).

    My figure wouldn't include costs like assembly/setup labour and the OS (heh) but half the work is opening the boxes...

    Seriously, once the system is going and the scientists have their apps setup, all you need to do is make sure it doesn't overheat. (We are talking about a massive number of x86 systems, here).

    Disclaimer: I really don't know what the hell I'm talking about in this post. If someone could inform us what it costs to maintain a project like this, please post.

    --
    Boo!