Where to Spend $1M on a Cluster?

Competitive Bidding by duffbeer703 · 2004-08-04 11:47 · Score: 5, Interesting

Have the companies submit bids... then compare them and make a decision.

This isn't rocket science.

--
Conformity is the jailer of freedom and enemy of growth. -JFK

Re:Competitive Bidding by Kris_J · 2004-08-04 12:36 · Score: 2, Insightful

Buying from the lowest tender is rarely a good idea.
Re:Competitive Bidding by gumbi+west · 2004-08-04 12:45 · Score: 2, Insightful

[sarcasm]My god! what a concept![/sarcasm]
Yeah, so if you know how to write a contract, the lowest bidder is always the best choice. Think of terms like this: contract price is $700,000 if the following conditions are meet: (a,b,c) by date x and $600,000 if meet by date y. System must be free of manufacturer defects until date z. Manufacturer defects are defined with high specificity here...
But in any case, if you tell the companys what to bid to, they will all bid to that. Then you can pick the company that you think is going to give you the best product and the least hastle. None the less, terms should be specific about date of delivery (a good company won't really care about this so it doesn't affect their bid).
Re:Competitive Bidding by duffbeer703 · 2004-08-04 14:54 · Score: 1

No shit.

Here's how it works. You get 5 or 6 technical staff and managers, at least 3 of whom are not involved with the proposals.

Then you Request Proposals via a sealed bid.

You then come up with a scoring worksheet; you weigh cost, implementation track record, hardware or whatever other factors are important to you.

Then each person scores the proposals and you meet to go over them and come up with an overall ranking.

It may seem drawn out, but its a system that works well AND controls costs.

--
Conformity is the jailer of freedom and enemy of growth. -JFK
Re:Competitive Bidding by Kris_J · 2004-08-04 19:38 · Score: 1

implementation track record
That's what the story's submitter is trying to find out. Rather than talking about generic processes, how about specific companies.
Re:Competitive Bidding by bhima · 2004-08-04 23:25 · Score: 1

OK this is worst joke I've ever made:
it "worked" for the space shuttle :(

--
Nothing in the world is more dangerous than sincere ignorance and conscientious stupidity.
Re:Competitive Bidding by Kris_J · 2004-08-05 01:41 · Score: 1

Take this ticket to hell and go wait for the train on Fark.
Re:Competitive Bidding by Clover_Kicker · 2004-08-05 03:47 · Score: 1

You're 100% correct, if you know exactly what you need.
The poster's question implies he doesn't know enough about clusters to write a good RFP.
Re:Competitive Bidding by Wyatt+Earp · 2004-08-05 04:52 · Score: 1

Actually, there was no real competitive bidding for Shuttle.

The players were the same ones from Gemini and Apollo, they submitted proposals and NASA decided on one, then it went through years of changes and redesign, but in the end, almost everyone who had a piece of Apollo and Gemini were in on Shuttle in some form or another.

The failues of Shuttle wasn't from the bidding process, it was from engineering tradeoffs.
Re:Competitive Bidding by gumbi+west · 2004-08-05 15:09 · Score: 1

I guess my point is that if you know how to compare, why not to select the lowest bid. If you do know, write it into the bid requirement so that all biders are on level ground.

Do it with an apple by skammie · 2004-08-04 11:49 · Score: 0, Offtopic

Try an Xserve Raid cluster here.

--
"Fortunately, I'm adhering to a very strict drug regimen to keep my mind limber..."

Re:Do it with an apple by skammie · 2004-08-04 12:08 · Score: 0, Offtopic

I'm not so much a fanboi as I really should have copied this article instead.

--
"Fortunately, I'm adhering to a very strict drug regimen to keep my mind limber..."
Re:Do it with an apple by coldcup · 2004-08-04 12:12 · Score: 5, Informative

A cluster of storage? Perhaps you mean the Xserve itself.

They even have a page on clusters.
Re:Do it with an apple by Fortunato_NC · 2004-08-04 12:59 · Score: 1

without apple themselfs getting involved using their hw for this would probably be not that easy(to get some company to do the offer, support & etc using their parts doesn't seem that likely - clusters built out of them aren't that common).

What, are you kidding? I suppose you think these guys sit on their hands all day long.

--
Blogging Weight Loss, Distance Education, and more at verlin.com
Re:Do it with an apple by gl4ss · 2004-08-04 22:03 · Score: 1

research != selling a product. surely useful if you got enough time(or want) to roll your own thing, but that is hardly what this ask-slashdot was about.

"The Advanced Computation Group (ACG) researches algorithms and high-performance issues relevant to Apple technology. Apple's state-of-the-art processors coupled with ACG programs such as Apple/Genentech BLAST propel the Power Mac and Xserve to performance several times faster than other platforms running standard BLAST. This supercomputer performance directly impacts the rate of inquiry and discovery, opening new scientific vistas.

ACG publishes its research in the form of tools, technical papers and sample code for use in the science, education and engineering sectors. Please direct inquiries about ACG research to Richard Crandall. "

doesn't seem that relevant to the issue at hand..

--
world was created 5 seconds before this post as it is.

Me! by Jahf · 2004-08-04 11:50 · Score: 4, Funny

I'll do it ... send via paypal to my /. username @ yahoo.com ...

--
It is more productive to voice thoughtful opinions (reply) than to judge (moderate) others.

Re:Me! by bigsteve@dstc · 2004-08-04 12:20 · Score: 1

No ... give it to me! I have 256 x ZX81's all ready to hook up!
Re:Me! by Jahf · 2004-08-04 12:54 · Score: 1

If you can get Beowulf running on those ZX81's ... it's all yours, I was gonna use Cobalt boxes.

--
It is more productive to voice thoughtful opinions (reply) than to judge (moderate) others.

Pick me by techgeek10101 · 2004-08-04 11:52 · Score: 0

I didn't think it would take long to get a few willing parties...

Penguin Computing by retostamm · 2004-08-04 11:53 · Score: 3, Informative

Penguin Computing does this kind of stuff for a living. I think they are an all open source shop, too... There may be others, too.

--
get 7 free Japanese lessons.

Re:Penguin Computing by Anonymous Coward · 2004-08-05 03:25 · Score: 0

yes, and a division of SearchKing.com, a googlewhacking and litigating company.
Re:Penguin Computing by DA-MAN · 2004-08-05 07:27 · Score: 3, Informative

Penguin Computing does this kind of stuff for a living. I think they are an all open source shop, too... There may be others, too.

As a Systems Engineer who has worked with a number of vendors, I would say that Penguin is the bottom of the barrel in service and quality control.

We have five clusters at our facility, the slowest of which is on the top500 in the 150 range. We've tried big and small vendors.

Penguin is the absolute worst. No two scsi hard disks had the same firmware version, the raid controller was DOA, etc. We buy/borrow a node from each vendor and evaluate them before buying clusters, and out of all the vendors the Penguin is the one that would crash or hang all the time. After months of trying, they were never able to get this going properly. Regardless of the fact that we shipped it back twice and were told each time that we'd get back a whole new machine (it wasn't).

I would personally recommend Appro, IBM or Western Scientific in that order. Service and quality hardware are their game.

--
Can I get an eye poke?
Dog House Forum

In this case... by shadwwulf · 2004-08-04 11:54 · Score: 1

...you seriously need to put this out for RFP (Request for proposal).

Once you've done that, look through the proposals and pick which one sounds the best.

Answer: by Anonymous Coward · 2004-08-04 11:55 · Score: 0

If you had $0.7 mil to buy a pre-built cluster who would you go with and why?

Definitely not Slashdot, partially because of responses like this one.

Submission Title by Computerguy5 · 2004-08-04 11:58 · Score: 2

Does anyone else see the VERY obvious discrepancy between the submission title and the submission? Where are the editors? Last time I checked, 0.7 mil != 1 mil.

Re:Submission Title by TibbonZero · 2004-08-04 14:11 · Score: 1

Good enough for government (or Enron) accounting... He's just learning young in university!

--
Tibbon
tibbon.com
Re:Submission Title by Frizzle+Fry · 2004-08-05 12:42 · Score: 1

The same thing heappened with the "mission to mars" that Bush wanted. I forget the details, but it was something like the cost was $.7B so the media would refer to it as $1B since it's a round number.

--
I'd rather be lucky than good.
Re:Submission Title by Anonymous Coward · 2004-08-06 07:47 · Score: 0

He meant .7 MiMillion or $.7 MiMillion x 1024 = $1 million or something like that.

Only on ask slashdot... by Txiasaeia · 2004-08-04 12:02 · Score: 3, Funny

could you say, "Imagine a beowulf cluster of those!" and actually be asking a legitimate question!

--
Condemnant quod non intellegunt.

Re:Only on ask slashdot... by DWXXV · 2004-08-04 12:13 · Score: 0

You beat me to it >_

--
A ruler wears a crown while the rest of us wear hats. But which would you rather have when it's raining?

I my name were Natchswing I would go with... by T-Ranger · 2004-08-04 12:09 · Score: 1

Intergalactic NatchswingCo Research.

You got a grant with NO PLAN? by Anonymous Coward · 2004-08-04 12:09 · Score: 3, Funny

Shit, my department needs to take lessons from you guys. We need to specify the budget down to the last rubber foot and network cable just to have them review the application, and you got a grant without even having an idea what you were going to spend it on?

Re:You got a grant with NO PLAN? by Natchswing · 2004-08-04 13:59 · Score: 2, Informative

Actually, yes. On top of that they plan on paying some company to fly out harddrives for obnoxious prices rather than pay grad students with far more experience doing such things.
Re:You got a grant with NO PLAN? by TwistedSquare · 2004-08-04 20:56 · Score: 1

Your students have experience flying around hard drives? ;-)
Re:You got a grant with NO PLAN? by Natchswing · 2004-08-05 00:21 · Score: 1

*Insert Harddrive Crash Joke Here*
Re:You got a grant with NO PLAN? by Anonymous Coward · 2004-08-05 15:05 · Score: 0

I've got experience in making hard drives fly. :-)

how is Embry-Riddle relevant to by nusratt · 2004-08-04 12:11 · Score: 1

"gravity-wave modeling"?

I thought that E-R was an aviation school.
Are they developing an anti-gravity levitation vehicle?

Re:how is Embry-Riddle relevant to by brsmith4 · 2004-08-04 12:29 · Score: 1

Why don't you go to their site and look at their offered majors and graduate research before asking such questions. They are much more than just aviation.
Re:how is Embry-Riddle relevant to by Natchswing · 2004-08-04 13:54 · Score: 1

It's an ugly image (aviation) but that's what we're known for. The professor who proposed this grant is one of the best known gravity wave modeling researchers in the country. I, personally, work in the Atmospheric Physics Research Lab building sounding rocket payloads.
Re:how is Embry-Riddle relevant to by Anonymous Coward · 2004-08-05 04:54 · Score: 0

Surprising as it may be, aircraft are also susceptible to gravity.

RFP is the answer by hectorh · 2004-08-04 12:15 · Score: 5, Interesting

I think that you should look at your intended application.

- How much disk space are you going to need in total?
- How much disk space are you going to need per node?
- How much RAM is each node going to need?
- Is your application going to benefit from a low-latency or a high-bandwith connection between nodes?
- What about cpu? which cpu family will provide the best bang/$ for your calculations? PPC or X86? x86-64 maybe?

Once you know what you need, put it together in an RFP and send it out to every company that shows up under a google search for "beowulf cluster"

Review the responses and pick the best.

Since you are asking this question here, I'm going to refrain from suggesting the better option which is to build your own.

Hector

Re:RFP is the answer by AndyRobinson · 2004-08-04 22:18 · Score: 2, Informative

I think that's pretty much right. Two suggestions though...
Firstly, the more you put into the process the more you'll get out of it so be prepared to come up with a good RFP. If you're not an expert in clusters then you might well not know the answers to some of these questions so be prepared to take advice from suppliers. Sure, some of them may try and rip you off but most will be honest and helpful which will make the dodgy ones pretty easy to spot. Alternatively, look for some external, independent help to work with you on both writing the RFP and selecting a supplier.
Secondly, once you've got an RFP don't send to every company you can find. Pick a few - say 5 or 6 - good ones, send it to them, and then be prepared to spend some considerable time talking to them and answering their questions. You'll get much better responses that way. Alternatively, have short, very initial discussions with a larger number and then reduce that down to a short list as early on as you can.
Part of my job involves responding to RFPs. We're usually pretty busy so we have to prioritise which RFPs we respond to and how much time we put into the response.
The ones we put most effort into are, quite frankly, the ones which we think we stand a good chance of winning. Those are usually the ones which the client has done their homework on and come up with a good spec of what they want to achieve (but not necessarily how they want to achieve it), and done a reasonable amount of pre-selection of suppliers before expecting them to invest lots of time responding.
The ones tend to politely decline are those that have been sent to everyone and his dog as, from experience, it suggests that the client doesn't really know what they're after, doesn't really know how to judge between suppliers, and/or isn't really bothered about who they choose and will just go with the lowest bidder.
Having said all that, though, I work in a different field so some of this might not apply. On the whole, though, it's worked when I've been commissioning work and finding suppliers so I think the basic principles work!
Re:RFP is the answer by Stinking+Pig · 2004-08-05 05:21 · Score: 2, Informative

I've been privileged to answer a lot of RFPs in my career, so here's some tips from the other side to make the process go a little smoother:

Corporate background questions are fine, but please stick to general stuff that can be answered with boilerplate. No one at the vendor knows or cares where our executive team went to college, and it's going to be a huge PITA to track that sort of BS down.

Ask what you want to know, but please re-read the RFP when you're done writing it. If you've asked the same question 50 times in different wording, I'm going to answer it once and paste the same answer 49 times. That's not helping anyone.

Do not use forms, whether document or web based. It makes it very difficult for us to check our work and makes it impossible for us to provide supplemental information.

Do give a schedule with a reasonable amount of time. I'd release to vendors, wait a week, do a bidder's conference call, wait a week, and then collect responses. If it's taking them longer to respond, then they're either too strapped for resources or too far from their core competency.

And for your own protection, here's some stuff to look out for:

If they can't reveal processes for "security reasons" it very very probably means that they don't have any process. Run screaming.

If a vendor is grossly more or less expensive, find out why. They might have good reasons, or they might not know what they're doing.

--
"Nothing was broken, and it's been fixed." -- Jon Carroll
Re:RFP is the answer by Anonymous Coward · 2004-08-05 07:30 · Score: 0

Don't forget about maintenance. Computer hardware breaks, and if you don't include support for fixing it in your RFP, you're going to be looking at a lot of down time. Yes, maintenance has to be paid for, it's expensive, and it means that you won't get as much hardware for the money. You'll appreciate it when your system goes down, but you don't have to scramble for parts, and somebody is there to fix it right away.

Also, requiring maintenance in the bid weeds out a lot of the lower quality vendors.

Re:I'll do it by Vaevictis666 · 2004-08-04 12:22 · Score: 2, Funny

Oh man, I so have #3 down...

3. Pocket the leftover $499.5K

Microway by brsmith4 · 2004-08-04 12:26 · Score: 5, Informative

I run a 48 Node Microway beowulf and I must say that it is the most stable system available. Everything came assembled and ready to go (of course, I built the enclosure and did the networking, but they will do that for you if you'd like). If you're not very knowledgeable about beowulfs, how do you know you'll need so much power? Do you know how well the software you will be using will scale? Is it close to embarassingly parallel or does it lose efficiency over X nuber of nodes? What type of resources and consumption does the program use? Is it extremely processor hungry, or does it deal with dense matrices and require low-memory latency and high bandwidth or both? Do you know if you will need the power of Myranet or will you be able to get by on GigE?

These are important questions you must ask your researchers and yourself before you purchase this cluster. But, to answer your question, I believe Microway is the best choice and I plan on having them build our next cluster in the next fiscal year.

-brian

Re:Microway by gumbi+west · 2004-08-04 16:21 · Score: 1

Uh... this dude doesn't know how to operate his pant's fly. give him a break

Negotiate a success story by elliotj · 2004-08-04 12:31 · Score: 5, Interesting

Whoever you chose to go with (I'm partial to Apple, but that's just me - and just because they have sexy hardware), see if you can get them to give you either more for your money, or free implementation/consulting help, or something like that in exchange for using your implementation as a success story. I think Virginia Tech got a bunch of free stuff from Apple when they decided to build their supercomputer.

All these vendors want to be able to talk about their work. Letting them use you for marketing may help you get more for your money.

Not Angstrom by Anonymous Coward · 2004-08-04 12:33 · Score: 4, Informative

I currently maintain some Opteron based Angstrom Microsystems Linux clusters. We've had them for less than a year, and already 30% of our nodes have had to be replaced. Support has been a nightmare.

Sadly, I was not around when the proposal was made, otherwise I would have rejected this cluster outright. There is no way to hook external storage up to this beast. There is no USB, Firewire, SCSI, external SATA, or fibre channel options. You can't even run an ATA cable out of the thing without drilling holes into the blade walls.

Personally? I'm looking at an XServe or an IBM Bladecenter.. but maybe it's just because I'd like some real support.

Re:Not Angstrom by Natchswing · 2004-08-04 13:56 · Score: 1

Actually, the professor awarded the grant was going to choose that company. Please provide more information so I can approach him with some facts. I'm sure he would be very appreciative of the advice.
Re:Not Angstrom by Anonymous Coward · 2004-08-06 01:55 · Score: 0

Angstrom is a white-box builder that has attempted to build a few clusters. You need experts on the other side of the phone, not $7/hr screwdriver monkeys.

Look, from an insider, here are the places you should really look:

Aspen Systems
IBM
Linux Networx
Microway
TeamHPC
Verari

Most certainly, you do not want some dumb white-box builder.

For the most bang-for-the-buck, look at Aspen or TeamHPC. Both of these guys have quality people, good pricing, and will take care of you.

cluster experience by Robbat2 · 2004-08-04 12:53 · Score: 3, Interesting

First of all, you really should put out an RFP for your cluster.

We've got a 128 node (1 cpu per node) cluster from Atipa http://www.atipa.com/ that cost CDN$ 0.25M.
128 P4 Xeon, 1GB RAM, 120Gb IDE, Gigabit Ethernet.
I'd expect you to get a lot more for your USD$ .75M, like maybe doubling your size and getting AMD64 nodes. Look at your primary problemset first, see if it's IO-intensive or CPU-intensive to figure out what you want in the way of disk/networking.

The only thing I don't like about it is Atipa's configuration of Redhat8 (they didn't offer anything newer at the time). Look for something newer there.

Atipa is one of the suppliers for SGI-branded clusters as well.

I'd really like a cluster from http://adelielinux.com/en/, but I wasn't aware of them at the time we did our RFP and cluster purchase.

--
ICQ# : 30269588
"I used to be an idealist, but I got mugged by reality."

Re:cluster experience by Robbat2 · 2004-08-04 12:56 · Score: 2, Informative

furthermore, make SURE you have sufficent physical space and airconditioning capacity for your new cluster.

--
ICQ# : 30269588
"I used to be an idealist, but I got mugged by reality."
Re:cluster experience by thempstead · 2004-08-04 20:40 · Score: 1

Also make sure that the power feed to your computer room will handle the load required for running the machines in the cluster in addition to everthing else in there. I have a friend who ran into this with a beowulf cluster where he was working...

t
Re:cluster experience by Anonymous Coward · 2004-08-06 01:59 · Score: 0

Yes on the RFP,

Big NO! on Atipa.

Seems all of the Linux experts quit Atipa in January and formed their own company, TeamHPC.

My experiences with Atipa left me feeling that their tech experts were good, but hardware support was terrible. Maybe that's why they left?

Apollo? by loony · 2004-08-04 13:17 · Score: 1

Maybe I'm dumb but I always thought apollo was bought by HP not sun ;)

Peter.

Re:Apollo? by ITsAlive · 2004-08-05 00:23 · Score: 1

I think he means: Apollo - the sun god
as apparent from this his words:
... every computer manufacturer under the sun (including Apollo himself)...

Anyone else think this is.... by WasterDave · 2004-08-04 13:39 · Score: 1

...f'ked. Like, seriously arse backwards.

You got then grant *then* went shopping? Does all US academia work like this? Aren't you supposed to work out what you want to do, how to do it, how much and only then apply for the grant?

Dave

--
I write a blog now, you should be afraid.

Re:Anyone else think this is.... by Anonymous Coward · 2004-08-05 06:27 · Score: 0

ER is a third rate enginerring skool. sorry, but that's not how MIT and the rest of academia works.
Re:Anyone else think this is.... by wmshub · 2004-08-06 08:34 · Score: 1

Uhhhh...it looks like they did work out what they want to do (gravity wave research), and how to do it (with a 256 node beowulf cluster), then they got the grant. The only thing left is to find a vendor for their hardware. The guy writing this probably isn't the researcher who got the grant, he's an IT person who needs to help figure out who to buy it from.

Since it can take over a year to get a grant in some cases, picking out the vendor before the grant arrives is usually stupid. By the time it arrives, the hardware and price you settled on won't be competitive any more, and they actual hardware may not even be available!

cluster problem set by Raleel · 2004-08-04 13:50 · Score: 2, Informative

I see you mentioned the problem set, which is good. to me and my only somewhat novice mind (I work with scientists all day, hear all kinds of stuff), this sounds suspiciously like a fine grained problem. that is to say, there will be a lot of interprocess communication, so don't skimp on the network. I'm not talking "get gigE". I'm talking "look at myrinet, or quadrix, or infiniband".

Most people can do you up a 256 node cluster for under half a million, but doing up one with high speed and low latency network is another story. that net costs bucks, around $1500 per machine (for a card, a cable, and a port on the switch).

Make sure you know your problem. if you understand how it works, then you can buy a cluster that meets the need much better. Make sure the nodes are not being starved for ram iif the problem is a ram hungry one (your researchers should be able to tell you, even from data off a single machine). Find out if it's heavily integer based or floating point based (my guess is that it's a floating point problem). Find out if it's a lot of vector and matrix manipulations.

every machine type is a little better at something than the others. for instance, on integer based problems, x86 will generally scorch everything else. on floating point, there is lots of good competition (apple, intel's itanium, opterons). Don't be afraid to say to the vendor "we want to run on it for a week before we buy".

as others have said, do a RFP, but get their specs. get your tech guys to look it over hard. ask them "what sucks".

All that having been said, i'm a fan of apples and of verari systems. Dell is also quite good.

--
-- Who is the bigger fool? The fool or the fool who follows him? --

Myrinet? by complete+loony · 2004-08-04 14:47 · Score: 1

You mean Myrinet?

--
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.

Re:Myrinet? by brsmith4 · 2004-08-04 16:49 · Score: 1

Sorry for the spelling error. Had to get me on something, eh?
Re:Myrinet? by complete+loony · 2004-08-04 16:58 · Score: 1

Well, you got me interested, and I wanted to find out more... At least google new how to spell it.

--
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
Re:Myrinet? by jrockway · 2004-08-05 01:05 · Score: 2, Funny

While we're on the subject of spelling, I just thought I'd point out that you need a knew spellchecker... or maybe you already new that?

--
My other car is first.
Re:Myrinet? by brsmith4 · 2004-08-05 05:11 · Score: 1

What makes you think it wasn't just a blind typo? And don't give me any shit on how 'a' in on the opposite end of the keyboard from 'i'.

There is a lot of unnecessary ego on /.
Re:Myrinet? by complete+loony · 2004-08-05 14:34 · Score: 1

(re-reads own comment) ah, that did sound a bit nasty. Appologies, I usually justy try for brevity in my writing, I didn't mean to sound harsh..

--
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.

Actually, it *is* rocket science by fm6 · 2004-08-04 15:05 · Score: 1

Even if what you said were true, it's a pretty useless statement. Like reducing capitalism to "buy low, sell high."

But there's more here than figuring out who can plunk down the best system for the specified price. There's the maintenance/support costs. And picking a particular hardware platform kind of defines your choices for software -- so whose compiler do you like best? And any serious school needs to ask: can we maybe do a better job, more cheaply, cobbling together a cluster from cheap (abandoned, commodity, or donated) hardware? Which has the additional advantage of giving ones students some solid practical experience. Slashdot has run any number of stories on projects that did just that; one or two have achieved a small measure of success.

Re:Actually, it *is* rocket science by duffbeer703 · 2004-08-04 15:23 · Score: 1

Capitalism is "buy low, sell high". The rest is detail.

High-dollar Federal grants generally require that you adhere to some sort of standardized purchasing practice.

Competitive bidding isn't simply "Ok, this guy said he can do it for $50, he wins."

When you issue an RFP for others to come in and do work, you have to weigh various factors in your scoring.

Price is one factor. Experience and hardware features are another. You might assign bonus points to companies that allowed for a few students to take part in the implementation.

In other words, you need a process that makes the vendors do their homework and get you reliable prices and statements of work. Otherwise, contracts have a habit of going to the best salesmen with the coolest swag.

--
Conformity is the jailer of freedom and enemy of growth. -JFK
Re:Actually, it *is* rocket science by fm6 · 2004-08-04 17:05 · Score: 1

Capitalism is "buy low, sell high". The rest is detail.
It's the details that separate Donald Trump from, well, me.

Yo bitch! by Anonymous Coward · 2004-08-04 15:27 · Score: 0

$1M = $1,000

$1mm = $1,000,000

go blade by aminorex · 2004-08-04 15:27 · Score: 1

I'd probably just buy 10 blade enclosures with 14
2-way Xeon blades each from ibm off the shelf.
They have blades with dual gigabit nics. A Pair of
3-Com 16-ways nics give you 2 parallel networks,
which makes it flexible. Run OpenMOSIX.
I'm pegging the whole shooting match at roughly
$420k. Spend the rest on NAS, pack out the RAM,
get a nice visualization wall, etc.

--
-I like my women like I like my tea: green-

You're gonna need help.... *shakes head* by menscher · 2004-08-04 15:36 · Score: 1

First off, it's disturbing that you got this grant. The NSF should be ashamed of themselves for giving that much cash to someone so clueless.

Second: you're almost certainly going to have to put it out to bid. For example, at UIUC, the bid limit is $28,100. Anything over that *must* go to bid unless you can provide a really good reason why you have to "sole source" it.

Now, you need to start thinking about stuff. First off, forget the number of nodes. You need to start by thinking about how they'll be used. Like, how much communication will there be? A few large packets, or many small ones? Myrinet is nice and fast, but will increase your costs by 50% over gigE. Similarly, you need to figure out how much ram to put in each node. How many processors, keeping in mind they'll be competing for the memory bandwidth. 32 bit vs 64 bit. The list goes on and on....

Of course, you didn't give any details, which means you probably don't have a clue. So maybe start by purchasing a couple of test systems and benchmarking your code on them, to see where your bottlenecks are.

Good luck. Sounds like you'll need it.

Re:You're gonna need help.... *shakes head* by Anonymous Coward · 2004-08-05 11:29 · Score: 0

This was probably an MRI grant (Major Research Instrumentation). NSF doesn't require that you state a plan for how you will purchase the system; only that you have a plan for how you will USE the system. I know. We just got a $200K MRI (including a small blade setup).
Re:You're gonna need help.... *shakes head* by Stevyn · 2004-08-05 14:34 · Score: 1

Come on. He asked the slashdot community for help because he needed it. If you think it's wasteful government spending, then write to your congressman. Don't put this guy down because he admits he would like some technical input by people who he knows have more experience than he does.

You did give some useful information, but there was no need to start it off by calling him clueless.

I'd lend you mine... by Timbotronic · 2004-08-04 17:01 · Score: 1

...but I need it to run Doom3 in "Ultra". Sorry.

--

One of these days I'm moving to Theory - everything works there

contact other universities by Parsec · 2004-08-04 17:06 · Score: 2, Insightful

If you haven't already, google for beowulf clusters at other universities and contact those departments.

interconnect is the thing by Plagued+by+Penguins · 2004-08-04 18:03 · Score: 1

what you buy depends mostly on how much you want to spend on your interconnect, which in turn depends on your applications. You can spend >50% of your cash on the interconnect - but do you need to?

are your apps parameter study serial jobs? (interconnect doesn't matter much - just use gigE)
already written MPI apps? (few large messages? many small?)
OpenMP only? (you need large SMP nodes)
do they need large bandwith or low latency or both?

Infiniband or gigabit ethernet are your main options. IB is low latency, and probably even more cost effective then gigE, but you may need the gigE anyway for a maintainance network (netbooting, NFS etc.). gigE usually comes with the motherboard, but you still need to budget for a fat tree of switches to connect it all. Myrinet's too pricy and (I think) slower then IB, but might be simpler to connect and has more mature MPI implementations for it.

Watch out for big vendor cluster software people - they may not actually know what they're doing.... not naming any names. What big vendor actually did (for the cluster next door to ours) was make it all slower!

IMHO you don't need that serial maintainance network crap they try to sell you, or even IPMI or similar. these Xeon/P4/Athlon64/Opteron clusters should be reliable enough that it's a waste of money. Our 264 node (528 Xeon) machine is fine without it.

If you want real bang for your buck then avoid the large expensive gigabit ethernet switches - they usually have limited backplane bandwith anyway. We use 2D mesh networking made from a stack of 24port gigE switches and had the fastest machine in Canada for a while... our networking is now way simpler than the hypercube-like topology on that page, but every node is still a router, and it works really well.

OSCAR is a great install system for a cluster. Do it yourself - it's the only way you'll ever be able to maintain the machine in the long term anyway... Just buy the hardware from anyone who gives you the best deal and looks like they'll be around for 5 years to replace nodes as they die.

Drop us a line if you want more dodgy advice :-)

Wanna Buy a Cluster? by Apollo · 2004-08-04 18:33 · Score: 2, Funny

Hey, you -- yeah, you. Wanna buy a cluster? I know you'd like some UltraSPARC IVs. No? Come on. I've got great deals on last year's hardware, too. For the low, low price of $757,825, you, too, can own a piece of precision equipment from the Sun Fire line. OK, OK, fine! Go to that guy across the street. But make sure you come back here before you decide, because I've been authorized to toss in some incentives.

You'll be back, believe me. You'll be back in no time.

ring.. by ivano · 2004-08-04 19:43 · Score: 3, Insightful

...Apple
..Dell
...IBM
and *talk* to a sales rep. I know how hard this is (not!) but asking Slashdot is kinda silly. Sure you might want some impartial advice but /. might not be the right place :) Ring these people and decide for yourself (you're a smart man, no?). From the media Apple is getting for its "out-of-the-box" clusters I would seriously put them as an option.

..and good luck ! it sounds like a good project

ciao

Warning ! by dargaud · 2004-08-04 19:44 · Score: 2, Interesting

Commercial clusters, hah ! My university did exactly that and they've had only problems. There was specialised hardware in it. It was never well supported by the Linux they installed on it, which was impossible to upgrade or change according to the admin who kept loosing hair on it. In other words that system never worked properly.

When my research group decided to build one, I was incharge, opted for OpenMosix and after a tweaking period worked really well. Now with the various bootable CDs with OpenMosix (PlumpOS, BCCD, Quantian, ClusterKNOPPIX...), tests and upgrades are done by just pressing reset !

Of course with clusters your mileage may vary.

--
Non-Linux Penguins ?

Don't ask slashdot by winchester · 2004-08-04 20:42 · Score: 1

I am serious.

Building or buying a cluster is serious business. Talk to supercomputing experts. Issues involved are numerous. Just a short list:

what applications will this cluster run? Just the one you mentioned or will you be running ore than just that one?
Will you need a low-latency network (hint: you'll want one)? Will this be the current safe choice Myrinet or the up-and-coming Infiniband? This is again, application dependent.
Who will do the hardware support? Are you allowed to chainge disks and memory yourself or will the vendor do that for you?
Who will do the software support? Can you install software yourself? Think about kernel updates, driver updates for the low-latency network etc.
Do you have tools to manage the cluster? If not, who will supply them? Will you develop them in house? Pick them off the internet? Is the experience in house to use the tools?
will the hardware vendor deliver just boxes or a working configuration? The last thing you'll want is set 128 biosses yourself :-)

This list is nowhere near complete... there are so many issues involved in buying a cluster, you really need expert advice.

Talk to other sites who do the same thing as you want to do, who run the same kind of applications as you ant to run etc.

on eBay by managementboy · 2004-08-04 21:09 · Score: 1

I found mine on eBay... but make sure you check for bad RAM first (use Knoppix)

Re:on eBay by Nynaeve · 2004-08-06 06:01 · Score: 1

Tip: To test memory, use Memtest86

Low Cost Cluster Computing by r0xah · 2004-08-04 21:51 · Score: 1

I would very much recommend this research site from one of my professors at the University of Kentucky. He has been doing work with cluster super computing for quite some time now and has managed to build some very impressive systems at low costs. Much lower costs than what your current grant is for. With a grant of that size using this professor's techniques you could build a whole bunch of clusters. I would suggest you taking a look at his group's research site aggregate.org.

You can also see one of the specific examples of a very low cost efficient cluster computer. KASY0

--
those people who think they know everything are a great annoyance to those of us who do. -isaac asimov

do you have a good relationship with one vendor? by Clover_Kicker · 2004-08-05 03:51 · Score: 1

Chances are, your school has a hefty existing contract with one of the vendors bidding on your cluster. If you like that vendor, and they haven't fucked you over in the past, why not go with them?

The are less likely to take advantage, since they want to continue doing business with you. Your existing relationship will give you a little leverage.

Capitalism by nuggz · 2004-08-05 06:23 · Score: 1

That is a horrible definition of capitalisem, maybe a good one for the markets though.

Capitalism would be invest in the opportunities offering the best return.

Government Accounting by Anonymous Coward · 2004-08-05 07:49 · Score: 0

It's obvious you know nothing about government accounting. The CASB has the right to re-open the contract for an error or $ .01 (yes, that's right, one cent.)

And they do it.

1 Million CANADIAN by Anonymous Coward · 2004-08-05 08:10 · Score: 0

700,000 US dollars would probably be about a million in Canadian funds. Obviously the slash editor is canadian and just showing his true colours (notice how I spelled "colours")

Re:Microway NOOOOOOO! by Anonymous Coward · 2004-08-05 08:56 · Score: 0

We bought cluster and less than half of it is functioning and its still "under warantee".

Posting AC for obvious reasons.

Re:Microway NOOOOOOO! by brsmith4 · 2004-08-05 10:15 · Score: 1

Have you bothered to CALL them?

When we purchased ours, we had two nodes that had bad power supplies within the first two months... Replacements arrived within 2 days. The cluster withstood a sever AC outage, where the Ambient Temperature rocketed to 105 degrees and failsafes had not yet been implemented. We've had no further problems with the system since the initial hicup, with a consistent load (added up) of 105.0 and an uptime of 104 days for every node.

How you could allow your cluster to run at half capacity is beyond me. You would have had to let that go for quite some time (unless you've a two node cluster)

Posting AC for obvious reasons.

Probably trolling.

Pay someone to do it for you? by femto · 2004-08-05 15:01 · Score: 1

One option is to pay an experienced independent (ie. someone who doesn't sell their own hardware or have affiliation with a hardware vendor) person to make the decision for you. If it costs 5% of the purchase price and the person saves you from buying a lemon (or even saves you 6% on th system) hasn't that been a good investment?

Perhaps consider using a team member from a free software clustering project as your consultant (check credential though)? That way you hopefully get someone who is an expert and will be up front with you.

http://aggregate.org/ by donniejones18 · 2004-08-05 15:08 · Score: 1

I would have the research group that I work with at the University of Kentucky build it. Maybe you should contact my professor, Dr. Hank Dietz.

KAYS0
University Of Kentucky Supercomputer Breaks The $100 Per GFLOPS Barrier

They built the supercomputer for under $40,000 with 128 nodes + 4 spare nodes, just think how many nodes and how powerful it could be with $700,000!

Re:http://aggregate.org/ by Nynaeve · 2004-08-06 06:04 · Score: 1

I've been keeping up with Dr. Dietz's work since Purdue. I really admire his work, and I even ran a small 2-node PAPERS cluster at home using his AFAPI library.

PeTS may be applicable here, especially his research into Flat Neighborhood Networks (FNNs). However, I think that AMD/Intel sytems use too much power (70 watts or so each). A computationally-equivalent cluster of VIA EPIA motherboards (maybe 10 watts each) would be both physically smaller and much easier on the electric bill. At $100 each for a VIA EPIA V10000A or $163 for the newer VIA EPIA M10000 Nehemiah I could afford to both buy a cluster and run it. Running an AMD cluster would use more electricity than I could afford.

The picture in the middle of the PeTS page, KAOSlab.jpg, is my background desktop at work, and I often get comments. I wish I were so lucky as to work with that sort of thing every day. :)

IBM of course by lkaos · 2004-08-05 15:33 · Score: 1

Let me start off with a disclaimer: I do work for IBM however the following represents my opinions, not that of IBM.

There's a reason that they say you never get fired for going with IBM. IBM has more super-computing experience than anyone. We've got an amazing turn-around capability when it comes to building clusters. But perhaps the best thing with going with IBM is the fact that it builds the relationship.

IBM is very involved with universities especially in the areas of high performance computing. We offer a number of grant programs to help out. I've seen how we handle universities where we make hardware investments. The people handling it really care about making sure things work out well for the students and professors involved.

It's definitely worth calling an IBM sales person about it. If you need a number, feel free to email me and I'll do my best to find you one.

--
int func(int a);
func((b += 3, b));

Re:IBM of course by Anonymous Coward · 2004-08-06 17:12 · Score: 0

IBM scams the top500 with benchmarks replicated from one never-shipped-to-customers benchmarking-only cluster. Look at the numbers on the list... Notice a lot of them the same? Gee, I wonder how many customers actually ran (or could possibly get) those numbers. Yeah, you guessed it... NONE

The IBM cluster we have at our university is badly designed, maintained, and expensive. A strictly by the numbers underperforming heap of crap.
The IBM engineers actually de-tuned Linux with their alledged enhancements. Pretty funny.

IBM - no creativity. Very sad.

multi-vendor requestor by Anonymous Coward · 2004-08-06 01:40 · Score: 0

http://www.linuxhpc.org/ has a form that will submit to 20 diff companies, most specializing in HPC.

Be a good idea to get in contact with the right people.

go for a flexible solution... by Anonymous Coward · 2004-08-06 01:52 · Score: 0

I do NOT work for Apple but am enthusiastically using MacOS X (I'm an old UNIX hand who is tired of struggling with Windows and Linux/*BSD when I have 'real' work to do...); on the other hand, I hate the idea of computing power being focused on such esoteric goals (no matter how laudable, a nice generic cluster is not like a telescope looking up at the stars...).

I would suggest 96-112 Xserves and 32-16 PowerMac G5s with supporting hardware (graphic displays for the G5s and XRAIDs for the rackmounted Xserves...)... you could then buy a nice 20 foot truck or an RV to install the cluster and most of the Powermacs, allowing you to take it paying customers who would like to buy time when it is not being used for the research you describe (I am presuming you will own the cluster after the project is over and that you could make an arrangement with NSF that was above the board...).

Heck, you could even buy a few copies of X-Plane (www.x-plane.com) and challenge high school students to develop projects that you could drive up and run with them as part of a competition that could lead to scholarships at Embry-Riddle...

Performance bid by tengu1sd · 2004-08-06 09:02 · Score: 1

Do what Celera Genomics did for their equipment bids for human genome computing resources. Develop a benchmark test run representing sample code and data. Have each vendor run your benchmark in time trials.

"People asked me why we chose Compaq," says Marshall Peterson, Celera's vice president of infrastructure technology. "The answer is simple. We took a benchmark and gave it to all the vendors. Only two could run it. One ran it in 87 hours.

Compaq ran it in seven." Peterson didn't disclose the name of the other vendor.

From Forbes.com

Wow by KangXii · 2004-08-08 02:35 · Score: 1

Whoa, I've been thinking about going to Embry-Ribble, except the one in Daytona Beach, Florida. Maybe I should think about the Arizona one now.

Slashdot Mirror

Where to Spend $1M on a Cluster?

104 comments