Ask Slashdot: Clusters On the Cheap?

Beowulf cluster? by Anonymous Coward · 2011-09-14 16:44 · Score: 1

Subject

Re:Beowulf cluster? by 2muchcoffeeman · 2011-09-14 18:01 · Score: 0

Of course this is the topic of the first reply. Of course it is.

Because if it were any other answer it would mean that /. had changed so completely that it had morphed into something unrecognizable as its former self.

Thank you for upholding my faith in humanity. Or, at least, in my fellow /.ers.

(Obvious and therefore obligatory follow-up questions: But does it run Linux? And how many Libraries Of Congress will it have in storage capacity? And can you imagine a Beowulf cluster of these clusters? Oh, wait ... )

--
Prevent Windows piracy. Use Linux instead.

Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-14 16:44 · Score: 3, Informative

Why waste money on building a cluster when you can rent the best in the world * by the hour * ?

Re:Uhm AWS EC2 Cluster Compute by jpedlow · 2011-09-14 16:48 · Score: 5, Informative

AWS EC2 was my response aswell. :)
for raw horsepower on the short - medium term, use AWS http://aws.amazon.com/ec2/
ec2 should do well for this, imho :)
Re:Uhm AWS EC2 Cluster Compute by subreality · 2011-09-14 17:05 · Score: 4, Informative

+1. It is very nice to be able to spin up 50 instances, run the hell out of your job, then delete them. It gets done faster, and you don't have to deal with maintenance, upgrades, and obsolescence. Realized you need more RAM? Just adjust it! And so on. It'll likely come out cheaper than owning your own after you add up all the hidden costs (power, cooling, space, time, etc).
The only downside is there are no GPUs. But that's not really a downside: if you do end up developing a GPU version, your cluster configuration would completely change (1x2 cores per box, 3-4U boxes with many PCI-E slots, instead of 2x8 cores or however many you can economically cram into a 1-2U pizza box), so the investment you'd make now would be completely wrong for that future development. With cloud servers you minimize sunk costs.
I use Rackspace Cloud and it performs as promised. It's definitely worth a look.
Re:Uhm AWS EC2 Cluster Compute by jpedlow · 2011-09-14 17:16 · Score: 1

http://aws.amazon.com/ec2/#instance
amazon has gpu instances :)
Re:Uhm AWS EC2 Cluster Compute by GeorgeK · 2011-09-14 17:16 · Score: 2

Actually, Amazon now offers instances with GPUs. See their page on High Performance Computing for more details.
Re:Uhm AWS EC2 Cluster Compute by justforgetme · 2011-09-14 18:35 · Score: 1

Have used rackspace cloud on some occasions. Runs very smooth indeed. Actually I stilll have some web apps over there. But is it just me or did they introduce a new low tier? I can remember running 512MB instances for about 12€/month about a year ago...
I have started experimenting with scale engine lately. They could actually be better for the cluster thing the author wants since they can deliver more computing power/$
I don't know how easy it will be to setup the distributed computing on them but it could be a good solution.

--
-- no sig today
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-14 18:44 · Score: 0

The problem with EC2 is that its virtualised - my tests showed that simulations ran about 3 times slower than on bare (equivalent) hardware.
Re:Uhm AWS EC2 Cluster Compute by KiloByte · 2011-09-14 19:11 · Score: 2

From a back-of-the-envelope estimate, I see that AWS gets even with buying your own hardware in three months. Except, you still get to own the gear.
Thus, if you need a week or maybe a month of computation, AWS might be a better option, but for anything above that, forget it. If your needs are more bursty, that shifts the balance towards AWS, but again, you need to estimate what you need.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:Uhm AWS EC2 Cluster Compute by hardtofindanick · 2011-09-14 19:16 · Score: 1

There must be a better way than ec2

Cluster GPU Quadruple Extra Large Instance
22 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
2 x NVIDIA Tesla “Fermi” M2050 GPUs
1690 GB of instance storage

Personal experience: Disk IO: not dependable, network IO excellent.
4000 British pounds sterling = 6302.8000 US dollars
Assuming for each "experiment" you run 20 instances in parallel for 15 minutes (partial hours count as full hours, so remember to round up even if you use for a minute), you spend 40$ per session.
You get to make 160 experiments, and you are over budget.
Or if you adjust everything optimally, e.g., end the experiment at 59 minute mark, and assuming your "parallelization" uses 5 instances, then you get to make 640 experiments. But real life is far from optimal, especially with those pesky grad students.
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-14 19:49 · Score: 0

are you including electicity costs in that estimate, hardware/network maintenace, HVAC etc.
Re:Uhm AWS EC2 Cluster Compute by rioki · 2011-09-14 20:08 · Score: 1

+1 Since this is probably a University, they probably already have a place to put the thing, that is powered and climate controlled. So there are actually "no running" costs. Yes, admins are "free" too. And looking at the average research project, there is always follow up research and if you need more money, you probably will get an answer along the lines of, "Whait? We what did you do with all that money we gave you?"
Re:Uhm AWS EC2 Cluster Compute by subreality · 2011-09-14 20:18 · Score: 1

Sweet!
Re:Uhm AWS EC2 Cluster Compute by Joce640k · 2011-09-14 20:20 · Score: 2

How about setup/installation time. Installing and configuring a whole bunch of machines takes a while.

--
No sig today...
Re:Uhm AWS EC2 Cluster Compute by toruonu · 2011-09-14 20:37 · Score: 1

Yes, wanted to make the same comment that hardware purchases are usually coming from a different budget line than power/rackspace/cooling/admins so one cannot look at TCO in EDU case because individual research groups never add up to it. If one manages the whole university/insititute compute infrastructure, then yes TCO plays a role, but individual groups usually get a bag of cash for HW and never get any dough for support/electricity and usually don't have to pay it either, it's those pesky grads that do the admin work no matter how inefficiently...
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-14 20:49 · Score: 1

Not everyone dreams about paying to become someone else's bitch, ya know.
Re:Uhm AWS EC2 Cluster Compute by KiloByte · 2011-09-14 20:52 · Score: 1

Electricity but not labour -- but for cash-strapped kinds of research, there's plenty of free or near-free student help.
A cluster you can get for L4000 is not going to need dedicated air conditioning.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-14 21:40 · Score: 0

Did you calculate in power consumption, cooling, rack space and failing parts?
Re:Uhm AWS EC2 Cluster Compute by ron_ivi · 2011-09-14 22:19 · Score: 1

EC2's not the cheapest place to rent servers.
If you're going to rent for a whole month or more, it seems you get a lot more bang for the buck by renting dedicated servers like these:
http://www.server4you.com/root-server/ecoserver.php
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-14 22:56 · Score: 0

The last thing I would want is someone else having access to my data. I don't care how much they promise to keep their hands off, you just *know* that they are going to exploit it at some point. It may not be *as* cost-effective, but keeping data processing in-house is a keystone to IT security.
The "cloud" is for posers and sheep.
Re:Uhm AWS EC2 Cluster Compute by Nikker · 2011-09-14 23:43 · Score: 2

Well 640 experiments is all you will ever need...

--
A loop, by its nature, continues. If that didn't make sense, start reading this sentence again.
Re:Uhm AWS EC2 Cluster Compute by petermgreen · 2011-09-15 00:05 · Score: 1

Quite likely neither is relavent. At least here individual research groups aren't charged for electricity and labour is pretty much a fixed cost. As for aircon that isn't really an issue at this scale.

--
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-15 00:47 · Score: 0

Why not build a Fastra 2 clone? 6000 EUR for 12TFlops.
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-15 00:50 · Score: 0

Just curious if you included rack/power/cooling/maintenance as a factor? Not sure how to calculate those.
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-15 00:59 · Score: 0

To demonstrate the power of cluster computing in my previous company, I requested and received a policy change. The result was that all equipment taken off-line was sent to my office instead of the recycle bin. (The company is on a 3-year replacement cycle.) The building manager pulled a nice wheeled rack for me out of his warehouse. We grabbed cables out of the trash can. From all that junk we built a cluster that we eventually rented out time on. All for no cost, save our time. The company loved it. All income for no capital outlay. Plus, they got new technology they could apply to their contracts. The recycling pusher loved it too and got our project written up in the company's international newsletter.
Since the original question comes from a university researcher, they should find out if their institution holds a formal certification as a non-profit. Then anyone donating equipment can write off the value and the university gets free stuff. Then apply your expertise to learn the "how to" and build your cluster. Another thought is that it may be possible to get an NSF grant to build an educational cluster. That would yield additional funds to purchase things to go with the donated equipment.
Re:Uhm AWS EC2 Cluster Compute by ebonum · 2011-09-15 00:59 · Score: 1

Would there be a problem if all the code is in C++ and uses a propriety database?
Or will you have to re-write everything from scratch to match Amazon's API?
Re:Uhm AWS EC2 Cluster Compute by Spazmania · 2011-09-15 01:10 · Score: 1

Does your estimate include power consumption for the computers and the requisite cooling for that many computers?

--
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
Re:Uhm AWS EC2 Cluster Compute by mlush · 2011-09-15 01:31 · Score: 1

never get any dough for support/electricity and usually don't have to pay it either,
Normally university's gouge a slice of every grant to cover office costs.
Re:Uhm AWS EC2 Cluster Compute by lee1 · 2011-09-15 01:41 · Score: 1

The summary: "implementing it on GPUs is an open research problem and not the topic of research." Not every problem is well suited to GPUs, which is what FASTRA uses.
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-15 02:35 · Score: 0

> The only downside is there are no GPUs.
EC2 does offer nodes with GPUs...
Re:Uhm AWS EC2 Cluster Compute by Guspaz · 2011-09-15 02:54 · Score: 1

EC2's processing performance is rather anemic compared to competing cloud hosts such as Linode, which tend to enormously outperform EC2 instances that cost several times more. EC2 does make performance guarantees about how much CPU power you have, though, which is nice. Just not necessarily cost-effective for most people.
Re:Uhm AWS EC2 Cluster Compute by nabsltd · 2011-09-15 03:27 · Score: 1

How about setup/installation time. Installing and configuring a whole bunch of machines takes a while.
About a week for the first one (to define and record the customizations to the default OS install), and then a few hours for each one after that shouldn't be an issue with a grant that is almost certainly for at least 6 months.
Re:Uhm AWS EC2 Cluster Compute by Amouth · 2011-09-15 05:30 · Score: 1

look at their GPU instances..

--
'...if only "Jumping to a Conclusion" was an event in the Olympics.'
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-15 05:52 · Score: 0

Because if you rent, it's got for that ONE SINGLE simulation run.
That's like "you only get ONE chance to test the nuke" or "you only get ONE launch for this experimental rocket", so I hope you're right.
Whereas if you buy your own systems, you can run as many as you want, to your heart's content. THAT'S the difference.
Even if you were able to perform multiple runs ---- do you know how much time it takes to actually transfer simulation data back and forth?
For my runs that take about 10 seconds, I generate 6 GB of data. You do the math.
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-15 07:44 · Score: 0

The issue here is this fully assumes the following:
1. you have any clue how to setup a server
2. you have the physical space for a machine
3. your time on physical administration of the machine is worthless
4. you actually have the cash up front, or couldn't use the up front cash in a more beneficial way
5. are great at installing and managing cluster software (lots of this is pre-built on EC2, see rocks+, cycle computing, EMR, etc)
6. the security of your space is absolute (what a lovely thing to buy and then get stolen!)
7. zero hardware failure rate
8. access to free internet bandwidth / connectivity / QOS / protection
The bottom line is most CS programmers just aren't that great with hardware.. they're very different skills. EC2 provides the ability for them to get access to the world's best setup w/o requiring the dual-specialization of IT/hardware + software development skills.
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-15 12:23 · Score: 0

Yes, EC2 is great for this type of work. There are also Rocks instances for EC2 now-a-days, which lots of HPC folks use to manage physical clusters: http://aws.amazon.com/solutions/solution-providers/stackiq/
Re:Uhm AWS EC2 Cluster Compute by Glonoinha · 2011-09-15 14:47 · Score: 1

It's a VM running your favorite OS (SuSE Linux Server, Windows, whatever you want pretty much.). Log in with shell access (or RDesktop or however you like), upload your code as source and compile it on the target machine or just upload binaries and run them.

--
Glonoinha the MebiByte Slayer
Re:Uhm AWS EC2 Cluster Compute by The_Wilschon · 2011-09-15 15:23 · Score: 1

The real breakdown there is whether or not you are paying for datacenter space too. If you have to pay for power, HVAC, etc out of your budget, then yes, rent from the cloud. If you're like most small research groups, you're part of a larger university or industry department, and you're already paying building overhead costs, which include some sort of server room space. For that sort of budget, I'd be astonished if you needed more than about 10 tiles of server room space, and probably a lot less than that. This is extremely workable in the sort of environment I described.
About a year ago, my research group (I'm a grad student in high-energy physics at a major American university) received 30k$ in supplemental funding via the American Recovery and Reinvestment Act. I used the money to build our group a 39-node (156 core * hyperthreading) compute cluster. For that price, we could have rented 94 days of 156 small EC2 instances. If you try to factor in maintenance and development time, you can get a little bit longer of course, but I certainly spent less than a year's worth of work on it, so you'd less than double the EC2 rental length at grad student pay rates...
More technical details: HEP is embarrassingly parallel tasks with IO needs ranging from very heavy to very light, and with fairly large data storage needs. I bought commodity desktop hardware (core i7 930, fairly cheap X58 chipset mobos, low-end name-brand memory, mid-range PSUs, cheap as near free chassis, 1.5TB WD HDD per node), and a 48-port Cisco gigabit switch (with our IO throughput needs for some jobs, and the OS install scheme, didn't want to skimp at all on the networking). We already had a general purpose server machine (file, NIS, electronic logbook, etc), which became the headnode.
Each node PXE boot + NFSROOT from the headnode, which reduces the maintenance dramatically. The disks in each node contain swap and data storage, and are tied together with Hadoop's HDFS or gfarm (admittedly, this part is still in flux, because extremely high inter-node network traffic causes nodes to hang spontaneously.... Still working on that aspect) which will, when/if it works, give us much greater data throughput by eliminating the bottleneck to the headnode for data.
If you'd like to chat more about building a small compute cluster on a shoestring budget according to this sort of model, please feel free to email me at jay ess double-u at eff enn ay ell dot gov.

--
SIGSEGV caught, terminating

wait... not that kind of sig.
Re:Uhm AWS EC2 Cluster Compute by Anonymous Coward · 2011-09-15 19:54 · Score: 0

Privacy for one thing. Americans are notorious on spying any computers on clouds that run on american companies' servers
and, as research goes, americans like to spy on what others do research on and how they progress. Just look at american companies spying on EU bank transfers and credit card transactions. They don't even deny that all data is shipped to CIA for processing and THAT is a big help in winning contracts when you can just check how much the competition got paid for their products.
There is a lot of competition. Therefore, own servers are a must if you want to be first in the field and get the patents and not let americans spy everything from you with their cloud servers.
Re:Uhm AWS EC2 Cluster Compute by Siffy · 2011-09-15 21:33 · Score: 1

Then something is wrong in the configuration, either hardware or software. Virtualization by itself should not reduce performance by ~66%. Your hit should more likely be 5-10%. If you're taking a huge hit, it's most likely because you're sharing resources. Don't blame virtualization for that.
To be honest, £4000 isn't going to buy a lot of processing power. Does that amount also cover operational costs such as power? I'd ask about bandwidth, but with the scale possible with this budget, colocation of the servers doesn't make sense. Have you considered BOINC? Are you 100% certain OpenCL and GPGPU won't help? Atom, while cheap, even on a small budget is probably a bad solution. Remember that CPU always ends up being less than 100% of the cost of a node. Increasing the cost per node by 10-25% to have a node that's 400-800% faster makes perfect sense, and the fewer nodes she has to run, the cheaper your network will be. Unless Bulldozer brings incredible performance, Sandy Bridge based CPUs will provide the best bang for the buck if she's buying new. Clock per clock, they're the fastest available cheaply and their energy consumption is excellent. I suggest looking at i5-2300/2400/2500 or Xeon E3-1220. Depends if she wants ECC mostly. She may have enough budget for 6-10 nodes using these CPUs. Reduce the complexity of a node to Motherboard, CPU, RAM, and Power Supply. Go with quality PSUs, but remember there's no need to go overboard on wattage for machines that won't be running a GPU (I'm saying 250-300 watts is optimal if you can find quality in that size). Also, DDR3 is dirt cheap right now, so if there's a possibility 8GB will make a difference at some point over 2GB or 4GB during the life of the nodes, it makes sense to just start with that much.
PXE boot from a head node that contains all the storage... which btw, you (serviscope_minor) didn't mention how much raw storage she's going to need which will eat a good portion of a small budget. You also didn't mention how hard her problem is on a network. Is simple gigabit enough? The closest serviscope_minor came to describing the problem was to use the term "CPU bound" somewhat ambiguously.
Again, I would bring up BOINC. She would accept hardware donations right? How about just asking for them on a worldwide scale? If this is a non-profit venture (her degree doesn't count as a profit if this is a University project as many have assumed) and isn't intensively time-sensitive, you'd be surprised how many people will freely contribute processing power.
Re:Uhm AWS EC2 Cluster Compute by ResidentSourcerer · 2011-09-16 01:53 · Score: 1

Local is better unless the computation/data ratio is very large. If it isn't you pay for moving data sets too and from your cloud, and you pay for storage time at the cloud.
Some notions for you:
1. I once spent a day analyizing benchmarks on Tom's Hardware, and found that Front side bus was a bigger determinant of benchmark performance than was CPU speed. Using the 2nd or 3rd fastest CPU in order to use a faster FSB was always a win.
2. Cache size is critical. If you can optimize your software to run in the processor's cache, your performance goes up dramatically. At one place I worked, one researcher was doing computational theoretical physics. The 200 MHz pentium was getting 25% of the performance of the Stardent Titan 4 core machine that cost 40 times as much. But when he made the inner loop just a bit bigger (he had to -- now he was looking at rotating holes) the performance fell by a factor of 20. The titan with it's much larger cache barely slowed down.
3. Another place had a Beowulf cluster for doing protein folding. It consisted of 60 machines. During the day they were windows boxes used as student labs. At 5 p.m. they rebooted as linux boxes, and picked up the computation where they left off that morning. The cluster's name was werewulf because it only came out at night.
This is one way to defray the costs if the researcher is part of an educational institution.
4. If a machine is going to boot frequently or start code frequently, an SSD disk helps a lot.
5. Don't forget to calculate your cooling requirements. A large number of CPUs working hard generates a lot of heat.
6. Beware of debugging costs. We had a 64 core machine once (Myrias SPS-3) that during the day we broke up into 4 single core queues, 1 4 core queue, and two larger queues. Process dispatch was expensive on this machine. The rules for the researchers was that they had to get their software working correctly on a single core, then get the parallelism working on 4 cores, before they could run on the larger two blocks. Researchers had their wrists slapped if they ran a job on 32 cores, and it aborted 30 seconds after starting. At night and weekends we ran either a sngle 64 queue, or if some researcher was working late, we'd run a 60 and a 4 or 60 and 4 singles.

--
Third Career: Tree Farmer Second Career: Computer Geek First Career: Teacher, Outdoor Instructor, Photographer.

This by Anonymous Coward · 2011-09-14 16:48 · Score: 0

You have a limited budget, so it's more cost effective for you to lease time on someone else's equipment for now.

MB stacks by Max+Romantschuk · 2011-09-14 16:49 · Score: 1

I've seen quite a few projects where people have stacked motherboards with spacers, using booting over Ethernet and a single power supply for multiple MBs. Google should be of use here, I'm trying to get my offspring to school so I'm cheating and not providing any links...

But the idea is that skipping the case and other components makes things cheaper. Leaving the rig exposed without a case also eliminates the need for most cooling.

--
.: Max Romantschuk :: http://max.romantschuk.fi/

Re:MB stacks by Anonymous Coward · 2011-09-14 16:58 · Score: 0

I've seen quite a few projects where people have stacked motherboards with spacers, using booting over Ethernet and a single power supply for multiple MBs. Google should be of use here, I'm trying to get my offspring to school so I'm cheating and not providing any links...
But the idea is that skipping the case and other components makes things cheaper. Leaving the rig exposed without a case also eliminates the need for most cooling.
Infectedtech are doing something exactly like this: stacks of motherboards for multi multi cpu on the extremely cheap!
www.infectedtech.org maybe they could help?
Re:MB stacks by Required+Snark · 2011-09-14 17:05 · Score: 1

In what plane are the motherboards stacked? Are they in the horizontal or vertical plane? Vertical stacking allows the hot air to exit the top, while horizontal implies that external airflow must be provided to get the hot air out. Also, if you have multiple layers of vertical stacks then the top boards are getting the hot air from the lower boards.
A random suggestion: Have the motherboards all parallel at a 45 degree angle. This could provide passive heat driven air flow. The cool air enters at the lower edge and exits at the higher edge, so one side of the stack is the cool side and the opposite side is the warm side, I would think that you want the CPU fan near the upper edge of the board.

--
Why is Snark Required?
Re:MB stacks by JorDan+Clock · 2011-09-14 17:10 · Score: 1

Does it matter which way they're stacked? If it's horizontal, you could just flip it on its side if heat becomes an issue...
Re:MB stacks by tempest69 · 2011-09-14 17:17 · Score: 1

I did this back in '03 (ok, I had discrete power, but a diskless boot) there is a project called warewulf that was pretty decent. The pxe boot was a little odd with the hardware at hand, so make sure the MB supports that sort of thing should you go this route. If you have a small enough data requirement (or fast enough broadband), a web service might be the way to go. Uploading/Downloading terabytes of data is a horrible thing over a low grade connection, and certainly isnt pretty over 100mbps lines. good luck, you're gonna need it.
Re:MB stacks by hedwards · 2011-09-14 17:25 · Score: 1

You're better off doing it from the start rather than waiting for it to be a problem. One of the things I remember from college was that if you had a DEC Multia, you had best be standing it up on its side, as they would have some serious problems very quickly if you lay them on their side.
Best thing is to avoid the possibility and the head aches of reorienting however many motherboards after it becomes a problem. Chances are you'll know it's a problem because they're unstable and possibly damaged.
Re:MB stacks by mpetch · 2011-09-14 17:25 · Score: 1

Good idea, we could recommend an Apple Crate II which previously appeared on /.
Re:MB stacks by MimeticLie · 2011-09-15 00:00 · Score: 1

My college has a program doing just that. They've been working with educational institutions, but the directions and parts list are available for anyone who wants to create their own. I'm not sure if it would be the best performance you could get for the price, but it's pretty easy to set up and low-maintenance as far as clusters go.
Re:MB stacks by Just+Brew+It! · 2011-09-15 00:06 · Score: 1

If you don't mind going really ghetto, you can also mount motherboards in plastic storage crates using zip ties: http://techreport.com/forums/viewtopic.php?p=445461#p445461
The pictured systems were diskless; everything was network based -- PXE boot, with swap partition and home directory mounted via NFS. With more modern hardware (the linked forum thread is from 2005) you might have a harder time getting 2 to a crate since the CPU heatsinks may be too tall; but using heatsinks designed for 1U/2U rackmount would probably solve this.
Re:MB stacks by deadline · 2011-09-15 00:34 · Score: 1

You may be thinking of this (Limulus Project)

--
HPC for Primates. Read Cluster Monkey
Re:MB stacks by Bill_the_Engineer · 2011-09-15 00:51 · Score: 1

Nothing that a fan blowing across the motherboards can't fix. When you have multiple motherboards in a single enclosure you will need active cooling anyway.
I wouldn't recommend the stack motherboard method anyway. You can make beige box 2U rack mount PCs with two 6 core Xenons and 24GB of memory for around $3000 US and this includes 4TB of SATA drive storage in a hot swappable chassis. (Check out new egg). If you need more power then just build another chassis and at it to the rack. The trick is to start modest and build up. Not every research project needs a big cluster.

--
These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
Re:MB stacks by fast+turtle · 2011-09-15 03:06 · Score: 1

Thanks for that Link. Very interesting and gives me some ideas for my next system build

--
Mod me up/Mod me down: I wont frown as I've no crown
Re:MB stacks by Anonymous Coward · 2011-09-15 04:21 · Score: 0

I've seen quite a few projects where people have stacked motherboards with spacers, using booting over Ethernet and a single power supply for multiple MBs. Google should be of use here, I'm trying to get my offspring to school so I'm cheating and not providing any links...
But the idea is that skipping the case and other components makes things cheaper. Leaving the rig exposed without a case also eliminates the need for most cooling.
Infectedtech are doing something exactly like this: stacks of motherboards for multi multi cpu on the extremely cheap!
www.infectedtech.org maybe they could help?
The build log for the project can be found here http://www.diskusjon.no/index.php?showtopic=1270313

trade-off by TheSHAD0W · 2011-09-14 16:50 · Score: 4, Interesting

Actually, that's a good question... Assuming no time constraints, at what point does it make sense to buy hardware rather than use the cloud? Take that budget above (roughly US$6K) and the best hardware you can get for that price: How many months would you need to run it, flat out, to equal the number of floating-point ops EC2 would give you for that cost?

Re:trade-off by subreality · 2011-09-14 17:12 · Score: 4, Insightful

Sometimes, never. Don't forget to add up power, cooling, sysadmin time... And that's before getting to intangibles like being able to spin up 400 cores for an hour and getting your result same-day instead of only owning 40 cores and having to wait until tomorrow.
Cloud computing really cleans up for batch computing jobs like this.
Re:trade-off by hairyfeet · 2011-09-14 17:59 · Score: 1

Question: How much bandwidth would that run? Because I've never had the chance to set up a cluster (most of my customers are SMBs and SOHOs) so I have no idea how much bandwidth you'd need to feed something like the Amazon cloud. If he is like many of us they probably have bandwidth limits and/or have to share that bandwidth with other users so if it takes a big ass pipe I could see that possibly being a problem.
That said if the cloud was out of the question I'd snatch up plenty of cheap AMD boards along with some cheap triples from Starmicro and simply mount the boards tray style in a simple home made rack. Those triples are last gen but at $40 a pop dirt cheap, any cheap cooler will work on those with an open tray design, and geeks sells AMD boards that they'll lost the I/O shields on for something like $20. Throw in some cheap DDR 3 and some small SATA drives along with Linux and voila! Cheap cluster.
But I have to agree with you and everyone else that if they are that tight on money the cloud would give them the most bang for the buck, and cut out a lot of the hassles and upkeep as well.

--
ACs don't waste your time replying, your posts are never seen by me.
Re:trade-off by crutchy · 2011-09-14 19:13 · Score: 1

one application for cloud-based supercomputing is things like FEA and CFD, and for those the inputs and outputs are relatively small (compared to the numbercrunching inbetween). Autodesk has an online FEA service as part of their Inventor Pro package that seems like it will bring them business for large models/meshes. sometimes the hardware cost isn't as much as the software cost. the op mentioned linux, which is free, but what good will that do on its own. big numbercrunching analysis packages like FEA usually aren't cheap, and the cost to develop your own software is always high.
Re:trade-off by rioki · 2011-09-14 20:14 · Score: 1

I am not so sure about CFD. We did 3D CFD with NaSt3DGPF at the German Federal Waterway Administration. The input data to the computation and result sets where along the lines of multiple GB. Pulling the data out of the Cluster over LAN was already trouble. The process was speed up substantially when input generation was done on the actual cluster. (Input came from a few input txt files and was compiled into the computation grid.) Just getting the data to and from AWS will be a major nightmare.
Re:trade-off by mrt_2394871 · 2011-09-14 20:55 · Score: 2

Question: How much bandwidth would that run? [...] If he is like many of us they probably have bandwidth limits and/or have to share that bandwidth with other users so if it takes a big ass pipe I could see that possibly being a problem.
If the group is set up at a University, odds are it's on JANET. Those are big pipes (our connection was 155 Mb/s several years ago).
Re:trade-off by pz · 2011-09-14 23:53 · Score: 3, Informative

Don't forget to add up power, cooling, sysadmin time...

If the friend's research group is in an academic institution, power and cooling are outside of the acquisition budget, along with space, network, etc., as those are typically part of overhead. Depending on the institution, sysadmin services are too. Often the institution will even have embarrassingly large discounts with hardware and software vendors (at my institution, a licensed copy of Matlab, for example, is about $100 per seat per year).
GBP 4000 buys a rackfull of modern computers that can be run as long as you want. It can be used to explore ideas without concern for cost. In contrast, once the GBP 4000 has been paid to a cloud service, the money is gone. Given that the pressures for a new researcher are already immense (and I speak from recent first-hand experience) not worrying about running out of compute resources, even if it means the instantaneously available compute power is somewhat lower than what you could get from a cloud service.
If this new research group is going to be competing for research funds, for example, then the compute resource is going to be highly utilized for the first 12-18 months to get preliminary results in order to write grants. I can't imagine that GBP 4000 is going to last long enough. Looking at Rackspace, as another poster suggested, they charge about USD 350 per decent configuration (8GB RAM / 320 GB disk) per month. That single server is going to last 18 months before the money is gone. If the memory demands of the computation aren't so large, then the charges are lower, say USD 45 per month (1GB RAM / 40 GB disk), then you get to use 7 virtual machines for the same 18 months.
Given that a highly capable system can be purchased new for USD 500, the same money gives the researcher about a dozen real machines for 18 months, and beyond (buying off-lease machines can easily double the amount of hardware). From my perspective as a researcher, there's no comparison: when money is tight, buy your own hardware and take advantage of the services provided by your institution.

--

Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
Re:trade-off by Anonymous Coward · 2011-09-15 01:19 · Score: 0

or in the US on the Internet 2 with its 100Gb/sec network backbone. Even in the middle of a busy semestet at Penn State, I'd regularly be able to download a file(s) from another .edu at 350MB/sec or higher.
Re:trade-off by Jawnn · 2011-09-15 01:26 · Score: 1

GBP 4000 buys a rackfull of modern computers that can be run as long as you want.
On what planet? Even with the most "embarassing discounts" that number is not going to by anywhere near that many "modern computers", unless you're trying to spend as little as possible per rack unit, which, of course makes no sense. Sure, I could fill 1U with a 1U chassis, filled with cheap consumer grade components for a couple hundred bucks, maybe less if I was extremely well connected, but why? For a few hundred more I could plug that hole with many times that computational power but again, why, when I can rent that power for only as long as I need it? A rack full of underpowered pizza boxes is cheap, but not good for much but a long (slow) cluster operation, so the residual value is rather low. And a rack full of blazing multi-core hardware is simply not attainable for anywhere near your GBP 4000 mark.
Re:trade-off by DuckDodgers · 2011-09-15 01:31 · Score: 1

I think your logic is correct. If our hypothetical researcher had to provide his own facility, do his own system administration or hire someone else to do it, pay his own electrical bills including the cost of cooling, manage his own network, etc.... then cloud computing is economically far better. But since this person will be working in a context where he or she is only responsible for the hardware and software costs of the physical machines, buying machines is much cheaper.
Re:trade-off by Gallamine · 2011-09-15 01:53 · Score: 1

I run my Monte Carlo simulations of photon propogation through water on AWS. I'm a gradute student, so I'm pretty price sensitive, but with AWS I can "rent" a 8 core (roughly 20% faster than my i7 920 computer in the lab) with 7 GB of ram for ~$0.25/hour. That's the "spot price" so it could fluctuate, but it's still *way* cheaper than the $0.89/hr fixed price. I have a machine instance that has all my tools (MATLAB, Dropbox) and I just click a few buttons and BAM I've got a machine I can remote to and work.

You need to consider the cost of storage, as that actually costs me more than the price of the machine. My bill last month was $130 for 677.640 GB-Mo of storage (forgot to shut down some EBS locations) and 200 hrs of the High-CPU Extra Large instance. I also transferred OUT 130.669 GB of data.

--
RobotBox - Robot projects from around the world
Re:trade-off by Anonymous Coward · 2011-09-15 02:17 · Score: 0

Cloud computing really cleans up for batch computing jobs like this.
No it doesn't ... in fact it is the exact opposite.
If you have predictable steady loads that you run all the time you are far better off buying your own hardware. If you have unpredictable spikes then go with Amazon.
Re:trade-off by Guspaz · 2011-09-15 03:24 · Score: 1

You can't fill a rack with them, but the most bang-for-buck, if we're ignoring GPUs (and Tesla) is probably going to be consumer hardware.
For just under $500 CAD before shipping/tax, you can build a respectable barebones desktop machine at newegg, with an i7 2600, 8 gigs of RAM, and a 500GB HDD. That's probably the fastest consumer CPU on the market, too (the previous-gen hex-cores might edge it out). For GBP 4000, I can build 12 of them, with a few hundred left over for a switch and some network cables.
Now, there are faster *enterprise* CPUs on the market, to be sure. Intel has some eight and ten core Westmere Xeons... but they cost so much, you might only be able to build one or two systems for GBP 4000, and the twelve consumer machines would destroy it in terms of pure number crunching power. The question becomes where the balance between performance and reliability will be.
Re:trade-off by subreality · 2011-09-15 06:31 · Score: 1

I did just say "batch computing jobs".
Re:trade-off by Anonymous Coward · 2011-09-15 07:53 · Score: 0

It's a bigger issue. I know of very few advanced CS majors who can do what you just described.. it's like asking a BMW piston materials engineer, who has degrees in metallurgy and physics to do an oil change on a 7-series.
Re:trade-off by TooMuchToDo · 2011-09-15 09:40 · Score: 1

Wrong. If you don't need the ability to burst computing-wise, cloud computing is always a worse deal (unless done internally).
Once you own your equipment, its yours. You're constantly renting the equipment from Amazon, hour after hour. We did the math in our group that did data taking for the LHC's CMS detector at Fermilab. It just doesn't make sense after 3-6 months, unless you're IT-phobic and don't ever want to manage hardware, AND you have the money to burn.
Re:trade-off by cas2000 · 2011-09-15 09:47 · Score: 3, Informative

or use a 16GB or 32GB USB flash (or better yet, a small SSD - swapping to USB flash would suck) as the boot drive on most machines and have one machine (the head node) with hard disks as a file server - NFS will do for small to medium size clusters (anywhere from a handful of nodes to a few hundred nodes). The OP is going to need a head node anyway to run Slurm or Torque as the scheduler/resource-manager (yes, i have built clusters before).
put a 2nd NIC in the head node, so the compute nodes can run on a private 192.168 network (you'll need a 24 or 48 port switch as well), and also install DHCP, tftp, and apache. Set up the last three to allow the compute nodes to netboot clonezillla....install everything you'll need on one compute node (openmpi, libatlas, octave, R, open source and proprietary scientific software as needed, etc) and use clonezilla to mass produce the rest (also allows you to quickly and easily add new nodes or replace failed nodes). LDAP or NIS will be needed for sharing account/auth details between machines.
i built something quite similar to this last year (but using some sunfire 1RU opteron rackmount servers as the compute nodes)
I'd go for an x4 CPU, they're not that much more than an x3 and the extra core is useful. 8GB RAM too, 2x4GB only costs about $40). given the budget, it's probably not worth getting a custom power supply for the tray-mounted motherboards, so each will need its own dedicated PSU
each node is going to cost somewhere around $250 (very rough estimates: $50 for the m/b, $40 for 8GB RAM, $50 CPU, $50 PSU, $60 for 32GB SSD - but possibly a fair bit cheaper as a bulk purchase), and the head node will cost roughly triple that (you'll need a case w/ hot-swap bays for the drives - a Norco 4224 is probably overkill but at well under $400 for 4RU with 24 SAS/SATA hot-swap bays, it would be hard to find a significantly cheaper case even with less drive bays) so for $6K you can build a cluster with 20 x 4 core compute nodes plus a good head node for the scheduler & file server). 80 compute cores for $6K. that's good, even considering that with cheap crap motherboards you'll have a noticable failure rate. the cluster i built last year with name brand hardware cost closer to $50K. I could build a better system today (far less nodes with a lot more cores and RAM each), also with name brand hardware, for about $20K - $30K
trays for the motherboards, the rack(s), and cooling will cost extra. as will licenses for any proprietary software they might need to run (could easily cost as much - or more! - as the hardware). if the OP's friend is at a university, she can probably scavenge an old rack or two from another dept, but even if she has to buy one new she could easily build 15+ compute nodes entirely within the $6K budget
Re:trade-off by cas2000 · 2011-09-15 10:14 · Score: 1
OTOH, you can buy a nice *off-the-shelf* opteron based workstation (e.g. supermicro motherboard) with four 12 core CPUs and 128GB RAM in a 4RU case with 6 or more hot swap bays for about $5-6K. Not as many cores and less RAM but:
- - it's all in one machine - so better for big memory hungry jobs
- - there's a lot of stuff that works better with a single multicore SMP machine than with MPI
- - it's a lot less stuffing around than building a DIY cluster
- - it will have a real warranty
- - takes only 4 or 5RU.
- - there's room for some GPUs in that case later too.
- - far less sysadmin hassle
Re:trade-off by The_Wilschon · 2011-09-15 15:33 · Score: 1

It's a lot cheaper on the small scale to build towers and put them on a meatrack than to get rackmount chassis et al. A *lot* cheaper.

--
SIGSEGV caught, terminating

wait... not that kind of sig.
Re:trade-off by Siffy · 2011-09-15 21:57 · Score: 1

buy your own hardware and take advantage of the services provided by your institution.
Your points are great. Something I didn't think to mention in a long post a few minutes ago. IF this is a university project and since the budget is so small, the grad student (I'm assuming) could look into building and sharing a cluster with another similarly sized research group of other grad students.
Re:trade-off by hairyfeet · 2011-09-15 23:26 · Score: 1

I'd agree on everything but the head server case. We had a need at the last shop I worked for someone else at for a server with a crapload of drives, this was in the days of Win9X where every damned thing had to have a special driver and most drives topped off at 40Gb, so we just spot welded 3 ATX cases together. When we were done we had a skeleton case that held 20 drives, was beyond simple to keep cool, and made swapping drives a breeze. it was damned nice to have nearly 500Gb of drivers on the network ready to go. you just mount a good board with one of those cheap quads or Opterons from Starmicro and you'd be good to go!
When you are trying to squeeze every drop of bang for the buck it is the little things like that that can mean the difference in having a couple more quad nodes for your cluster or not. Sure it isn't pretty to look at, but who cares? it works great and you can cool the whole smash with a $10 Walmart box fan like we did at the shop.

--
ACs don't waste your time replying, your posts are never seen by me.
Re:trade-off by Anonymous Coward · 2011-09-16 03:26 · Score: 0

I agree. Working in an academic institution, it is ALWAYS cheaper for research groups to get our own hardware versus running on the cloud. Sysadmin is free, (or dirt-cheap). Most institutions have fiber optics direct to internet backbones. Gigabit (or in my case 10 gigabit) are by default. There are academic alliances that research groups can join which give access to unlimited storage (with backup) in return for letting other people in the research groups to use their computing hardware off hours.

Well, Infiniband is out by Anonymous Coward · 2011-09-14 16:50 · Score: 0

Infiniband is out in that budget. But you could see how far you could get buying some cheap quad cores and interconnecting them with GbE. You can take a look at TomsHardware cpu charts (e.g. for 3dsmax rendering since this is a similar task: http://www.tomshardware.com/charts/desktop-cpu-charts-2010/Image-Rendering-3DS-Max-2010,2420.html) and get the most bang for your buck.

Amazon EC2 by Anonymous Coward · 2011-09-14 16:55 · Score: 0

why not try it?

Has she investigated existing clusters? by Goonie · 2011-09-14 16:55 · Score: 4, Informative

Many universities/consortia have supercomputers available on which researchers can apply for (or buy) time. For example, my university is a member of VPAC, which has a big-arse cluster shared between a number of institutions. She might get much better bang for buck if she uses the money for that, rather than splashing out for dedicated hardware.

--

Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)

Re:Has she investigated existing clusters? by RuBLed · 2011-09-14 17:57 · Score: 1

Also this guy have some and he's not sure what to do with it. They could also have brunch or something.

http://linux.slashdot.org/story/11/09/13/2111210/Ask-Slashdot-Best-Use-For-a-New-Supercomputing-Cluster
Re:Has she investigated existing clusters? by Anonymous Coward · 2011-09-14 18:12 · Score: 0

Many universities/consortia have supercomputers available on which researchers can apply for (or buy) time. For example, my university is a member of VPAC, which has a big-arse cluster shared between a number of institutions.
She might get much better bang for buck if she uses the money for that, rather than splashing out for dedicated hardware.
i agreen with you ,good idea !
http://www.accvv.com/louis-vuittonkoala-wallet-damier-azur-p-2019.html women’s handbags
http://www.accvv.com/louis-vuittonl39epanoui-pm-black-p-2214.html womens bags
http://www.accvv.com/louis-vuittonl39epanoui-pm-brown-p-2215.html womens designer handbag
http://www.accvv.com/louis-vuittonl39ingenieux-pm-white-p-2216.html womens designer handbags
http://www.accvv.com/louis-vuittonle-confident-black-p-2203.html womens designer purse
http://www.accvv.com/louis-vuittonle-confident-white-p-2204.html womens designer purses
http://www.accvv.com/louis-vuittonle-fabuleux-brown-p-2206.html womens handbags
http://www.accvv.com/louis-vuittonle-fabuleux-p-2205.html womens purses
http://www.accvv.com/louis-vuittonle-fabuleux-white-p-2207.html designer handbags for men
http://www.accvv.com/louis-vuittonle-precieux-black-p-2208.html designer handbags for sale
Re:Has she investigated existing clusters? by Anonymous Coward · 2011-09-14 19:20 · Score: 0

I was gonna suggest she spend the 4k on professional thieves to lift his shipment when it comes in, fence a few boxes to cover GbE, and build it herself.
But brunch is nice, too.
Re:Has she investigated existing clusters? by Anonymous Coward · 2011-09-14 19:42 · Score: 0

What is this? Slashlist? Craigsdot? Anyone?
*chirp*chirp*
Re:Has she investigated existing clusters? by serviscope_minor · 2011-09-14 20:48 · Score: 1

Thanks for the suggestion about university facilities. This university is not great in this regard. Also, the cluster is used to process experimantal data, and can do either one large dataset or many tiny ones within 6 to 12 hours. The advantage of having a personal cluster is that the latency is low: you can tell by tomorrow if some of today's experiment worked, which is necessary to have a decent turn around on the experiments.

--
SJW n. One who posts facts.
Re:Has she investigated existing clusters? by bakuun · 2011-09-14 21:43 · Score: 2

A followup on this: since you gave the budget in £, I assume that you're working in the UK. The national UK computational grid service is called... "national grid service". You can find it here: http://ngs.ac.uk/ . It is a little complicated to get access (you need to visit somebody in order to prove that you are you, so you can get a login certificate), but once you have an account it's easy enough to use. You need to submit requests for CPU time though, and while I never had any problem getting mine through, you may want to quickly check in first to see whether they would expect any problems with granting the kind of CPU time you will need.
Re:Has she investigated existing clusters? by Tim+C · 2011-09-14 22:41 · Score: 1

That guy is in the US, while quoting a budget in GBP clearly indicates that the submitter's friend is in the UK; brunch seems unlikely.

--
It's official. Most of you are morons.
Re:Has she investigated existing clusters? by JustinOpinion · 2011-09-15 04:24 · Score: 1

The cluster you use doesn't have to be in the University the research group is housed in. Many clusters are available to researchers worldwide; you just upload your data/code to a processing queue and it gets run. You can remote-login to monitor the status, restart jobs, etc. You have quite a bit of control.

In fact, if the research in question is "high quality" and not proprietary then you can get access to various clusters for FREE. It's hard to beat free in terms of bang/buck. For instance, the US Department of Energy runs various computer clusters within "user facilities" (other funding agencies in US, Europe and elsewhere have similar programs). What this means is that you submit a proposal/request where you describe the research you're doing and what kind of resource you need (in this case, routine access to a computer cluster). If the proposal is highly rated (externally peer reviewed) then you're allocated access for free. In addition to getting access to the cluster itself, you get "access" to the experts who run the cluster--their expertise in optimizing and parallelizing code is extremely valuable.

I understand that having immediate access to computing power is useful. But if you're on a shoestring budget then something's gotta give, and using pre-existing clusters is a very efficient way to spend money. In the case of user clusters, if you can get free access then you can use a mixture of a smallish in-lab cluster and occasional access to the large-scale cluster. This is so easy to do (and did I mention free?), there's almost no reason not to try. (Yes, the DoE accepts proposals internationally, so there's no problem there.)

Disclosure: I work for the DoE, so I guess I'm biased. Here are some links that might help:
http://www.bnl.gov/cfn/facilities/Theory_and_Computation.asp
http://computing.ornl.gov/
http://www.alcf.anl.gov/

Yahoo M45? by Anonymous Coward · 2011-09-14 16:57 · Score: 0

Is this program still around? http://research.yahoo.com/node/1884

Re:Yahoo M45? by allenw · 2011-09-15 02:22 · Score: 1

Yes, but I don't think Yahoo! is adding any more universities to it. In fact, I don't think it ever expanded beyond CMU.
ObDisclosure: I was on the ops-side of that project for Yahoo!.

Whitebox 1U rackmounts by mcrbids · 2011-09-14 16:57 · Score: 0

You don't specify whether or not your friend would be working out of a colo. If so, space will be at a premium.

My needs are high reliability, low cost, and high density. (colo)

I've been providing an excellent bang/buck ratio using whitebox 1U rackmounts made by SuperMicro. For about $1,000 I can get a late model CPU with a decent amount of quality ECC RAM, dual Gbit Ethernet ports, SCSI / SATA3 interfaces with a chipset highly compatible with CentOS Linux. (my distro of choice)

This is server-grade equipment, optimized for I/O throughput and reliability over raw processing power. You may be looking for raw computational power with a higher tolerance of downtime, in which case you'd want to try something else.

--
I have no problem with your religion until you decide it's reason to deprive others of the truth.

Re:Whitebox 1U rackmounts by serviscope_minor · 2011-09-14 20:56 · Score: 1

The university has a few server rooms scattered around of various qualities, though #4k's worth of kit could probably be scattered around a bit if necessary. The department in question does not have much history of heavy computational demands.
Though it's interesting what you mention about a colo. I had a look around at colo's on the open market and they're very expensive, compared to the budget. Oddly, the density seemed lower than expected. These days, modern high density servers can easily reach 1kW/U, but the colos I looked at were generally charging assuming a few hundred W/U.
Pretty much anything I could find would eat half the budget or more on colo costs. Any sugestions there?
In terms of reliability, jobs are isolated and short and can easily be rerun, so it is not worth going for high reliability over spare parts. Also, the task is very heavily CPU bound.
I looked at the high density rack stuff (having relatively recently purchased some supermicro quad 6100's) which were placed in a proper facility. They're very nice and in large quantities, proper kit does certainly save enough on sysadmin time, space and electricity to be worth it.

--
SJW n. One who posts facts.
Re:Whitebox 1U rackmounts by DrgnDancer · 2011-09-15 00:26 · Score: 1

Biggest problem you're going to have, and the reason I think some of the people suggesting renting outside resources rather than purchasing may be on target, is storage (both physical and logical). You can get a machines with 4-8 cores, and 4-8 GB of memory for around $1000 in a rack mount case, so around 6 boxes with your budget (since you don't have VAT on .edu gear I'm assuming around 1/1.5 dollar/pound ratio). That doesn't get you a rack to mount them in and doesn't get you any storage beyond the hard drives in the boxes. Realistically, something approaching a quarter to a half of your budget will go to incidentals. You'll need a at least one, preferably two ten port Gig-e network switches, some kind of low end shared storage, at least a half height rack, cables, a multiport KVM, the monitor/KB/mouse (ideally rack mount, but you could save some dough by getting regular ones or using spare hardware)... Individually none of those things is horribly expensive, but together I'd guess 1-2K pounds.
If you've got all that stuff, then your goals are a lot more reasonable... if you don't, you're taking a shoe string budget and making it one of those shoe strings that has been pulled through the metal trivet on your boots a few to many times. If you absolutely must have dedicated hardware, one way to save on storage might be something like gfs. It's a shared cluster file system that allows you to create one file system from disks on multiple computers. In theory this could allow you to use the spare space on each node to create a large shared storage... in practice I'm not sure how well it scales when the "servers" are also the "clients". I've only ever used it with separate server boxes.

--
I don't need a million points of light, just two points of multi-mode fiber and a 10 Gig-E router.
Re:Whitebox 1U rackmounts by Gyorg_Lavode · 2011-09-15 02:52 · Score: 1

Find some place that collects e-waste in a business-heavy area (or military area). You can usually find racks for cheap.

Since there have been a lot of rack mount suggestions, I might throw in my personal experience. I bought got a half-height portable rack for $500 a few years ago. I put it in a closet with a free-standing portable AC unit ($400) and vented it into the dryer vent. (If you lack dryer vents, a closet with a window will do fine. I'd recommend talking to building maintenance to see make sure the rack and the AC unit can power off the same circuit, but otherwise you should be fine.

I also created a small lab (about 25 laptop hosts and 12 pieces of network equipment. It ran free-standing in a room with no modifications for cooling/electricity. At the size you're talking, I wouldn't worry about heat/power.

--
I do security

Benchmark then go by price-performance by Anonymous Coward · 2011-09-14 16:59 · Score: 0

First thing is to run some benchmarks to find out which architecture is best. Next figure out how much memory per core and buy the systems with highest $ per performance. We recently went through this and for our highly-parallel workload. We found that the Intel Nehalem processors were faster by a factor of 2 or so than Opteron 6100 cpus. However, the Opterons were cheaper. We ended up getting boxes with 4 Opteron 6100 cpus that have 12 cores each. We need about 1 GB per core, so ended up with 32 GB of memory. We have an existing cluster with disk servers, etc, so went with 1 U boxes since we are somewhat space-constrained. Our communications need are modest so we are simply using gigabit Ethernet, though we are experimenting with channel bonding. We have found that it is worth buying name brand systems, though not by top-tier manufacturers. Our systems cost about $7k apiece as I recall. For us one of the benefits of buying more expensive systems was that they are considered equipment so we can use funds originally budgeted for facilities (overhead) expenses in our grant to pay for the computers. Without this the calculation might have been different. In the past the sweet spot for price/performance has been 1U boxes with 2 CPUs. We run CentOS on our cluster and like the stability combined with security fixes. We buy our systems with 3 year warranties and do not pay for service contracts. By the end of 3 years we have usually added some new systems and if the old ones die it isn't a big deal. The whole cluster thing works very well for us since we can add computing power in relatively small increments as funds are available.

Re:Benchmark then go by price-performance by Anonymous Coward · 2011-09-14 19:16 · Score: 0

... buy the systems with highest $ per performance.
That sounds expensive. Wouldn't it be smarter to go for the best performance per $?

Imagine by Tablizer · 2011-09-14 16:59 · Score: 1

Imagine a cluster of cheapness!

--
Table-ized A.I.

Re:Imagine by secondhand_Buddah · 2011-09-15 00:06 · Score: 1

I tried, and all I got was images of Wallmart....

--
Participatory Governance : The only feasible option for a real democracy, where everyone really does have a say.

FPGA by Anonymous Coward · 2011-09-14 16:59 · Score: 0

Check out a FPGA based solution such as NI's FlexRIO. Moving computation over to hardware makes things much much faster.

starcluster may be the answer by Anonymous Coward · 2011-09-14 16:59 · Score: 0

It runs sun grid engine with a NFS master on EC2.
http://web.mit.edu/stardev/cluster/

Use the Existing Grid by Roger+W+Moore · 2011-09-14 17:00 · Score: 3, Insightful

Why buy your own when you can use existing GRID infrastructure? For 4k you can't do much more that get a few decent desktops for yourself and a few grad students and/or postdocs. Rather than blow it on a massively underpowered cluster use the grid. I know the UK has massive clusters available to researchers so find out how to get an account and resources on them and use those. For test jobs, interactive analysis and other low latency tasks use your desktop.

Re:Use the Existing Grid by mapsjanhere · 2011-09-15 03:11 · Score: 1

I guess a lot are missing out on the "just started a research group" part. Beginning academic researches (what would be tenure track assistant prof in the US) plan for a 5 year run or research, not cranking out results in 90 days of high power cloud computing. Infrastructure is provided, labor is free or fixed, and typical data needs to be looked at, algorithms rewritten etc so preventing an optimal use of the high power capacity anyway. Plus 5 years from now the cluster will probably become desktops for grad students, and 10 years from now part of the beginners lab.

--
I'm aging rapidly, I bought a new game and had no idea if my machine was good for it.
Re:Use the Existing Grid by Roger+W+Moore · 2011-09-18 06:37 · Score: 1

I guess a lot are missing out on the "just started a research group" part.
No, not really, I've been there and done that. Rather than waste time and effort putting together a tiny cluster (remember the budget is £4k!) you will be better off using a large cluster. Develop/test your algorithms on your desktop as much as possible and then submit to the large clusters. Yes there might be latency in getting the results but not as much latency as there will be running on a tiny cluster. In addition you can submit multiple jobs with different data and/or parameters to speed things up that way.

Plus 5 years from now the cluster will probably become desktops for grad students
Not really - you would need to buy monitors and rack mount cases are pretty hard to accomodate in an office. Certainly that did not happen with the cluster I had was tenure track (which was before these large grid clusters were available!).

buy one Opteron 6100-based box by Chalex · 2011-09-14 17:04 · Score: 5, Informative

You can get a SuperMicro reseller to sell you one workstation with 4 sockets of CPUs and a bunch of RAM. UK£ 4000 = 6 299.2 U.S. dollars

That buys you a box with 4 x Opteron 6134 (32 cores) and 128GB RAM (32 x 4GB sticks). And some hard disks.

Re:buy one Opteron 6100-based box by Anonymous Coward · 2011-09-14 17:42 · Score: 2, Informative

parent is correct.
or for some more get the the 4x6168 (48 cores for about 770$ each ~ 3000, MB is around 800$, supermicro cabinet - 800$ OR use a chenbro fileserver for 300$) and 8GB sticks are pretty cheap these days and almost proportional in price compared to the 4GB - get the KVR1333D3Q8R9S/8G for about 90$ or about 90*8 = 720$ for 64GB. Around 5000 in total, though there are some additional costs like coolers etc but still wont break budget.
i got a similar config for doing something that is cpu intensive (branches, fp and mem intensive) and not easily portable to gpus.
the resultant machine hardly cracks 600w, and is 4 times as fast as an i7 960 on my application.
Re:buy one Opteron 6100-based box by toruonu · 2011-09-14 20:33 · Score: 4, Informative

Yes, my recommendation would be also, we do loads of LHC data analysis and simulations and have found that for real science real cores outweigh hyberthreaded ones so we run Opteron 6172 x2 in supermicro chassis that fits 4 servers into 2RU. The cost of such a box of course is ca 11keur, but it gives 96 cores 192GB ram. Now she can get for half the money that she has about half of that so 48 cores 96GB ram should be doable using SM boxes and you can scale up/down with CPU frequency to adjust the cost and maybe adjust total RAM alongside to fit in the budget. If she plans to later expand she may actually want to spend the money to get the 2U chassis with only 2 of the 4 machines present and later add one/two more by just buying the board with cpu/ram.
Re:buy one Opteron 6100-based box by serviscope_minor · 2011-09-14 21:00 · Score: 1

You can get a SuperMicro reseller to sell you one workstation with 4 sockets of CPUs and a bunch of RAM. UKÂ£ 4000
Can you provide a link? Most places will sell that in 1U form. A workstation formfactor could be very useful, depending on where it ends up.

--
SJW n. One who posts facts.
Re:buy one Opteron 6100-based box by AstroMatt · 2011-09-15 01:09 · Score: 1

I've got sitting on my desk a few quotes from this week along these lines: 32 cores, 32 GB ram 2 TB Raid 1, 23" monitor. --> $8k Matt A. Wood, Professor FIT Physics & Space Sciences
Re:buy one Opteron 6100-based box by Guspaz · 2011-09-15 03:32 · Score: 1

If you're going to compare it to a consumer processor, I'd point out that a $300 sandy bridge chip is 20-30% faster than an i7 960, and if you're saying that a 6168 is the same speed indivdually as an i7 960, the sandy bridge chip costs less than half as much to boot.
Of course, there aren't any multi-processor Sandy Bridge Xeons on the market yet, so you can't put four of them in one machine.
Re:buy one Opteron 6100-based box by Anonymous Coward · 2011-09-15 04:16 · Score: 0

I bought a few dual-socket opteron GPU servers last year and we are very satisfied with performance. Each has 2 x 6172 (24 2.1 GHz cores) and 64GB ram and cost the equivalent of about £4000 without GPUs. Using cloud services would have cost more than that so far and those servers will still be good a few years from now. It is also a lot easier in most cases to run parallel code on a single multi-core server than on a cluster.
Re:buy one Opteron 6100-based box by Siffy · 2011-09-15 23:17 · Score: 1

European Branch

Super Micro Computer, B.V.
Het Sterrenbeeld 28, 5215 ML,
's-Hertogenbosch, The Netherlands
Tel: +31-73-640-0390
Fax: +31-73-641-6525
General Info: Sales@Supermicro.nl
Tech Support: Support@Supermicro.nl
Supermicro uses Xeons in all their current workstations. You'd have to ask for a custom job to get Opterons in a pedestal build. Nothing says you can't just run http://www.supermicro.com/Aplus/system/2U/2042/AS-2042G-6RF.cfm on its side, except it's going to sound awful in an office. If you're curious how one of those could/would spec out filling all bays and memory slots (without converting currencies),

1 x SUPERMICRO AS-2042G-6RF 2U Rackmount Server Barebone Quad Socket G34 AMD SR5690/SR5670 DDR3 1333/1066/800 Item #: N82E16816101321 $1,899.99
6 x Western Digital RE4 WD5003ABYX 500GB 7200 RPM SATA 3.0Gb/s 3.5" Internal Hard Drive -Bare Drive Item #: N82E16822136697 $449.94 ($74.99 each)
32 x Kingston 8GB 240-Pin DDR3 SDRAM DDR3 1333 ECC Registered Server Memory Model KVR1333D3D4R9S/8G Item #: N82E16820139140 $3,103.68 ($96.99 each)
4 x AMD Opteron 6128 Magny-Cours 2.0GHz Socket G34 115W 8-Core Server Processor OS6128WKT8EGOWOF Item #: N82E16819105266 $999.96 ($249.99 each)
Subtotal: $6,453.57 which is about the total budget.
The biggest problem with Opteron 6100's is the next faster proc costs 2x as much. I'm not suggesting this exact config, just an example. And YMWV with exchange rates and the hike to costs when importing stuff to your lil island. The UK gets hosed on hardware prices.

Caseless Example: Ikea Cabinet Cluster by cmholm · 2011-09-14 17:08 · Score: 1

An near-example of what Max is talking about can be found at the Home Linux Render Cluster. The builder threw six dual cpu motherboards into a small, gutted filing cabinet and Gig-e. Cheap, expandable.

However, if your friend hasn't got a very good idea how much mmmph she needs, the AWS EC2 rental idea has merit.

--
Luke, help me take this mask off ... Just for once, let me butterfly kiss you with my own eyes.

Consider other options by msobkow · 2011-09-14 17:09 · Score: 1

You can't really get a cluster for that kind of money. You can barely get one decent box.

But you shoud be able to rent a lot of computer time in the cloud for that kind of money, or use it to buy time on someone else's cluster.

--
I do not fail; I succeed at finding out what does not work.

Re:Consider other options by allenw · 2011-09-15 02:33 · Score: 1

This is absolutely correct and if I had mod points, I'd spend them here.
If your budget is only £4000, you don't have the funding to build a real, actual grid for something that is CPU bound. If you are lucky, you have enough to get one or two boxes and some network gear to put on the top of someone's desk.... at least if you are doing AMD or Intel higher end procs.
Here are two ideas worth exploring...
1) Look at boxes like SeaMicro and other Atom-based mini-grids-in-a-box.
2) Look at building your own with Atom- and Arm- based machines
Re:Consider other options by citizenr · 2011-09-15 03:48 · Score: 1

CPU bound ........... Atom- and Arm- based machines
lol

--
Who logs in to gdm? Not I, said the duck.
Re:Consider other options by allenw · 2011-09-16 03:11 · Score: 1

Yup, I realize that going to Atom or ARM for a CPU bound process is suicide, but so is only using the tiny amount of money to try and solve the problem. :)

EC2 is expensive by Anonymous Coward · 2011-09-14 17:10 · Score: 0

EC2 is really expensive, brah.

Re:EC2 is expensive by Gaygirlie · 2011-09-14 18:38 · Score: 1

EC2 is really expensive, brah.
Once you count all the costs of running your own cluster, ie. electricity, cooling, man-hours spent on configuring, installing and maintaining them, repairing broken parts etc. suddenly EC2 is likely cheaper than your own cluster, not to mention you can scale up on-demand if your requirements suddenly require such.
Re:EC2 is expensive by Cwix · 2011-09-14 21:17 · Score: 2

Hmm I wonder if the College has a datacenter already. You know, so ya dont have to worry about the overhead.
Ohh and man hours to configure the thing? You know I'm sure there are a load of students there working for free.
This is NOT some company trying to spin up a new service. This is a school doing a project. That means shoe string budget, and the people get paid with good grades in their class.
4000 pounds wont get you very far at all on Amazon.
I love it when people take the question given and then insist that the problem can only be solved by changing the specifications to using "cloud" computing or some other nonsense. It makes you guys look like paid shrills.

--
You are entitled to your own opinions, not your own facts.
Re:EC2 is expensive by Gaygirlie · 2011-09-14 22:27 · Score: 0

4000 pounds wont get you very far at all on Amazon.
Let's say the researchers in question took 4x "High-CPU Extra Large Instance" from EC2. Each one of them sports the following:
7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
That's helluva lot of computing power right there, at $0.76 an hour. £4000 translates to about $6 316.8, which in terms of useable hours at 4x$0.76 would be roughly 2077 hours. 2077 hours is roughly 86 days of 24/7 computing. Thus it sounds like a good deal to me.

I love it when people take the question given and then insist that the problem can only be solved by changing the specifications to using "cloud" computing or some other nonsense. It makes you guys look like paid shrills.
We were asked a question without much information as to who does what, how, when, and for how long. Based on what we were told Amazon EC2 or something similar sounded like a much better deal than having to deal with building a cluster yourself, then installing, configuring and running it, especially since the point was to do research, not tinkering. Less time spent on tinkering == more time spent on the research itself.
If you don't like the answer given then ignore it. Calling the answer nonsense when it clearly has its advantages, and insinuating that we have some personal angle here at stake is just meanspirited.
Re:EC2 is expensive by TheRaven64 · 2011-09-14 23:53 · Score: 4, Informative

The problem is the constraints. The cheap cluster in my old department cost £100k. £4k does not buy you a lot of hardware. You will probably find a lot more lying around in the undergrad labs. For some of my work as a PhD student, that's exactly what I used - each lab had 40 machines on a GigE network and closed overnight, and for work that wasn't that latency sensitive, I could distribute it across the machines there and run it at night without anyone minding.
If you're serious about needing a cluster, then you need to spend a lot more than £4K. If you only need a cluster for a short time, then £4K can buy you a chunk of time on someone else's hardware. Since this is the UK, they should contact the Manchester Supercomputing Centre, which provides this kind of service to UK universities at quite a reasonable price (and will also lend you people who are good at optimising code for their systems). If the university doesn't already have some clusters lying around, then you should get in contact with a few other research groups. £4K won't go very far, but if half a dozen research groups each put in £4K then that gives you enough for a reasonable cluster to share between the various users.

--
I am TheRaven on Soylent News
Re:EC2 is expensive by mlush · 2011-09-15 02:16 · Score: 1

4000 pounds wont get you very far at all on Amazon.
Let's say the researchers in question took 4x "High-CPU Extra Large Instance" from EC2. Each one of them sports the following: 7 GB of memory 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each) 1690 GB of instance storage 64-bit platform
That's helluva lot of computing power right there, at $0.76 an hour. £4000 translates to about $6 316.8, which in terms of useable hours at 4x$0.76 would be roughly 2077 hours. 2077 hours is roughly 86 days of 24/7 computing. Thus it sounds like a good deal to me.
Spread this over a 3 year grant means they can do 2 hours computing a day. If they do more than that they will lose their compute cluster before the end of the project, just when there writing the next grant application. and I guarantee they will go over budget because if they get answers back instantly they won't bother/need to optimise their code. Workload expands to fill the available machine time
Looking at the numbers again I suspect that taking one of those machines, so they have ~8 hours a day compute time. Ideally I think they'd want a service at ~$0.25/hour for a 24/7 service. Then they can compute away without worrying about budget and any unused time/money could be used on the big iron for special occasions/rush jobs.
Re:EC2 is expensive by lee1 · 2011-09-15 02:23 · Score: 1

That's helluva lot of computing power right there, at $0.76 an hour.
I've never used the service, but if what another poster says is true, that you pay for a full hour when you use any time at all, the cost per hour can be huge (theoretically infinite). The first debugging session will use up all your money. (There is no avoiding doing some debugging and testing on the finial compute configuration, which will involve a bunch of short runs of your code - each run of a few seconds will count as an hour.)
Re:EC2 is expensive by Gaygirlie · 2011-09-15 02:31 · Score: 1

I've never used the service, but if what another poster says is true, that you pay for a full hour when you use any time at all, the cost per hour can be huge (theoretically infinite). The first debugging session will use up all your money. (There is no avoiding doing some debugging and testing on the finial compute configuration, which will involve a bunch of short runs of your code - each run of a few seconds will count as an hour.)
The hour starts running when you power up the instance. How many tasks you run inside the instance itself doesn't matter. So the obvious solution would be to only power up one instance for one hour and do as many debugging runs as you can within that time, then examine the results and fire up another instance once you feel you can again use the full hour.
Ie. you got things mixed up. You interpreted it as "do a test run == get billed for an hour", whereas it's "do as many test runs as you can within an hour == get billed for an hour."
Re:EC2 is expensive by lee1 · 2011-09-15 03:13 · Score: 1

That's not nearly as bad as it sounded. Thank you for the clarification. But I still wouldn't want to have to work that way. One minute run + looking at the output + get interrupted by a phonecall = one hour charge. If they billed by actual CPU time that would be much more attractive.
Re:EC2 is expensive by Guspaz · 2011-09-15 03:25 · Score: 1

EC2 is really expensive compared to other (better) cloud providers, not running your own cluster.
Re:EC2 is expensive by Gaygirlie · 2011-09-15 05:13 · Score: 1

EC2 is really expensive compared to other (better) cloud providers
That could well be true, I am only familiar with EC2 and I've actually never used it myself. There's plenty of people suggesting asking universities for CPU time and that might well be cheaper. The point is though that running your own cluster with only £4000 available is likely to end in tears, not actual research getting done.
Re:EC2 is expensive by Guspaz · 2011-09-15 07:07 · Score: 1

I think a lot of the work you put into running your own cluster will still be required for EC2 or other cloud providers. Cloud providers give you a VPS to play with, but they don't typically handle any of the cluster part. EC2 doesn't. So in either case, all of the software side of things is still up to you. In fact, the only thing extra that EC2's "cluster" service really gets you is that they provide a 10Gbps interconnect between your cluster instances. The instances themselves are nothing special, just large, and at ~$1500 a month, expensive.
Building a bunch of cheap machines and plugging them into a switch isn't difficult or risky, and outsourcing that to a cloud provider doesn't necessarily make it any easier. Potentially cheaper, depending on how long-term you need the performance. 4000 GBP would buy you 228162 hours of small (512MB RAM) instances at Linode, or 5690 hours of large (20GB of RAM) instances (cost scales linearly by RAM and guaranteed CPU share). Due to the way such cloud providers work (larger instances guaranteed larger minimum CPU, but all instance sizes have the same maximum theoretical performance of one quad core xeon), you'll probably get far more CPU power out of the many smaller instances, but it depends on how parallelizable the task is.
The downside of EC2 is that, while they guarantee a given amount of CPU power, the guaranteed amount is very small per dollar, and there's no taking advantage of spare CPU time that other people aren't using. In the case of a good cloud provider, there's usually a lot of CPU power to go around. If you want to directly compare guarantees, a large EC2 instance ($0.34/h) is probably roughly equivalent in guaranteed CPU to a 4GB linode ($0.22/h), but has double the burstable CPU power if it's available. Of course, RAM is not comparable, but it's unclear if RAM or CPU power is the primary demand in this instance. There are other complexities, because EC2 charges for storage and transfer on top of the base rate, while Linode includes it in the base rate.
Re:EC2 is expensive by LaRainette · 2011-09-15 07:15 · Score: 1

I can buy hardware that is about 1 third this power for £4000 and then run it forever, or even resell it after, or transfer it to another project.
Amazon is AMAZING if you have unpredictable workload and sudden spikes of CPU power requirements. For instance if you have time sensitive workloads, things that need to be done before day X.
This is a University project so this doesn't apply at all.

You should definitely BUY your hardware. which one I don't know, it depends on the type of computation, the software and such but don't go amazon it's just NOT meant for you.
Re:EC2 is expensive by TooMuchToDo · 2011-09-15 09:41 · Score: 1

IF you have to go get power, cooling, man hours of maintenance. *Lots* of places already have these as sunk costs, so why go incur costs with Amazon when you're already paying for your own IT overhead?
Re:EC2 is expensive by Cwix · 2011-09-15 13:10 · Score: 1

Yes, but you could purchase and run your own machine for two years straight. Even if the machine is only a quarter as fast, you'll get twice the computations out of it.
Ohh, BTW. I said that your comment made you sound like a paid shrill. I never said you were one. I never said I thought you were one. I was telling you how your comment was interpreted. I didn't even mean it as an insult, it was more of 'Hey, you know how that came off, right?'

--
You are entitled to your own opinions, not your own facts.
Re:EC2 is expensive by Glonoinha · 2011-09-15 15:08 · Score: 1

Hmmm. Actually I think you may have hit upon the answer without realizing it. 'Borrow' the CPU cycles of computer labs that are closed at night.
If you think about it, I am pretty sure there is at least one classroom exactly as you described (a few dozen mid-range to high end desktops on a GigE network) that locks the doors at night and spends a solid 10 hours dark. Figure out a way to boot these machines from a thumbdrive or boot DVD with the Linux distro for clusters that you like (personally I like the thumbdrive approach - it runs a LOT faster due to the seek times on a DVD ROM) and Voila! instant slave machine army for your cluster. If the OP can work around the hours constraints, I'm going to be he has access to a LOT more CPU horsepower than he could imagine.
The trick is simply finding out who is responsible for the hardware and convincing them to allow you access to the 'training room' or 'computer lab' after hours.

--
Glonoinha the MebiByte Slayer

Commercial cloud computing clusters? by Anonymous Coward · 2011-09-14 17:11 · Score: 0

Have you considered commercial services like Amazon? I believe some are pay as you use.

Theoretical analysis by Warlord88 · 2011-09-14 17:13 · Score: 3, Informative

OP hasn't mentioned a lot except budget. Since you are on such a tight budget, I would highly recommend doing some theoretical analysis first. Do you have a serial code? How much parallelism exists in the code? You say the task is 'very parallel', but Amdahl's law (which is really common sense) will tell you that even for small amounts of serial sections of code, your speedup will be limited. You should also consider the amount of time the code actually runs. Achieving a speedup of 2 for a serial code that runs for one minute is near worthless.

After you estimate speedup, do some rough calculations on the basis of average cost of a processor and the the number of processors required. This should give you an estimate of the hardware cost required. Compare that with the cost of CPU cycles per dollar you get using a cloud service such as Amazon.

Re:Theoretical analysis by Anonymous Coward · 2011-09-14 17:56 · Score: 0

When a scientific computing task is described as 'very parallel', I think you can assume that it's more-or-less perfectly parallel. There are lots of scientific problems that are embarrassingly parallel. My own work, for example, could be parallelised 100,000-fold or so without using more than a few percent of overhead.
Re:Theoretical analysis by serviscope_minor · 2011-09-14 21:17 · Score: 1

I've already done the analysis (in an ad-hoc manner). The problem is data-parallel, where each process runs on a separate chunk of data and never interacts with the others. I've been running it on a spiffier cluster which I will soon loose access to.

--
SJW n. One who posts facts.
Re:Theoretical analysis by Anonymous Coward · 2011-09-14 22:38 · Score: 0

This sounds like a task farm, in which case you should put Condor on your desktop PCs and cycle scavenge them. You can buy more "dedicated" compute nodes to add to the pool with the money. And some data storage. I can't emphasize to you how many problems are caused by researchers not budgeting sufficient storage when they apply for grants.
It's worth asking around the institution you are at to find out what resources may be available to you for free. For example, at UCL there is a 5680 core Linux cluster, a large Altix SMP box and a Condor pool consisting of all the Windows desktops around the university. These resources are free to researchers at UCL. Many such centrally provided resources also employ staff with a great deal of HPC expertise which you can take advantage of. There is little to no point wasting money creating inferior versions of services which your university already provides.

Amazon AWS. by Haven · 2011-09-14 17:13 · Score: 4, Informative

$1.60 / hour for the largest non-GPU cluster instance. This also provides you with rather fast interconnects and scalability with multiple instances.

Only £4,000 in hardware would be a waste of money. You wouldn't have all that much computing power, and it would be obsolete immediately.

Re:Amazon AWS. by Anonymous Coward · 2011-09-14 18:22 · Score: 0

You have a point here, with this price of 1.6$ / hour she can pay for 5 and a half month of 24x7 computing processing time on cutting edge (virtualizaed) hardware. If I were she, I'll make my mind in favor of AWS (of course if this 5.5month is enough for the project)
Re:Amazon AWS. by outsider007 · 2011-09-14 19:26 · Score: 1

That's the on-demand price. She would be paying the spot price which is roughly half of that.

--
If you mod me down the terrorists will have won
Re:Amazon AWS. by serviscope_minor · 2011-09-14 21:12 · Score: 1

$1.60 / hour for the largest non-GPU cluster instance. This also provides you with rather fast interconnects and scalability with multiple instances. Only Â£4,000 in hardware would be a waste of money. You wouldn't have all that much computing power, and it would be obsolete immediately.
Interesting option. It looks like it depends on the usage pattern. The large amazon ones are 8CPUs (by the looks of it). For 4000, one can get about 80 high clocking x6 1100 cores. At 1.60 per hour, 4000 would buy about 20 days continuously on a cluster I could build, or perhaps 40 days if I was to pay the spot proce (as suggested by another poster).
I may have done my sums woong, though.

--
SJW n. One who posts facts.
Re:Amazon AWS. by Ironhandx · 2011-09-15 00:44 · Score: 1

No, you've done them correctly.
Anticipate your usage patterns. If you think you're going to need the power in short bursts of "I need it NOW" rather than "I can let this run while I'm working on something else" then Amazon AWS is your best bet.
Otherwise a Supermicro Opteron based system will win, hands down.
As an aside I have an old 2nd gen opteron system with two dual core opterons at 2.2 ghz in it that still outpeforms my friends brand new i5 that he bought as a dual purpose cpu so as others have mentioned I`d stick to opterons for raw science if I was you :)
Re:Amazon AWS. by Anonymous Coward · 2011-09-15 07:18 · Score: 0

Except 1 instance might be in a server farm in India and another instance might be in a server farm in China, then another in Eastern Europe.
Screw the cloud and the steam it road on. Give me bare metal anyday.

Buy the cycles, not the machines by Anonymous Coward · 2011-09-14 17:16 · Score: 0

It will cost her more that 4000GBP in grad student time to configure/manage the cluster. (Even using a turnkey installation system like ROCKs.) She should use the money instead to buy time on a national/university cluster system.

Your best bet by eclectro · 2011-09-14 17:18 · Score: 1

Study the design of the "microwulf" and it's relatives. Considering that hardware prices has dropped since 2009, your task might be achievable.

--
Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"

Re:Your best bet by deadline · 2011-09-15 00:23 · Score: 1

Have a look at The Limulus Project. You will be able to buy one of these real soon.

--
HPC for Primates. Read Cluster Monkey

HP by guttentag · 2011-09-14 17:18 · Score: 1

HP? Is that you?

Not doable by darkjedi521 · 2011-09-14 17:25 · Score: 1

Assuming a 1.5 to 1 correspondence with the USD, you're either getting a decent cpu box and no storage, or a reasonable amount of storage and no CPU. I build/run supercomputing clusters for molecular dynamics simulations at an university in upstate New York, and I wouldn't even consider attempting a cluster for less than $25,000.

Since the OP didn't specify if this was massively parallel or not, I'm going to assume this is so I can use AMD chips for cheapness.

First off, storage. Computational output adds up quick. You're looking at $7,000 USD for 24TB raw storage from the likes of IBM or HP or Dell. Yes, you can whitebox it for cheaper, but considering if you lose this box, nothing else matters (And I doubt you have the funds for proper backups), it pays to get hardware that's been tested and is from a vendor you can scream at when it breaks.

Second, interconnect. A cheap netgear will work, but reasonable internode communication is not cheap, especially if moving largish amounts of data. This could run $1000 to $3000

Finally, the compute hardware itself. A decent node will run $3000 to $5000 depending on the core count, socket count, GHz, and to a lesser extent RAM.

Assuming you want 128 cores, you're looking at 8 machines for compute ($32,000 right there assuming $4K/node, and dual 8 core chips), plus another $7K for the file server/landing pad, and finally add $1500 for a decent switch that can let those nodes talk to each other at line speed and allow room for future growth. Total cost: $40,500 USD or 27,000 pounds assuming the 1 pound:1.5 USD ratio.

Re:Not doable by cowboy76Spain · 2011-09-14 19:49 · Score: 1

First off, storage. Computational output adds up quick. You're looking at $7,000 USD for 24TB raw storage from the likes of IBM or HP or Dell. Yes, you can whitebox it for cheaper, but considering if you lose this box, nothing else matters (And I doubt you have the funds for proper backups), it pays to get hardware that's been tested and is from a vendor you can scream at when it breaks.
Having someone to scream off may help ease some pressure, but if you do not have proper backups you are fubar no matter who you scream to.

--
Why can't /. have a rich-text editor? Editing your own HTML is so XXth century.
Re:Not doable by imsabbel · 2011-09-14 21:24 · Score: 1

I build a cluster in a similar budget a while ago. It can be done easily.
3-5k for a node it pure bullshit and has nothing to do with reality.
For that budget, I got 8 machines with 2GHz core2 quad cores. 2 years ago.

--
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
Re:Not doable by darkjedi521 · 2011-09-15 01:40 · Score: 1

I've never been able to build dual socket 1U systems for less than $3K
Re:Not doable by the+eric+conspiracy · 2011-09-15 02:21 · Score: 1

Don't buy rack mounted dual socket systems if you have this sort of budget. They are too expensive.
Re:Not doable by Amouth · 2011-09-15 06:02 · Score: 1

i can't seem to remember the name or find them via Google right now - but ~4 years ago i remember Intel launched a line of small HPC chassis that let you add up to 6-8 nodes in a custom made rack system (like blades but not as dense and allowed them to be cheaper). the benefit of them was that for about the same price as a home built 1-2u per node setup you got in a single box that came with built in networking/interconnect/back-plane and shared sas based san storage in the unit.
i remember we almost bought one here for a VM cluster but we found out after we had ordered a dedicated can unit.
i seem to remember it being priced fairly well, not sure if it would work for this person, but i'm mainly wondering if someone else can remember the product name/# because i'm drawing a blank.

--
'...if only "Jumping to a Conclusion" was an event in the Olympics.'
Re:Not doable by imsabbel · 2011-09-15 09:25 · Score: 1

I would even go as far and say: No not buy rack mounted at all if on such a budget.

--
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?

computer recyclers by thedarknite · 2011-09-14 17:26 · Score: 1

Try a computer recycling centre, most tend to be short on storage and are happy to sell a large number of desktop machines at a lower than normal price per unit. Community operated ones tend to be more helpful than business ones though.

--
A game has objectives and is competitive, anything else is just play

Buy a small chunk by pigwiggle · 2011-09-14 17:32 · Score: 1

Buy a small chunk of something that looks like the big machines she will be using. As others have said, with that little money you aren't going to get legitimate computational resources. But she will certainly qualify - or already has - on some of the larger public machines. In my experience, it is really nice to have a small, i.e two or three nodes, cluster to test and benchmark code. You can look at things like parallel performance on a single node versus across nodes. If the code plays well with shared memory. Can the code reasonably mix shared and non-shared parallelization schemes. And so forth.

--
46 & 2

Re:Buy a small chunk by Anonymous Coward · 2011-09-14 17:43 · Score: 0

great advice. buy 3 or 4 of the cheapest you can get at the time, get the infrastructure going and get some results.
when you're really finished, the market will have changed. reevaluate your options then

Um, hello? Botnet? by Anonymous Coward · 2011-09-14 17:42 · Score: 0

They are going for like $3 an hour.

Ps3 by Anonymous Coward · 2011-09-14 17:52 · Score: 0

If your friend doesn't mind tinkering with the OS, it might be worth buying a bunch of ps3's. They disabled the old other os feature, but id imagine its not too hard to mod them , or do something along those lines.

Re:Ps3 by bhcompy · 2011-09-15 01:10 · Score: 1

This is a good idea. The Air Force has a PS3 cluster doing similar work.

Raspberry Pi, Model B...... by Anonymous Coward · 2011-09-14 17:54 · Score: 1

$35/element, runs a boring Linux distro, runs very cool, low power consumption (less than 1w), onboard Ethernet.

Sorted!

Raspberry Pi

BOINC Project? by bradley13 · 2011-09-14 18:03 · Score: 4, Interesting

She could also consider creating a BOINC project. She could then do some publicity locally and on forums, to get people to choose her project. I've never tried creating a BOINC project, so I don't know how hard this is. However, I do run the client as a background task, and I imagine many other people do as well.

--
Enjoy life! This is not a dress rehearsal.

Re:BOINC Project? by WhiteSpade · 2011-09-15 00:08 · Score: 1

I have no idea what she is researching, but there may be problems with patient data, etc.
---Alex

Don't spend it on hardware. by Jane+Q.+Public · 2011-09-14 18:07 · Score: 1

Spend the money on a programmer to parallelize the algorithm on standard CPUs, and put it out on BOINC. People volunteer their spare cycles for BOINC projects that are barely more interesting than the chemistry of aardvark snot. She would likely get volunteers if there's anything of even passing interest in her research.

Boxes on shelves by stevelinton · 2011-09-14 18:09 · Score: 1

If your friend doesn't want to do a lot of engineering work, then for this price I would just buy 10 or so PCs (depending on memory/CPU tradefoffs) from wherever has a special offer, plus a gigabit switch and put them on shelves. If you need a lot of memory, or can usefully share memory then that would be a bit different, but you can buy a usable headless PC for £300-£400. This will also not be terribly power efficient, nor will components like motherboards be of the highest quality, but you get more bang for the buck that way than almost anything else except second-hand. At the other extreme, you could probably buy a single 24-core AMD box for the money with quite a lot of RAM and just run a lot of processes on it.

Talking of second-hand, the other thing to do is to see if anyone has a cluster they can't feed (ie power) any more. Our aplied maths dept is about to shut down a 3 year old 1000-core cluster because they can't afford the power to run it and their newer 2000 core cluster. A slice of that would be great and someone locally might be able to help you in a similar way.

Re:Boxes on shelves by stevelinton · 2011-09-14 18:21 · Score: 1

Just did a bit of checking. For £249 including VAT you can get a mini-tower, dual core midrange CPU and 2GB RAM. A dozen or so of these and a switch looks very appealing if there is space. 300W PSU, so cluster should be under 5kW.
Re:Boxes on shelves by Anonymous Coward · 2011-09-14 18:27 · Score: 0

You are charged for electricity in the UK? I've never heard of that in the US. The university pays for the electricity, presumably with some of the money they took out of your grants for "overhead." The departments and individual research labs aren't charged for power. Do you pay less in overhead in the UK? Do you turn stuff off at night? Because we usually leave the lights on all night.
Re:Boxes on shelves by Vectormatic · 2011-09-14 20:09 · Score: 1

if you are open to custom building yourself, you can dispense with the optical drive and other crap you dont need (pick smaller drives if storage isnt a concern etc.. avoid windows tax, how ever small it may be on a 250 quid box) and dump that into stronger hardware
Late last year we had a crunch-intensive problem at work, and the internal IT department wouldnt even give us a price quote, just said they could do it (a problem which required six octo-core xeon machines, in a world wide company with 90k in people, go figure), so i drafted a small proposal saying i could build ten hex-core AMD boxes with 4 GB ram each for 5k in euros. Off course it wasnt take seriously, because it wasnt enterprise, but there you go, custom bare bones build with a beefy cpu

--
People, what a bunch of bastards
Re:Boxes on shelves by imsabbel · 2011-09-14 21:27 · Score: 1

Also, dump video cards.
And HDs: Depending on the load, one might not need a big / fast local storage at all. It can be cost efficient to use a very small SSD.
And yeah to boxes in shelfs. A usable 19" rack case alone can be more expensive than a computer box..

--
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?

Microwulf by tqk · 2011-09-14 18:10 · Score: 1

Microwulf.

--
"Tongue tied and twisted, just an Earth bound misfit ..." -- Pink Floyd.

Who is paying the electricity bill? by Michael+Woodhams · 2011-09-14 18:15 · Score: 1

And how much space and air conditioning do you have? Depending on the answers do these questions, the optimal* solution might be 'get a bunch of 5 year old computers nearly for free.'

* Optimal for your friend, not for her university.

--
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.

Re:Who is paying the electricity bill? by Memroid · 2011-09-14 18:32 · Score: 1

Free? where?
Re:Who is paying the electricity bill? by Anonymous Coward · 2011-09-14 19:15 · Score: 0

Around here you can by Core 2 Duo + 2GB Lenovo desktops for 35€. They even have linux installed. I would call that almost free considering the processing power.
Many of those kind of places don't advertise, but I'm guessing most decent size cities have several places selling used hw.
Re:Who is paying the electricity bill? by Anonymous Coward · 2011-09-14 19:59 · Score: 0

At your university. Computers are replaced all the time, and you can't throw out the old ones. there are unwanted computers all over the university. just ask for one. but they aren't going to be good enough to do supercomputer calculations.

Had similar situation Dell410s + Condor; no cloud by Oori · 2011-09-14 18:29 · Score: 1

I was in a similar situation setting up a research group. Wanted an expandable setup for a research group, that would meet approval of local IT sysadmins (some remote management opts, vendor support). Per 2.6K pounds a pop I got a Dell poweredge T410 server with 2 6-core CPUs and 24GB RAM. I'm never one to push a Dell (been purchasing IBM/HP for years) but this is a decent machine for a decent price. I tried various cloud solutions using virtual machines on Amazon and similar frameworks, but for the kind of work we do (frequent software updates, massive amounts of data that need to be stored locally and can't be transferred easily), those don't scale. We use Condor as a job submission engine. Not that we don't like SGE but with Oracle's plans (http://en.wikipedia.org/wiki/Oracle_Grid_Engine) one can never tell. PS: remind you friend to invest in a QNAP NAS or similar for backups / disaster recovery.

Don't use EC2 by hawguy · 2011-09-14 18:32 · Score: 1

I don't think any of the posters recommending EC2 have ever looked at the economics of EC2 versus self-hosting.

If you have long-term compute needs (as opposed to needing to throw lots of cores at a problem to get fast results in a short time), you're better off buying a Dell.

An EC2 Quadruple Extra Large EC2 instance is $1.60/hour. You have around $6500 USD, so you could buy 169 days of computer time at EC2 (ignoring the cost of I/O and network bandwidth).

This instance has 23GB of RAM and is equivalent to 2 x Intel Xeon X5570 CPU's.

For around $5000, you could buy a Dell R710 with dual X5647's, 32GB RAM, RAID-1 1TB SATA drives (depending on your storage needs, you might want to move to faster SAS disks). As long as you have a suitable office to host the server, your only recurring hosting cost is electricity (around $70/month) and maybe you'll need to spend $500 on a UPS. If you need to pay for hosting/colocation somewhere, that will definitely change the economics.

So, with your budget, you get one node + UPS + electricity for a year. All for the price of around 5 months of EC2 time.

You come out ahead even if you want to throw away the server every 6 months and start fresh.

You can save a few bucks by building your own (or going to a custom whitebox builder), but the Dell comes with 3 years of next business day support. Last time I priced out a whitebox builder, they beat Dell's best discounted price by about 10% and only offered a 1 year warranty.

Re:Don't use EC2 by goarilla · 2011-09-14 19:14 · Score: 1

Dell support isn't ( always ) heaven on earth !
Re:Don't use EC2 by outsider007 · 2011-09-14 19:45 · Score: 1

I think you're overestimating the cost of ec2 by 2x and underestimating the cost of electricity by 3x

--
If you mod me down the terrorists will have won
Re:Don't use EC2 by PhrstBrn · 2011-09-14 19:51 · Score: 1
You can save a few bucks by building your own (or going to a custom whitebox builder), but the Dell comes with 3 years of next business day support. Last time I priced out a whitebox builder, they beat Dell's best discounted price by about 10% and only offered a 1 year warranty.
Rubbish.
- SYS-6026T-URF - $1200
- Xeon X5647 - $850 each
- 4GB ECC Registered - $50 each
- 1TB SATA 7200RPM - $120 each (WD RE4 or equiv)
Chassis + 2xXeon + 24GB RAM + 2x SATA drives =~ $3500. Drop the chassis down to 1U and you're looking at ~$3200. I'm assuming the $5k Dell is an H200 controller, and not a H700 card.
These prices are retail prices, not wholesale, and most are on the conservative side, so a system builder is going to have better wholesale prices on parts than what I quoted even. Unless your system builder is making 50%+ margins an their hardware, your system builder is trying to rip you off, or you're not comparing apples to apples.
Re:Don't use EC2 by Anonymous Coward · 2011-09-15 01:27 · Score: 0

doubtful... a college or business gets electricity for a lot less per kw/h than a residential customer does
Re:Don't use EC2 by Anonymous Coward · 2011-09-15 02:07 · Score: 0

I don't know where you get that $70/mo. figure from, but please note that rates vary wildly. UK rates are roughly 3 times higher than those in Hawaii which is the most expensive US state for electricity. The cheapest states are more than three times cheaper than Hawaii. Location matters.
Though, given the fact that this will run at a university, they're probably going to pay exactly zilch.
Re:Don't use EC2 by hawguy · 2011-09-15 02:26 · Score: 1

I don't know where you get that $70/mo. figure from, but please note that rates vary wildly. UK rates are roughly 3 times higher than those in Hawaii which is the most expensive US state for electricity. The cheapest states are more than three times cheaper than Hawaii. Location matters.
Though, given the fact that this will run at a university, they're probably going to pay exactly zilch.
I used $0.15/KWh to run a 650W load 24x7, which is close to what someone would pay in California to host the server. This article says that rates in the UK average around $0.185/KWh (which would raise the cost of electricity from $86/mo to $70/mo). Feel free to plug in your own local rate and see where the breakeven point is.
Re:Don't use EC2 by hawguy · 2011-09-15 02:40 · Score: 1

Instead of just speculating that I'm overestimating the cost, why not do the math yourself and show us your work?
EC2 pricing is here: http://aws.amazon.com/ec2/pricing/
$1.60/hour * 24 hours/day * 169 days = $6489
You could buy a one year reserved instance:
$4290 + $0.56 * 24 hours/day * 164 days = $6494
I used 15 cents/Kwh (which is around what I pay in California). Your local rates will vary, feel free to use your local rate:
650W * 24 Hours/Day * 30 days * $0.15/Kwh = $70.20
The 650W figure is measured power use for a 2 socket Xeon compute node while busy.
Re:Don't use EC2 by hawguy · 2011-09-15 02:48 · Score: 1

These prices are retail prices, not wholesale, and most are on the conservative side, so a system builder is going to have better wholesale prices on parts than what I quoted even. Unless your system builder is making 50%+ margins an their hardware, your system builder is trying to rip you off, or you're not comparing apples to apples.
Yeah, admittedly, I'm comparing apples to oranges - last time I did a Dell comparison, i was using a Dell VAR that was able to secure good discounting from Dell, and it was a 4 socket server with 256GB of RAM and no disk.
The Dell was 10% higher and came with a 3 year warranty on-site warranty versus a 1 year repair shop warranty from the builder, so we went with Dell.
In 9 months we've had one problem with the servers, and Dell swapped out a motherboard the next day. Diagnostics pointed to the hard drive controller, which was weird since we're not using it,but the server has been fine since then.
Using your numbers, you could almost build 2 servers for the price of 6 months of EC2, though you'd have no money for electricity.
Re:Don't use EC2 by outsider007 · 2011-09-15 13:48 · Score: 1

If she's bargain shopping she will get spot instances when demand is low to get a good price. If the price is good, spin up 5 instances. If it's bad just wait. With electricity you can do the same thing by only running offpeak hours but I think your 650W figure is very optimistic.

--
If you mod me down the terrorists will have won

Clusters on the cheap by kawabago · 2011-09-14 18:47 · Score: 1

Mums are nice.

Re:Clusters on the cheap by John_Booty · 2011-09-14 19:24 · Score: 1

Yer mum's pretty nice, mate. Winkwink nudgenudge.

--

OtakuBooty.com: Smart, funny, sexy nerds.
Re:Clusters on the cheap by qmaqdk · 2011-09-15 07:27 · Score: 1

Know what I mean, know what I mean?

--
My UID is prime. Hah!

Unemployed Loans by nicholaspetman01 · 2011-09-14 18:58 · Score: 0

Nice site. For details visits: http://www.loansunemployed.org.uk/

Just pay to use someone else's HPC cluster by Anonymous Coward · 2011-09-14 19:28 · Score: 0

You're probably excited about being able to help her burn some money on BUILDING some ultra-cheap Beowulf-style HPC cluster but your friend is probably more interested in just actually USING a working production HPC cluster to solve her very parallel, very CPU-bound scientific/mathematical problem ASAP, and not so much into having to setup and administrate the HPC cluster.

Based on the little bits of info you've provided, I'm going to guess your friend is something like a new assistant professor/lecturer at some small research university? Maybe located in the UK or some other European country that prefers to use pounds sterling instead of euros?

Let's see here...4K in GBP is about 6.3K in USD right now. For that amount of money, I would check if the university or a larger affiliated academic institution might have some sort of HPC cluster that you can just pay for compute time on. Best bet is if she can talk to her colleagues involved in her specific research area who might already be aware of what's available. A Google search for "UK high performance computing" comes up with some possible good hits.

In the U.S., there are organizations like XSEDE (http://www.xsede.org/) which has a system where academic researchers can apply for an allocation of compute time to use on various HPC clusters at institutions affiliated with XSEDE. Some of these institutions will also separately independently rent out compute time (sometimes it's easier to just wave some money at some supercomputing center than deal with the hassle for applying for formal allocation based on the proposed merits of your project hoping that somebody will think it's worthy). See the Triton Resource (http://tritonresource.sdsc.edu/) at SDSC (http://www.sdsc.edu/us/tapp/tapp_pricing.html) as an example.

When your friend is actually comfortable just using a HPC cluster to run her compute/simulation jobs, publishes her results, gets noticed for her work by more senior peers and funding agencies, applies for more grant funding, and when serious amounts of money start rolling in ($50K-$1M+), then it's time to start building her own (which is another long story).

It all depends on structure of tasks to be run by Anonymous Coward · 2011-09-14 19:40 · Score: 0

Main question is how specialized is software to run. If You are sure what kind of microkernels are required and its fixed over cluster lifetime, make it fpga/custom made cpu with IP logic extension. The best for buck if possible and allowing moderate variation in calc kernels are dsps, also ones becoming multicore lately (for exmple TI). If previous fail: biggest raw power but with high energy costs are x86 variations. If task is easily separable into tinytasks and energy costs do matter - hundreds/thousands of separate tiny arm cores of A8 or A9 grade (A9 planned and starting production for multicore cpus) on gigabit ETH link might suffice.

Do you count manpower and infrastructure costs? by grid17 · 2011-09-14 19:40 · Score: 1

If you are extremely data heavy, the cloud becomes quickly much more expensive than buying your own. The Broad Institute made some recent experiments on Amazon analyzing genome experiments, and they said Amazon was 4x more expensive.

But for her cpu-heavy workloads the cloud would work perfectly.

more things to consider:

If she buys her own hardware there are a lot of extra costs to the raw hardware:
1 someone needs to set the thing up, administer it, and support it with patches, etc. even for a small cluster this is a good percentage of a person. if she has a slave student doing that, great! although if the guy leaves there will be an issue. if she needs to spend money on a person, then amazon will be muchmuch cheaper.
2 there is an electricity bill and you need space, probably cooled space. if it is available, great! otherwise it can be a showstopper (fire hazard in a lab)
These costs are included when people talk about total cost of ownership. If you factor them in, the cloud suddenly becomes veeery interesting. Btw, EC2 is not the only player on the market now, there is Azure and also the IBM SmartCloud, with competitive pricing.

For normal cluster computing you can go to a number of startups that will build an on-demand cluster for you based on amazon, my favorite in the research domain is cloudbroker - http://www.cloudbroker.com/ - who will actually render the software you need into a SaaS based on EC2 or the IBM cloud. you just launch your hpc cloud app, with your uploaded data and you pay for the workload you did. the bill includes just amazon or ibm costs, licenses if any and a surcharge by cloudbroker which is totally worth the money because now you do not even need to set up the software and the virtual machines anymore.

World Community Grid: Free if You Qualify by BBCWatcher · 2011-09-14 19:41 · Score: 1

Somebody upthread mentioned BOINC, which is a great idea for many parallel-oriented compute-bound problems. However, while making your project compatible with BOINC is necessary, it's usually not sufficient. The problem is marketing, to convince enough people to run your work. World Community Grid, sponsored by IBM, is free and is an excellent way to solve that problem. You can submit a proposal, and if approved you'll quickly have lots of BOINC-powered computing working on your problem.

Buy time on HECToR by CSMoran · 2011-09-14 20:03 · Score: 1

At least you'll be running on the bare metal, not some virtualized piece of cloud. http://www.hector.ac.uk/

--
Every end has half a stick.

AWS, Grid, Owned hardware, many options by gef7 · 2011-09-14 20:17 · Score: 1

In short:

AWS could be good value for money, if you DON'T have data-intensive tasks (otherwise it gets expensive quickly)
Grid could be even better value for money, assuming you can get the service AND assistance in your lab, for low or zero cost (eg. in US/EU this should be fine)
Owning hardware is a MUST if you have sensitive data (medical, financial etc) or just need to build local expertise (more input needed here)

Assuming the later, check among Supermicro & Dell servers. Last time I needed to setup a cluster, the Dell R610s were a good pick, giving great manageability over the LAN, low volume and decent features (balanced storage space along with cpu capacity, around 8 cores + 8 TBs per 1u blade). Don't rule out also options like Shuttle XPCs, they are damned robust in thermal aspects (hey, you'll be running these continuously, won't you?). Finally, don't underestimate the need for local sysadmining; you will likely need to setup a queueing system (Torque, *PBS*, SLURM, SGE, LSF, NQS, Condor) and manage the whole thing. This won't happen automatically, take a note on that. If you run something of the pbs or sge family I can happily help with setting up a tool called qtop

Re:AWS, Grid, Owned hardware, many options by Anonymous Coward · 2011-09-15 01:41 · Score: 0

Hey, why don't you ask a tiny portion from the other cluster mentioned in this slashdot article?
http://linux.slashdot.org/story/11/09/13/2111210/Ask-Slashdot-Best-Use-For-a-New-Supercomputing-Cluster

Phone big companies by Anonymous Coward · 2011-09-14 20:28 · Score: 0

Call the IT depts in some big companies. They may be throwing out old desktops. If you are an institution then you can honestly turn what would have been a scrappage deal (which tends to be regulated and cost money) into a Â£1 sale from one institution to another.

Just beware the liability for scrapping the computers after the project. But hey, that probably comes out of someone elses budget.

Use idle lab machines by axedog · 2011-09-14 20:32 · Score: 1

Is the research group in a university? Most universities have a lot of computing power that sits idle for large proportions of the time in their undergraduate computing laboratories. There's a significant resource that could be exploited simply by deploying jobs to idle machines.

--
Sent from my Tianhe-2 (MilkyWay-2).

Use the Grid by nextgens · 2011-09-14 20:40 · Score: 1

In the UK there are academic grids that research groups can use like the ngs or gridpp (for free or next to nothing) http://ngs.ac.uk/use-ngs I used to work at the Center for Parallel Computing where I am sure some people would talk to her. http://www.cpc.wmin.ac.uk/cpcsite/index.php/Main_Page

All the cloud-goers; by TranceThrust · 2011-09-14 20:49 · Score: 1

give some thought to data security as well please. If the research done is sensitive, don't use clouds.

On the other hand, costs of self-hosting are indeed underestimated very quickly, which is not a good thing when budget is low.

Also, while manycore machines seem cost-effective, look at the solutions you are using for computation; it is hard to press a 48-core machine to peak performance, much harder than driving more standard distributed-memory supercomputers. But this depends on your application.

Buying time on an existing cluster (local university, or a dedicated HPC company) seems the surest way, and also reasonably secure when done at a trusted institute or company.

MicroWulf Project by StoneyMahoney · 2011-09-14 20:50 · Score: 1

Look up the MicroWulf project. Pretty much exactly the thing you want if you really really really do need to build it yourself.

Self-building clusters needs some careful planning before you start:

1) Don't do it if it you don't have to. Lots of resources around that you can use, especially if you're in academia.
2) Find/read benchmarks for the key components to your research (CPU in this case) and then design your cluster modules around it. Keep in the mind your price/performance ratio for each entire module, not just the core component. Fewer faster modules is usually the best way to go.
3) Cut out as much hardware as you can but don't skimp on important specs. Integrated NICs on motherboards are good but check they're PCI-Ex connected instead of vanilla PCI if you want the best transfer rates with lowest latency. If you don't need lots of data lying around on each module, ditch local hard drives - dual purpose a module as a data server, maybe a Netboot server if you fancy it, or consider booting from cheap USB keys.
4) Don't forget to factor in the price of network switches, Cat-6, power leads, etc - it can mount up pretty quickly!

Many factors are involved ... by MacTO · 2011-09-14 20:54 · Score: 1

First, see what's available. Many departments have computing options available that depend upon scheduling and departmental budgets.

If that doesn't work out, what are you doing? Serial processing or based upon existing software, then 'contract it out' (if that's an option). It's easier, and probably cheaper, especially in the early stages.

Parallel processing though needs more serious consideration. Cheap SIMD was being offloaded onto GPU's the last time I looked (which, admittedly, is probably too long ago). So it may be best to look to 'homebrew' configurations in that case.

Buy a mining rig by Anonymous Coward · 2011-09-14 21:21 · Score: 0

I'm sure there are a lot of bitcoin mining rigs on the market right now. Which sounds like exactly what you want.
Bitcoin mining are purpose built machines to run OpenCL apps as fast as possible, with multiple high end AMD GPUs.

Recently the bottom has fallen out of the bitcoin price making mining for a large number of people and they will be dumping their rigs.

Check on ebay.

Doesn't anyone rent dedicated servers anymore? by a+sad+dude · 2011-09-14 21:26 · Score: 1

We got a customized 8-core (2x E5504) box with 32 gigs of memory from Leaseweb for about 200 euros a month. I'd say it's a great deal. I think if you talk to them, they could get you a custom cluster too. Definitely cheaper than EC2 if you use it constantly.

BOINC/WCG by shutch · 2011-09-14 21:33 · Score: 1

I think an innovative solution, assuming you have a defined timeframe for this project, is the following: Set aside a small portion of your budget (Â£250-Â£500 - I'm guessing) and set up a BOINC server on an EC2 instance; http://boinc.berkeley.edu/trac/wiki/CloudServer . It will probably not cost as much as I have suggested above as it won't be *nearly* as intensive as actually doing the computing required for analysis of the experiments but you can use some of the budget to pay a server admin to set it up for you if you are not very confident. Although, I am certain, if you looked around any of the big communities involved in grid projects (overclock.net, evga, etc.), someone would be willing to assist you for free. Go around to the major forums posting a message in their grid computing projects asking for assistance and offering a Â£2000 prize or donation to a charity of their choosing for the group that completes the most work units over the project's life. This may sound like a lot of hard work, but these groups are fiercely competitive and are extremely willing to help to any cause and it will be not as difficult as it seems when reading this. At the very least, I can guarantee you about 20 users from a grid forum I am part of that will contribute - at Â£0 cost. Best of luck!

Re:This... Perhaps by mlush · 2011-09-14 21:46 · Score: 1

You have a limited budget, so it's more cost effective for you to lease time on someone else's equipment for now.

In a fair and logical world this would be true.

In an academic setting there can be problems on what your allowed to spend the money on. If the £4000 is in the 'Computer Hardware' section of the grant buying AWS could count as a service and have to come out of the 'Consumables' section of the grant

The other important question that springs to mind is would £4000 last till the end of the grant. Knowing the typical research program I think it would be very hard to estimate this, and if they underestimate (or suddenly find they need to recompute the last years work) they may end up without their processing power at the end of the grant when they really need to get something pretty to go in the next grant application.

If they buy hardware it may be more expensive, but they can absolutely rely on it being available till the end of the grant (assuming there is a 3 year warentee and the University has some sort of building contents insurance) and if they find they need more power they could still go to AWS.

High density Low budget Cluster by Anonymous Coward · 2011-09-14 22:45 · Score: 0

This does seem prime for outsourcing to a computing on demand but if you really want to build it.

Dual core all in one mini-itx boards have a low power connection, can run quite happily as dual core and emit a small amount of heat.
You can work on a module cost of around £100 for a dual core unit at standard retail costs even without volume discounts.
30 x Intel Dual core atom 4GB RAM No HDD ~ £100 per unit
48 port Gigabit switch £400
1x Storage array (dual or quad gigabit ethernet) £600 depending on the requirements

If you wanted to expand into GPU processing later on you may want to consider ION2 boards as that gives a huge advantage but at a cost.
Having a standard voltage/power/heat and slot in form factor goes a long way in building large scalable systems. Heat management is going to be the major issue in the physical design.

Get More Creative with AWS by byennie · 2011-09-14 22:51 · Score: 1

I see a lot of people suggesting the use of cluster instances on AWS. At first blush this is what they are built for, but it's not a gimme that they are the most cost-efficient option. From the description, the job is not targeting GPU, and it's also not network-bound. Some of the high-cpu instances are more economical if you don't need the gobs of RAM or 10 Gigabit pipes. The cluster instances do have somewhat faster CPUs.

AWS offers a MapReduce layer that supports all of these instance types (http://aws.amazon.com/elasticmapreduce/).

Cluster xLarge (GPU) = $2.10 / hour = $0.26 / hour / core = $0.063 / hour / cpu unit
Cluster xLarge = $1.60 / hour = $0.20 / hour / core = $0.048 / hour / cpu unit
High CPU - medium = $0.17 / hour = $0.085 / hour / core = $0.034 / hour / cpu unit
High CPU - large = $0.68 / hour = $0.085 / hour / core = $0.034 / hour / cpu unit

Throw in:

* Spot instances are discounted by over 50%. If your jobs can work on a range of instances, bid on a variety of cheap CPUs first.

* Reserved instances come out ahead after about 6 months of 24/7 usage, if you're going to use it that way.

All together, you could do something like this, with many possible variation. This gets you roughly 10 CPUs running 24/7 for 6 months, plus 3 hours a day of cluster compute time. And of course you don't pay for any time that you're not running so that could be reallocated.

5000 hours High CPU (medium) = $850 = 10,000 CPU hours
5000 hours High CPU (large) = $3400 = 40,000 CPU hours
250 hours spot instance (Cluster) = $150 = 2,000 CPU+ hours
250 hours spot instance (Cluster GPU) = $200 = 2,000 CPU+ hours
---
Roughly 55,000 CPU hours for $4500, leaving about $1800 for bandwidth, storage, or more compute time.

Point being, just like you can customize the heck out of box to buy, you can carefully craft a cloud approach more efficiently that just buying cluster time. If you just throw it at GPU cluster boxes, you could get half the work done (or less)...

Anyone make a smartphone cluster yet ??? by Ex-MislTech · 2011-09-14 22:53 · Score: 1

I think it doable especially with Android based phones.

Data rates might be an issue but some things like SETI @home
didn't have real high data rates, with Wifi enabled phones this could be
mitigated to only work when Wifi was active.

100's of millions of phones moving in and out of a global cluster.

I think I just had a nerd moment.

--
google "32 trillion offshore needs IRS attention"

Re:Anyone make a smartphone cluster yet ??? by MimeticLie · 2011-09-15 00:10 · Score: 1

More like 100s of millions of people who try it once and then never run it again because their battery life went to hell.

DON'T use BOINC. by real-modo · 2011-09-14 23:01 · Score: 1

Spend the money on a programmer to parallelize the algorithm on standard CPUs, and put it out on BOINC.

No, DON'T use BOINC. It will take six months to set up the project and publicise it, and a further year to learn how to deal with the volunteers so as to get work done reasonably reliably. Half or more of the work you send out will never come back. Newbs will bother you with questions about BOINC and the sixteen other projects they're 'donating time' to, not your project. (Virus scanners cause people grief.) RAC hunters will badger you incessantly about minor discrepancies in points awarded.

Running a project through BOINC and getting value out of it means committing a *lot* of time and effort just on the PR side of the project-- setting up and maintaining a website and the Boinc infrastructure, and doing regular news updates, answering emails, getting rid of spammers, and so on.

Unless, of course, you don't get any volunteers.

Unless your work is very interesting to the general public and you can use a big PR machine, trying to use BOINC is a good way to waste a year and achieve nothing.

Re:DON'T use BOINC. by GameboyRMH · 2011-09-15 01:45 · Score: 1

Wouldn't it be possible to set up a local BOINC cluster connected directly to the BOINC job server to do the bulk of the work, and then optionally allow outside volunteers to help? Then you could take a "help or GTFO" attitude to volunteers. Worst case scenario, you're down to your local cluster (which could include all the desktop PCs in the labs as some have suggested.)

--
"When information is power, privacy is freedom" - Jah-Wren Ryel
Re:DON'T use BOINC. by Jane+Q.+Public · 2011-09-16 08:46 · Score: 1

I hadn't considered the time aspect. Of course, if there isn't plenty of time then BOINC would not be a very good idea.

But I disagree with the "unless it's very interesting to the general public" part. Protein folding is not very interesting to the "general public", but it has been a great success on BOINC.

Hadoop by spxZA · 2011-09-14 23:01 · Score: 1

From http://hadoop.apache.org/ The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.

In the US, we have Teragrid/XSEDE by Orp · 2011-09-14 23:03 · Score: 1

Does the UK/Europe have federally funded, shared computational resources for researchers? In the US we have what used to be called the Teragrid (now XSEDE) which is a network of supercomputers that are available for researchers. You have to write a proposal for machine time, but they're not all that difficult to get. The main disadvantage is that you have to submit your jobs via a queueing system, so your jobs usually don't start right away (having your own hardware does have its advantages) but the big shared resources have their advantages - you don't have to worry about maintenance, they usually have reliable archival resources, and every X years they usually replace the hardware with something faster.

--
A squid eating dough in a polyethylene bag is fast and bulbous, got me?

Re:In the US, we have Teragrid/XSEDE by Anonymous Coward · 2011-09-15 00:48 · Score: 0

Yes. At many levels. At the European level, there is PRACE (formerly DEISA). At the UK level there is the National Grid Service (although the size of this is so small as to be almost insignificant it is still bigger than anything you are likely to build for £4000) which is effectively free (you have to write an application for time but it's not charged) and HECToR which is a large (although not by US standards) Cray system that anybody funded by the *right* funding agencies can apply for time on.

went with Dell by ImWithBrilliant · 2011-09-14 23:39 · Score: 1

I had to use Windows and Monte Carlo my sim "locally". I sat two Optiplex 790s w/ 3.4GHz i7-quad cores on a rack shelf for under $3000. Their small form factor is a sweet chassis (can't say that about their Precision desktop). I moved the boot disk into the optical drive's bay, and installed a sub-$190 3TB 3.5" drive internally. No power supply for a serious graphics card but native is good enough for my sim. Many-hour runs are 20-25% shorter than my older X9650 and E5540 so I'm happy.

--

Is it a rule, that there's an exception to every rule?

If you insist on your own hardware... by real-modo · 2011-09-14 23:46 · Score: 1

What Stoney says. Especially point 1.

If you're past point 1, then at the risk of starting a flame war:-

For a compute-bound problem that *can't go on GPUs* and needs fast turnaround, you'd be silly to use anything other than Sandy Bridge at the moment. Core i7 2600 if you can tolerate consumer grade (keep a spare) or the equivalent Xeon, or better. (See, I told you, flames! Look, guys, I like AMD, I want them to succeed, but benchmarks are benchmarks.) For the combination of raw speed, FLOPS/watt, and minimum idle power, Sandy Bridge beats everything else at present. If you're paying for the power, you'll maximise your total computation per pound with SNB. And be much more likely to have your overnight jobs finished and waiting for you when you come in in the morning.

Regarding second-hand stuff, even if you can get it for free, if it's older/smaller than Core 2 Quad Q6600 or the equivalent Xeon, pass on it. There aren't enough cpu instructions executed per second or per kilowatt-hour in anything older. Some of the older cases can be re-used, though.

Build a Tetragonal GB-lan linked 4-box i7 cluster by Anonymous Coward · 2011-09-14 23:53 · Score: 0

This is what I did:

- buy four headless boxes, Standard case, standard 500W PSU, Gigabyte GA-P67A-UD3P-B3, 500GB WD Green
- put a i7/2600K plus good cooler (ARTIC Freezer 13pro) on each mainboard
- add 3 PCI-E Intel NICS (Intel PRO/1000 GT) *per Box* (12 total)
- buy 3 x 2 x 1 (6 total) short CAT-5E GB-Lan Ethernet cables
- buy 4 long CAT-5E cables
- buy a conventional 8-port GB-lan switch

- use an old box for the server (contains some big disk shared to the nodes
and the programs running on the nodes)
- connect the four boxes via long cables to the 8-port switch
- install the boxes (by temporarily adding a videocard) w/Linux of your choice + ssh + OpenMPI + Libs necessery
- connect the boxes with one-another (each contains 3 extra NICS!) w/short cables like the edges of a tetrahedron (6 edges)
- configure the networks between the boxes, each edge is another subnet (10.0.1.0 to 10.0.6.0)

If all works out, you'll got a 4 x 8 = 32 (by threads) process parallel (MPI) machine.
I cosed the said boards because of the high power throughput, an i7 o/c to 4.2GHz
running 7/24 in 8 x 100% load *does* require a stable mainboard (12 power phases).

Open-MPI will figure out your network config. Your aggregated bandwith within
the cluster should be 4 x 2 x GB/sec (if full duplex).

Never ever include the onboard (to the switch) NIC into the MPI-allowed
interfaces. This won't work. The outer interface is disk-transfer and
c&c only.

Should be within budget.

Regards

rbo

don't rebuild the wheel by Anonymous Coward · 2011-09-15 00:04 · Score: 0

Don't rebuild the wheel using existing clusters as a service or hosting is quickest, with guarenteed performance.

Butas an option.. I could suggest two products.. In your have access to a large collection of desktops ie the office machines the could be scheduled at nights to be computational nodes in either a windows hpc cluster, or an open source paralell computer cluster.. If need virtualize the server.. The only way to do it on the budget.. Beg and borrow resources and funding..in my environment I have over 1000 desktops

Also you will need a tech person to run it..
With a hosted solution they look after the tech resource need to create and manage the hardware..

Talk to this guy... by argStyopa · 2011-09-15 00:06 · Score: 1

http://linux.slashdot.org/story/11/09/13/2111210/Ask-Slashdot-Best-Use-For-a-New-Supercomputing-Cluster ...who posted that he was allegedly just getting the budget to build "...the largest x86_64-based supercomputer on the east coast of the U.S. (at least until someone takes the title away from us). It's spec'ed to start with 1200 dual-socket six-core servers..." but apparently has no idea what he's going to use it for.

If true, he'll have lots of cycles to sell for cheap and/or his organization is clearly not value-oriented so he'll probably sell time without much concern over price.

--
-Styopa

Budget by wirelesslayers · 2011-09-15 00:07 · Score: 1

I do not know about your research/project. But, here on the university I do my research , once you get "money for something" you must apply for that... If the money is to buy a few computers for parallel programming, it must be used this way. And you can not use the same money for cloud/grid "rent a hour".

1 - is your program cpu intensive only?
2 - is your program memory intensive?
3 - how about disk usage?
4 - Are you guys "driving" a pre-made program/package/solution (like gaussian/columbus) or developing one ?
5 - How many people gonna use your cluster?

Best practices for any of those situation are, in case you are using "boring linux":
1 - customized kernel for better performance
2 - allocation of resources using CGroups
3 - in case of developing you own program you could read about "intrinsic functions" and make your program faster.

How many commenters have BUILT a cluster!? by mprinkey · 2011-09-15 00:17 · Score: 1

OK, I won't be too hard on the discussions above, but I read enough to try to give some real help to the OP. I get that this is basically an embarrassingly parallel application. So, that means a gigabit network is fine. That also means that single core performance is the ONLY indicator of the speed of the application. That means investing in anything AMD is a mistake. The best bang for the buck is quad-core Sandy Bridge CPUs. 4000 pounds is about $6300. I can build a quad-core 2.8 GHz Sandy Bridge node (2GB/core in a desktop case) for under $400 each. Cables, Gbit switch, and 15-16 nodes (60-64 Sandy Bridge cores total) will fit in the budget without too much effort.

OK...so, it isn't ECC memory. And it isn't general purpose. And it isn't going to run most parallel applications worth a crap due to the gbit network, but the point of building a cluster is to design it to match the application. 64 Sandy Bridge cores will run rings around any Magny Cores solution you can build for the same price.

Re:How many commenters have BUILT a cluster!? by WhitePanther5000 · 2011-09-15 03:29 · Score: 1

Mod parent up... Sandy Bridge can sustain 8 double precision FLOPS/cycle, as opposed to Magny-Cours 4 per cycle. Which means, if all you're doing is an embarrassingly parallel floating point problem, this 64 core Sandy Bridge cluster would get a theoretical peak of 712 GFLOPS (8 FPU's * 2.8 Ghz * 64 cores) as opposed to a 48-core Magny-Cours node with a peak of 403 GFLOPS (4 FPU's * 2.1 Ghz * 48 cores). But even a moderate amount of communication between nodes is going to greatly reduce your computational efficiency, so real world performance might be closer than you think between the 2 solutions, depending on your application.
Re:How many commenters have BUILT a cluster!? by mprinkey · 2011-09-15 07:07 · Score: 1

The difference is even bigger than you posted! You made a math error on the Sandy Bridge FLOP calculation:
64 Sandy Bridge Cores: 8 FLOPS/Hz * 2.8 GHz * 64 cores = 1433.6 GFLOPS
48 Magny Cours: 4 FLOPS/Hz * 2.1 GHz * 48 cores = 403.2 GFLOPS
So, Sandy Bridge is roughly 3.5 times faster than AMD.
And the original poster commented that the application was parceling out data sets and crunching on the independently, so the application is embarassingly parallel. This design would be rubbish for any *real* parallel application, but I think it is optimal for OP's stated goal.
Re:How many commenters have BUILT a cluster!? by WhitePanther5000 · 2011-09-16 01:10 · Score: 1

Oops, math fail, thanks for the correction! Looks like I used 4 FLOPS/Hz in my calculation even though I quoted 8.
Re:How many commenters have BUILT a cluster!? by Anonymous Coward · 2011-09-16 04:48 · Score: 0

Ummm..
"I get that this is basically an embarrassingly parallel application. So, that means a gigabit network is fine. That also means that single core performance is the ONLY indicator of the speed of the application. " When was single core clock speed the throughput determinant for 'embarrassingly parallel"?
Seriously, do you think about what you type or just throw together nifty sounding phrases, and end with the word "Intel" in the recommendation? Cost is a factor, so I come up with AMD cores is > Intel single thread output.

You could always try... by Anonymous Coward · 2011-09-15 00:43 · Score: 0

something like this

scavenge time from desktops by Anonymous Coward · 2011-09-15 00:48 · Score: 0

I'm assuming the poster is in a university environment. If so see if you can use condor (http://www.cs.wisc.edu/condor/) installed on the university's desktops and just scavenge CPU cycles from them when they are idle. Alternatively, i'm guessing you are also in the UK in which can you can take advantage of the National Grid Services (NGS) and get compute time for free. See http://www.ngs.ac.uk/ for more information.

Check out ROCKS (rocksclusters.org) for a cluster by Anonymous Coward · 2011-09-15 01:18 · Score: 0

If you do go the buy-your-own-hardware route (or even if you use EC2?) look into using the ROCKS cluster suite (http://rocksclusters.org). Basically, you do a regular linux install on the head node, and everything else is set up so you just PXE boot all the compute nodes and it "just works."

I built a cluster with 36 nodes this way some years ago, and now it's rocking away with over 2100 cores.

Capitalize, punctuate, and space by Anonymous Coward · 2011-09-15 01:20 · Score: 1

ca 11keur

It took me three attempts to parse that before I realized you meant "ca. EUR 11k" ("circa eleven thousand euro.")

Wow ... by jon3k · 2011-09-15 01:57 · Score: 1

never thought I'd see the day where the general consensus was "just rent it!". this is slashdot, how can we not do this better, cheaper, faster and "free" than amazong's ec2?

Who isn't on a shoestring budget really? by BlueCoder · 2011-09-15 02:01 · Score: 1

Everyone wants to get the most for their money be it in good times or bad.

Team up with other departments or use a cloud by angel'o'sphere · 2011-09-15 02:08 · Score: 1

As others pointed out: use cloud first.
If you want your "own" grid, try to team up with other departments. Likely anotherone either did the same and can share resources with you or there is demand and they wait for one to set up a grid.

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.

HPC-Europa by jmak · 2011-09-15 02:08 · Score: 1

If your friend works in the academia (from the currency mentioned, I assume you are from Europe), she can also try applying for an access to HPC-Europa infrastructure.

See http://www.hpc-europa.org/

Re:This... Perhaps by networkBoy · 2011-09-15 02:15 · Score: 1

And to this end, the compute task is CPU bound. Is it actually memory and thread bound or is it core CPU tick bound?
If it is memory bound, not thread heavy, and not a lot of floating point math I would suggest a cluster of Atom dual core ITX boards. 2 gig of ram per node, boot off a USB key and share all the data over GbE. That is likely to be the most active cores for the buck, but if there is a lot of floatin point math or thread/context switches... ouch.

Another option is to go with a lot of slightly older used 1u Xeon/Opteron boxes. Likely can be had fairly cheap from a surplus dealer. This is used kit so no warranty, which may be an issue with the grant, but again will get more processing for the buck.

Final option:
Call up IBM/Dell/HP/Cray/SGI and ask if they would like to help by "selling" a small cluster for £4K. They might just do it, knowing that you are very likely to expand later by buying from the same vendor that you started with (less integration problems).

--
whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump

Ask the Beowulf mailing list by Anonymous Coward · 2011-09-15 02:32 · Score: 0

http://www.beowulf.org

This kind of question comes up periodically, and as all the posts indicate, there's a lot of options, but depending on the type of computing you're doing, the answers tend to clump into several general approaches: buy compute cycles, loosely coupled cluster using Ethernet (good for embarrassingly parallel), and tightly coupled using a high performance interconnect (Myrinet, Infinband, etc).

Then, there's decisions about diskless nodes or not, etc.

I've built a few small clusters at various scales (some years ago, I published a design for "how to build a cluster entirely from things you buy at Wal-Mart for $2000)

There is appeal in having your own cluster that YOU control and you don't have to negotiate or pay for changes in resources. It's there when you want it. You don't have to worry about someone else having your data in their hands, etc. OTOH, you have all the infrastructure (electricity, HVAC, maintenance).

http://www.clustermonkey.net has articles on building a cheap cluster using commodity components (a true Beowulf)

There's also the whole problem of computers getting faster. If you have a year's worth of computing to do on a computer you can buy today, it's possible that you could do no computing for 1/2 year, buy your computer then, and get twice as fast a computer, and still finish at the same time.

For $6k, one can go out and buy a pretty powerful multicore/multiCPU desktop machine with all the mod cons. And it would probably be faster than any sort of 4-8 node cluster you build for the same price.
On reason to build your own small low performance cluster is if you are doing a proof of concept for something that will eventually be scaled up massively. That 6 node cluster will provide a platform to develop your parallel code and give you chance to make actual measurements on what the various performance limits are (is it interconnect? Is it memory bandwidth?) so that when you scale up to 1000 nodes (whether rented, bought, etc.) you can be an intelligent and informed consumer. You'll have programs to run as benchmarks that are "your real problem" as opposed to trying to figure out if some sort of SPECmark or Dhrystone number relates. If you ARE buying a 1000 node cluster, the cluster vendor(s) will typically have some machines you could run your code on to try it out.

However, be aware that rolling your own cluster is not a "plug and play" 1 hour setup kind of thing. You're going to spend a week getting it all going the first time, and then several months fooling with it to get your tools and configurations the way you like it, etc. It's a heck of a lot of fun, if you're into that sort of thing, and it is substantial geek-cred to say "oh yeah, I have a cluster supercomputer in my garage, but I don't use it any more, because ..."

Re:This... Perhaps by mlush · 2011-09-15 02:34 · Score: 1

Had second thoughts about £4000 lasting long enough. If they can find a service that costs ~15p/hour (and it seems reasonable that they could) that would provide 24/7 computing power for a 3 year grant and better than that, any unused time/money could be kept and used on special occasions/rush jobs...

The point about restrictions on how the grant is spent still stand though :-(

At that price, build a workstation by guruevi · 2011-09-15 02:53 · Score: 1

4000 pounds is like what, $6000? Not much you can get with that. Do you really want to build a cluster? As other said, you can rent a cluster but HPC in science is hopelessly inefficient and comes in with large datasets (bandwidth is usually metered) and spits out even larger datasets (bandwidth is metered + online storage is metered) and you'll be paying through the nose because somebody (the researcher) doesn't know how to program correctly. First of all, I would definitely recommend you look around in your institution or in peer institutions whether or not there is already a cluster you can rent (or use for free). Most large institutions have a supercomputer and even smaller departments (like Physics, Astronomy and Imaging or Visual Sciences) have their own small cluster that is not 100% used. You may need to work within the framework of your local politics about that and make concessions as far as time allotments and constraints go but it's cheaper (or free).

If you want to go the DIY route, why don't you just buy a machine from Supermicro (their 3U's are both towers and rack mount) and fill it with a good amount (at least 2GB/core) RAM and 2-4 processors (eg. 6-core or 8-core AMD Opterons), a couple of 2TB hard drives (mirror) and you're pretty much through your budget especially if you want to throw in an nVidia Tesla card. If you want to, you could use virtual machines on those system with Xen or VirtualBox (whichever fits better)

The advantages of this approach is this:
- Much less maintenance even if you go the virtual machine route
- Much faster interprocess and device (storage etc.) communications - interprocess communications over gigabit kills performance unnecessarily on small cluster. Larger clusters have InfiniBand.
- You can still expand it later with another machine like it and use cluster software then.
- Footprint is smaller. You can fit these machines in 3 or 4U and they come with a 1.5kW power supply (to support the GPU's). You can buy about 6 1U systems for roughly the same price but you've doubled your rack usage and the power supplies are together about 3 kW because now you've got to power the motherboards and peripherals of 6 devices.
- GPU computing cannot be done in cheap 1U devices. And even if you can (and spend a little more and get only 4 machine), only 1 fits, rarely 2. The 3U solutions fit 4 (sometimes 5) GPU's perfectly and are built (power supply, cooling) for it. Even if you don't do GPU computing right now, even MATLAB can offload certain functions to gaming GPU's. They have a little less memory than the Tesla's but they only cost $150-250 (compared to the $700-1200 for a Tesla (EDU discount))
- No need to maintain cluster software (and it can be a pain in the neck)
- It fits under a desk and doesn't take up rack space if you don't want it to. No need to pay for hosting or cooling, no extra noise.

--
Custom electronics and digital signage for your business: www.evcircuits.com

it's cheap to do by roman_mir · 2011-09-15 03:00 · Score: 1

cheap to do with existing botnet systems.

--
You can't handle the truth.

free is always cheapest by cryptozoologist · 2011-09-15 03:28 · Score: 1

depending on the nature of your research group (academic, government, military, private...) you may well be able to have cluster time free for the asking. from my experience you may have to 'apply for a grant' which is really filling out a form. the cluster i have access to has many nodes with 64 gig of ram and all nodes are stuffed with gpus as well. it never makes sense to spend money on computer equipment that you will spend a year or more learning to use. do as much development on borrowed equipment as you can and when you have working software already implemented, buy hardware as needed. good luck!

Raspberry Pi by cybrpnk2 · 2011-09-15 03:50 · Score: 1

In November the $25 Raspberry Pi computer will become available. Check it out.

Why not to ask help from your mates? by Anonymous Coward · 2011-09-15 04:22 · Score: 0

Another option is to use instant messenger based distributed computing solutions.
http://ulno.net/f2f/

NFS? I wouldn't suggest it by Anonymous Coward · 2011-09-15 04:26 · Score: 0

It will work fine for a few CPUs. But if you get hundreds writing to NFS at the same time, expect bad results. Many simple IO strategies do just that.

Remember, a supercomputer is a device for changing a CPU bound process into an IO bound process. Make sure your IO can handle it.

torque by whitroth · 2011-09-15 04:30 · Score: 1

For our HPC clusters, we run torque on Linux (CentOS), which is descended, I believe, from beowulf. No scaling problems at all. Get servers with the most cores you can afford, put this on, and away you go.

I will note that the code has to be aware of parallelism, and fork.

mark

Beowulf vs. GPU/DSP vs. Amazon vs. Uni facilities by billstewart · 2011-09-15 04:38 · Score: 1

I'm assuming this is at a university - are there other facilities available already?

How long will the CPU-burning requirements last? Does it make sense to buy hardware, or to rent time on Amazon's cloud? Is it worth spending a month of programmer time to port to GPU/DSP if it saves you three months of computation? Have you done any models on what you need?

When you say "CPU-bound", what do you mean? Is it fixed-point or floating-point? What precision? Is it large-memory or small-memory? Is it a standard problem space, like image processing or cryptography? For some problems, e.g. small memory fixed-point, you can buy DSP boards that will be several orders of magnitude faster than generic PC hardware, and won't require much application porting.

Do you have a spare grad student to do hardware/sysadmin grunt work? For 4000 pounds, you can probably buy about 40 sets of motherboard+power supply, if you have a grunt to build boxes for them, or about 20 sets of pre-built desktop PCs, or about 4 high-end Dell rack servers.

--

Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks

I second Beowulf by ILongForDarkness · 2011-09-15 04:52 · Score: 1

Look up something like Condor. It will let you use the spare cycles kicking around in the workstations in your friend's lab and other labs that are willing to install the client. It can be setup to only run when the computer is idle, to run on her labs computers first before using others etc. Also once it is setup any lab in the building/network/world could use it provided that the admin approved it.

Open source clustering from UCSD by Anonymous Coward · 2011-09-15 05:37 · Score: 0

NSF grant that is has been used to develop open source heterogeneous computer cluster software at the University of California San Diego:

http://www.rocksclusters.org/wordpress/

raspberrypi? by Anonymous Coward · 2011-09-15 05:39 · Score: 0

at $25 per 'box' you'd get around 216 of them, which is more than a fair start

It depends on the kind of simulation she's running by Anonymous Coward · 2011-09-15 06:01 · Score: 0

Is this an MPP problem or SMP problem?

Does she have access to high speed network equipment or will she have to buy everything on her own?

Is this an embarrassingly parallel problem?

Does the problem or the math parallelize well/easily?

Does the problem or the math SCALE well/easily with more CPUs?

Does each partition/portion of the problem run independently of the next (very little inter-core communication)?

Is she writing with OpenMP or MPI?

Is latency and synchronization a problem?

For about 4000 GBP, she can probably build a dual Xeon X5690 system with 64 GB of RAM and two 240 GB SSDs.

Does the simulation generate a LOT of data?

Does it make sense for her to use CUDA or OpenCL (GPU accelerated computing)?

What language is she writing in?

Teraflop for under £4,000 by Scytheon3 · 2011-09-15 07:52 · Score: 1

Here is something I may do one day: 10 nodes each with an intel core i7 2600k (around 100 gflops). If you network boot and have headless nodes then each can easily be built (with 8 GB ram) for £400 So in conclusion approximately 1000 gflops (more if you overclock) for at most £4000 (thats including VAT)

Ask slashdot by ThurstonMoore · 2011-09-15 08:47 · Score: 3, Funny

Just get with the guy who did the ask Slashdot the other day that didnt know what to do with his supercomputer.

Re:Ask slashdot by Anonymous Coward · 2011-09-15 15:38 · Score: 0

That's why I like Slashdot. "My boss has asked me to spend roughly two million dollars to upgrade our machine room. I have never done this before, so do you slashdotters have any special advice?" Right.

no GPU? by all_aspects · 2011-09-15 09:53 · Score: 1

Are you sure? Might want to take a look at this.

SETI@home ? by koan · 2011-09-15 11:25 · Score: 1

Instead of hardware why not program software for distributed computing (like SETI@home) then you need only write the software and get people to install it on their systems. (yes I said only)
Perhaps set up the software so that other research groups could utilize the spare processing cycles thus getting a huge distributed processing set up for use by multiple Universities with different research needs brought together by a common processing software.
Make it modular not all research has the same needs.

With the correct media spin it could happen, and I have always wondered why schools don't use their lab computers for distributed computing at night, like I did by installing Vue Infinite render nodes on all the math lab computers.

No one ever noticed they were running.

--
"If any question why we died, Tell them because our fathers lied."

Simple Recap, Not Rocket Science by Anonymous Coward · 2011-09-15 13:08 · Score: 0

Seems the thread has it all pretty well nailed,

- if this is a 'typical' academic environment, it means you probably have (effectively) free electricity, cooling, and (more or less) labour (undergrad/grad students; and the row of PCs can be stuck under a bench in a spare office or something similar).
- such a scenario, it is hard to beat COTS boxes - I would probably lean towards standard consumer-grade 6-core phenom CPU with 16gigs ram per node to get the best bang for your buck (and commodity gig-ether interconnect). Such a system is well under $1k Cdn or less even depending on how much (little) disk you need to toss into the machines; so you will end up with half a dozen boxes and 36 cores with your budget (maybe more)
- Just throw on a standard tin-opener HPC platform (Pelican HPC, ROCKS, ... etc ...) and you are good to go.

If you are in an environment where power:cooling:labour factor in significantly, clearly the economics of operation costs become bigger quickly.
Certainly if you have access to free HPC cycles, that is a great option as well; so long as the effort involved in getting your data to the presumably-remote-site; then queuing jobs; and retrieving output .. is not prohibitive.

Certainly fewer SMP boxes (dual-12 core opteron or similar) has significantly more versatility for jobs which are SMP rather than serial:parallel in their nature. But the initial question posted seems quite clear on this.

Cloud based services (Amazon, etc) tend to be a bit overpriced when compared to environments that have heavily subsidized cost structures (ie, often, academia for example) - "The Cloud" often makes 'the best sense' for 'peaky' workloads (think: Xmas sales spikes) rather than "CPU Saturating behemoth job batches"; and are priced to be competitive with "enterprise data centre hosting services".

Like most projects, the requirements are well known and won't change. Until they do. :-)

--Mr.Tim

Rocks in your head... by Anonymous Coward · 2011-09-16 07:40 · Score: 0

Parallel implies a parallel programming environment.

First look at Rocks and MPI. Next look
at Hadoop. Start with four hefty multi core desktop
that folk also work on. Boxes where high end GFX
cards would be cool and happy. Explore programming languages
and quality compilers. A good compiler can reduce
hardware costs. I have seen compilers improve results
by 70%.... in some cases. Language choice can matter.

Build check point and restart into your application design.
Watch data storage and distribution. I/O can keep many
processors idle waiting for bits to chew on.

Well a research group should do some. by niftymitch · 2011-09-16 08:08 · Score: 1

Well a research group should do some research.

Security and revision control are important.

When starting a research group one important
and largish investment is the desktops and local
storage to manage the code and the data.

A startup should start with dual purpose resources
when possible. Code design should begin with
some notion of progress and checkpoint and restart.
Building reliable infrastructure is a royal PITA.

The desktop tools and cluster tools should play well
together.

Do research the various cloud resources. Optimum
use of cloud resources can depend on the smallest
initial design decisions.

As always read Jon Louis Bentley's "Programming Pearls"

--
Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.

SOny PS3 cluster by Anonymous Coward · 2011-09-17 09:22 · Score: 0

Has someone already mentioned creating a cluster out of SONY PS3 ? There are some universities and research labs in the US that makes use of a cluster of SOny PS3 machines .. and they do serious research with that ..

Slashdot Mirror

Ask Slashdot: Clusters On the Cheap?

264 comments