Ask Slashdot: Clusters On the Cheap?
First time accepted submitter serviscope_minor writes "A friend of mine has recently started a research group. As usual with these things, she is on a shoestring budget and has computational demands. The computational task is very parallel (but implementing it on GPUs is an open research problem and not the topic of research), and very CPU bound. Can slashdotters advise on a practical way of getting really high bang for buck? The budget is about £4000 (excluding VAT/sales tax), though it is likely that the system will be expanded later. The computers will probably end up running a boring Linux distro and Sun GridEngine to manage batch processing (with home directories shared over NFS)."
Subject
Why waste money on building a cluster when you can rent the best in the world * by the hour * ?
I've seen quite a few projects where people have stacked motherboards with spacers, using booting over Ethernet and a single power supply for multiple MBs. Google should be of use here, I'm trying to get my offspring to school so I'm cheating and not providing any links...
But the idea is that skipping the case and other components makes things cheaper. Leaving the rig exposed without a case also eliminates the need for most cooling.
.: Max Romantschuk
Actually, that's a good question... Assuming no time constraints, at what point does it make sense to buy hardware rather than use the cloud? Take that budget above (roughly US$6K) and the best hardware you can get for that price: How many months would you need to run it, flat out, to equal the number of floating-point ops EC2 would give you for that cost?
Many universities/consortia have supercomputers available on which researchers can apply for (or buy) time. For example, my university is a member of VPAC, which has a big-arse cluster shared between a number of institutions. She might get much better bang for buck if she uses the money for that, rather than splashing out for dedicated hardware.
Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)
Imagine a cluster of cheapness!
Table-ized A.I.
Why buy your own when you can use existing GRID infrastructure? For 4k you can't do much more that get a few decent desktops for yourself and a few grad students and/or postdocs. Rather than blow it on a massively underpowered cluster use the grid. I know the UK has massive clusters available to researchers so find out how to get an account and resources on them and use those. For test jobs, interactive analysis and other low latency tasks use your desktop.
You can get a SuperMicro reseller to sell you one workstation with 4 sockets of CPUs and a bunch of RAM. UK£ 4000 = 6 299.2 U.S. dollars
That buys you a box with 4 x Opteron 6134 (32 cores) and 128GB RAM (32 x 4GB sticks). And some hard disks.
An near-example of what Max is talking about can be found at the Home Linux Render Cluster. The builder threw six dual cpu motherboards into a small, gutted filing cabinet and Gig-e. Cheap, expandable.
However, if your friend hasn't got a very good idea how much mmmph she needs, the AWS EC2 rental idea has merit.
Luke, help me take this mask off
You can't really get a cluster for that kind of money. You can barely get one decent box.
But you shoud be able to rent a lot of computer time in the cloud for that kind of money, or use it to buy time on someone else's cluster.
I do not fail; I succeed at finding out what does not work.
OP hasn't mentioned a lot except budget. Since you are on such a tight budget, I would highly recommend doing some theoretical analysis first. Do you have a serial code? How much parallelism exists in the code? You say the task is 'very parallel', but Amdahl's law (which is really common sense) will tell you that even for small amounts of serial sections of code, your speedup will be limited. You should also consider the amount of time the code actually runs. Achieving a speedup of 2 for a serial code that runs for one minute is near worthless.
After you estimate speedup, do some rough calculations on the basis of average cost of a processor and the the number of processors required. This should give you an estimate of the hardware cost required. Compare that with the cost of CPU cycles per dollar you get using a cloud service such as Amazon.
$1.60 / hour for the largest non-GPU cluster instance. This also provides you with rather fast interconnects and scalability with multiple instances.
Only £4,000 in hardware would be a waste of money. You wouldn't have all that much computing power, and it would be obsolete immediately.
Study the design of the "microwulf" and it's relatives. Considering that hardware prices has dropped since 2009, your task might be achievable.
Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
HP? Is that you?
Assuming a 1.5 to 1 correspondence with the USD, you're either getting a decent cpu box and no storage, or a reasonable amount of storage and no CPU. I build/run supercomputing clusters for molecular dynamics simulations at an university in upstate New York, and I wouldn't even consider attempting a cluster for less than $25,000.
Since the OP didn't specify if this was massively parallel or not, I'm going to assume this is so I can use AMD chips for cheapness.
First off, storage. Computational output adds up quick. You're looking at $7,000 USD for 24TB raw storage from the likes of IBM or HP or Dell. Yes, you can whitebox it for cheaper, but considering if you lose this box, nothing else matters (And I doubt you have the funds for proper backups), it pays to get hardware that's been tested and is from a vendor you can scream at when it breaks.
Second, interconnect. A cheap netgear will work, but reasonable internode communication is not cheap, especially if moving largish amounts of data. This could run $1000 to $3000
Finally, the compute hardware itself. A decent node will run $3000 to $5000 depending on the core count, socket count, GHz, and to a lesser extent RAM.
Assuming you want 128 cores, you're looking at 8 machines for compute ($32,000 right there assuming $4K/node, and dual 8 core chips), plus another $7K for the file server/landing pad, and finally add $1500 for a decent switch that can let those nodes talk to each other at line speed and allow room for future growth. Total cost: $40,500 USD or 27,000 pounds assuming the 1 pound:1.5 USD ratio.
Try a computer recycling centre, most tend to be short on storage and are happy to sell a large number of desktop machines at a lower than normal price per unit. Community operated ones tend to be more helpful than business ones though.
A game has objectives and is competitive, anything else is just play
Buy a small chunk of something that looks like the big machines she will be using. As others have said, with that little money you aren't going to get legitimate computational resources. But she will certainly qualify - or already has - on some of the larger public machines. In my experience, it is really nice to have a small, i.e two or three nodes, cluster to test and benchmark code. You can look at things like parallel performance on a single node versus across nodes. If the code plays well with shared memory. Can the code reasonably mix shared and non-shared parallelization schemes. And so forth.
46 & 2
$35/element, runs a boring Linux distro, runs very cool, low power consumption (less than 1w), onboard Ethernet.
Sorted!
Raspberry Pi
She could also consider creating a BOINC project. She could then do some publicity locally and on forums, to get people to choose her project. I've never tried creating a BOINC project, so I don't know how hard this is. However, I do run the client as a background task, and I imagine many other people do as well.
Enjoy life! This is not a dress rehearsal.
Spend the money on a programmer to parallelize the algorithm on standard CPUs, and put it out on BOINC. People volunteer their spare cycles for BOINC projects that are barely more interesting than the chemistry of aardvark snot. She would likely get volunteers if there's anything of even passing interest in her research.
If your friend doesn't want to do a lot of engineering work, then for this price I would just buy 10 or so PCs (depending on memory/CPU tradefoffs) from wherever has a special offer, plus a gigabit switch and put them on shelves. If you need a lot of memory, or can usefully share memory then that would be a bit different, but you can buy a usable headless PC for £300-£400. This will also not be terribly power efficient, nor will components like motherboards be of the highest quality, but you get more bang for the buck that way than almost anything else except second-hand. At the other extreme, you could probably buy a single 24-core AMD box for the money with quite a lot of RAM and just run a lot of processes on it.
Talking of second-hand, the other thing to do is to see if anyone has a cluster they can't feed (ie power) any more. Our aplied maths dept is about to shut down a 3 year old 1000-core cluster because they can't afford the power to run it and their newer 2000 core cluster. A slice of that would be great and someone locally might be able to help you in a similar way.
Microwulf.
"Tongue tied and twisted, just an Earth bound misfit
And how much space and air conditioning do you have? Depending on the answers do these questions, the optimal* solution might be 'get a bunch of 5 year old computers nearly for free.'
* Optimal for your friend, not for her university.
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
I was in a similar situation setting up a research group. Wanted an expandable setup for a research group, that would meet approval of local IT sysadmins (some remote management opts, vendor support). Per 2.6K pounds a pop I got a Dell poweredge T410 server with 2 6-core CPUs and 24GB RAM. I'm never one to push a Dell (been purchasing IBM/HP for years) but this is a decent machine for a decent price. I tried various cloud solutions using virtual machines on Amazon and similar frameworks, but for the kind of work we do (frequent software updates, massive amounts of data that need to be stored locally and can't be transferred easily), those don't scale. We use Condor as a job submission engine. Not that we don't like SGE but with Oracle's plans (http://en.wikipedia.org/wiki/Oracle_Grid_Engine) one can never tell. PS: remind you friend to invest in a QNAP NAS or similar for backups / disaster recovery.
I don't think any of the posters recommending EC2 have ever looked at the economics of EC2 versus self-hosting.
If you have long-term compute needs (as opposed to needing to throw lots of cores at a problem to get fast results in a short time), you're better off buying a Dell.
An EC2 Quadruple Extra Large EC2 instance is $1.60/hour. You have around $6500 USD, so you could buy 169 days of computer time at EC2 (ignoring the cost of I/O and network bandwidth).
This instance has 23GB of RAM and is equivalent to 2 x Intel Xeon X5570 CPU's.
For around $5000, you could buy a Dell R710 with dual X5647's, 32GB RAM, RAID-1 1TB SATA drives (depending on your storage needs, you might want to move to faster SAS disks). As long as you have a suitable office to host the server, your only recurring hosting cost is electricity (around $70/month) and maybe you'll need to spend $500 on a UPS. If you need to pay for hosting/colocation somewhere, that will definitely change the economics.
So, with your budget, you get one node + UPS + electricity for a year. All for the price of around 5 months of EC2 time.
You come out ahead even if you want to throw away the server every 6 months and start fresh.
You can save a few bucks by building your own (or going to a custom whitebox builder), but the Dell comes with 3 years of next business day support. Last time I priced out a whitebox builder, they beat Dell's best discounted price by about 10% and only offered a 1 year warranty.
EC2 is really expensive, brah.
Once you count all the costs of running your own cluster, ie. electricity, cooling, man-hours spent on configuring, installing and maintaining them, repairing broken parts etc. suddenly EC2 is likely cheaper than your own cluster, not to mention you can scale up on-demand if your requirements suddenly require such.
Mums are nice.
If you are extremely data heavy, the cloud becomes quickly much more expensive than buying your own. The Broad Institute made some recent experiments on Amazon analyzing genome experiments, and they said Amazon was 4x more expensive.
But for her cpu-heavy workloads the cloud would work perfectly.
more things to consider:
If she buys her own hardware there are a lot of extra costs to the raw hardware:
1 someone needs to set the thing up, administer it, and support it with patches, etc. even for a small cluster this is a good percentage of a person. if she has a slave student doing that, great! although if the guy leaves there will be an issue. if she needs to spend money on a person, then amazon will be muchmuch cheaper.
2 there is an electricity bill and you need space, probably cooled space. if it is available, great! otherwise it can be a showstopper (fire hazard in a lab)
These costs are included when people talk about total cost of ownership. If you factor them in, the cloud suddenly becomes veeery interesting. Btw, EC2 is not the only player on the market now, there is Azure and also the IBM SmartCloud, with competitive pricing.
For normal cluster computing you can go to a number of startups that will build an on-demand cluster for you based on amazon, my favorite in the research domain is cloudbroker - http://www.cloudbroker.com/ - who will actually render the software you need into a SaaS based on EC2 or the IBM cloud. you just launch your hpc cloud app, with your uploaded data and you pay for the workload you did. the bill includes just amazon or ibm costs, licenses if any and a surcharge by cloudbroker which is totally worth the money because now you do not even need to set up the software and the virtual machines anymore.
Somebody upthread mentioned BOINC, which is a great idea for many parallel-oriented compute-bound problems. However, while making your project compatible with BOINC is necessary, it's usually not sufficient. The problem is marketing, to convince enough people to run your work. World Community Grid, sponsored by IBM, is free and is an excellent way to solve that problem. You can submit a proposal, and if approved you'll quickly have lots of BOINC-powered computing working on your problem.
At least you'll be running on the bare metal, not some virtualized piece of cloud. http://www.hector.ac.uk/
Every end has half a stick.
Assuming the later, check among Supermicro & Dell servers. Last time I needed to setup a cluster, the Dell R610s were a good pick, giving great manageability over the LAN, low volume and decent features (balanced storage space along with cpu capacity, around 8 cores + 8 TBs per 1u blade). Don't rule out also options like Shuttle XPCs, they are damned robust in thermal aspects (hey, you'll be running these continuously, won't you?). Finally, don't underestimate the need for local sysadmining; you will likely need to setup a queueing system (Torque, *PBS*, SLURM, SGE, LSF, NQS, Condor) and manage the whole thing. This won't happen automatically, take a note on that. If you run something of the pbs or sge family I can happily help with setting up a tool called qtop
Is the research group in a university? Most universities have a lot of computing power that sits idle for large proportions of the time in their undergraduate computing laboratories. There's a significant resource that could be exploited simply by deploying jobs to idle machines.
Sent from my Tianhe-2 (MilkyWay-2).
In the UK there are academic grids that research groups can use like the ngs or gridpp (for free or next to nothing) http://ngs.ac.uk/use-ngs I used to work at the Center for Parallel Computing where I am sure some people would talk to her. http://www.cpc.wmin.ac.uk/cpcsite/index.php/Main_Page
give some thought to data security as well please. If the research done is sensitive, don't use clouds.
On the other hand, costs of self-hosting are indeed underestimated very quickly, which is not a good thing when budget is low.
Also, while manycore machines seem cost-effective, look at the solutions you are using for computation; it is hard to press a 48-core machine to peak performance, much harder than driving more standard distributed-memory supercomputers. But this depends on your application.
Buying time on an existing cluster (local university, or a dedicated HPC company) seems the surest way, and also reasonably secure when done at a trusted institute or company.
Look up the MicroWulf project. Pretty much exactly the thing you want if you really really really do need to build it yourself.
Self-building clusters needs some careful planning before you start:
1) Don't do it if it you don't have to. Lots of resources around that you can use, especially if you're in academia.
2) Find/read benchmarks for the key components to your research (CPU in this case) and then design your cluster modules around it. Keep in the mind your price/performance ratio for each entire module, not just the core component. Fewer faster modules is usually the best way to go.
3) Cut out as much hardware as you can but don't skimp on important specs. Integrated NICs on motherboards are good but check they're PCI-Ex connected instead of vanilla PCI if you want the best transfer rates with lowest latency. If you don't need lots of data lying around on each module, ditch local hard drives - dual purpose a module as a data server, maybe a Netboot server if you fancy it, or consider booting from cheap USB keys.
4) Don't forget to factor in the price of network switches, Cat-6, power leads, etc - it can mount up pretty quickly!
First, see what's available. Many departments have computing options available that depend upon scheduling and departmental budgets.
If that doesn't work out, what are you doing? Serial processing or based upon existing software, then 'contract it out' (if that's an option). It's easier, and probably cheaper, especially in the early stages.
Parallel processing though needs more serious consideration. Cheap SIMD was being offloaded onto GPU's the last time I looked (which, admittedly, is probably too long ago). So it may be best to look to 'homebrew' configurations in that case.
The university has a few server rooms scattered around of various qualities, though #4k's worth of kit could probably be scattered around a bit if necessary. The department in question does not have much history of heavy computational demands.
Though it's interesting what you mention about a colo. I had a look around at colo's on the open market and they're very expensive, compared to the budget. Oddly, the density seemed lower than expected. These days, modern high density servers can easily reach 1kW/U, but the colos I looked at were generally charging assuming a few hundred W/U.
Pretty much anything I could find would eat half the budget or more on colo costs. Any sugestions there?
In terms of reliability, jobs are isolated and short and can easily be rerun, so it is not worth going for high reliability over spare parts. Also, the task is very heavily CPU bound.
I looked at the high density rack stuff (having relatively recently purchased some supermicro quad 6100's) which were placed in a proper facility. They're very nice and in large quantities, proper kit does certainly save enough on sysadmin time, space and electricity to be worth it.
SJW n. One who posts facts.
Hmm I wonder if the College has a datacenter already. You know, so ya dont have to worry about the overhead.
Ohh and man hours to configure the thing? You know I'm sure there are a load of students there working for free.
This is NOT some company trying to spin up a new service. This is a school doing a project. That means shoe string budget, and the people get paid with good grades in their class.
4000 pounds wont get you very far at all on Amazon.
I love it when people take the question given and then insist that the problem can only be solved by changing the specifications to using "cloud" computing or some other nonsense. It makes you guys look like paid shrills.
You are entitled to your own opinions, not your own facts.
We got a customized 8-core (2x E5504) box with 32 gigs of memory from Leaseweb for about 200 euros a month. I'd say it's a great deal. I think if you talk to them, they could get you a custom cluster too. Definitely cheaper than EC2 if you use it constantly.
I think an innovative solution, assuming you have a defined timeframe for this project, is the following: Set aside a small portion of your budget (£250-£500 - I'm guessing) and set up a BOINC server on an EC2 instance; http://boinc.berkeley.edu/trac/wiki/CloudServer . It will probably not cost as much as I have suggested above as it won't be *nearly* as intensive as actually doing the computing required for analysis of the experiments but you can use some of the budget to pay a server admin to set it up for you if you are not very confident. Although, I am certain, if you looked around any of the big communities involved in grid projects (overclock.net, evga, etc.), someone would be willing to assist you for free. Go around to the major forums posting a message in their grid computing projects asking for assistance and offering a £2000 prize or donation to a charity of their choosing for the group that completes the most work units over the project's life. This may sound like a lot of hard work, but these groups are fiercely competitive and are extremely willing to help to any cause and it will be not as difficult as it seems when reading this. At the very least, I can guarantee you about 20 users from a grid forum I am part of that will contribute - at £0 cost. Best of luck!
You have a limited budget, so it's more cost effective for you to lease time on someone else's equipment for now.
In a fair and logical world this would be true.
In an academic setting there can be problems on what your allowed to spend the money on. If the £4000 is in the 'Computer Hardware' section of the grant buying AWS could count as a service and have to come out of the 'Consumables' section of the grant
The other important question that springs to mind is would £4000 last till the end of the grant. Knowing the typical research program I think it would be very hard to estimate this, and if they underestimate (or suddenly find they need to recompute the last years work) they may end up without their processing power at the end of the grant when they really need to get something pretty to go in the next grant application.
If they buy hardware it may be more expensive, but they can absolutely rely on it being available till the end of the grant (assuming there is a 3 year warentee and the University has some sort of building contents insurance) and if they find they need more power they could still go to AWS.
I see a lot of people suggesting the use of cluster instances on AWS. At first blush this is what they are built for, but it's not a gimme that they are the most cost-efficient option. From the description, the job is not targeting GPU, and it's also not network-bound. Some of the high-cpu instances are more economical if you don't need the gobs of RAM or 10 Gigabit pipes. The cluster instances do have somewhat faster CPUs.
AWS offers a MapReduce layer that supports all of these instance types (http://aws.amazon.com/elasticmapreduce/).
Cluster xLarge (GPU) = $2.10 / hour = $0.26 / hour / core = $0.063 / hour / cpu unit
Cluster xLarge = $1.60 / hour = $0.20 / hour / core = $0.048 / hour / cpu unit
High CPU - medium = $0.17 / hour = $0.085 / hour / core = $0.034 / hour / cpu unit
High CPU - large = $0.68 / hour = $0.085 / hour / core = $0.034 / hour / cpu unit
Throw in:
* Spot instances are discounted by over 50%. If your jobs can work on a range of instances, bid on a variety of cheap CPUs first.
* Reserved instances come out ahead after about 6 months of 24/7 usage, if you're going to use it that way.
All together, you could do something like this, with many possible variation. This gets you roughly 10 CPUs running 24/7 for 6 months, plus 3 hours a day of cluster compute time. And of course you don't pay for any time that you're not running so that could be reallocated.
5000 hours High CPU (medium) = $850 = 10,000 CPU hours
5000 hours High CPU (large) = $3400 = 40,000 CPU hours
250 hours spot instance (Cluster) = $150 = 2,000 CPU+ hours
250 hours spot instance (Cluster GPU) = $200 = 2,000 CPU+ hours
---
Roughly 55,000 CPU hours for $4500, leaving about $1800 for bandwidth, storage, or more compute time.
Point being, just like you can customize the heck out of box to buy, you can carefully craft a cloud approach more efficiently that just buying cluster time. If you just throw it at GPU cluster boxes, you could get half the work done (or less)...
I think it doable especially with Android based phones.
Data rates might be an issue but some things like SETI @home
didn't have real high data rates, with Wifi enabled phones this could be
mitigated to only work when Wifi was active.
100's of millions of phones moving in and out of a global cluster.
I think I just had a nerd moment.
google "32 trillion offshore needs IRS attention"
Spend the money on a programmer to parallelize the algorithm on standard CPUs, and put it out on BOINC.
No, DON'T use BOINC. It will take six months to set up the project and publicise it, and a further year to learn how to deal with the volunteers so as to get work done reasonably reliably. Half or more of the work you send out will never come back. Newbs will bother you with questions about BOINC and the sixteen other projects they're 'donating time' to, not your project. (Virus scanners cause people grief.) RAC hunters will badger you incessantly about minor discrepancies in points awarded.
Running a project through BOINC and getting value out of it means committing a *lot* of time and effort just on the PR side of the project-- setting up and maintaining a website and the Boinc infrastructure, and doing regular news updates, answering emails, getting rid of spammers, and so on.
Unless, of course, you don't get any volunteers.
Unless your work is very interesting to the general public and you can use a big PR machine, trying to use BOINC is a good way to waste a year and achieve nothing.
From http://hadoop.apache.org/ The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.
Does the UK/Europe have federally funded, shared computational resources for researchers? In the US we have what used to be called the Teragrid (now XSEDE) which is a network of supercomputers that are available for researchers. You have to write a proposal for machine time, but they're not all that difficult to get. The main disadvantage is that you have to submit your jobs via a queueing system, so your jobs usually don't start right away (having your own hardware does have its advantages) but the big shared resources have their advantages - you don't have to worry about maintenance, they usually have reliable archival resources, and every X years they usually replace the hardware with something faster.
A squid eating dough in a polyethylene bag is fast and bulbous, got me?
I had to use Windows and Monte Carlo my sim "locally". I sat two Optiplex 790s w/ 3.4GHz i7-quad cores on a rack shelf for under $3000. Their small form factor is a sweet chassis (can't say that about their Precision desktop). I moved the boot disk into the optical drive's bay, and installed a sub-$190 3TB 3.5" drive internally. No power supply for a serious graphics card but native is good enough for my sim. Many-hour runs are 20-25% shorter than my older X9650 and E5540 so I'm happy.
Is it a rule, that there's an exception to every rule?
What Stoney says. Especially point 1.
If you're past point 1, then at the risk of starting a flame war:-
For a compute-bound problem that *can't go on GPUs* and needs fast turnaround, you'd be silly to use anything other than Sandy Bridge at the moment. Core i7 2600 if you can tolerate consumer grade (keep a spare) or the equivalent Xeon, or better. (See, I told you, flames! Look, guys, I like AMD, I want them to succeed, but benchmarks are benchmarks.) For the combination of raw speed, FLOPS/watt, and minimum idle power, Sandy Bridge beats everything else at present. If you're paying for the power, you'll maximise your total computation per pound with SNB. And be much more likely to have your overnight jobs finished and waiting for you when you come in in the morning.
Regarding second-hand stuff, even if you can get it for free, if it's older/smaller than Core 2 Quad Q6600 or the equivalent Xeon, pass on it. There aren't enough cpu instructions executed per second or per kilowatt-hour in anything older. Some of the older cases can be re-used, though.
The problem is the constraints. The cheap cluster in my old department cost £100k. £4k does not buy you a lot of hardware. You will probably find a lot more lying around in the undergrad labs. For some of my work as a PhD student, that's exactly what I used - each lab had 40 machines on a GigE network and closed overnight, and for work that wasn't that latency sensitive, I could distribute it across the machines there and run it at night without anyone minding.
If you're serious about needing a cluster, then you need to spend a lot more than £4K. If you only need a cluster for a short time, then £4K can buy you a chunk of time on someone else's hardware. Since this is the UK, they should contact the Manchester Supercomputing Centre, which provides this kind of service to UK universities at quite a reasonable price (and will also lend you people who are good at optimising code for their systems). If the university doesn't already have some clusters lying around, then you should get in contact with a few other research groups. £4K won't go very far, but if half a dozen research groups each put in £4K then that gives you enough for a reasonable cluster to share between the various users.
I am TheRaven on Soylent News
http://linux.slashdot.org/story/11/09/13/2111210/Ask-Slashdot-Best-Use-For-a-New-Supercomputing-Cluster ...who posted that he was allegedly just getting the budget to build "...the largest x86_64-based supercomputer on the east coast of the U.S. (at least until someone takes the title away from us). It's spec'ed to start with 1200 dual-socket six-core servers..." but apparently has no idea what he's going to use it for.
If true, he'll have lots of cycles to sell for cheap and/or his organization is clearly not value-oriented so he'll probably sell time without much concern over price.
-Styopa
I do not know about your research/project. But, here on the university I do my research , once you get "money for something" you must apply for that... If the money is to buy a few computers for parallel programming, it must be used this way. And you can not use the same money for cloud/grid "rent a hour".
1 - is your program cpu intensive only?
2 - is your program memory intensive?
3 - how about disk usage?
4 - Are you guys "driving" a pre-made program/package/solution (like gaussian/columbus) or developing one ?
5 - How many people gonna use your cluster?
Best practices for any of those situation are, in case you are using "boring linux":
1 - customized kernel for better performance
2 - allocation of resources using CGroups
3 - in case of developing you own program you could read about "intrinsic functions" and make your program faster.
OK, I won't be too hard on the discussions above, but I read enough to try to give some real help to the OP. I get that this is basically an embarrassingly parallel application. So, that means a gigabit network is fine. That also means that single core performance is the ONLY indicator of the speed of the application. That means investing in anything AMD is a mistake. The best bang for the buck is quad-core Sandy Bridge CPUs. 4000 pounds is about $6300. I can build a quad-core 2.8 GHz Sandy Bridge node (2GB/core in a desktop case) for under $400 each. Cables, Gbit switch, and 15-16 nodes (60-64 Sandy Bridge cores total) will fit in the budget without too much effort.
OK...so, it isn't ECC memory. And it isn't general purpose. And it isn't going to run most parallel applications worth a crap due to the gbit network, but the point of building a cluster is to design it to match the application. 64 Sandy Bridge cores will run rings around any Magny Cores solution you can build for the same price.
Biggest problem you're going to have, and the reason I think some of the people suggesting renting outside resources rather than purchasing may be on target, is storage (both physical and logical). You can get a machines with 4-8 cores, and 4-8 GB of memory for around $1000 in a rack mount case, so around 6 boxes with your budget (since you don't have VAT on .edu gear I'm assuming around 1/1.5 dollar/pound ratio). That doesn't get you a rack to mount them in and doesn't get you any storage beyond the hard drives in the boxes. Realistically, something approaching a quarter to a half of your budget will go to incidentals. You'll need a at least one, preferably two ten port Gig-e network switches, some kind of low end shared storage, at least a half height rack, cables, a multiport KVM, the monitor/KB/mouse (ideally rack mount, but you could save some dough by getting regular ones or using spare hardware)... Individually none of those things is horribly expensive, but together I'd guess 1-2K pounds.
If you've got all that stuff, then your goals are a lot more reasonable... if you don't, you're taking a shoe string budget and making it one of those shoe strings that has been pulled through the metal trivet on your boots a few to many times. If you absolutely must have dedicated hardware, one way to save on storage might be something like gfs. It's a shared cluster file system that allows you to create one file system from disks on multiple computers. In theory this could allow you to use the spare space on each node to create a large shared storage... in practice I'm not sure how well it scales when the "servers" are also the "clients". I've only ever used it with separate server boxes.
I don't need a million points of light, just two points of multi-mode fiber and a 10 Gig-E router.
This is a good idea. The Air Force has a PS3 cluster doing similar work.
ca 11keur
It took me three attempts to parse that before I realized you meant "ca. EUR 11k" ("circa eleven thousand euro.")
never thought I'd see the day where the general consensus was "just rent it!". this is slashdot, how can we not do this better, cheaper, faster and "free" than amazong's ec2?
Everyone wants to get the most for their money be it in good times or bad.
As others pointed out: use cloud first.
If you want your "own" grid, try to team up with other departments. Likely anotherone either did the same and can share resources with you or there is demand and they wait for one to set up a grid.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
If your friend works in the academia (from the currency mentioned, I assume you are from Europe), she can also try applying for an access to HPC-Europa infrastructure.
See http://www.hpc-europa.org/
And to this end, the compute task is CPU bound. Is it actually memory and thread bound or is it core CPU tick bound?
If it is memory bound, not thread heavy, and not a lot of floating point math I would suggest a cluster of Atom dual core ITX boards. 2 gig of ram per node, boot off a USB key and share all the data over GbE. That is likely to be the most active cores for the buck, but if there is a lot of floatin point math or thread/context switches... ouch.
Another option is to go with a lot of slightly older used 1u Xeon/Opteron boxes. Likely can be had fairly cheap from a surplus dealer. This is used kit so no warranty, which may be an issue with the grant, but again will get more processing for the buck.
Final option:
Call up IBM/Dell/HP/Cray/SGI and ask if they would like to help by "selling" a small cluster for £4K. They might just do it, knowing that you are very likely to expand later by buying from the same vendor that you started with (less integration problems).
whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
4000 pounds wont get you very far at all on Amazon.
Let's say the researchers in question took 4x "High-CPU Extra Large Instance" from EC2. Each one of them sports the following: 7 GB of memory 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each) 1690 GB of instance storage 64-bit platform
That's helluva lot of computing power right there, at $0.76 an hour. £4000 translates to about $6 316.8, which in terms of useable hours at 4x$0.76 would be roughly 2077 hours. 2077 hours is roughly 86 days of 24/7 computing. Thus it sounds like a good deal to me.
Spread this over a 3 year grant means they can do 2 hours computing a day. If they do more than that they will lose their compute cluster before the end of the project, just when there writing the next grant application. and I guarantee they will go over budget because if they get answers back instantly they won't bother/need to optimise their code. Workload expands to fill the available machine time
Looking at the numbers again I suspect that taking one of those machines, so they have ~8 hours a day compute time. Ideally I think they'd want a service at ~$0.25/hour for a 24/7 service. Then they can compute away without worrying about budget and any unused time/money could be used on the big iron for special occasions/rush jobs.
Yes, but I don't think Yahoo! is adding any more universities to it. In fact, I don't think it ever expanded beyond CMU.
ObDisclosure: I was on the ops-side of that project for Yahoo!.
I've never used the service, but if what another poster says is true, that you pay for a full hour when you use any time at all, the cost per hour can be huge (theoretically infinite). The first debugging session will use up all your money. (There is no avoiding doing some debugging and testing on the finial compute configuration, which will involve a bunch of short runs of your code - each run of a few seconds will count as an hour.)
I've never used the service, but if what another poster says is true, that you pay for a full hour when you use any time at all, the cost per hour can be huge (theoretically infinite). The first debugging session will use up all your money. (There is no avoiding doing some debugging and testing on the finial compute configuration, which will involve a bunch of short runs of your code - each run of a few seconds will count as an hour.)
The hour starts running when you power up the instance. How many tasks you run inside the instance itself doesn't matter. So the obvious solution would be to only power up one instance for one hour and do as many debugging runs as you can within that time, then examine the results and fire up another instance once you feel you can again use the full hour.
Ie. you got things mixed up. You interpreted it as "do a test run == get billed for an hour", whereas it's "do as many test runs as you can within an hour == get billed for an hour."
Had second thoughts about £4000 lasting long enough. If they can find a service that costs ~15p/hour (and it seems reasonable that they could) that would provide 24/7 computing power for a 3 year grant and better than that, any unused time/money could be kept and used on special occasions/rush jobs...
The point about restrictions on how the grant is spent still stand though :-(
Find some place that collects e-waste in a business-heavy area (or military area). You can usually find racks for cheap.
Since there have been a lot of rack mount suggestions, I might throw in my personal experience. I bought got a half-height portable rack for $500 a few years ago. I put it in a closet with a free-standing portable AC unit ($400) and vented it into the dryer vent. (If you lack dryer vents, a closet with a window will do fine. I'd recommend talking to building maintenance to see make sure the rack and the AC unit can power off the same circuit, but otherwise you should be fine.
I also created a small lab (about 25 laptop hosts and 12 pieces of network equipment. It ran free-standing in a room with no modifications for cooling/electricity. At the size you're talking, I wouldn't worry about heat/power.
I do security
4000 pounds is like what, $6000? Not much you can get with that. Do you really want to build a cluster? As other said, you can rent a cluster but HPC in science is hopelessly inefficient and comes in with large datasets (bandwidth is usually metered) and spits out even larger datasets (bandwidth is metered + online storage is metered) and you'll be paying through the nose because somebody (the researcher) doesn't know how to program correctly. First of all, I would definitely recommend you look around in your institution or in peer institutions whether or not there is already a cluster you can rent (or use for free). Most large institutions have a supercomputer and even smaller departments (like Physics, Astronomy and Imaging or Visual Sciences) have their own small cluster that is not 100% used. You may need to work within the framework of your local politics about that and make concessions as far as time allotments and constraints go but it's cheaper (or free).
If you want to go the DIY route, why don't you just buy a machine from Supermicro (their 3U's are both towers and rack mount) and fill it with a good amount (at least 2GB/core) RAM and 2-4 processors (eg. 6-core or 8-core AMD Opterons), a couple of 2TB hard drives (mirror) and you're pretty much through your budget especially if you want to throw in an nVidia Tesla card. If you want to, you could use virtual machines on those system with Xen or VirtualBox (whichever fits better)
The advantages of this approach is this:
- Much less maintenance even if you go the virtual machine route
- Much faster interprocess and device (storage etc.) communications - interprocess communications over gigabit kills performance unnecessarily on small cluster. Larger clusters have InfiniBand.
- You can still expand it later with another machine like it and use cluster software then.
- Footprint is smaller. You can fit these machines in 3 or 4U and they come with a 1.5kW power supply (to support the GPU's). You can buy about 6 1U systems for roughly the same price but you've doubled your rack usage and the power supplies are together about 3 kW because now you've got to power the motherboards and peripherals of 6 devices.
- GPU computing cannot be done in cheap 1U devices. And even if you can (and spend a little more and get only 4 machine), only 1 fits, rarely 2. The 3U solutions fit 4 (sometimes 5) GPU's perfectly and are built (power supply, cooling) for it. Even if you don't do GPU computing right now, even MATLAB can offload certain functions to gaming GPU's. They have a little less memory than the Tesla's but they only cost $150-250 (compared to the $700-1200 for a Tesla (EDU discount))
- No need to maintain cluster software (and it can be a pain in the neck)
- It fits under a desk and doesn't take up rack space if you don't want it to. No need to pay for hosting or cooling, no extra noise.
Custom electronics and digital signage for your business: www.evcircuits.com
cheap to do with existing botnet systems.
You can't handle the truth.
That's not nearly as bad as it sounded. Thank you for the clarification. But I still wouldn't want to have to work that way. One minute run + looking at the output + get interrupted by a phonecall = one hour charge. If they billed by actual CPU time that would be much more attractive.
EC2 is really expensive compared to other (better) cloud providers, not running your own cluster.
depending on the nature of your research group (academic, government, military, private...) you may well be able to have cluster time free for the asking. from my experience you may have to 'apply for a grant' which is really filling out a form. the cluster i have access to has many nodes with 64 gig of ram and all nodes are stuffed with gpus as well. it never makes sense to spend money on computer equipment that you will spend a year or more learning to use. do as much development on borrowed equipment as you can and when you have working software already implemented, buy hardware as needed. good luck!
In November the $25 Raspberry Pi computer will become available. Check it out.
For our HPC clusters, we run torque on Linux (CentOS), which is descended, I believe, from beowulf. No scaling problems at all. Get servers with the most cores you can afford, put this on, and away you go.
I will note that the code has to be aware of parallelism, and fork.
mark
I'm assuming this is at a university - are there other facilities available already?
How long will the CPU-burning requirements last? Does it make sense to buy hardware, or to rent time on Amazon's cloud? Is it worth spending a month of programmer time to port to GPU/DSP if it saves you three months of computation? Have you done any models on what you need?
When you say "CPU-bound", what do you mean? Is it fixed-point or floating-point? What precision? Is it large-memory or small-memory? Is it a standard problem space, like image processing or cryptography? For some problems, e.g. small memory fixed-point, you can buy DSP boards that will be several orders of magnitude faster than generic PC hardware, and won't require much application porting.
Do you have a spare grad student to do hardware/sysadmin grunt work? For 4000 pounds, you can probably buy about 40 sets of motherboard+power supply, if you have a grunt to build boxes for them, or about 20 sets of pre-built desktop PCs, or about 4 high-end Dell rack servers.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Look up something like Condor. It will let you use the spare cycles kicking around in the workstations in your friend's lab and other labs that are willing to install the client. It can be setup to only run when the computer is idle, to run on her labs computers first before using others etc. Also once it is setup any lab in the building/network/world could use it provided that the admin approved it.
EC2 is really expensive compared to other (better) cloud providers
That could well be true, I am only familiar with EC2 and I've actually never used it myself. There's plenty of people suggesting asking universities for CPU time and that might well be cheaper. The point is though that running your own cluster with only £4000 available is likely to end in tears, not actual research getting done.
I think a lot of the work you put into running your own cluster will still be required for EC2 or other cloud providers. Cloud providers give you a VPS to play with, but they don't typically handle any of the cluster part. EC2 doesn't. So in either case, all of the software side of things is still up to you. In fact, the only thing extra that EC2's "cluster" service really gets you is that they provide a 10Gbps interconnect between your cluster instances. The instances themselves are nothing special, just large, and at ~$1500 a month, expensive.
Building a bunch of cheap machines and plugging them into a switch isn't difficult or risky, and outsourcing that to a cloud provider doesn't necessarily make it any easier. Potentially cheaper, depending on how long-term you need the performance. 4000 GBP would buy you 228162 hours of small (512MB RAM) instances at Linode, or 5690 hours of large (20GB of RAM) instances (cost scales linearly by RAM and guaranteed CPU share). Due to the way such cloud providers work (larger instances guaranteed larger minimum CPU, but all instance sizes have the same maximum theoretical performance of one quad core xeon), you'll probably get far more CPU power out of the many smaller instances, but it depends on how parallelizable the task is.
The downside of EC2 is that, while they guarantee a given amount of CPU power, the guaranteed amount is very small per dollar, and there's no taking advantage of spare CPU time that other people aren't using. In the case of a good cloud provider, there's usually a lot of CPU power to go around. If you want to directly compare guarantees, a large EC2 instance ($0.34/h) is probably roughly equivalent in guaranteed CPU to a 4GB linode ($0.22/h), but has double the burstable CPU power if it's available. Of course, RAM is not comparable, but it's unclear if RAM or CPU power is the primary demand in this instance. There are other complexities, because EC2 charges for storage and transfer on top of the base rate, while Linode includes it in the base rate.
I can buy hardware that is about 1 third this power for £4000 and then run it forever, or even resell it after, or transfer it to another project.
Amazon is AMAZING if you have unpredictable workload and sudden spikes of CPU power requirements. For instance if you have time sensitive workloads, things that need to be done before day X.
This is a University project so this doesn't apply at all.
You should definitely BUY your hardware. which one I don't know, it depends on the type of computation, the software and such but don't go amazon it's just NOT meant for you.
Here is something I may do one day: 10 nodes each with an intel core i7 2600k (around 100 gflops). If you network boot and have headless nodes then each can easily be built (with 8 GB ram) for £400 So in conclusion approximately 1000 gflops (more if you overclock) for at most £4000 (thats including VAT)
Just get with the guy who did the ask Slashdot the other day that didnt know what to do with his supercomputer.
IF you have to go get power, cooling, man hours of maintenance. *Lots* of places already have these as sunk costs, so why go incur costs with Amazon when you're already paying for your own IT overhead?
Are you sure? Might want to take a look at this.
Instead of hardware why not program software for distributed computing (like SETI@home) then you need only write the software and get people to install it on their systems. (yes I said only)
Perhaps set up the software so that other research groups could utilize the spare processing cycles thus getting a huge distributed processing set up for use by multiple Universities with different research needs brought together by a common processing software.
Make it modular not all research has the same needs.
With the correct media spin it could happen, and I have always wondered why schools don't use their lab computers for distributed computing at night, like I did by installing Vue Infinite render nodes on all the math lab computers.
No one ever noticed they were running.
"If any question why we died, Tell them because our fathers lied."
Yes, but you could purchase and run your own machine for two years straight. Even if the machine is only a quarter as fast, you'll get twice the computations out of it.
Ohh, BTW. I said that your comment made you sound like a paid shrill. I never said you were one. I never said I thought you were one. I was telling you how your comment was interpreted. I didn't even mean it as an insult, it was more of 'Hey, you know how that came off, right?'
You are entitled to your own opinions, not your own facts.
Hmmm. Actually I think you may have hit upon the answer without realizing it. 'Borrow' the CPU cycles of computer labs that are closed at night.
If you think about it, I am pretty sure there is at least one classroom exactly as you described (a few dozen mid-range to high end desktops on a GigE network) that locks the doors at night and spends a solid 10 hours dark. Figure out a way to boot these machines from a thumbdrive or boot DVD with the Linux distro for clusters that you like (personally I like the thumbdrive approach - it runs a LOT faster due to the seek times on a DVD ROM) and Voila! instant slave machine army for your cluster. If the OP can work around the hours constraints, I'm going to be he has access to a LOT more CPU horsepower than he could imagine.
The trick is simply finding out who is responsible for the hardware and convincing them to allow you access to the 'training room' or 'computer lab' after hours.
Glonoinha the MebiByte Slayer
Well a research group should do some research.
Security and revision control are important.
When starting a research group one important
and largish investment is the desktops and local
storage to manage the code and the data.
A startup should start with dual purpose resources
when possible. Code design should begin with
some notion of progress and checkpoint and restart.
Building reliable infrastructure is a royal PITA.
The desktop tools and cluster tools should play well
together.
Do research the various cloud resources. Optimum
use of cloud resources can depend on the smallest
initial design decisions.
As always read Jon Louis Bentley's "Programming Pearls"
Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.