Skip the raised floor and other stuff that isn't critical, shop carefully and buy the battery system used and you can put it together for well under $100k. The cost of the additional building power transformers and switchgear *alone* would cost well over $100k, for the amount of power necessary. Add to that at least 85 tons of air conditioning necessary to cool the cluster (assuming 100% effiency, and no redundancy). At about $40k for each 30ton Liebert chilled/potable water DX unit, that's another $120k...
Also, if you think that we can afford to put either a UPS or generator backup under this whole mess, you don't seem to understand what kind of minimal funding we get for facilities and infrastructure. Truthfully, considering the amount of money that it'd cost vs the cost of the downtime (and typical amount/length of power outages), putting a UPS and genset under all of the cluster nodes probably isn't worthwhile, or else our customers would actually pay us to do it.
With a fiber ring, they could have (temporarily) distributed the cabinets around the campus, bringing the machine up to full power. Then once the researchers sign off on it, the old one is powered down and moved out. Each pair of racks has a 50Gb connection back to the switching backbone. Building the cluster in this manner would horribly hurt performance (latency), probably require an extra $200k or so in networking hardware that would be useless after we re-assembled the cluster in one location, inconvenience our users (user jobs can run for up to 30 *days* at a time), require us to somehow convince other departments on campus to give us power and cooling to house the racks, require many, MANY more man hours to assemble and then move the machines back across campus, probably resulting in more hardware failures due to the number of times they're moved, etc.
It's simply not a feasible idea to try to assemble the new cluster before the old ones were demolished.
Each of the 800 machines costs at least $5000, so for the price of 20 of the machines you can build a whole new room to house them. Each machine cost us no more than about $2300 (excluding the infrastructure). In total, we purchased about 1000 machines for about $2.6M including racks, cables, network switches, etc, so about $2600/machine total cost.
Also, for many reasons you can't just spend money on whatever you want. The money's source dictates whether it can be spent on computers, staff salaries, building improvements, etc. It's not like "here's a check for a lot of money, do with it as you please."
First, the amount of time wasted by trying to do this incrementally would be a much bigger hassle than doing this all at once. The last cluster we built a rack at a time was 512 nodes (16 racks plus one network rack), and required about 2 months to construct. Even if we could achieve something like a zero-downtime switchover, the months it would take to assemble the systems and test them out incrementally would be completely unacceptable to our customers.
With the 1 week downtime, we were able to clean out and almost completely reorganize the half of our datacenter that HASN'T since the building was built in 1968. I personally spent over 30 hours the first two days of the downtime shutting down machines, wiping disks, removing machines and network hardware from racks, removing cables from the floor (in a manner to not disturb the remaining critical systems), etc. We pulled over 1.6 cubic yards of *network cables* out of the floor over those first two days, and spent time re-arranging the racks of systems that were left into proper hot-aisle/cold-aisle cooling rows.
Also, we've spent around $2.5M on new hardware for this compute cluster The necessary improvements to the datacenter would easily cost over $10M and require years to complete. In addition, a huge portion of the funds used to buy the new systems were from grants, which specified specifically what they could be used for (buying compute resources), and couldn't be used for building repair, maintenance, or construction.
We've been trying to convince the Board of Trustees that we need a new datacenter for the past 5-10 years now. No one there wants to cough up the amount of money that would be required to build such a building, despite how fundamental computing is to modern research. Part of the problem is cost per square foot... it's hard for them to take us seriously when other campus buildings are going up for around 10% of the cost per square foot of what is required for a properly constructed datacenter.
I guess another way to say this is "There's people who have been fighting these issues for over a decade now, and who intimately know the problems that are being faced, don't think that you're saying something new or useful that hasn't been considered.".:)
The storage will be provided by our already-in-place BlueArc Titan 2200 and 2500 systems. Both are two-head clusters with a few racks of disk behind each, the 2200 with 4Gb of uplink per head for home directory storage, and the 2500s each with 10Gb of uplink for scratch (high-speed) storage. They export their filesystems using NFS v3. We also provide an archival storage system using EMC's DXUL and a tape library that has a capacity of a couple of petabytes. We've tested the BlueArc Titans (which are FPGA-based NAS boxes) with over a thousand clients talking to them, running high-performance compute jobs, and they're basically the only solution that we've found which doesn't burst into flames. (well, stop working, anyways...)
The OS is RedHat Enterprise Linux 4, which is a requirement of some of our customers, and software that will be running on the systems.
There are actually two sections, one with about 200 nodes, which has an SDR Infiniband interconnect plus a gigabit ethernet fabric based upon re-used Cisco 4948-10GE switches, and one the other with about 600 nodes, which are networked with gigabit ethernet to a Foundry RX-16 switch, which is completely non-blocking (50Gb FDX of bandwidth connecting each 48-port blade to the switch fabric).
At peak load, we're estimating that the entire system (including in-room cooling systems) will use up about $650 (Purdue's negotiated power cost) in power each day. This doesn't include the water chillers across campus that supply cold water to the CRAC units.
Power: our department doesn't pay for it, for now. So, it doesn't cost us real money. While it might cost less to Purdue to buy new fast hardware than to pay for power to this stuff, "operation" expenses aren't considered the same as "capital" expenses such as equipment. Burocracy prefers wasting power than buying equipment...
F6800s vs G5 macs: There's two issues here.
First one relates to researchers and software compatibilty - some researchers use software that will only run on Solaris machines. Yes, we do have to support the researchers, we don't always get the luxary of chosing 'well, we won't do that because it's not OpenSource'...
Second issue is memory size and interconnect. The Sun machines have a blazingly fast interconnect compared to anything a G5 has, and can support up to 192GB ram per machine (which is what they contain). At 24 processors per box, that's 8GB per processor. Can you stick 16GB of ram into a dual G5? I'm leaning towards no...
Wow, you seem to be a typical nice slashdotter : ).
I'll go ahead and respond... Yes, we DO get very useful cycles out of these machines, even with their interconnect. They're great at running 1-4 node jobs, which is what's typically run on them. Small jobs like that work wonderfully on the clusters, freeing up things like our SP (and soon our F6800s) for 'real' researchers that need access to multi-GB datasets requiring a >32bit address space per process, or high speed interconnect, or...
And don't forget the machines (well, not the labor necessarily, but that was cheap enough, I know what my paycheck is!) were basically free to us.
Power is another issue, but for now it's something a different department gets to pay for (read "free to us").
Sorry about the AC post but I couldn't make log in using kfm. What I meant is why the hell would you want something like Access (or excel, word, powerpoint, etc.) on a PDA? I'd wrather use something where it's easy to edit the database ie. something w/a keyboard.
Also, if you think that we can afford to put either a UPS or generator backup under this whole mess, you don't seem to understand what kind of minimal funding we get for facilities and infrastructure. Truthfully, considering the amount of money that it'd cost vs the cost of the downtime (and typical amount/length of power outages), putting a UPS and genset under all of the cluster nodes probably isn't worthwhile, or else our customers would actually pay us to do it. With a fiber ring, they could have (temporarily) distributed the cabinets around the campus, bringing the machine up to full power. Then once the researchers sign off on it, the old one is powered down and moved out. Each pair of racks has a 50Gb connection back to the switching backbone. Building the cluster in this manner would horribly hurt performance (latency), probably require an extra $200k or so in networking hardware that would be useless after we re-assembled the cluster in one location, inconvenience our users (user jobs can run for up to 30 *days* at a time), require us to somehow convince other departments on campus to give us power and cooling to house the racks, require many, MANY more man hours to assemble and then move the machines back across campus, probably resulting in more hardware failures due to the number of times they're moved, etc.
It's simply not a feasible idea to try to assemble the new cluster before the old ones were demolished. Each of the 800 machines costs at least $5000, so for the price of 20 of the machines you can build a whole new room to house them. Each machine cost us no more than about $2300 (excluding the infrastructure). In total, we purchased about 1000 machines for about $2.6M including racks, cables, network switches, etc, so about $2600/machine total cost.
Also, for many reasons you can't just spend money on whatever you want. The money's source dictates whether it can be spent on computers, staff salaries, building improvements, etc. It's not like "here's a check for a lot of money, do with it as you please."
First, the amount of time wasted by trying to do this incrementally would be a much bigger hassle than doing this all at once. The last cluster we built a rack at a time was 512 nodes (16 racks plus one network rack), and required about 2 months to construct. Even if we could achieve something like a zero-downtime switchover, the months it would take to assemble the systems and test them out incrementally would be completely unacceptable to our customers.
:)
With the 1 week downtime, we were able to clean out and almost completely reorganize the half of our datacenter that HASN'T since the building was built in 1968. I personally spent over 30 hours the first two days of the downtime shutting down machines, wiping disks, removing machines and network hardware from racks, removing cables from the floor (in a manner to not disturb the remaining critical systems), etc. We pulled over 1.6 cubic yards of *network cables* out of the floor over those first two days, and spent time re-arranging the racks of systems that were left into proper hot-aisle/cold-aisle cooling rows.
Also, we've spent around $2.5M on new hardware for this compute cluster The necessary improvements to the datacenter would easily cost over $10M and require years to complete. In addition, a huge portion of the funds used to buy the new systems were from grants, which specified specifically what they could be used for (buying compute resources), and couldn't be used for building repair, maintenance, or construction.
We've been trying to convince the Board of Trustees that we need a new datacenter for the past 5-10 years now. No one there wants to cough up the amount of money that would be required to build such a building, despite how fundamental computing is to modern research. Part of the problem is cost per square foot... it's hard for them to take us seriously when other campus buildings are going up for around 10% of the cost per square foot of what is required for a properly constructed datacenter.
I guess another way to say this is "There's people who have been fighting these issues for over a decade now, and who intimately know the problems that are being faced, don't think that you're saying something new or useful that hasn't been considered.".
The storage will be provided by our already-in-place BlueArc Titan 2200 and 2500 systems. Both are two-head clusters with a few racks of disk behind each, the 2200 with 4Gb of uplink per head for home directory storage, and the 2500s each with 10Gb of uplink for scratch (high-speed) storage. They export their filesystems using NFS v3. We also provide an archival storage system using EMC's DXUL and a tape library that has a capacity of a couple of petabytes. We've tested the BlueArc Titans (which are FPGA-based NAS boxes) with over a thousand clients talking to them, running high-performance compute jobs, and they're basically the only solution that we've found which doesn't burst into flames. (well, stop working, anyways...)
The OS is RedHat Enterprise Linux 4, which is a requirement of some of our customers, and software that will be running on the systems.
There are actually two sections, one with about 200 nodes, which has an SDR Infiniband interconnect plus a gigabit ethernet fabric based upon re-used Cisco 4948-10GE switches, and one the other with about 600 nodes, which are networked with gigabit ethernet to a Foundry RX-16 switch, which is completely non-blocking (50Gb FDX of bandwidth connecting each 48-port blade to the switch fabric).
At peak load, we're estimating that the entire system (including in-room cooling systems) will use up about $650 (Purdue's negotiated power cost) in power each day. This doesn't include the water chillers across campus that supply cold water to the CRAC units.
Power: our department doesn't pay for it, for now. So, it doesn't cost us real money. While it might cost less to Purdue to buy new fast hardware than to pay for power to this stuff, "operation" expenses aren't considered the same as "capital" expenses such as equipment. Burocracy prefers wasting power than buying equipment...
F6800s vs G5 macs: There's two issues here.
First one relates to researchers and software compatibilty - some researchers use software that will only run on Solaris machines. Yes, we do have to support the researchers, we don't always get the luxary of chosing 'well, we won't do that because it's not OpenSource'...
Second issue is memory size and interconnect. The Sun machines have a blazingly fast interconnect compared to anything a G5 has, and can support up to 192GB ram per machine (which is what they contain). At 24 processors per box, that's 8GB per processor. Can you stick 16GB of ram into a dual G5? I'm leaning towards no...
Pat
I'll go ahead and respond...
Yes, we DO get very useful cycles out of these machines, even with their interconnect. They're great at running 1-4 node jobs, which is what's typically run on them. Small jobs like that work wonderfully on the clusters, freeing up things like our SP (and soon our F6800s) for 'real' researchers that need access to multi-GB datasets requiring a >32bit address space per process, or high speed interconnect, or...
And don't forget the machines (well, not the labor necessarily, but that was cheap enough, I know what my paycheck is!) were basically free to us.
Power is another issue, but for now it's something a different department gets to pay for (read "free to us").
Pat
And can I hook my VT420 up to it to run programs? For that matter, does it even support console-mode apps?
Sorry about the AC post but I couldn't make log in using kfm. What I meant is why the hell would you want something like Access (or excel, word, powerpoint, etc.) on a PDA? I'd wrather use something where it's easy to edit the database ie. something w/a keyboard.