More on Virginia Tech G5 Cluster: 17.6 Tflops
daveschroeder writes "BBC World's Click Online has a video report (with text transcript) on Virginia Tech's new 1100-node dual 2.0 GHz G5 Terascale Cluster. The report quotes the performance as 17.6 Tflops. As a point of reference, the cluster would be number 2 on the most recent June Top 500 list, behind only Japan's Earth Simulator, and considerably more than doubling the performance of the current number 3 1152-node dual 2.4 GHz Xeon MCR Linux cluster. Assuming the performance figure accurately reflects the LINPACK score (which it should; since the deadline for submissions for the upcoming list of Oct 1 has already passed, one would imagine VT would quote that figure), and depending on new entries for November's upcoming list, the cluster should almost certainly rank in the top 5 - all for only US$5.2 million. The video report is available in Windows Media 9 and Real formats; the relevant portion starts at 13:00."
http://www.bbcworld.com/content/template_clickonli ne.asp?pageid=666&co_pageid=3
They have previously discussed this, they use error correction algorithms, no ECC RAM necessary.
Yep, and they're going to be top 5. Between you and them I wonder who has the best knowledge of how to build a cost efficient cluster?
The project leader, Dr. Srinidhi Varadarajan, will be speaking at a session entitled Building Virginia Tech's G5 Supercluster on Oct 28 at the upcoming O'Reilly Mac OS X conference.
He'll probably reveal some of the technical details, such as the version of Mac OS X used, at that session.
Also, according to a blog at O'Reilly:
Next year, all the little known details [about the cluster] will be revealed in a new book. By that time we'll know what the project means for supercomputing and for Apple.
Just because it's in hardware doesn't mean it's free. The ECC logic is going to add a small delay to each of trillions of memory accesses. Plain memory can most likely be tuned to run faster than ECC memory.
If you're running a constrained problem and can verify the results at the end, a single error check in software could consume far less overall time than the continuous ECC hardware checks. The software check would probably catch other types of errors as well (including many errors caused by software bugs).
In fact the heat is so intense that ordinary air conditioning units would have resulted in 60 mph winds
Help fight continental drift.
The 'project' uses the same amount of electricity as 3,000 average sized homes. There are many more devices deployed than just the 1100 G5s. The cooling system alone is a major power eater. Read the articles :)
No, ECC ram typically is just made with faster internals. As an example most ECC comodity ram is CAS2 latency whereas most generic ram is CAS3, so the ECC ram will perform exactly the same as the non-ECC ram. You can buy CAS2 non-ECC ram but it's nearly as expensive as the ECC ram. If you have a simple idiot check at the end of a complex calculation then saving the cost of going with ECC may be worth it but most clusters this large will be used on too many different projects to assume that all of them will have such checks. For an idea of how important ECC is read (a href="http://www.ibm.com/servers/eserver/pseries/c ampaigns/chipkill.pdf">This IBM whitepaper on their chipkill ECC scheme. Even normal SEC ECC ram (what most ECC ram is today) will have aproximately 900 failures per 10TB per three years. I think that IBM is right and that eventually all ram will be RAID-M, that is a RAID5 style array of redundant memory banks that are composed of ECC banks. At future densities this will be necessary because a single high energy particle will have the ability to scramble an entire memory word including it's ECC checking bits.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.