Inside The World's Most Advanced Computer
Junky191 writes: "Just came across an informational page for the Earth Simulator computer, which provides nice graphics of the layout of the machine and its support structure, as well as details about exactly what types of problems it solves. Fascinating for the engineering problems tackled- how would you organize a 5,120 processor system capable of 40Tflops, and of course don't forget about the 10TB of shared memory." Take note -- donour writes: "well, the new list of supercomputer rankings is up today. I have to say that the Earth Simulator is quite impressive, from both a performance and architectural standpoint."
The easiest way to validate these types of prediction mechanisms is to feed them only part of your data set and see how well it predicts your remaining dataset. For example, if you have an ocean temperature data set from 1920 to the present, you might start by feeding it 1920-1992 and seeing how well its predictions for then past ten years hold up to you actual data. You may think that the known data set it too small for accurate predictions, but there are some fascinating methods (like ice core sampling and tree growth sampling) that seem to allow pretty good deductions as to past climate conditions over a very long period of time.
The Earth Simulator is running Super UX. The same operating system as the rest of the NEC supercomputers
The German Language TV channel 3sat will broadcast a 30 min film on Earth Simulator on Monday and 24th of June at 21:30 hours and on Tuesday, 25th of June at 14:30 hours.
I'm not sure how much you've looked up, so some of this information may be redundant, but here's what I've been able to dig up:
That's a beast of a chip! The packaging looks pretty substantial as well. I don't doubt the cooling systems are fairly remarkable, although I can't find any specific information about 'em.
cheers!
These machines tend to be clusters of smaller machines. IBM's SP architecture, for example, runs AIX which doesn't need to scale particularly well.
The magic in SP is partly hardware (high-speed interconnect between nodes), partly the admin software which allows admin tasks to be run simultaneously of many nodes (a non-negligible task), and is otherwise left up to the application programmers to use MPI or similar to get the application to run over the cluster.
Single system images typically don't scale this large. Cray's UNICOS/mk (Unix variant) is a microkernel version of the UNICOS OS, used on the T3E and it's predecessors, where a microkernel runs on each node, obviously incurring some overhead, but avoiding bottlenecks that otherwise occur as you scale. Here's some info. Last time I checked, T3E scaled to 2048 processors.
Out of the box, SGI's IRIX scales very nicely up to 128-256 processors. Beyond that "IRIX XXL" is used (up to 1024 processors, to date). This is no longer considered to be a general purpose OS!
IRIX replicates kernel text across nodes for speed, and kernel structures are allocated locally wherever possible. But getting write access to global kernel structures (some performance counters, for example) becomes a bottle-neck as the system scales.
IRIX XXL works around these bottle-necks, presumably sacrificing some features in the process. Sorry, I can't find a good link on IRIX scalability.