Sure, if you buy a ton of second-hand peecees and glue them together in a Beowulf, you have lots and lots of flops (= CPU power).
But the flops are not everything. The problem with clusters is the network latency when the nodes talk to each other. That latency is small for your average network application, but immense for a supercomputer trying to make all its CPUs talk together. This is why there are entire classes of problems that cannot be solved properly on clusters (non-parallelizable problems).
As opposed to that, an SGI supercomputer has the inter-CPU latency orders of magnitude lower. Same GFlops per total (same CPU power), but certain problems are solved orders of magnitude faster.
Your 1U 4CPU mainboard is just a 4CPU system, period. You can connect many of these in a Beowolf/Mosix cluster, but that's a cluster, not a single-image supercomputer. There are entire classes of problems which cannot be solved on clusters, because of latency (well, they can, but the speed is not good at all). That's why you need single-image systems.
SGI makes single-image supercomputers, not computer clusters. There's a big difference between those two, big enough to justify the higher price on the supercomputers.
Your "128 cpu intel/amd solutions that fit in a single rack" are clusters of computers, not single-image supercomputers. Until you understand the difference, go play with the "imagine a Beowulf of these" crowd.:-P
Hint: that difference is why there's a price tag so high on the supercomputers.
If someone's that stupid, then i guess (s)he deserves whatever his/hers own hallucinations might produce. Just let them belive whatever shit they believe.
It is a good thing you noticed your own mistake: the Origin's are MIPS, not IA64.
I also noticed you are aware of the selling point of the Origins: extremely high bandwidth, single OS copy. This way, you can have one single large dataset accessed by all CPUs directly (over NUMA). The Beowulf clusters choke when trying to do this, because of the ridiculous latencies of the network.
I am not sure whether or not it is true "very few applications actually demand high IO"; they are few indeed, but not quite so "very few". As a plus, the ones that do exist are large cash providers, since they are usually required by the government and three-letter agencies. Also, the contracts in this area are not the quickly dissapearing kind. That's why you won't see SGI going away anytime soon.
P.S. reply to your P.S.: I own (as in: me, personally, not my company) an SGI Indy. At work, i deal currently with tens of SGI systems, from the oldest/smallest, to the newest.;-) And yes, i do use systems with three-digit CPU numbers.
Please stop sending messages that do not have any support. Obviously you're talking without having the slightest clue. For your information, if you go to www.sgi.com, in just one click (on Products/Servers) you can see how mistaken you are. According to the webpage http://www.sgi.com/servers/ the SGI Origin 3000 have "Up to 716 GB/sec" internal bandwitdh. How do you compare that to the 1GB/s of your PC?
I agree, the Origin 3000 is not a graphical workstation, but SGI's focus shifted from graphical stations to supercomputers. You were comparing PCs with SGI workstations, which indeed do not have a large technical superiority over PCs nowadays.
SGI's compiler (MIPSPro) is slow to compile because it does some really powerlifting stuff with the optimisations. I did a lot of tests with gcc and MIPSPro, and gcc doesn't come near to the SGI compiler. The neatest trick is that MIPSPro does a global optimisation after linking the.o files. If you want your binaries to run really fast, use MIPSPro; forget gcc.
2x195MHz MIPS you're saying is the same thing as a 400MHz Intel?:-) That's ridiculous. First off, MIPS is 64 bit, so the dual CPU is more like a 800MHz Intel. Second, MIPS has a huge cache, like 2MB or so. Intels have tiny caches. Third, SGI architecture has a huge internal bandwidth. Intel comes nowhere near that.
The problem you describe happens with all filesystems that do not have ordered writes: ReiserFS and JFS are also affected. Ext3 has this "ordered mode", where metadata is commited to disk only after data was commited, therefore there's no chance to get NULLs no matter what. A while ago, XFS had this pathological behaviour when metadata was commited after data, so the NULLs were quite a problem after power blackouts. But this was fixed since a few versions now, and there's no real difference between XFS and other journaled filesystems nowadays.
Anyway, if you care that much for your data, then you're better off using Ext3 with full journalling turned on. Otherwise, i just use XFS everywhere, because of performance boost (ok, so i do use ReiserFS for proxy caches).
What's your kernel? Are you using vanilla kernel with XFS patch? I'm using the Red Hat kernel (lots of performance and stability patches) and XFS, and i've never seen the problem you describe.
If you're like me, and you're doing lots of video processing stuff, then the ability to very quickly process files that are usually > 1GB is very neat. That's one reason to use XFS.
Ext3: - compatible with Ext2 - can journal everything (data included)
XFS: - very large volumes and files - very good performance when writing/reading at high speeds, and/or to/from large files, and/or with concurrent access - POSIX ACLs and extended attributes
...it is methane hydrate. That is methane associated with water; the water and methane molecules are entangled in a weird fashion, but it's solid and stable under conditions that are not quite exotic.
biotech oil drilling car crash simulations weather modelling military stuff solid state physics (and all kind of physics actually) rocket science:-) (literally) etc.
SGI never thought to replace Irix with Windows! That's ridiculous. Irix can scale up to 1024 CPUs and beyond. Solaris can scale up to 100. Here's Linux, now it's scaling close to 100. How much to you think Windows can scale? 10 CPUs? 20?:-) SGI's thing was always that it had machines running one single copy of the OS across hundreds (or thousands) of CPUs on the same machine (not in a cluster). You simply cannot do that with Windows, period. They had some graphics workstations running Windows, but that was on the lowest end of things, and now those systems are not available anymore.
The whole point with the SGI supercomputers (there are Origin servers running Irix on 1024 processors) is that there's one single copy of the OS running across all those CPUs, and the entire memory is available to all CPUs on the same piece of hardware. That means, any CPU can access any piece of information at the speed of mem-IO, and you can easily create a large matrix (think many tens or hundreds of GB) to keep all your data in one piece. Networked clusters (Mosix, Beowulf) split the CPU bunch across the network, and the memory is split too. That means there's a huge latency when a CPU wants to access data that happens to be on a different node on the network: the network latency is many times larger than memory latency.
There are problems that simply cannot be solved on networked clusters, precisely because of network latency. While true supercomputers (all CPUs on the same machine) do not have this limitation. Well, ok, so you can split the matrix across nodes in a Beowulf, but even if you have the same CPU power as the SGI supercomp, you're going to solve the problem several times slower (if not several orders of magnitude slower). Such is the importance of latency.
This is why there's no point in clusterising this kind of computers: you lose their biggest advantage: single OS copy, all memory on the same machine.
Actually, it's precisely because of lack of superfast mem-IO machines that many people tried to work around the problem and create algorithms that are CPU-bound. In fact, most of the computationally-intensive problems require LOTS of mem-IO.
And there's one more thing: there's a huge difference between the 64-CPU SGI machine, and a Mosix cluster of 64 1-CPU nodes: the SGI has one single memory space contiguous on the same machine. That means you can actually use a very large matrix to process your data, instead of shoving bits of it over the network back and forth. There are entire classes of problems that will be solved orders of magnitude faster on the SGI server than on a network-distributed Mosix cluster (or any other kind of cluster, Beowulf, etc.). That's the advantage of true SMP systems (all CPUs on the same hardware) as opposed to networked clusters.
Sure, if you buy a ton of second-hand peecees and glue them together in a Beowulf, you have lots and lots of flops (= CPU power).
;-)
But the flops are not everything. The problem with clusters is the network latency when the nodes talk to each other. That latency is small for your average network application, but immense for a supercomputer trying to make all its CPUs talk together. This is why there are entire classes of problems that cannot be solved properly on clusters (non-parallelizable problems).
As opposed to that, an SGI supercomputer has the inter-CPU latency orders of magnitude lower. Same GFlops per total (same CPU power), but certain problems are solved orders of magnitude faster.
That's the power of latency.
It's a typo.
Your 1U 4CPU mainboard is just a 4CPU system, period. You can connect many of these in a Beowolf/Mosix cluster, but that's a cluster, not a single-image supercomputer.
There are entire classes of problems which cannot be solved on clusters, because of latency (well, they can, but the speed is not good at all). That's why you need single-image systems.
SGI makes single-image supercomputers, not computer clusters. There's a big difference between those two, big enough to justify the higher price on the supercomputers.
Your "128 cpu intel/amd solutions that fit in a single rack" are clusters of computers, not single-image supercomputers. Until you understand the difference, go play with the "imagine a Beowulf of these" crowd. :-P
Hint: that difference is why there's a price tag so high on the supercomputers.
If someone's that stupid, then i guess (s)he deserves whatever his/hers own hallucinations might produce. Just let them belive whatever shit they believe.
The article was a joke, the "april 1st" style.
You're lame. The article was a joke (a la "april fools"). You didn't get it.
Please stop feeding every phoenix release into Slashdot's news. That's enough.
Xine: http://xine.sourceforge.net/
Why is this so tremenduously important that it made it to the first page?
It is a good thing you noticed your own mistake: the Origin's are MIPS, not IA64.
;-) And yes, i do use systems with three-digit CPU numbers.
I also noticed you are aware of the selling point of the Origins: extremely high bandwidth, single OS copy. This way, you can have one single large dataset accessed by all CPUs directly (over NUMA). The Beowulf clusters choke when trying to do this, because of the ridiculous latencies of the network.
I am not sure whether or not it is true "very few applications actually demand high IO"; they are few indeed, but not quite so "very few". As a plus, the ones that do exist are large cash providers, since they are usually required by the government and three-letter agencies. Also, the contracts in this area are not the quickly dissapearing kind. That's why you won't see SGI going away anytime soon.
P.S. reply to your P.S.: I own (as in: me, personally, not my company) an SGI Indy. At work, i deal currently with tens of SGI systems, from the oldest/smallest, to the newest.
Please stop sending messages that do not have any support. Obviously you're talking without having the slightest clue.
For your information, if you go to www.sgi.com, in just one click (on Products/Servers) you can see how mistaken you are.
According to the webpage http://www.sgi.com/servers/ the SGI Origin 3000 have "Up to 716 GB/sec" internal bandwitdh. How do you compare that to the 1GB/s of your PC?
I agree, the Origin 3000 is not a graphical workstation, but SGI's focus shifted from graphical stations to supercomputers.
You were comparing PCs with SGI workstations, which indeed do not have a large technical superiority over PCs nowadays.
SGI's compiler (MIPSPro) is slow to compile because it does some really powerlifting stuff with the optimisations. .o files.
I did a lot of tests with gcc and MIPSPro, and gcc doesn't come near to the SGI compiler.
The neatest trick is that MIPSPro does a global optimisation after linking the
If you want your binaries to run really fast, use MIPSPro; forget gcc.
2x195MHz MIPS you're saying is the same thing as a 400MHz Intel? :-) That's ridiculous.
First off, MIPS is 64 bit, so the dual CPU is more like a 800MHz Intel.
Second, MIPS has a huge cache, like 2MB or so. Intels have tiny caches.
Third, SGI architecture has a huge internal bandwidth. Intel comes nowhere near that.
The problem you describe happens with all filesystems that do not have ordered writes: ReiserFS and JFS are also affected.
Ext3 has this "ordered mode", where metadata is commited to disk only after data was commited, therefore there's no chance to get NULLs no matter what.
A while ago, XFS had this pathological behaviour when metadata was commited after data, so the NULLs were quite a problem after power blackouts. But this was fixed since a few versions now, and there's no real difference between XFS and other journaled filesystems nowadays.
Anyway, if you care that much for your data, then you're better off using Ext3 with full journalling turned on.
Otherwise, i just use XFS everywhere, because of performance boost (ok, so i do use ReiserFS for proxy caches).
What's your kernel? Are you using vanilla kernel with XFS patch?
I'm using the Red Hat kernel (lots of performance and stability patches) and XFS, and i've never seen the problem you describe.
If you're like me, and you're doing lots of video processing stuff, then the ability to very quickly process files that are usually > 1GB is very neat. That's one reason to use XFS.
- very high disk I/O performance, especially when reading/writing from/to large files
- extended attributes
- POSIX ACLs
- mature and stable
Ext3:
- compatible with Ext2
- can journal everything (data included)
XFS:
- very large volumes and files
- very good performance when writing/reading at high speeds, and/or to/from large files, and/or with concurrent access
- POSIX ACLs and extended attributes
ReiserFS:
- very fast with lots of small files
...it is methane hydrate. That is methane associated with water; the water and methane molecules are entangled in a weird fashion, but it's solid and stable under conditions that are not quite exotic.
biotech :-) (literally)
oil drilling
car crash simulations
weather modelling
military stuff
solid state physics (and all kind of physics actually)
rocket science
etc.
SGI never thought to replace Irix with Windows! That's ridiculous. :-)
Irix can scale up to 1024 CPUs and beyond. Solaris can scale up to 100. Here's Linux, now it's scaling close to 100. How much to you think Windows can scale? 10 CPUs? 20?
SGI's thing was always that it had machines running one single copy of the OS across hundreds (or thousands) of CPUs on the same machine (not in a cluster). You simply cannot do that with Windows, period.
They had some graphics workstations running Windows, but that was on the lowest end of things, and now those systems are not available anymore.
The whole point with the SGI supercomputers (there are Origin servers running Irix on 1024 processors) is that there's one single copy of the OS running across all those CPUs, and the entire memory is available to all CPUs on the same piece of hardware. That means, any CPU can access any piece of information at the speed of mem-IO, and you can easily create a large matrix (think many tens or hundreds of GB) to keep all your data in one piece.
Networked clusters (Mosix, Beowulf) split the CPU bunch across the network, and the memory is split too. That means there's a huge latency when a CPU wants to access data that happens to be on a different node on the network: the network latency is many times larger than memory latency.
There are problems that simply cannot be solved on networked clusters, precisely because of network latency. While true supercomputers (all CPUs on the same machine) do not have this limitation.
Well, ok, so you can split the matrix across nodes in a Beowulf, but even if you have the same CPU power as the SGI supercomp, you're going to solve the problem several times slower (if not several orders of magnitude slower). Such is the importance of latency.
This is why there's no point in clusterising this kind of computers: you lose their biggest advantage: single OS copy, all memory on the same machine.
Actually, it's precisely because of lack of superfast mem-IO machines that many people tried to work around the problem and create algorithms that are CPU-bound.
In fact, most of the computationally-intensive problems require LOTS of mem-IO.
And there's one more thing: there's a huge difference between the 64-CPU SGI machine, and a Mosix cluster of 64 1-CPU nodes: the SGI has one single memory space contiguous on the same machine. That means you can actually use a very large matrix to process your data, instead of shoving bits of it over the network back and forth.
There are entire classes of problems that will be solved orders of magnitude faster on the SGI server than on a network-distributed Mosix cluster (or any other kind of cluster, Beowulf, etc.). That's the advantage of true SMP systems (all CPUs on the same hardware) as opposed to networked clusters.