BigTux Shows Linux Scales To 64-Way
An anonymous reader writes "HP has been demonstrating a Superdome server running the Stream and HPL benchmarks, which shows that the standard 2.6 Linux kernel scales to 64 processors. Compiling the kernel didn't scale quite so well, but that was because it involves intermittent serial processing by a single processor. The article also notes that HP's customers are increasingly using Linux for enterprise applications, and getting more interested in using it on the desktop..."
That's what, 640.000?
Does it run Linux well?
What parallel-computing activity doesn't involve intermittent activity by a single processor? You have to spawn the parallel job somehow, and typically that starts as a single process. Is the implication here that compiling is pipelined, but linking is a single-CPU job?
If you mod me down, I shall become more powerful than you can possibly imagine.
I haven't had a 64-way since college.
And you?
"Look, Smithers! I'm Davy Crockett!"
SGI
Unisys
Fujitsu
HP
It looks like there might actually be a competitive marketplace for scalable multiprocessor Linux systems real soon now (if not already).
"serial processing" is most probably the linking step... "intermittent" probably means that they incrementally link groups of .o files, etc.
Zero points for trying.
But seriously, this is pretty cool - though I think the best thing about multi-processor systems past two or four is really the ability to run virtualized servers with two or four dedicated CPUs each inside an uber-CPU'd system.
This flies in the face of science.
I work on a SuperDome and would love to see it running Linux. HP-UX is such a pain!!!
I was raised on the command line, bitch
"Nemo me impune lacesset"
I would hardly consider a 26x speed up for a 64x processor multiple scaling "well". They can wave their hands and claim it's because part of their compile is single threaded, but until they demonstrat a real world app (or even standard benchmark; a kernel compile doesn't really qualify there either) that scales better than 26x, I'm not very impressed.
7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
Did you not RTFA? The compile test showed non-linear results, but other benchmarks did prove (near) linear.
Quite informative and typical of what I've seen in a few other cases. Poorly "equipped" IT admin makes "dumb" but well-meaning proposal to switch to Linux/OSS. In some cases, he/she's kicked out on his/her butt (as in this case). Other times, the shop switches with disasterous results.
I know linux is pretty good from a security sence (compared to windows, at least), and I'm not surprised to find it operates on exotic setups, but is there that many programs out there that support such a setup? or ones that will actually benefit from this many processors? Or is the point of this system to develop custom business for their use? Or is it for a data server of some sort that can benefit from multiple cores answering requests?
lol: You see no door there!
While FreeBSD is a great OS/kernel, it doesn't scale as well as Linux, end of story.
Huh? What smoke are you craking? Here is the comparison of MS's latest and greatest Windows 2003 server editions So, umm where is this double of what Linux supports? Plain vanilla Linux 2.6 can do 64-way no problem. Actually, SGI has had single image 128-way Linux system out for a while. They should have 256-way, single image Linux system out soon. That is more then MS can even touch. Maybe do some research before you just shoot off FUD.If Tyranny and Oppression come to this land,
it will be in the guise of fighting a foreign enemy. -James Madison
What's more they estimated the parallel element of the compile workload had scaled linearly.
Hey, at least they tried. How many news articles have you read that compares linux kernel compiles on a 64 processor machine? probably only one.
it took 19 minutes to compile with a single cpu, and 26x faster for the 64 processor machine. Does that equate to about 43 seconds for a kernel compile? It'd probably take longer than that just to untar/unbzip2 the source, since that would be running on only 2 cpus (one process for tar, one for bzip2).
Why read the article when I can just make up a snap judgement?
If it can scale to 16 procs well, it will scale to 64 procs well.
Until you start talking about double that amount of procs, which is what Windows Server does these days
Wrong. Windows Server 2003 supports a maximum of only 64 processors, and I believe it was significantly tested only on 32-way and smaller machines.
Looking at the literature, Linux and Unix in general seems to be designed to keep processes as lightweight as possible. OTOH, Windows processes are a little heavier and take longer to start up.
Then, OTOH, Windows threads are very lightweight compared to the equivalent thread model in Linux. Benchmarks have shown that in multi-process setups, Unix is heavily favored, but in multi-threaded setups Windows comes out on top.
When it comes to multi-processors, is there a theoretical advantage to using processes vs threads? Leaving out the Windows vs Linux debate for a second, how would an OS that implemented very efficient threads compare to one that implemented very efficient processes?
Would there be a difference?
I like the way HP is taking their software distributing with offering Linux as a solution along with AMD processors. Dell attempts it but only for servers. HP I believe does the same, but at least them seem like they care more. nd that is what matters if we are going to be pushing Linux into enterprises AND the home... Hooray for HP!
_
Free 27" Sony WEGA TV
Minor correction. SGI have 512-way single system image Linux computers out there *now*. They are aiming to soon go to 1024 and 2048 way.
Check out AMDZone and the Inq.
MS now recommends just this kind of a system for a desktop when doing Longhorn :).
wouldn't it have been better to suggest something more comercial like xandros or linspire using evolution for the groupware ?
If Tyranny and Oppression come to this land,
it will be in the guise of fighting a foreign enemy. -James Madison
First of all, a 26x speedup is GOOD. That said, if you are trying to use a cluster of 64 Itanium 2 processors to compile things, you're an idiot. IIRC, the long pipeline and VLIW, highly scheduled, architecture of the Itanium 2 make it bad at compiling. You could get that performance with cheapter Athlon 64s or Xeons. Not only that, but compiling one thing will ALWAYS be partly serial. Now if they were to compile multiple things (say 3 kernels, or the kernel, X, and KDE) at the same time, they should see closer to that 64x speedup. It's all about how much you can make parallel.
Which is something else. If you were to give that same thing a better application, it WOULD give you near 64x performance. If you used it to batch convert WAVs to MP3s, or RAW images to JPEGs, or MPEG4 to DiVx, or even just raytrace images (all things where no part is dependant on another part so they are highly parallizable), things will go great. In the article, they give the example of some bandwidth benchmark where the bandwidth scales almost perfectly with the number of processors they throw at it.
PS: Interesting fact I saw the other day. The human brain can only do about 200 operations per second, which is why computers are much faster at math. But the brain can do MILLIONS of things at once. So while it may only be able to process the image from our eyes at 200 "operations" per second, it do that for the millions of little bits of information all at once, which is why people are so good at visual things, pattern matching, chess, etc. Just FYI.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
NASA's Columbia cluster ^ 512-way SGI machines running Linux (actually 20 of them...) Not to mention "Columbia's record results were achieved running the LINPACK benchmark on 8,192 of the NASA supercomputer's 10,240 processors. Columbia also achieved an 88 percent efficiency rating on the LINPACK benchmark, the highest efficiency rating ever attained in a LINPACK test on large systems." from http://www.sgi.com/company_info/newsroom/press_rel eases/2004/october/worlds_fastest.html
The hell with that--I just want a wireless driver for my Dell (Broadcom) PCI card. :-(
Correct, AFAIK the biggest windows 2003 datacenter installs are on Unisys ES7000's and those only support 32-way windows partitions. The box can hold 64 Xeon's so I would say that Unisys isn't comfortable with the scalability of windows to the full system size, otherwise they'd be shouting it from the rooftops.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
thats great to know that the kernel can handle 64way machines.. Especially since i just ordered one from my local pc store in bits to build myself..
Really the key will be when the system scales to 128processors and beyond.
Ah, shut up and get back to work, Bill!
Never mind Linux for a moment, I'm just amazed that 64 Itanium 2's have actually been sold...
Wrong! wrong! you (and the other guy who replied) are wrong.
Just take a look at some of the public benchmarks. Just take a look at what some folks are currently running in production. Do a little work and see what's publicly referenced out there. Windows 2003 scales quite nicely up to 64 processors (and 512 Gb). Worst case is 1.7x from 32p to 64p. Some apps 1.8x 1.9x. Depends also on whether you're interleaving memory across the backplane or whether (if you can) you know enough about tweaking your app ( and the hardware allows it) to run your memory locally. Soon to be 1 Tb RAM and there's been a box in Redmond that's been running 128p now for some time (here's a hint, kiddies, it's not a Unisys). BTW, how many procs can a Unisys support in a single domain?
To be efficient, the processors would need gigantic caches, to keep the load on the rest of the system down. Either that, or you COULD run the CPUs out of step over a bus that is 64 times faster than normal. I'd hate to be the person designing such a system, though.
Now, this system could be of extreme interest in the supercomputer world. One of the biggest complaints about clustering is the poor interconnects. This would seem to get round that problem. A Blue Gene-style cluster where each node is a 64-way SMP board, and you're running a few thousand nodes, would likely be an order of magnitude faster than anything currently on the supercomputer charts.
On the other hand, do we need to know what the weather is not going to be, ten times as often?
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Imagine a Beo-- oh wait...
you had me at #!
... use 10 faster processors.
Yeah but how fast does the 128-processor beast from Redmond compile the Linux kernel?
I gots ta ding a ding dang my dang a long ling long
Imagine a Beowulf cluster of those!
Ahh lick my balls. I'll believe you when I see the benchmark results.
OOooh, Redmond has a crappy 128-way now, do they? Pity Linux has been running on 512-ways IN PRODUCTION for the past year, and SGI are looking up to 1024 and 2048 way.
Windows is a heap of shit.
While FreeBSD is a great OS/kernel, it doesn't scale as well as Linux, end of story.
Well I hope the article is wrong concerning how long it took to compile that kernel using a single processor Itanium 2...19 min?
Thu Jan 13 03:22:14 MST 2005
Thu Jan 13 03:41:52 MST 2005
make buildworld = 19.75 min
Tue Jan 18 21:32:08 MST 2005
Tue Jan 18 21:35:54 MST 2005
make buildkernel = 4 min
That is 25 min to compleatly rebuild FreeBSD 4.11 from source.
This is a P4 2.55 (no HTT) with on 1GB ram and PATA disks running FreeBSD 4.11. It was runing an X server, acting as a NAT router for my internal network, DNS server, web server and general purpose workstation (including SetiAtHome active).
It took my 6.0-Current (Sempron 2400+ 512MB/PATA) box 12 min for the kernel with ALL the debugging (aka WITNESS/INVARIANTS/DEBUGGERS) stuff in the compiling kernel.
Even a single processor Itanium 2 should have blown EITHER of my two boxes away.
Maybe they should concentrate on getting good performance from a single processor (which is way more common) before adding more CPUs (walk before running???).
BWP
Are they 512-way single image systems? If so, that is pretty impressive!
If Tyranny and Oppression come to this land,
it will be in the guise of fighting a foreign enemy. -James Madison
Looks like someone was up to those challenges, eh? 64-processor support *and* 64-bit support. Awesome news.
I have no special gift, I am only passionately curious. --Albert Einstein
Yes. From the link:
Brooks and his team instead pointed to Kalpana, an Intel® Itanium® 2-based, 512-processor SGI® Altix® 3000 system in use at NASA Ames since November 2003 and named to honor Kalpana Chawla, a NASA scientist lost in the Columbia accident.. In less than six months, Taft says, the Kalpana system - the first 512-processor Linux® system ever to operate under a single Linux kernel - had revolutionized the rate of scientific discovery at NASA for a number of disciplines. On NASA's previous supercomputers, simulations showing five years worth of changes in ocean temperatures and sea levels were taking 12 months to model. But on the SGI® Altix® system, scientists could simulate decades of ocean circulation in just days, while producing simulations in greater detail than ever before. And the time required to assess flight characteristics of an aircraft design, which involves thousands of complex calculations, dropped from years to a single day. "That kind of leap is incredible," says Taft. "What took a year on the best computing technology previously available, we could now accomplish in days on the Altix system."
Smaller, say 4 or 8 way NUMA boards, that are within the means of the average geek?
I'm not talking about mere mortal SMP systems, I wan't all the crazy memory partitioning and whatnot.
I don't need no instructions to know how to rock!!!!
Comparing the Itanium complie times to anything is just stupid. I can compile my Linux kernel on a P4 or AMD _much_ faster then 19 minutes.
If Tyranny and Oppression come to this land,
it will be in the guise of fighting a foreign enemy. -James Madison
You seem to forget that the enterprise users which fund development on big machines are usually the ones that supporting the entire projects you use.
:)
Between the kernel, your latest DBMS, etc, lots of companies fund the dollars to these projects (or the man hours).
Nonetheless- write one
when you see the word 'Linux', drink!
Someone wasn't awake when their Comp Sci class covered Ahmdal's Law. Or the Dining Philosopher's Problem. Or vector processing. Or networking. Or the parallelization problem. Or...
Actually, the troll can be made to serve a useful purpose, because there are probably a lot of people who read Slashdot who didn't do Comp Sci.
Part of the problem with parallelization is that not all problems can be divided up that way. If one man takes 60 seconds to dig a posthole, how long would it take 60 men to dig a single posthole? Answer - 60 seconds. Exactly the same amount of time is spent, because only one person can be digging the posthole at a time. Having more people doesn't help.
Another part of the problem is sharing resources. Let's say you have some computer memory that can respond to a read operation in one clock cycle. Let's also say that the computer program never reads from memory. (Very unlikely.) The first processor fetches an instruction (which is a read operation) and then executes it. The second processor can't do anything while the first one is reading, so has to wait until it has finished with that part, before it can do a read of its own.
If the instruction takes 1 clock cycle to execute, then the first processor will be ready after the second one has performed its fetch. In which case, you will be running the memory flat-out with just 2 processors. Any more than that, and the system will actually slow down, because the processors will have to wait.
Likewise, if the average time to run an instruction is N clock cycles, you will (on average) be able to have N+1 processors, before the memory is maxed out.
In practice, processors run about an order of magnitude faster than RAM, which is why modern systems have lots of L1 and L2 cache (and sometimes L3), pipelining, etc. These are all tricks to try and access the somewhat slower main memory as little as possible.
Also in practice, programmers try to avoid "expensive" (in terms of clock cycles) operations because you can generally get the same results faster by other means. (That's why RISC technology became popular - make the fast operations faster, rather than adding stuff that people will try to avoid.)
In consequence, sharing resources is a very difficult problem. It is not the only problem that many-way systems face, though. If you have N processors, there are !N possible ways for those processors to communicate. In this case, it would be !64 (64x63x62x...x2x1), which is a horribly large number. You couldn't have one link per pathway, for example, which means you've got to share links, which means you've got to have some damn good scheduling and routing mechanisms. Even then, with limited resources, you can only have so many processors talking at a time, before you are overwhelmed. Which means that "chatty" problems will involve a lot of processors spending a lot of time simply waiting for their turn to chat.
(This goes back to why people generally build clusters, rather than many-way SMP systems, and why high-end clusters use the fastest networking technology on the planet. Clustering is easy. Getting the communication speeds up is the problem. Getting communication speeds to the point of being useful for scientific applications is a very complex, expensive problem. Which is the main reason Mr. Cray charged more than Mr. Dell for his computers - and why people would pay it.)
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Ahh but compiling code for ia64 is pretty resource intensive, especially with gcc.
Don't worry about Linux performance though, back in the day when the IBM guys were getting excited about Linux scalability, they got their kernel compiles down to under 4 seconds on a 32 way POWER4. I wonder what the 64-way POWER5s could do it in. (probably not much better, as most of that few seconds was the final link phase).
but then why does it require 6 computer scientists to change a light bulb?
Using Itanium 2 cpus jsut like the Superdomes... how is this new news?
||| I still can't believe Parkay's not butter.
They did try Windows Server 2003 on a 64-way machine, but the kernel got scared and hid under the disk controller.
-- Microsoft is the most expensive commodity operating system and office suite vendor in the marketplace.
Umm, no. The Itanium sucks at these kinds of tasks due to a long pipe line. Read this post for more info.
But a 5 times speed increase for me running a machine with a load with ATA disks?
If the pipeline clears/stalls are that bad (even with their massive L3 cache (1.5MB to 9MB), it looks like the Itaniums are really only good for number crunching and not much else.
BWP
It's a violation of the Microsoft License to compile the Linux kernel on Microsoft Visual C++.
"What's the frequency Kenneth?"
I was going to comment on this "continued to double" . It doesn't "continue to double." That would be 5*2^16, or exponential growth.
This is 5*16. They said it wrong.
Mmmm...a 128 CPU spam zombie...
-- Microsoft is the most expensive commodity operating system and office suite vendor in the marketplace.
Because it's a hardware problem.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
If Tyranny and Oppression come to this land,
it will be in the guise of fighting a foreign enemy. -James Madison
I guess the kernel compile system could use some work if this is a really important benchmark of system performance.
MPEG4 to DiVx
I don't think that part is really CPU intensive
I'm so confused. Itanium bad. Linux kernel scalability good. Help!
---
Posted as me for the negative karma whoring.
True, you can build very large clusters from these bricks, but the bricks themselves don't scale beyond a relatively small number of CPUs.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Linux scaling to 512 processors:/ columbia/
http://www.sgi.com/features/2004/oct
The story should be HP has finally caught up to where SGI were 2 years ago.\
There is folly and foolishness on the one side, and daring and calculation on the other. - Admiral Pellew, Hornblower
PF isn't really any better than iptables as far as I know. Lots of (openbsd) people like the syntax better... but if you can't handle iptables syntax, you shouldn't be administering a complex firewall in the first place.
Support for today's problems and the future DRM problems of tomorrow.
It also doesn't avoid the main point, which is that any given resource can only be used by one CPU at a time. If processor A on brick B is passing data along wire C, then wire C cannot be handling traffic for any other processor at the same time. That resource is claimed, for that time.
While it's true that you can only send one signal down a wire at "a time" (absent weird frequency stuff, although the wires are bidirectional, so you can really send two signals), "a time" in these systems is on the order of nanoseconds. So while only one CPU can use a wire in any given nanosecond, hundreds of CPUs can use the same wire within the same millisecond, which is close enough to "at the same time" to work as "at the same time", so you can have multiple streams of traffic using the same physical connection.
The only resource a CPU locks on is an exclusively owned (writable) cache line. CPUs share access to I/O space, and share access to cache lines that are read-only. CPUs can talk to "local" memory (on the same node) or memory on a node on the opposite side of the system, in an identical manner except for access latency (i.e. the address for a particular piece of memory is the same no matter which CPU is addressing it).
How does how many CPUs are in a brick have anything to do with whether it's an N-Way SMP system? A brick is just a physical box. The interconnect that connects the processors together extends over multiple bricks. The bricks just provide modularization - you could put all 64 CPUs in one brick if you wanted to, but the only difference would be cosmetic (additional pieces of metal between boards).
Do you really think anyone is building single boxes with 512 processors in them? These things come in *RACKS*.
To maximize resources to the absolute limit, you'd need a completely asynchronous computer. Such computers exist, sure, but they're usually very specialized and I know of none that are superscaler.
I'm not sure of the state-of-the-art for massively parallel asynchronous CPUs, but my guess is that they're nowhere near the same level as more traditional synchronous designs.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
My kernel only goes up to 11.
How'd you get a three processor system? Is it a quad board, discounted heavily because one socket was broken? That'd be neat, where'd you get it?
Infuriate left and right
Way to quote a fucking article from a year ago, douchenozzle...
Where do you think all this NUMA awareness came from? Sequent Engineers, that's where. Where do you think they are now?
I was under the impression that enterprise applications were normally limited by the speed of the hard-drive and RAM, applications like webserving and database management.
You see that brine there? That's my brine.
a kernel compile using a single Itanium 2 processor took about 19 minutes
And a kernel compile on a four way PPro 200 MHz took about two and a half minutes. Ok, that was a 2.2 kernel, where they probably used a 2.6 kernel, so that may account for a bit of the extra time, but still, 19 minutes? No wonder they need a 64-way box to make Itaniums do anything serious.
Just a little note: A version of bzip does exist that scales lineary on SMP machines - you can find it here.
I take it you've never seen This Is Spinal Tap.
Get Firestarter. It's a GUI for iptables. Best thing to do is figure out what port need to be blocked and write a bash script so iptables can block those, allow others, etc and instant firewall assuming you won't chang eit much (home use).
The fucking news and the fucking article itself are misleading.
> A 64-way system may or may not be useful. It depends on the speed of the interconnects, and the way it handles bus locking.
Of course it IS useful. It is great for database consolidation (especially for SQL Server which practically doesn't scale horizontally), for example, as upgrades can be done in minutes and the whole goddamned thing is as stable as an Intel box can be.
And in case you missed what the FA said, they did NOT run an OS on 64 CPUs (that's why it's bullshit and misleading) but they partitioned those 64 CPU is 16 four-way servers. But hey - this is Slashdot and any Linux related hype is welcome....
> So, sure, there are people who could use such a system, but I cannot imagine many of them are in the market.
Sorry, pal, but HP sold $1b of such boxes in 2004. Manufacturing, telcos, utilities and many other users need "boxen" like these. I think they're slightly more suitable for Windows because of the way it can "add" (allocate, actually) processors to Exchange and SQL Server systems.
Really? I would have thought that the compilation of loads and loads of .c files is exactly the sort of thing that could be shared among processors. It certainly has been on projects that I've worked on.
make -j (num of processors) ?
The interconnects needed are not 64! (64x63x...x2x1).
They are only 63+...+1 = (63*64)/2 = 32*63 = 2016.
The first must connect to the other 63, but the second has to connect to only 62 (it is already connected to the first), the third to only 61, and so on...
In consequence, sharing resources is a very difficult problem. It is not the only problem that many-way systems face, though. If you have N processors, there are !N possible ways for those processors to communicate. In this case, it would be !64 (64x63x62x...x2x1), which is a horribly large number. You couldn't have one link per pathway, for example, which means you've got to share links
Ever heard of a crossbar switch, Einstein?
A 64 node system could be connected using 64 links, and a 64x64 crossbar switch. There is no benifit to anything higher.
Your figure of 64 factorial (it is 64!, by the way, not !64) is ridiculous, and is nothing more than a wild guess.
Not a single one of which was a standard benchmark, which leads me to believe that they were manufactured to be linear. Woo hoo.
7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
Good gosh, slashdot is really going to pieces. Two people explain Spinal Tap to me, another comes up with a possibly real, possibly tongue-in-cheek answer, and, worst of all, someone mods me up as "insightful". What do I have to do, add footnotes and explanations?
I guess the two Spinal Tap explainers never heard the joke about only 10 people in the world. No, I'm not going to explain that.
This is pathetic. Insightful, jeez. Now watch someone mod this as flamebait or funny.
Infuriate left and right
Did I read that correctly, they've got Linux working way good on a c-64?? ;)
STREAM is not a real world app. Its just a massively parallel vector copy/sum/dot-product. Every SMP kernel in the world (even the older Linuxes) should be able to scale STREAM perfectly well.
The HPL results are more impressive, but keep in mind that linear equation solving code has advanced quite considerably (i.e., it tends to behave a lot like STREAM) to the point that its not very limited by kernel behavior.
Do you know what an ethernet switch is? And why it's better than a hub? you're assuming that the resource management on these systems works like a hub. It doesn't - it works like a switch. the *ONLY* place that a CPU shares resources with other CPUs is on the processor bus - 2-4 CPUs share that, *EXACTLY* the same as in a cluster of Xeons or any other dual-CPU box. Once you get past the processor bus, everything is buffered. The CPU sends out whatever data request it has and off it goes. The interconnect takes care of making sure the wires are used appropriately, the CPU doesn't have to worry about it.
Now, yes, it's possible that a CPU needs something from memory or IO and it has to wait for it to come back, but EXACTLY the same thig would happen in a CPU in a cluster as well.
You very simply have asbsolutely no clue what you're talking about - a node in one of these huge systems functions pretty muc identically to one box in a cluster - it is archetectually the same. You don't add processors by sticking them on the same processor bus, you add processors by adding more nodes, each with their own memory and IO, and having a REALLY FAST interconnect between them, and an OS where everything is one system image.
Cluster: Many distinct computers networked together. Supercomputer: Many distinct nodes networked into one computer.
the long pipeline
You call a 7-stage pipeline long ? You don't know anything about Itanium, do you ?
Like I've said, I've used transputers. Let me know when you find a distinct node in an array. You can't? Oh dear.
Seymore Cray, for many years, resisted multi-processor computers. Most of his designs were monolithic, on the grounds that a good design doesn't need to be MP. I guess that means that his designs weren't supercomputers, then. No? They were? Oh.
I guess that the only conclusion is the one in the Princess Bride - "You keep using that word. I do not think it means what you think it means".
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
I find it interesting how well developed this is. I mean, how many linux coders actually have access to such hardware for testing/development purposes? Many of the larger projects can have a huge base of devs from within the userbase supplying patches/fixes/upgrades. I'm guessing that the userbase for the system described isn't very high (much less so for those able to much with running kernels on such)
Or perhaps most of it just scales up very nicely from smaller systems?
"Well, now I'm unemployed just like you all"
+1 insightful