Japan's Newest Linux Supercluster: 13TB RAM
green pizza writes "Following its sale of a 10240 processor cluster to NASA, Silicon Graphics Inc has announced that it's supplying a 2048 processor Altix 3700 Bx2 to the Japan Atomic Energy Research Institute. Aside from running Linux on Itanium2 processors, the beast also features 13 TB of RAM!"
I guess that'll be enough to run Longhorn then.
I remember back in my electronics course when we had to design the flip-flop grid for memory... the teacher said he'd give 100% to anyone that could draw out 64K of memory... 13TB just makes me cringe...
---
Programming is like sex... Make one mistake and support it the rest of your life.
I agree. These rather wasteful supercomputers are getting less and less impressive.
You know what would be impressive? Published results!
I mean they consume gobs of resources [power, material, waste]. That's not impressive. That's an American city block. What would be impressive is having to show for it at the end of the day.
Tom
Someday, I'll have a real sig.
Do all processors share 13TB? Because if they don't the bottleneck is that subprocesses have only 13TB/1024 available ( a mere 13GB each), and still have to communicate a lot.
In theory there is no difference between theory and practice. In practice there is. - Yogi Berra
I hear this is the reccomended base configuration for Windows Longhorn...
it's 13TB, not 3TB. Which is according to the article: "over 13 terabytes of memory - the world's largest memory capacity"
Szo
Red Leader Standing By!
So it says 13TB of RAM... that's a cool 6.5 GB per processor. Beats anything i run :)
The puter will be used for nuclear research (bushspeak: nucjular reesatch) by the Japan Atomic Energy Research Institute. More info about the organisation, their projects, etc. can be found at: http://www.jaeri.go.jp/english/index.cgi.
SIG: TAKE OFF EVERY 'CAPTAIN'!!
2048 processors, 13 terabytes of ram, AND it comes with a smaller, more ergonomic controller.
"If you think you have things under control, you're not going fast enough." --Mario Andretti
So an American company is selling a computer to a Japanese organization that is ideal for simulating nuclear explosions. Interesting.
I'm wrong and so are you.
I know you're trying to be humourous, but it raises an interesting question: is this thing faster than the Big Mac?
-- james
I think the line should be: now that's one impressive Beowulf cluster... In Japan
A whooping sale of 2048 Itanium2 processors in one shot - is this the BIGGEST sale for the Itanium2 chip, so far ?
Muchas Gracias, Señor Edward Snowden !
Haven't we had enough rudeness the last four years? I happen to be pleased by most of those results (though not, for example, that anyone still uses Windows). But you're a cowardly troll for anonymously posting such off-topic flamebait. - Get some stones and at least use a pseudonym - Stay on topic - Avoid calling people names like "Eurotrash" - In short, show a little class
sigs, as if you care.
I hear gobs about huge clusters and Linux as the OS that makes it all happen but I don't think I've ever heard other OSes uses like this.
Could someone make an "off the top of their head" list of SuperComputing cluster and OSes that are used in them?
I *am* a Linux user and I'm actually kinda curious if Microsoft has an answer to this area of computing?
serious response to funny comment:
The deal is that the Itanium2's are better(relative) processors when everything is compiled for them. The hitch is that in terms of price for performance Itaniums are near the bottom of the pile (highest performance != best value).
Finally, in this situation (price be damned), there is not any reason to worry about value, just performance. Thus Itanium wins.
-nB
whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
I do wonder why they went with the Itaniums. Perhaps Intel is having an "All 64-bit chips must GO!!" clearance or something.
is this thing faster than the Big Mac?
Interesting question. Especially considering they have roughly the same number of processors...
But from the article I get the idea the SGI is kind of... less clustered. It seems to share its memory while on the Big Mac, each G5 computer has its "private" 4 GB of memory.
I don't need a signature.
But isn't Itanium kinda evil (as opposed to slashdot darlings PPC/Power and Opteron)?
While Linux is super cool? So should I like it?
Nonono, on Slashdot Itanium is the best thing since sliced bread. And if it isn't... I'LL MAKE IT!!
SGI has been working through this in hardware for over 10 years.
The distributed shared memory concept of the Altix (first seen on Origin 200 / Origin 2000 in the commercial space, and previously based on the Standford DASH/FLASH projects) uses a hardware based memory router.
Each PE has local ram and local CPUs and a "MAGIC" chip that routes cache invalidations, memory block "ownership", etc messages to other PE's as necessary. Unlike SMP designs, cache coherencvy doesn't destroy the whole shebang because its not a shared bus, it's a heirarchial directory system. I.e. PE0 knows it only needs to contact PE3, PE6, and PE13 to invalidate a cache block. Turns out that thats much more efficient than broadcasting a message to PE0-PE63 saying "invalidate this block!"
Now, as far as _all_ processor sharing the full 13TB - i am not sure.
The memory density / system image equation is sort of a tradeoff, as more PE's require more router hops in the topology. More router hops increase latency. SGI has sold 256 and 512p single-image systems, and may have gone up to 1024 or 2048p / system.
To be perfectly honest, the system-system latency is different than the intra-system latency, but nothing like it would be on an x86-with-ethernet shared nothing cluster.
SGI's big installations are cool as they have advantages of both SMP and MPP designs.. each autonomous machine gives you signle-image benefits but with really high proc counts.. . and then you link a bunch of those together to get this outrageously sized machine.
My opinions are my own, and do not necessarily represent those of my employer.
13Tb of RAM, but how much swap?
The more advanced the technology, the more open it is to primitive attack
Sorry to spoil the excitement for everybody but actually, Columbia far exceeds the Japanses system's memory capacity at 20 TByte. See this description for details of Columbia's config.
You don't suppose they ever do any weapons research, do you? Hmmm, what to do...
sigs, as if you care.
Actually, that'd be 13TB/20480, not 13TB/1024.
:)
so, 0.000634765625 TB's per machine... too lazy to do it properly right now
feh. stuff.
isn't it recommended you have 2x ram as your swap? so that'd be *does difficult calculations in head* 26TB of swap. You really don't want the kernel killing off processes because you run out of ram....that'd be bad.
I call for a US export ban on Memory to protect the Homeland's national security.
Ow! Dr. Condoleezza just informed me they make Memory all by themself, lets pre-emptively nuke 'em!
"The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
Wonder what linux they run, probably they need the RAM for something, so Damn Small Linux would be the right distro for them.
About JAERI
Devoted to comprehensive research on nuclear energy since1956, JAERI challenges research and development in the realm of frontier science and engineering with focus on the realm of nuclear research and developments. Projects include the establishment of light-water reactor power generation technology in Japan through its endeavors including the success in Japan's first nuclear power generation and achievement of the leading and systematic research on nuclear safety. JAERI has also attained the world's foremost level of R&D in nuclear fusion and has applied radiation to the field of industry, agriculture and medicine, supported by extensive basic research to underscore the advancement of all its R&D activities. For additional information, visit www.jaeri.go.jp.
And you bit. So, I'm sorry to say:
YHBT. YHL. HAND.
I want to delete my account but Slashdot doesn't allow it.
pah... when I were a lad building my first machine... I had to gang nine 1Kbit chips together to make 1 Kbyte + parity... aye, they were the days... and you could cram a full chess playing program into that 1 Kbyte as well. A 4K ram expansion cost an arm and a leg... well it felt like that having to give up beer and ciggies for ages to scrape up the wonga...
Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
a Japanese Atomic Energy Research foundation would need that kind of computing power...Godzilla!
My other sig is extremely clever...
The Japan Nuclear research team who just acquired a 13TB RAM supercluster also gained a nuclear power plant to power this bad boy. Projections speak of a 2.5 hour battary life, although Limrick Power Plant has offered their Nuclear facility which will generate a whopping 5.5 hour battary life span.
I mod down so you can mod up. Your welcome.
You'll be a lot more effective if you keep your tone civil. In this forum, it's not what you say, it's how you say it that counts. The slashdot crowd, whom you hope to influence, quits listening and clicks "flaimbait" when you start calling names (except some comments directed at conservatives or ).
Sorry for the formatting of the first reply.
sigs, as if you care.
Reading through both links, I fail to see where it mentions that SGI & Intel *gave* the system to NASA for free.
l eases/2004/october/columbia.html mentions NASA having to put together a business case and justification for Congress and that normally means asking for funds.
The SGI press release http://www.sgi.com/company_info/newsroom/press_re
Even if they did just give it away for the press (and I dount it). When dealing with the gov't, the support contracts are separate. No one but SGI could properly support the system, so I'm willing to bet they got a fat support contract out of it.
-Charles
Learning HOW to think is more important than learning WHAT to think.
And the awnser is: it depends on what you're doing with it.
This thing is significantly more tightly coupled than VT's cluster, and uses shared memory as opposed to clustering, so for alot of tightly coupled problems it will be *far* more efficient.
As for raw processing power, the Itanium2 has the same theoretical peak floating point performance as a PPC970 at the same clock. In reality the Itanium is likely to come closer to achieving it's peak than the PPC970 due to it's massive cache (9MB compared to the 970's 512KB). However the Itaniums in an Altix3000 are only running at 1.6Ghz according to SGI's page, while the 970s in VT's cluster are now at 2.3Ghz. So the BigMac would have some advantage on loosely coupled problems that it can fit in it's smaller cache and memory.
So while the BigMac might beat this system at Linpack, the benchmark used to determine the top500, in the domain this system is to be used for (3d modeling of nuclear blasts) it's tighter coupling and greater RAM will make it much faster.
"The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
But isn't saying that offensive to turkeys?
Stick Men
I don't dispute that. So are the other chips. And they are otherwise outselling the Itaniums by an order of magnitude. Intel has been quite sheepish about their 64-bit line & the performance/price value you get isn't that spectacular. I'm just wondering why Japan Atomic Energy Research Institute bucked the trend--not saying the Itaniums are actually bad.
Sorry, the gloves come off when Europeans dare to judge us unfairly just because our country (and Britian) has the cojones to oppose evil in the world. Just because you guys have given up trying to change the world for the better, doesn't mean we will.
I will not go into a discussion about the methods you use to better the world, but will share you a consideration a lot of Europeans have about the US foreign policy: have you ever considered why some of these evils in the world don't turn to Europe, only to the U.S.?
Z
And I thought 640k memory was enough for everyone. Wait a minute, was it me or...?
This
Man this shit makes me feel old.
I worked on a machine that had 24k (that 24,576) bytes of wire-wrapped, core memory. At the time though I new where RAM was trending. I had an Apple][ with 32 K of semi-conductor memory.
I wrote a Pascal-like HLL compiler and a payroll system for the damn thing. In 24k bytes of memory.
What the [expletive deleted] do you DO with all those terabytes or high-speed RAM? Lets pretend something goes KABOOM!
I don't know wether to be wow-ed or depressed.
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
The Itanium2 is a fast processor, especially when it comes to optimized floating-point calculations. Yes, it is expensive and so the price/performance ratio is not as good as common desktop processors mostly for two reasons:
1. Large die area (mostly due to huge amounts of on-die cache) - chip price is directly related to how many cores that fit on a silicon wafer.
2. The Itanium2 is a low volume product, so R&D and verification costs are a higher percentage of chip costs.
The biggest problem with the Itanium2 is not its performance, but the innability of Intel to lower its cost. This causes it to being relagated to niche markets like HPC where performance is everything.
I can't believe any of you didn't do a single doom 3 joke yet!
mhhh, I looked up some other (German) domains.
Not sure about the results. There are some huge fluctuations(~20x) in the data without obvious explanations. Maybe some can explain how they generate those numbers:
..they should just run everything from a RAM disk.
Hope they didn't forget the $6.4 million startup cost and $2.2 million annual fee for linux licenses (assuming 8 CPU systems).
SGI has a layered approach to the max number of CPUs in a supercomputer.
I guess 256 is what they call "ultrastable" - kinda like the Linux kernel 2.2.
But the NASA monster already has 512 CPU machines, and who knows what the japanese system has.
Apparently, SGI sells bigger systems to customers who "know what they are doing" and who work closer with SGI. If you want something that 100% no-frills, then probably the 256 CPU is the current absolutely stable limit.
0.000634765625 TB's per machine
Then you're assuming each machine has just one CPU. That is not correct, and it's the biggest difference between SGI supercomputers and commodity clusters.
An SGI system has hundreds, if not thousands of CPUs per machine.
But yeah, it should be enough to display all my porn at once provided they can find a big enough monitor....
is this thing faster than the Big Mac?
Yes it is.
The Big Mac is just like any other commodity cluster. It's just a bunch of machines tied together in a closed network.
The SGI supercomputers keep all CPUs in a single machine, sharing all the memory over extremely fast, proprietary interconnects. In such a system, the CPUs talk to each other as fast (if not faster) as the CPUs in a dual-CPU server.
Assuming the total CPU power is the same, the SGI supercomputer is faster than any cluster (Big Mac included) for all problems that are not 100% parallelizable. For those few problems whose algorhitms are 100% parallel, a SGI system and a cluster will probably be equal.
...but not quite enough to hold my...friends'...entire pr0n collection in memory.
What's wrong with you people? "Itanium is bad" is an urban myth. Yes it's expensive. Yes, Itanium 1 was indeed bad. But this is Itanium2. It might still be expensive, but it's currently the best CPU for large supercomputers - machines which run hundreds of CPU in parallel, which is exactly what SGI does.
SGI is using the best tool for the job. When (or rather IF) AMD comes up with a better CPU for this kind of workload, they'll probably migrate to that.
Don't get me wrong, i'm using AMD on all my PCs, but a massively parallel supercomputer is a different thing.
These rather wasteful supercomputers are getting less and less impressive.
:-)
You know what would be impressive? Published results!
The results are already "published", just not explicitly.
The gas price is twice smaller than it could be? That's because supercomputers such as those made by SGI are used to do simulations related to oil drilling and stuff.
The car prices are smaller than they could be? That's because car crashes are simulated on supercomputers instead of performed actually during the design process.
Nuclear weapons are more powerful than they could be? That's because... oh, wait.
It's a single-OS-image supercomputer.
:-)
But yeah, nice joke.
if this will help safety out at the power plants. Japan has a (comparatively) horrible safety record when it comes to nuclear power compared to Western Europe and the US....
Monstar L
... enough to run Longhorn ... ... compile Gentoo ... ... Beowulf cluster ...
I hear they will be using it to test the new release of After Dark. It will be running it 24 hours a day.
I never liked you
Obviously you didn't read one of my other posts. I agree Itanium2 is a fine enough chip. I just think it is expensive. I don't think it is a good value & I am used to seeing government-sponsored research pick a lower cost product over the top-of-the-line every time. In the US, many of the research institutes I know of are consciously choosing to make AMD or Apple clusters because of how far they can stretch their research dollar.
Europeans dare to judge us unfairly just because our country (and Britian) has the cojones to oppose evil in the world.
Noblesse oblige me to correct you, sir.
We dare to judge you just because (the government of) your country and (the government of) Britain are the cojones who claim to oppose "evil" in the world.
How can one actually have many Central Processing Units?
I mean, I know there are multiprocessor computers nowdays, but what is then central there?!
it sounds good, but can it run the Duke Nukem 3D Atomic Edition in 800x600 VESA mode?
The higher the technology, the sharper that two-edged sword.
Having been born in Japan, I believe a Beowulf cluster in Japan is more commonly called a "godzilla cluster". And having said that I'm wondering what that amount of money would cost in toto? Oh, BTW, this machine is really going to be used to create the next Gozilla movie, only animated. Sig? Ok. Sig.
If you can stay calm, while all around you is chaos... then you probably haven't completely understood the question.
2048 is nothing... They've got everyone beat with 10,240 processors!
http://github.com/gbook/nidb
http://christiancarling.com/snoopy.htmlThis is what the british government needs
...have no influence in Japanese big business or government...or so Jon Lovitz told me.
Yes, I find it interesting also.
To be pedantic: the maximum physically addressable RAM of the Opterons is 1/8th of what the Itanium can address. Obviously I'm not advocating trying to use 8 AMD chips for each Intel chip! But neither are they maxing out the RAM that the machine can address. And RAM isn't everything. The UltraSPARCs can address TBs of RAM and have a larger cache. Few use those.
The "Big Mac" managed to perform quite well, despite the fact that the PowerPCs have the same RAM limitation as the AMD chips and a smaller cache.
And I give you a second chance to read my message. Did I advocate a specific chip other than the Itanium? No--I just expressed interest in why they chose it. And you can have another chance to do math in public. Even if I did think they should have chosen the Opteron, would the Opteron have been a poor choice. Who knows. I don't know what their demands are. But clusters aren't always limited by memory and cache restrictions.
Yup--I was completely wrong. Thanks for the crash course & smackdown. You've made it clear enough why they chose what they did.
Still, other systems do address large amounts of RAM. The ASCI-Q at Los Alamos has 33 TB (!) on alphas.
Still, other systems do address large amounts of RAM. The ASCI-Q at Los Alamos has 33 TB (!) on alphas.
But the ASCI-Q is a cluster, IIRC consisting of 2048 4-cpu SMP nodes. Thus each node only has 33 TB/2048 = 16 GB memory.
It might still be expensive, but it's currently the best CPU for large supercomputers
Except for, say, POWER5, and vector processors (NEC SX-6, SX-8, Cray X1). If you by "best" mean raw performance and bandwidth, cost and power consumption be damned.
SGI is using the best tool for the job.
Perhaps they are, perhaps not. That's not the issue. The thing is that a number of years ago (when AMD64 was barely a blip on the radar) they made a strategic commitment to IA-64. Spending vast amounts of money to port all their stuff to another architecture and royaly pissing of customers and ISV:s just for a modest improvement in performance or price/performance of the cpu:s (which wouldn't matter much since the major cost of the Altix is the interconnect and NUMA stuff) doesn't sound like a cunning plan to me.
That is, the IA-64 doesn't have to be the absolutely fastest chip on the block, as long as it stays competetive. It makes no sense for SGI to switch architecture for a 10 % performance gain.
You're right and I knew that. I figured that out when you made the first comment on addressable memory. The Altix is the machine with the largest amount of RAM that is globally addressable across all processors. That wasn't clear to me from reading the press release, but other reports were much more informative. My comment was really to say there is more than one way to skin a cat. Yes--it is an achievement to build a single 2048 processor beast. But I wanted to point out that fine super computers with a lot of RAM and a lot of processors were built without Itanium2s. I would have offered an AMD or PowerPC cluster with comparable RAM & outstanding performance, but I don't really know of any.
NASA's Columbia system cost $45M. SGI charges money for their products. This Japanese system is probably on the order of $10M.