Japan Builds World's Fastest Computer
claylikethemud writes "The New York Times reports that Japan has built the world's most powerful supercomputer from "640 specialized nodes that are in turn composed of 5,104" NEC processors. The machine boasts the computing power equivalent to the 20 fastest American supercomputers combined, and with a top speed of 35.6 teraflops, outpaces the next fastest machine, the ASCI White Pacific, by more than factor of five. Applications include climate modeling, global warming prediction, and other non-weapons research."
With all of the supercomputer posts on /. recently, I've seen a lot of talk about the various ASCI projects in the works by IBM and others. No one even mentioned this before. I'm glad to see that someone is building supercomputers for reasons other than nuclear weapons research though.
Interesting comment from the SJ Mercury
The accomplishment is also a dramatic statement of contrasting scientific and technology priorities in the United States and Japan. The Japanese machine was built to analyze climate change, including global warming, as well as weather and earthquake patterns. The United States has predominantly focused its efforts on building powerful computers for simulating weapons.
Also worth noting is that the article mentions that the US gov't has blocked sales of these machines because they believe that NEC is "dumping" them on the US market - eg selling them below cost. Has there been any WTO action on these restrictions? Wouldn't this be a perfect test case for getting US trade restrictions struck down?
Don't you wonder why they bother? They're only going to have to destroy the thing when it sprouts purple tentacles and destroys Tokyo.
We all know that it's really used for Japan's top secret Super Ultra Omega Gundam Robot Mobile Suit 95006^10.
The supercomputer was built with 'the earth systems model' in mind. This will be the most ambitious computer model ever concieved. It aims to simulate every aspect of the earth system climate - including more processes than ever before: atmospheric processes, ocean processes,land surface feedbacks and land use models, economic models, ice sheet models, at a higher resolution than ever before.
... ;-)
Predictably the model is rumoured to be still 2 years off target yet - so there is the worlds fastest computer sitting idle for the mean time.
Perhaps I could buy some space to run my webpage off it in the mean time
First, let me disclaim that I have never worked with multiprocessor systems, but this is /., so that usually means I'm an expert in the field.
I would imagine that the processors are specifically made for this application, and not some off-the-shelf processor. Also, It much easier to design/build a 5104 processor machine then a 1 million processor machine. Economy of scale doesn't apply here.
generates a login:
http://www.majcher.com/nytview.html
Non Weapon research??
Yeah right !
Uh.. from Chapter II, Section 9 of the Japanese constitution:
"Aspiring sincerely to an international peace based on justice and order, the Japanese people forever renounce war as a sovereign right of the nation and the threat or use of force as means of settling international disputes. 2) In order to accomplish the aim of the preceding paragraph, land, sea, and air forces, as well as other war potential, will never be maintained. The right of belligerency of the state will not be recognized."
The Japanese are only able to maintain a defensive force, not an army, so even if it was weapons research, it would only be for use in self defense.
slashdot!=valid HTML
...become a huge goddamned distributed-network-in-a-room?
How to make a liver capable of taking in 60 years of alcoholism?
An Education is the Font of All Liberty
I'm not a real expert but I have recently taken a high performance computing course from somebody who is an expert for my comp sci masters.
The basic problem of adding more and more processors is keeping all the memory in sync. If you have a process that is running across 50 cpus the machine needs to ensure that if one of them updates a variable that all the others work with the current value. (Ok, it's more complicated than that but I'm not writing a book here)
The solution is to write your system so that the calculations can run as independently as possible. However, at 100 million processors it probably just doesn't fit the problem space.
That which does not kill me only makes me whinier
Ahem, I think you meant to say 3.2 Million processors. Not Trillion unless the math from your Universe is different than the math from my Universe.
"sweet dreams are made of this..."
This is, of course, one reason why the post-war Japanese economy was so successful for most of the second half of the 20th century. whilst we were pouring all available resources into 'defence' research, they were getting on with something a litle more useful and productive.
It seems a largely successful strategy and it might be better if more countries were to consider it.
Japanese people are very anti-nuclear-wepons - which is not really a surprise due to the fact that they had two dropped on them. In fact they have sent letters of protest to the heads of every country that tests nuclear wepons since 1965 - hundreds of letters.
And on June 6th at 06:06 pm Pacific Standard Time it became self-aware.
"sweet dreams are made of this..."
Vintage computer games and RPG books available. Email me if you're interested.
Evil: "Back in the Sixties I had a weather changing machine that was in essence a sophisticated heat beam which we called a "laser." Using this laser, we punch a hole in the protective layer around the Earth, which we scientists call the "Ozone Layer." Slowly but surely, ultraviolet rays would pour in, increasing the risk of skin cancer. That is, unless the world pays us a hefty ransom."
Weather research my butt!
I really hate Dan Patrick.
The basic problem of adding more and more processors is keeping all the memory in sync.
That's why message passing is typically used instead of some sort of shared memory approach. You eliminate the synchronization problems as well as memory contention. After that, it's just a matter of keeping all the processors busy.
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
Applications include climate modeling [...] But virtually anything, and any knowledge, can be used to "weapons"
You know you're right!
- scientist: our climate modeling indicates that if we start our weekly barbeque at exactly 6:17pm, a US weapons lab will be destroyed by a powerful tornado in 41 days.
- director: well let's start our barbeque at 6:17pm to see if you're right. Welcome to the 21st century, America! (insert maniacal laughter).
Just think, now that they have all of this processing power they can do some of the following:
1) Make a metal that looks like plastic. Handy for all of those rocket launches.
2) Genetically engineer large reptiles to guard their country from invaders.
3) One word: Gundam.
4) Launch theoretical bombs at ASCI White and see if they can finally win the technology war.
5) Create a fully aware computer program that will help guard us from ourselves.
6) Make a fully synthetic actor that can outact, say, Keanu Reeves. (Oh, sorry, that was the Thunderbirds).
What other possibilities can this thing hold?
"Giving money and power to governments is like giving whiskey and car keys to teenage boys." - P.J. O'Rourke
The last time I checked, Google had more than 10,000 servers. I realize these aren't tightly coupled, parallel processors, but it's still a massive machine. Is it 10,000 computers or one? I say for the purposes of comparision that it would beat the Japanese computer. If not now, in a few months when Google's installation grows even larger. This piece struck me as a thinnly-veiled ploy to get more cash for some government computer lab.
Pictures here. so cool!
Oui.
My point is that it's a very _big_ number for beowulf clusters. The biggest one have 8192 processors
Hmmm...I thought that Beowulf referred to separate computers and not processors. So five computers linked together are a Beowulf cluster of five while a single computer with five processors is not a Beowulf cluster of anything but simply a multiprocessor computer, or is this a incorrect interpretation?
As for the number of processors, I'm sure that they had to make up for the lack of processor speed with sheer numbers. Yuck!
"sweet dreams are made of this..."
They test one.
Ad infinitum while the world cringes in fear.
It's going to get ugly when Cuba starts hosting Japanese built systems.
[Okay. Lame joke. It sounded better before I typed it, but I'm too attatched to the effort to not post. You're Welcome.]
-Fantastic Lad
America to the world community is like Microsoft to the business world; "We're here, and we don't want anybody else to be here, so play by our rules, or we'll smack you down."
Vintage computer games and RPG books available. Email me if you're interested.
Jeez, could you imagine a single one of those...
...we still operate under this 640 node barrier.
Wrong. Just plain wrong. Explicit message passing can often reduce communication overhead compared to coherent shared memory, but the synchronization problems are still very much present. You still can't operate on data before it becomes available, regardless of the programming model. Explicit message-passing systems handle synchronization very differently than shared-memory systems, but those problems don't just go away.
Slashdot - News for Herds. Stuff that Splatters.
"All in all, it is a surprisingly large amount for a country that doesn't go into military actions. Who are they defending themselves from?"
Red China and North Korea, for starters (who both have nukes, BTW). They don't exactly have the friendliest of neighbors over there. They would be stupid not to have a good defensive force.
So long as "more" is "all". If one country doesn't renounce violence it can just take whatever it wants from the other countries. If several don't they could just split up the world...
Pocket? What's the point of playing Quake in your pocket? Or maybe I shouldn't ask...
640*5104==3.2M CPUs... so I can dedicate four CPUs to each pixel on a 1024x768 display, and get reasonable Quake performance without hardware acceleration? (-:
Got time? Spend some of it coding or testing
Finally, some new news (as opposed to "this is nothing new").
;)
Question - why is it that we JUST found out about this? How long did it take to build this giant supercomputer? Companies like IBM talk about what they're building long before they are done. Speaking of which, I guess IBM's Deep Blue is kinda underpowered now, relatively speaking.
One more thing - why all the hub-bub about US export restrictions re: computer power? If Japan already has this much computing power, who wants our "junk" anyway?
I lied - one more thing - does the NSA have penis envy over this? Or is their computer still faster?
It takes about a few weeks on Sun ultra sparcs to simulate a week long air pollution scenario over the north eastern united states. This is assuming a 8x8 km grid (where the 8x8 sqkm area is one "point"). The wind modeling is extremely simplified, and the focus is on a select set of contaminants.
To do a detailed wind modeling, and have a finer resolution, and to do some statistical analysis of different input conditions... suddenly we end up with requirements far more than the current computing power.
We can always come up with a problem that is more complex than we can solve using current computing power. That is a good pursuit.
S
I remember seeing this in a magazine a couple years back as a planned project.
Nice to see it working now.
Sorry, just couldn't resist. :)
This is great news really. With the supercomputers built for weapons research naturally people doing "normal" research will have problems getting access.
After all they don't want just anyone poking around and finding things they shouldn't.
But with non weapons research systems I can see academics from all over the world getting easier access and maybe something interesting can happen.
The whole name of a "Beowulf" cluster is misunderstood anyway.
A cluster is a cluster, there are many different kind. There is no one kind called a Beowulf.
Beowulf was the name of a project at NASA that was building clusters out of cheap computers. So I guess any cluster built out of cheap computers is a Beowulf.
Beyond that, there is no set standard for how a "Beowulf" cluster operates. They all use different librarires, different cabling, etc.Some use PVM. Some just use mosix. Some use other stuff. Etc..
Eliza did that several years ago.
The japanese computer has MILLIONS of processors. Google doesn't even come close to 1/100th of the size.
Google can NOT do 36.5 TERAFLOPS.
The japanese computer is bigger than the top 10 US supercomputers combined. DO you mean to say google is bigger than that?
ANd btw, this project has been in the works for years, I remember reading about it in some science magazine 3 or 4 years ago, when they started the project.
By some rough statistics that I remember, it could render the original Toy Story completely in 30 hours easily. It might not even take that long. Rendering a whole movie in a day, Edwin Catmull would be proud.
This Wiki Feeds You TV and Anime - vidwiki.org
5) Emulate Grand Theft Auto 3
WOPR
"Would you like a nice game of chess?"
----- Whats wrong with this picture? http://www.revoh.org:1234/whatswrong
but does it open the pod doors when asked to? :-)
Fortran always used to be the dominant language for vector code but C also (I've been away from this game for a couple of years but did spend the best part of 10 years of my professional life "Vectorising" code.)
The real trick to Vector code is to work with the memory subsystem and not have it permanently trying to catch up with you as in normal processor style.
You can pretty well put as many floating point units in a modern cpu as you want, the problem is feeding them with data to operate on and storing results. Current microprocessors use multilevel caches to try and keep what it hopes are useful subsets of main memory close to the cpu. Trouble is if you are scanning 100's of GBs of data in a weather model there may be virtually no useful small subsets.
For vector processing you design a system where, before you actually fire off any calcs, you give the memory system a list of the next 64/128/4096(varies) addresses you plan to use. It may take a little while to get the first one but after that they arrive at 1 per clock per memory pipe and, depending on the number of memory pipes you use you can actually drive your floating point units full speed.
Because you want to process streams of memory addresses as a single op (vectors) you spend all your time looking for loops where each iteration can be calculated independent of the next and where the compiler can be sure of that with no ambiguity. That tends to mean no subroutine calls, anything a(i)=f(a(i-1)) is bad but a(i)=f(a(i+1)) is fine and even a(i)=f(a(i-65)) can be OK depending on vector register length. You then get into CIGS (compressed index gather scatter) ops like a(i)=b(c(i)) and you can work with that sometimes etc.
Bottom line, if you don't vectorise high 90%'s of your code the Vector computer is a very expensive room heater. You then need to worry about 99% parallel code+ for multinode architecture but there are similarities between data independence of vectorised loop coefficients and parallel modules.
HTH
Crash
I'm just waiting for Tom's Hardware to write up an article on how to overclock this to get an additional 1,000,000 fps in Quake III.
All the sudden the most annoying NBC "the only team of certified meteorologists in the Delaware Valley" and "Most accurate forcasting with the Doppler 10,000" seem kind of funny.
I wonder how far in advance this new supercomputer can predict how far John Bolaris is going to be off in his predictions again (the poor guy made some completely overhyped predictions about a blizzard last year in Philly area).
Anyhow, hats off, Japan! I'm impressed.
Okay, so then the Japanese complain about us dumping. Then what? Let's say they win in WTO hearings. How nice for them. Then the US just ignores it. Why? Because we can. What real punishment can the WTO provide?
The WTO is totally powerless, especially against the US. The only thing it provides is a common forum for working these issues out and for establishing a sort of trade best practices. But when you get right down to it, trade disputes are settled as they always have been, either through discussion, or through various embargoes, tariffs, etc. The WTO may add some legitimacy to a particular countries use of some tariffs, etc, but overall it doesn't provide any significant sanctioning ability.
That's the funny thing with all of the world governmental bodies. They have no real power, they mostly just serve as negotiating platforms. The real power continues to be held by individual nations and there's no evidence that they'll be giving up that power anytime soon.
This sig has been temporarily disconnected or is no longer in service
Actually, the "super-node" idea was how they originally did SMP systems.
So each processor is connected by the front side bus. When you need a value you check each processor to see if it has it before you get it from main memory. Your alternative is to check with the controlling processor instead of each individual processor. Through the years they found that the loss of a processor just for admin tasks wasn't worth it. Now everybody shares admin load and everybody does work.
Because this admin load get's to be too much, they normally divide the machine into subsections. For example, the 64 CPU sun box I use at work is divided into 8 smaller sections. Each of the 8 CPUs are equally close to memory but they are far from the other "sub machine's" memory.
Anyway, I'm kind of rambling here but the general idea is that super computer builders have moved away from that idea in most of the models I've seen. Instead they try to keep the communication requirements low. (AKA, maximizing locality of reference)
If a "real" expert is around who knows of something different I'm all ears.
That which does not kill me only makes me whinier
*shrug*
It's probably just a not particularly subtle jab at the US DOE nuclear weapon simulations research which gets done on the big American gov't supercomputers -- in other words, pointing out that they're using their CPU cycles for what theyc consider a better purpose. Japan isn't particularly fond of nuclear weapons at all.
Only the dead have seen the end of war.
OF course you have to assume that under typical black projects the DoE/DoD/NSA is running machines far more complex and powerful than they let on. After all SR-71s were a strategic asset 40 years ago and the performance specs are still largely classified. Similarly with computing. A
Also keep in mind that several years ago the US govt complained about the French performing nuclear testing under the rubric that they could do it all on a machine. And low and behold only a few weeks ago the DoE 'announced' that they now have the capability to do that, seemingly forgetting that it was previously announced in 1999. So in the intervening 3 years how far do you think they've come.
You know, there are scads of scientists working for the govt who could probably get on the short list for the Nobel if they were allowed to publically publish... and that's basic research. Imagine what applied engineering looks like..
Contrary to rumor,
n e02.html
the machine is constructed from 640 nodes, with 8 vector processors per node, and 16GB RAM per node. That totals 5120 processors and 10TB memory.
See http://www.es.jamstec.go.jp/esc/eng/outline/outli
Also of note:
peak performance per processor: 8 GFLOPS
total peak performance: 40 TFLOPS
Remember, when they give you TFLOPS or TOPS values, they're giving you PEAK values.
In reality, most of the time, performance is way below peak values, even for the algorithms for which the computer was designed to handle. IBM's pacific blue has a peak TFLOPS value around 3.6TFLOPS...but in reality, its usually around 1.2TFLOPS.
There's no reason to believe this machine will be any different.
Furthermore, the performance of this machine is likely to sink like a rock when its used outside the area it was specially designed for.
In other words, the best supercomputers in the world are still the ones made by starbridge systems, which were bought by NASA (I believe the one NASA bought was called HAL 15, or something like that).
social sciences can never use experience to verify their statemen
This particular picture (from the above links.) is mildly disturbing.
0 3. jpg
http://www.es.jamstec.go.jp/esc/gallary/images/
"I'm sorry, I can't do that Dave...."
Lousy facepalm.
I'm not going to even get into it, but the Constitution you refer to was written by Americans, not the Japanese themselves. The repeal of Article 9 has been debated for many years, and Japan may well repudiate it in the next few years, and become a "normal" nation with seagoing navy and overseas bases.
Shutting down free speech with violence isn't fighting fascism. It IS fascism!
This is faster than the SETI network.
SETI operates at 17 teraflops, but at a cost of only $500000.
We're wondering why you just don't log out already.
Maybe the state's highest function is to grind out insoluble problems. (Zelazny, Hall of Mirrors)
Global warming research is weapons research. They're just digging for global warming propaganda, a good part of any war machine.
Maybe the state's highest function is to grind out insoluble problems. (Zelazny, Hall of Mirrors)
Nuclear weapons are the most sensitive issue in Japan, Japanese people are strongly against it. Since the nuclear accident in Ibaraki Prefecture in 1999, the most serious nuclear leakage accident, Japanese citizens have lost confidence about nuclear industry, they asked governments to reduce or stop nuclear power plant construction.
So how, exactly, do I "not know what I'm talking about"? --
Evan
"$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
Keep OPEC fat and happy: Buy an SUV.
Oh yeah?
How about: "I am a research geologiest and have to drive for hundreds of miles in areas that don't exactly have maintained roads. If I don't have a pickup or SUV how the fsck am I supposed to get there?"
Think not? My fiance is almost done with her degree in geology, and we DO have need to go to places like that.
"Alcohol, Tobacco, Firearms, and Explosives" should be a convenience store, not a government agency.
Just for comparison, the whole SETI@home network had a performance of 17.6 TeraFLOPS during the last 24 hours.
Did you know you can fertilize your lawn with used motor oil?
It would not suprise me if the US all ready had a petaflop of super computing power or more in one machine. The box might sit down at some government agency like the NSA (the worlds largest employer of mathmaticians) and be classified so that no person without a clearance and need to know will ever know about it. At least for thirty years or so.
Disclaimer:
I have no clearance , so this is sheer speculation on my part.
You forget that alot of defense spending has a positive Keynesian influence on the economy. Technologies and methods developed at taxpayer expense are exploited by private industry thus expanding the American economy. Defense research spending provides R&D dollars without the risk to private industry.
Somewhere, something incredible is waiting to be known. -- Carl Sagan
i doubt a government would let some pesky little thing like a consitution get in the way. the american government doesn't seem to get too bothered about the US constitution, for example.
Maybe you should find some specific examples of the Japanese going against their constitution before you start making arguments like that. The Japanese government is not the American government, and vice versa. Unless you have some concrete evidence otherwise, you may not want to make broad generalizations about one government based on what you know about another.
slashdot!=valid HTML
I believe it has to do with the interconnects. While a cluster's many nodes may be talking to each other at 1 Gbps, or whatever, these speeds don't work for a supercomputer like this. A cluster or distributed network is good for jobs that can be split up easily. For example, SETI@home or load balancing servers. However, this is the world of simulations. Like people were pointing out during our discussion on ASCI White, the entire environment of the simulation must be calculated simultaneously. You can't calculate what is going on at point (x1,y1,z1) at time t1 and then move on to (x2,y2,z2) at time t1 becuase the two are touching and interdependant on each other. This is true for every point in the simulation's scope. Therefore, the processors have to have an interconnect speed that will allow them to act as if they are all on the same bus and process data simultaneously for all points before moving on to the next time increment.
Of course, I am only a lowly CS student and I'm sure that someone out there can give a more detailed explanation. Thanks.
We now know WHO really started it all. From the press release:
"Blue Gene/L will also be a part of IBM's research in "autonomic computing", an initiative to design computer systems that are self-healing, self-managing and self-configuring."
unfinished: (adj.)
Which is a shame, because, as I pointed out to a friend the other day, Japan is about the only first world country that *doesn't* have a cultural heritage heavily vested in Israel. They would make a perfect neutral police force, with both a consideration for the historical value of the area, and no favoritism towards either group.
Okay, not really, but it's one of those "it looks really good on paper" plans.
--
Evan
"$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
You forget the Swiss. I believe they could be a pretty good neutral police force.
I guess you're unaware of the recent push in the Diet to introduce legislation to change this law (partially, at least) so that Japanese are capable of joining wars in some cases.
Link to story at Asahi.com
"What is the the operating system running?"
Hyper Operating System.
graspee
The grid they are building will be four times as powerful as the system described in this article.
--
Evan
"$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
In realtime 3D with blood and swords and genuine terrified screams as a pawn is ridden down by a knight... (-:
Got time? Spend some of it coding or testing
I'm 39, have at least three children, and earn $120 an hour for consulting. And rarely get to play Quake, which I do admire for its, uh, execution.
Yeah, running viruses, apparently... oops, Billy boy only has 94% of the desktop. Does Quake exist for the Mac? If so, we could probably go pretty close to 99% at least capable of it, if not actually designed to do it. Tell me with a straight face that all of those 3D cards ship for use only in CAD workstations.
Got time? Spend some of it coding or testing
I'm told it will push a winmodem at 55 kbs.
DMCA, Hollings, Palladium. What might have sounded like paranoia is now common sense.
I wager I'm not alone in wishing you'd just go away.
Maybe the state's highest function is to grind out insoluble problems. (Zelazny, Hall of Mirrors)
Another computer you may be interested in is Grape-6 which is a 48 Tflop accelerator for gravitational calculations, developed at U. Tokyo for astrophysics. The creator won the Gordon Bell Award a couple years ago.
bite
Maybe the state's highest function is to grind out insoluble problems. (Zelazny, Hall of Mirrors)
no.
Maybe the state's highest function is to grind out insoluble problems. (Zelazny, Hall of Mirrors)
yawn.
Maybe the state's highest function is to grind out insoluble problems. (Zelazny, Hall of Mirrors)
No, I just don't live in front of a computer. What were we talking about again?
Maybe the state's highest function is to grind out insoluble problems. (Zelazny, Hall of Mirrors)