China Switching To Home-Grown Chips For Supercomputers
rubycodez writes "The Tianhe-1A system will be the last Chinese supercomputer to use imported Intel and AMD processors. By years end, China's own 64 bit MIPS-compatible 65nm 8-core 1GHz version of the Godsen (Longsoon family) processors will be used, including 10,000 of them for the 'Dawning 6000' supercomputer. Yes, the chips can and usually do run GNU/Linux, but also can run FreeBSD, OpenBSD, and NetBSD."
This makes sure US companies can't hide spying equipment and China slowly but surely takes more foothold over technology and in the end gaining world domination.
That's silly. They're trying to build a supercomputer out of MIPS chips. That'll never work...
Speaking of which, it does make me wonder about all this fuss over 64 bit ARM chips for datacentres. There are already high performance, low power 64 bit MIPS chips and have been for years. They're well proven, have good compiler support, cheaply licensable, low power (perhaps not quite as los as ARM?), have standard 64 bit modes and so on.
SJW n. One who posts facts.
... very trustworthy. 10,000 not-yet-fabricated CPUs are going to be powering a 1 petaflop supercomputer in less than a year? Color me skeptical. ... and anyone want to fill me in why 10,000 8-core MIPS chips at 1ghz can be expected to outperform 12,000 12-core x86 chips at 2.1ghz?
The processor family is called Loongson and not "LongSoon" as summary says. But the typo is funny in its own way.
If China has taught the world a lesson, it's that it's companies are more corrupt than American companies.
-but-
China does have the ability to compete with the rest of the world to lower prices on things.
So you have to ask which is more evil, making the competition have to lower their prices, or making poor quality equipment that has to be replaced annually... or worse kills people.
There are a few things I generally won't buy if I see "made in china"
1) Processed food (due to pet food/baby food/milk recall)
2) Plastic toys for children that still put things in their mouths (due to lead concerns)
3) Electronics that need to last (Kitchen appliances) and have to be safe.
I'm ok with buying chinese computer parts, though I probably wouldn't buy any network processor or cpu processor parts for security reasons. Likewise I probably wouldn't buy any GSM, Bluetooth, WiFi or similar wireless parts or phones because I don't trust the safety.
Most clothing (that isn't tissue-paper thin D: ) from China is just cheaply made from low-thread count materials. You don't have to buy it, and you can even buy better quality stuff that won't rip the second you take it off if you're willing to pay a bit more.
(Yes... I've bought clothing that was so thin before that it ripped the first time I wore it... because it was almost sheer from cheapness.)
including media (now a weapon)
banned? yikes almighty. mynuts(clearly)won; does't play well in peoria.
Compare Chinese medicin with "speaking in tongues", as seen here http://www.youtube.com/watch?v=NZbQBajYnEc
Greater usage of MIPS will stop Intel/AMD/ARM from getting complacent.
Plus, MIPS is the only assembly instruction set I know, so there's possibly some nostalgia there... :-)
Either your network or ip address has been banned from Slashdot ...due to script flooding that originated from your network or ip address
-- or this IP might have been used to post comments designed to break web
browser rendering. Or you crawled us with a rude robot, especially one
that doesn't understand RFCs very well.
If you feel that this is unwarranted, feel free to include your IP address
in the subject of an email, and we will examine why there
is a ban. If you fail to include the IP address (again, in the subject!),
then your message will be deleted and ignored. I mean come on, we're good,
we're not psychic.
If you think your IP number is different from $@;:/ tell us both.
If you are using a browser with some kind of add-on that crawls or caches
pages for you, tell us what it is.
Since you can't read the FAQ because you're banned, here's the relevant
portion:
Why is my IP banned?
Perhaps you are running some sort of program that loaded thousands of
Slashdot Pages. We have limited resources here and are fairly protective
of them. We need to make sure that everyone shares. If your IP loads
thousands of pages in a day, you will likely be banned. Please note that
many proxy servers load large quantities of pages, but we can usually
distinguish between proxy servers being used by humans, and IPs running
software that is hammering our servers.
Your IP might have been used to perform some sort of denial of service
attack against Slashdot. These range from simple programs that just load a
lot of pages, to programs that attempt to coordinate an avalanche of posts
in the forums (often through misconfigured "Open Relay" proxy servers).
You might be using a proxy server that is also being used by another
person who did something from the above list. You should have your proxy
server administrator contact us.
Your IP might have been used to post comments designed to break web
browser rendering, or our priority motive $income$.
Answered by: CmdrTaco
Last Modified: 7/02/02
I wonder how well these chips compare to the R16000's?
"To those who are overly cautious, everything is impossible. "
The Japanese 10 petaflops-scale K computer in Kobe uses Sparc-compatible cpus from Fujitsu. Sounds like a good idea if you want to build know-how, not just a machine.
Trust the Computer. The Computer is your friend.
you are completely wrong. this processor has over 200 x86 emulation instructions, allowing it to run x86 code with only a 30% performance penalty, under qemu. it also has two 256-bit vector pipelines that provide SIMD floating-point operations so powerful that a single 1ghz core can do 1080p at over 100 frames a second. to claim that "it will never work" in the face of evidence that you simply haven't looked at is ridiculous. look up the specifications on the GS464V, please. also, you are not aware that the Chinese Government has purchased 25% of MIPS, and is working with the MIPS teams in the U.S. to create this processor. this processor *IS* MIPS's high-performance, low-power 64-bit MIPS chip.
the article has missed out some important information, which is that they are planning two versions of the CPU. the first is a Quad-Core 65nm, and the second is a 16-core 28nm, which will use the same amount of power (about 12-15 watts). hopefully they will also do a Single-Core 28nm which would be under 1 watt, because at 1ghz the SIMD units are so powerful they can do 1080p at 100 frames per second. really, this CPU design is a game-changer. i've been advocating their use for some time - http://lkcl.net/laptop.html
I guess after decades of reverse engineering, stealing, copying, and calling there own, why not? Just like all the rest of the technologies they have stolen over the years to churn out cheap crap...
Perfect example. If they will be using the supercomputer to run Windows and watch 1080p torrents.
I would like to buy a small (perhaps 1U) server based on these chips if such a thing exists...
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
Yes, it's amazing how fast the chinese can reverse engineer old technology! Good thing there are strong copy-protection laws in force to prevent this sort of thing.
All snark aside, this does point out something very important; The Chinese can never surpass the performance of the people they're copying. On the other hand, they can price them right out of the market. The down side (for the entrenched powers) with the world going multicore is that you can solve problems by just throwing more cores at them. Granted, there are plenty of problems which can't be solved in this way, but even a really crappy CPU core of today is shockingly impressive by Ye Olde Tyme standards. When I think about the difference between my old Sun 4/260 and the cute little netbook on my lap with the 1.2 GHz 64 bit processor and the 2GB of ram, which is a kiddie class machine by modern standards, it makes my mouth a little dry.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
it also has two 256-bit vector pipelines that provide SIMD floating-point operations so powerful that a single 1ghz core can do 1080p at over 100 frames a second.
In these modern times, if you are going to be doing lots of SIMD on your HPC, you will replace the 10,000 CPU's with 500 GPU's + 500 CPU's to drive them.
Its cheaper to buy, and cheaper to operate.
"His name was James Damore."
they use the MIPS based, as its derived from a GPL version of the MIPS technology
so they can produce it without patent problems
but DEC made the Alpha CPU GPL before DEC got eaten by compaq so that should be free to use also
infact some of the alpha tec like programable microcode was incoparated into the p4 released after that
the Alpha cpu would be a better architecture to base a system around
other than i have not seen a free download softcore for the Alpha cpu yet (although not looked for some time )
Its cheaper to buy, and cheaper to operate.
And performance dies screaming at the first branch instruction. Yes, GPUs have great throughput, but they suck for large categories of algorithm. If they didn't, then CPUs would have the same performance. They generally lack any branch prediction, so a branch can stall the pipeline completely - if you've got more than one branch every hundred instructions, running it on the GPU won't give you anything like the theoretical maximum throughput. If your threads aren't exactly in lockstep (i.e. if two threads take different branches), say goodbye to performance too. CPUs have been heavily optimised with caches because most algorithms have a lot of locality of reference. GPUs haven't - they assume algorithms that stream large amounts of data without revisiting any of it.
In short, a MIPS CPU with a wide vector unit is going to have very different performance characteristics to a GPU and will be significantly faster for large categories of algorithm.
I am TheRaven on Soylent News
30% performance penalty compared to what? You're just repeating the marketing bs without any benchmark or proof or even understanding: is it 30% of a 8088 or of a single core P 4 at the same frequency?
And performance dies screaming at the first branch instruction.
You cant do separate branching *at all* between the multiple scalers within a SIMD vector. All the scalers have the same operations performed on them.
You seem to be confused about why the negative performance of branching matters on GPU's... its not because it impacts their SIMD capabilities.. because it doesnt.. its because it impacts their CPU-like "GPGPU" capabilities... which means... what I said is 100% correct:
If you are doing heavy SIMD work, get a pile of GPU's.
"His name was James Damore."
Intel and AMD are hampered by having to provide legacy compatibility, MIPS is a much newer designed architecture that should impose less bottlenecks on processor advancement.
Motorola and IBM said the same thing about PowerPC when they started. Over the following years the PowerPC got about 20-40% better performance at the same clock rate as the contemporary Pentium, SMP also had a similar performance advantage. However Intel was able to win with actual performance by achieving higher clock rates.
Also recent Intel x86 architectures have a modern RISC design. The x86 instruction set is merely a facade. The x86 instructions are translated to "micro-ops" that run natively on the RISC core. Intel is free to change/replace this core at any time.
Things are a bit more complicated than they appear.
You cant do separate branching *at all* between the multiple scalers within a SIMD vector. All the scalers have the same operations performed on them.
No, but you can do branching between each set of operations. If you're doing a matrix operation, then you can do a couple of SIMD operation on a row, then a branch based on the result. This is pretty fast on most CPUs, it's painfully slow on a GPU.
You seem to be confused about why the negative performance of branching matters on GPU's.
No, I'm not. One of the things I work on is a GPGPU compiler for HPC, so I'm intimately familiar with their strengths and weaknesses and when it makes sense to offload work to them from the CPU.
its not because it impacts their SIMD capabilities.. because it doesnt.. its because it impacts their CPU-like "GPGPU" capabilities... which means... what I said is 100% correct
I never said it did. I said that it affects their ability to handle instruction streams containing branches. A typical instruction stream coming from a piece of C code has a branch, on average, every 7 instructions. If you're doing vector operations on a scalar unit, then this may be every 20 instructions or so, but closer to 7 if you're using vector intrinsics. It needs to be over about 200 before you start to see the GPU being faster (and even that's highly dependent on other factors, including data independence and memory layout).
If you are doing heavy SIMD work, get a pile of GPU's.
Still not true. If you are doing highly parallel (SIMD or MIMD) work that has little locality of reference, predictable data access patterns, and very predictable code flow, get a GPU. If not, your CPU is likely to be faster. A GPU is not just a very fast SIMD unit, it's a processor that is insanely heavily optimised for a relatively narrow class of algorithms. The set of (useful) algorithms that benefit from SIMD is much larger than the set of (useful) algorithms that can run efficiently on a GPU.
I am TheRaven on Soylent News
They are obviously refering to the age of the earth.
Nope, they are expecting to achieve 2/3rds of the performance of a HAL 9000.
China indeed appears to be led by engineers rather than lawyers (maybe being a champion debater or orator isn't that useful in a single party state). As for "highest average IQ", I doubt that China's average IQ score would be much higher than the US since China has a greater base, where low scorers could pull down national average.
Results of my casual googling has turned up lists topped by countries in north-east Asia that include Japan, South Korea, Singapore and Taiwan, countries whose average income per person is much higher than China's (which can imply, among other things, better nutrition and education).
There, fixed it fer ya.
No, but you can do branching between each set of operations. If you're doing a matrix operation, then you can do a couple of SIMD operation on a row, then a branch based on the result. This is pretty fast on most CPUs, it's painfully slow on a GPU.
You are doing it wrong. The branching is only one of your issues. You are preventing coalesced reads, as well as causing bank conflicts in shared memory.
What you are describing is effectively "gimped" from the start. You have a single matrix but want to leverage instructions which operate on multiple data. Sure, the matrix is made up of multiple data.. but what you should be doing is operating on many matrices (hundreds.. thousands even) at the same time... Certainly you know the difference between AoS (Array of Structures), SoA (Structure of Arrays), and SoAoS (Structure of Arrays of Structures)
If you cannot do this, then performance isnt really the concern that you are making it out to be (I don't care how big the matrix is.)
"His name was James Damore."
My apologies for two bad typos, though I typed both in correctly in tags, also wrote Godsen for article text not Godson.
Which makes you dependent on GPU producers, which are dependent on which country? Right, USA...
Wow, you really have no clue. If your problems are that loosely coupled, then you don't need to do SIMD at all, just solve each matrix in a separate process on separate CPU. For typical applications where supercomputers are used the problem is to solve a single, huge problem, not a gazillion small ones. That is when parallelism becomes hard, otherwise you don't need a supercomputer at all.
I'm sure these are very nice chips, but anyone can do similar, given funding. there are a number of cores available for licensing (like they did with MIPS), and adding vector units is the obvious way to boost your peak flops without blowing your power budget. I guess I don't really see why this merits all the coverage - for instance, what fraction of peak performance can it get on real code (say, a weather or MD simulation, not HPL)?
the quoted peak gflops/watt for this project are decent, but not much better than current commodity x86 parts, and comparable to GPUs. most architects in the field consider power-efficient computing to be a system-architecture challenge: how to move around all that data without spending all your power on fast/wide buses. a genuinely interesting new architecture would try to address this - perhaps something vaguely like IRAM. smart memory with some kind of high-order interconnect seems like the way to go, rather than putting giant vector units on a traditional design.
"low power" quad-core 65nm 1ghz MIPS64 chips use 10 watts; 90nm, 20 watts. if you go to 28nm and stay at 1ghz, you divide by four - so that's 10/4 = 2.5 watts.
also, there are two different configurations for 65nm done by TSMC: one is high-performance (lower cost, 20 masks) and the other is lower-power (slightly higher cost, 32 masks). the lower-power CMOS one was only invented recently, so this is why you often see e.g. Broadcom Network / Server MIPS64 Quad-Core 1ghz 65nm CPUs consuming 10-20 watts. with the mask charges (NREs) being measured in $millions and the verification as well it's not justifiable financially to do a conversion of these older ICs to the newer lower-power fab process.
so there are a lot of factors that need to be taken into consideration. also you have to bear in mind that speed is a trade-off against latency. ARM SoCs are typically done in low-power, high-latency configurations, whilst the Chinese ICT University want to go for high performance (without busting the building's water-cooling when you have 1,000 of the 16-processor chips in the same room) so they can get that number one slot for the world's fastest supercomputer.
It's not unfathomable to believe that an indication of one's success is the number of detractors one garners. Why indeed even give them a second thought if there is no meat on the bones.
The numerous sinophobic posts here would suggest indirectly the posters' deep down belief that the Chinese are a credible threat. it's just too bad you don't hide it well ;-/ ... as a sign that most don't believe these cultures represent a serious threat to western leadership in science and technology.
I further direct your attention to the relative lack of academic invectives directed at Africans, Arabs, Indians
However, it seems to me many a workers in operations like Intel, AMD, Nvidia are of an oriental persuasion. Hence I'm not entirely convinced one need to delve so subconsciously to discover a culture that is fairly competent.
The exact argument was made for Japan, as its rate of improvement in manufacturing was staggering in the late 80s and early 90s.
Please cite examples of entire countries that have won from the country's industrial policy? You can find some individual industry successes, but not entire countries, I think.
Laws and regulations are programming for an open environment. A conceptual oxymoron that nobody associated with technology should fall into. Nobody tries to write a handbook for living their life, we all know that our worlds are too complex to allow a finite set of rules to deal with it. Yet, most people seem to think the gov is part of such a simple world.
Empirically, it is pretty clear that laws are written by entities with $ who will be affected by the law, and that all regulatory bodies are quickly taken over by the regulatees. Money buys power, every time.
Even if programming for an open environment were possible, the parallel is hackers vs programmers : Programmers for any given program are few, the tools are flawed. Hackers are many, inevitably some of them find the flaws left by programmers + tools. Hackers have a permanent advantage. Lawyers writing laws/regulations are few, tools are nil, lawyers on the opposite side looking for loopholes are many.
So, laws and regulations of the very specific type, at least, do not and cannot work. We have to give up that model of government.
"The Constitution, the WHOLE Constitution, and nothing but the CONSTITUTION."
A 30% performance penalty running the x86 version compared to the native version of the same code. See Godson-3: A Scalable Multicore RISC Processor with x86 Emulation.
Be relentless!
past claims of chinese indigenous cpus have been nothing short of blatant scams. it's seriously a national shame. so count me a skeptic on this mips processor.
...Chips for supercomputers, but are still importing Doritos from the US for their LAN parties.
it's not that they're 32-bit or 64-bit that's so important, it's that until the Cortex A9, you couldn't use ECC RAM. also, with only 32-bit memory addressing, and with peripherals memory-mapped, many ARM SoCs simply can't do more than 1gb RAM (and many Cortex A9s can't do more than 2gb). then, also, there is the lack of virtualisation, which, again, has only been corrected in the Cortex A9 design.
you are completely wrong.
Do you really think it's possible to know about 64 bit MIPS and not know about (e.g.) SGI origin systems? You might need to reset your humour filter.
SJW n. One who posts facts.
Not because of anything tech or truly /. related but I am a Detroit native and this is no different than saying we will no longer use Honda's. Honda factories are here in the US just like Intel and AMD are in China. Maybe even worse since there is a good chance the manufacturing processes were Intel inspired unlike the fact that Honda has an assembly line.
They don't believe they actually have their own plant, but contract out all of their fabrication/
Does having control over fabrication and design mean that they can, at source, include useful features like backdoors, rootkits, etc. In short, processors easily remotely co-opted by the government for citizen monitoring, espionage, and conducting cyberwarfare.
"Consensus" in science is _always_ a political construct.
"It still needs another decade before China-made chips meet the needs of the domestic market. Hopefully after two decades, we will be able to sell our China-made CPUs to the US just like we are selling clothes and shoes."
All of these companies that work inside of China are only slitting their throats.
I prefer the "u" in honour as it seems to be missing these days.
It may happen eventually, but it requires the rise of some new device category or "killer application" that cannot be handled well by x86 chips and Windows.
We got close with netbooks, where the price of Windows made enough of a difference to matter and Vista was too heavyweight to run well on this device category. Short term, Microsoft managed to counter this one with an extra cheap "starter edition" of XP. Meanwhile, cheap RAM and Windows 7 being faster than Vista made things easier for Microsoft.
IMHO, the next hurdle for Windows/x86 will be tablet PCs. Not so much because of computing power or Windows prices, but because of the user interface, especially in applications. That takes a lot of redesigning, because applications that require much typing are definitely not fun to use on a tablet. This makes a lot of older applications unattractive for tablets, which means the advantage of Windows having lots of existing software is much smaller on a tablet.
C - the footgun of programming languages
While working for the German subsidiary of a US company (being German myself), I got the impression that management was more interested in quick solutions and new things than gradual improvement of existing technology. If this is typical for US companies, it may be the downside of the frontier and pioneering spirit:
Once a technology works reasonably well, people lose interest in refining it. Which allows other, more patient competitors to catch up and eventually become better.
Asians seem to be the other extreme:
Not so good at inventing new stuff, but very tenacious at improving it.
C - the footgun of programming languages
We copy all so chip no ploblem - said well known chip designer Hoo Flung Dung
China was going to put "home-grown" CPUs in their PCs. They could never catch up to Intel's efficiencies, especially when their home-grown CPUs were identical photocopies of Intel's chip designs.