IBM Unveils Fastest Microprocessor Ever
adeelarshad82 writes "IBM revealed details of its 5.2-GHz chip, the fastest microprocessor ever announced. Costing hundreds of thousands of dollars, IBM described the z196, which will power its Z-series of mainframes. The z196 contains 1.4 billion transistors on a chip measuring 512 square millimeters fabricated on 45-nm PD SOI technology. It contains a 64KB L1 instruction cache, a 128KB L1 data cache, a 1.5MB private L2 cache per core, plus a pair of co-processors used for cryptographic operations. IBM is set to ship the chip in September."
But will it run ... a Beowolf cluster of ...
[Comment terminated : memelock detected]
The Z-series mainframes cost hundreds of thousands (or even over a million) dollars, not the chips. As it says in the article.
I can't wait to get a PowerMac G6 with this CPU, in your face Dell users with your commodity Intel-based desi... oh, wait.
You are not alone. This is not normal. None of this is normal.
The thing is that if you have 2 (say) 1.6 GHz processors, they aren't as 'powerful' as one 3.2 GHz processor.
For one - there are overheads, certain stuff common between them, pipelines - stuff which I forgot (computer engineering related problems).
But the main thing is that not all programs are multi-threaded, and a program with a single thread can only run on one processor. So yeah, GHz are still useful. Maybe for large single-thread batch processing - which is the kind of thing a mainframe would do.
Can't even imagine writing in assembly code for this monster. I miss dinking around with a nice 6502 system.
__ Someday, but not this morning, I'll finally learn to use the preview button.
Yes, but their article comments are much closer to Youtube than Slashdot.
ummmm.......
It's a quad-core chip. Each core has two integer, two load and store, one binary floating point, and one decimal floating point unit. Up to 24 CPUs can be placed in the frame. It can connect to another whole rack of POWER7 blades running AIX as an application accelerator platform.
The z196 is for the stuff a mainframe is good at: big batches and fast I/O. The application accelerator is for stuff the clusters of supermicro servers are good at. As a hybrid system connected across the GX bus, it should pump data in and out of applications out pretty well.
More or less. They hit two walls - fabricating chips that could run faster while retaining an acceptable yield, and dealing with the heat such chips produced.
The fastest general-sale chips were the P4s - the end of their line marked the end of the gigahertz wars, as Intel switched from ramping up the clock to ramping up the per-cycle efficiency with the Core 2 and their complete architecture overhaul. As a result a 2GHz Core 2 duo will outperform a 4GHz P4 dual-core under most conditions. Better pipeline organisation, larger caches better managed.
Clock rate is no longer the key variable in comparing processors, unless they are of the same microarchitecture.
When configured to run Linux, each core costs approx $125K. When configured for z/OS, each core costs approx $250K. A complete system (not including any storage or software) can cost up to around $30M.
Considering the ratio between the two sets of figures is ~96, it seems that the "four-node system" contains 96 cores with their own L1 and L2 caches, but shared L3 and L4 caches.
The comments were about the fact that at 3GHz light travels 10cm per clock speed, which limits how far you can have 2 items on a bus if you want them to communicate within 1 clock cycle. There is no "light speed barrier" or anything of the sort, however at these frequencies you design knowing that it will take measurable time for an electric signal to propagate. For example, for this particular system whose core is at 5.2GHz, if you try to send a signal to an external memory that is say 11-12cm away, then it will take about two clock cycles just for the signal to travel the distance.
Violence is the last refuge of the incompetent. Polar Scope Align for iOS
"clockspeed is NOT related to throughput"
Of course it is. It is not, however, the only factor, and other factors may indeed (and commonly do) outweigh it.
"IBM may have created a very highly clocked CPU and given it tons of transistors, but I seriously doubt if it will compete with a modern day server CPU from Intel or even AMD."
I think you underestimate IBM's technical ability. They do have some idea of what they're doing.
"pure performance maybe, but definitely not price-performance or performance-per-watt"
That's like saying a Ferrari is a poor performance car because it can't compete against a Ford Focus on cost-per-max-speed or miles-per-gallon.
Banks, Credit card companies, hospitals, Insurance companies...
Cheap clusters are great but they are not always the best tool for the job.
Very large traditional datasets involving lots of high value transactions, with 5 9s uptime requirements do not tend to scale well to COTS clusters.
IBM mainframes have uptimes measured in years if not decades.
They have hot swapable everything including CPUs. so you can do ugrades with zero downtime.
Also you need to take a look at the costs involved. The costs to throw out a working software system that has been used for decades and then the cost to redesign it to work on a Cluster of X86 boes will be huge.
Not to mention the investment in making it fault tolerant and if it is used in certain markets the cost of the auditing the software.
Not to mention that ZSystems tend to be really secure. There are just not a lot of exploits on Zsystems.
When downtime can cost millions of dollars hardware costs are just no that big of a deal.
Now if you are starting from scratch then you may save money by going with a cluster but then you may not depending on just how good your programmers are.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
"They say it's an old CISC architecture. This is probably the sort of system that runs horribly outdated and un-updatable code, like the tax system."
You mean like Windows?
The X86 is also an old CISC architecture.
Actually the Power line is RISC anyway. When it is used in a ZMachine the old style 360/370/390 CISC ISA is translated to RISC and then executed.
Before you go ew that is what modern X86 chips do as well as ARM when using the Thumb Instruction set. The ZSystem ISA is so high end it is almost a high level language so the translation doesn't really effect performance much at all. Also that old CISC architecture is much better than the mess that we have on the X86.
I am not sure about how IBM does the translation. On the System 38 AS/400 System-I the translation was done during the IPL aka Initial Program Load. On the Zs it may be done as a JIT but I am not sure.
Honestly I love the idea and wish that Linux would adopt it. You could then have one binary that would work on any Linux system on an CPU.
The AS400 way kept a native binary copy along with the TIMI copy. When the program was run the first time it would translate the TIMI copy into the native segment. Yes the first time you ran the program it might take a bit to start but after that it would run at full speed and start fast. Of course you could add a binary segment when you first released the code for the ISA of your choice.
All in all those old Mainframes and Minis had a lot of brilliant tech we still don't have today on our PCs.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Mainframes are engineered fundamentally around two things: Reliability and IOPS.
When it comes to basic tasks, it isn't often that a large server ends up CPU bound (especially database servers). Instead what usually becomes the bottleneck is I/O and RAM.
Reliability is where mainframes take the cake. Some use multiple CPUs to execute the same instructions to make sure the output is correct. Mainframes have virtually redundant everything. Because they have been doing VM since the dawn of computing, it may be that a LPAR might need kicked, but a full IPL of a mainframe is exceedingly rare.
IBM System z machines are on one end of the spectrum. They cost an arm and a leg, but if someone has a lot of 1U servers or even blades, it might be better to just dump the rackfuls of those machines and go with some big iron and LPARs. The TCO of a machine isn't just the price tag of the box, nor the licenses or service fees. One factor people forget is how many admins are needed to keep things going. Some companies are far better off with a mainframe and some Linux admins as opposed to a rackfuls of Windows machines that require an army of MS-ITPs to keep running.
Believe it or not, mainframes have advanced along with the times. They have always been reliable and boring. COBOL is long gone except for way legacy stuff. Instead, you still have Oracle, WebSphere, JBoss, and many other behind the scene applications which are not flashy, but are business critical.
Mainframes also come with their own viewpoint. On one hand, a company can buy enough x86 servers with clustering, redundancy, failover capability, and other items to reduce the MTBF of those servers to an acceptable level. On the other hand, a company can pay the ticket to the System z series and have one machine that has an extremely high MTBF with less of the need of a HA cluster. Even with all the clustering and redundancy of x86 machines, there is only so much lipstick you can put on a pig before it turns into a oinking ball of wax, so if some wants to go the x86 route, it will require a lot more employees to keep things running.
Yeah, it's actually kind of funny how today's Intel desktop processors actually trace their lineage to the Pentium M, which was a mobile chip. When the Pentium 4 came around, the Pentium Pro (Pentium II, Pentium III) architecture was pretty much relegated to the mobile market while Pentium 4 represented their desktop line. As you said, they ran into heat (and power) issues with the Pentium 4s and basically had no more room for expansion there. They went back to the Pentium M, which was doing pretty nicely in the notebook space, and since it was low-power and efficient it became the basis for their future desktop CPUs--the Core line, in particular. They just stopped playing up the clock speed because that architecture's clock speeds were substantially lower than the Pentium 4, despite being able to do more work. I read once that a Pentium M could do about 40% more work than a Pentium 4 of the same clock, so in essence a 2GHz Pentium M was about as powerful as a 3.2 GHz P4.
Switching everything over to the low-power and parallel-friendly Pentium M line is probably one of the smartest things Intel ever did. They would've dug their own grave had they stuck with building on Pentium 4 to the bitter end.
Check out my world simulator thingy.