iPhone 5 A6 SoC Teardown: ARM Cores Appear To Be Laid Out By Hand
MrSeb writes "Reverse engineering company Chipworks has completed its initial microscopic analysis of Apple's new A6 SoC (found in the iPhone 5), and there are some rather interesting findings. First, there's a tri-core GPU — and then there's a custom, hand-made dual-core ARM CPU. Hand-made chips are very rare nowadays, with Chipworks reporting that it hasn't seen a non-Intel hand-made chip for 'years.' The advantage of hand-drawn chips is that they can be more efficient and capable of higher clock speeds — but they take a lot longer (and cost a lot more) to design. Perhaps this is finally the answer to what PA Semi's engineers have been doing at Apple since the company was acquired back in 2008..."
Pretty picture of the chip after using an Ion Beam to remove the casing. The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding.
That must be a very fine tipped resist pen...
The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding.
Coding in assembly still remains a superior method of squeezing extra performance out of software. It's just that few people do it because compilers are "good enough" at guessing which optimizations to apply, and where, and usually development costs are the primary concern for software development. But when you're shipping hundreds of millions of units of hardware, and you're trying to pack as much processing power in a small and efficient form factor, you don't go with VLSI for the same reason you don't go with a compiler for realtime code: You need that extra few percent.
#fuckbeta #iamslashdot #dicemustdie
There are a lot of layout methodologies that are between the (frankly mythical) "X cache, Y FPUs, and Z cores" and fully hand layout. The top level may have more or less amounts of hand assembly, some blocks can be hand optimized, etc.. Usually, there is lots of glue logic which must be designed in RTL, synthesized and only then laid-out. And, for most blocks the process to create the logic design (RTL or perhaps gates) is separate from the process of laying-out these blocks. So there is room for manual involvement in each of the steps.
The real "Libtards" are the Libertarians!
The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding.
Coding in assembly still remains a superior method of squeezing extra performance out of software. It's just that few people do it because compilers are "good enough" at guessing which optimizations to apply, and where, and usually development costs are the primary concern for software development. But when you're shipping hundreds of millions of units of hardware, and you're trying to pack as much processing power in a small and efficient form factor, you don't go with VLSI for the same reason you don't go with a compiler for realtime code: You need that extra few percent.
I like to view things as a little more complicated than just applying optimizations. IMHO assembly gets some of its biggest wins when the human programmer has information that can't quite be expressed in the programming language. Specifically I recall such things in the bad old days when games and graphics code would use fixed point math. The programmer knew the goal was to multiply two 32-bit values, get a 64-bit result and right shift that result back down to 32 bits. The Intel assembly programmer knew this could be done in a single instruction. However there wasn't any real way to convey the bit twiddling details of this fixed point multiply to a C compiler so that it could do a comparable operation. C code could do the calculation but it needed to multiply two 64-bit operands to get the 64-bit result.
Looking closely I see a bunch of ram - probably half laid out by hand (caches) - and a many may small standard cell blocks almost certainly not laid out by hand - what I don't see is an obviously hand laid out datapath (the first part of your CPU you spend layout engineers on) - look for that diagonal where the barrel shifter(s) would be. There are some very regular structures (8 vertically) that I suspect are register blocks.
Still what I see is probably someone managing timing by synthesizing small std cell blocks (not by hand), laying those blocks out by hand then letting their router hook them up on a second pass - - it's probably a great way to spend a little extra time guiding your tools into doing a better job to squeeze that extra 20% out of your timing budget and give you a greater gate density (and lower resulting wire delays)
So - a little bit of stuff being done by hand but almost all the gates being lait out by machine
I've put the picture (which is what everyone wants) up here:
http://i.imgur.com/vqCAu.jpg
This is not by hand.
To take a programming analogy, it's looking at what the compiler generated, and then giving it hints so the resultant code/chip is laid out as you expect.
Chips stopped being able to be laid out 'properly' by hand some time ago.
Doing this has much the same benefits as doing it with code.
You know stuff the compiler does not.
You can spot silly stuff it's doing, that is not wrong, but suboptimal, and hold its hand.
Android users usually get laid by hand.
Brilliant, this is what I love about Slashdot, I can be the biggest geek in whatever field I pick and I will still get outgeeked! I enjoyed reading the comments above mostly because I have absolutely no idea what the detail is, and I'd never even realised that hand-drawn vs machine was a issue.
:)
Can anyone supply a concise explanation of the differences and how it's all done? I'm guessing we're talking about people drawing circuits on acetate or similar and then it's scaled down photo-style to produce a mask for the actual chip?
Yes, I know I can just Google it, and I will, but as the question came up here I thought I'd add something to a real conversation, it beats a pointless click of some vague "like" button any day
Please consider this account deleted, I just can't be bothered with the spam anymore.
I don't think bicycle riding is a very good analogy to this problem. How about cooking, which is a procedural step-by-step operation? Little hints the recipe can give you like "preheat oven to 350 degrees" can be a tremendous time-saver later. If you didn't know to do that, you'd get your dish ready and then look at the oven (off) and click it on and sit back and wait 20 minutes before placing it in the oven. A dish that was supposed to be 60 minutes start to serve is now going to take 80 minutes due to a lack of process optimization.
Compilers have the same problem of not knowing what the expectations are down the road, and aren't good at timing things. Good expereinced cooks can manage a 4 course meal and time it so all the dishes are done at the right time and don't dirty as many dishes. Inexperienced cooks are much like compilers, they can get the job done but their timing and efficiency usually have much room for improvement.
I work for the Department of Redundancy Department.
Actually not only is that one hand made, each and every A6 is lovingly hand made. That's why they're top quality.
and their tech section is run by taco so it could be counted as a sort of pseudo-slashdot effect they have as well
---Saying gnome 3 is better than windows 8 not so much a compliment as it is damning with light praise.
I'm guessing that the search space is too large to brute force the optimization. For similar reasons we can't write a program that can beat a Go master. It's just too hard a problem without heuristics, and the heuristics in the human brain are better. Figure out why, and you've solved AI.
Give me Classic Slashdot or give me death!
What you're missing is that chip layout is NP-complete. For anything beyond very trivial chips, no computer algorithm can yield the optimal solution in a reasonable time.
As I understand it, automated layout algorithms are still, when you get down to it, largely quite dumb. I'm sure this is oversimplifying and someone who writes place-and-route software will probably want to kill me, but the algorithm is closer to "throw stuff together, measure performance, tweak things randomly, measure performance, keep the change if it got better" than to anything likely to yield an optimal solution. Eventually, you'll converge on a decent layout, sure, but not an optimal one.
It's pretty much guaranteed that this chip wasn't completely hand-crafted (modern chips are much too complicated to do that). Instead, most likely, engineers guided the placement of major blocks and data paths, and let the automated place-and-route software choose the rest. By constraining the design based on intelligent decisions, you can guide the automated process to converge on a better solution.
" The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding."
I've done PCB layouts, microwave chip and wire circuits, as well as RFIC/MMIC layouts. Anyone who asks the question above has never done a real layout. Many autorouter and layout tools allow complex rules to match delays, keep minimum widths, etc. You can spend as much time on each layout trying to populate these rules for critical sections of a design, but it is like trying to train a 5 year old to do brain surgery. Digital design is rather much different than the analog circuits I work on, but you only have to do a few layouts of any flavor by hand in your life to be able to see just how scary it is to hand a layout to HAL.
Clearly autorouters and autogenerated layouts, and I don't mean to sound like too much of a luddite... I've witnesses plenty of awful hand layouts to go around as well.
Display is LG, Flash is Hynix, the RAM is from Elpida and their chip is their own design with Samsung just acting as a fab no different than Global Foundries or TSMC.
And an incremental chip would benefit from hand-holding more than a new one. Say they tested the chip at + 50% clock speed and identified locations of instability. Then, they "by hand" took an optimized automated layout and tweaked it to improve those few specific areas. That would be a "by hand" design that didn't take too much work and gave a better result. Perhaps do the same thing, but with 50% less voltage, rather then increased speed. Then hand-optimize on that. Then compare the two and come up with something that's faster and lower power than the automated process would ever come up with.
Most algorithms I've messed with are very good at iteration, but bad at evolution (they win chess by calculating odds and values, not by analyzing the opponent and his moves).
Learn to love Alaska
Display is LG. Flash is mostly Hynix and Toshiiba.
Yeah, but the software is Samsung, and everyone knows that's what really counts.
The CPU is manufactured by Samsung, and that's what really counts for Fandroids.
Nah, I was referring to the well sourced fact that iOS is actually just a gimped version of Android.
Remember Schmidt was on the Apple board, and he provided preview copies of Android to Jobs.
They're made into nutritional supplements that you can pick up in a rectangular bar form
With rounded corners?
When someone buys a design from ARM, they buy one of two things:
1. A Hard macro block. This is like an mspaint version of a cpu. it looks just like the photos here. The CPU has been laid out partially by hand by ARM engineers. The buyer must use it exactly as supplied - changing it would be neigh-on impossible. In the software world, it's the equivalent of giving an exe file.
2. Source Code. This can be compiled by the buyer. Most buyers make minor changes, like adjusting the memory controller or caches, or adding custom FPU-like things. They then compile themselves. Most use a standard compiler rather than hand-laying out the stuff, and performance is therefore lower.
The articles assertion that hand layout hasn't been done for years outside intel as far as I know is codswallop. Elements of hand layout, from gate design to designing memory cells and cache blocks have been present in ARM hard blocks since the very first arm processors. Go look in the lobby at ARM HQ in Cambridge UK and you can see the meticulous hand layout of their first cpu, and it's so simple you can see every wire!
Apple has probably collaborated with ARM to get a hand layout done with apples chosen modifications. I can't see anything new or innovative here.
Evidence: http://www.arm.com/images/A9-osprey-hres.jpg (this is a layout for an ARM Cortex A9)
The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever.
No matter how much improvement on VLSI layout software their output can't match that of hand-laid layout by those who know what they are doing.
The VLSI layout software are like compilers. The final compiled code relies on two factors - the source-code input and the built-in "rules" of the compilers.
A similar case is in software programming - The source code from a so-so programmer compiled by a very very good compiler will result in a "good-enough" result.
It's good enough because it gets the job done.
However, a similar program by an expert Assembly Language programmer would have left "good enough" behind because the assembly language programmer would know how to tweak his code using the most efficient commands, and cut out the 'fats" by optimizing the loops and flows.
Muchas Gracias, Señor Edward Snowden !
As a mathematician, you ought to understand global optimization encountering local minima in a high-dimensional space. Standard tools for large-scale functional minimization are all subject to it in one form or another, and humans get to ignore all the "stuff that doesn't make sense" - machines don't have that latitude, at least with current algorithms.
Don't get me wrong, the layout and design tools are on the bleeding edge; they're as sophisticated as they come, and there's a *huge* amount of maths in how they work, but they're still crap, compared to a moderately skilled human. What they do excel at is doing all the tedious repetitive work that is typically required, and there's a *lot* of that.
Simon
Physicists get Hadrons!
Not surprising at all, as PA SEMI was founded by Daniel W. Dobberpuhl.
Daniel Dobberpuhl had his hand in StrongARM and DEC Alpha design - both hand-drawn cores which to this day command some respect in chip design circles I'm told.
Anyway,
I said no... but I missed and it came out yes.
I don't see anything in the pictures which implies "hand custom layout". I see a lot of carefully placed and floorplanned blocks, some of which are synthesized and some of which may have varying degrees of directed placement & routing. There are a lot of RAMs and register files, which look very regular but there's no way to tell whether they were generated by a bog standard RAM/RF compiler or whether there was some custom work (perhaps a combination of the two). There are a lot of unique blocks for a chip this size, I suspect there are several fixed function units to do various things (mpeg decoding or whatnot).
Hand custom layout conjures images of dozens of layout engineers drawing polygons for every transistor; I doubt they did much of that but I'm certain you can't tell from these kinds of photos.
It certainly looks "designed" and knowing how sharp the pasemi folks are then that isn't at all surprising.
The question I have is how it's less expensive (in the long run) to lay a chip out by hand once instead of improving your VLSI layout software forever. NP classification notwithstanding.
It's simple math. At what volume will the chip be produced? A modern fab costs $X Billion, and you know pretty much exactly how many wafers you can run during the 3 years it is state-of-the-art. After that, add $Y Billion for a refit, or just continue to run old processes. Anyway, say a new fab at refit time would cost $Z Billion. Refitting the old fab instead costs $Y Billion. So you save $Z-$Y by doing a refit. So the original fab cost you $X-($Z-$Y). Divide by number of wafers the fab can run during its life, that is the cost per wafer. Now compute die area for hand layout versus auto layout, and adjust for imporved yield for smaller die. Divide by die per wafer. That is how much less each die costs you. Now since the die is smaller, it probably runs faster, so adjust your yield-to-frequency-spec upwards, or adjust your average selling price upwards if the speed difference is "large" (enough MHz to have marketing value). That is the value of hand layout. It isn't rocket surgury to work out a dollars-and-cents number.
Anyway, even at Intel for at least the past 20 years only highly repetive structures like datapath logic has been hand laid out. Control logic is too tedius to lay out by hand, doesn't yield much area benefit, and is where the bulk of the bug fixes end up so it's the most volatile part of the layout from stepping to stepping.
So, can hand layout have a positive return on investment? Yes, if you run enough wafers of one part to make the math work out. These days the math will only work out for higher volume parts.
(Yes, I'm ex-Intel).