Next-Gen GPU Progress Slowing As It Aims for 20 nm and Beyond
JoshMST writes "Why are we in the middle of GPU-renaming hell? AMD may be releasing a new 28-nm Hawaii chip in the next few days, but it is still based on the same 28-nm process that the original HD 7970 debuted on nearly two years ago. Quick and easy (relative terms) process node transitions are probably a thing of the past. 20-nm lines applicable to large ASICs are not being opened until mid-2014. 'AMD and NVIDIA will have to do a lot of work to implement next generation features without breaking transistor budgets. They will have to do more with less, essentially. Either that or we will just have to deal with a much slower introduction of next generation parts.' It's amazing how far the graphics industry has come in the past 18 years, but the challenges ahead are greater than ever."
As a Radeon 7970 early adopter, I am completely fine with this. It still more than kicks butt at any game I throw at it, and hopefully this slow pace will mean that I'll get another couple of good years out of my expensive purchase.
Meanwhile, Intel is about to give us 15 core Ivy-bridge Xeons. A year from now we'll have at least that many Haswell core Xeons, given that they have the same 22nm feature size.
How many cores will 14nm Broadwell parts give us (once they sort the yield problems?) You may expect to see 4-5 billion transistor CPUs in the next few years.
Yay for Moore's law.
More GPU power translates into more detailed geometry and shaders as well as more GPGPU power to calculate more detailed physics.
No, seriously. I have yet to find a graphics card that will accelerate 2D line or bitmapped drawings, such as are found in PDF containers. It isn't memory-bound, as you can easily throw enough RAM to hold the base file, and it shouldn't be buffer-bound. And yet it still takes seconds per page to render an architectural print on screen. That may seem trivial, but to render real-time thumbnails of a 200 page 30x42" set of drawings becomes non-trivial.
If you can render an entire screen in 30ms, why does it take 6000ms to render a simple bitmap at the same resolution?
(the answer is, of course, because almost nobody buys a card for 2D rendering speed - but that makes it no less frustrating)
Is it just my observation, or are there way too many stupid people in the world?
the real killer app i've heard of for gaming rigs is making realtime special effects for movies and tvs. other than that there is news departments where thin clients can take advantage of a gpu assited server to run as many displays as the hardware can handle.
then there is wallstreet where a cuda assisted computer can model market dynamics in real time, there are a lot of superfast computers on stock exchanges. so there you go. 3 reasons for gpus to go as far as technology will allow.
https://www.gnu.org/philosophy/free-sw.html
The 'point' of this very crappy article is that each process node shrink is taking longer and longer. Why bother connecting this to GPUs, when self-evidently ANY type of chip relying on regular process shrinks will be affected?
The real story is EITHER the future of GPUs in a time of decreasing PC sales, rapidly improving FIXED-function consoles, and the need to keep high-end GPUs within a sane power budget ***OR*** what is the likely future of general chip fabrication?
Take the later. Each new process node costs vastly larger amounts of money to implement than the last. Nvidia put out a paper last year (about GPUs, but their point was general) that the cost of shrinking a chip may become so high, that it will ALWAYS be more profitable to keep making chips on the older process instead. This is the nightmare scenario, NOT bumping into the limits of physics.
Today, we have a good view of the problem. TSMC took a long time to get to 28nm, and is taking much longer to get off. 20nm and smaller aren't even real process node shrinks. What Intel dishonestly calls 22nm and 14nm is actually 28nm with some elements only on a finer geometry. Because of this, AMD is due to catch up with Intel in the first half of 2014, with its FinFET transistors also at 20nm and 14nm.
Some nerdy sheeple won't believe what I've just said about Intel's lies. Well Intel gets 10 million transistors per mm2 on its 22nm process, and AMD, via TSMC, gets 14+ million on the larger 28nm process. Defies all concept of maths when Intel CLAIMS a smaller process, but gets far less transistors per area against a larger process.
It gets more complicated. The rush to SHRINK has caused the industry to overlook the possibilities of optimising a given process with new materials, geometry, and transistor designs. FD-SOI is proving to be insanely better than finFET on any current process, but is being IGNORED as much as possible by most of the bigger players, because they've already invested tens of billions of dollars in prepping for FinFET. Intel has had two disastrous rounds of new CPU (Ivybridge and Haswell), because FinFET failed to deliver any of the theoretical on the process they 'call' 22nm.
Intel has one very significant TRUE lead, though- that of power consumption in its mains-powered CPU family. Although no-one gives a damn about mains-powered CPU power usage, Intel is more than twice as efficient than AMD here. Sadly, their power advantage largely vanishes with mobile, battery powered parts.
Anyway, to flip back to GPUs, AMD is about to announce the 290x, the fastest GPU, but with a VERY high power usage. Both AMD and Nvidia need to seriously work to get power consumption down as low as possible, and this means 'sweet spot' GPU parts which will NOT be the fastest possible, but will have sane compromise characteristics. Because 20nm from TSMC is almost here (in 12 months max), AMD and Nvidia are focused firstly on the new shrink, and finFETs, BUT moving below 20nm (in a TRUE shrink, not simply measuring the 2D profile of finFET transistors) is going to take a very, very, very long time, so all companies have an incentive to explore ways of improving chip design on a GIVEN process, not simply lazily waiting for a shrink.
Who knows? FD-SOI offers (for some chips) more improvements than a single shrink suing conventional methods. It is more than possible that by exploring material science, and the physics of semiconductor design, we could get the equivalent of the advantages of multiple generations of shrink, without changing process.
Not totally true. Stroke/path/fill rasterization work is not supported by current 3D rendering APIs (and thus not accelerated by 3d hardware). Right now the stroke/path/fill rasterization is done on the CPU and merely 2D blit-ed to the frame buffer by the GPU. The CPU could of course attempt convert the stroke/path into triangles and then use the GPU to rasterize those triangles (with some level of efficiency), but that's a far cry from "proper, full-featured 2D".
Fonts are special cased in that glyphs are cached, but small font rasterization isn't generally possible to do with triangle rasterization (because of the glyph hints).
Since SW doesn't even attempt to use HW for modern 2D operations, it will likely be a long time before HW will support this kind of stuff...
A - anything that you can't do by tesselating to triangles could be done with OpenCL or CUDA. You could, for example, assign OpenCL kernels where each instance rasterizes one stroke and composite the results or something similar, and exploit the paralellism of the GPU. But, it would be inconvenient to write. Especially since most PDF viewers don't even bother with effective parallelism in their software rasterizers.
B - you can do anything by tesselating to triangles.
True. And once graphical realism in a human-created game universe reaches its practical limit, game developers will have to once again experiment with stylized graphics. This parallels painting, which progressed to impressionism, cubism, and abstract expressionism.