Next-Gen GPU Progress Slowing As It Aims for 20 nm and Beyond
JoshMST writes "Why are we in the middle of GPU-renaming hell? AMD may be releasing a new 28-nm Hawaii chip in the next few days, but it is still based on the same 28-nm process that the original HD 7970 debuted on nearly two years ago. Quick and easy (relative terms) process node transitions are probably a thing of the past. 20-nm lines applicable to large ASICs are not being opened until mid-2014. 'AMD and NVIDIA will have to do a lot of work to implement next generation features without breaking transistor budgets. They will have to do more with less, essentially. Either that or we will just have to deal with a much slower introduction of next generation parts.' It's amazing how far the graphics industry has come in the past 18 years, but the challenges ahead are greater than ever."
As a Radeon 7970 early adopter, I am completely fine with this. It still more than kicks butt at any game I throw at it, and hopefully this slow pace will mean that I'll get another couple of good years out of my expensive purchase.
Meanwhile, Intel is about to give us 15 core Ivy-bridge Xeons. A year from now we'll have at least that many Haswell core Xeons, given that they have the same 22nm feature size.
How many cores will 14nm Broadwell parts give us (once they sort the yield problems?) You may expect to see 4-5 billion transistor CPUs in the next few years.
Yay for Moore's law.
There have been some graphics advances since the days of Quake 3.
More GPU power translates into more detailed geometry and shaders as well as more GPGPU power to calculate more detailed physics.
No, seriously. I have yet to find a graphics card that will accelerate 2D line or bitmapped drawings, such as are found in PDF containers. It isn't memory-bound, as you can easily throw enough RAM to hold the base file, and it shouldn't be buffer-bound. And yet it still takes seconds per page to render an architectural print on screen. That may seem trivial, but to render real-time thumbnails of a 200 page 30x42" set of drawings becomes non-trivial.
If you can render an entire screen in 30ms, why does it take 6000ms to render a simple bitmap at the same resolution?
(the answer is, of course, because almost nobody buys a card for 2D rendering speed - but that makes it no less frustrating)
Is it just my observation, or are there way too many stupid people in the world?
the real killer app i've heard of for gaming rigs is making realtime special effects for movies and tvs. other than that there is news departments where thin clients can take advantage of a gpu assited server to run as many displays as the hardware can handle.
then there is wallstreet where a cuda assisted computer can model market dynamics in real time, there are a lot of superfast computers on stock exchanges. so there you go. 3 reasons for gpus to go as far as technology will allow.
https://www.gnu.org/philosophy/free-sw.html
The 'point' of this very crappy article is that each process node shrink is taking longer and longer. Why bother connecting this to GPUs, when self-evidently ANY type of chip relying on regular process shrinks will be affected?
The real story is EITHER the future of GPUs in a time of decreasing PC sales, rapidly improving FIXED-function consoles, and the need to keep high-end GPUs within a sane power budget ***OR*** what is the likely future of general chip fabrication?
Take the later. Each new process node costs vastly larger amounts of money to implement than the last. Nvidia put out a paper last year (about GPUs, but their point was general) that the cost of shrinking a chip may become so high, that it will ALWAYS be more profitable to keep making chips on the older process instead. This is the nightmare scenario, NOT bumping into the limits of physics.
Today, we have a good view of the problem. TSMC took a long time to get to 28nm, and is taking much longer to get off. 20nm and smaller aren't even real process node shrinks. What Intel dishonestly calls 22nm and 14nm is actually 28nm with some elements only on a finer geometry. Because of this, AMD is due to catch up with Intel in the first half of 2014, with its FinFET transistors also at 20nm and 14nm.
Some nerdy sheeple won't believe what I've just said about Intel's lies. Well Intel gets 10 million transistors per mm2 on its 22nm process, and AMD, via TSMC, gets 14+ million on the larger 28nm process. Defies all concept of maths when Intel CLAIMS a smaller process, but gets far less transistors per area against a larger process.
It gets more complicated. The rush to SHRINK has caused the industry to overlook the possibilities of optimising a given process with new materials, geometry, and transistor designs. FD-SOI is proving to be insanely better than finFET on any current process, but is being IGNORED as much as possible by most of the bigger players, because they've already invested tens of billions of dollars in prepping for FinFET. Intel has had two disastrous rounds of new CPU (Ivybridge and Haswell), because FinFET failed to deliver any of the theoretical on the process they 'call' 22nm.
Intel has one very significant TRUE lead, though- that of power consumption in its mains-powered CPU family. Although no-one gives a damn about mains-powered CPU power usage, Intel is more than twice as efficient than AMD here. Sadly, their power advantage largely vanishes with mobile, battery powered parts.
Anyway, to flip back to GPUs, AMD is about to announce the 290x, the fastest GPU, but with a VERY high power usage. Both AMD and Nvidia need to seriously work to get power consumption down as low as possible, and this means 'sweet spot' GPU parts which will NOT be the fastest possible, but will have sane compromise characteristics. Because 20nm from TSMC is almost here (in 12 months max), AMD and Nvidia are focused firstly on the new shrink, and finFETs, BUT moving below 20nm (in a TRUE shrink, not simply measuring the 2D profile of finFET transistors) is going to take a very, very, very long time, so all companies have an incentive to explore ways of improving chip design on a GIVEN process, not simply lazily waiting for a shrink.
Who knows? FD-SOI offers (for some chips) more improvements than a single shrink suing conventional methods. It is more than possible that by exploring material science, and the physics of semiconductor design, we could get the equivalent of the advantages of multiple generations of shrink, without changing process.
The Xbox One and PS4, for example, will be good at 1080p but ultimately only a few times faster than the previous generation consoles.
I believe the same sort of slowdown happened at the end of the fourth generation. The big improvements of the Genesis over the Master System were the second background layer, somewhat larger sprites, and the 68000 CPU.
Good luck gaming on a high resolution monitor spending less than $500.
Good luck buying a good 4K monitor for substantially less than $4K.
The thing with graphics improvements is that GPUs are getting better in linear scale, but quality improvements need to happen in logarithmic scale. Going from 100 polys to 200 polys looks like a huge leap, but going from 10,000 polys to 10,100 polys doesn't. I personally think the next big thing will be on-card raytracing (As NVidia has already demonstrated some). Massively parallel raytracing tasks are like candy for GPGPUs, but there is a lot of investment in Rasterising at the moment, so that is their current go-to method.
During TSMC earnings call the CEO mentioned that there are tape-outs for GPUS for 16nm Finfet, but not 20nm - hinting that Nvidia and AMD will skip that node altogether.
http://seekingalpha.com/article/1750792-taiwan-semiconductor-manufacturing-limited-management-discusses-q3-2013-results-earnings-call-transcript?page=3
"Specifically on 20-nanometers, we have received 5 product tape-outs, and scheduled more than 30 tape-outs in this year and next year from mobile computing CPU and PLD segments"
"On 16-FinFET. Technological development is progressing well, risk production is on schedule by the end of this year. More than 25 customer product tape-outs are planned in 2014, including mobile computing, CPU, GPU, PLD and networking applications. "
> How much more is enough?
Uhm, never.
I have a GTX Titan and it is still TOO SLOW: I want to run my games at 100 fps @ 2560 x 1440. I prefer 120 Hz on a single monitor using LightBoost. Tomb Raider 2013 dips down below 60 fps which makes me mad.
And before you say "What?", I started out with the Apple ][ with 280x192; even ran glQuake at 512x384 res on a CRT to guarantee 60 fps, so I am very thankful for the progress of GPUs. ;-)
But yeah, my other 560 Ti w/ 448 cores runs 1080p @ 60 Hz is certainly "good enough" but we we are QUITE a ways away from the end of GPUs for the forseeable future. There is a still a demand for real-time CG photorealism.
All of the 3D rendering APIs are capable of proper, full-featured 2D rendering. The same hardware accelerates both just as well. The problem is that most apps are just not using it and/or that they are CPU bound for other reasons. PDFs, for instance, are rather complex to decode.
Not totally true. Stroke/path/fill rasterization work is not supported by current 3D rendering APIs (and thus not accelerated by 3d hardware). Right now the stroke/path/fill rasterization is done on the CPU and merely 2D blit-ed to the frame buffer by the GPU. The CPU could of course attempt convert the stroke/path into triangles and then use the GPU to rasterize those triangles (with some level of efficiency), but that's a far cry from "proper, full-featured 2D".
Fonts are special cased in that glyphs are cached, but small font rasterization isn't generally possible to do with triangle rasterization (because of the glyph hints).
Since SW doesn't even attempt to use HW for modern 2D operations, it will likely be a long time before HW will support this kind of stuff...
True. And once graphical realism in a human-created game universe reaches its practical limit, game developers will have to once again experiment with stylized graphics. This parallels painting, which progressed to impressionism, cubism, and abstract expressionism.
The more DPI craze has just started. We are going to have more DPIs on monitors and graphics cards will compete to bring the same speed of lower resolutions to them. The unfortunate thing is that the moore's law is at its practical limits (for now) so more capable CPUs might become more expensive and consume more power.
I personally hate the noise of those fans and the heat coming from under my table. I don't do games but I use the GTS-450 (joke? ha?) for scientific computing.
If you're out of silicon to work with, you can't just keep on going to throw transistors at a performance problem. You will have to get smarter with what you do with the transistors. If the GFX card makers add innovative features to the on-board chips, they could solve many bottlenecks we still face with utilizing the massive parallel performance we have on these cards. Both for science and for GFX I'm sure there is a list of "most wanted features" or "biggest hotspots" they could work on. For example, the speed at which you can calculate hashes with OCLhashcat differs extremely for NVidia and AMD graphics. NVidia clearly is missing something they don't need a smaller silicon process for. There must be plenty of this sort of improvements both AMD and NVidia can make.
I was promised a flying car. Where is my flying car?