Twilight of the GPU — an Interview With Tim Sweeney
cecom writes to share that Tim Sweeney, co-founder of Epic Games and the main brain behind the Unreal engine, recently sat down at NVIDIA's NVISION con to share his thoughts on the rise and (what he says is) the impending fall of the GPU: "...a fall that he maintains will also sound the death knell for graphics APIs like Microsoft's DirectX and the venerable, SGI-authored OpenGL. Game engine writers will, Sweeney explains, be faced with a C compiler, a blank text editor, and a stifling array of possibilities for bending a new generation of general-purpose, data-parallel hardware toward the task of putting pixels on a screen."
He talks about the impending fall of the fixed function GPU.
I just browsed the article and it looks like what he's saying is that as GPU's become more like highly parallel cpu's we will begin to see API's go away in favor of writing compiled code for the GPU itself. For example, if I want to generate an explosion, I could write some native GPU code for the explosion, and let the GPU execute this program directly... rather than being limited to the API's capabilities.
So essentially, we will go back to game developers needing to make hardware specific hacks for their games... some games having better support for some cards, etc.
API's are there for a reason... lets keep em and just make them better.
Sometimes the best solution is to stop wasting time looking for an easy solution.
On a day when Lehman Brothers and Merrill Lynch, the backbone of our economy, "died", to make these kinds of heartless statements is just pandering to the prejudices of the liberal elite. Instead of buying a Mac you should buy a Dell (which is far more "American" in the truest sense of the word) and instead of irresponsible talk of combining CPU and GPU you should keep them well separated for the good of our nation.
Our country is built on a strong consumer market; it is "progressives" like you who are causing this crisis, which might even result in a president with an Afro. Shame on you.
If we are going back to the "old" days...
Why can't we skip all this OS nonsense, and just boot the game directly?
After all that will make sure that you get the MOST out of your computer.
Except that you like everyone else reads "CPU" in the article to mean the Intel/AMD CPU and not think of it as current gen GPUs that are almost capable of massive parallel execution of general purpose code. With the advent of Shaders, more processing was able to be offloaded to GPUs and over the next couple of GPU generations. So his idea is that we'll see less of the OpenGL/DirectX specific API calls and everything being done in CUDA/Shaders. That way the folks who write graphics engines aren't limited to the current SGI implementation (here's a set of vectors describing my object -- draw it) and we'll see different rendering engines based on Ray Tracing for example (or whatever other methods the engine writers want to do it).
This isn't anything new here -- he's basically saying what Intel has already said... You'll see less OpenGL/DirectX and more CUDA/Shader based implementations for rendering engines.
No, that can't be it. Know why? Because...why would you put more processing and thus more heat in one place that already has problems with that?
You mean how floating point units used to in a separate coprocessor?
Or how L2 cache used to be on external chips? (And in some cases was even upgradable.)
Or how modems used to have their own signal processors? But now most use the CPU.
Or how we're moving the memory controller into the CPU right now.
Hell, we've even stuck the majority of complete additional CPUs into the the CPU with our modern dual and quad core chips.
Apparently the author doesn't know much about computers.
Apparently you don't know much about computers either.
The entire history of the personal computer is been one long slide of functionality moving towards the CPU. Sure every now and then something new comes along being done by an add-on processor - like the numeric coprocessor for example.
Sure before the coprocessor you could accomplish the functionality of what a coprocessor does with an 'integer cpu', but a hardware optimized numerica coprocessor was a new feature, one that added tremendous floating point performance in dediated hardware. Within a couple CPU generations the coprocessor had been completely absorbed into the CPU.
The author is speculating that the GPU will see the same fate eventually. And he's probably right.
And why install an overkill graphics processing unit inside the processor if most people won't use it anyway?
Once upon a time people said that about numeric coprocessors. "Only a research scientist would need that!"
Man, you've got some awful, awful arguments here.
For the same reason that your CPU isn't spread up among thirty chips distributed throughout your laptop: efficiency and cost. Making one chip is generally cheaper than making two, and the amount of bandwidth inside a single chip is massively higher than what you can do with a northbridge.
Every latest-generation operating system provides a 3d accelerated desktop. Every latest-generation computer provides the hardware to use it. Programs are going to be taking more and more advantage of that feature.
See question 1.
Not at all. For one thing, there wouldn't *be* graphics hardware - it'd be more of a vectorized coprocessor. For another thing, why *would* it be any harder? It's not like people are having horrible trouble updating their USB drivers, even if the USB controller is part of another, larger chip.
Obviously, if they took existing laptop designs, and slapped a bigger heatsource in the CPU, yes. I'm assuming that computer manufacturers aren't functionally retarded, and they wouldn't do that. (Well, maybe some would, but their computers aren't going to be stable anyway.)
The same place it already goes on motherboards that have integrated graphics? It's not like "computers without dedicated graphics cards" is a new concept, unless you've been living in a cave for the last decade.
See question 1. Also, "why not?" - it's not like that extra four inches is going to be a serious problem.
As near as I can tell, your argument comes down to the common logical fallacy:
"They *could* do X. But if they do it the *stupidest way possible*, X is a bad idea. Therefore, X is a bad idea."
When determining whether something is a good idea or not, you have to assume it's going to be done well. If the person in charge of integrating CPUs and GPUs is anything less than a complete unalloyed moron, they'll have come up with solutions to all of those issues of yours.
Breaking Into the Industry - A development log about starting a game studio.
what's a few more inches?
The difference between "Is it in yet?" and "Dear god you're ripping me in two!"
gpu's aren't really parallel in that [traditional multithreaded] sense, they are parallel in the SIMD sense.
Actually, they're somewhere in between. Some current hardware can reallocate individual processors between fragment and vertex processing depending on the current workload profile. Even at the level of an individual processor lots of "threads" may be running simultaneously; this is to hide latency when a shader program blocks on memory (texture or framebuffer) access.
If you look at NV's descriptions of their 8xx-series drivers, they talk about *hundreds* of threads in flight at any given time. These aren't threads in the classical sense - there's no preemption, for a start - but they're much, much more advanced than SIMD-style "apply this instruction to all these values" parallelism.
> Or how modems used to have their own signal processors? But now most use the CPU.
Welllll... I'd say the move to HSP modems took place more because the ascent of DSL and cable internet relegated modems to the status of, "nice to have if I happen to need it once in a blue moon to send a fax or dialup in the middle of nowhere at some point over the next 2-4 years." Remember all those articles 3-5 years ago about how host signal processing absolutely DESTROYS CPU performance because it demands constant attention from the CPU, and the software overhead of having to keep stopping to service the modem caused the computer to run at least 20-30% slower? Well, not much has changed, except now with a multi-core CPU it can kill the performance of just ONE core instead of bringing the whole computer to its knees. But even with multicore CPUs, I can guarantee that if modems were still the primary way people got online, there would definitely be a thriving market for "performance" modems that offloaded at LEAST the signal-processing functions to a real DSP (like the Lucent "semi-Winmodems", that actually gave users the best of both worlds... offloading the stuff that really dragged the CPU down to its own DSP, but doing things like compression and error-correction that could be handled in discrete batches faster than even dedicated hardware could achieve).
There's another thing to remember about discrete chips... in the early 3dfx days, the mainstream CPU makers (Intel, AMD, and Cyrix) had ZERO interest in giving even the slightest attention to 3D graphics. Unless you're IBM (who wasn't interested in 3D, either), building CPUs is probably way beyond your company's capabilities. HOWEVER, designing a 3D graphics chip with the complexity of the first ones used by 3dfx IS within the capabilities of a well-funded design company with the connections to get it manufactured. It doesn't even need a fab with the capabilities of one owned by Intel, AMD, etc. So discrete 3d cards were an elegant way to sidestep the deadweight lack of interest on the CPU side by shifting it to a chip that smaller companies could design and build. Now that "the big guys" have turned their attention to it, the smaller players don't have a prayer (ergo, the merger mania among CPU/mobo chipset makers and graphics chip makers).
The same observation can be made regarding cache and memory controllers. In the First Pentium Era, volume manufacturers like Compaq (and their comrades at arms, Intel & AMD) regarded cache as a luxury the unwashed masses could live without, even if it only saved $5 and cut the effective performance in half. Hey, consumers only look at that "Mhz" number, anyway... Fortunately, performance-oriented mobo makers were able to take matters into their own hands, and once again do an end run around the CPU vendors' sloth and put cache directly on the mobo. Once CPU makers decided cache mattered, and put lots of it on-die, the marginal benefit of putting more, relatively expensive tertiary cache on the motherboard diminished. As for memory controllers, they got moved into the CPU because it was the only way to reliably achieve increased memory bandwidth (designing a 32-bit parallel interface for ANYTHING that has to run at 400+ MHz and communicate across traces on a circuit board is a hardcore engineering challenge; Serial is cheaper to implement and can be faster overall than a simpler parallel solution, but there's a point where you can't shove the bits any faster, and the only way to increase bandwidth is to go parallel. It's not a coincidence that PCI Express video cards communicate 16 bits at a time, but even the fastest fibre-channel disk or network interface is happy with a single bit.
The sad irony, though, is that 5 years from now, games will probably have graphics about as good as you can get from the best and most expensive SLI solutions money can buy today... but overall performance will probably be less consistent (ie, if Windows decides that it might be a good time to reorganize its temp directory while y
It's very slightly (pennies) cheaper to put one chip on the motherboard rather than two. It's MUCH more expensive to merge two big CPU/GPU type chips into one. Manufacturing flaws become more common fast with bigger chips.
I don't think your estimate is correct for packaging and placing a single chip vs. two chips but in high volume manufacturing even pennies make all the difference. What about the cost of the second fan and other infrastructure for the GPU? There is also the issue of real estate - two chips take up more room than one so your etch routing becomes more of a challenge requiring smaller etch geometries resulting in a more expensive PCB.
If we could break CPUs into pieces now we'd do it, both for that and heat reasons. We can't because all the parts currently located in a CPU need to talk to each other very fast. The GPU is something that usually doesn't need to talk to the CPU much. So it's separate.
No, it will always be cheaper to integrate at the chip level. Look at system prices, dual core is cheaper than dual CPU (if you can even find one these days) of similar performance. The trend in embedded systems (even more cost sensitive than PCs) is to integrate everything into a single package (SOC). As others have pointed out cache and the FPU were once separate chips from the CPU they are now all integrated into one package.