NVIDIA Launches New SLI Physics Technology
Thomas Hines writes "NVIDIA just launched a new SLI Physics technology. It offloads the physics processing from the CPU to the graphics card. According to the benchmark, it improves the frame rate by more than 10x. Certainly worth investing in SLI if it works."
Sounds like an ATI-killer to me! What ever happened to the hype about dedicated physics chips?
Making you think you're crazy is a billion dollar industry.
This is a little misleading. The hardware is really just fast at computing, not specifically designed for "physics". For example it doesn't have a build in ODE solver.
This physics system is used for visual physics (i.e., realistic graphical effects), not gameplay physics, which are still done on the CPU.
Therefore you get a 10x framerate increase over running massively intensive effects on the CPU.
This is good, because games will look nicer. But if you don't have the GPU grunt, you can simply disable (or cut them down) them in game - it won't affect the gameplay.
Why does this require SLI? You can do stream processing on most relatively-modern accelerated 3d video cards.
This will be critically important as programs start to push particle and geometry modeling. I remember back when I had my Quadra 840av in 1993, I popped a couple of Wizard 3dfx Voodoo cards in it when they first started supporting SLI and the performance benefits were noticeable. Of course we were all hoping for the performance to continue to scale, but 3Dfx started getting interested in other markets including defense and then were bought by Nvidia making me wonder if SLI would ever really take off. It's nice to see that the technology is still around and flourishing.
Visit Jonesblog and say hello.
How does this work in relation to AMD's consideration of a physics coprocessor or another specialized processor? It seems like that solution is superior.
I've been waiting for this for a while. It's the obvious next step in GPU design. I have a feeling GPUs are going to become more and more general, and eventually accelerate the majority of inherently parallel processes, while the CPU executes everything else. We don't even have to change the acronym. Just call it a "Generic Processing Unit"...
Vandemar.org
Of course its nothing more than a press release but there are numerous questions it raises:
1) What limitations are there on calculations. A GPU is not as general as a cpu and it would probably suck when dealing with branches especially when they aren't independant.
2) How much faster could this actually be. Is it simply a matter of looking to the future? (ie: we can already run with Aniso and AA and high resolutions so 5 years from now they'll be "overpowered"). IMO the next logical step is full fledged HDR and then more polygons.
3) What is exactly expected of these. General physics shouldn't be, but i can understand if they do small effects here or there.
----
Go canucks, habs, and sens!
I don't think this is a general physics processor. It seems to be aimed at "eyecandy" physics calculations - mostly particle systems - whose results don't need to feed back into application logic. Which makes sense, given than GPU->CPU readbacks are a notorious perfomance killer.
Potentially shiny, but not really revolutionary or new. People have been doing particle system updates with shaders for a while now.
The "technology" is specifically designed for physics. The hardware is not, but the driver, API, and havok engine enhancements are. This is therefore "physics technology".
Besides, I rather think this is what nVidia had in mind when they first started making SLI boards. It was always obvious that the rendering benefit from SLI wasn't going to be cost-effective. Turning their boards into general purpose game accelerators has probably been in their thoughts for a while.
www.gpgpu.org
This neither requires SLI nor is it limited to NVIDIA chips. NVIDIA is just launching it publicly. ATI will be showing it off behind closed doors this week.
10x faster? They might as well just say it's infinity times faster so that we know they are bullshitting from the second we read it...
Don't forget that http://www.ageia.com/ is already doing this, and set to ship their cards sometime this year hopefully. Of course the significant difference between the two is that you would only have to buy one card for the SLI solution.
Why not have a complete physics card? It would be a nice use for that PCI express bus which only has video cards as an option right now. That way you could just buy the physics card, without having to upgrade the video card. Although this is all kind of weird. Start offloading everything off to specialized cards, you pretty much have a multiple CPU machine, where each CPU is specially tuned to do a specific type of processing. Might be the leap necessary to maintain Moore's law.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
The one linked is a little bland for my taste
h ysics/
r g.mozilla%3Aen-US%3Aofficial_s&hl=en&percentage_se rved=100&tab=wn&ie=ISO-8859-1&q=NVIDIA+SLI+Physics &btnG=Search+News
this one is better:
http://www.tgdaily.com/2006/03/20/nvidia_sli_forp
Or choose you own adventure via Google news:
http://news.google.ca/news?client=firefox-a&rls=o
Of course, the basic isn't exactly brand new -- some of us have been writing shaders to handle heavy duty math for a while. The difference is that up until now, most real support for this has been more or less experimental (e.g. the Brook system for doing numeric processing on GPUs. Brook is also sufficiently different from an average programming language that it's probably fairly difficult to put to use in quite a few situations.
Having a physics-oriented framework will probably make this capability quite a bit easier to apply in quite a few more situations, which is clearly a good thing (especially for nVidia and perhaps ATI, of course).
The part I find interesting is that Intel has taken a big part of the low-end graphics market. Now nVidia is working at taking more of the computing high-end market. I can see it now: a game that does all the computing on a couple of big nVidia chips, and then displays the results via Intel integrated graphics...
The universe is a figment of its own imagination.
Well, I for one, want to have a smarter AI in all games. Unloading the "mundane" physics engine to the graphic card will hopefully spare more CPU cycles for the AI. After all, it's not graphics that matter in games. It's the gameplay.
--
Error 500: Internal sig error
Applications should be built to be more efficient, to handle modern hardware, instead of simply relying on consumers purchasing faster hardware.
By offloading physics from the CPU to the graphics card, this improves frame rates?
Yes. Why does that surprise you? When you do incredibly complicated physics simulation, things can be very parallel and consequently GPUs outperform CPUs.
Why would I waste precious GPU processing to process Physics? I mean, all the CPU does these days is handle AI, physics, and texture loading. If you offload physics to the GPU, then the CPU is doing less and your swamping the GPU with more work.
You seem to be under the impression that your GPU cycles are more important than your cpu cycles. This is done with SLI for a reason..
If it does increase frame rates, then I would suggest why not improve graphics rendering rather then physics processing.
Because the quality of the render is controlled in software? Because hardware is currently limited by, ya know, physics and technology?
I find that for all the advances nVidia and ATI have made over the years, 3D gaming visual quality is still inferior to cinematic quality 3D rendering.
And in other news, offline processing is still more powerful than online processing. There's a shocker.
I would prefer if nVidia and ATI actually focused on bringing cinematic quality 3D rendering to gaming, instead of just claiming they do.
First of all, 99.9% of what nVidia and ATI do is exactly that. They are also starting to realize that the GPU paradigm, with minor modification, can be turned into a very powerful co-processor... and they are the experts at creating those types of chips. The market for them is growing... and they don't want to miss the boat.
I want smooth high-poly models with realistic lighting and 60fps.
And I want peace in the middle east. Give it 10 years, one of us may get our wish.
Real-time cinematic quality graphics rendering = HARD.
Physics acceleration that allows for rather impressive collisions and water: MUCH EASIER.
Maximum output for minimum input. Having physics acceleration in the GPU makes sense as you don't have to buy an extra accelerator card.
------- "From bored to fanboy in 3.8 asian girls" ----------
We do not live in the 21st century. We live in the 20 second century.
All your points are certainly valid, but I'd say the next era of physics in games is just around the corner. Go watch the spore video to see an example of what's coming.
:)
Besides, who doesn't like rag dolling? I played through HL2 just so I could toss bodies around with the gravity gun.
I can't read the article since it's slashdotted, but here's what I want to know:
First, what physics API are they using? This is, after all, a little like OpenGL vs DirectX. You need a physics API to do this stuff, and there are out there a *lot* of portable and high quality APIs. Havok, Newton, Aegeia (spelling?), and the open source ODE ( which I use ). The APIs aren't interchangeable, and aren't necessarily free.
Second, at least when I'm doing this work, there's a *lot* of back and forth between the physics and my game engine. Maybe not a whole lot of data, but a lot of callbacks -- a lot of situations where the collision system determines A & B are about to touch and has to ask my code what to do about it. And my code has to do some hairy stuff to forward these events to A & B ( since physics engines have their own idea of what a physical object instance is, and it's orthogonal to my game objects, so I have to have some container and void ptr mojo ) and so on and so forth. If all this is running on the GPU, sure the math may be fast but I worry about all the stalls resulting from the back and forth. Sure, that can be parallelized and the callbacks can be queued, but still.
Anyway, I want info, not marketing.
Oh christ, and finally, I work on a Mac. When will I see support? ( lol. this is me falling off my chair, crying and laughing, crying... sobbing. Because I know the answer ). Can we at least assume/hope that they'll provide a software fallback api, and that that api will be available for linux and mac? After all, NVIDIA has linux and mac ports of Cg, so why not this? I'm keeping my fingers crossed.
lorem ipsum, dolor sit amet
Non-root, user-level access to IO ports (by authorized programs) is not evil; it's what allows non-kernel level display servers. It keeps some really complicated stuff out of the kernel, thus improving system stability.
I think the point is that this is for games where the bottleneck is in the CPU and the graphics card is sitting idle half of the time. By pulling 10% of the graphics card's resources to physics calculations, you could offload enough of the work from the CPU that it could keep the rest of the card completely fed and see a framerateimprovement with no additional hardware or loss in video quality.
I read the internet for the articles.
What ever happened to the hype about dedicated physics chips?
The original article appears to be slashdotted.
So could somebody tell me how wide the floats are in this "SLI" engine? [I don't even know what "SLI" stands for.]
AFAIK, nVidia [like IBM/Sony "cell"] uses only 32-bit single-precision floats [and, as bad as that is, ATi uses only 24-bit "three-quarters"-precision floats].
What math/physics/chemistry/engineering types need is as much precision as possible - preferably 128 bits.
Why? Because the stuff they are modelling tends to be highly non-linear and the calculations tend to be highly unstable.
32-bits isn't even enough to give integer granularity up to 16 million:
I have 2 6600GT's SLI'd... the first cost me about $175, the 2nd was about $130. You don't necessarily have to buy the super-expensive cards to do SLI. Even today, you could buy a pair of 7600's for about $400, and those are brand new.
For a game, the best way to solve ODEs is numerically. Since you don't need the precision of the exact solution, the solutions are considerably simpler computationally once you've linearized them. Doing RK4 on the fly is precisely the best solution to the problem. Well, depending on the stiffness.. but you can always fall back on plain ol' trapezoid rule if you just wanna know, "what does the thing do until it hits the ground" to enough precision to be pretty.
solving a linearized ODE is just plain ol' ordinary matrix math, very parallelizeable and a lot less computationally expensive than breaking up a transcendental function into piecewise conitinuous steps and calculating the result every time.
Can you be Even More Awesome?!
The guys over at http://www.gpgpu.org/ have been doing various math calculations, including 'physics' on GPUs for a while now. One big problem is that the only real API is OpenGL. So not only do you have to be a smart math programmer (which is pretty rare to begin with) but you also have to understand graphics programming too and then figure out how to map traditional math operations onto the graphics operations that OpenGL makes available. It isn't that hard to do simple things like matrix math, but trying to really optimize it for really good performance requires almost wizard-level understanding of OpenGL and the underlying hardware implementation.
The cards' math capabilities would be so much more accessible (and thus used by so many more programmers) if Nvidia (and ATI) would come out with standard math-library interfaces to their cards. Give us something that looks like FFTW and has been tweaked by the card engineers for maximum performance and then we will see everbody and his brother using these video cards for math co-processing.
Clearly, you misunderstand how cinematic 3D is rendered
Desktop GPUs will always be inferior to cinematic 3D, simply because cinematic 3D is rendered at a rate of several frames per day by a multi-million dollar farm of computers, while desktop GPUs must deliver dozens of frames per second all by itself.
A peek at what it took to render The Incredibles:
And again -- even this much hardware generated images measured in frames per day -- nowhere near the ~24 frames per second you'd want for real-time imaging. In fact, according to pixar.com it takes 6 - 90 hours to render one frame.
Each time I hear that an "advance" has been made and I read that it is basically re-integrating various components back into the primary system or tying those components tighter to the CPU then I can't help but scream "AMIGA!" Of course, this leads to co-workers walking wider paths around me while having avoiding eye contact '-).
Still, all of these advances lead me to believe that we might going back to a dedicated chip style of computing BUT what I am also hoping for is a completely upgradeable system that I can pull the, say, physics processor out and plug a newer version or better chip into without having to replace the entire motherboard or daughterboard. Which, of course, leade me right back to that whole screaming scenario :) The Amiga style of computing may yet live again.
Dream as if you'll live forever.
Live as if you'll die tomorrow.
~Anonymous~
Uh, that's highly unlikely. The physics of a flying body is no more difficult to compute that the physics of a running body. "Particle systems" are not the reason for the slowdown, more likely, it has to do with the fact that a player at high altitude can see a LOT of the game world and therefore more packets have to be sent in order to maintain a consistent view as the player flies through the air.
Don't worry! Microsoft will come to the rescue with DirectX 11... all you will have to do is write the physics engine using the DirectX API, and Microsoft's trusty software will interface with whichever hardware you have. Don't worry... it'll be bug free and secure, too!
Read the article... they state that right now you can not share both graphics rendering and physics emulation on the same GPU, though they do plan to work on this in the future.
For now you need two GPUs, one for graphics, one for physics.
This is due to modern PC video card architecture containing a large quantity of PURE EVIL. To get around this evil the X developers have done some rather expedient things, such as directly accessing the cards via IO registers, directly from userland.
It's worse than that - even if you dispense with a graphics card, your OS still has to directly access some of your hardware at some point. This creates the opportunity for all kinds of strange interactions and unforseen security holes.
Ever at the forefront of proactive security, the OpenBSD team have announced their solution to this problem. OpenBSD 4.0 will be the first OS to not run on any hardware at all. It will exist only as a mass of finely crafted and provably secure pseudo code. Although critics have pointed out that the finished product may lack something in the functionality stakes, supporters have pointed out that the OS has been moving in this direction for a while.
Project leader Theo offered the following comment, "Retards! You weren't supposed to install it anyway! Have you read the chapter on partitioning in the install guide? Do you really think we wrote it like that because we wanted people to try and install it? Jesus, you make me sick."
Didn't a company called Ageia (?) design a PCI-express addon (or PCIx or wtv) that was basically a separate chip completely dedicated to physics calculations (ragdoll thingies and that sort of stuff)?
In fact, wasn't the PS3 supposed to have said chip from Ageia (or wtv)?
This would be cool, but i wonder how many would actually flock to it (if cheap enough (~40) then probably it would lead developers to assume its existence, and if not to default to using good old ix86).
Well, at 4 frames per day, I could probably keep up with my buddies on an online shooter.....I might actually win a few rounds, with 6 hours between frames to think about what to do next.
Stasis is death. Embrace change.
Our approach produces better-looking movement than the low-end physics packages. We don't have the "boink problem", where everything bounces as if it were very light. Heavy objects look heavy. Our physics has "ease in" and "ease out" in collisions, as animators put it, derived directly from the real physics. When we first did this, back in the 200MHz era, it was slow for real time (a two-player fighter was barely possible) but now, game physics can get better.
Take a look at our videos. Few if any other physics systems can even do the spinning top correctly, let alone the hard cases shown.
... Most of you didn't get the point. It's not that you can access the GPU from userland (it depends on that access, but that's not the point). The main point is that that the current gen of programmable GPUs allow you to (theoretically) directly access kernel memory, as pointed out later in the thread by Theo:
> Are these new programable cards capable of reading main memory, which
> OpenBSD would not be able to prevent if machdep.allowaperture were
> set to something other than 0?
Yes, they have DMA engines. If the privilege seperate X server has a
bug, it can still wiggle the IO registers of the card to do DMA to
physical addresses, entirely bypassing system security.
Thus, a resourceful attacker theoretically could get access to kernel memory through anything which allows access to the video card. An unusual and probably difficult-to-exploit hole, but a possible hole none the less.
And so we go, on with our lives
We know the truth, but prefer lies
Lies are simple, simple is bliss
Does he have this concern with soundcards, HDD controllers and network cards too? They've all got DMA capability, coprocessors, and firmware. Network cards even have network connectivity, making them potentially WAY more dangerous than a video card.