Nvidia Releases Hardware-Accelerated Film Renderer
snowtigger writes "The day we'll be doing movie rendering in hardware has come: Nvidia today released Gelato, a hardware rendering solution for movie production with some advanced rendering features: displacement, motion blur, raytracing, flexible shading and lighting, a C++ interface for plugins and integration, plus lots of other goodies used in television and movie production. It will be nice to see how this will compete against the software rendering solutions used today. And it runs under Linux too, so we might be seeing more Linux rendering clusters in the future =)" Gelato is proprietary (and pricey), which makes me wonder: is there any Free software capable of exploiting the general computing power of modern video cards?
The AGP bus has assymetrical bandwidth. Upstream to video card is like 10x faster than downstream to the CPU. So you can dump tons of data to the GPU but you can't get the data back for further processing fast enough, which defeats the purpose.
www.rexguo.com - Technologist + Designer
"Gelsto is proprietary (and pricey)"
A company that wants to be payed for their work, weird !
You will see more, allot more, of this for the Linux platform in the near future.
Software may be released with source code, but no way that it will be released under GPL, most ISV's can't make a living releasing their work under GPL.
And please the "but you can provid consulting services" argument is not valid, it dont work that way in the real world.
Perhaps you should read the Nvidia FAQ? This topic is covered. From what I can tell, they don't use the GPU in the traditional way, they just use it as a co-processor.
I bet the type of people that buy this are like big time architects that have a few machines set up to do renders for clients, and want to perhaps do some additional effects for promo/confidence value, that likely already have people running that type of hardware.
Then again all those Quadro users could be CAD people and they've got no audience. =)
Not only with hardware manufacturers/drivers, but also general software. ISV's are getting annoyed by Microsoft's dominance of the desktop market, and through that, their (heavy) influence on desktop software. It's not inconceivable that in a decade, Microsoft could control every aspect of the standard desktop PC and desktop software market. At the moment some of the only really strong ISVs in their respective areas are Adobe, Corel, Intuit, Macromedia, Oracle, and a few specialized companies. Expect a big ISV push towards a "neutral" platform, like Linux or FreeBSD. Windows is too big to stop supporting, but ISVs will be smart to at least try and carve out a suitable alternative and avoid being completely dominated by Microsoft. All that most ISVs might be able to hope for in a decade is being bought out by Microsoft or making deals with Microsoft, if things don't go the way of creating a vendor-neutral platform.
Software, like much technology, follows a classic cycle from rare/expensive to common/cheap as the knowledge and means required to build it get cheaper.
"Moore's Law" is simply the application of this general law to hardware. But it applies also to software.
Free software is an expression of this cycle: at the point where the individual price paid by a group of developers to collaborate on a project falls below some amount (which is some function of a commercial license cost), they will naturally tend to produce a free version.
This is my theory, anyhow.
We can use this theory to predict where and how free software will be developed: there must be a market (i.e. enough developers who need it to also make it) and the technology required to build it must be itself very cheap (what I'd call 'zero-price').
History is full of examples of this: every large scale free software domain is backed by technologies and tools that themselves have fallen into the zero-price domain.
Thus we can ask: what technology is needed to build products like Gelato, and how close is this coming to the zero-price domain?
Incidentally, a corollary of this theory is that _all_ software domains will eventually fall into the zero-price domain.
And a second corollary is that this process can be mapped and predicted to some extent.
Ceci n'est pas une signature
For 3D rendering, especially non-realtime cinematic rendering, you have large source datasets - LOTS of geometry, huge textures, complex shaders - but a relatively small result. You also generally take long enough to render (seconds or even minutes, rather than fractions of a second) that the readback speed is not so much an issue.
Upload to the card is plenty fast enough (theoretical 2 GB/s, but achieved bandwidth is usually a lot less) to feed it the source data, if you're doing something intensive like global illumination (which will take a lot more time to render than the upload time). Readback speed (around 150 MB/s) is indeed a lot slower, but when your result is only e.g. 2048x1536x64 (FP16 OpenEXR format, 24 MB per image), you can typically read that back in 1/6 of a second. Not to say PCIe won't help, of course, in both cases.
Readback is more of an issue if you can't do a required processing stage on the GPU, and you have to retrieve the partially-complete image from the GPU, work on it, then send it back for more GPU processing etc, but with fairly generalised 32 bit float processing, you can usually get away with just using a different algorithm, even if it's less efficient, and keep it on the card.
Another issue might be running out of onboard RAM, but in most cases you can just dump source data instead & upload it again later.
Why would anyone engrave "Elbereth"?
Almost every FX house worth its salt in the CG business uses Pixar's Renderman on UNIX or Linux machines. The reasons behind this choise are very simple.
Renderman is proven technology and has been so since the early '90s. Renderman is well known, its results are predictable and it is a fast renderer. Also, current production pipelines are optimised for Renderman.
UNIX and Linux are quite good when it comes to distributed environments (can anyone say Render Farm?) and handle large file sizes well (Think a 2k by 2k image file, large RIB files).
And last but not least, renderman is available with a source code license.
Hardware accelerated film rendering is in essence nothing but processor operations, some memory to hold objects and some I/O stuff to get the source files and output the film images. Please explain to me why a dedicated rendering device from NVidia would be any better than your average UNIX or Linux machine? Correct, there aren't any advantages, only disadvantages. (More expensive, proprietary hardware, unproven etc.)
Instead of just using the native 3D engine in the GPU, as done in games, Gelato also uses the hardware as a second floating point processor.
Does this mean that I could eventually use my GeForce to do things like matrix inversion for me?
So why should Nvidia benefit from Linux, without some reciprical giving ? Hardware programming specs would be enough of a gift.
The Internet's nature is peer to peer - 20050301_cs_profs.pdf
I think you make the best point on the board today.
The opening quote of the article poster is ignorant. Movie rendering has been done in hardware forever. He seems to be mixing up doing rendering in hardware with rendering on the fly in a video card.
What we have here is a slight mix of the two, but by no means anything new on the market. Its only letting you use your quadro if you already have one for movie rendering acceleration. I certainly would not buy one for this purpose. I imagine its still incredibly more profitable to use a CPU than GPU. Also, note that render farms computers rarely have video cards. The video part would be wasted.
But for in-game recording by home users and non-studio $$$ having artist, this will likely be a welcome addition. (Especially for those who can turn a GeForce into a Quadro) Though I have to wonder, doesen't the Quadro cost quite a bit of money!?
Still think its nice technology. I wonder if PCI-Express is allowing them to get this data off the GPU and back into the hands of the CPU?
It makes sense, really... If you're building an app that's intended to be used by clusters, why would you write it for XP? Having to spend an extra $100 per node really starts adding up when you've got serveral hundred or thousands of machines...
my sig's at the bottom of the page.
With physics in games becoming more and more advanced how long till we see an API for hardware accelerated physics. Not saying you'll be shelling out anymore cash for a ElectricForce FX physics card but have dedicated hardware on graphics cards for physics calculations. Not only could this boost performance but it could combined with pixel shaders and geometric transformations to increase what is possible. Who knows, the GeForce 6800 might be programmable enough to do this already, albeit at a performance hit to the graphics pipeline. There are some pretty incredible physics demos that can be found on the internet everyone's favourite monopoly(Crash Video) has one that showcases their new game development suite XNA. I don't see this level of physics being available in any game any time soon but with hardware accelerated physics who knows what's possible. Even without a standard API to build on developers might implement their own physics acceleration shaders(for lack of a better term) in cases where the CPU is the bottleneck. Ever since the first programmable GPU was release I imagined that they could accelerate more then just graphics and be used to increase the responsiveness of computers.
[...] advanced rendering features: displacement, motion blur, raytracing, flexible shading and lighting, [...]
That sounds like an old Siggraph presentation I saw a decade or two ago when I used to go to Siggraph. Lucasfilm, I think. (The fine sample picture in the article showed a motion-blured image of a set of pool balls in motion.)
When rendering an image using raytracing, there are several effects that are achieved by similar over-rendering processes. I.e. you ray-trace several times varying a paramter:
- Depth-of-field (use different points on the iris of the "camera", blurring things at different distances from the "focal plane".)
- Diffuse shadows (use different points on the diffuse light source(s) when computing the illumination of a point.)
- Motion blur (use different positions for the objects and "camera", evenly {or randomly} distributed along their paths during the "exposure" - ideally pick the positions of the whole set of objects by picking several intermediate times, rather than picking the postion of each object separately, to avoid artifacts of improper position combinations.)
- Anti-aliased edges. (Pick different points in the pixel when computing whether you hit or missed the object or which color patch of its texture you hit.)
As I recall there were about five effects that worked similarly, but I don't recall the other(s?) just now.
To do any one of them requires rendering the frame N times {for some N} with the parameter varied, then averaging the frames. (Eight times might be typical.) Naively, to do them all would require N**5 renderings - 32,768 raytracings of the frame to do all five.
The insight was to realize that the effects could be computed SIMULTANEOUSLY. Pseudorandomly pick one of the N from each effect's set for each frame and only render N frames, rather than N**5. Eight is a LOT smaller than 32K. B-)
Sounds like Nvidia ported this hack to the firmware for their accellerator.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
As another person has already said, you can only have one AGP slot. PCI Express is the next-generation, high-speed replacement for PCI. Remember how the first couple of generations of 3D accelerators were PCI-based?
Besides, you don't need killer bus bandwidth with this because you're not trying to pump out 100fps using a couple of hundred megs of geometry and textures on a card with only a hundred or so megs of memory. (That means you have to send loads of data over the bus 100 times each second.)
The power here is in the parallelisation and incredible performance delivered by highly-specialised processors. Graphics cards have phenomenal memory bandwidth - nVidia's latest has something like 32GB/sec (big B!) - compare that to say a Dual 2GHz PowerMac G5, which has 6.4GB/sec of memory bandwidth. New graphics chips are heading towards the usage of memory paging (3D Labs P10 already does this I believe) So with this and high-end cards with 256 or 512MB of RAM you won't need much bus bandwidth because you'll just page in little bits of geometry and textures as each processor needs it, rather than having to upload huge textures everytime an entire one gets trashed to make room for another one.
So once again, the key thing to remember is that you're not trying to push 100fps. Most of the time spent rendering a frame will be in the GPU shader units, not uploading data to the graphics chip.