Slashdot Mirror


AGP Texture Download Problem Revealed

EconolineCrush writes "The latest high-end graphics cards are capable of rendering games at 1600x1200 in 32-bit color at jaw-dropping frame rates, but that might be all they're good for. For all their gaming prowess, all of these cards have horrific AGP download speeds that realize only 1/100th of their theoretical peak. This article lays it all out, testing video cards from ATI, Matrox, and NVIDIA, and clearly illustrates just how bad the problem is. While these cards have no problems rendering images to your screen, you're out of luck if you want to capture those images with any kind of reasonable frame rate via the AGP bus."

10 of 265 comments (clear)

  1. Software issue? by larien · · Score: 5, Informative
    From the article, the author reckons this is a software (driver) issue rather than a hardware issue. I also note the test rig ran Windows, but how does linux shape up? Is it better or worse?

    In any event, there's another issue he doesn't really touch upon; while he mentions that a single frame at 1600x1200@32bit colour is 7.5MB, he ignores the fact that a 30fps movie would require (30*7.5)=225MB per second uncompressed; you either have to have that much disk bandwidth or have enough CPU grunt to compress that on the fly. I guess a dedicated MPEG encoder card could help, but your average box is going to have trouble keeping up with on-screen gibs, rocket trails and blood splatters and encoding video.

  2. Imagine That by mosch · · Score: 5, Insightful
    Wow, what a surprise. Video cards being built on ultra-thin margins are only being designed for the use that 99.99% of the population wants to use them for. You'd think with their huge 4% and 5% profits they'd add in lots of features that only a very few people want, just in case!

    In summary, who the fuck cares?

  3. It's not the cards by tmark · · Score: 5, Insightful

    all of these cards have horrific AGP download speeds that realize only 1/100th of their theoretical peak...you're out of luck if you want to capture those images with any kind of reasonable frame rate via the AGP bus."

    As the quoted article clearly indicates, the problem lies with the drivers and not with the cards, the latter which the original poster intimates.

    And the underlying reason is immediately understandable: after years of AGP cards and years of noone really complaining raising this issue - (except, now, developers of video-editing software who could benefit) - it seems clear that there isn't much demand for this kind of performance. In the (near ?) future there might be, but why should these companies spend money working on driver performance in areas like this when really customers only care about how well Quake will run ?

    When people are willing to pay for these features is when companies will pay to build the requisite drivers. And that is how it should be.

  4. Re:Um, this is a surprise? by Mike+Connell · · Score: 5, Interesting

    There are actually some good reasons to be able to do this apart from just taking screenshots. I did (sad but true) these tests over 4 years ago finishing grad school, and the results (read back speed is very bad) were much the same.

    Two reasons for wanting to grab the framebuffer (or parts of it) are for

    a) texture imposters (realtime adaptive billboarding) and
    b) split world/image-space occlusion culling.

    With faster readback, both these techniques would probably be used more in "normal" software (ie games).
    0.02

  5. Re:128 bit colour? by Viking+Coder · · Score: 5, Interesting

    If you're doing multi-pass rendering, it might be extremely convenient to capture the results back to main memory. Especially if the board doesn't have enough texture memory to support all of your temporary buffers.

    And boards are starting to ship with 128-bit IEEE floating point buffers.

    Essentially, you're right - a human can't tell the difference beyond 24-bit on a given image. But if 100 images were composited together (very likely, to support something like RenderMan-style rendering in hardware), 24 bits is nowhere near enough - you'd get all sorts of accumulation error.

    --
    Education is the silver bullet.
  6. One of the worst technical articles.... by grahamtriggs · · Score: 5, Interesting

    ...that I have ever read. Either that, or I am missing something here... The idea that graphics subsytems have 'bandwidth to burn' is kind of ironic, given that every graphics chip is ultimately held back in performance by the amount of bandwidth available to it - especially when using high quality options like anti-aliasing. The main focus of the article is actually a very niche segment... the idea of transeferring back rendered images over the AGP bus for TV / film / etc. is a joke... Rendering at high quality takes a huge amount of bandwidth (ie. textures and geometry)... as someone else pointed out, transferring back high-res images would take up over 200MB - that's a quarter of your AGP bandwidth! And without taking into account contention and timing issues in uploading/downloading that would mean that you simple couldn't realise the full potential of the bandwidth without a lot of other (expensive?) hardware... The simple fact is that for production uses, you would be *far* better off taking a stream of data from the DVI connector, and storing that for later use... Screen capture for business use is a reasonable point - however when does that require 3d rendering to be taking place? There should be no contention and no reason why the AGP bus couldn't be utilised fully - although would the graphics companies make enough out of this to justify the effort? As for internet streaming - how many people have access to bandwidth fast enough for high quality, full screen video streaming? Enough said...

  7. Ray Tracing on the GPU by eeeeaagh · · Score: 5, Interesting
    We just ran into this problem when implementing a ray tracer using the GPU that will be presented soon at the upcoming Graphics Hardware Workshop.

    Our ray intersection algorithm implemented on the GPU (an "old" Radeon 8500) was able to intersect 114M rays per second. This was loads faster than the best CPU implementation, which could handle between 20 and 40 intersections.

    But when we tried to implement a ray tracer based on this, and an efficient one that didn't intersect every ray with every triangle, the readback rate killed us. Our execution times slowed down to the low end of the fastest CPU implementations.

    And the readback delay seems to be completely due to the drivers, which apparently still use the old PCI-bus code. If the drivers could use the full potential of the AGP bus, our ray tracer could approach twice the speed of the best CPU ray tracers.

  8. Yes, but... by Anonymous Coward · · Score: 5, Informative
    a) texture imposters (realtime adaptive billboarding)

    That's what render-to-texture is for, you don't need to read data back to the CPU.

    b) split world/image-space occlusion culling.

    This wouldn't be too useful for realtime graphics anyways, because of the way the 3D graphics pipeline works. The CPU can already be processing data a few frames ahead of what the GPU is currently working on. If you read back data from the card every frame, you have to wait for the GPU to finish rendering the current frame before you can start work on the next one.

    1. Re:Yes, but... by Mike+Connell · · Score: 5, Informative

      That's what render-to-texture is for, you don't need to read data back to the CPU.

      That is true for simple versions, but with methods moving towards image based rendering you often have to pull the data back anyway. Then you can process the textures to produce better imposters - not necessarily just billboards

      Re: occlusion culling. People are using these methods today for realtime graphics (for example combinations of Greens HZB, or HOMs) even with the low readback speed. UNC's Gigawalk software is one published example (Google for it). Getting Z or alpha channel infomation back is the biggest hit, so these methods would be even more efficient and so more widley applicable with faster transfers. When you're rendering N million triangles per frame (UNC quote 82Million) you have to do this stuff to get realtime rendering.

      So it is used for realtime graphics today - although mainly for heavy duty applications not games.

      HTH

  9. Perhaps... by ColGraff · · Score: 5, Insightful

    "What kind of idiot puts their most powerful processor at the end of a one way street?"

    Maybe they're the kind of idiots who know most people just want the best possible OUTPUT for gaming possible, and so don't want to add any overhead in card performance - or even additional design time - that isn't related to gaming performance. You know, the idiots who make cards that get award after award from gaming companies, then write near-perfect drivers, port those drivers to linux, and let you overclock the card to your heart's content. Those sort of idiots. My, they're idiotic.

    Nobody says, "buy a geforce 4 ti, make the next toy story." No, it's advertised as a gaming card, and that's what its designed to do. If you want to do high-end video rendering things, perhaps a gaming card isn't the best choice.

    --
    I'm the stranger...posting to /.