Slashdot Mirror


AGP Texture Download Problem Revealed

EconolineCrush writes "The latest high-end graphics cards are capable of rendering games at 1600x1200 in 32-bit color at jaw-dropping frame rates, but that might be all they're good for. For all their gaming prowess, all of these cards have horrific AGP download speeds that realize only 1/100th of their theoretical peak. This article lays it all out, testing video cards from ATI, Matrox, and NVIDIA, and clearly illustrates just how bad the problem is. While these cards have no problems rendering images to your screen, you're out of luck if you want to capture those images with any kind of reasonable frame rate via the AGP bus."

265 comments

  1. Hmm. by 13Echo · · Score: 1

    Correct me if I am wrong, but does it really make a difference if a card has 128 MB of onboard RAM? AGP's main benefit is texture streaming from the system RAM to the video card, but actually, with 128MB of RAM on a card, I don't believe it is even an issue.

    1. Re:Hmm. by FyRE666 · · Score: 3, Insightful

      Maybe you should have read the article? The point is that the slow transfer rate from the card TO the PC's RAM means that capturing video (or recording a gaming session for playback later) is severely hampered.

      To be honest though, most people buy a GF4 to play games, not capture video.

    2. Re:Hmm. by Viking+Coder · · Score: 2

      I work with data much, much larger than 128 MB. If the board had 2 GB of memory, I'd use it.

      Not everyone is using their video card to play Quake. =) (Although, I do that, too.)

      --
      Education is the silver bullet.
    3. Re:Hmm. by MagPulse · · Score: 4, Informative

      This would affect everyone in a different way though. TV stations and production sets, even public access TV, along with low budget movies, would be able to use their PCs with a Radeon 9700 or NV30 card to produce their content. They could not only reproduce many of the effects from movies like Toy Story (notably excluding ray tracing), but do it in real-time for instant feedback, meaning much much faster production cycles. This has the potential to make a big impact.

    4. Re:Hmm. by Ost99 · · Score: 2

      Not that this has anything to do with the article in question, but fast ram on the video card is essential if you're going to play games in hi-res.
      The AGP bus can't supply data / textures fast enough to a modern GPU/VPU. Both the bus and the main memory is way to slow. Some business pcs uses shared video and main memory. It works ok for most 2D apps, and will even allow you to play DVDs or streamed video. For games; forget it.

      - Ost

      --
      ---- Sig. gone.
    5. Re:Hmm. by 13Echo · · Score: 3, Insightful

      I wouldn't use one of these cards to capture video though. I can't see why most people would, actually. The Matrox cards might be an exception. Quadro is a CAD/CAM card. These are just consumer grade cards. They buffer and write video directly to the hard disk. Real video editing hardware works differently, but even they often have several gigs of onboard RAM.

      So really, I guess that I meant to say that I fail to see the relevance of the article. It is kinda of silly, actually, to even want to record real-time game footage with this hardware. Just pipe the video output to a real capture card on another machine. Problem solved.

    6. Re:Hmm. by uchian · · Score: 2

      you might not use your 3d graphics card to capture video, but if you wanted to edit video, it means you can't use your matrox to hardware accelerate 3d wipes/transitions/color transformations. Which is a bit of a shame really. Equivalent to having one sitting in your machine but having to play Doom3 through a software renderer in the frustrating stakes.

      Of course if your editing video on the cheap, you probably go for something slightly more dedicated like the Matrox RT2500 anyway, which is not that much more expensive.

    7. Re:Hmm. by ncc74656 · · Score: 2
      So really, I guess that I meant to say that I fail to see the relevance of the article. It is kinda of silly, actually, to even want to record real-time game footage with this hardware. Just pipe the video output to a real capture card on another machine. Problem solved.

      Capturing what you do in the average FPS would be silly, but what if you're doing 3D rendering with your graphics card? What you propose would be like ripping CDs by plugging a CD player into your soundcard's line-in jack. What the article envisions would be more like ripping CDs with EAC...you eliminate the digital-to-analog-to-digital conversion.

      --
      20 January 2017: the End of an Error.
    8. Re:Hmm. by kasperd · · Score: 1

      1200 lines on the screen at 75Hz is at very least a 90kHz line frequency. However both PAL and NTSC use a 15kHz line frequency. Rendering six times as many lines as you can possibly show is plain waste of processing power.

      --

      Do you care about the security of your wireless mouse?
    9. Re:Hmm. by t · · Score: 2

      Are you really saying that instead of simply fixing the software drivers, you should get a second high end computer capable of capturing video at real time rates? Man are you that stupid or just trolling badly?

    10. Re:Hmm. by Anonymous Coward · · Score: 0

      I am pretty sure that it isn't a driver issue at all. Don't be so insulting to people. He has a valid point, asshead. The hardware probably isn't designed for such a task.

    11. Re:Hmm. by spectral · · Score: 1

      I'm pretty sure you didn't read the article at all. Don't be so insulting to people. The person you yelled at was correct, asshead. The drivers aren't designed for such a task.

      The problem isn't the hardware, it's the software drivers. In fact, the speed could be dramatically increased with revised software drivers. However, no manufacturer has presently made this aspect of driver performance a priority.

  2. Um, this is a surprise? by Yarn · · Score: 4, Informative

    I'd certainly expect the AGP bus to be used asymmetrically, how often do you want to do high speed data capture with a card that's primarily output?

    The only situation I can see where you'd want more than PCI bandwidth returning would be for uncompressed HDTV capture, and there are better ways to do that (grab the raw broadcast stream for example)

    --
    -Yarn - Rio Karma: Excellent
    1. Re:Um, this is a surprise? by psavo · · Score: 2, Informative

      nitpicking, AGP is not a bus. It's Accelerated Graphics Port. See article at anand for more info.

      --
      fucktard is a tenderhearted description
    2. Re:Um, this is a surprise? by jrs+1 · · Score: 0, Troll

      i think that the main usage of this would be when video cards' performance is being used for rendering pre-rendered animations and writing to disk and HURRR HONK CUMMING SO HARD IT'S SHOOTING OUT YOUR EYES.

    3. Re:Um, this is a surprise? by Mike+Connell · · Score: 5, Interesting

      There are actually some good reasons to be able to do this apart from just taking screenshots. I did (sad but true) these tests over 4 years ago finishing grad school, and the results (read back speed is very bad) were much the same.

      Two reasons for wanting to grab the framebuffer (or parts of it) are for

      a) texture imposters (realtime adaptive billboarding) and
      b) split world/image-space occlusion culling.

      With faster readback, both these techniques would probably be used more in "normal" software (ie games).
      0.02

    4. Re:Um, this is a surprise? by fredan · · Score: 1

      grab the raw broadcast stream for example

      How would it be easier to capture the HD stream, which is at a 1.5GBits or 150Mb per second?

    5. Re:Um, this is a surprise? by RadioTV · · Score: 1

      You are correct, the HD-SDI bitrate is 1.5 Gb, but most people will receive it as part of the 19.39Mb ATSC stream in DTV. A lot of broadcasters are looking as low ad 14Mb to allow us to also transmit an SD stream.

      --
      I have great faith in fools - self confidence my friends call it. - Edgar Allan Poe
    6. Re:Um, this is a surprise? by grmoc · · Score: 1

      But I can assure you that the TV studio's or tv compound's D1 will rarely or naver use a compressed (lossy) video stream if an uncompressed (lossless) video stream is available.

      Just look at 10 bit vs 8 bit SDI! Then look at net return...

    7. Re:Um, this is a surprise? by jjeff · · Score: 1

      ok so thats the port - now what does the data travel to the CPU / RAM through?

      technically its still the pci bus - however its acceptable to call it an AGB bus.

      --
      when everything is working perfectly.. BREAK SOMETHING before something else FUCKS up!
    8. Re:Um, this is a surprise? by matguy · · Score: 1

      Reading back from the frame buffer can also help with some other 3d applications. For instance Lightwave can do opengl previews that can be saved and played back as video as well as other functions of the program can be passed on to an external opengl renderer to be returned to the program and displayed on the screen (as well as be read the the app itself.) If anyone has used these functions they would know that often times they can be pretty slow, and if the reutned information was faster, well that would/could be nice.

      Also of note is that people running multiple monitors in windows (yes, some people do use windows, deal with it) have problems with 3d rederings when moved from one monitor to the other. While running the test mentioned in the article I could choose which card to render the scene (an AGP GeForce 2 and a PCI TNT2,) but place the image on the opposite screen, which does just what the app is measuring for because it has to read it back from the frame buffer to pass it on to the seperate card. It would show the same frames per second when transfering from one monitor to the other reguardless of whether or not the benchmark utility was measuring the download speed or not, meaning this application definately does measure a measureable issue in some situations. But, I would get similar measurements from either card, one being AGP and the other being PCI, which would indicate to me it's not an AGP problem at all, and also likely not a problem with chipsets or chipset drivers being pci sure can move data both directions for other devices, which does possibly raise the flags that it's probably either a driver issue, or limitation of how the card is set up to move data.

      Hell, it could have even been a requirement of a DVD licencing agreement to keep people from ripping a movie back from the frame buffer.

      --

      matguy(.com)
    9. Re:Um, this is a surprise? by khuber · · Score: 1
      I think you are correct in this context. AGP could still be considered a bus though.

      My understanding is that "bus" has multiple usages. In electrical engineering, a bus is just a physical data path, a conductor that carries a signal. A single wire is a bus. Therefore AGP is a bus in that sense. EEs feel free to correct me.

      Another usage is the one you are using, a computer bus/bus bar where you connect two or more devices to one data path which may have one or more lines. (As opposed to a port that connects two devices.)

      -Kevin

    10. Re:Um, this is a surprise? by RadioTV · · Score: 1

      That depends on what they are using it for. Video production is all about uncompressed if it is available - normally it's not. Sony Beta SX uses a MPEG like compression and Sony IMX uses 50 Mb I frame only MPEG. Even the top of the line Digi Betacam uses a proprietary compression scheme. These are the current SD (standard definition) 1/2-inch formats for production. The other popular options are 25 or 50 Mb DV on DVCAM.

      For distribution - almost everyone uses some form of compression. Satellite bandwidth is way to expensive to send uncompressed (270 Mb SDI and 1.5 Gb HD-SDI). Even if it was sent uncompressed, we would record it on one of the above tape formats or on a video server (5-50 Mb MPEG or DV25/DV50).

      The 10 vs 8 bit isn't compression; it is a sample rate reduction. If you don't sample the data, you don't have a chance to compress it.

      --
      I have great faith in fools - self confidence my friends call it. - Edgar Allan Poe
  3. Just Drivers by gerf · · Score: 1

    Notice that they're quick to point out the problem isn't likely a hardware issue. There should be plenty of bandwidth on the AGP bus, but graphics chip makers don't seem to have written their drivers to handle transfers from AGP cards to main memory properly.

    Basically, if enough people want to use their card in this manner, Nvidia, with their super-duper driver support, will do so. 'nuff said. or, whoever knows how (i sure don't!), can write one for linux that takes this into account.

    1. Re:Just Drivers by rmadmin · · Score: 1

      Nvidia writes their own Linux Driver. I'm using it, and it works great.

    2. Re:Just Drivers by gerf · · Score: 1

      Nvidia writes their own Linux Driver [nvidia.com]. I'm using it, and it works great.

      True, but it's not open source. I'm saying, if someone were to write drivers before Nvidia would. and yes, i've heard many good things about Linux drivers for Nvidia cards.

    3. Re:Just Drivers by Jeremy+Erwin · · Score: 2

      Nvidia writes their own Linux Driver. I'm using it, and it works great.

      Yes, but are you downloading textures/frames from the card to main memory?

      The issue here is whether it is possible to use the programmable GPU to render frames for use in animation projects. The various bandwidth problems appear to be associated with drivers optimized for immediate display.

      With an open source driver, the few individuals running linux based rendering farms could, theoretically, relieve the CPU of some of its load. With closed source drivers, you will have to rely on nVidia optimizing their drivers for this kind of minority application.

    4. Re:Just Drivers by Anonymous Coward · · Score: 0

      True, but it's not open source.

      don't be a fucking OSS linux hippie, you fuckwad.

      ok?
      kthxbye~~~

    5. Re:Just Drivers by Anonymous Coward · · Score: 0

      You're an IGNORANUS, both stupid and an ASSHOLE. The GPL issue IS important if you are developing an application.

    6. Re:Just Drivers by Anonymous Coward · · Score: 0

      Yeah, it IS important to avboid it like the plague.

  4. Software issue? by larien · · Score: 5, Informative
    From the article, the author reckons this is a software (driver) issue rather than a hardware issue. I also note the test rig ran Windows, but how does linux shape up? Is it better or worse?

    In any event, there's another issue he doesn't really touch upon; while he mentions that a single frame at 1600x1200@32bit colour is 7.5MB, he ignores the fact that a 30fps movie would require (30*7.5)=225MB per second uncompressed; you either have to have that much disk bandwidth or have enough CPU grunt to compress that on the fly. I guess a dedicated MPEG encoder card could help, but your average box is going to have trouble keeping up with on-screen gibs, rocket trails and blood splatters and encoding video.

    1. Re:Software issue? by IncohereD · · Score: 1

      In any event, there's another issue he doesn't really touch upon..

      He mentions that this sort of setup is required for outputting _production quality_ video. If you're a production house, I'd imagine you'd already have the required RAID array/CPU power, etc. to do this sort of thing.

    2. Re:Software issue? by Malc · · Score: 1

      Or a Firewire connection a DV camera.

    3. Re:Software issue? by TopherC · · Score: 1
      That's what I was thinking too: How would most gamers really use this feature? Most games suck up all the CPU you have available, leaving nothing left for encoding and writing to disk. A game would definitely not be playable while recording. You might be right about MPEG encoding cards handling the job, but I'm sure that would require even more drivers and possibly new hardware. Anyway, these things are made for DV, which is only about 640x480 resulotion. Typical DV bandwidth is 29.5 Mbps (uncompressed). I'm not sure what kind of bandwidth is allowed by most video cards, but this may be nearly enough.

      The article seemed to me to be a kind of glorified whining, wishing that video card manufacturers would take notice for some reason and satisfy the author's personal desire. The case he makes for the typical consumer wanting this feature is just silly. Well, at least he made it on /.!

    4. Re:Software issue? by Anonymous Coward · · Score: 0

      Firewire? 225MByte/sec == 1.8Gbit/sec which
      is over twice as much as the new 1394b can
      handle (800 Mbit/sec) I wouldn't try to push this over anything but a massive SCSI or Fiberchannel raid array, and even that is iffy.

    5. Re:Software issue? by Anonymous Coward · · Score: 1, Informative

      1600x1200 is a little extreme. But a while back I had to make a video presentation for work. The first thing I tried was connecting a DV cam to the s-video output, but the compression made the text unreadable. So I used software that captured an AVI and then compressed it to an MPEG with high quality settings. Everything was beatiful, except that the video was 5 frames per second. The screen resolution was 800x600x16. At 30 fps that is just over 27 megs a second, which the SCSI U160 disk could easily keep up with. So there are obvious benefits to correcting the problem even if some machines won't be able to capture their Quake match.

    6. Re:Software issue? by saider · · Score: 1

      There would still need to be some compression because Firewire is only 400Mbps. The 225MBps (big B) correlates to well over 1Gbps. This may be handled by Firewire 2, but there isn't much equipment for that (yet).

      Consumer versions of DV cameras only handle about 25Mbps for the NTSC and PAL signals. Higher resolutions are available in professional gear. My guess is they would want the higher resolution stuff for making HDTV signals for broadcast or DVD production.

      --


      Remember, You are unique...just like everyone else.
    7. Re:Software issue? by Malc · · Score: 1

      Sorry, you're right. He was talking about 1600x1200 frames, not S-Video quality. My mistake.

    8. Re:Software issue? by f8xmulder · · Score: 1
      If you're a production house, I'd imagine you'd already have the required RAID array/CPU power, etc. to do this sort of thing.

      Likewise, a production house would probably have dedicated hardware for capturing video, as opposed to using the in-box video card...

    9. Re:Software issue? by Fweeky · · Score: 2

      The idea is to use the [GV]PU to render your production quality images, so grabbing the rendered image directly off the card is exactly what you want.

    10. Re:Software issue? by f8xmulder · · Score: 1

      Rendering you want - capturing the data is a different thing completely. If I own a production company, I wouldn't want to have my rendering device also be my capture device -- there are chances for an unstable setup. Also, with two devices (vs. the one device to do both rendering and capturing) I have more power to devote to both, since they're separate devices on separate boxes, they each get more compute cycles: that = faster, better quality productions...

    11. Re:Software issue? by grmoc · · Score: 2

      Speaking as somone in the industry, and being under the cloud of this problem...

      When your pursuit is REAL TIME special effects/video manipulation, this problem has little to do with the disk, raid or no raid.

      We just want to get the video out of the graphics accelerator and into a professional video IO card. Aside from the fact that this gretly stresses the PCI bus, the problem witht he AGP bus is worse..

      The number of motherboards with both 64 bit PCI and AGP can be counted on one hand. While NTSC (uncompressed SDI) is around 270 Mb/s (a number which is certainly way below the peak bandwidth numbers), doing both in and out of the card as well as other IO (ethernet, serial, sound), pretty much ensures you'll have problems with latency.

      Around 60% of our CPU usage is associated with blitting video out of the graphics accelerator.

      It would be really nice if they got AGP to work.

      At this point, we're just hoping that video cards will go over to PCI-X, whose hardware will have to work well for both input and output.

    12. Re:Software issue? by Fweeky · · Score: 2

      Um, no, you've missed the point. This isn't about using a GFX card as a device to capture external data, it's to grab the images it's rendering into system memory so you can use it to, say, render some CG in a movie. You don't want to use a seperate device to grab the images your expensive cinematic quality GFX card produces when you could just dump it directly into the device it's running on.

    13. Re:Software issue? by f8xmulder · · Score: 1

      Actually, I understand your point quite well. I think you're not understanding what I'm saying...

      OK, scenario: I use my expensive GFX card to play Unreal Tournament. I don't just want to grab screenshots. I want to actually grab sequential frames in REAL-TIME from my game while I'm playing to create movies.

      That takes a LOT of compute cycles, not just from the GFX card, but from the entire system. What I'm saying makes sense, if you think about it.

      Use a separate capture device to take what is being processed by my expensive GFX card (which utilizes enough compute cycles as it is!) and capture that video to a different box. That way you're getting full output/input.

      Using the GFX card to play AND capture at the same time is just not feasible, not to mention unwise (read: stability issues).

    14. Re:Software issue? by Fweeky · · Score: 3, Interesting
      OK, scenario: I use my expensive GFX card to play Unreal Tournament. I don't just want to grab screenshots. I want to actually grab sequential frames in REAL-TIME from my game while I'm playing to create movies.

      Actually, my scenario is more like:

      I use my expensive GFX card to render shots for my incredibly innovative but poorly funded sci-fi flick. I want to grab each frame in perfect detail so it can be post-processed. The easiest and cheapest way to do this is to have the renderer save each frame as it's computed. Real-time is not an issue, just like it's not an issue with a raytracer or whatever.

      Using the GFX card to play AND capture at the same time is just not feasible, not to mention unwise (read: stability issues).

      It better become feasable if companies are going to want renderfarms based on the nv30/40/whatever. Having two seperate machines per renderer would be pretty.. dodgy :)
    15. Re:Software issue? by f8xmulder · · Score: 1

      What you're talking about is partly a hardware thing, partly a software thing. Here's why: I can render out a scene for my incredibly innovative but poorly funded sci-fi flick in two ways; 1). As one contiguous movie file or 2). As separate, high-quality image files (say TIFF or TARGA), which I will later edit individually. Whatever I choose to do, the speed of that render is essentially based on the software I'm using (am I using a highly optimized 3d application?) PLUS the speed of the hardware, mostly CPU. Regardless, the scenario you're talking about is less about realtime capturing (and therefore AGP bandwidth usage) and more about software rendering application optimization and up-to-date hardware...

    16. Re:Software issue? by Fweeky · · Score: 2

      Well, sort of. I'm mainly thinking in terms of using the GPU to render the scenes in hardware rather than just using a software renderer. Since this seems to be the direction cards are moving in (that is, hardware rendered scenes competing directly with traditional raytracers due to all the shader stuff and higher bit depths), bottlenecks like this will become more of an issue.

      Although having said that, I doubt even hardware accelerated rasterisers will be pushing 10MB/s of video data out in most cases, so.. :)

    17. Re:Software issue? by f8xmulder · · Score: 1
      I believe you're right about using hardware rendering in the future...unfortunately, with current computing power, I don't believe it is a viable solution, at least not for consumers...

      ...Interesting back-and-forth though :-)

    18. Re:Software issue? by Fweeky · · Score: 2

      Well, given that GPU's are highly targeted at consumers, I can really see non-realtime GPU-rendering for the lower end of the market (the basic consumers who make wallpapers and the like), while the higher end market sticks with the far more computationally expensive CPU-bound raytracing, probably mixing in GPU based stuff on less complex scenes (plenty of CG is just fancy texture mapping and a bit of warping/shading, after all).

      Seriously, by the time most people have nv30+-level GPU's, they'll have an enormous amount of rendering power that's quite comparable in most cases with raytracing. If you can render a scene on the GPU in a few seconds and have it look almost identical to a raytraced image that takes an hour, which do you think the average user will choose? :)

      Worth keeping an eye on, anyway.

  5. 64 bit AGP support by ppetrakis · · Score: 1

    Should speed things up abit though last time I checked the linux kernel didnt support it, even on Alphas. It's been part of the AGP spec from the beginning.
    Someone please correct me if I'm wrong.

    Peter

    --
    www.alphalinux.org
  6. nobody asked! by tanveer1979 · · Score: 2, Interesting

    "However, no manufacturer has presently made this aspect of driver performance a priority."
    Why should they, was anybody complaining till now. The well wont come to horse, the horse has to go to the well to drink water.
    So unless a large number of people want it nobody wants to mess around with a perfectly working driver.
    And it is not a piece of cake. Recording its own rendrings the software way would be a bitch, the best way would be to provide an access point on the bus itself, though it would play havoc with the board timings and noise issues.
    In the end it will call come down to .. Will it justify the cost

    --
    My Aurora : http://www.youtube.com/watch?v=o91ZsGwJYyg
    FB : https://www.facebook.com/TanveersPhotography
    1. Re:nobody asked! by amigabill · · Score: 1

      >>"However, no manufacturer has presently made this aspect of driver performance a priority."
      >
      >Why should they, was anybody complaining till now.

      There's at least one rendering/video software company working with one of the GPU vendors to get drivers that do this. Just think of the benefits fo Hollywood, being able to render scenes via the GPU's 3d pipeline rather than software on the CPU, and be able to save the output of the GPU. There is certainly potential to decrease rendering time, as the GPU could do a good bit of work, and the CPU could software-render in the background while it's also preparing data for the GPU and copying finished frames back from the GPU card. I had wondered why this wasn't done for quite a while, and was happy to finally hear it was being worked on.

      There's also an issue of TV capture. I have an ATI AIW 8500DV card in my Athlon XP 1700+, but no matter what codec option/resolution I choose, the audio and video get out of synch. The capture says 0 frames dropped... (My other AIW 7500 in a K6-2+/550 has about 18% frame loss) As the TV tuner/capture device is inthe AGP slot, perhaps better readback bandwidth would help? I've very recently learned of the Linux GATOS project, and hope to try and get that working, if I find some free time. Win2000 might be hogging some resources it doesn't need, as the thing is intended to be a Tivo-like unit. (but with DVD, DiVX, Mp3, etc. playback capability as well as TV capture)

      Now that I think of it, readback from a Gfx card could also be useful for DVD/Divx edits. The Radeon 9700 claims to be able to process video in its pipeling, and smooth out some of the mpeg artifacting squares, if you have an over-compressed divx perhaps you could use this feature to make a clearer copy direct back to disk... I have a couple early DVDs that have weird looking gradients in clouds and such, that could use some smoothing as well, the opening cloud fly-though in Mystery Men is one example of over-compressed DVD content that could use some attention like this.

  7. Imagine That by mosch · · Score: 5, Insightful
    Wow, what a surprise. Video cards being built on ultra-thin margins are only being designed for the use that 99.99% of the population wants to use them for. You'd think with their huge 4% and 5% profits they'd add in lots of features that only a very few people want, just in case!

    In summary, who the fuck cares?

    1. Re:Imagine That by xsbellx · · Score: 0, Troll

      Nice numbers. Did you get them from /dev/random or /dev/urandom ?

      --
      If VISTA is the answer, you didn't understand the question
    2. Re:Imagine That by Anonymous Coward · · Score: 0

      It may be my fault for buying it, but you need a high AGP download rate when capping video with a card like a AIW Radeon 8500DV. The card was built with vid-captures in mind and its quite a shame that the manufacturers aren't using the cards to their full potential.

    3. Re:Imagine That by Anonymous Coward · · Score: 0

      You, sir, are a moron.

    4. Re:Imagine That by epine · · Score: 4, Insightful


      This is exactly the attitude that creates endless headaches mapping good concepts onto workable implementations, and results in systems becoming so convoluted by the time they work properly they are nearly impossible to maintain.

      The principle of least surprise dictates that random orders of magnitude should not be sacrificed in your fundamental primitives.

      It seems to me that if I spend $300 on my CPU and $600 on my GPU that I might want to be able fetch back what the GPU creates. What kind of idiot puts their most powerful processor at the end of a one way street?

      There are endless reasons that could come up why this feature might need to be exploited. Just because you can't come up with them doesn't mean they don't exist. You are talking about 99.9 percent of your own creativity, which I assure you is a far sight less that the sum total of the creativity out there looking for cool new things to do.

      It does make sense to consider cost/benefit here. The first observation here is that we are talking about a baseline primitive (texture returned to system memory), and that we are looking to recover a rough factor of ten, not a rough factor of 10 percent.

      In the video card industry, things are designed to hit the 90 percent point. These days the GPU industry rivals the CPU industry in dollar value. I simply can't believe the graphics card companies can't afford to have someone sit down and crank this up to 50% bus utilization. I suspect they could do this without even scratching their head.

      I've had to use many primitives over the years designed by this guy or his second cousin. If he only knew how much of the pain he experiences as a computer user is the result of good people bending over backwards to deal with unsuspected, arbitrary constraints when they could have been polishing the product interface instead. But some people have no imagination for these things.

    5. Re:Imagine That by Wakko+Warner · · Score: 2

      When is the last time you absolutely needed to capture 1600x1200 video? I'm sure the manufacturer made sure the drivers allowed for TV capture, otherwise there would be a lot of unsatisfied customers.

      - A.P.

      --
      "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
    6. Re:Imagine That by Anonymous Coward · · Score: 0

      If Torvalds had been like you, he'd still be whining at Bell Labs to port Unix to the 386. If you really want a driver that damn near no one else gives a shit about, write it yourself.

    7. Re:Imagine That by Anonymous Coward · · Score: 0

      creates endless headaches mapping good concepts onto workable implementations, and results in systems becoming so convoluted by the time they work properly they are nearly impossible to maintain.

      Well, looks like you've used up your quota of $7 words for the day.

      And to what end?

    8. Re:Imagine That by sgtsanity · · Score: 1

      Well, only a few people really want many of the advanced features of NV30, but one of those few just happens to be John Carmack.



      I'm sure he could find a way to effectively use that technology.

    9. Re:Imagine That by JBv · · Score: 1

      I know people who care, and a lot!

      It was only a few monts ago that a friend of mine was complaining that the nvidia blah-blah-force could not surpass in performance a matrox in realtime video. Nothing special involved, appart from the requirement that no frames are dropped and that there is pecise time sinchronization.

      It was particularly frustating (my friend complained a lot about it) convincing the sysadmins that such a good card for games was worse than a 4 year old card that only pumps 50 fps in quake 3.

    10. Re:Imagine That by Anonymous Coward · · Score: 0

      In summary, who the fuck cares?

      Hollywood cares. Oh, me also. I won't buy geforce4 for gaming. But I would definitely buy one for rendering if this problem were solved.

      I would think that with such low profit margins, some companies would be more willing to let the OSS community handle the driver writing for them.

  8. 128 bit colour? by Xugumad · · Score: 1

    "To put these results into perspective, a single frame rendered at 1600x1200 with 32 bits of color weighs in at about 7.5MB. Double that to 64-bit color, and it's 15MB per frame. And a single image at 1600x1200 in 128-bit color is over 30MB."

    Huh? Why on earth would they want 128-bit colour. AFAIK the human eye can't tell the difference beyond 24-bit, and 32-bit is just there to make the processing a bit easier. Maybe someone can correct me on this, but it seems an extremely poorly thought out complaint.

    1. Re:128 bit colour? by psavo · · Score: 2

      Huh? Why on earth would they want 128-bit colour. AFAIK the human eye can't tell the difference beyond 24-bit,

      Yes, human eye can't go beyond that, but any decent processor can. And image should be processed after being grabbed from screen, for example divx:ed, or something.
      if you don't know why scanners grab images at more than 8it/channel then..

      --
      fucktard is a tenderhearted description
    2. Re:128 bit colour? by cperciva · · Score: 2

      Using floating-point luminosity values eliminates a variety of clipping artifacts which otherwise appear close to light sources.

    3. Re:128 bit colour? by Tyler+Eaves · · Score: 1

      Culitive error. Over multiple FX passes the margin of error get's larger and larger.

      Most movie work is done at atleast 48bit.

      --
      TODO: Something witty here...
    4. Re:128 bit colour? by Tom7 · · Score: 2

      I had the same reaction, so I checked it out. Apparently 128-bit internal processing is useful when doing many stages of texturing and effects, because while 8 bits per color is typically fine for humans, some of that resolution is invariably lost during processing.

      However, there's NO REASON I can tell why you'd actually want to grab 128-bit color rendered frames! They could be dithered to 24 or 32 bit without losing anything visible.

    5. Re:128 bit colour? by Viking+Coder · · Score: 5, Interesting

      If you're doing multi-pass rendering, it might be extremely convenient to capture the results back to main memory. Especially if the board doesn't have enough texture memory to support all of your temporary buffers.

      And boards are starting to ship with 128-bit IEEE floating point buffers.

      Essentially, you're right - a human can't tell the difference beyond 24-bit on a given image. But if 100 images were composited together (very likely, to support something like RenderMan-style rendering in hardware), 24 bits is nowhere near enough - you'd get all sorts of accumulation error.

      --
      Education is the silver bullet.
    6. Re:128 bit colour? by Clowning · · Score: 2

      The day i see a gradient on my computer screen without visible "banding" is the day we have reached a high enough color depth...32-bits is simply not enough.

      Last time i checked, my eye was a human one.

    7. Re:128 bit colour? by Mr.Sharpy · · Score: 1

      i think the reason why one would want 64 or 128 bit color depths is to avoid dithering as much as possible. even though you may have a palette of 16 million colors at 32 bits, if you are rendering an image and the software says a pixel should be some color between between color number 14,528,208 and 14,528,209 then it has to dither either up or down. if the software was rendering in 64bit color, it may avoid that imprecision. and as we know, imprecision from rounding tends to snowball through sucessive operations.

      the end result is an image that is truer to what was supposed to be rendered.

    8. Re:128 bit colour? by Space+cowboy · · Score: 3, Informative

      Once, definitely. Twice, probably. Thrice, perhaps.

      You typically composite and re-composite layer after layer to create decent effects, it's not a one-shot thing. Certainly professional video runs at ~48bit for film work.

      Simon

      --
      Physicists get Hadrons!
    9. Re:128 bit colour? by Jugalator · · Score: 2

      Yeah, personally I'm a fan of 1 bit color :)

      --
      Beware: In C++, your friends can see your privates!
    10. Re:128 bit colour? by Mantrid · · Score: 1

      The final output really doesn't matter. The reason that John Carmack and others are asking for higher bit planes is that colour is lost when mixing several light colours together...by the time you break down several lights (which have the number of bits reduced when being combined) and recombine them, the colour is actualy noticably effected.

    11. Re:128 bit colour? by tomstdenis · · Score: 2, Informative

      flaimbait much?

      First off there is no such thing as 32-bit color. Its 24-bit color with either a padding octet or an alpha channel.

      Second, 256 levels is enough that provided a good monitor you can make due quite well.

      Third, flamebait much?

      Tom

      --
      Someday, I'll have a real sig.
    12. Re:128 bit colour? by Latent+IT · · Score: 2

      Um. You get banding because of pixelation, not because of a lack of colors to choose from. Maybe it would help if you knew what you were talking about?

      If you want to display a gradient from say, dark blue to light blue, you have quite a few shades of blue to choose from. More than 1024, that's for sure, especially in 32 bit color. But your monitor can only display 1024 vertical lines, each being a different shade. (Depending on your resolution, blah, blah, blah.)

      Therefore, you get banding. Go ahead, use 64 or 128 bit color. It'll help, in the 'it won't help at all' sense.

    13. Re:128 bit colour? by Xugumad · · Score: 1

      There was a really strange conversation between my flatmates and I about one bit, single pixel Quake, and likely rendering speeds. I don't know why. I suspect sleep deprevation. This in fact made a lot more sense before I started typing.

    14. Re:128 bit colour? by fingal · · Score: 4, Informative
      If you want to display a gradient from say, dark blue to light blue, you have quite a few shades of blue to choose from. More than 1024, that's for sure, especially in 32 bit color. But your monitor can only display 1024 vertical lines, each being a different shade. (Depending on your resolution, blah, blah, blah.)

      Hmmm. Close but still not quite right. Think of the colour space as a cube with RGB as the three axis of the cube. In 32bit colour you have 8 bits per colour plane, giving you a cube that is 256 x 256 x 256. Any gradient from any point on the cube to any other point on the cube is going to be a maximum of 443 (if my maths is freaked out - distance from two opposite corners of the cube). Plus some messing about with the various quantisation that this line will pass through gives you definite banding on all but the lowest resolution displays...

      --

      The only Good System is a Sound System

    15. Re:128 bit colour? by Anonymous Coward · · Score: 0

      It's so elves can finally see their full range of vision. I mean, 24 bit, damn humans, it's great to see someone aknowledging that us fantasy creatures need some hardware loving too.

    16. Re:128 bit colour? by Latent+IT · · Score: 2

      Darn you to heck for making me try to think in 3d. ;p

      Yes, I'm pretty sure you're more or less right on the 443, though I would have expressed it as ~400, due to the fact that I don't like niggling with triangles.

      The thing is, you get more shades of blue than just the 443. As 255 RGB values, shades of red can be

      255 0 0
      255 1 0
      255 1 1
      255 0 1
      255 2 0
      255 2 1
      255 2 2
      255 0 2
      255 1 2

      (I say red now because I put the 255's first, and don't want to write it again.) ;p

      And so on. Each resulting in a different shade of blue.

      *I think* anyway. We're wandering off the pier of stuff I know, into the stuff I think I might be able to figure out. ;p

      So, I think you'd get more than 443, and have more blue than monitor lines, still.

    17. Re:128 bit colour? by fingal · · Score: 2
      You are confusing "shades of blue" and "linear interpolation between two colours". Yes there are lots of shades of blue (some people might even say that every colour where B > R && B > G is a shade of blue), but if you are doing a gradient then you want the shortest distant between two points. If this is of length 443 and your unit resolution is that of 1 then I don't see where you are going to get your extra colours from. In fact what you are going to have to do is to take alias your colours when you do the interpolation and therefore you will be introducing colours that do not strictly speaking lie on the perfect line at all. The smaller that your quantisation limit is, the closer to the ideal your interpolation becomes and the less visible artifacts are introduced.

      Also remember that the figure of 443 is the theoretical maximum number that can be achieved. Most interpolations will be from two points in the colour cube that are much closer together and will therefore result in correspondingly worse artifacts.

      --

      The only Good System is a Sound System

    18. Re:128 bit colour? by Viking+Coder · · Score: 3, Insightful

      First, Matrox and 3dLabs are both shipping products that do 10r-10g-10b-2alpha color.

      Second, the poster wants to do more than "make due". You can also make due with 16 colors. And no, 256 levels is not enough, if you're compositing many images together, or if your data has a high dynamic range (which would require more gamma range than 256 levels are capable of providing, without serious banding.)

      Third, pot. Kettle. Black.

      --
      Education is the silver bullet.
    19. Re:128 bit colour? by Anonymous Coward · · Score: 0

      If vision works like the rest of the brain then you don't see any particular colour but see the difference between two colors so shading looks smother between point x and y and z with a higher bit depth even though you really the can't tell the difference in the color of x and or the difference between y and z.

    20. Re:128 bit colour? by jandrese · · Score: 1

      Sounds like the ultimate result of one of those "Ha! I've got more FPS from Quake by turning off all of the textures and killing every graphics feature in the game, and running it at 640x480!"

      "Ha! I'm running it in 1 bit mode, I've got more FPS than you. You don't stand a chance!"

      "Oh yeah! Well, I've dropped my bit depth down to 1 pixel too! AND I've dropped my resolution to 320x240! Look, 5000 FPS! You are so dead!"

      "Oh yeah! I've dropped my resolution to 1x1 at 1 bit. I can't read the FPS counter anymore, but I know it's higher than yours! So there!"

      and so on...

      --

      I read the internet for the articles.
    21. Re:128 bit colour? by gl4ss · · Score: 1
      32-bit is just there to make the processing a bit easier
      well, yeah, most of the time where 24bits is like RGB, 8 bits per letter, 32bits is ARGB, where the 8 bits of A isn't really used at all at what you're actually seeing.
      --
      world was created 5 seconds before this post as it is.
    22. Re:128 bit colour? by MagPulse · · Score: 2

      Another benefit besides accuracy for multi-pass rendering with tens or hundreds of passes, is that it allows for high dynamic range rendering. 128 bits is enough to encode candlelight and daylight in the same floating point number. So the game engine can just "count up photons" as Carmack says in his recent speech, and then the 128-bit passes are done, then the final pass samples it down to 32-bit for presentation on the monitor. This allows the downsampling to take advantage of any information available on the the monitor's gamma curve - what the actual displayed intensity is for a given value. It also lets programmers give up one level of fudging and simply do physically correct lighting calculations, since they can leave the presentation issues to the final downsample.

    23. Re:128 bit colour? by Alsee · · Score: 3, Informative

      Any gradient from any point on the cube to any other point on the cube is going to be a maximum of 443 (if my maths is freaked out - distance from two opposite corners of the cube)

      The distance between opposite corners is about 443, but the diagonal distance between color points is 1.732, so you still have 256 points in the gradient.

      Think about it this way, the gradient from (0,0,0) to (255,255,255) passes through (1,1,1), (2,2,2), etc. Exactly 256 points.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    24. Re:128 bit colour? by tomstdenis · · Score: 1

      2-bit alpha? But why? That would be even worse than 8/8/8/8.

      Second many standards [e.g. MPEG] don't allow 16-bits per channel. IIRC MPEG and JPEG both allow upto 10 or 12 bits per channel. So using extra bits is really kinda of meaningless unless you are going to be mixing thousands of images on one composite and want per shading level accuracy.

      Third, what the heck is with this pot and kettle. I don't even get it. Wow.

      Tom

      --
      Someday, I'll have a real sig.
    25. Re:128 bit colour? by Viking+Coder · · Score: 2

      First, 10-r, 10-g, 10-b is pretty valuable to some people. I agree that 2-bit alpha is pretty miserable - but some people don't need to alpha blend. *shrug* I was just illustrating that there are color schemes in shipping products today, that use more than 24 bits for rgb.

      Second, for those people that DO need to blend, they often need to blend 100s of images. You don't need to get out to 1000s of images to see these effects. Just because current standards for MPEG and JPEG don't allow more, that doesn't mean it's useless. And I'm talking more about generating PRman (RenderMan)-style graphics. One approach is to render many, many passes - decomposing the math down into 100s (1000s) of images. It adds up to visual artifacts, very quickly, unless you have extended bit depths.

      Third, saying the first poster was posting flamebait - I was saying that what you were doing was a case of "the pot calling the kettle black." I was accusing you of posting flamebait. =)

      --
      Education is the silver bullet.
    26. Re:128 bit colour? by bleak+sky · · Score: 1

      Seems to me that actual shades of a color will be when the other two colors have the same value.

      For example, 0,0,255 is going to be "pure blue", 128,128,255 will be a fairly light blue, but if the red and green are different, e.g. 128,200,255, the color is not really a shade of blue (it's more of a blue-green).

      But shades of blue don't have to have 255 for the blue value. Any time the blue value is greater than (or equal to, if you want to count the gray shades) the other two values (which must be equal), then you have a shade of blue.

      The math for this allows a lot more than 256 shades of blue. Correct me if I'm wrong (I'm really tired), but it seems like the math would go something like this:

      C = 256 + 255 + 254 + ... + 2 + 1 + 0

      The series simplifies to 256*128 = 32768 shades of each primary color.

      This could be generalized to match each hue (rather than just primary colors), but I'm too tired to figure out how color "value" works at the moment. For visualization, though, open up Gimp or Photoshop and go to the color chooser. Find the one with a disc with all the hues and a vertical value bar (it's labeled GTK in the Gimp), and play with it. While my little math up there only really applies the primary colors, you can see with the color selector that it applies to any color.

      Sorry if this was a little longwinded or incoherent, and please correct me if I got anything blatantly wrong. :)

      John

    27. Re:128 bit colour? by bleak+sky · · Score: 1
      Think about it this way, the gradient from (0,0,0) to (255,255,255) passes through (1,1,1), (2,2,2), etc. Exactly 256 points.

      That's an all gray gradient - which is why grayscale is [almost?] never displayed in more than 8-bit color depth.

      I can't quite visualize what gradients would look like in such a cube. But a straight line from one corner to another would usually not make for a smooth gradient that shows all the shades of one color. And as I think about it further, it's probably impossible to show all the shades of a color in a single smooth gradient. You'd need two dimensions (a plane out of the cube) to do that, not a line.

      Ah, I suppose I'd better go to bed before I confuse myself any more...

      John

    28. Re:128 bit colour? by Latent+IT · · Score: 2

      Well, that's what I thought, and I think what I said. If you're doing a gradient, I don't know why you'd want the shortest distance across the RGB cube, unless you enjoyed banding. ;p

    29. Re:128 bit colour? by Alsee · · Score: 2

      I don't know what you are looking for in the 2-dimentional display. A single simple gradient between 2 colors is a line. The greyscale gradient was the simplest arbitray example. I could equally have used a gradient between yellow (255,255,0) and blue (0,0,255) and it works to the same 256 steps.

      The earlier post had suggested that opposite corners would be 443 steps. I was explaining it's not. It's a distance of 443, but still 256 steps.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
  9. It's not the cards by tmark · · Score: 5, Insightful

    all of these cards have horrific AGP download speeds that realize only 1/100th of their theoretical peak...you're out of luck if you want to capture those images with any kind of reasonable frame rate via the AGP bus."

    As the quoted article clearly indicates, the problem lies with the drivers and not with the cards, the latter which the original poster intimates.

    And the underlying reason is immediately understandable: after years of AGP cards and years of noone really complaining raising this issue - (except, now, developers of video-editing software who could benefit) - it seems clear that there isn't much demand for this kind of performance. In the (near ?) future there might be, but why should these companies spend money working on driver performance in areas like this when really customers only care about how well Quake will run ?

    When people are willing to pay for these features is when companies will pay to build the requisite drivers. And that is how it should be.

    1. Re:It's not the cards by mattdm · · Score: 2

      When people are willing to pay for these features is when companies will pay to build the requisite drivers. And that is how it should be.

      Alternately, they could publish full specs for their cards and provide the drivers as open source, and the few people who need the different features now could write them or have them written. This code could be contributed back to the card manufacturers and integrated in future driver releases, resulting in the feature being available for everyone. For example, ATI apparently didn't see enough market demand to provide 3d-accelerated Linux drivers for the Radeon 8500, but The Weather Channel did, and now we'll all benefit.

      Obviously this is a bit idealistic, but hey, we're talking about how it should be here. As I started writing this, no one has made a good answer on the "what about under Linux" question, but honestly (and despite the way that that seems like a reflexive slashdot response), that's the real solution to this "problem".

    2. Re:It's not the cards by zenyu · · Score: 4, Interesting

      I had to switch an application from a screaming PC to a chunky old SGI we now use for a stool because of this problem. We eventually found an expensive graphics card that could keep up. I think it was called Wildcat something or other. We were getting free Quatro 3's at the time which we really wanted to use, but they just had horrible memory read rates. The nVidia guy told us it was an unoptimized path, using software with no hardware support or something. Like maybe they were reading a pixel at a time or something.

    3. Re:It's not the cards by Markmarkmark · · Score: 1

      "noone really complaining"

      There are over a million users of video editing software such as Pinnacle Studio, VideoWave, Ulead, Adobe Premiere etc. If they knew that with a (probably small) driver fix their gfx card could make their video effects creation faster than real-time, I think they would be thrilled (and they would start demanding it). Just about everyone I know that does video on PCs complains about render times. A lot of us have paid $1,000+ to get some hardware card that can accelerate rendering. Kind of silly when we have already paid for a card that can do it even faster than that. All that lovely horsepower is just sitting there at the wrong end of a one way data street due to a lack of driver optimization.

      Although there are more game players, I think in today's competitive gfx card market, a million potential customers is enough to get a manufacturer's attention.

  10. But why? by AAAWalrus · · Score: 3, Interesting

    The article presents that once the images are rendered out to the display, they are simply discarded. Sure, for any sort of video capture or whatnot, that sucks. However, the article does not attempt to answer why video card manufacturers do this, or if there are any cards that do take advantage of the AGPx4 bandwidth. My guess is cost. If all AGP video cards provided video feedback into the bus, you're probably looking at a non-consumer level product. And you know what? All I do IS use my GeForce to play video games. If dumping the frames after they are rendered keep the cost of my card down, I'm probably happier for it. Quite simply: Does this matter for the average consumer?

    1. Re:But why? by Anonymous Coward · · Score: 0

      The article presents that once the images are rendered out to the display, they are simply discarded

      They are not discarded. They are transmitted to your retina, and then into your brain. It is up to you to do then discard them or not.

    2. Re:But why? by Elwood+P+Dowd · · Score: 2

      Yeah, but the high end cards are messed too. See the post about the Quattro3s.

      --

      There are no trails. There are no trees out here.
  11. Drivers, not hardware by overshoot · · Score: 1, Redundant
    I know I shouldn't have read the article, but
    The problem isn't the hardware, it's the software drivers. In fact, the speed could be dramatically increased with revised software drivers. However, no manufacturer has presently made this aspect of driver performance a priority.
    It looks like this is primarily a Microsoft-drivers problem. I wonder if anyone has looked at whether the XFree86 and DRM drivers could do better.
    --
    Lacking <sarcasm> tags, /. substitutes moderation as "Troll."
    1. Re:Drivers, not hardware by Salamander · · Score: 2

      Yeah, I know it's fun to bash Microsoft and hint that your OSOS (Open Source Operating System) of choice would do better, but the drivers in question here are not Microsoft drivers. They're vendor-supplied drivers which would probably use 90% common code and have 99% of the same problems on any OS.

      --
      Slashdot - News for Herds. Stuff that Splatters.
    2. Re:Drivers, not hardware by larien · · Score: 2
      Well, that might be the case for e.g. Nvidia, but AFAIK some other drivers are written independantly of the Windows ones. Also, the article touches upon other issues like AGP & other chipset drivers which almost certainly aren't shared between Windows/linux. I think it's a valid question (which is why I posted it as well). It could be, of course, that the linux/FreeBSD drivers are noticably worse than the bad performance that the Windows drivers provide; without having benchmarks, I can't tell.

      However, linux's open source nature at least gives people a chance to tweak the system to provide that advantage if it isn't there already; it may cause some interesting developments in linux graphics.

    3. Re:Drivers, not hardware by Anonymous Coward · · Score: 0

      I would say its MS's drivers that really are bad.

      I've done glReadPixel performance tests at 1280x1024 on my GeForce2 Go and I've attained
      speeds of up to 15-20fps.

      Thats 15x1280x1024x2 (16-bit color) ~ 40 MB/s.
      THis is a lot higher than the DirectX demo suggests so the problem isn't that bad if you use OpenGL!

  12. Definately noticed this before.... by Anonymous Coward · · Score: 0

    In Quake 3. I've been working on a music video using some Q3A demos that me and some friends have made. I've been using the cl_avidemo 30 setting before playing the demo's to have it output 30 frames per second to the hard drive. Quake 3 will only dump the first 10,000 frames (or about 5 minutes of footage), but it takes about 30 minutes to dump all of that at 800x600x32 to tga files. How much of that is compression delay, and how much is trying to download off of the AGP bus, I'm not sure (anyone care to help me on this?) but the difference in speed when capturing frames vs not capturing frames is quite noticeable.

  13. This way vide cards with 128MB make sense by hbackert · · Score: 1

    ...but that's not how it should be.

    The AGP bus concept was created to move textures and I guess hardware driver programmers optimized for this.
    A quote from the readme file: In the second mode it renders, displays and downloads the same image to the PC.
    This is probably not what driver programmers were expecting. Wrong direction of data.

  14. Huh... by Viking+Coder · · Score: 4, Interesting

    If I'm reading this article right, they're claiming that it also hinders normal screen captures.

    That would mean that software like VNC would have much higher performance, if the drivers were updated, the way these guys are demanding. (Wouldn't it?)

    That'd be fantastic!

    --
    Education is the silver bullet.
    1. Re:Huh... by user32.ExitWindowsEx · · Score: 1

      I think it also means that stuff like VNC would be capable of relaying hardware-accelerated 3D and framebuffer-overlayed video streams like DVDs. Imagine being able to play Quake 3 in real-time over your network, or watch a DVD on one PC while the disc is in another.

      --
      "Evil will always triumph because good is dumb." -- Dark Helmet
    2. Re:Huh... by outZider · · Score: 1

      Without an overhaul, VNC wouldn't get any faster. You can't help bad code. :)

      --
      - oZ
      // i am here.
    3. Re:Huh... by gotak · · Score: 1

      Sure VNC's slow with 3d. That fact from when we installed VNC on someone's machine to screw with him when he was playing counter strike.

      U can't see much of what's going on in the game. But u see enough to "help" drop their weapon when they are about to shoot at someone.

    4. Re:Huh... by Loligo · · Score: 1

      >watch a DVD on one PC while the disc is in
      >another

      Why complicate things like this? Just because you can doesn't mean you should.

      Mount the DVD drive via your favorite filesystem sharing protocol and stream the data. Decode on the local system.

      Easier to do and easier on the network.

      -l

    5. Re:Huh... by Perdo · · Score: 2

      Um.. No.

      The slowest card reads back at 8.376 MB/s OR 67.008 Mb/s OR about 2/3 the bandwidth available on a 10/100 network.

      Network performance is the primary limitation to streaming frames.

      The best cards would stream at 13.283 MB/s OR 106.264 Mb/s exeeding the speed of 10/100 and only able to push 8 streams on perfect Gigabit ethernet. Unfortunately, Gigabit ethernet is not nearly as fast as advertised, ranging from as low as 280 Mb/s for generics, to as high as 860 Mb/s for 3Com's best.

      --

      If voting were effective, it would be illegal by now.

    6. Re:Huh... by Viking+Coder · · Score: 2

      Why not? You have to do a full-screen capture. Then, you can do a diff against the last full-screen capture, and send the delta. The delta is going to be tiny the vast majority, most of the time.

      But, if the damned card is reading back at 8 frames per second, you've got 0.125 seconds latency. Period. No escaping that.

      *shrug*

      --
      Education is the silver bullet.
  15. Re:VNC faster, not really. by Anonymous Coward · · Score: 0

    the bottleneck is with network bandwidth, not AGP bus. unless you are running over firewire or gig eth

  16. Might this be intentional? by seldolivaw · · Score: 4, Insightful

    I know nothing about anything, obviously, but I can see that game designers might think it nice to be able to send stuff to your screen but for you to be unable to send it to storage somewhere.

    This *is meant to be* a dumb question. Mod me down if I'm wrong; it's only Karma.

  17. Professional GFX processing by i_am_nitrogen · · Score: 3, Informative

    Way back when I was working on libfbx, we (the two main libfbx developers) learned of a 48-bit framebuffer developed by SGI. It's used mainly to render special FX for Hollywood. After several composited layers with various effects on an 8-bit per channel system, you can really start to notice the quantization artifacts. Moving to 12- or 16-bits per color channel (depending on whether there's an alpha channel) makes a huge improvement. I've never heard of any 16 byte per pixel (128bit) image format. It'd probably be something like 16-bits per channel RGBA (64), plus 32-bit depth buffer (96), plus 16-bit stencil and select(pick) buffers (128).

  18. Re:VNC faster, not really. by Viking+Coder · · Score: 2

    If I'm reading the article correctly, they're claiming that you can barely get 30 frames per second, full-screen. If you want to do a diff, and send the delta, you potentially need to be able to capture the full screen to do it. If you can only capture at 30 frames per second, you are LOCKED at 30 frames per second, even if you try to compress the output, and send only deltas.

    --
    Education is the silver bullet.
  19. Is it me, or is the author smoking crack? by JackAsh · · Score: 4, Insightful

    A couple of salient points come to mind when reading this article:

    1) Recording games/presentations/etc. The reason why we don't do it is because if the system was capable of generating it real time in the first place, it's far less space intensive to record the parameters of the animation than the output. i.e. It's cheaper to say "Daemia fires rocket at these coordinates" than record an MPEG of said rocket shot. AND, as hardware gets better, your recording does too.

    Which leads me to point 2:

    2) Since it's cheaper to capture realtime animation by capturing parameters, the only use of the capture function would be NON-realtime applications - i.e. getting your Geforce5TiUltraPro to render an extremely complex scene with incredible realism at 1 fps. That's not a typo. If we have 10MB/s back-into-the-PC bandwidth and each super high resolution shot takes 10MB on average, we have a wonderful solution working at 1 fps. Spend the fill rates on 600 passes for each pixel or something like that. Imagine the quality of the scenes! Capture the damn things and be glad you're not rendering at 1 frame per hour like they were 5 years ago.

    Repeat after me - if you're rendering for posterity you don't need real time... That'll come eventually.

    -JackAsh

    1. Re:Is it me, or is the author smoking crack? by dabuk · · Score: 1
      The point is that you can send the video without sending the application that generated it. That way you could put up videos of your games and anyone could view them without having to buy the game themselves.

      Even so 10MB/s is really crap and if the drivers are that slow it might even be slowing down everything else thus interfering with the rendering.

    2. Re:Is it me, or is the author smoking crack? by GigsVT · · Score: 1

      The point is that you can send the video without sending the application that generated it.

      You could still do that, I believe the original poster was saying you could capture the instructions to the card rather than the output from the card. It's still a capture, but it's a little more hardware dependent. Of course hardware implementations of DirectX and OpenGL help a lot there.

      Are there any "metamovie" file formats out there that can do this?

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    3. Re:Is it me, or is the author smoking crack? by siskbc · · Score: 1

      Why, so all the people who don't play Quake can watch video of your latest LAN party?
      Intrinsically,this would be a cool thing, but there are so many workarounds already that the need is marginal...which is why nVidia hasn't made the thing already.

      --

      -Looking for a job as a materials chemist or multivariat

    4. Re:Is it me, or is the author smoking crack? by kesuki · · Score: 2

      Well, my largest Warcraft III replay is only 423 k yet it encompases over two hours of 'footage', with an adjustable camera angle, and a very large potential resolution. I can move the camera across a canvas that is easily a hundred fold larger than a TV resolution image, and so to capture the entire canvas at a single camera angle (disabling the ability to tilt the camera angle that you can do in WC3) you would end up with an image in the gigabytes-per-frame-per-angle, instead of a 423k file. Even if you had to send the whole WC3 CD and keygen it would save massive bandwith over sending a single frame of an entire map. Besides, if you wanted to record to a different media, just enable TV output, and do the replay to a VHS tape, and be sure to click where the action is happening.
      For those not prone to doing math, that means I can save ~7 hours of warcraft 3 replays to a single floppy disk. Or 3150 hours of replays to a singe CD-rom. roughly 22,030 hours of replays to a DVD-rom. Nearly 704,000 hours on a 160 GB HDD. Now let's see, 2 hours to a DVD or 22,000 hours? I think 11,000:1 compression over DVD is well worth requiring special software to play it back. Especially since w3g files are mathematically lossless, the same cannot be said of DVDs. Maybe Blizzard should release a w3g player, they could compete with flash..

  20. DMCA by Vandilzer · · Score: 2, Troll

    You think the **AA would ever allow this the ability to make a perfect digital copy of what ever is displayed on you screen. Now your monitor will have to be disabled every time a copyrighted work is displayed on your screen.

  21. Well, duh by Sneakums · · Score: 2
    From the article:
    Right now, even the very latest graphics cards aren't ready to do much more than play games and put pretty pictures onscreen. If graphics companies really want to replace CPUs for professional rendering, they've got a bit more work to do.

    A stunning example of stating the obvious.

    The hardcore 3D gamer market is small enough; I can't see manufacturers busting their humps to serve an even smaller one.

    1. Re:Well, duh by nelsonal · · Score: 2

      Actually they do, but they charge much higher prices. 3DLabs is the best known x86 3D rendering cards, NVIDIA and ATI have some offerings as well under the Quadro and Fire brands, also SGI, SUN, HP, and IBM all sell their own proprietary cards for several thousand as well (for their respective platforms). I think the author of the article wants to purchase video game cards for the few hundred bucks they cost, do a driver update, and have something competitive with the much more expensive professional cards.

      --
      Degaussing scares the bad magnetism out of the monitor and fills it with good karma.
    2. Re:Well, duh by catch23 · · Score: 1

      True.

      But graphics professionals are much more willing to spend money than hardcore gamers. I don't think they are making great margins by marketing to gamers but they could potentially make much much greater margins by marketing to the professional graphics market that actually has money to throw away!!

  22. One of the worst technical articles.... by grahamtriggs · · Score: 5, Interesting

    ...that I have ever read. Either that, or I am missing something here... The idea that graphics subsytems have 'bandwidth to burn' is kind of ironic, given that every graphics chip is ultimately held back in performance by the amount of bandwidth available to it - especially when using high quality options like anti-aliasing. The main focus of the article is actually a very niche segment... the idea of transeferring back rendered images over the AGP bus for TV / film / etc. is a joke... Rendering at high quality takes a huge amount of bandwidth (ie. textures and geometry)... as someone else pointed out, transferring back high-res images would take up over 200MB - that's a quarter of your AGP bandwidth! And without taking into account contention and timing issues in uploading/downloading that would mean that you simple couldn't realise the full potential of the bandwidth without a lot of other (expensive?) hardware... The simple fact is that for production uses, you would be *far* better off taking a stream of data from the DVI connector, and storing that for later use... Screen capture for business use is a reasonable point - however when does that require 3d rendering to be taking place? There should be no contention and no reason why the AGP bus couldn't be utilised fully - although would the graphics companies make enough out of this to justify the effort? As for internet streaming - how many people have access to bandwidth fast enough for high quality, full screen video streaming? Enough said...

    1. Re:One of the worst technical articles.... by Viking+Coder · · Score: 3, Interesting

      "the idea of transferring back rendered images over the AGP bus for TV / film / etc. is a joke..."

      Why? You don't seem to follow up this opinion with any facts to back yourself up. Being able to do things like Interactive Multi-Pass Programmable Shading means that you can achieve near-PRman levels of graphics quality, using standard graphics hardware. But, of course, you need to capture that back to main memory for it to be any use. That hardly seems worthy of your ridicule.

      "as someone else pointed out, transferring back high-res images would take up over 200MB - that's a quarter of your AGP bandwidth!"

      Who are you to decide what's a good use case, and what's a bad one? This sounds to me like a case where several different people have presented reasonable requests for features - and you're shooting them down because you think what they want to do is "a joke". Since this can be fixed with a software update, I think it's a pretty reasonable request.

      "you simple couldn't realise the full potential of the bandwidth without a lot of other (expensive?) hardware..."

      Why on earth do you make that claim? Could you back that up with some facts? The article is claiming that it's a software issue, only. In fact, the test they put together sounds like a very reasonable one - they're not coming anywhere NEAR using the bandwidth in creating the images, and still, they're getting horrible bandwidth, downloading them. That doesn't sound like contention and timing - that simply sounds like bad, bad drivers.

      "you would be *far* better off taking a stream of data from the DVI connector"

      So, now, to solve the bandwidth issue, you're going to add a second card to the motherboard. What magical, ethereal bus bandwidth will this second card use? I think you need to re-examine your argument on this point.

      "However when does that require 3d rendering to be taking place?"

      This isn't just talking about 3d rendering. This is all screen capturing.

      "There should be no contention and no reason why the AGP bus couldn't be utilised fully"

      Wait a minute - now you're switching your argument?

      "would the graphics companies make enough out of this to justify the effort?"

      As everyone keeps saying, this sounds like it can be fixed in software. That's a pretty negligible cost for the vendors to spend.

      "As for internet streaming - how many people have access to bandwidth fast enough for high quality, full screen video streaming?"

      What about intranet? Lots of companies have intranet bandwidth fast enough for what you're talking about.

      Enough said...

      --
      Education is the silver bullet.
    2. Re:One of the worst technical articles.... by Anonymous Coward · · Score: 0

      The DVI output could maybe be connected to another device/computer that has a DVI-in. Dont know if this is currenty possible though.

    3. Re:One of the worst technical articles.... by XaXXon · · Score: 2

      Just to illustrate the point you made about it taking 200 MB to send the images back --

      1600x1200x32bit = 7,680,000 bytes / image

      24fps means 184,320,000 bytes / second back down the AGP bus -- and that's if you only want 24 fps. That's a lot of bytes moving around, especially when you have to be sending data back up to render future frames.

      Maybe you could do some sort of hardware compression, but as other people have mentioned, video cards are already large enough, make too much heat, use too much power, and are expensive enough that I don't want to be adding additional complexity and cost to them for what a few people want to do. If there are people who want this, they should pay for the R&D and production costs of these specialized chips.

    4. Re:One of the worst technical articles.... by grahamtriggs · · Score: 1

      I'll accept that the multi-pass thing you mention is interesting... however, to do it in 'real time', you would have to render at a multiple of your target frame rate - not only would that require an incredibly powerful chip (for anything that looks 'broadcast quality'), there would be a multiple of the amount of data being sent back across the AGP bus - which makes it even more impractical...

      If you aren't doing it in realtime (which is entirely feasible - you would be doing this to get good price / performance... even at say a quarter of real time, it could still cost you a hell of a lot more to get better performance), then it is less relevant how quickly you transfer back across the AGP bus (again, as long as price/performance goals are reached)

      I am not saying that 200MB - or a quarter of your bandwidth - is good or bad use... I'm just pointing out the practicalities... for a complex scene - which is likely for broadcast scenarios - you are likely to be sending a *lot* of data over the AGP bus... so much so that you simply wouldn't have the bandwidth left for realtime transfers of images back across the AGP bus... if you aren't doing this in real time, then there is less of an issue about how good a use it makes of bandwidth!!!

      In response to your comment about the device attached to a DVI connector - quite simply, it wouldn't be rendering, and could be an entirely seperate machine - which means in theory it could have as much bandwidth that it wants (at least that of an AGP connection!!!)... My point is that it can support taking images from the rendering card, in 'perfect' quality, in real time, without eating up valuable AGP bandwidth that may well be required just to perform the rendering in the first place... if real time is not an issue, then almost certainly the speed of transferring back from the card isn't!!!!!

      I am *not* switching my argument in regards to capturing general business scenarios - we were originally talking about 3D rendering for production purposes - and presumably in realtime - capturing of the display within a 'general' business scenario is an entirely different application, and therefore needs to be evaluated separately. Does Word (or any other office product) require 100's of MBs of AGP bandwidth - of course not, which means that it could in theory be made available for real time screen capture... this is not the case for real time 3D rendering...

      You seem to be ignorant of the cost of software development... just assuming that it is a driver change, it is not negligible cost... implementation, ensuring that it works in the 'right' cases (ie. ensuring that AGP bandwidth is available for 3D rendering, etc. when such an application is being used), testing - including rigorous memory checking, etc. as it is a system component that could cause many problems to system stability usability... and then on top of that, regression testing to ensure that it still works properly in every future driver release... still think that that is negligible cost for the vendors? Think it is negligible in comparison to the extra revenue that such a feature would generate?

      And finally, what about intranet? Sure, companies do have enough intranet bandwidth - although I'm slightly less convinced about the number that would be prepared to devote such bandwidth to that use... but then the article didn't mention intranets, did it?

    5. Re:One of the worst technical articles.... by Viking+Coder · · Score: 2

      Your arguments all seem to boil down to your last point - that it's not worth it, based on the extra revenue that the development would generate. That's really not for you or me to decide. All that we can do is to try to prop up a good use case, and hope that some vendors will listen.

      I'm defending the use case, and you're attacking it. Why do you care? If your argument is that it's not "reasonable" to expect them to support it, based on the additional money they would make, that's fine - I don't necessarily disagree with your opinion about that.

      What I'm saying is that there's both a need, and a simple software solution. The vendors would do good to encourage this kind of feedback - it makes their products better.

      I'm saying that it makes sense, at any given moment, to take advantage of the bandwidth that's there. If I render a scene, I expect that to be fast. If I then pause until I can capture the image back to main memory, I expect that to be fast, as well. 8 frames per second is agonizingly slow. In the case of near real-time, waiting 0.125 seconds for a screen capture is very frustrating. Especially when you can render the frame in something like 0.0125 seconds. It's not as though the AGP bus is doing both tasks at the exact same instant, as you seem to keep implying.

      Intranet: So, because the article didn't mention something, I can't mention it, and it's not worthy of your contemplation? What? =)

      In this specific use case, every vendor has crappy drivers. If you've got a better list of what their driver developers should be working on, by all means, post it. Until then, let them work on the reported issues and requested features - this sounds like a good one, to me.

      --
      Education is the silver bullet.
    6. Re:One of the worst technical articles.... by jcl5m · · Score: 1

      I won't spend much time on this since your post is obviously not worth the time. The moderators that brought you up to a five apparently understood less about the topic than you do.

      This is indeed a niche interest that when driven by market forces will become a bigger issue. But this is likely to happen with the coming generation of video cards (R300 & NV30) when high level shading algorithms can be implemented to run on GPU's (see post on raytracing). Being able to readback the framebuffer quickly will be critical to making this feature usable to. See this article. These media producers are also likely to be the same people that buy large disk arrays that can push 200MB/s which is comparable to HDTV production. It depends a who you are and what you need.

      How you got into talking about internet streaming is beyond me.

    7. Re:One of the worst technical articles.... by grahamtriggs · · Score: 1

      That it is not worth is not something that I am 'deciding' - rather I am pointing out the practicalities of the situation... (ie. don't *expect* it to happen, as it isn't likely to be worth it for the vendors). Yes, it does make sense to be able to take advantage of available bandwidth - but how much is really available? You see, the benchmark that they use specifically focuses on DirectX applications (ie. 3D rendering, although possibly 2D stuff as well)... for the typical uses of DirectX, you can understand why transfer of images back across the bus is not given priority... even if the AGP bandwidth is not being fully utilised in rendering (which it might well be for complex scenes), you can understand why they might only allow reading data off the card a limited amount of bandwidth... and bear in mind, that DirectX applications are typically double (or even triple) buffered - so you can be rendering the next scene whilst downloading the old one... (which is likely to slow things down anyway, due to contention!!!). Now, despite it's direct mention in the article - how does this relate to business uses? How many business use DirectX built interfaces, let alone have a requirement to capture them in real time?

  23. Somewhat misleading IMHO by minkwe · · Score: 1

    Since when did textures become video frames and vice-versa. I can't think of a scenario in which anybody would want to download *textures* from the GPU. If it would have been video frames, that would be another matter. If the benchmark actually measures *texture download* from the GPU then something is really wrong with this picture.

    --
    "Fighting terrorists with millitary might is like killing a mosquitor on your Dad's forehead with a rifle."
    1. Re:Somewhat misleading IMHO by Anonymous Coward · · Score: 0

      This could be useful for creating impostors in a compilcated scene. Now, instead of rendering a 50 billion triangle object, you can render it once and grab a texture from video memory. As long as the view doesn't change too much, you can render this one texture instead now.

      -- John

    2. Re:Somewhat misleading IMHO by Anonymous Coward · · Score: 0

      So, the difference between a pixel and a pixel is...?

  24. Do the sums by NigelJohnstone · · Score: 2, Insightful

    When you record video it is normally compressed by hardware or a DSP. They are compressed for a damn good reason.

    Uncompressed, say just 1600x1200x24bit is about 6Mb per frame. At say 70 frames/sec is about 420Mb a second to store to disk.

    So what exactly are you going to do with that much data? If you had 512Mb of ram you could hold 1 seconds worth.
    Forget a hard disk, even a 3 disk raid doesn't have that sustained IO rate.

    1. Re:Do the sums by Anonymous Coward · · Score: 0

      Mb = Megabit
      MB = Megabyte

      Sheesh!

    2. Re:Do the sums by Hadlock · · Score: 2

      i agree with you completely. BUT i think about things; using firewire/ieee1394, you can do essentially raid/striping of sorts. current firewire has a theroritical peak of 400 mb/s; next gen firewire should see 800 mb/s...

      oh wait. that's megabits. we're talking megaBYTES. fuxor. sounds like we've got a decade or so before we have consumer-level storage options at this level. crazy.

      btw, if i had mod points currently, i'd mod you up.

      --
      moox. for a new generation.
    3. Re:Do the sums by Markmarkmark · · Score: 1

      That's true but I don't care about 1600 x 1200, I want to do 720 x 480 at 30 fps (or interlaced 60 fps at 720 x 240). That's what myself and about one million other users of video editing software want to do. And my processor will compress it in real-time in DV format at ~3 MB/s. No problem for my pokey IDE drive. Big time savings for all us video creators.

      --- Mark

  25. Actually it's unlikely because... by BoomerSooner · · Score: 1

    Have you ever seen the price of high end 3D Cards?
    They start in the $1500 range and go WAY up.

    I had a guy at the local base show me a card (forgot the name) that they spent over $3000 on (for doing 3ds max rendering) and they couldn't figure out why a geforce 2 mx was better in quake 3. The performance on the 3d design was amazing compared to a retail card.

    They dont want to kill their high end market so it's unlikely that you'll see drivers that take full advantage. $$$ is king.

    1. Re:Actually it's unlikely because... by subgeek · · Score: 2

      in the article it shows how they benchmarked consumer cards like geforce4 and radeon 8500. they also benchmarked some "entry level" high end cards like the quadro4 750 gxl, parhelia, and radeon 9700. i am not sure about the radeons, but i do know that the quadro4 is a different chip than the gf4, not just a card with extra features turned on.

      all of the cards had the same problem.

      --
      you probably shouldn't have read this.
  26. Slashdot Random Story Generator by Anonymous Coward · · Score: 0

    Obligatory comment on how every day, /. looks more and more like the random story generator.

  27. Don't SGI's do this already ? by Anonymous Coward · · Score: 0

    Have I read this right ? They want to be able to take the output of the card (say playing a game of Quake) and save that output into RAM then maybe onto disk as a video ?

    If so then why not just use a system that has been designed to do such a thing ? My SGI O2 has been doing it since 1996.

    Mark

  28. Re:VNC faster, not really. by gmack · · Score: 2

    Not 30 frames a second. 8 frames a second assuming you don't use a larger resolution.

  29. non-issue by BenjyD · · Score: 1

    Users could actually record game output in real-time...a compressed movie of your game play saved on your HD.

    Or you could always record a gamedemo (available Quake1 onwards, I believe). much less data to handle. If you really then really do want to convert the demo to a video of some sort, do it after the game, when you also have the time to mpeg-or-whatever encode it.

    Despite the popularity of Internet streaming, it is not currently possible to stream live output from graphics cards over the Internet. The connections, processors and codecs are all fast enough today. Sadly, all of this horsepower is being held back by one remaining weak link: the texture download speed of today's graphics card drivers.

    Excuse me? The bandwidth off the graphics cards they test is in the 10megabyte/s range! Not many users have that sort of bandwidth on their internet connection.
  30. Re:hello, i am a typical slashdot user by macksav · · Score: 0, Troll

    no, you're a typical /. reader because you are a moronic twit and a flaming homosexual with fantasies of public masturbation followed by public humiliation. idiot.

  31. How about this by cr@ckwhore · · Score: 2

    How about the obvious for video production... since going out isn't a problem... why not just hook up a recording device (could be digital media) to the video out port of the video card.

    Does this really have to be over-engineered?

    --
    Skiers and Riders -- http://www.snowjournal.com
  32. Why? Where? How? by imperator_mundi · · Score: 1

    from the article
    After playing there could be a compressed movie of your game play saved on your HD. On a reasonably fast machine you could actually record your game play digitally to your DV camcorder as you play or even compress and burn it to a Video CD or DVD at the same time you are actually playing.

    Maybe I'm missing something, but why should be so vital to write on HD the output of your Video Card?

    GPU are widely diffused goods, so it would be enough to save the meta information needed to recreate the images, send them, and let my friends own video cards recreate the images of me fragging around the enemies.

    Beside that to save 50 fps at 1600x1200x32 (about 7mb pro frame uncompressed) 350mb/s are needed, even with a compression ratio of 1/350 (possible) that would mean no more than 10 minutes on a cd.

    1. Re:Why? Where? How? by Forkenhoppen · · Score: 3, Interesting

      I can think of several reasons:

      - The company hasn't released the game yet, but wants to release a video of gameplay to the public. Current methods would require implementing a "save game as it goes" and then a "replay, in offline rendering mode at a steady frame rate, and record results" pass. Or, you could save it at reduced quality if you had video out on your computer and video in on another computer.. but that's just ridiculous, imo.

      - Likewise, you have the game, and your friend hasn't purchased it yet, and lives too far away to just take a glance at it..

      - You're having a graphical glitch in a game with your particular card that can't be easily illustrated with screenshots. Think how much easier it would be to just send a video clip than having to send a half-dozen screenshots and a wordy explanation, where they still might not believe you.

      - You have a Radeon9700, he has a Geforce2. You want to show him how different Doom III looks on your card, as opposed to his card, in real time.

      Etc..

    2. Re:Why? Where? How? by Anonymous Coward · · Score: 0

      All these sending of video ideas all assume that both sides have fast connections. This may come as a shock to the techno elite but most people are still using 56k modems (and this is usually not by choice!).

    3. Re:Why? Where? How? by Anonymous Coward · · Score: 0

      DVD-R?

      stupid lameness filter.

    4. Re:Why? Where? How? by Markmarkmark · · Score: 1

      Yes, but Fedex men are all about the same size :-)

      I could send you a DVD, S-VCD or DV tape of my output (and not just games). Except I can't record my output now because I can't get it back into my PC from the GFX card at a reasonably useful speed.

      --- Mark

  33. More complex than it sounds by videodriverguy · · Score: 1

    Many video cards hold frame buffer/textures in a private format optimized for the video processor. This means that if you want to read them, there may be an uncompress or untiling operation being done without your knowledge. This is expecially true for textures. These operations may be computationally intense - and any memory that can be written in AGP space is un cached, and so slow to access by the proceesor. In general, cards are optimized for drawing and display, not for read back of the buffers. Another problem is that access to the host memory for writing (via the AGP bus) is not immediate. The card is competing with the processor.

    1. Re:More complex than it sounds by Anonymous Coward · · Score: 0

      Finally, someone with a clue.

      Listen up folks, linear, row-at-a-time ARGB pixels are not what modern hardware renders to. If you ask the 'driver' for a linear ARGB buffer, then expect a long wait to get it. Simple as that.
      Open source, linux, or xfree86 or new drivers aren't going to magically solve this problem.

      The DAC, or the chip itself (such as render-to-texture) is the customer for the graphics chip output, not the CPU.

      I don't see what the problem is, even for high-end content creation, 10-15fps is heckuva lot faster than 1 frame per hour.

  34. Half-Life Screenshots by Anonymous Coward · · Score: 0

    I noticed this problem when I was trying to take screenshots in FireArms [ a half-life mod ] I slaughtered some dude thru the throat and had the reflexes to hit F5, but I guess not cuz not only did the game lag for about 3-5 seconds, but I failed to capture the image I wanted! doh!...

  35. Bogus benefits! by Adnans · · Score: 1, Redundant

    * Their graphics cards would become invaluable for rendering production output for TV, film and video

    Oh yeah, the current market for this is huuuge, NOT! When and if the need arises I'm sure the card manufacturers will address this. Right now it's FPS that count and I don't think GFX companies are going to waste engineering time on this useless feature without any significant return on investment.

    * Users could actually record game output in real-time with little impact on game performance. After playing there could be a compressed movie of your game play saved on your HD. On a reasonably fast machine you could actually record your game play digitally to your DV camcorder as you play or even compress and burn it to a Video CD or DVD at the same time you are actually playing

    This one is just silly. Why not record the game engine commands instead of the videocard output? Oh wait, that has been possible for years in most FPS games no? What these people are proposing is to capture high resolution images, compress (eeks), and them stream out to a TV screen that probably has only 1/3 the resolution of the original capture. What a great way to waste time!

    * Screen capture software that grabs motion images of user interfaces for the purposes of tutorials and training is a vital business application

    Haha, this one takes the cake. Most of these tutorials do not need high FPS numbers to be usefull at all. And even more importantly, a lot of these applications simply script the real application to demonstrate the needed behaviour. You can't beat that.

    * Despite the popularity of Internet streaming, it is not currently possible to stream live output from graphics cards over the Internet. The connections, processors and codecs are all fast enough today. Sadly, all of this horsepower is being held back by one remaining weak link: the texture download speed of today's graphics card drivers.

    Woehahhahaa :-)

    I'd rather fetch a UDP stream of game engine commands render the game action on my side of the Internet, thank you very much.

    What a joke :)

    -adnans

    --
    "In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd people." --Linus Torvalds
  36. Ray Tracing on the GPU by eeeeaagh · · Score: 5, Interesting
    We just ran into this problem when implementing a ray tracer using the GPU that will be presented soon at the upcoming Graphics Hardware Workshop.

    Our ray intersection algorithm implemented on the GPU (an "old" Radeon 8500) was able to intersect 114M rays per second. This was loads faster than the best CPU implementation, which could handle between 20 and 40 intersections.

    But when we tried to implement a ray tracer based on this, and an efficient one that didn't intersect every ray with every triangle, the readback rate killed us. Our execution times slowed down to the low end of the fastest CPU implementations.

    And the readback delay seems to be completely due to the drivers, which apparently still use the old PCI-bus code. If the drivers could use the full potential of the AGP bus, our ray tracer could approach twice the speed of the best CPU ray tracers.

  37. Hmm sounds like a call to arms... by Odinson · · Score: 2

    If the drivers are truely the only issue and not the hardware, wouldn't this be a great opportunity for the XF86 guys and whoever writes the particular tdfx modules to optimize Linux first.

    "No Mr. Vallenti sir you don't understand we have to use Linux. It's the only game out there for our CG budget. Windows can't do RAM write back with decent FPSes, and commodity GPU's are 20 times cheaper..."

    Wouldn't that suck for them... at least it would be amusing.

  38. Really? by Anonymous Coward · · Score: 0

    Somehow this doesn't surprise me.

    I noticed long ago that my simple Pinnacle TV tuner card (PCI) has a *much* better card-to-memory transfer rate than my Matrox eTV AGP display/tuner card.

    With the Pinnacle I can capture video, let the CPU handle on-the-fly compression in Indeo (at maximum resolution, or even straight to DivX at lower resolutions), while the image is tee-ed to the display adapter at the same time.

    With the Matrox, capture is only possible if the compression is done on-card, otherwise you get dropped frames because the bus can't keep up.

    The Matrox is the better card for practical use, even though the tuner is slightly less in quality, because it does have a very good compressor on board (real-time MPEG2 at DVD level bitrate and resolution, or lower if you want to -- I usually capture at 8 Mbps max resolution, and later recompress it through DivX).

  39. Yes, but... by Anonymous Coward · · Score: 5, Informative
    a) texture imposters (realtime adaptive billboarding)

    That's what render-to-texture is for, you don't need to read data back to the CPU.

    b) split world/image-space occlusion culling.

    This wouldn't be too useful for realtime graphics anyways, because of the way the 3D graphics pipeline works. The CPU can already be processing data a few frames ahead of what the GPU is currently working on. If you read back data from the card every frame, you have to wait for the GPU to finish rendering the current frame before you can start work on the next one.

    1. Re:Yes, but... by PacoSuarez · · Score: 1

      Can someone, please, moderate this post higher? Right now it's Score:0, while it's the exact right answer to the two cases proposed.

    2. Re:Yes, but... by Mike+Connell · · Score: 5, Informative

      That's what render-to-texture is for, you don't need to read data back to the CPU.

      That is true for simple versions, but with methods moving towards image based rendering you often have to pull the data back anyway. Then you can process the textures to produce better imposters - not necessarily just billboards

      Re: occlusion culling. People are using these methods today for realtime graphics (for example combinations of Greens HZB, or HOMs) even with the low readback speed. UNC's Gigawalk software is one published example (Google for it). Getting Z or alpha channel infomation back is the biggest hit, so these methods would be even more efficient and so more widley applicable with faster transfers. When you're rendering N million triangles per frame (UNC quote 82Million) you have to do this stuff to get realtime rendering.

      So it is used for realtime graphics today - although mainly for heavy duty applications not games.

      HTH

    3. Re:Yes, but... by Mike+Connell · · Score: 3, Informative

      Oops, forgot to point out one more thing too: HP and NVidia have both implemented opengl extensions to address the issue of getting Z occlusion information back (nvidia's is layered on top of the HP extension iirc). This isn't useful for reading back the framebuffer fast, but helps when doing realtime occlusion culling.

    4. Re:Yes, but... by Anonymous Coward · · Score: 1, Interesting

      Well those are great examples, but I think you have to draw the line somewhere. There's so many neat things you can when you can read back data quickly, but is it really worth the trouble?

      Now that cards have abitrary dependant texture reads, doing warps for IBR right in the hardware is a real possibility. Also, the latest 3D cards can push upwards of 100 million triangles, enough to render that 82 million triangle scene in realtime (assuming some basic LOD and occlusion culling).

      Read backs are going to become more and more irrelevant in the future. Try looking at the Moore's law doubling times for GPU speed vs. bus bandwidth. Since AGP was introduced, the speed has doubled 3 times, with AGP 8x just becoming available now. On the other hand, it only takes 3D hardware a year and a half to achieve the same speed jump. As time passes, the size of the bus compared to the amount of data being processed by the GPU will only become smaller.

      I think treating the graphics bus as a one way street is inevitable, so we might as well accept it and learn take advantage of it.

    5. Re:Yes, but... by Anonymous Coward · · Score: 0

      The only use of that I've heard so far is for light flares... you check if the light source is visible using nvidia's extension, and draw the flare if it is.

      The only reason this works is that nobody really cares if the data is recieved from the card a few frames late, at worst the light flare will lag behind by a fraction of a second. This is a really big hack, though -- trying to use the occlusion query for anything else is going to kill your performance.

    6. Re:Yes, but... by Mike+Connell · · Score: 2

      You can also use it for lots of multipass effects - not least cutting down geometry for shadows.

      With big scenes (as I mentioned in another post) the cost of attempting to render occluded geometry is far more costly than stalling the pipe for a few ms. Trying to render a few million polys can also kill your performance :-)

    7. Re:Yes, but... by Sesse · · Score: 1

      One word: GL_NV_occlusion_query. Available on GF3 and up. :-) /* Steinar */

      --
      (This comment is of course GPLed.)
    8. Re:Yes, but... by Mike+Connell · · Score: 2

      Is it worth the trouble? It's worth the trouble for some users now yes. That's why it's done. Given a choice between rendering stuff slowly, or doing readback and rendering fast, people choose to render fast. It's a small investment for a big speed improvement.

      As to the future, everybody can see the difference in bus speed vs. GPU performance. Shaders are going to open up a lot of possibilities in the next few years - for all parts of the pipeline.

      But at the end of the day performance is what counts. Today we need to do readbacks, tomorrow hopefully not. The fact that we might not need to do them in the future doesn't mean that people shouldn't make the most of what we have at the moment. Nothing lasts forever, everything changes - and in computer graphics - especially fast.

    9. Re:Yes, but... by cyranose · · Score: 1

      Some of this was adressed in another post, but I guarantee if I had faster readback we'd use it my projects. I've had cases where the best theoretical solution to a graphics problem (like occlusion culling) was hampered solely by slow readbacks. For example, NVidia is currently advising developers to go render the entire frame once up front quickly (no light, no texture) to lay down the z-buffer so hidden pixels don't get rendered during later passes. If we could do that first render with tags (color-ids) that would tell us which objects are actually visible, we could avoid sending more detailed geometry and texture passes (which can actually lower AGP demands) and avoid a heck of a lot of wasted pixel processing too. Even if latency were an issue, sampling this visibility information at a lower frequency would give us a better idea about turning off whole portions of a scene. We might even be able to avoid frustum culling entirely. So, yes, this is an issue today and maybe even with the games you care about.

    10. Re:Yes, but... by Anonymous Coward · · Score: 0

      get an ID pls :) hard to track an AC and intelligent posters are worth following

    11. Re:Yes, but... by Anonymous Coward · · Score: 0

      I don't think the people making the hardware really care about what a few people want right now. This is just like how people used to clamour about the inability to get more fine-grained texture management on OpenGL back when Glide was cool. There's always going to be a few people lagging behind who want things the way they used to be. They need to face the facts -- direct framebuffer access is a thing of the past.

    12. Re:Yes, but... by ShortSpecialBus · · Score: 1

      But are these cards designed for those users? Most users that are going to need that sort of thing aren't going to be buying consumer grade cards...they'll be spending way more for a good professional 3D rendering card.
      I would rather have the card that I buy for games and the like be fast and not be more expensive or slower to support the few users who would consider this card for that particular app.

      --
      //FIXME: Bad .sig
  40. you're missing the point. by wuHoncho · · Score: 3, Interesting

    Very few people use their typical desktop video cards for actual video production or anything related to it because the hardware up until now was simply unable to handle that sort of load. Now we have these cards that are the beginning of a new era of computer-generated visuals. The article is saying that they can do quite a bit more than they can do now if someone would just write some better drivers for them.

    Now, streaming real-time rendering images over the internet? Maybe not fullscreen stuff right now because of a multitude of hampering factors on affordable internet bandwidth which I won't name for clarity's sake, but for the limiting factor to be the internet itself and not the graphics card is still a significant step.

    This would definately be very beneficial to low-budget game developers and movie directors. We could very well see the return of the shareware boom (remember the early-mid 90's?) because of this.

    sure, only a small portion of the people who'd buy the cards would use these features that the article talks about, but they'd be people that didn't have that capability before. Whenever this happens in any medium/artform/what-have-you, there is the tendency for a lot of experimental stuff to appear. I think we have some very interesting times ahead of us if someone gets these drivers written.

    --


    Just another freak in the freak kingdom.
  41. I disagree. by Anonymous Coward · · Score: 1, Informative

    Let's say Pixar starts using 3D chips to accelerate their rendering. They will be doing one of two things:

    1) High quality rendering - It takes one hour to render a frame, so the download time is negligible.

    2) Realtime previewing - Why would you want to download each frame to the CPU if all you want is a preview?

    1. Re:I disagree. by Anonymous Coward · · Score: 0

      Hes not talking about an increase in speed, but rather a change in how everything can be done - the tecniques of editing and producing could totally shift.

  42. What's more... by Anonymous Coward · · Score: 0

    You hit the nail on the head.

    Just take a look at who's behind this article. Serious Magic writes low-end video editing software, which is hardly the target market for 3D cards. What's more is that their "CTO" Stephan Schaem already has a bad reputation of pestering DirectX developers with constant demands of odd API features.

    If you read some of the posts he makes on DirectX mailing lists, you'll realize that has very little idea of how to use a 3D graphics API properly. I would definately call into question the results of his benchmark.

  43. Perhaps... by ColGraff · · Score: 5, Insightful

    "What kind of idiot puts their most powerful processor at the end of a one way street?"

    Maybe they're the kind of idiots who know most people just want the best possible OUTPUT for gaming possible, and so don't want to add any overhead in card performance - or even additional design time - that isn't related to gaming performance. You know, the idiots who make cards that get award after award from gaming companies, then write near-perfect drivers, port those drivers to linux, and let you overclock the card to your heart's content. Those sort of idiots. My, they're idiotic.

    Nobody says, "buy a geforce 4 ti, make the next toy story." No, it's advertised as a gaming card, and that's what its designed to do. If you want to do high-end video rendering things, perhaps a gaming card isn't the best choice.

    --
    I'm the stranger...posting to /.
    1. Re:Perhaps... by unicron · · Score: 2

      Yeah, but you ever run Half-life on a cluster of boxes with Oxygen Wildcat cards? Damn.....

      --
      Finally, math books without any of that base 6 crap in them.
    2. Re:Perhaps... by gspeare · · Score: 4, Funny

      Hey, I just realized that my high-end printing device has absolutely no hardware provision for reverse-direction printing! If I want to take the high quality document I just printed and put it back into electronic form, I have to spend hundreds of dollars* for a completely separate "scanning" device! What a ripoff!

      Really, as soon as the market for this sort of capture starts to grow, someone will have a hardware solution. The first ones will be cheesy: a connecter into a separate PCI capture card, for example; but eventually a more reasonable method will become standard design.

      To me, this is just the free market in action, working (more or less) as it should be.

      * I know how much scanners cost. Think hyperbole. :)

    3. Re:Perhaps... by Anonymous Coward · · Score: 0

      The comparison here should be with getting raster output from a printer's PostScript module. Over parallel.

  44. Should this be surprising? by siskbc · · Score: 1

    TV cards are designed to take incoming data and put it to the bus. GPU's are designed to do the opposite. So yeah, your tuner card, even on PCI, should beat the vid card for readback.

    --

    -Looking for a job as a materials chemist or multivariat

    1. Re:Should this be surprising? by sschaem · · Score: 1

      This is wrong. GPU are designed to do this also. Each core is VIVO enabled, from the TNT to the Geforce4. When you capture video on a Geforce2 or a AIW radeon the video is capture passing thru the AGP buss using DMA. Their little load on the CPU/system resource. Those video surface after all can be compressed using software mpeg2 in real time. Whats missing is the driver path that also perform this for graphic surface, not just video surface. I seing a bunch or report from people seing 100meg under 9x but 10meg under Xp/W2k with nvidia HW... Its all about drivers. Stephan

    2. Re:Should this be surprising? by Anonymous Coward · · Score: 0

      Your post kind of disregards the entire 400 odd messages on this board, but whatever. Do a survey - how many GPU owners use GPU's for capture? 0.01%. For display? 100%. While the AGP bus is 2-way, the main purpose of a GPU is rendering for output, not capture, at least for the nVidia stuff. Hence they haven't supported capture. It may be all about drivers, but the people writing the drivers are the same ones writing the hardware and they dided to basically use the AGP interface for near 100% input (CPU->GPU) rather than any significant bidirectional traffic.

    3. Re:Should this be surprising? by sschaem · · Score: 1

      They all support DMA capture over the AGP bus.

      Nvidia even implement DMA transfer in their OpenGL driver, and their W9x directx drivers.

      GPU are allot more flexible then you can imagine.
      Being close minded doesn't help the industry move forward and be inovative.

      Millions of user would prefere to do video work in realtime/faster then realtime.
      Thats just one market/application for putting DMA support in drivers.

      The 4+ million card sold every month dont end up in gammer box. Other market do exist...

      Stephan

  45. I've seen this firsthand... by nadador · · Score: 2

    And I tend to agree that its a software issue.

    NVIDIA says that if you ask for contents of the framebuffer in a call to glReadPixels and you ask for it in the same pixel units its stored in, you won't be really disappointed. If, however, you ask for that same region of the framebuffer in another format, you're screwed. (So, if your framebuffer is 8-8-8-8 RGBA, and you ask for luminance or 10-10-10-2 or something else odd, you aren't going to be pleased with the performance.)

    This isn't by the way, just a render-movies-on-your-PC issue. Lots of scientific computing, visualization, etc., applications render with OpenGL and then grab the framebuffer to store a result. This throughput issue is significant considering that for many applications, what was an enormous data set 10 years ago is now not such a big data set. Like another poster said, this issue is one of the ones that still ties people to SGI.

    While 99% of your other concerns might be dealt with, there are still lingering problems like this one that keep some people from moving to commodity hardware.

    --

    Outside of a dog, a book is a man's best friend. Inside a dog, its too dark to read.
    1. Re:I've seen this firsthand... by siskbc · · Score: 1

      And I say this because I do this...

      Anyone in science who is doing any modelling has an SGI or a farm of them. Scientists aren't going to be using gaming GPU's anytime soon. Hell, the software (such as Cerius2) is, I believe, optimized for SGI's, but I could be wrong on the last point.
      There are other things that tie scientists to SGI's. First, the company is stable. We could have been having this discussion 5 years ago regarding 3dfx. No researcher in his right mind is going to tie his prospects to a GPU company, as they generally don't control market for all that long. Good luck getting them to support your voodoo.
      Second, no one wants to re-write the bulk of the code out there that runs really well on SGI's. Third, unless your group is huge, a researcher won't likely have his own sysadmin. So he could get a pre-configured, reasonably ready-to-go SGI, or he could hack a way to get his GeForce to do it.

      Sorry, but there is no demand in the scientific community for this.

      --

      -Looking for a job as a materials chemist or multivariat

    2. Re:I've seen this firsthand... by nadador · · Score: 2

      You're right. Maybe I misspoke. I should have said engineering community, not scientific. The data sets, etc., that are most frequently seen in the scientific (and parts of the engineering community) are still so large, that no one considers moving away from SGI. When the data is large and the visualization is intense, you buy an Onyx. (As I type, I could walk over and touch the Onyx3800 or the rack of O300 in the computer room.)

      What I'm refering to is the smaller applications, the ones that don't justify the purchase of an Onyx, and don't warrant significant time on an Onyx. Around here, there are a small number, but growing, group of projects that started 10 years ago on a fleet of Onyx. The data sets haven't grown significantly in that time, and what would have choked a PC and required an Onyx 10 years ago isn't that big a deal now. While I realize this is still a small set, the number grows every couple of months.

      Would you agree that smaller, especially custom applications, might eventually move away from SGI?

      --

      Outside of a dog, a book is a man's best friend. Inside a dog, its too dark to read.
    3. Re:I've seen this firsthand... by siskbc · · Score: 1

      I'll buy that to an extent. It would have to be a small company, probably a startup, and it would have to be a really tech-savvy company. You would have to be the sole maintainer of the box (ie, no nVidia support), and it would have to be a box that had a limited userbase, probably.

      But, like you say, these companies do exist. But this movement will have to have an open-minded CIO or equivalent - the cost savings could be there, certainly.

      I think, bottom line, this won't happen to any huge extent until a big player starts trying to open SGI's market with bargain prices, because it will have to be supported to get most CIO's to switch. And SGI won't much like that. I've wondered occasionally if there isn't a sort of gentleman's agreement keeping the markets separate.

      Bottom line - no big company will try it. Maybe a few small ones, yes - and even then, only those that truly need extremely high-res, realtime video capture. I still think this has to be mostly entertainment industry, because only under successive video processing will such res be needed. I think most of the needs can be addressed adequately with the GeForce as it is now. But someone will do it, if as much for fun as anything.

      --

      -Looking for a job as a materials chemist or multivariat

    4. Re:I've seen this firsthand... by varaani · · Score: 1

      I do this too. Apparently your field is much unlike mine.

      I work in computer vision research. I do all my rendering work on a P4-Geforce3 machine. We have SGIs too but PCs give a lot more bang for the buck (as a govermental organization we don't have that much to spend on equipment) and have more or less equal performance.

      Second, no one wants to re-write the bulk of the code out there that runs really well on SGI's

      What rewrites? My plain-C OpenGL code works just fine on both.

      a researcher won't likely have his own sysadmin

      I know some labs which are like that, but luckily mine isn't one of them.

    5. Re:I've seen this firsthand... by nadador · · Score: 2

      True, true. One of the few good things going for SGI is that the userbase is very loyal, and for just as many technical reasons as not so technical ones.

      And you almost pinned down what I do :-) I work for a large company, but I develop applications that never leave the lab, and have user bases on the order of 10. Since what we do is pretty much fully custom applications for real time simulation, we get to pick and choose the tools, and ever since we couldn't buy any more Harris parts, we've been an all SGI shop.

      --

      Outside of a dog, a book is a man's best friend. Inside a dog, its too dark to read.
    6. Re:I've seen this firsthand... by UncleFluffy · · Score: 2

      It is possible to handle anything-to-anything untiling and format conversion in the CPU at high enough speed for this not to be the bottleneck. I've written code to do it, and I'm sure the guys at nVidia could do it as well if they wanted to (which is not to say that they are).

      My suspicion is that the raw bit-shovelling across the bus is more likely to be the problem.

      --

      What would Lemmy do?

  46. We need to hear the Great Carmack's opinion! by Anonymous Coward · · Score: 0

    nt

  47. why the drivers? by Anonymous Coward · · Score: 0

    I can't see why they come to the conclusion its the drivers. Current graphics cards are pipelines which are heavily optimised for taking lots of data from the AGP bus, processing it (TnL, rasterisation etc) and outputting it to the monitor. Thats what they're designed to do. Sure, the possibility exists for reading back from the pipeline across the AGP (due to calls imnplemented in graphics APIs designed before the cards), but anyone who has coded using a modern graphics API should know its not advisable if you want to keep pipeline speed.as it means reversing some parts of the pipeline (meaning flushing it, flushing chache, refilling cache, etc etc, then same in reverse to get back to drawing again....).

    As far as I can tell, this article is about a company realising something that everyone else already knew, then whining cos they don't get the cheap rendering platform they were hoping for.

    Seriously though, how much extra effort is it to capture the output and store it (via another PC with vid capture if needed)? This whole thing smells of "OMG! this thing doesn't do wonderous new things that normally cost 15 times as much, straight out of the box!"

    OK, the newer cards seem to be even more heavily optimised towards rendering towards monitors. Can you blame them? If a feature is already almost never used because its so inefficient, and you can improve performance of regularly used features at the cost of the rarely used one, then thats the way to go.

    Even with adapted drivers, the performance would still suck compared to letting the pipline run how it likes and caopturing the frame at the other end.

    Rant over.

    Dan

  48. Depends on settings... by godel9 · · Score: 1

    I've done my graduate research for the past two years on topics requiring fast frame buffer readback, and here's what I've found: For nVidia, reading back in native format (GL_BGRA_EXT) with OpenGL is very fast. I get performance in the range of 40-50 Mpix/sec, which comes out to be 160-200 MB/sec. I know people at some of the other research labs in California have been able to reproduce these numbers. Reading back in any other format is slow, and reading back in DirectX is slow. Reading back anything but color information (e.g. Z-buffer) is really, really slow. I've talked to the driver people at nVidia, and my understanding is that they just haven't optimized these paths yet. The driver code reads data from the card one word at a time and doesn't use any of AGP's block transfer modes simply because that would take development time away from providing features that most people are going to use. As much as we all like to bash Microsoft, I don't think it has anything to do with them. Since nVidia writes both the Microsoft and Linux drivers, we'd probably see any improvement in readback performance in both drivers at approximately the same time, but I could be wrong... The market drives companies like ATI and nVidia, so as soon as people start demanding fast frame buffer feedback, they'll put it in. Until then, there's still the fast OpenGL path that nVidia has put in for research purposes.

  49. Stop me if I am wrong, but... by jjackson · · Score: 1

    Isn't the whole point of the hyper-fast GPU's to be able to render 3D, texture-mapped objects with a minimal of communication with the CPU? If a graphics card is going to make full use of the AGP bandwidth, isn't this going to put one hell of a strain on the motherboard processing power itself?

    If you look at the pre-hardware accel cards, the CPU was responsible for the calcs needed to render the display and then dump the raw data out to the video card. I would think that the design focus for the GF4 in my system was to render as many 3D objects at as high of a framerate possible WITHOUT the need to send gigs of data streaming into the video card.

    While it might not be that big of an engineering modification to up the bandwidth capabilities of a video card's interface with the mobo, I don't think this was part of the initial design goals... in fact, I think the goal was the exact opposite.

  50. Not a new problem, not just 3d by Anonymous Coward · · Score: 1, Informative

    This has been an issue for quite some time. Raster once put reading from the card at being 1/10th the speed of writing to it. This is the reason we have very little "fake transparency" going on right now. Those methods read the frame buffer and then composite upon the necessary region. With this method transparency can neither be fast nor update in real-time.

    The solution is to take this into account when desgning the compositing model which Apple has done and Keith Packard and co are doing with Xrender and it's offshoots.

    macros

  51. Faster readback has been requested for years by cyranose · · Score: 3, Informative

    I've been doing real-time 3D graphics for 10 years and read-back speeds have been the biggest problem for doing many advanced algorithms. We have asked the companies to improve this many times. The problem as I see it: Quake and other benchmark apps don't rely on readback.
    Here are a few other important but non-Quake techniques that are driven by readback speeds. I'll go into more detail on the first for illustration purposes.
    High-quality real-time occlusion culling -- many techniques render the scene quickly by using a unique color tag per object or polygon and then read back the framebuffer to figure out everything that was visible (and how many pixels for each) for a final high-quality pass. If HW drivers would even just implement the standard glHistogram functions (which essentially compress the framebuffer before readback), this would become practical. NVidia adds their NVOcclusion extension, but it's limited in how many objects at a time you can test, it's very asynchronous, and it requires depth sorting on the CPU to make it most useful. The render-color technique does not. Yet HW makers are spending lots of money adding custom HW to do z-occlusion when a simple driver-based software technique may be easier.
    Dynamic Reflection Maps -- for simple, reflective surfaces -- Requires background rendering from multiple POVs (generally six 90 degree views) and caching these. Even if you can cache a small set of maps in AGP memory, you want fast async readback if you have a large fairly static scene and you're roaming around.
    Real-time radiosity -- similar to above, but needs more CPU processing of the returned images and possibly depth maps (reading back the depth buffer is often even more expensive than the color).
    Real-time ray tracing -- the better quality approaches need fast readback to store intermediate results (due to recursion, etc..). With floating point framebuffers and good vertex/pixel shaders, ray-tracing becomes possible, but not yet practical. I believe ./ may even have run a link to one of these techniques a while back.
    So there's a lot more to this issue than just making movies of your games. Faster, better graphics would be possible. So why isn't this a priority?
    ------------ cyranose@realityprime.com

  52. Not the SW by po8 · · Score: 2

    The article claims that the drivers, not the HW, are causing the performance problem. Based on my conversations with a premier graphics programmer and some x86 experts, I don't believe that it is this simple. In particular, note that XFree86 2D, which uses its own drivers, also has pathetic readback rates.

    I barely understand the technical details, but it seems like there are some serious misfeatures in the way that the AGP bus interacts with CPUs and caches on both Intel and AMD during readback; it is going to be hard for card vendors to fix this problem (even if they decide to care). It may be that a new bus and/or new CPU glue will be needed for high-readback-rate applications.

    1. Re:Not the SW by sschaem · · Score: 1

      Its no HW issue. If it was it wouldn't explain how people get 200meg second readback using OpenGL or how some are getting 80-130 meg readback under win9x. (This apply only to nvidia drivers, I'm not aware of any other driver that perform readback using DMA) Stephan

  53. How come my car doesn't fly? by Anonymous Coward · · Score: 0

    I just dont get it! My car still doesnt fly.. they keep gettin faster, but my car just doesnt fly. I belive this is a driver issue, after all its 2002 now... 15 years ago they said the cars now should be flyig, it must be the drivers fault! How come I'm the only one complaining about this..? Hello? Its me the .00001% of the population that is DISGUSTED that we can't fly our cars around yet, but they have fancy gps, maps, airbags and big ass engines... but they can't fly! ... *voice in the background* - "go buy a plane you jerk!"

  54. a VCR to the Svideo output by DABANSHEE · · Score: 2

    My card will ouput the same image to its VGA & TV ouputs at the same time.

    Surelly simply by connecting the S-video output to a VCR while playing quake through the monitor should do the trick.

    1. Re:a VCR to the Svideo output by Anonymous Coward · · Score: 0

      Not quite as easy as it sounds. Look into issues of time-base synchronization and other such things, and keep in mind that NLE means that any video you dump to tape (which keeps you stuck at realtime speeds, not above them) has to be redigitized. Technically, some sort of DVI capture device would work fine, but it'd be a kludge when you could just use your existing disk.

      Another advantage- if you're using an external (unsynchronized) device, you're going to drop a frame if load spikes or something happens. If you're writing to your own disk, not in 'real time,' but perhaps at > realtime speeds, you simply take a little longer to render that frame, and end up with a crystal-clean .MPEG or similar output.

      Try doing really clean fades/wipes/editing in the analog domain, and you'll see why people have moved to all-digital solutions like AVID, and Toaster 2000 when working with live inputs, from the days of the Video Toaster (the first device to make 'desktop production' possible).

  55. Re:hello, i am a typical slashdot user by Anonymous Coward · · Score: 0

    no, that is a typical slashdot user/reader.

  56. This is so funny... by streak · · Score: 1

    It's quite ironic that this story was posted today, because I'm having the same problem and I was beating my head into the ground as to why it was so slow.

    Basically I wanted the GPU to map some textures for me (cause its been designed to do that) and then I wanted to get those back and do some other operations on it.

    What I found with my really cude benchmarking is that a call to glReadPixels() of 128x128 8-bit RGB data from a GeForce 2 Ti took around .25ms, which was totally unacceptable because I needed to do this as much as possible within a frame time (so ~30ms..)

    It boggled my mind as to why this was so slow, and now I know.

  57. Doubts about the benchmark by Anonymous Coward · · Score: 0

    I think that the bandwidth figures generated by the author of this article are seriously suspect.

    I've written an OpenGL application that supports a capture movie feature . . . It's possible to record a 720x480 movie at 32bpp at 30 frames per second (GF4 Ti 4600, 1.33GHz Athlon), and this includes the compression and rendering time. That's about 40 megabytes per second. 30 fps is the highest supported capture rate, so I haven't tried to find an upper limit. But even 40 megs per second is three times higher than the best figure reported in the benchmark.

    I'm not sure whether the benchmark uses OpenGL or DirectX--that could have a significant performance impact. But I think it's more likely that there's a big problem with the way the benchmark is written--without having the source, it's difficult to tell what's really happening.

    The author of the article also seems to have confused "texture download" with "frame buffer read." Such a deep confusion about the very subject of criticism casts further doubt upon the author's results.

    --Chris

  58. Damn.. *what*? by Anonymous Coward · · Score: 0

    Half Life is an old game, and its maximum visuals can already be seen by any decent card today. And I doubt it has support for clustering of any kind.

  59. You're supposed to render to an offscreen buffer by Animats · · Score: 3, Insightful
    If you want the rendered image back in main memory, render it into an offscreen buffer, or "pbuffer" in the OpenGL world. That's the standard approach, and it's designed to be fast, unlike reading back the screen buffer. Here's an NVidia tutorial for developers on how to do it. Not only is it faster, you don't have to worry about what the user is doing with overlapping windows or seeing the cursor in the picture.

    OpenGL supports reading back the screen buffer mostly so that the OpenGL validation suite can check the rendering accuracy. For that, it doesn't have to be efficient. And if you read back in some format other than the actual structure of the framebuffer, every pixel gets converted in software and performance will be awful.

    This article reads like it was written by an overclocker, not a graphics developer.

  60. Machinima could use faster transfer rates by Allen+Varney · · Score: 2

    The nascent art of machinima, which involves using 3D game engines to make desktop movies, could benefit from a practical way to record game output faster. (It would also be nice to export directly to .AVI format for editing in Premiere or Avid, but that's another wishlist.)

  61. What about HDTV? by maddogsparky · · Score: 2
    HDTV has greater rendering needs than PAL or NTSC. 1040 lines * 30 fps is a little over 30 kHz.

    Also, having the ability to render faster means that you can do it faster than real-time. If you are working to a deadline in a TV news studio, that might be a real advantage (think late-breaking news where a story has to be put together during a comercial break).

    --
    science is a religion
    1. Re:What about HDTV? by kasperd · · Score: 1

      1040 lines * 30 fps is a little over 30 kHz.

      30Hz certainly sounds too small to me. I hope this is not really the case, and that the reality is 60Hz interlaced. Now I know some people thinks this is the same, but it isn't. If you take 30Hz motion and display in 60Hz interlaced there will be visible steps rather than smooth movement. (Doing the oposite and displaying a 60Hz interlaced motion at 30Hz is going to produce even worse artifacts.)

      --

      Do you care about the security of your wireless mouse?
    2. Re:What about HDTV? by SillySlashdotName · · Score: 1

      1040 lines * 30 fps is a little over 30 kHz.

      30Hz certainly sounds too small to me. I hope this is not really the case, and that the reality is 60Hz interlaced. Now I know some people thinks this is the same, but it isn't. If you take 30Hz motion and display in 60Hz interlaced there will be visible steps rather than smooth movement. (Doing the oposite and displaying a 60Hz interlaced motion at 30Hz is going to produce even worse artifacts.)

      Sorry, but the numbers being bandied about were KILOHertz, not Hertz. 30kHz is very much not 30Hz.

      --
      Acts of massive stupidity are almost never covered by warranty. --me.
    3. Re:What about HDTV? by kasperd · · Score: 1

      Sorry, but the numbers being bandied about were KILOHertz, not Hertz. 30kHz is very much not 30Hz.

      The horisontal frequency is 30kHz. I was talking about the vertical frequency which is always far lower. The comment said 30fps which should match the vertical frequency in order to achieve good quality.

      --

      Do you care about the security of your wireless mouse?
  62. Re:You're supposed to render to an offscreen buffe by cyranose · · Score: 1

    That's the design, but it doesn't really work that way in practice AFAICT. If I have some geometry in AGP memory, the fastest way seems to be to render it to part of the main framebuffer before the final main rendering. Keeps the context switches low. I haven't yet found a way to preserve VAR settings across context switches, which gets in the way of asynchronous rendering.
    Pbuffers are better suited for when you want to render data that isn't in the same config as the main framebuffer, or want to render and buffer up at other than the main framerate. Besides, there's still a readback required.

  63. Right however, by BoomerSooner · · Score: 1

    The card i'm refering to is of a different architecture. You'll have to take my word for it since it was a while ago.

  64. Allow you to peer over other's shoulders by maddogsparky · · Score: 2
    This would be really nice for collaborative games so you can share your viewpoint with teammates. I think the US army has something similar where their tanks all pass back the info they collect to the platoon leader; turns out that the platoon leaders make much better decissions when the info they have access to is better.

    --
    science is a religion
  65. This is old news; Intel AGP spec was short sighted by PhilFrisbie · · Score: 4, Interesting
    This has been discussed many times on various news groups. Here is my 'Readers Digest' version:

    If you read the AGP spec, which was written by Intel, you will note that it is based on the PCI 2.0 spec. The PCI 2.0 spec is for a 32 bit, 33 MHz symmetric bus which gives you a max transfer of rate of 132 MB per second. The AGP spec is for an asymmetric bus, 33 MHz read and 66+ MHz write. But writes were optimized at the expense of reads, since Intel was pushing video with NO onboard texture memory, and who would want to read back the image in real-time anyway, right?!?

    Yes, I am sure that drivers do have some affect, but the AGP spec is the first bottleneck. On an OpenGL news group it was reported last year that a person tested two identical video cards, the only difference being one was AGP and the other was PCI. The read performance for the PCI version was several times faster than the AGP version.

    Of course, some video cards are also to blame because of the frame buffer format they use, but that is another story...

  66. Thank you (mod up please) by moogla · · Score: 1

    This is one of the more interesting and compelling reasons for the 128-bit requirement.

    --
    Black holes are where the Matrix raised SIGFPE
  67. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  68. Finally a reason for NVidia to OpenSource by iplayfast · · Score: 2

    Follow my reasoning here. I've heard from other articles at /. that Alan Cox (or one of the big name advocates) couldn't think of a reason to justify to NVidea to OpenSource their drivers. There would be no profit for them to do so.

    But if they had, the drivers would have been updated to scratch whoever's itch needed to be scratched. In this case the bandwidth from card to Memory.

    One of the benifits of Open source is that even seldom used features are enhanced, so that when suddenly there is a demand for this the features are in place.

    1. Re:Finally a reason for NVidia to OpenSource by Ziviyr · · Score: 1

      One of the benifits of Open source is that even seldom used features are enhanced, so that when suddenly there is a demand for this the features are in place.

      It also has the benefit of pissing off companies you license "3-D technologies" from after signing NDAs and stuff.

      --

      Someone set us up the bomb, so shine we are!
  69. GPU RAM is not CPU RAM - Film at 11 by NFW · · Score: 2
    I ran into something like this on an application I'm writing... when I first made an MPEG recording of my 3D output, there were no textures. About 3 seconds and one forehead-slap later I realized that the video card's memory (where the rendering happens) isn't on the CPU's bus (where the recording happens).

    It seems the lesson here is that proper captures from video RAM are slow. Yeah, it'd be nice to change that. But how many people really care? Given how long it took anyone to notice, I can't help but think that very very few people really care - and with good reason. Unless you're into making rendered movies, it's irrelevant.

    --
    Build stuff. Stuff that walks, stuff that rolls, whatever.
  70. OT by Anonymous Coward · · Score: 0

    I think it's "Donny Don't"

  71. Not just a software issue by Namarrgon · · Score: 2
    If it were just a driver bug or even a design tradeoff, why is it that all GPUs from any manufacturer are uniformly abysmal? Even an SGI 320 with its UMA design still only gives 18.9 MB/s readback speeds, to my tests.

    I asked nVidia at SIGGRAPH why image readback is so slow. They said, no motherboard they know of (not even their own) supports AGP Writes back to the system memory. Without that, you're limited to PCI bandwidth at best, far less than what the AGP spec allows.

    However, we're not even seeing that. Results are showing 1% of what is possible. It's certainly a hardware issue, but there may be a lot of room to improve from the software side, too.

    --
    Why would anyone engrave "Elbereth"?
  72. High-end cards are slow too. by Namarrgon · · Score: 2
    If you want to do high-end video rendering things, perhaps a gaming card isn't the best choice.

    Why is it that a much more expensive Quadro card gives equally slow results? I've run a very similar test on an SGI 320 (shared-memory design) and it only gives 18.9 MB/s.

    Anyone reading this with a Wildcat 6000-series? What does that bench at?

    --
    Why would anyone engrave "Elbereth"?
  73. Reality check by Fujisawa+Sensei · · Score: 1

    What you thing companies are going to let you use a cheap video card for frame grabbing? This suckers were designed to video games, and home entertainment purposes, not studio work.

    I noticed that there were no reviews of cards by 3d Labs? I wonder why? Could it be that 3d Labs builds its cards for Professional Graphics users and could care less how things like Quake benchmarks?

    --
    If someone is passing you on the right, you are an asshole for driving in the wrong lane.
  74. I think you just showed us the solution... by Ungrounded+Lightning · · Score: 2
    What kind of idiot puts their most powerful processor at the end of a one way street?

    the kind of idiots who know most people just want the best possible OUTPUT for gaming, and so don't want to add any overhead in card performance - or even additional design time - that isn't related to gaming performance. You know, the idiots who make cards that get award after award from gaming companies, then write near-perfect drivers,


    here it comes...

    port those drivers to linux ...

    Bingo!

    The only problem is in the driver. Hardware's up to the job.

    The driver has been ported to Linux.

    So fix it!

    Closed source? Reverse engineer it.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  75. Definitely a driver issue by JensOwen · · Score: 1

    but it is rare that image read back is the best solution. However, when it is, proper us of AGP makes a significant difference. A prototype was done in open source for the Matrox G400 driver, but was never maintained. There were more recent discussions on DRI-devel to bring this functionality back, here is a pointer to the thread:
    http://www.geocrawler.com/mail/thread.php 3?subject =%5BDri-devel%5D+proper+ioctls+%28%3F%29++to+expor t+agp+to+a+user&list=680

    1. Re:Definitely a driver issue by X-Guy · · Score: 1

      Looks more like an app problem to me. Are they using ReadPixels for this benchmark? If they were it would likely use hardware acceleration. Looks more like they're just copying with the CPU from DDraw surfaces. If so, then they got expected results. I've seen around 200 MB/sec with ReadPixels using NVIDIA's binary drivers.

    2. Re:Definitely a driver issue by JensOwen · · Score: 1

      The MGA prototype defined an AGP memory allocator (albeit a weak one) which in turn allowed glReadPixel to exceed 800MB/s. That was a few years ago, and I would guess new chipsets and graphics cards could double that.

    3. Re:Definitely a driver issue by sschaem · · Score: 1

      Its a Dx8 benchmark and use the API/Driver to copy the buffer. The benchmark was actually seen delivering 80-130meg second on the win9x platform. Its a driver issue for sure, but might also be a W2k/XP (in combination with the directx) API limitation. This a good reason for people to stick to using OpenGL. But would be nice if DirectX can deliver the same performance nontheless. Stephan

  76. Anyone looked at the source for that benchmark? by X-Guy · · Score: 1

    That IS the expected bandwidth you get copying from video ram to system memory with the CPU. Even having the graphics engine push the data across for you you'll only get PCI bandwith. This is a the way AGP is SUPPOSED to work. If they're claiming you should be able to get 1Gig/sec copying with the CPU or even with the graphics engine into main memory then they don't understand the way the technology works. That bandwidth only goes in one direction. It has nothing to do with drivers. This is the expected behavior for AGP. If they're copying with the CPU in their benchmark (and it looks like they are from those numbers), they got exactly the expected results. Methinks those guys just don't know what they're doing.

    1. Re:Anyone looked at the source for that benchmark? by sschaem · · Score: 1

      Its not the benchmark test code that copy the data with the CPU, its all in the driver/API hands. Its all done in one API call. This same API call for example easely deliver 80+meg second on nvidia driver and W9x. Its all about the driver implentation.... Stephan

  77. Re:VNC faster, not really. by Rothron+the+Wise · · Score: 1

    Or if you're running tightvnc.

    I'm pretty sure this COULD help vnc performance, if not at the client side, then at the server side, which often spends quite a bit of CPU fetching those pixels.

    --
    A witty .sig proves nothing
  78. The benchmark uses DirectX. by Markmarkmark · · Score: 1

    The benchmark uses DirectX.

  79. Explained WHY Serious Magic need read performance by flowerp · · Score: 1

    Seems to me that many posters here don't have
    an idea why Serious Magic needs fast readback
    rates from the graphic card to the PC's memory.

    SeriousMagic have developed a really cool video compositing engine based on DirectX that allows them to generate realtime video effects, like mapping stuff onto 3D objects, alpha blending, bluebox effects, wipes, fades, etc...

    I've seen a demo of it and I was really impressed.

    Now if you want to encode that stuff into a video stream (like MPEG or Windows Media) you need to read the generated output back into the computer's main memory for CPU based encoding.

    And that's where the bottleneck is. They can't get the data back fast enough.

    So their release of a benchmark application shows how bad the cards actually perform. If they want to put some pressure on card makers to improve performance, that's the way to go.

    --
    --- Eat my sig.
  80. Re:Blame this on... by langed · · Score: 1
    From the article:
    It's remarkable that a graphics card with a video input and video recorder software can record TV-quality images to the PC HD in real-time, yet the same card can't even record it's own renderings at 1/10th this speed.

    Hmm. Way back in the early 80s we had a nifty device known as a "genlock" that converted PC video card output back to NTSC-compliant signals for viewing on a standard TV. These have gotten much better, and I've seen projectors that can handle 1200x1600 or better in true color. I'm just surprised that enthusiasts haven't devised some sort of "loopback" device utilizing one of these. It could theoretically get the data back to the CPU, but it wouldn't help in the way of increasing performance if in fact it is the problem of bad drivers, as the article suggests...

    Of course, I suspect it's not entirely the fault of the drivers; more than likely, there would have to be some near-redundant circuitry to help prevent lagging on the video card.

  81. Re:This is old news; Intel AGP spec was short sigh by sschaem · · Score: 1

    Can anyone confirm that the AGP spec is not symmetric? The fact remain... Nvidia card under OpenGL deliver 200meg second , yet most card (include nivida) deliver 15meg with DirectX and XP. Note: We did see nvidia driver deliver 80-130meg second under W98. Their is a big gap betwen 10meg to 200meg. Also another factor is that in that benchmark 100% of the processor is used (driver use CPU loop VS DMA) And its very probable that OpenGL using DMA and freeup the processor. So 200meg second 'only' but with little procesor usage. Double bonus! Stephan

  82. Re:Blame this on... by sschaem · · Score: 1

    This does not explain why the benchmark work 8 to 13time faster under Win9x and people posting result of 130-200 meg scond using OpenGL. Its a software issue. No need to invent elaborate HW hacks. Stephan

  83. AGP is effectively a one way bus, by design. by OverCode@work · · Score: 2

    I spent most of the summer working on AGP driver bugs, so let me clarify a few things.

    AGP was designed by Intel as an ad hoc solution to combat the problem of transferring large textures to a graphics card over the PCI bus. It's an extension to PCI, essentially, allowing fast, pipelined, ONE-WAY transfers. That should be repeated. AGP is PCI, with a different connector, and a bunch of extra pins and logic for pipelined transfers from system memory to the card. In fact, without "fast writes" enabled, CPU -> graphics card writes are plain PCI; only transfers requested BY THE CARD are accelerated.

    There is nothing new about this. It's in the spec.

    It is NOT meant to be a two-way bus. It it was never designed for offloading cinematic rendering to the card, for later recovery. AGP came out around 1997, before NVIDIA or ATI had shaders in hardware. PC rendering was nowhere near photorealistic at the time; that was the domain of software raytracers. Without AGP, video cards seriously hog the AGP bus with their texture streaming. That is ALL that AGP fixes.

    The real solution is to come up with a new bus. I tend to like unified memory architecture designs, but they have disadvantages as well. The real trouble is getting the PC industry to agree on anything; if ATI came up with a new bus standard, for instance, I doubt NVIDIA or Matrox would adopt it, not wishing to appear to submit to their competitor.

    -John

    1. Re:AGP is effectively a one way bus, by design. by sschaem · · Score: 1

      The Real3d / Intel 'partership' resulted in the starfighter card. This card was VIVO ready and supported DMA both ways across the AGP bus. You could download from the card at 20meg second effortlessly using a pentium classic. Yet a 2.5ghz pentium4 cant get much more then 10meg second with a geforce4 under XP using 100% of the processor resource. Interestingly, OpenGL on similare HW can deliver 20x that performance. No new HW specification is needed... updated drivers are needed. Stephan

  84. Wow, is this a new product opportunity or what? by Gldm · · Score: 1

    Ok, so the problem is there's too much frame data and no way to get it back over the bus to the system for storage. Proposed solution: Seperate capture device. Method of connection: DVI digital output. All modern graphics cards have digital DVI out these days. Most of them can run it simultaneously with a VGA monitor or second DVI, depending on your card. Some can even fullscreen an app on one interface while having it windowed on the other. Perfect for this. So, you tap the DVI into a custom piece of hardware designed to do the capture. Say a box with a couple hundred megs of ram (what's a gig of PC133 these days, $80?) with the bandwidth to do the capture (PC133 = 1066MB/sec, well above the 225MB/sec estimate from another post). Then you add in a hardware compression chip a la mpeg2 hardware encoder or mpeg4 hardware encoder, whichever you please. Then dump the compressed result to a hard drive. Hell, I bet you could put this entire thing except the hard drive on a PCI card, put it in a slot, and run your video card's DVI out to the PCI card's input port, then capture back to the disk. All you need is a way to decode DVI. Since it's already a digital signal designed to display an image, I don't think decoding it to say, a TGA format would be that impossible to do. After all, LCDs have to have some kind of decoder for it right? Is this really any less feasible than those old mpeg2 PCI decoders that used pass through connections to the video card? I mean it'll need the ram for temp storage and probably a bit more processing power to encode instead of decode, but I don't see it being unfeasible. I bet you could mass produce one for $299 that worked with any 3D card on the market. Need it right now? Two PCs, one renders, one captures. Optimize each box for its task. One with a fast CPU, fast GPU, the other with a vid capture/hardware encoding card, and RAID array. Of course then you could only capture output dependant on the source machine, so doing individual frames might be slightly tricky, but I'm betting the timing issues for syncing could be worked out in software.

    --

    Introducing the new Occam Fusion! Now with sqrt(-1) fewer blades!

  85. Sheesh by alexburke · · Score: 2

    Someone build a bloody box with a DVI input and a gigabit ethernet port on it. Connect DVI out of video card to DVI input on our magic box, gigabit Ethernet on the box to gigabit ethernet on the PC. As each frame is generated, capture it and spew it back to the PC over the ethernet, then ask the custom software on the PC (via a packet from the magic box) to put the next frame over the DVI.

    Lather, rinse, repeat.

    Won't be cheap, but someone could almost certainly whip one up with a Xilinx FPGA. I know they make one with a built-in TMDS receiver, which is what you'd need to decode the DVI signal.

  86. DMI by berowne · · Score: 1

    Why not plug a capture device into the DMI port?

  87. Misleading headline, much. by rakslice · · Score: 2

    "AGP Texture Download Problem Revealed"

    "AGP Texture Download Problem" implies that there's a problem downloading textures via AGP from main memory. But it's not about texture transfers at all, it's about transfers of rendered frames back to the system (in the opposite direction).

    Hey, 'Taco... You're the high point of the /. editing staff; your readership is depending on you to drag the other editors up the bell curve kicking and screaming by your example. Don't give up now. =)

  88. Use the Digital output. by sbaker · · Score: 2

    I'm not suprised at this - when you spend your effort optimising for
    output, dragging that final image back up to the input is kinda like
    running up a downward moving escalator...you *can* do it - but you
    probably shouldn't.

    It seems to me that if you are rendering movies with this technology,
    you are either a small operation who can probably afford to wait (say)
    10x longer than realtime to do it - or you are some big production house
    who can afford to do better.

    In those cases, why not simply stick a frame-grabber onto the digital output?

    Heck you can even get around the 8 bits-per-component problem by using a
    fragment shader to render the high order bits to red and the middle bits
    to green and the low order bits to blue - then do three passes to render
    the Red component of your image at 24 bits per pixel, then the green, then
    the blue.

    Using the downstream performance to your advantage is the way to go.

    The title of this article (which talks about "Texture Download" is most
    confusing because that's a term usually used to describe the process of
    taking a texture map out of the CPU and stuffing it into the graphics
    card's texture memory.

    This is more like "Screen Dump Upload".

    --
    www.sjbaker.org
  89. WTF by Anonymous Coward · · Score: 0

    wtf are you all talking about? computers are crap...

  90. Re:You're supposed to render to an offscreen buffe by luc-fr · · Score: 1

    My experience with P-buffer on Nvidia card is that you get even lower performance for the readback phase ! If you copy-to-texture, it's ok. Since, I switched back to regular frame-buffer rendering. Luc.

  91. I disagree. by mmol_6453 · · Score: 1

    Have you ever compared software-rendered GL (read, Mesa) to even an old Voodoo1? The difference in time required is staggering. I always knew when my GL drivers weren't working right because it took 15 seconds to render one frame.

    Now scoot ahead to five years from now, when 3d accelerators take their data in the form of some .pov/.diff hybrid, or a .pov derivitive with motion thrown in. These new, cutting edge video cards are *capable* of onboard reytracing at 16000x12000. (sic) Suddenly, rendering Toy Story III in native IMAX sounds real good right now...

    (On a side note... throw in dedicated mass-spring simulation hardware for fluid and materials emulation. [drool])

    --
    What's this Submit thingy do?
  92. Capturing FPS play by mmol_6453 · · Score: 1

    Capturing FPS play would be absolutely perfect for all the mod developers out there for whom still-picture screenshots don't do their mod justice.

    --
    What's this Submit thingy do?
  93. Just do it baby by Anonymous Coward · · Score: 0
  94. DirectX vs. OpenGL readback, with benchmarks by Namarrgon · · Score: 2
    OK, some facts for the melting pot, even if a little late.

    I wrote a benchmark last night that did DirectDraw and OpenGL pixelblock transfers, both ways across the AGP bus. Now, I wouldn't call my results totally rigorous (there are various versions of drivers, no Win9x machines, a couple WinXP & the rest are Win2k), but I ran many of them multiple times, on a selection of machines/cards, & got pretty consistent numbers each time. Also, the DirectDraw readback numbers agreed fairly closely with the Studio Magic Direct3D results.

    (Write denotes system to gfx card, Read denotes gfx card to system)

    ATI Radeon 8500DV / P4 1.4 GHz

    DDraw Write: 358.20 MB/s Read: 6.70 MB/s
    OpenGL Write: 56.36 MB/s Read: 96.60 MB/s

    ATI Radeon 7200 / Athlon 2100+ x 2

    DDraw Write: 345.04 MB/s Read: 12.26 MB/s
    OpenGL Write: 50.93 MB/s Read: 75.83 MB/s

    ATI Radeon 7200 / Athlon 1700+ x 2

    DDraw Write: 347.28 MB/s Read: 12.24 MB/s
    OpenGL Write: 51.06 MB/s Read: 107.21 MB/s

    ATI Rage 128 PCI / Celeron 300A @ 450 MHz x 2

    DDraw Write: 113.75 MB/s Read: 8.54 MB/s
    OpenGL Write: 47.98 MB/s Read: 2.58 MB/s

    nVidia Quadro DCC / P4 Xeon 1.5 GHz x 2

    DDraw Write: 265.70 MB/s Read: 8.67 MB/s
    OpenGL Write: 482.03 MB/s Read: 157.60 MB/s

    nVidia GeForce 4MX 440 / P4 Xeon 1.7 GHz x 2

    DDraw Write: 315.47 MB/s Read: 8.67 MB/s
    OpenGL Write: 411.88 MB/s Read: 126.17 MB/s

    SGI 320 Cobalt / P3 450 MHz x 2

    DDraw Write: 189.52 MB/s Read: 18.92 MB/s
    OpenGL Write: 304.52 MB/s Read: 183.97 MB/s

    Matrox G400 / Celeron 433 MHz x 2

    DDraw Write: 133.27 MB/s Read: 11.33 MB/s
    OpenGL Write: 2.42 MB/s Read: 2.17 MB/s
    A few things struck me:

    - OpenGL does WAY faster readbacks, especially on nVidia hardware.
    - OpenGL is faster for writes too, on nVidia, but a lot slower on ATI
    - ATI seem to optimise more for DirectX
    - The SGI's unified memory architecture does help, though not as much as I would have expected.
    - Matrox's OpenGL drivers sucked big time.
    - These numbers would look better in one of Damage's graphs.

    Anyway, I'm convinced that there's no particular hardware problems involved, other than perhaps readback being limited to PCI66 speeds. I have no idea why DirectX readbacks are so much slower - can it really be that every single company just hasn't bothered to optimise this path, even though they have for OpenGL? Or is there something within DirectX itself that's holding them all back?

    --
    Why would anyone engrave "Elbereth"?
    1. Re:DirectX vs. OpenGL readback, with benchmarks by Anonymous Coward · · Score: 0

      Wow, we didn't know that there would be such a big difference between OpenGL and DirectX in bandwidth.

      Seem that ATI have spend much effort on their DirectX drivers and that NVidia have put much effort on their OpenGL ones.

      As for reading back from the video card. We can see many areas in wich this can speed things. (Accelerating radiosity for example). Since the NVdia cards have achieved such a big bandwidth in reading from the card, we agree that the problem is not the card hardware. Could be the software drivers, but also could be the mother board chipset. That, in our experience, can make a huge difference in AGP writes to the card. So we wouldn't be surprised if there is also a huge (maybe bigger) difference in AGP reads from the card.

  95. Re:This is old news; Intel AGP spec was short sigh by PhilFrisbie · · Score: 1
    If you wnat confirmation that AGP is asymmetric then read the spec!

    Also, as I already stated, the MAX read is ~132 MB per second (33 MHz bus, 32 bits wide). There is no way you could get a transfer rate of 200 MB per second.

    I already hinted that the frame buffer format could be a bottle neck. For example, the frame buffer might not be in a 'normal' RGB or RGBA format. I could be in a format like ARBG, BRG, or even somehting exotic like RGBAZZZS (Z-buffer, Stencil buffer). And it is very likely the scan lines are not contiguous, so the padding will need to be skipped.

  96. Re:This is old news; Intel AGP spec was short sigh by sschaem · · Score: 1

    AGP is 66mhz... Have you, yourself read the spec? AGP4x is 66mhz 32bit quadpump. And are you calling people reporting 130-200meg read under OpenGL lyars? Example: nVidia Quadro DCC / P4 Xeon 1.5 GHz x 2 OpenGL Write: 482.03 MB/s Read: 157.60 MB/s Email: d a n i e l @ e y e o n l i n e . c o m. if you want to prove yourself wrong. Stephan