Slashdot Mirror


Add Another Core for Faster Graphics

Dzonatas writes "Need a reason for extra cores inside your box? How about faster graphics. Unlike traditional faster GPUs, raytraced graphics scale with extra cores. Brett Thomas writes in his article Parallel Worlds on Bit-Tech, 'But rather than working on that advancement, most of the commercial graphics industry has been intent on pushing raster-based graphics as far as they could go. Research has been slow in raytracing, whereas raster graphic research has continued to be milked for every approximate drop it closely resembles being worth. Of course, it is to be expected that current technology be pushed, and it was a bit of a pipe dream to think that the whole industry should redesign itself over raytracing.' A report by Intel about Ray Tracing shows that a single P4 3.2Ghz is capable of 100 million raysegs, which gives a comfortable 30fps. Intel further states 450 million raysegs is when it gets 'interesting.' Also, quad cores are dated to be available around the turn of the year. Would octacores bring us dual screen or separate right/left real-time raytraced 3D?"

237 comments

  1. need a reason by b1ufox · · Score: 5, Funny

    Need a reason for extra cores inside your box? No :)

    --
    -- "Genius is 1% inspiration and 99% perspiration" - TAE --
  2. I need more cores... by Anonymous Coward · · Score: 0

    What about 2 cores for the OS (one for the system idle process and the other for the working processes) , and 5 other cores (4 cores dedicated to the first 4 applications I launch and the other for misc. activities)?

    1. Re:I need more cores... by Anonymous Coward · · Score: 1, Interesting

      I've been thinking about the same thing, but only with 1 core dedicated to the Operating System itself, without the drivers for the various peripherals. This way it should become a lot easier to make a crash free OS: when one core (the OS one) has a higher priority than the other(s) and the only code inside the core is some kind of busy loop checking if the other core is still working as planned. Perhaps of course its already been done by Sun/IBM/... and I'm busy reinventing the wheel.

    2. Re:I need more cores... by Don_dumb · · Score: 2, Interesting
      What about 2 cores for the OS (one for the system idle process and the other for the working processes
      A core for an idle process?
      I am not an expert in OSes, I thought that the idle process just gave the CPU something to do while it waited for a working process (the idle just allowed the working to butt-in, whenever somethin came along).
      Wouldn't creating a core just to do nothing be hardware bloat at its most obsurd?
      Or am I showing my ignorance, just a bit too openly.
      --
      If this were really happening, what would you think?
    3. Re:I need more cores... by oojah · · Score: 1

      The so-called idle process just shows how much of your processor time isn't in use. Creating a core for it would be akin to creating a core to sit there and do nothing as you say. It wouldn't run the idle process; it would just do nothing. In reality of course, it would get utilised for other jobs.

      --
      Do you have any better hostages?
    4. Re:I need more cores... by Anonymous Coward · · Score: 1, Funny

      I offer a web service whereby you can offload your system idle process to my servers for a small fee. The servers are specially optimized for the system idle process and it only takes 1-2% of the CPU so I can run 50-100 system idle processes per box. It's quite revolutionary actually. Should save you a bunch on extra cores to run this process in the future. It's not well known but generally the OS has a system idle process on every core so that's 4x the savings if you have a quad-core box.

      Our SLA even specifies how infrequently the idle process is run - we can do it as idle as you like, for more money.

    5. Re:I need more cores... by Anonymous Coward · · Score: 0

      You're right, he's mad. He probably has an extra hard drive full of "spare" 1s and 0s to write to his main drive when he saves a file.

      On topic:
      Sooooo, how 'bout that raytracin' then, eh?

    6. Re:I need more cores... by Anonymous Coward · · Score: 1, Funny

      You've been flogging this service on /. for long enough, no? It's a waste of money.

      You can eliminate the Idle Process just by downloading one of the "idle process crackers". It's a small piece of OSS code which kills the Idle Process or limits the amount of time it can use. It's installed by default with Seti@Home and Folding@Home clients (since they need all the horsepower they can get!)

    7. Re:I need more cores... by somersault · · Score: 1

      I think he may have been making what's known in the industry as a 'joke'.

      Well I hope so anyway.

      --
      which is totally what she said
    8. Re:I need more cores... by somersault · · Score: 1

      If that sounds too complicated for you, you could just install an Adobe Acrobat product instead

      --
      which is totally what she said
    9. Re:I need more cores... by TomorrowPlusX · · Score: 1

      Right -- it's like putting your swap file on a ram disk. Pure genius!!

      --

      lorem ipsum, dolor sit amet
    10. Re:I need more cores... by rblum · · Score: 1

      Yep, you are ;)

      It's called a 'watchdog'. Really old systems (i.e. the stuff I worked on ;) just used hardware for this - a dead man's switch you had to regularly reset from your CPU. If you fail to do that, reboot. (And reboot back then was running 37 instructions to get into a defined state - none of that fancy hard disk stuff)

    11. Re:I need more cores... by Drooling+Iguana · · Score: 1

      It won't make a difference for much longer, though, since current reports say that Windows Vista will not make use of the System Idle Process.

      --
      ... I'm addicted to placebos
    12. Re:I need more cores... by Don_dumb · · Score: 1

      I have to admit I really couldn't tell.

      --
      If this were really happening, what would you think?
    13. Re:I need more cores... by ErroneousBee · · Score: 1

      What, you mean like s/390 architecture?

      --
      **TODO** Steal someone elses sig.
  3. Need a reason for extra cores inside your box? by Anonymous Coward · · Score: 0, Funny

    Who needs extra cores when you've got free hardcore?

    1. Re:Need a reason for extra cores inside your box? by Lex-Man82 · · Score: 1

      You need at least two for that. One for the porn and the other for the spyware.

  4. Isn't it by TVAFR · · Score: 0

    why we all use GPU? Graphic card is the addictional CPU to handle all our graphic needs.

    We all know that extra cores that will be in our processors in near future are for running malware/spyware while we do our work on main core.

  5. How many do I need by Moraelin · · Score: 5, Funny

    Lemme see, at this rate I'll need: 9 cores for the raytracer, 7 cores for the physics simulation, 5 for the AI, 3 for the OS, and of course

    One core to rule them all
    One core to find them
    One core to bring them all
    And in the darkness bind them ;)

    --
    A polar bear is a cartesian bear after a coordinate transform.
    1. Re:How many do I need by Freaky+Spook · · Score: 5, Funny

      One core to rule them all

      One core to find them

      One core to bring them all

      And in the darkness bind them

      You just installed Vista onto that rig didn't you

      *Ducks*

    2. Re:How many do I need by legoburner · · Score: 4, Funny
      Lemme see, at this rate I'll need: 9 cores for the raytracer, 7 cores for the physics simulation, 5 for the AI, 3 for the OS

      Yeah, but asteroids will look AMAZING!
    3. Re:How many do I need by Ihlosi · · Score: 3, Insightful
      One core to rule them all
      One core to find them
      One core to bring them all
      And in the darkness bind them ;)



      You must be talking about the one core that's part of the TPM.

    4. Re:How many do I need by Frightening · · Score: 2, Funny

      Yeah, and you'll need a team of dedicated hobbits to buy all that shit and put it together.

      Also, you'll probably need a Kandalf case to put it all in.

    5. Re:How many do I need by TheOrquithVagrant · · Score: 2, Funny

      > And in the darkness bind them

      I see you're sensibly predicting the first game to use this rendering technology will be "Doom 4",
      which still won't provide ducttape for the flashlight.

    6. Re:How many do I need by Anonymous Coward · · Score: 3, Funny

      It doesn't meet the minimum system requirements for Vista

    7. Re:How many do I need by SScorpio · · Score: 1

      Umm... asteroids is "vector"-based not raytraced. Raytracing is what programs like 3d Studio Max do.

    8. Re:How many do I need by somersault · · Score: 1

      The best looking version of asteroids I ever played was on my A1200, 2D vector based with great particle effects for the explosions - wish I remembered the name of it!

      --
      which is totally what she said
    9. Re:How many do I need by Anonymous Coward · · Score: 0

      Bravo!

    10. Re:How many do I need by Drooling+Iguana · · Score: 1

      An Earth-like planet with a water-filled core just doesn't make any sense.

      --
      ... I'm addicted to placebos
  6. Gaming by Anonymous Coward · · Score: 5, Interesting

    There are already ray traced games. :O

    http://graphics.cs.uni-sb.de/~morfiel/oasen/

    1. Re:Gaming by Vario · · Score: 5, Informative

      They managed to get reasonable frame rates with a FPGA board, which is rather slow compared to modern GPUs. A lot of special effects like diffraction are included and don't kill the framerate. This might be a very interesting alternative to more texels/s and shaders.
      It just looks good as well: http://graphics.cs.uni-sb.de/~woop/rpu/rpu.html

    2. Re:Gaming by Anonymous Coward · · Score: 0
      There are already ray traced games. :O

      http://graphics.cs.uni-sb.de/~morfiel/oasen/


      DivX video only there. Not quite a game.
  7. It's been done... by SigILL · · Score: 4, Interesting

    F.A.N. released a real-time raytraced demo at breakpoint back in 2003. It does no more than 10 fps on my lowly 1GHz P3, but I'm sure it runs quite smooth on a nice modern CPU (though I don't think it's multithreaded).

    --
    Error: password can't contain reverse spelling of ancient Chinese emperor
    1. Re:It's been done... by SigILL · · Score: 1

      Oh hey, only now I notice.. they've been at it since (at least) 2000.

      --
      Error: password can't contain reverse spelling of ancient Chinese emperor
    2. Re:It's been done... by Anonymous Coward · · Score: 1, Funny

      Oh? I released my real-time raytraced intro in 1996. It did 10fps on my lowly 66MHz 486. Uphill. Both ways. With 96 kilobytes of data. And snow.

    3. Re:It's been done... by Yetihehe · · Score: 2, Informative

      Yup, and Heaven seven is even good looking :)

      --
      Extreme Programming - Redundant Array of Inexpensive Developers
    4. Re:It's been done... by Anonymous Coward · · Score: 0

      Bah, my Amiga could do that in 1986! And play MP3's at the same time.

    5. Re:It's been done... by Anonymous Coward · · Score: 0

      Link please ;)

  8. That should read 450 million raysegs by manjunaths · · Score: 3, Informative

    Each core is already capable of doing 100 million raysegs and you talk about quad cores. So I think you mean
    450 million raysegs not 450 raysegs.

    --
    Slashdot: Tabloid for the nerds. Stuff that doesn't matter.
    1. Re:That should read 450 million raysegs by quenda · · Score: 1, Insightful

      +4 informative!? Where's the "-1 bleedin' obvious" mod? Are there not enough serious errors in the summaries to worry about?

    2. Re:That should read 450 million raysegs by Rich+Klein · · Score: 1

      You're absolutely right. I actually SkimmedTFA, and the figure is 450M raysegs/S. "Interesting" means 30fps at 1 megapixel and 15 raysegs per pixel. Frankly, I don't find 1 megapixel all that interesting. I want graphics at my LCDs native 1600x1200, nearly 2 megapixels (and that desire will change when I'm forced to buy a 16x9 monitor some day). 1 megapixel should be a bit below 1280x1024, but better than 1024x768.

      --
      -Rich
    3. Re:That should read 450 million raysegs by empaler · · Score: 1

      I was actually about to rant about the 30fps figure was without mention of resolution. Seriously, 30fps at 1mp is not very impressive.

    4. Re:That should read 450 million raysegs by MBGMorden · · Score: 2, Funny

      That's nothing. As long as you're running an Intel chip with a class-G phase varying containment field you should be able to reverse the polarity of the fluxing core to match that of the capaciting core, and then temporally render twice that much. That's assuming that you have a 1.21 Jiggawatt PS (I would personally recommend the 1.8 Jiggawatt unit from PC Power & Cooling just to some breathing room).

      --
      "People who think they know everything are very annoying to those of us who do."-Mark Twain
  9. Put it on the GPU by TheRaven64 · · Score: 5, Interesting
    The thing about ray tracing is that it's the archetypal embarrassingly parallel problem that makes heavy use of floating point arithmetic. The thing about GPUs is that they are incredibly parallel processors optimised for for floating point operations.

    Take a look at the proceedings from any graphics conference in the last three or four years, and you will see several papers which involve ray-tracing on a GPU. Actually, not so many recently, because it's been done to death. The most impressive one I saw was at Eurographics in 2004 running non-linear ray tracing. As the rays advanced, their direction was adjusted based on the gravity of objects in the scene. The demo (rendered in realtime) showed a black hole moving in front of a stellar scene and all of the light effects this caused.

    --
    I am TheRaven on Soylent News
    1. Re:Put it on the GPU by Xymor · · Score: 1

      What about a cell-based GPU? Cell is one order of magnitude higher in float point proccesing, so wouldn't this fix the bottleneck?

    2. Re:Put it on the GPU by smallfries · · Score: 4, Interesting

      The problem with raytracing researchers is that they are incredibly myopic. *Everybody* should use raytracing for *everything* because it is superior to raster in *every case*. Well, bullshit. Take a look at the raytracing results people have posted links to, and then watch the video of Crysis. The problem is not raytracing, but geometric complexity. Raytracing does not scale nicely with the amount of geometry - mainly because of the shadow rays that have to be scattered from each intersection. The 100mil figure assumes about 100 rays per pixel. Well, you need 64 of them just to get around aliasing, and that doesn't leave many for ambient and shadow bounces.

      But the GPU is interesting for raytracing. As it moves closer towards a giant floating point vector machine the motivating application will become raytracing. So at the moment a 7800gtx can push 280Gflops. That is 2800 cycles per ray for a single frame. (BTW Intels figures in the article are bullshit. 100mil rays at 30fps = 3 billion rays per second. Roughly one ray per cycle on averge. They are counting a huge number of rays that have been optimised out of the scene, eg shadows or interpolated from pervious frames using a cache).

      The raw horsepower is getting there on the card but at the moment the communication soaks up all of the time. Raytracing is the poster-child problem for parallelisation - assuming that you have random access (readable) global memory. If you need to partition the memory into the compute nodes it begins to get harder. In a GPU building datastructures to hold the information is the bottleneck, and it drops the speed by factors of 100s or 1000s. Nvidia and ATi have given the general-purpose community hints that they will improve performance in reading data-structures so this particular roadblock may disappear. A real scatter operation in the fragment shader would be nice, but you would have to gut the ROPs in order to do it. This may happen anyway as the local-area operations that the ROPs compute could fold into fragment operations. To increase the write bandwidth in the card the retirement logic needs to start retiring 'pages' of pixels anyway, over a much wider bus. Otherwise the number of feasible passes per pixel will always be capped by the speed that the ROPs can retire the data.

      So given how hard it would be to *efficiently* raytrace on a GPU - why bother when you can throw so much more raw horsepower at faking it with cheap raster techniques?

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    3. Re:Put it on the GPU by Anonymous Coward · · Score: 3, Informative

      Raytracing does not scale nicely with the amount of geometry - mainly because of the shadow rays that have to be scattered from each intersection.

      Erm, that's just flat wrong. With the correct bounding volume hierarchy, ray tracing scales with geometric scene complexity much better than scanline methods. It is one of the reasons that offline raytracing renderers can handle such huge datasets efficiently. Also, the number of shadow rays used is *completely* independent of the "amount of geometry" in a scene. You are likely to need more shadow rays if you have large area lights and are seeing a lot of noise in the penumbrae of these lights - but this is nothing to do with the amount of scene geometry.

    4. Re:Put it on the GPU by JohnPM · · Score: 3, Funny

      The problem with raytracing researchers is that they are incredibly myopic.
      Yes but myopia would seem to be one of those problems that ray tracing would be much better at solving since it can handle refraction directly.

      --
      Karma police, I've given all I can, it's not enough, I've given all I can, but we're still on the payroll.
    5. Re:Put it on the GPU by GooberToo · · Score: 1

      That is 2800 cycles per ray for a single frame. (BTW Intels figures in the article are bullshit. 100mil rays at 30fps = 3 billion rays per second.

      Ahh....this to me says you screwed up the math. 100m rays at 30f/s = 3.333333m rays per frame not 3b r/s. You multiplied when you should have divided. In other words, it looks like Intel got it right and your math is in the south pasture.

      I still agree with the mods...your post is interesting.

    6. Re:Put it on the GPU by Anonymous Coward · · Score: 0

      The ray-tracing session at Siggraph I went to last year went into great detail how they got the performance they did. And they were proud of their performance. Their conclusion - it was only 5x SLOWER than the CPU-based algorithm. Five times slower.

    7. Re:Put it on the GPU by WilyCoder · · Score: 1

      Wrong. As geometric complexity tends towards infinity, raytracing beats rasterization hands down. This observation keeps people interested in Raytracing. It is assumed that one day, hopefully not too far in the future, the hardware will (really) be fast enough to do raytracing in realtime. Some also feel that raytracing is more "physically correct" than the Z-buffer algorithm, since it attempts to mimic light rays.

      Raytracing: ONE Pixel, intersect with ALL geometry.
      Rasterization: ONE triangle, potentially rasterize to ALL pixels.

      IMHO, The main issue with rasterizing very large geometric datasets is clipping. Remember that clipping can actually introduce new geometry into the scene.

    8. Re:Put it on the GPU by smallfries · · Score: 2, Insightful

      Erm, yes it is actually. You and the other replies that pointed out that it scales better with complexity are correct. Google confirms that my memory was a bit off on this one...

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    9. Re:Put it on the GPU by Anonymous Coward · · Score: 0

      The article states directly that a raytracing approach scales much more favorably with increasing complexity compared with raster techniques. There are even some nice plots where they deal with this issue.

      You should RFTA before you condemn a bunch of innocent people to a life of bad eyesight.

      Chris

    10. Re:Put it on the GPU by smallfries · · Score: 2, Insightful

      When I read it the way that you've put it, it does sound plausible. But the Intel quote was a bit ambiguous - you could read it as 100m rays per image, which I still think is a more natural way of describing it. If you read it the other way as 100m rays per second then it would be a division there, making it about 350 cycles per ray. The actual math could be done that quickly, but it would be very dependent on how cache friendly the data is. Using 3m rays per frame is roughly 3 rays per pixel - beneath the threshold for removing aliasing. Conventional wisdom is about 16 rays per pixel to get nice antialiasing. This of course assumes that those 350 cycles are including subsequent bounces for each ray - not treating each bounce as a separate ray. With a memory latency of ~150 cycles this means that the computation of each ray needs to pipeline exceedingly well...

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    11. Re:Put it on the GPU by lenhap · · Score: 2, Insightful
      The problem is not raytracing, but geometric complexity. Raytracing does not scale nicely with the amount of geometry - mainly because of the shadow rays that have to be scattered from each intersection.


      Did you even read the article? I understand this is slashdot where no one RTFA but come on...

      The whole benefit of raytracing, according to the article, is that it scales logarithmically with complexity (number of triangles) and shadows are free (shadows are just a side effect of raytracing, not something extra like with raster graphics). So in other words, concerning raytracing, you have to increase the complexity of a viewable scene (viewable meaning: if an object is hidden by another object, it doesn't add to the complexity) by 10 to double the computation needed vs. raster graphics which scale linearly with complexity in a scene (even non-viewable graphics add to the complexity) meaning a doubling of the complexity doubles the computation needed.

      I love the spreading of FUD and FUD* in slashdot as much as the next guy, but come on...
      *in this case I mean FUD as F'd Up Drivel.
    12. Re:Put it on the GPU by Creepy · · Score: 1

      I learned the same thing as you, but I believe the geometric complexity towards infinity favoring raytracing was only true for painter's algorithm. Actually, I think the magic number was actually somewhere around 100000 polygons for painter's algorithm, but the occlusion characteristic of the z-buffer negated the advantage.

      The main advantages of ray tracing is physically correct lighting, shadows, and reflections, as well as the ability to use any geometric shape that can be mathematically defined (e.g. a sphere). The downside of parallelizing this is each additional parallel ray comes with a concurrent increase in memory requirements. Ray tracing also tends to do specular well, but isn't as good at diffuse, so many ray tracing programs hybrid-ize with something like radiosity or ambient occlusion.

      clipping is less of an issue now than it was pre z-buffer (or depth buffer, if you prefer) and artifacting from it should be no worse than ray tracing. The painter's algorithm had the three triangle problem (same as the three polygons on wikipedia) that caused odd artifacting, but I don't see how zbuffer is any worse than ray tracing, outside of the physical shape of round objects - both traditionally keep the closest pixel.

      To answer the parent GPU question, the short answer is it's not easily possible with this generation of GPUs. The current generation of GPUs are designed to handle meshes and texture sets fed in one at a time, not en-masse (e.g. a scene), though actions on these meshes are done in massive parallel fashion. That's not to say that ray tracing isn't done on GPU - it is - just that it's done on individual meshes. For some examples of ray tracing on GPU, check out Displacement mapping and relief texture mapping (though both use pseudo-ray tracing in practice for increased performance). To put it in perspective, you would need to hold your entire scene including textures on your GPU's memory to do real-time ray tracing on the scene (to properly handle specular reflections, which could come from any object) and that isn't feasible with today's hardware (for most scenes).

    13. Re:Put it on the GPU by Anonymous Coward · · Score: 0

      You're wrong on all counts pal...

      The most interesting property of raytracing is that it DOES scale well with geometry. The rendering time is O(log(N)), where N is the number of surfaces, whereas rasterization is closer O(N) (well, not even, because it actually depends on the area of visible surfaces as well!).

      As for 64 rays to do antialiasing... I've programmed a GI renderer of my own and only needed 16 jittered rays to get perfect AA.

    14. Re:Put it on the GPU by Anonymous Coward · · Score: 0

      Now I did not RTFA or really care about how correct your opinion is, but you might want to look up logarithms. Because you seem to be attempting to explain base 10 logarithms (and what algorithm actually depends on log base 10?), and you seem to be attempting so incorrectly.

    15. Re:Put it on the GPU by Anonymous Coward · · Score: 0

      I salute your candour!

    16. Re:Put it on the GPU by modeless · · Score: 1
      If the number of triangles is infinite a rasterizer will never finish, because it has to consider each triangle in the scene at least once. Theoretically, a raytracer running on an infinitely large scene could finish. Raytracers use fancy data structures to avoid even needing to look at huge parts of the scene if they don't intersect any rays, which provides a sort of automatic fine-grained culling that gives them an advantage, not just in infinite scenes, but in real-life scenes which games are fast approaching in complexity.

      Ray tracing also tends to do specular well, but isn't as good at diffuse
      It's certainly not any worse than rasterization; rasterization hardly does diffuse lighting at all. You have to resort to hacks like ambient lights, lightmaps, etc, which all can be implemented in a raytracer in exactly the same way. But with raytracing you have the ability to use more advanced techniques which can be a lot better.
    17. Re:Put it on the GPU by Anonymous Coward · · Score: 0

      Raytracing: ONE Pixel, intersect with ALL geometry.
      Rasterization: ONE triangle, potentially rasterize to ALL pixels.


      Interestingly, this one-two sounds quite a lot like the "Infinite Planes" occlusion culling method that PowerVR's (nee Imagination Technologies nee Videologic) deferred tile-based renderers have used. Too bad the PC-class* silicon implementations stopped at Series 3 (Kyro and Kyro II video cards). The Series 2 (Dreamcast console, Neon 250 video card) was already promising, and the DX9 class unified-shader Series 5 looks really interesting, as pixel-shading only visible final pixels of course saves the more work the more complex the shader programs get.

      * They are doing fairly well in the embedded/small devices domain with their extremely efficient rendering method, but who cares... And incidentally, Series 4 and 5 is what Intel should have picked for chipset-integrated graphics instead of the horrid "Extreme" Graphics and GMA decelerators, especially as a DTBR needs way less memory bandwidth than an immediate full-screen rasterizer (due to no need for Z testing). But NIH is strong...

    18. Re:Put it on the GPU by marcosdumay · · Score: 1

      Also mirrors and lenses are almost free and water does look like water. But objects that are not perfect mirrors (semi reflective) consume a lot of time.

    19. Re:Put it on the GPU by StikyPad · · Score: 1

      Log without a quantifier is assumed to be base 10.

    20. Re:Put it on the GPU by Prune · · Score: 1

      The problem, of course, is that the myopic moderators have caused your initial post with the incorrect assertion to be at the top, which means that most readers will come out of this misinformed.

      --
      "Politicians and diapers must be changed often, and for the same reason."
    21. Re:Put it on the GPU by Creepy · · Score: 1

      I've written a ray tracer and I don't know what you're talking about. A ray tracer needs to test each ray against every object in the scene. You can use bounding boxes or spheres to trivially reject objects, but that's true of either model (or you can use pre-sort data structures such as quad-tree, oct-tree, BSP tree, etc). For specular objects the ray is also checked against every object in the scene again for what it reflects, which can increase time significantly; however, ray tracing is faster in the respect that it only checks one point per ray on all objects rather than all points on the rasterizer that contain the triangle, which should be faster in a large scene (but in practice takes little time since it's handled by graphics hardware - I haven't had to write a software blitter for that sort of thing in ages).

          I'm pretty sure they both rasterization using a depth buffer and ray tracing would take linear time (O(n)), while the painter's algorithm was O(nlogn) because of the required logn sort.

      I certainly didn't mean ray tracing was any better than rasterization on diffuse - they're both bad. I just wanted to point out that ray tracing has flaws and shouldn't be considered the be-all end-all solution for computer graphics - there are still issues that need to be resolved. I have mixed feelings about ambient occlusion as the diffuse technique as presented by the paper, but I have the same feelings about it being used in polygons (it's fast, but not realistic). I certainly don't mean to bad-mouth ray tracing, because I think it's a great technique and I even predicted that it would become the dominant technique starting around 2010 or 11 (prediction was from 1996 and based on Moore's law), because that's when I believed single processor machines would hit the 17GHz mark which I had calculated was required to maintain 640x480x30FPS. Yes, that prediction was a bit short-sighted, as resolution expectations, CPU technology improvements, GPUs (not that current GPUs would help much because they're not meant to handle scenes), and multiprocessing were not considered.

    22. Re:Put it on the GPU by modeless · · Score: 1

      I have also written a raytracer. (hasn't everyone?) I suppose the confusion is sort of a matter of definitions. If the rasterizer is sent an infinite list of triangles to rasterize, it will take an infinitely long time to render them, while if a raytracer is given an infinite number of triangles in a KD-tree, it can finish raytracing because it can skip nodes of the tree.

      Of course it would take an infinitely long time to build the KD-tree, this is just a silly example. But even if the rasterizer was given the prebuilt KD-tree, it wouldn't know what to do with it. You have to determine visibility in order to choose which nodes of the KD-tree to send to the rasterizer. Visibility is expensive to compute by itself (Quake engines do it as a preprocessing step during level design), but the raytracer does it automatically as part of the rendering process, so it has an advantage.

    23. Re:Put it on the GPU by Creepy · · Score: 1

      I still don't see, but perhaps its because a KD-tree isn't "grid-locked" like a quad or oct tree (I vaguely remember them from classes, but it's been a long time). The raster code I've been working on does trivial adds to the view volume of blocks of data, then it does trivial rejects to the actual polygons after that, and I believe that has a similar effect to that of a KD-tree. The scene is potentially infinite since it uses terrain paging (in reality, it's a 4GB data file), so trivial rejection of everything would potentially take infinite time. The potential terrain that is in view is found using a surface projected cone, since it's mathematically much simpler than a frustum at the expense of accuracy (we leave the accuracy to hardware). In practice, this has proven extremely fast.

      In ASCII-art below, O is the observer (camera), and X and Y are potential view volume. * is cached data that is out of the view volume that is paged in background. O is added by default. Up to 3 additional "blocks" of code can be added to the data that needs to be parsed based on camera direction - the Ys show what should be the worst case scenario (viewer looking toward a corner with parts of two other areas in view).

      *****
      *XYY*
      *XOY*
      *XXX*
      *****

      Although the code I'm working on only does this in 2 directions, it can easily be expanded into 3 - you could even use the projected cone method and just project it onto another side. Border crossing polygons can be split, if necessary, but I've never had to do anything like that in practice (for static objects that overlap the terrain I just add vertices on the border in a pre-processing step). So far I haven't had to do any simplification of dynamic objects outside of distance based detail reduction (a sqrt on each actor), but that could become an issue later.

      Anyhow, I'm not sure if that is the equivalent of what you were talking about (I think it is, since ray traced objects are more difficult to split, thus the KD-tree instead of oct-tree), but it is a way to reduce an infinite dataset to a manageable one using data structures.

    24. Re:Put it on the GPU by modeless · · Score: 1

      For simplified problems like 2D terrain or Quake-like levels made of hallways, you can certainly come up with fancy visibility culling algorithms that work well in practice. But I'm talking about fully general triangle soups here. With a raytracer, you just feed the entire scene in (could be paged), and visibility is computed on the fly, no matter what the scene is. For a rasterizer you must write your own domain-specific visibility culling; if you send the entire scene straight to the rasterizer it will choke. That's all I'm trying to say.

      In many cases it's not obvious how to write a good visibility culling algorithm, and games shy away from that sort of situation, or end up with incredibly obvious nonoptimal LOD/culling (GTA comes to mind). A raytracer would work better, automatically. It's the same situation as with shadows and reflections; with rasterizers low-quality approximations are a lot of work and with raytracing they work better, automatically. With game companies screaming about how next-gen games are getting harder and harder to make, this sort of improvement is exactly what the industry needs.

    25. Re:Put it on the GPU by Creepy · · Score: 1

      I've now read that even polygon data can be sorted into KD-trees, so it's not exclusive to ray tracing (as I stated earlier, it's been a long time since I've used them if at all...), so I don't think your statement is correct. I also don't think my first ray tracer had any sorting at all, though obviously sorting has some speed benefits.

      The LOD culling that is most common in raster games lately is probably a chunked level of detail algorithm such as GeoMipMapping (sometimes combined with GeoMorphing for smoothness), which has a horrible problem with popping because most programmers intentionally choose to set the error values to noticable levels rather than have framerates suffer. The old standard was ROAM, but that was CPU based and GeoMipMapping is optimized for use with a GPU, making it much more appealing on today's hardware. A ray tracer won't help with LOD unless you reduce the shape count (as opposed to polygon count). Your best hope is probably to have less shapes to form your model to begin with, because you have more powerful primitives (cube, sphere, etc) and you could also make much more efficient use of constructive solid geometry to make simpler objects without polygons (yes, you can do CSG with polygons, too, but most of the time that's used to create a polygon based subsurface). Still, if you have too many objects in the scene you either need to distance cull them or simplify them, so the solution is non-trivial.

      Another, newer form of raster detail ugliness can be seen in games like Oblivion, which I believe uses a technique called distance based detail mapping (look at the ground and move - you should see a sphereical detail section around your avatar - it's most obvious on the paved roads in towns) and distance based object culling (like trees). I ask, though, how Ray Tracing would fix that? Anything I can think of will still cause popping or repetitive detail. Terrain simplification will probably be the same with ray-tracers as it is with polygons because height maps are so much more disk efficient than other methods and therefore will still have the same sort of culling issues. If anything, I would expect to see programmers adding splines for smoothness before moving to a more disk intensive model.

    26. Re:Put it on the GPU by modeless · · Score: 1

      Of course polygons can be put into KD-trees. But that doesn't allow you to do efficient occlusion culling for a rasterizer. The only way to perform the occlusion culling would be to cast rays from the camera into the scene and only send what they hit to the rasterizer, but then you've just implemented raytracing inefficiently.

      Raytracing makes instancing geometry far less expensive because there's no per-polygon cost, making forests and fields of plants much cheaper to render. Consider a tree instanced in the foreground and also far away where it covers five pixels: A rasterizer must render every polygon in the tree in both cases, perhaps thousands, while for the distant tree the raytracer only concerns itself with the five polygons which are actually hit by rays (plus some small overhead for traversing the KD-tree to throw out the rest). You're doing efficient occlusion culling at the *polygon* level, throwing out polygons which are between pixels! The polygons thrown out need never even be loaded into cache. No fancy data structure can perform that for a rasterizer. Oblivion could push its LOD distances out much farther if it was using raytracing for this reason alone.

    27. Re:Put it on the GPU by Creepy · · Score: 1

      Ah - I see what you are getting at - you are saying that a ray tracer can find each pixel on the screen by the sort alone. I was giving worst case scenario estimates (big-Oh), which favor the rasterizer with z-buffering because it doesn't require a sort, but average case scenario estimates may well indeed favor ray tracing. Going back to the original topic, it seems to me ray tracing had a better big Oh than the old raster standard the painter's algorithm (like nlogn vs n^2) and had better average case analysis as n approached infinity. In school zbuffering was really a future thing, and I don't recall doing any analysis on it (I think it was even new in the book we used - Computer Graphics: Principals and Practices [2nd edition?] that year). For that matter, I was still writing code in PHIGS and GL, at that time, not even OpenGL, though I picked up some OGL by my second graphics class the next spring.

          There are occlusion sort algorithms on the polygon side that may give it a good run for its money, like cPLP (since standard PLP is lossy) or hierarchal z-buffers, though there isn't really a "one shoe fits all feet" algorithm like you're talking about.

  10. Why on the CPU? by onion2k · · Score: 1, Funny

    Wouldn't it be possible to write a raytracer that used the GPU core(s) instead of the CPU? Raytracing is pretty much entirely vectors isn't it? That's what GPUs do best.

    NB: The only raytracer I've ever written was in PHP and it managed about 0.01 frames per second with very basic geometry and no textures, so I'm probably very, very wrong.

  11. Not quite by Aceticon · · Score: 4, Insightful

    If i remember it correctly from my days of playing with POVRay (free raytracing app), the time it took to raytrace an image depended on things like the presence (or not) of semi-transparent, semi-reflective surfaces and on the number of light sources.

    If this is still the case, then going from the current rendering techniques in games to raytracing would result in images with more realistic reflections and lighting but, due to performance tradeoffs, few reflective surfaces and light sources.

    Besides, at the moment what games need the most is beter AIs and procedurally generated content, not yet another layer of eyecandy that requires gamers to upgrade their hardware (again).

    1. Re:Not quite by tgd · · Score: 4, Interesting

      Thats because a reflection creates another ray segment, and a refraction creates two.

      Considering a non-reflective ray traced world at 800x600 needs 320,000 rays to be cast to calculate an image, so 9,600,000 at 30fps, the claim of 450 million ray segments makes sense... thats 45+ per pixel at 800x600, which is a lot of reflections. Usually you'd limit the number to a fairly low because 100 deep reflections don't add noticable detail, especially in motion. Thats a lot of room for both refractive and reflective objects to be in the scenes.

    2. Re:Not quite by TheRaven64 · · Score: 4, Interesting

      You probably wouldn't just use one ray per pixel. It is typical to fire a number of rays and then average the result. This is because rays diverge quite quickly after passing through the display port, and so you get quite an uneven image. There is a noticeable difference between 1 and 4 rays per pixel, and between 4 and 9. After 9, you start to get into diminishing returns, and beyond about 25 it becomes harder to spot the difference (note that it is common to use a square number of rays, since that makes it easy calculate where they should go).

      --
      I am TheRaven on Soylent News
    3. Re:Not quite by GooberToo · · Score: 1

      Someone mod the parent and grandparent up. Both are fairly interesting/insightful.

    4. Re:Not quite by tgd · · Score: 1

      Thats for antialiasing... and yes, it will consume more rays but there are other ways to handle antialiasing that don't need to cast another entire sequence of rays... especially if what you were looking for in a game was the sort of reflective and refractive effects you can't get with shaders. You don't need true RT accuracy, you just need a fast way of doing things you can't do with existing technology.

      Plus, as I said in my first post on this, you just don't need that level of accuracy or detail if you're ray tracing something a user will see for 1/30th of a second. You can smooth jaggies algorithmically, since those edges will be moving anyway.

      What really needs to be done is to track motion the way you would encoding for mpeg, and focus more ray casts in areas of low motion... do true RT antialiasing in still parts of the scene, do interpolated AA in the moving parts...

    5. Re:Not quite by TheRaven64 · · Score: 4, Interesting
      What really needs to be done is to track motion the way you would encoding for mpeg, and focus more ray casts in areas of low motion

      There was a paper published a couple of years ago (at Eurographics?) about this. Each ray was independent, and would return a value at each intersection (i.e. you get the primary ray value quickly, and then refine it further with secondary, tertiary, etc ray data). When a ray was no longer lined up with a pixel, it was interrupted and terminated. This meant that you got a fairly low quality image while moving quickly, but a much better one when you let they rays run longer. I found it particularly interesting, since it completely removed the concept of a frame; each pixel was updated independently when a better approximation of its correct value was ready, giving a much better degradation.

      --
      I am TheRaven on Soylent News
    6. Re:Not quite by Anonymous Coward · · Score: 0

      Tracing to a fixed depth is pretty old-fashioned these days. Nowadays when you want to emulate 50% reflective surface, you simply reflect 50% of rays. That gives you an average depth of 2 while getting an emulation of infinite depth.

    7. Re:Not quite by timeOday · · Score: 1
      Thats for antialiasing... and yes, it will consume more rays but there are other ways to handle antialiasing that don't need to cast another entire sequence of rays... especially if what you were looking for in a game was the sort of reflective and refractive effects you can't get with shaders.
      I disagree. You will never geat realistic scenes working on the assumption that most materials are not reflective - not because of antialiasing, but because accurate lighting requires accurate light models, and the truth is light bounces off everything (but a black hole). That's why typical raytraced scenes don't look right, and why you need other techniques like radiosity. But then you have to go beyond using just one ray for typical surfaces.
    8. Re:Not quite by Prune · · Score: 1

      Good supersampling is not just 'averaging'; the rays are weighted by some filter like a Gaussian or Lanczos etc.

      --
      "Politicians and diapers must be changed often, and for the same reason."
    9. Re:Not quite by DavidV · · Score: 1

      'images with more realistic reflections and lighting but, due to performance tradeoffs, few reflective surfaces and light sources.'

      I really want this to work at a higher level, 4 cores at 2007 standards and it might with the GPU just controlling the interface of games, if the GPU manufacturers don't respond (please do).

      --
      !sig
  12. rabbit rabbit rabbit by RuBLed · · Score: 4, Informative

    FTA

    "Oh, blast. Rabbit, I seem to have forgotten my pocketwatch. May I borrow yours?"

    Rabbit: I'm late, I'm late, I'm late...

    ---

    anyway, if these technology becomes a reality in the 3-5 years and if I read the article right, the whole graphics architecture would change, there would only be a need for a super graphics processor and less need for too much memory and those graphics pipeline/shader thingies...

    The reason that they might want it in a CPU is that, why have a separate add on GPU to handle the job while the CPU could do it alone by that time. You would then only need a "basic" video card that would just do the display.

    Hmmm... could this be one of the reasons why ATI and AMD merged?

    1. Re:rabbit rabbit rabbit by deviceb · · Score: 1

      "there would only be a need for a super graphics processor and less need for too much memory and those graphics pipeline/shader thingies..." -do you mean no need for DirectX... bring it on

      --
      Kill your TV
  13. Quake 3: Raytraced by Anonymous Coward · · Score: 4, Interesting

    Just found that game using raytracing - Quake 3: Raytraced.
    http://graphics.cs.uni-sb.de/~sidapohl/egoshooter/

    Rumors are there's a q4 version on the way.

    1. Re:Quake 3: Raytraced by Tim+C · · Score: 3, Informative

      Unfortunately, the only downloads I see on that site are for videos of the engine in action. I also note that they quote speeds of 20FPS on a virtual CPU running at 36GHz... Add to that the fact that the site hasn't been updated since mid-2005, and I'd say it's dead.

    2. Re:Quake 3: Raytraced by Anonymous Coward · · Score: 0
      Just found that game using raytracing - Quake 3: Raytraced.
      http://graphics.cs.uni-sb.de/~sidapohl/egoshooter/

      Rumors are there's a q4 version on the way.


      And another rendered video... still not a game. Yawn.
    3. Re:Quake 3: Raytraced by Anonymous Coward · · Score: 0
      virtual CPU running at 36GHz
      ugh I just upgraded my rig, now I gotta again?
    4. Re:Quake 3: Raytraced by bazorg · · Score: 1

      Rumors are there's a q4 version on the way.
      yes, bundled with Duke Nukem Forever.

    5. Re:Quake 3: Raytraced by Arminator · · Score: 1

      It's really playable on the hardware they use in the University of Saarbrücken. One of the people involved with this went to school with my sister. I could ask him if there still is work being done on it.
      However, what I know is: They made a prototype of a graphics card (SaarCOR) with raytracing hardware instead of a rasterizer (http://graphics.cs.uni-sb.de/SaarCOR/).

      The RT-Quake 3 ran on a cluster of several (20 I think) computers, that's why a virtual Intel CPU of 36 GHz is shown.

      So, when you have over 20 computers left for a playable Q3 Raytracing you might contact them and you too will be able to play.

    6. Re:Quake 3: Raytraced by Peldor · · Score: 1

      Nah, it's not dead. I think they're just waiting for it to finish drawing.

    7. Re:Quake 3: Raytraced by default+luser · · Score: 1

      Yes, but this isn't 2002 anymore, and the Athlon XP 1800+ (1533 MHz) is no longer even considered "powerful." While a cluster of 20 Athlon XP 1800+ systems may have been daunting then, we are quickly approaching the day when you will see that kind of power on a single chip.

      Take for consideration, the fact that current top-end Athlon 64 processors clock in at 2.8 GHz (1.83 times faster). These Athlon 64 processors also perform typically %40 faster clock-for-clock than their 266 MHz DDR bus 256k cache Athlon XP counterparts, thanks to the larger cache, on-die memory controller and improved cache bandwidth. Then, keep in mind that you can get this 2.8 GHz processor in a dual-core, and that's some serious power.

      1.83 * 1.4 * 2 = 5 times faster than an Athlon XP 1800+. With a dual-Opteron or 4x4 setup, you'd be halfway there (10x the speed of an Athlon XP 1800+). Quad core K8Ls will make bridging that gap even easier, with their beefier floating-point thoroughput.

      I can't draw direct parallels on the Intel side, but I'll wager that even in raytracing the Core 2 Duo is as fast if not faster clock-for-clock as an Athlon 64 X2.

      I'd be really pissed off if they never released this engine. I think it looks pretty cool, and you can truely run it in real-time (5-10fps, instead of 20-30fps) on processors that are out right now.

      --

      Man is the animal that laughs.
      And occasionally whores for Karma.

    8. Re:Quake 3: Raytraced by SanityInAnarchy · · Score: 1

      Question, though -- will it look better or worse than current games? The walls and everything looked about as good as Doom 3. About the only things that looked better were curved surfaces, but I don't think that measures up to the fact that -- just look at the models there. You're getting 20 fps on a model that seems to have less than 100 polys, with a 32 ghz cluster. I can get 60 fps on a real computer, on Quake 4, with lighting that looks almost as good, and models that look ridiculously better.

      --
      Don't thank God, thank a doctor!
    9. Re:Quake 3: Raytraced by Anonymous Coward · · Score: 0

      One of the people involved with this went to school with my sister.

      "Went to school with", right.

      He was a moose and he bit her, admit it.

  14. hm... by Facegarden · · Score: 0, Troll

    My apple only has one core... -Taylor DISCPLAIMER - By apple i mean a piece of fruit grown from a tree, often referred to as an apple tree. I do not own an Apple computer, and am not referring to one in this post. I simply wanted everyone to know that my fruit is relatively normal in regard to the number of cores it contains. Also, it is not very good at raytracing, so maybe the talk of multiple cores really is better for that...

    --
    Worldwide Military budgets: $2100 billion. Worldwide Space Exploration budgets: $38 billion. Really, world? Really?
    1. Re:hm... by Anonymous Coward · · Score: 0
      My apple only has one core... -Taylor DISCPLAIMER - By apple i mean a piece of fruit grown from a tree, often referred to as an apple tree. I do not own an Apple computer, and am not referring to one in this post. I simply wanted everyone to know that my fruit is relatively normal in regard to the number of cores it contains. Also, it is not very good at raytracing, so maybe the talk of multiple cores really is better for that...

      Where as apples don't work for raytracing it's a well known fact that apple sauce works extremely well. You need to smash the apple against your head to see the effect. If it doesn't work continue smashing apples aganst your head until you see rays or stars. If it still doesn't work use a bigger hammer to smash the apples.

  15. Ha by akondrat · · Score: 0

    Hey, everybody, dont be surprised to know how easy to cheat human eyes and mind. I still recall games in pseudographics with beep sounds - and thats was really cool.

    1. Re:Ha by pimpimpim · · Score: 1

      best game ever. Click the java icon to play it in your browser. And dammit, it's still as addictive as 20 years ago.

      --
      molmod.com - computing tips from a molecular modeling
  16. hooray by Anonymous Coward · · Score: 0

    hehe, it would lead to a complete failure of current graphic card manufacturers, as the need for 3d acceleration would drop to zero immediately

    on the other side, cpu cores or addons specialised for linear mathematics would rise quickly to enhance this even further - we still could use a lot of work there despite sse2/3dknow/altivec

    also this is one of the things, where the cell will burn into the world when used appropriately, as the raytracing data can be easily fed into it as stream and be parallised easily

    1. Re:hooray by lowe0 · · Score: 1

      While I doubt that GPUs would continue their rapid ramp-up in performance, I would bet that a move to primarily ray-traced graphics engines would still put some post-processing effects on the GPU.

      Put the power there, and they'll find something to do with it.

  17. If you can't beat them, obviate them! by DoofusOfDeath · · Score: 3, Interesting

    I wonder how much this research relates to Intel's renewed desire to become a graphics player.

    If they're having trouble, for staffing or other reasons, producing good GPU designs, then it would be pretty clever of them to revolutionize the industry AND capitalize on their CPU strengths in a single move. More power to them, I say. (More power = about 120 watts, I'm guessing.)

    1. Re:If you can't beat them, obviate them! by Anonymous Coward · · Score: 0
      I wonder how much this research relates to Intel's renewed desire to become a graphics player.

      ...or AMD buying ATI.

    2. Re:If you can't beat them, obviate them! by Anonymous Coward · · Score: 0

      Intel's renewed desire to become a graphics player? Intel is the biggest graphics player and has been for quite a while.

    3. Re:If you can't beat them, obviate them! by gregorio · · Score: 1
      I wonder how much this research relates to Intel's renewed desire to become a graphics player.
      Whoever modded you up is on crack. Intel is the largest player of the graphics market. It does not develop any kind of "omg 31337 skillz" GPU because the gaming market does not serve its interests. If they wanted to, they would.
    4. Re:If you can't beat them, obviate them! by DoofusOfDeath · · Score: 1

      Slow down, cowboy. I misstated my post a little: I agree that they're already a big player in embedded graphics. What I meant to say is that they're trying to gain ground in mid-level and possibly high-end graphics, where they definitely don't have products.

      Why do I say this? Because of articles like these:

      http://techreport.com/onearticle.x/10564

      http://www.theinquirer.net/default.aspx?article=33 836

      http://www.theinquirer.net/default.aspx?article=33 720

      http://www.theinquirer.net/default.aspx?article=32 268

  18. Raytracing in hardware by Anonymous Coward · · Score: 1, Interesting

    Found this link a while ago: http://graphics.cs.uni-sb.de/SaarCOR/

    Methinks it would be pretty cool when there's a 'graphics' card that could do standard raster-based graphics, raytracing and physics. Most of the calculations are the same anyway, so a general purpose processor that is very good in floating-point vector calculations would be necessary. The API's would be mostly implemented in the driver (OpenGL, OpenRT, etc.)

  19. Yep by Joce640k · · Score: 1
    the time it took to raytrace an image depended on things like the presence (or not) of semi-transparent, semi-reflective surfaces and on the number of light sources. Yep. Raw "number of rays" means nothing. The number of rays can grow exponentially as soon as you try to make a scene of anything other than plastic spheres.


    The current crop of "raster based" games don't look so bad to me. I doubt that ray tracing would add very much to a FPS.

    --
    No sig today...
    1. Re:Yep by QuantumG · · Score: 1

      Meh, curved surfaces are still shit.

      --
      How we know is more important than what we know.
  20. "entirely vectors" by Joce640k · · Score: 4, Insightful
    Raytracing is pretty much entirely vectors isn't it?

    No, ray tracing is all about searching databases for ray-object intersections. That's what GPUs can't do at all.

    --
    No sig today...
    1. Re:"entirely vectors" by S3D · · Score: 3, Interesting
      No, ray tracing is all about searching databases for ray-object intersections. That's what GPUs can't do at all.
      Serious raytracers are tile-based anyway, that is using a lot of look-up tables. Processing of single tile could probably be fit into upcoming GPU with "unified shader architecture". But it wouldn't be efficient. GPU arn't designed for a lot of branching.
    2. Re:"entirely vectors" by Nutria · · Score: 1
      ray tracing is all about searching databases for ray-object intersections.

      Is ray-tracing "tied in with" vector graphics?

      --
      "I don't know, therefore Aliens" Wafflebox1
    3. Re:"entirely vectors" by pimpimpim · · Score: 3, Funny
      No, ray tracing is all about searching databases for ray-object intersections.

      So the choice for php+sql might not be such a bad idea after all ;)

      --
      molmod.com - computing tips from a molecular modeling
    4. Re:"entirely vectors" by usrusr · · Score: 1

      isn't "tile based" just a special case for what the article calls "beams"?

      anyways, if i understand you right then the tile based approach hardly applies to anything but first-generation backwards rays for viewport pixels, which are only a small fraction of the rays used in a nontrivial raytracer.

      (if you are talking about bounding box hierarchies and not viewport tiles, then sorry for the misunderstanding)

      --
      [i have an opinion and i am not afraid to use it]
    5. Re:"entirely vectors" by Anonymous Coward · · Score: 0

      Physics processors, on the other hand...

  21. It's not JUST FP that's the issue by N+Monkey · · Score: 4, Interesting
    The thing about ray tracing is that it's the archetypal embarrassingly parallel problem that makes heavy use of floating point arithmetic. The thing about GPUs is that they are incredibly parallel processors optimised for for floating point operations.

    It's not just the sheer number of FP calculations that can be the problem. Once you get away from the first (or perhaps even second) level of rays, you end up losing coherence between neighbouring rays which causes memory page/cache thrashing. This is not a nice thing on a GPU.
    1. Re:It's not JUST FP that's the issue by zenyu · · Score: 1

      It's not just the sheer number of FP calculations that can be the problem. Once you get away from the first (or perhaps even second) level of rays, you end up losing coherence between neighbouring rays which causes memory page/cache thrashing. This is not a nice thing on a GPU.



      Yep and you only get 100 million rays per second on a P4@3Ghz when you do four at a time (and the scene is relatively simple, this would be better stated as 200 million intersections). As soon as you lose coherence you drop to as little as 25 million rays per second. And the 100 number is probably using a binary subdivision of space, something like a kd-tree, for a GPU you would want grids as an acceleration structure, this means you need to do a lot more intersections.

      But grids is it actually where it gets interesting, it means losing ray packet coherence is not such a big deal. You just do more intersections on a single ray, instead of intersecting one triangle with multiple rays...

      Note: I didn't read the article, but I wrote a fast SIMD raytracer a few years ago.
    2. Re:It's not JUST FP that's the issue by Dzonatas · · Score: 1

      The cache appears to be the main argument between GPU/CPU implementations. Even though GPU manufactures claim better memory performance in the future, I still can't understand why I need to buy memory for my CPU and seperate memory for my GPU. Obviously, memory needs to be upgraded overall and together (or at least made accessible by both).

      -=drm rant-=Not only is the memory seperation an added expense, it is like DRM (GPU as a special hardware implemented solution where you are locked in to use their renderers and must buy new hardware to upgrade render-ability. I remember we had to buy new GPU card for a couple hundred dollars just to see colored light reflections rendered, for example, but such could already be done on the CPU). The current trend is a need to keep a specialized installation to run graphics with a GPU when there is an obvious generic options through the use of the CPU. An old GPU card is as good as an old DRMed music file. I know that (DRM-likeness) wasn't the intention originally, but when a GPU manufacturer is bought out from its original team, we can expect to take some caution with the company's new management. Here, the CPU option appears more secure.-=/drm rant=-

      I am all for the parallelism, and the GPU has provided that. We also now have the option to undo the need for a GPU and move such functionality back into the CPU environment.

      The article hints at the U.S.A. hardware, which has a processor unit for every vector unit. It also hints that such technology can be put into the CPU environment. There were no further details, but we can take this as evidence that CPU level subprocessors, which work in parallel, have been on the table because it exist at the GPU level. For now, we have SMP to scale to the CPU.

      Despite the rabbit hole style discussion on this, the expense of GPU specialization is at hand.

  22. Won't happen soon. by midkay · · Score: 5, Informative

    It's extremely unlikely that anything will go anywhere with raytracing in the near future. Raytracing takes a tremendous amount of power - apps that demonstrate it in realtime usually run quite choppy, and they're very minimalistic to boot; ugly textures, very simple geometry, very confined areas...

    The main benefits of raytracing in games would be:
    1) Shadows; they'd be Doom 3-like. Several games have full stencil shadows and that's just how raytraced ones would look: sharp and straight. The difference? Raytraced ones would take a ton more power and time to compute.
    2) True reflection and refraction. We can "fake" this well enough - for example, see the Source engine's water, incorporating realtime fresnel reflections and refractions. Though Source's water's "fake" refraction/reflection aren't pixel-perfect, and are only distorted by a bump-map, it certainly looks great.

    Honestly, considering the small gain in visual quality (although a major gain in accuracy) - it's like going after a fly with a bazooka. Sure, once we get to the point where there's enough processing power to deal with this well enough in realtime, it will happen - but don't expect it soon, and don't expect that huge a difference. Nicer reflections and refractions (which already look good today) and pixel-perfect shadows (looking just the same as stencil shadows in some newer games).

    1. Re:Won't happen soon. by Anonymous Coward · · Score: 1, Insightful

      the way I see it, sharp shadows are not very realistic, unless you have a point light. before going to raytracing, we should get area shadows, that are cast from an area light (allmost all light sources have area/volume). we should also get shaders that mimic the subsurface scattering, i.e. indirect illumination that is generated from the light bouncing off objects.

      not that i care, since i dont play them stupid video games.

    2. Re:Won't happen soon. by DoofusOfDeath · · Score: 2, Funny

      "going after a fly with a bazooka" + raytracing in the same game? Hell, I'D BUY IT!!! :)

    3. Re:Won't happen soon. by Glacial+Wanderer · · Score: 3, Informative

      I mostly agree with you; however, your statement that ray tracing results in hard/sharp shadows is wrong. Ray tracing can easily make realistic soft shadows. As you mentioned ray tracing costs a ton of extra processing power to result in approximately equivalent images to raster graphics. Ray tracing more or less simulates how light works in the real world, and there is the real problem. Ask anyone in the graphics industry and they'll tell you their job is to fudge things until they look good because realistically modeling the real world is too expensive.

    4. Re:Won't happen soon. by Anonymous Coward · · Score: 0

      You Sir, are truly an idiot. Raster graphics took all of 10 years to take off. Of course, 10 years ago nobody would have thought of having such powerful GPUs that we have today. To write off raytracing like that shows off your narrow mind.

    5. Re:Won't happen soon. by mdarksbane · · Score: 1

      1) You don't know what you're talking about.

      There are multiple techniques to "fix" hard shadows in a raytracer or a raster, although the "correct" way to do them involves pairing a raytracer with a global illumination model (something like photon mapping). They're just slow to compute. In general, you can make the raytraced ones look nicer, but they take longer the nicer you want them to look, of course.

      2) You can fake it well enough for simple cases like water, or a single mirror. Although sometimes the lack of per-pixel accuracy in that can be unnoticeable or quite nasty, depending on how you're doing it. Raytracing can give some very nice reflections of reflections effects, though, and more generalized distortions. The reasons that the reflection stuff looks decent in today's games is because an artist tweaks it for a day or two and they don't put in any of the cases where it wouldn't look good :)

      There's a reason that most of the hyper-realistic non-real-time renderers have a raytracing option. I'm not saying we're likely to see it happening in real time soon (30 fps on a modern computer... doesn't leave much room for an AI or physics calculations) but it could still look pretty sweet.

    6. Re:Won't happen soon. by cgenman · · Score: 1

      There is no reason to do anything for real if you can fake it in gaming. There is no reason to fully and accurately render a 3D scene if you can just make it a hand-painted image and agree not to move the camera. There is no reason to render the molecular behaviors of individual pieces of glass when a "shattered" texture would suffice. Mario doesn't obey the laws of physics.

      And really, the only reason to raytrace is so that your artists don't need to make and optimize a million reflection and shadow maps... But they'll still have to make bump maps, normal maps, etc.

    7. Re:Won't happen soon. by CTho9305 · · Score: 4, Informative

      Raytracing takes a tremendous amount of power - apps that demonstrate it in realtime usually run quite choppy
      If you read the Intel paper that inspired TFA's author to write his ill-informed article, you'll see that raytracing scales better with scene complexity, and Intel did benchmarks to show that after about 1M triangles per scene, software raytracers will outperform hardware GPUs using triangle pipelines (e.g. openGL, directX, shaders).

      Sure, once we get to the point where there's enough processing power to deal with this well enough in realtime, it will happen
      The benchmarks in the Intel paper show that we are very close to that point right now.

    8. Re:Won't happen soon. by ivan256 · · Score: 1

      approximately equivalent images to raster graphics

      Huh?

      Current tech essentially adds up to the game of the day being a showcase for whatever the latest buzzword technology from the GPU makers is that month. Look at the games that are out in the last 6 months. We've got "HDR" lighting now, so everything is so damned fake-shiny it makes you want to puke. If you rate glitz as highly as realism, well, I still wouldn't rate them the same. The hacks we pull for high performance 3D graphics today result in plastic looking scenes even in the best game examples. Go look at screenshots for Crysis, the latest 3D accelerator poster child to see what I mean. There are plenty of raytraced animations out there that will fool you for a good long time before you figure out they aren't real. Crysis represents a best-of-breed, and the graphics wouldn't fool anybody into thinking they were actually real for even a second.

    9. Re:Won't happen soon. by nacturation · · Score: 2, Interesting

      Shadows; they'd be Doom 3-like. Several games have full stencil shadows and that's just how raytraced ones would look: sharp and straight.

      Sharp and straight shadows? Check out this example or this one or yet another. Granted, these scenes rendering times are measured in hours, not fractions of a second... but eventually games will be at that level of quality.

      --
      Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
    10. Re:Won't happen soon. by fotbr · · Score: 1

      So you use the CPU to do the raytracing, and push the physics off onto the GPU. That might free up enough CPU time to do AI.

    11. Re:Won't happen soon. by jozmala · · Score: 1

      Thank you for an idea for a game.

      --
      ©God :Copyright is exclusive right for creator to determine the use of his creation.
    12. Re:Won't happen soon. by Anonymous Coward · · Score: 0

      Raytracing takes a tremendous amount of power - apps that demonstrate it in realtime usually run quite choppy, and they're very minimalistic to boot; ugly textures, very simple geometry, very confined areas...

      So basically like how Doom was in 1993 when it took the gaming world by storm?

      Oh, except it'd be raytraced, higher resolution, true 3d, high quality stereo sound, yada yada yada.

      I seem to recall that makers of 3d cards basically started to build cards to do especially well at what Doom and its successors and clones needed. Do you think it would not happen again?

    13. Re:Won't happen soon. by tritium6 · · Score: 1

      Wow, wish I had mod points just so others could see those awesome images. I had no idea we could generate photo-realism like that! The second image is the best.

    14. Re:Won't happen soon. by midkay · · Score: 1

      Me? I didn't say there was no way to get soft shadows. Considering that this would literally multiply the computation time by the number of samples it's not exactly feasible when we can't even get raytraced sharp shadows in a game anytime soon.

      It certainly would look nice - I mean, if it existed we'd see a lot more developers taking advantage of it and showing it off by realtime reflections on marble floors and stuff. But the difference in visual quality is disproportionate to the huge amount of processing power you need for it.

      For rendering, raytracing is extremely popular and very "basic" in the sense that people use it daily and don't think twice about it. It's very common for nice renders to go on for hours and hours, even days. We're talking about frames per second, not days per frame. Raytracing isn't new - it's used casually every day for basic reflections, etc in 3D renders - nor is it suited to games. Not yet.

    15. Re:Won't happen soon. by midkay · · Score: 1

      Reflection/refraction of such a quality is not at all uncommon. The problem? Actually getting that into a game. As you mentioned, it's hours per frame, not frames per second.

    16. Re:Won't happen soon. by jonadab · · Score: 1

      > that's just how raytraced [shadows] would look: sharp and straight.

      That depends on, among other things, the lighting. Use a small number of point-source lights and yeah, shadows are sharp and straight. Throw in some area lights, and the shadows become *much* more realistic.

      Of course, area lights also significantly increase the render time. I disagree with the article's ideas about how much CPU power is needed. A couple of years ago I calculated _roughly_ how soon desktop PCs will be able to handle raytracing in realtime for games, and I think the figure I came up with was something like 2050 or thereabouts. That's assuming game graphics continue advancing incrementally ad interim to raise the expectations of the visual results a bit, but I think that's valid.

      I'd love to be proved wrong, because raytraced frames look awesome, but I just don't think current CPUs can cut the mustard. Multiple cores help a little, but working with current consumer-grade CPUs you'd need a roomful of small-form-factor systems in your cluster, if you want the game to smoothly handle interesting things like irridescent surfaces, refraction, fog, realistic lighting, complex geometry, and so on.

      Of course, my estimate _was_ assuming individual frames had to be traced individually. My understanding of 3D games seems to demand this, since the viewport is always in motion in my experience, but if you had a game with a gameplay paradigm such that it was normal to stand still from time to time, then it might become feasible to throw in raytraced frames when the action slows down enough, and that could perhaps be interesting. That would require that your raytracer use the same object data as the faster engine that handles the picture when things move...

      --
      Cut that out, or I will ship you to Norilsk in a box.
    17. Re:Won't happen soon. by renoX · · Score: 1

      Except that the article also said that many tricks currently used on games renderers wouldn't work with ray-tracing: I'm thinking about bump-mapping.

      To compensate the loss of these tricks, you need to add a *huge* amount of geometry to get the same result, so it's not so obvious that realtime raytracing will arrive fast..

    18. Re:Won't happen soon. by CTho9305 · · Score: 1

      Bump mapping definitely works with raytracing. In college I wrote a raytracer for a class - bump map support is not difficult.

  23. 30 fps - unlikely by DrXym · · Score: 4, Interesting
    Ray tracing works by tracing a hypothetical ray(s) of light back from a screen pixel, and following it as it bounces and splits off various objects which may or may not be opaque, shiny, textured etc. to the light source. So a ray might first hit a sphere so you calculate the light at that point and recursively to trace the light as it bounces off other objects. To get any level of realism you're talking about multiple recursion which takes an enormous amount of time in any complex scene. Transparency also requires the relected and refracted ray to be traced so the number of rays can increase dramatically.

    Ray tracing also suffers terribly from "jaggies". Edges look bad because rays can just miss an object and cause really bad stepping on the edges of objects. To eliminate jaggies and do anti-aliasing, you need to do sub-pixel rendering with jitter (slight randomness) to produce an average value for the pixel. So you might have to trace 4 or more rays in a pixel for acceptable anti-aliasing. Effects like focal length, fog, bump mapping etc. cause things to get even more complex. Most pictures rendered with high quality on Blender, POVRay etc. would take minutes if not hours even on a fast / dual core processor.

    The only way you'd get 30fps is if cut your ray trace depth to 1 or 2, used a couple of lights, cut the screen res down and forgot about fixing jaggies. It would look terrible. Oh and find time for all the other things that apps and games must do.

    1. Re:30 fps - unlikely by jackbird · · Score: 1

      I use a trace depth of 1 or 2 in offline rendering for paying clients all the time. Modern raytracers are getting scary fast, and a lot of the rendertime is devoted to building the acceleration tree, which can be preproceesed and pulled from storage. And don't forget that realtime raster/scanline rendering also cuts down scene complexity to maintain performance.

    2. Re:30 fps - unlikely by CTho9305 · · Score: 1

      The only way you'd get 30fps is if cut your ray trace depth to 1 or 2, used a couple of lights, cut the screen res down and forgot about fixing jaggies. It would look terrible. Oh and find time for all the other things that apps and games must do.

      The Intel research paper that inspired TFA's author actually did benchmarks, and their scenes were pretty complex. Basically, raytracing's complexity scales with the log of the number of triangles in the scene, whereas the techniques currently used in GPUs scale linearly. At about 1M triangles per scene, even though the GPU is highly optimized and very parallel, a dual-cpu software raytracer can outperform the GPU.

    3. Re:30 fps - unlikely by sholden · · Score: 1

      Wow, if only you had told the authors of the tech report how ray tracing worked before they spent all that time working on actually doing the math on it.

    4. Re:30 fps - unlikely by 99BottlesOfBeerInMyF · · Score: 1

      Ray tracing also suffers terribly from "jaggies". Edges look bad because rays can just miss an object and cause really bad stepping on the edges of objects.

      I know traditional ray-tracers solve this with an anti-aliasing pass, but I thought real-time ray tracers solved this by incorporating a radiacity engine instead.

      Effects like focal length, fog, bump mapping etc. cause things to get even more complex. Most pictures rendered with high quality on Blender, POVRay etc. would take minutes if not hours even on a fast / dual core processor.

      The only way you'd get 30fps is if cut your ray trace depth to 1 or 2, used a couple of lights, cut the screen res down and forgot about fixing jaggies.

      But we're not looking for parity with a professional ray tracing used in film or print. We're looking at parity or improvements upon the quality of video game rendering today, which is nowhere near the that level of photo-realism. They actually claim to have run benchmarks, and given how fast you can render an animation of better quality than you'd see in a video game using Bryce or something and a modern processor, I'm inclined to believe it is becoming feasible.

      Oh and find time for all the other things that apps and games must do.

      This is where we're coming into a discussion of bottlenecks in such a system. Assume in three years everyone will have a quad-core machine or better. Most games are not CPU bound even today. That means you could have two cores sitting basically idle. If general purpose CPUs can take over the the rendering, you may well have no problem finding the CPU cycles. Then we're talking about memory and disk access as the real bottlenecks and the GPU can be used for putting it all together and applying filter effects and the like.

      Will this work? I don't know. Will it work so well that it will beat current rendering tech? Again, I don't know. But I'm not convinced by your arguments that it won't.

    5. Re:30 fps - unlikely by SanityInAnarchy · · Score: 1
      a lot of the rendertime is devoted to building the acceleration tree, which can be preproceesed and pulled from storage.

      I don't think that would help for games, though.

      Or are you saying it's like a BSP tree, Octree, something like that?

      In other words, does changing the camera angle require you to rebuild that tree? Does changing the light source? What about moving objects around?

      --
      Don't thank God, thank a doctor!
    6. Re:30 fps - unlikely by Anonymous Coward · · Score: 0

      Ray tracing also suffers terribly from "jaggies". Edges look bad because rays can just miss an object and cause really bad stepping on the edges of objects. To eliminate jaggies and do anti-aliasing, you need to do sub-pixel rendering with jitter (slight randomness) to produce an average value for the pixel. So you might have to trace 4 or more rays in a pixel for acceptable anti-aliasing.

      The nice thing about raytracing is that if you do things right, these expenses don't accumulate. If you need 16 rays per pixel for focal blur, 16 for antialiasing and 16 for recursion it doesn't follow that you need 4096 to get all of them at once.

    7. Re:30 fps - unlikely by Prune · · Score: 1

      Nonsense. As has been pointed out numerous times, ray tracing scales logarithmically with scene complexity. As number of triangles increases, ray tracing becomes faster at some point than raster algorithms. The article even gives specific numbers.

      --
      "Politicians and diapers must be changed often, and for the same reason."
    8. Re:30 fps - unlikely by DrXym · · Score: 1
      Not nonsense, reality. Show me a 30fps ray tracer anywhere which runs at even 1/4 the resolution of a normal GPU rendering a complex scene with multiple light sources such as practically any FPS game in the last two or three years would do. I'm only aware of a couple of "realtime" ray tracers (there is a Quake 3 demo for example) and the result looks terrible even at low-res. And 30fps is an awful framerate. Most midrange graphics cards could deliver more than that on the highest settings.

      Besides, ray tracing triangular meshes makes your scene look awful. Everything looks like it was hewn with a diamond cutter unless you increase the number of triangles, do phong like shading to give it some approximation to a curve, use parametric patches, or tesselate the patches into a mesh of triangles at runtime based on CPU settings. All of which would slow things down even further.

      I'm sure there is a crossover point between conventional rendering and ray tracing. I just have major doubts that it will be any time soon. Even if you threw 4 processors at the task it seems highly unlikely that you could match the quality of today's midrange card in realtime even at a lower resolution.

    9. Re:30 fps - unlikely by renoX · · Score: 1

      Complex scene, but static scene, in mosts games you need dynamic scenes so building the 'acceleration structure' is much more difficult.

      So I expect raytracing to come first to 'flyby' games, *yawn*, doesn't seem very exciting.

  24. Use a moment method with physical optics by EmagGeek · · Score: 2, Interesting

    A ray-tracing problem can be solved simultaneously using a moment method that incorporates physical optics. I wrote my Master's thesis a long time ago that did precisely that for 2-dimensional situations. Of course, this required solving massive linear systems that, at the time I wrote it, took hours on a 433MHz Alpha to do a single frame, and it was written in FORTRAN77, but hey, we've come a long way since then :)

    1. Re:Use a moment method with physical optics by Calinous · · Score: 1

      Yes, we came a long way from there - now the processors are about 10 times as fast, and you could hope for 4 cores. This means, in the best case, that the rendering will take several minutes per frame. Anyway, did you take into account we came a long way from there with game resolution also? Doubling the resolution increases four times the calculation time.

    2. Re:Use a moment method with physical optics by Splab · · Score: 1

      I really hate it when people do the "double the resolution" (or core or whatever) and say its 4 times calculation - of course it's true if you double up in both directions - but you only said to double it, so it should only be in one direction...

    3. Re:Use a moment method with physical optics by Anonymous Coward · · Score: 0

      You think this industry is going to work all that out from a 1-line description?

      Post the PDF, dude ;) Sounds cool ;)

    4. Re:Use a moment method with physical optics by Calinous · · Score: 1

      While printers usually have different X and Y axis resolution (see 600x1200, 600x2400 dpi effective, thanks to resolution enhancement technologies), displays usually have the same resolution on X and Y axis. Hmmm, from Dictionary.com: 12. the degree of sharpness of a computer-generated image as measured by the number of dots per linear inch in a hard-copy printout or the number of pixels across and down on a display screen. You might be right...

    5. Re:Use a moment method with physical optics by EmagGeek · · Score: 1

      "Doubling the resolution increases four times the calculation time."

      Not true at all. The density of the impedance matrix has nothing to do with screen resolution. It only has to do with the physical size of the objects in the scene releative to the wavelength of light incident upon them. You could have one mesh node per pixel, but that would be computational suicide. If you _did_ have one node per pixel, then your mesh would be so dense that you could use a dirac-delta basis function and avoid a lot of the interpolatory calculation that is involved with coarser mesh densities.

    6. Re:Use a moment method with physical optics by EmagGeek · · Score: 2, Funny

      Nah this was a long time ago. It may have been "cutting edge" research then, but nowadays the average 3rd grade advanced calculus student could figure it out in their head. All you have to do is take an arbitrary object in 3-space, chop the volume up into little 3D blocks that can be represented by known equations, test the incident field in those little blocks by integrating the interaction of the incident field with the material and shape of the block, and calculate the far-field by performing a fourier transform on the resulting solution matrix. Piece of cake, as long as you don't forget that the incident field at a given block is the sum of the incident plane wave and the scattered nearfields from the other blocks in the mesh :)

  25. Povray by Ethan+Allison · · Score: 1

    But when we get into presentation-quality raytraces (at least with Povray), my P4-2.4HT takes 3-4 minutes per frame. And that's on lowest quality. High quality takes a matter of days.

    Would multicore processing help out here as much as it would with high-speed raytraces?

    1. Re:Povray by Anonymous Coward · · Score: 0

      In my testing (using MPI and separate machines rather than multiple cores, but same idea) POV-Ray scales nearly linearly up to at least 8 cores (I didn't test any more than that).

  26. Raytracing vs. Scanline for Realtime by greazer · · Score: 5, Informative

    I've seen the topic of realtime ray-tracing and hardware accelerated ray-tracing come up countless times over the past 15 years. In the 80's and 90's, a realtime ray-tracing acceleration chip was always around the corner. Some products did actually emerge, but never quite caught on. The reason for this is not because "commercial graphics industry has been intent on pushing raster-based graphics as far as they could go". Quite the contrary; it's much more elegant algorithmically (and hence 'easier') to implement a ray-tracer than a scanline based renderer. However, there's a fundamental limitation of ray-tracing that make it unappealing performance-wise. Cache coherence for ray-tracers suck.

    All rendering algorithms boil down to a sorting problem, where all the geometry in the scene is sorted in the Z dimension per pixel or sample. Fundamentally, scanline algorithms and ray-tracing algorithms are the same. For primary rays, here's some simpliefied pseudocode:

          foreach pixel in image
            trace ray through pixel
            shade frontmost geometry

    The trace essentially sorts all the geometrty along its path.

    A scanline algorithm looks like this:

          foreach geometry object in the scene
            foreach pixel geometry is in
              if geometry is in front of whatever is in the pixel already
                shade fragement of geometry in pixel
                replace pixel with new shaded fragment

    As you can see, the only distinction is the order of the two loops. For ray-tracing, traversing the pixels is in the outer loop, and the geometry in the inner loop. For scanline rendering, it's the opposite. This has huge consequences in terms of cache coherency. With scanline methods, since the same object is being shaded in the inner loop, and neighboring fragments of the same object are being shaded, cache coherency tends to be extermely high. The same shader program is used, and likelyhood of the texture being accessed from cache is very good. The same can't be said for ray-tracing. You can shoot two almost identical rays but touch wildly different parts of the scene. Cache coherency relative to scanline rendering is abysmal.

    This one performance side-effect of ray-tracing is the only reason we haven't seen any serious ray-tracing for realtime applications. Even in offline rendering, scanline rendering dominates even though software ray-tracing has been available from the beginning of CG. For ray-tracing to become viable, we need more than just more CPU cores. We need buses fast enough to feed all the cores in situations where we have an extremely high ratio of cache misses. Unfortunately, the speed gap between memory speeds and compute power seems to be increasing in recent years.

    1. Re:Raytracing vs. Scanline for Realtime by smallfries · · Score: 1

      The nice effect of placing the geometry loop on the outside is that clipping becomes a coherent decision for large groups of pixels. Again this has nice effects in both control flow, which can be amortised over many pixels, and the relative depth of pixels. If you ignore intersecting geometry then you can optimise even more pixels out of the calculation.

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    2. Re:Raytracing vs. Scanline for Realtime by smallfries · · Score: 1

      [damn should have used preview] : The *other* nice effect...

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    3. Re:Raytracing vs. Scanline for Realtime by Anonymous Coward · · Score: 0

      That's not a scanline algorithm, that's a straight Z-buffer algorithm.

      A scanline algorithm (as the name implies) operates in screen space, line by line, like this:

      for y
          while x
              find next object(s) start
              find front object at start
              if front is not current object, draw current until start and set current to front

      It's actually allright cache wise because you always write in the screen buffer in the right order (so no read to write), there's no reading back the Z-buffer, and because objects that are in fromt on one line (likely to be cached) will tend to be in front on the next one (the algo can be improved to use that).

      The problem with cache and ray-tracing is not the ordering of the loops, it's that the "shade frontmost geometry" is actually a recursive "trace ray".

    4. Re:Raytracing vs. Scanline for Realtime by h890231398021 · · Score: 1

      You don't mean cache coherency , you mean cache locality .

    5. Re:Raytracing vs. Scanline for Realtime by Anonymous Coward · · Score: 0

      No actually. Read the Intel research paper that TFA is based on. There are ways to significantly improve raytracers behavior - for example, by shooting "beams" (groups of rays) out at the same time, and using good data structures.

    6. Re:Raytracing vs. Scanline for Realtime by DamnStupidElf · · Score: 1

      Cache coherence for ray-tracers suck.

      I have wondered whether it would make sense to generate ray lists and sort those by the objects they intersect, and then process the intersections sorted by object in separate threads, thus keeping the cache coherancy. It's just a mixture of rasterization and raytracing, with the only real difference that only intersections would be computed on the first pass, and then the raytracing would take over and only stick the nearest intersection for each ray into a queue for shading and generation of new rays for reflections, shadows, and refractions which would go back into the intersection queue. I guess now I need to try it out.

  27. Film at 11 by jalefkowit · · Score: 4, Insightful
    A report by Intel about Ray Tracing shows that a single P4 3.2Ghz is capable of 100 million raysegs, which gives a comfortable 30fps.

    Extra, extra! This just in! Report from CPU vendor discovers that you should spend more money on your CPU and less on your graphics card!

    Shocking, I tells ya. Shocking.

    1. Re:Film at 11 by GraZZ · · Score: 1

      Especially shocking since that CPU vendor's competitor has recently purchased a company that produces graphics cards...

  28. Note that OpenRT is not open source by Anonymous Coward · · Score: 3, Interesting

    It's nice that people are working on ray traced games, but please note the following:

    Oasen is based on "OpenRT" --- which is entirely proprietary, and is NOT open source. Their FAQ explains that clearly.

    I'm sure that I'm not the only person annoyed at their use of "open" to mean "closed".

    Time to look for an open-source raytracing engine designed for interative use ...

    1. Re:Note that OpenRT is not open source by jared9900 · · Score: 2, Interesting

      I just feel that it should be mentioned that part of the reason for that naming is that it attempts to stay inline with the OpenGL API conventions, making the user experience of the OpenRT API seem more familiar. And just like OpenRT, OpenGL is not open source.

    2. Re:Note that OpenRT is not open source by PitaBred · · Score: 1

      Of all the times to not have mod points... someone wanna mod him up? It may not be an open implementation, but...

  29. Makes sense by John+Betonschaar · · Score: 1

    This makes sense, and I've been saying this since details about designs like the Cell processor were announced. Not specifically for raytracing, but also for normal 3D rasterization. In time, I predict GPU's will disappear entirely, and CG will move back to software rendering. Most people find this ridiculous as they only remember software rendering from Quake 2-generation engines, and software rendering looked awful compared to hardware rendering back then. But there is no reason at all why a CPU couldn't produce the exact same quality rendering as a GPU, its just too expensive at the moment.

    Think about it: if a Cell-like CPU with N cores can be programmed efficiently to use (e.g) N/4 cores for rendering, and be able to scale image (AA/filtering/NURBs tesselation/etc) quality with the number of cores, and have enough power left for physics and logic, what's the future for GPU's? If the solution is slower then a current GPU, the number of cores will grow eventually and at one moment surpass GPU performance. At that point software rendering would be more attractive then hardware rendering, as it is much more flexible/upgradable and scalable than a (more or less) fixed-function GPU. For example you wouldn't need special/extra hardware to be able to play both software rendered as well as software raytraced games, as both just run on some of the CPU's cores.

    1. Re:Makes sense by Viking+Coder · · Score: 1

      But there is no reason at all why a CPU couldn't produce the exact same quality rendering as a GPU, its just too expensive at the moment.

      The problem with your logic is that the effective performance of CPUs is doubling every 18 months, and the effective performance of GPUs is doubling every 12 months. This holds over a period as long as GPUs have been in production. In other words, it will always be less expensive to produce a rendered image on a GPU than on a CPU. Now, if you can invent some image calculations that only a CPU is capable of (photonic wavelet, blah, blah, blah), then you shift the argument - but only for as long as it takes some geek to figure out how to do it on a GPU. Even if the CPU is then four times as fast as the GPU at it, it will only take a couple years for the GPU to catch up.

      And by the way, calling current generation GPUs "fixed-function" is ridiculous. Look at gpgpu.org, for instance.

      --
      Education is the silver bullet.
    2. Re:Makes sense by Anonymous Coward · · Score: 0

      I would say right now that GPUs are just 'catching up' to what a CPU can do speed wise. They are just starting to get there. They will be on the 18 month bandwagon like the rest of the industry very quickly.

      I think the orig poster is correct though. But not in the way he thinks. I think it will be more of a merging of the two. As the doubling is not speed but items on the cpu. They have merged the 'extra' cpus. Now it is time for the other chipsets such as memory controler (AMD has done this already), PCI bus, etc. The northbridge/southbridge these days have fairly much gobbled up all the other extra chips on the board. It is just a matter of time for that one to be gobbled up too. Then probably the GPU. Then its memory next.

      I do not see an 'extra' core that just does graphics being feasable long term. But a more generic CPU that does both is where it will end up. When will we see all this? By my calcs somewhere between 2010 and 2012. Its going to be very cool!

    3. Re:Makes sense by Viking+Coder · · Score: 1

      AC, I can't begin to understand your assesment that GPUs are just now 'catching up' to what a CPU can do speed-wise. You can't run Doom 3 on a CPU alone at the same speed as on a GPU. It can't do it. No way, no how. For specialized rendering tasks, a CPU isn't even close to a GPU.

      --
      Education is the silver bullet.
    4. Re:Makes sense by PastaLover · · Score: 1

      It's an interesting idea, but with game textures easily surpassing several megabytes in size and needing to be there for every frame you're calculating I'd say you'd be seeing a lot of memory traffic that would normally stay inside the graphics card and thus seriously damage your performance. This at least for raster based calculations. The author's point seems to be it works better with raytracing though I'm not sure I agree.

    5. Re:Makes sense by John+Betonschaar · · Score: 1

      For specialized rendering tasks, a CPU isn't even close to a GPU.

      Not yet, no. But I wasn't thinking about Pentiums or Opterons anyway, but CPU's with highly specialized vector units. As a lot of 3D rasterization work lends itself very well for parallellization a number of these vector units could approach GPU performance in some parts of the rendering pipeline. For those parts that do not map to vector-code well, a more general-purpose core can be used to tie the vector-based pipeline stages together. A cell CPU with its PowerPC controller and configurable number of vector units looks might be very well-suited for such tasks.

      By the way, you are right about GPU performance: nu current AMD/Intel CPU even comes near current GPU's on typical GPU tasks (not even within 1 or 2 percent for such tasks). But thats because they're really general-purpose processors, and not designed for these tasks at all. But turning this argument around: a current GPU, doesn't even come within the 1/100ths of percents of real general purpose CPU tasks. GPU's only perform on highly specialized tasks, and although its possible to 'simulate' a few general-purpose tasks on a GPU reasonably efficiently, this is not true for most of them...

      If you'd ask me a hybrid general-purpose CPU with a configurable number of vector units could combine the best of 2 worlds, and obsolete specialized GPU's in the future...

    6. Re:Makes sense by Helios1182 · · Score: 1

      The bigger point to be made is that as scene complexity grows linearly, rendering requirements grow at O(n) for raster graphics, but only O(lg n) for ray traced graphics. So at some point it will be more efficient to use raster graphics -- even if GPU performance doubled every 6 months.

    7. Re:Makes sense by Viking+Coder · · Score: 1

      You're over-simplifying! You're presuming that "scene complexity" is equivalent to "scene geometry," and that's not true. There are more effective techniques for scaling up the quality of your output image which work just fine, such as procedural textures, or even just texture look-ups. When you use those textures for such things as displacement mapping, you're really talking.

      --
      Education is the silver bullet.
  30. Oh! by Anonymous Coward · · Score: 0

    Raytracing - something involving complex mathematics and considerable computing time, performed by the CPU - actually goes faster when you add more horsepower? It "scales" with more processing power? Peter Griffin says: NO FREAKING WAY!

  31. let's do the math... by Anonymous Coward · · Score: 1, Insightful

    Four 3.2 GHz CPUs gives:

    3.2 * 1,000,000,000 * 4 = 12,800,000,000 Hz

    Assume resolution 640x480 and framerate 30:

    640 * 480 * 30 = 9,216,000

    OK, now let's see how many cpu cycles we're gonna have for each ray:

    12,800,000,000 / 9,216,000 = ~1388.89

    Conclusion:

    Can you complete a raycast in one and a half kHz? Not a chance.

    And even if you could - there would be _nil_ cycles left for sound, game mechanics etc etc... //0xFE

    1. Re:let's do the math... by xenocide2 · · Score: 1

      It might make more sense to develop a purpose specific card to push raytracing graphics, rather than an Intel chip. I've seen a university project to build one and they demonstrated an architecture that pretty much scaled up performance as they added more silicon. Which is to be somewhat expected; give each chip a region of view to render and they can cast their rays independently. The critical aspect of this system is memory, and memory contention. The good news is that rendering doesn't need to modify the shared geometry, and they should be rendering to different parts of the frame buffer, so it won't be a process synchronization hell (there will still be some problems when you need to modify that geometry when the frame is done, and figuring out how to ). Critics will tell you that this is going to destroy the cache, but the news is, what little cache coherency on raster cards that's left will continue to die out as textures increase in size, and as we introduce multiple texturing (bump mapping, textures, dirt maps, alpha maps, etc) per polygon. iD's new Quake wars is using some ridiculus texture size to reduce terrain fractals, and no matter how it's done it's gotta put the GPU memory at strain. GPUs have been pushing memory technology for a while now; GDDR3, common in cards for the last three or four years, is DDR2 improved to reduce heat production (and thus up the clock).

      Compared to traditional raster based solutions, ray tracing hardware is underresearched. There's a whole host of possibilities with gaming in mind. You can do iterative improvement of the frame. You can dedicate more rays to the middle of the frame (presumably the object of interest). And that's just shit some dude who hasn't spent years of his life pondering this came up with. I'll be much more interested to see what's presented this September at IEEE's RT raytracing conference..

      --
      I Browse at +4 Flamebait

      Open Source Sysadmin

  32. Need a reason... by Corwn+of+Amber · · Score: 1

    The reason to add extra cores is that, with our current processors made for single-tasking, with their only stack and set of registers - hyperthreading is the exact same thing, with only 2 stacks and register sets - programs (threads, processes, whatever) are supposed to be run one on each processor. Let's change that, first. Let's put, say, a whole lot more registers, drop the hardware stack and use a software one - make everything software-based, reduce the instruction sets, use WAY more powerful SIMD, and add numerous cores. THAT is the design we NEED since forever. As for GPUs, RPUs, and so on - let's just drop it. They offer nothing really valuable. Raytracing has been The One True Way since forever. Now we have the CPU power to use it in real-time. Why has no one ever thought of using numerous general-purpose chips (68000s?) on additional cards to do only raytracing? Because it would be BETTER? Raster graphics have always sucked.

    --
    Making laws based on opinions that stem up from false informations leads to witch hunts.
    1. Re:Need a reason... by Viking+Coder · · Score: 1

      Raster graphics have always sucked? Seriously? You think that Ice Age 2 looks better than The Incredibles? If anything, I think it's the other way around.

      And by the way, having a reduced instruction set and adding a ton of processors is exactly what a GPU is.

      --
      Education is the silver bullet.
    2. Re:Need a reason... by Anonymous Coward · · Score: 0

      Why has no one ever thought of using numerous general-purpose chips (68000s?) on additional cards to do only raytracing?

      Many, many people have thought of that over the years but they have always turned out to be people who didn't understand the problem.

      I am sure you are different and have taken into account the cache coherence issues, the massive communications bandwidth required not to mention memory and power dissipation.

      Otherwise you'd just be another unqualified idiot who thinks they've had a major insight even though it's a simple and obvious idea. And I can't believe that someone like that would be posting on Slashdot.

  33. Exact Shapes by MrSteveSD · · Score: 1

    Surely an advantage of Raytracing would be that you could have exact shapes. e.g. Instead of approximating a cylinder by an octagonal prism, you could just have a real cylinder. What's more, the pure shapes like cylinders, sphere etc, would take up less memory than their approximations.

  34. Lies, Damned Lies and RT Raytracing by adam31 · · Score: 4, Informative
    If there's one thing the RT raytracing community is good at, it's explaining how good it works in theory. Take some numbers, extrapolate a little one dimension, then another and BOOM-- The Future. There are several problems with raytracing in real-time:


    1) Static Objects Only. The huge majority of computation time is traversing a spatial subdivision structure. It happens that K-d trees offer the best characteristic (typically, fewest primitive per leaf for a given memory limit). However, these are really heinous to dynamically update. You can cheaply re-create it with median partitioning, but your trees are crappy. You can do a much nicer SAH (surface area heuristic), but to do this per frame blows out your CPU budget.

    2) Bandwidth. Even if you could update your subdivision structure very cheaply, that structure still needs to be propogated out to all the CPUs participating in the raytrace. For the 1.87 MTri model they list on page 6, their spatial structure was 127 MB. Say you have a bandwidth of 6 GB/s, it takes 20ms just to transfer the structure (and there are other problems here). So your ceiling is 50 Fps before you trace your first ray.

    3) Slower than a GPU. Even though they give you some little graph showing that raytracing (a static model, with static partitioning) beats a GPU at a MTri in the frame, this is very deceiving. The GPU pipeline works such that zillions of sub-pixel triangles simply can't get into pixel shaders fast enough, and force the pixel shader to be run many times extra. Double the resolution, however and the GPU won't take a cycle longer... with raytracing, performance will halve. So they found a bottleneck in the GPU which is totally unrepresentative of a game in every single sense, and said LOOK! BETTER! (in theory).

    4) Hey, Where's my Features? All the cool things about raytracing (nice shadows, refraction, implicit surfaces, reflection, subsurface scattering) all get tossed out the window to make it real-time! What's the point, then? Given all the pixel shader hacks invented to make a GPU frame look interesting, the quality that can be achieved in a real-time raytrace is sadly tame. Especially when you consider that quality is the supposed advantage of raytracing.

    And c'mon. It's Gameplay that counts anyway :P

    1. Re:Lies, Damned Lies and RT Raytracing by TheRaven64 · · Score: 2, Interesting
      I read a paper a couple of years ago about a ray tracer that updated the pixel at each ray iteration. This meant that when you were moving quickly, you got a lower-quality picture, but you didn't notice because you were moving quickly. If you stopped, then the detail appeared very quickly (over 2-5 frames, as I recall; too quick for the user to notice that it was appearing).

      As for dynamic scenes, this is actually easier in many ways with a ray tracer. If you start with a scene graph API, you just need to send the changes each frame. How much changes in a typical game? Most of the scenery is fairly static. The characters move and deform slightly (you can often get away with a spacial transfer function, rather than a real change to the geometry in this case). With a traditional graphics pipeline, you still need to redraw every single polygon every frame. With a ray tracer, you can cache huge amounts of the scene (in terms of secondary rays; simply save the results of them as a texture and perform a lookup here for the next ray, and invalidate the texture when something moves between it and any of the light sources).

      Ray tracers have been running in real time for a while (take a look at Utah), but not on cheap hardware. The hardware will catch up soon. Take a look at some of the designs from Microsoft Research; they have some very shiny FPGA-based logic which does 100% procedural graphics and is likely to show up in the XBox 3.

      --
      I am TheRaven on Soylent News
    2. Re:Lies, Damned Lies and RT Raytracing by Guysmiley777 · · Score: 1

      And c'mon. It's Gameplay that counts anyway :P

      But gameplay doesn't sell hardware. You've got a lot to learn about the modern gaming industry. :)

      Although now that I think about it, I'd probably buy an "AI Coprocessor" if it meant better AI in games...

      --
      Coding with assembly is like playing with Legos. Coding an application in assembly is like building a car with Legos.
    3. Re:Lies, Damned Lies and RT Raytracing by JustNiz · · Score: 1

      >> I'd probably buy an "AI Coprocessor" if it meant better AI in games...

      No that really is a much better use than graphics for that 2nd ( or 3rd or 4th) core in your cpu.

    4. Re:Lies, Damned Lies and RT Raytracing by Anonymous Coward · · Score: 0

      Ok, after reading all this (which was actually quite entertaining, just because it is really funny when so many people know so little), here some comments from me:

      First of all: I know pretty much every aspect of this topic since I have been doing HQ Rendering and Animation in Softimage/Maya/Lightwave for some years, have written a GPU-based Rendering solution providing a similar quality to HQ Rendering for some scenes and am currently writing my own realtime raytracer.
      And in my opinion, realtime raytracing will be the winner. Not yet, except for industrial applications (where it is already used by companies like Volkswagen), but probably within the next 5 years.

      But first: Remember that you are comparing apples and pears. Or more precisely: You are comparing a highly specialized graphics hardware with a general purpose cpu. So a good raytracing based accelerator will also have special units for shaders, textures and so on. Problem is just: These cards are not available yet (although I was told, a japanese company is very interested in the hardware design the saarland guys proposed). But this is likely to change.

      Now my comments to the 4 points mentioned:

      1. Static Objects only: This has just proven to be wrong. Of course, Kd-Trees are said to deliver the best performance and building a good kd-tree is expensive. But there were various papers published this year that showed a SAH-based kdtree that could build a 180k Triangle scene in 0.69 seconds on a 2.4GHz Core2 Duo with loosing just about 3-4% to a fully optimized kd-tree.
      Also, kd-Trees are not the only way to go since they also have a large memory footprint. So I use another acceleration structure at the moment that takes about 400ms to build a 200.000 Triangle Model on my Pentium M with a SAH Approximation (and I am definetly not the best coder in the world). Since the Buildprocess can be distributed to multiple cores without to much headache, a full rebuild should be possible on the near future for scenes of this size. And to trace performance: I am currently tracing over 1 Million completely random rays per second on scenes of this size without even using SIMD. Coherent rays, like the primary rays to find the surface you want to shade, will be much faster (cache coherence!!!).

      2. The Bandwidth comment is not really correct: A scanline renderer needs to send all the triangles through the pipeline for each frame. A raytracer does not need this, it just needs to send the nodes a ray traverses, so the number depends more on the resolution and not on the size of a scene.
      For small scenes, a scanline renderer is faster than a raytracer, which needs to send rays for each pixel and traverse the acceleration structure. But this is also, why raytracing wins as soon as the scene gets large (traversing 1 million triangles is not much slower than traversing 100 million triangles). So it is a question of finding the point where raytracing gets cheaper than scanline rendering.

      3. Slower than GPU: Guess what: a VW Multivan is slower than a Ferrari....but I wouldnt want to put all my friends into a Ferrari. You are right about the resolution problem, but as already said, this is a question of how large your scene is. Throw 1 Million triangles at a scanline renderer - no problem. Throw 2 Million triangles at a scanline renderer - performance is halfed. So it is the same, just another aspect. And actually: Most people dont use TFTs with a resolution above 1280x1024. So when raytracing reaches this, no one will care if doubling resolution will matter. The good thing with raytracing is: Each ray can be processed by a single cpu (or core or on specialized hardware: Pipeline). So things will get interesting, when we finally have 24 Core CPUs.

      4. I guess I already answered that. But generally: One of the reasons realtime raytracing doesnt look so good now may be, because the field is mainly populated by programmers - not artists. And believe me, this really makes the difference. inView for example (the realtime raytracing softwa

  35. Three Words by GeffDE · · Score: 3, Informative

    The Cell Processor

    Three or four people have brought up the idea that problem would work well for the cell processor. But I don't think anyone has really seen the (rays of) light on the issue. The Cell is perfect for this. Some facts:
    1) Raytracing is highly vectorized. The Cell's many processors are optimized for vector calculations.
    2) Raytracing scales linearly with the number of cores. The Cell has 8 (at least in its current manifestation).
    3) The Cell is already available as a PCI-Express add-in card (that even runs linux!) which sounds awfully like what a GPU is... 4) The Cell is a bitch to program. But then, so are GPUs...so maybe it's not that ridiculous to see the future of the GPU...from IBM.

    How ironic it is that Intel is now pushing this technology...

    --
    It has been a nervous year, with people beginning to feel like Christian Scientists with appendicitis.
    1. Re:Three Words by ichigo+2.0 · · Score: 1

      4) The Cell is a bitch to program. But then, so are GPUs...so maybe it's not that ridiculous to see the future of the GPU...from IBM.

      I do not know how difficult cell is to program, but I can assure you that GPU programming is no harder than CPU programming (as long as you don't code your shaders in assembler of course).

    2. Re:Three Words by Corwn+of+Amber · · Score: 1

      The Cell is already available as a PCI-Express add-in card (that even runs linux!) which sounds awfully like what a GPU is...

      A GPU for $8000 and that's sold by SONY, of them all, at $600 with a Blu-Ray player and other things. Now that sounds an awful lot like getting raped in the ass with the HUGE gold bar you have to hand them.

      --
      Making laws based on opinions that stem up from false informations leads to witch hunts.
    3. Re:Three Words by TeknoHog · · Score: 1
      A GPU for $8000 and that's sold by SONY, of them all, at $600 with a Blu-Ray player and other things. Now that sounds an awful lot like getting raped in the ass with the HUGE gold bar you have to hand them.

      No, you're getting it all wrong! This is Slashdot, where we collectively bash Sony for making a hugely expensive game console.... or praise Sony for selling a monster workstation ridiculously cheap.... I forget which way it is this week ;)

      --
      Escher was the first MC and Giger invented the HR department.
    4. Re:Three Words by Anonymous Coward · · Score: 0

      Someone has seen the light.

  36. SGI Siggraph 2002 demo by DotDotSlasher · · Score: 2, Informative

    SGI had a ray tracing demo at Siggraph 2002. On the show floor, a 128-processor SGI box ran demos at around 20hz at about 512x512 pixels.
    http://www.sci.utah.edu/stories/2002/sum_star-ray. html
    They make some good points about geometric complexity increasing much faster than displayed pixels, so there are fewer graphics primitives per pixel, so scan-line-based algorithms will make less sense.
    So in 2002 it took 128 processors to run at 20Hz at 512x512 pixels. And now we think quad-cores will be enough to render today's complex environments? That math doesn't add up to me. I think scan-line algorithms are the mainstream answer for a long time coming...

    1. Re:SGI Siggraph 2002 demo by JustNiz · · Score: 1

      128-processor SGI box ran demos at around 20hz

      Jeez and I thought those things were fast. Even my old Amiga ran at 4 Mhz or something. :-)

    2. Re:SGI Siggraph 2002 demo by tbp · · Score: 1

      You're neglecting recent progress in the field; as stated in the FTA for traversal it's all about coherence. Note that coherent traversal (read SIMD friendly) has been transposed to other hierarchies, grid, bvh, bih... you name it.

      http://www.mpi-sb.mpg.de/~wald/PhD/wald_phd.pdf
      ftp://download.intel.com/technology/computing/appl ications/download/mlrta.pdf
      Etc...

      Scan-line will remain mainstream as long as it will be the only method with a cheap specialised hardware implementation in town, even if objectively it doesn't make much sense.

  37. Scientific American article by vivin · · Score: 1

    This month's edition of Scientific American had a good article on Ray Tracing. Basically, how it can be more feasible with the faster/better hardware we have today. The article is available here, but unfortunately, you have to pay for it. The article focused on new software and hardware techniques for Ray Tracing being developed at Intel. They say that Ray Tracing is "poised to replace raster graphics" because it "scales well with hyper threading and multi-processor configurations.". Also the "cache hierarchy associated with CPU's is very effective at managing the external memory bandwidth requirements". With multicore processors entering the mainstream, they may have a point.

    I wish I could remember more from the article, but I read it some time ago.

    --
    Vivin Suresh Paliath
    http://vivin.net

    I like
    1. Re:Scientific American article by usrusr · · Score: 1

      what you described is pretty much what the paper which would be "TFA" in "RTFA" is about. scaling with hyperthreading and multicore, and the good hit rate with common cpu cache hierarchies. (where the "hyperthreading" part tells us something about when the underlying experiments where done)

      looks like that article was mostly about that same paper.

      --
      [i have an opinion and i am not afraid to use it]
  38. Another one by gr8_phk · · Score: 1, Troll

    I did RTchess a few years back (a link would kill my friends server). The core RT code has been pulled into a library and improved significantly since then. I was actually meaning to write an artice making the same point as the one in the summary. Multi-core will make realtime ray tracing common in a five years, and then there will be no use for the GPU. Why rasterize when you can ray trace instead? Ray tracing scales exceptionally well with polygon count (log n). Why add a second chip? Not to mention the geometry needs to be present in the CPU anyway when you do physics. Why maintain all the geometry data in 2 places?

    The Intel guy has some funny stats about ray-somethings per second. Intersection tests are irrelevant. Generated rays per second is too. There is already a "Benchmark for Animated Ray Tracing" called BART. Frame rates on those animations are much more important. Unfortunately I haven't even had time to patch that code together with mine to get numbers. It's down there on the to-do list. Is Intel hiring? If someone could pay me to work on it, things would come together quickly.

    1. Re:Another one by Nahor · · Score: 2, Interesting

      I remember a talk from someone (John Carmack I think) saying something like raytracing is nice but overkill. Today's hardware maybe be able to handle realtime raytracers but no way near the quality you can get from current 3D engines.

      Most special effects you see in current engines are approximation/hacks compared to what you can do with a raytracer but it's also way cheaper to compute.

      It's the same kind of relationship than between texture maps vs procedural textures. Procedural textures are better for a rendering point of view, they scale better. But it's also lot harder to make a good quality texture and it requires a lot more power to render.

  39. what about intersections? by Anonymous Coward · · Score: 0

    That's an interesting calculation (part of what I was trying to estimate from the article; I'm trying to reconcile "30 fps!" with "But Povray/Bryce/Maya can take _hours_ per frame"). So you need to cast rays (calculate colors) for 9.6 million pixels every second.

    But what I'm wondering about is, the more relevant number may be the number of intersections you have to calculate. If you have 100 shape objects in your scene, you need to test each ray against 100 intersections. You really probably need thousands of shapes to make a decent scene. (How many polygons are in an average scene in a modern 3d game? For raytracing, you don't need to break everything up into polygons...but it still requires a lot of shapes.) Then, you still need to worry about reflections, oversampling, light sources, and transparency. (For accurate shadows, you'd have to trace each ray back to each light source rather than choose a static color.) These effects make raster graphics more difficult, too - but they seem to have addressed them.

    Plus, it's not clear...are they describing raytracing using full floating-point arithmetic? Or are they using fixed-point integer arithmetic? I've always seen people use floating-point for raytracing since a fixed-point algorithm is much messier and subject to inaccuracies, but I suppose fixed point could be done.

  40. More interesting to me... by jhfry · · Score: 1

    ... would be a FPGA that sits on a core. For those who are not familiar, a Field Programmable Gate Array is essentially a peice of hardware that can be "programmed" to perform specialized tasks, especially sequental ones, at faster speeds than software on a general purpose CPU. Imagine a fully programmable coprocessor with blazing access to RAM and a hypertransport to the general purpose cores for more complex functions that are hard to script in hardware.

    I have seen comparatively weak, 1M gate FPGA's encode OGG Theora at 1280x1024 at almost 30fps (http://www3.elphel.com/en/products), imagine what a FPGA with a far greater number of gates, communicating at processor speeds could do.

    Sure it would need to be reprogrammed for every task it needed to perform, however if I'm doing video encoding or decoding, I wouldn't mind reprogramming it. Game designers could even use the FPGA to accelerate tasks from AI to raytracing to whatever.

    It may not be as fast as a coprocessor built for a single task, but it would be a heck of a lot more versitile. And it could be reprogrammed if any bugs are discovered.

    --
    Sometimes the best solution is to stop wasting time looking for an easy solution.
    1. Re:More interesting to me... by inKubus · · Score: 1

      So you could have one of these as a peripheral, and just load a "program" into it, then run whatever task you want to do on it? How long does it take to load the program, and how much faster will it encode videos, etc?

      --
      Cool! Amazing Toys.
    2. Re:More interesting to me... by JustNiz · · Score: 1

      >> a peice of hardware that can be "programmed" to perform specialized tasks, especially sequental ones, at faster speeds than software on a general purpose CPU.

      So just like, say, a 3D graphics card then?

  41. Ah, but can you picture it in raytraced 3D? by Moraelin · · Score: 1

    Just imagine the possibilities. Imagine moving your ship among asteroids that aren't just outlines, but fully texture-mapped bump-mapped gloss-mapped anti-aliased anisotropic-filtered self-shadowed pixel-shaded, and with lens-flare and bloom effects to boot. And rendered in HDR too!

    I smell a winner. Let's now flood the review sites with 120 screenshots (see, in that one there are _two_ large asteroids)and we could have a bestseller.

    --
    A polar bear is a cartesian bear after a coordinate transform.
    1. Re:Ah, but can you picture it in raytraced 3D? by ndogg · · Score: 1

      Just so you know, anisotropic filtering, self-shadowing, and pixel shading all come free with raytracing.

      --
      // file: mice.h
      #include "frickin_lasers.h"
    2. Re:Ah, but can you picture it in raytraced 3D? by Moraelin · · Score: 1

      Well, yes, actually most of those are free with raytracing, including the texturing, lighting, bump-mapping is doable too, and so on. Not that sure about pixel shading, since nowadays that isn't used to mean just phong or gouraud shading. What is meant nowadays are shader programs which are run for each pixel. E.g., the water effects in most games, or the depth of field effects in COH are such programs. I believe you wouldn't get those for free with raytracing.

      Sure, you could get for free _most_ effects that are solved with pixel-shader programs nowadays, such as waves on a lake. But I can still think of other uses which wouldn't come for free with raytracing.

      But anyway, it was meant as a joke. I was just mocking the game industry's reliance on releasing the same things over and over again, with just one more eye-candy technique thrown in. And the fact that nowadays most publishers don't even discuss gameplay any more when they hype their game, but instead just try to dazzle you with millions of pretty screenshots. (Who cares if it has no gamplay? Just look at all those pretty bump maps! Doesn't that just make you want to buy it.)

      It was basically just saying "yup, I wouldn't be surprised if someone just remade Asteroids verbatim, only with pretty 3D graphics." (Ray-traced or not.)

      But mostly it was supposed to be just a joke.

      --
      A polar bear is a cartesian bear after a coordinate transform.
    3. Re:Ah, but can you picture it in raytraced 3D? by Creepy · · Score: 3, Informative

      you are correct - pixel/fragment shading is not free - both use the same default shading model in OpenGL and probably DirectX.

      the default shader is, I believe, Lambert (a close relative to Phong - if not, it's Phong) for OpenGL and probably DirectX as well. Programs in the shader can change this to whatever you want it to be (e.g. a cel shader) and you would need to do that in either a ray tracer or rasterizer.

      there's a lot of things I like about ray tracing, but it's not without flaw - it handles specular highlights fantastically, but doesn't handle diffuse well at all, so you have to bolt on other techniques. Most people (including Intel) use ambient occlusion since it's a quick technique (also commonly used in polygon based graphics), but it tends to make muddy shadows (see the wikipedia entry). Radiosity is more realistic, but the patch computations are incredibly expensive (but parallel-able). photon mapping is another method that could be used, but I haven't used it myself. In college I wrote (with a team) a simple ray tracer and shortly after that class wrote a radiosity engine, so I'm familiar with both techniques. I never did really understand how to combine them, but I remember seeing POVRAY do it in the mid 90s and really wanted to figure out how they did it (but I graduated and was putting in startup hours, so that never happened).

      Oh, and waves on a lake are non-trivial - to be completely realistic, you need to deal with subsurface diffusion (or estimation of), foam and caustics (if you can see through the semi-transparent water surface). The specular mirror effect would be nice, but I don't see true caustics from either a raytracer or a rasterizer (you'd need to use ray bars or cones, probably).

    4. Re:Ah, but can you picture it in raytraced 3D? by usrusr · · Score: 1

      bump mapping...

      the whole idea of "reviving the ray tracer" seems to be to be less bound by triangle counts and being able to replace the bump maps with actual bumps instead.

      --
      [i have an opinion and i am not afraid to use it]
    5. Re:Ah, but can you picture it in raytraced 3D? by ndogg · · Score: 1

      Yeah, you're pretty much right about all of that. I made the mistake. However, it's still cheaper in raytracing to do the illusion of bump mapping rather than actually calculating it.

      --
      // file: mice.h
      #include "frickin_lasers.h"
  42. -1, Wrong by spun · · Score: 2, Interesting

    I'll let the other posters comment on the wrongness of your idea that raytracing doesn't scale with scene complexity. There was a nice SciAm article about it, if you need more convincing. Instead, I'll talk about something in the article that the other posters didn't mention. Raster Processing may scale with scene somplexity, but creation doesn't. Raster graphics must be tweaked at creation to make an object look realistic while still rendering quickly. With ray tracing, you just create an object and forget about it. It just looks right without any tweaking.

    What costs game designers more: hand tweaking every object, or you buying a better computer so you can ray trace their un-tweaked objects? Now guess which way 3D graphics are gonna go...

    --
    - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
  43. Scientific American by samkass · · Score: 3, Informative

    There is a good article about this in August's Scientific American by W Wayt Gibbs. It's only a couple pages but worth picking up a paper issue, or if you have one of their digital subscriptions here: http://www.sciam.com/article.cfm?chanID=sa001&arti cleID=000637F9-3815-14C0-AFE483414B7F4945

    --
    E pluribus unum
  44. Dammit, Tim by spun · · Score: 1

    Add to that the fact that the site hasn't been updated since mid-2005, and I'd say it's dead.

    I'm a doctor, not a programmer...

    --
    - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
  45. Cache locality++ by DeadCatX2 · · Score: 1

    Parent is correct.

    Cache coherency is what happens when you have multiple processors, and you have to keep the cache coherent between the two processors if they're working on the same data. i.e. processor A and B are working on a dataset. Processor A modifies variable X, and processor B wants to modify it, but it has to 1) know X is in A's cache and 2) get X from A's cache. Then you get into fun little things like shared cache, write-through operations, etc.

    Cache locality is affected by random reads, hence the previous comments comparing ray tracing to databases. Normally, we expect to access the same data more than once (temporal locality), because we frequently use a variable in more than one calculation. We also expect to access data near other data (spatial locality), because we store objects as groups of data and we access multiple members of an object for some calculations.

    --
    :(){ :|:& };:
    1. Re:Cache locality++ by greazer · · Score: 1

      Sorry, you're right with these definitions. I was using the meaning of coherency generally used within graphics research, which is used interchangably with locality.

  46. Not only that by Moraelin · · Score: 1

    Not only that, but also bear in mind that

    1. GPUs are already massively parallel things. If you think of way back in the days of 1 pipeline GPUs and cards, each extra pipeline is, in a way, like an additional core. We're already at the point where cards act like a 16 or 24 core chip, for all practical reasons. (Or 4x that if you run two 7950 GTXs in SLI.)

    Graphics problems are by definition massively parallel. They're essentially doing the same (simple) set of operations on many many pixels. Unlike CPUs where "multiple pipelines" means you get to split a single stream of instructions across them, while "multiple cores" means actually executing several threads at the same time, GPUs don't have that distinction. Each pipeline processes one "thread": that for one pixel.

    I.e., as "multi-core" hype goes, the GPUs are actually ahead. I don't need to wait for Intel's 8 core chips sometime in the future, when Nvidia can do 96 cores right here and now, and do that cheaper. Partially _because_ they don't have to also be a general-purpose CPU, applicable to everything from games to databases. So they can pack more specialized units per square inch. They've also sold whatever issues of concurrent access, caching, etc, are involved in graphics problems.

    2. GPUs can and do also solve a lot of problems via dedicated hardware, rather than having to program them to even find the pixel in a texture. There are programmable things like pixel and vertex shaders, yes, but also a lot of things like texturing or filtering or T&L nowadays are solved _much_ faster by hard-wired dedicated units. They're units which just have to do one thing well and quickly, and actually do it, churning one pixel per clock cycle.

    Even if we move to raytracing, I'd expect much the same to apply. Sure, instead of building an image starting from the triangles, we'll start from the screen pixels and find the triangles they intersect. But conceptually it's still the same kind of operation that's (A) massively parallel, and (B) can have many parts that are hard-wired for maximum speed.

    E.g., finding the intersection of a pixel with a surface, and finding out what pixel value is there, is basically one such operation which can be served just as well by a fast hard-wired unit. Reflections and refractions? Ditto. You don't _need_ a user program running on a general purpose CPU to do those.

    So basically on the whole, while I can understand why Intel is searching feverishly for a reason why you'd want 8 cores in your home computer, I can't understand why would any end-user actually want to move those operations back onto the CPU. The GPUs do that stuff better, and will keep on doing that stuff better.

    --
    A polar bear is a cartesian bear after a coordinate transform.
  47. This was predicted in 1968 by Jecel+Assumpcao+Jr · · Score: 1

    Just another step in the well known Wheel of Reincarnation. At least well known to all three of us who don't completely ignore computer history ;-)

  48. Umm no by gr8_phk · · Score: 1
    Raytracing does not scale nicely with the amount of geometry

    It actually scales exceptionally well with the amount of geometry O(log n) where GPUs suck. Read the linked article. Also, the spatial index used in ray tracing is a non-trivial data structure which is not handled well on a GPU. I've also found that ray tracing works better (fewer/no artifacts) with double precision floating point which is not available on a GPU. In a few years, the CPU will be quite capable of realtime ray tracing, so at that point, the GPU becomes a truely redundant part. And in spite of all the "GPGPU" hype, they still don't run word processors or other useful app. When you can use either chip for graphics, but only one of them can do all the other things, which one do you think will get dropped?

    BTW, compositing windows can be done on the CPU today, so why all this talk about using the 3D card for it? Have people forgotten how to write fast graphics code? (unless it's GPU assembler of course)

    1. Re:Umm no by smallfries · · Score: 1

      Yup, I got that one wrong, see my earlier replies. The non-trivial data-structure is something that I referred to as being hard to deal with on a GPU. I've done some research in GPGPU - specifically doing RSA on a graphics card so I've come across problems that fit the architecture badly.

      Oddly, real-time rendering is one application that may be good on a CPU, but it could still be handled more efficiently by custom hardware. The point that I was trying to make is that the custom hardware to do it is not the rasterisation engine on a GPU. Although, that is quite good for compositing windows .... If only to keep the CPU free for other tasks.

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    2. Re:Umm no by Lerc · · Score: 1

      Modern CPUs have enough ergs to do compositing, The problem is that they can't get at the memory with enough speed.

      I'm a game programmer who prefers doing 2d games. Ideally I'd do everything on the CPU because then there isn't any vaiance between hardware/drivers, but the speed you can write to video memory just plain sucks. Reading is far worse, If you want any blending effects you need you do eveything in a buffer and copy the final frame over.

      On the other hand, Raytracing actually gets you a better use of your CPU. It's a lot of computation per write.

      All up I think I'd like a video card with nice high speed 2d porter-duff, and a many core cpu for everything else.

      --
      -- That which does not kill us has made its last mistake.
  49. Bogus Intel comment by JustNiz · · Score: 1

    >>> ' A report by Intel about Ray Tracing shows that a single P4 3.2Ghz is capable of 100 million raysegs, which gives a comfortable 30fps'

    Thats a bogus comment, as raytracing time (hence framerate) is totally dependent on the complexity of the scene being rendered.

    E.g. A few simple cubes would raytrace MUCH faster than a forest scene with reflective water and multiple trees, leaves, and blades of grass etc.

    Unfortunately, for gaming, the latter scenario is much more likely.

    It still makes sense to offload rendering/raytracing to a dedicated graphics processor ( read GPU ) as it has closer ties to video ram. If done by the CPU, you'd swamp the sytem bus with billions of (slow) graphics memory writes. Also it leaves the CPU free to do other stuff like game AI, speech recognition, physics, dowloading pr0n etc.

    The question this article should have asked is if real-time raytracing is so doable, when will nVidia/ATI start using raytracing in their GPU's and drivers? Don't forget even an average GPU still kills even the best Intel CPU in terms of floating point operations.

    1. Re:Bogus Intel comment by Eideewt · · Score: 1

      But ray tracing scales better than today's popular rendering methods. According to TFA, a ray tracer will render in O(log n) time. They found that software ray tracers catch up to hardware rasterizers when scenes approach 1 million triangles. So a few simple cubes would be *much* faster with the hardware rasterizer, but it would bog down with the second scene you describe, while the ray tracer would handle it more gracefully.

  50. What, are you new here? by spun · · Score: 1

    You expect someone to RTFA? This is slashdot, where everyone can be an expert. As long as no one reads TFA. Get with the program!

    --
    - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
  51. Not for GPUs, or CPUs, but RTPUs... by Anonymous Coward · · Score: 0

    Dedicated RayTracing hardware has been available for several years. The first commercially successful one was http://www.artvps.com/ (Advanced Rendering Technolgies (ART) in the UK)

    They demo'd their prototype render engine back about 6-7 years ago at SIGGRAPH (99, I believe...I was there.)

    But to use it for realtime work (even the PURE card they sell) it would have to be heavily optimized for speed, rather than 'prettiness'. That would be a major re-tooling of the design.

    Raytracing for games is becoming a possibility...but it is not nearly where it needs to be to be affordable and efficient.

    GPUs do NOT run raytrace algorithms well. But that is because they aren't designed to. If nVidia and ATI/AMD decide to start including some silicon real-estate on the chips for dedicated ray-intersection testing and a linear equation solver unit.....things might accelerate in that direction faster.

    Until then, its just a lot of academic theorizing and 'best-case' numbers.

  52. Need better processors for a simple job by Anonymous Coward · · Score: 0

    How about some real world use. My Athalon 64 3600 conks out when trying to show and transcode a simple HDTV programming from the local PBS station transmitting at 1080i. In mplayer it drops frame here and there and remember mplayer is one of the best out there.

  53. Swap on RAM Disk by DarthStrydre · · Score: 1

    You mean that as a joke, but it is not entirely without merit. On any system with 192MB of RAM or more I generally do not use a swap partition, since it is not needed as long as you dont go bonkers and try to load up the GIMP and a VM and OO.o and all the default active services or daemons.

    Some apps will not run without swap space - not that they actually use it - just that they refuse to run without some. Create a 2MB swap ramdrive, and problem solved - just to make the programs happy.

    I mostly use a laprop retrofitted with a Hitachi microdrive in place of the hard disk to save power. It sips power, is dead quiet, and produces almost 0 heat, but is slow as ice melting. Program loading is almost bearable, but once loaded everything's fine. Hitting swap on it would be disastrous.

    If I need to use a memory hungry app (huge image editing, etc) I use a different machine.

    Swap is for n3wbs.

    1. Re:Swap on RAM Disk by somersault · · Score: 1

      I wonder if it would help use the last 500MB of RAM or so that you can't seem to access on 32 bit versions of Windows with 4Gb of ram installed? :s Shouldn't you be able to use 16GB of RAM in a 32 bit OS anyway? Don't flame me, I can't remember, and I dont want to work it out right now =p

      --
      which is totally what she said
  54. Microsoft won't like it. by theolein · · Score: 1

    If a boom does finally start in ray traced realtime engines, which I somehow doubt, Microsoft won't like it. They have spent a whole lot of money to make DirectX a monopoly and thereby control the games market. this would push the whole effort overboard, with time. Microsoft would have to make their DirectXRayTrace the sole API before they'd be happy again.

    1. Re:Microsoft won't like it. by SanityInAnarchy · · Score: 1

      Here's hoping OpenRT beats them to it.

      --
      Don't thank God, thank a doctor!
  55. povray performance by j1m+5n0w · · Score: 1
    But when we get into presentation-quality raytraces (at least with Povray), my P4-2.4HT takes 3-4 minutes per frame.

    I don't know that povray rendering time is necessarily a good indicator of the maximum performance of a real-time ray tracer. From Ray Tracing News:

    The Saarbrucken folks presented their design for an RPU, a ray-tracing processing unit. It's not considerably different than today's GPUs, having just a bit more specialized hardware here and there. Interesting stat: their *software* ray tracer is about 30x faster than the freely-available POVRay ray tracer. Now, I could probably speed POVRay up by 2x by some tuning of this and that, but 30x is pretty incredible. More on this topic later in this issue.

    Not that I'm disparaging povray; it has many interesting features and a wonderful programmer-friendly scene description language. However, I don't think rendering speed is necessarily its strong point.

  56. "PC LOAD LETTER"? by Anonymous Coward · · Score: 0
    ... raster graphic research has continued to be milked for every approximate drop it closely resembles being worth.
    What the fuck does that mean?
  57. I'm not saying it's better... by default+luser · · Score: 1

    I'm saying purely from a "geekiness" standpoint that I'd like to download and mess around with this engine. It looks like fun. It is painfully obvious that, aside from the bump maps and real-time lighting, the game actually looks worse...but that's understandable.

    The lighting has actualy served to highlight the small amount of polygons in the player model (750, to be exact). The good thing is, as people have been stating all through this thread: processing requirements for raytracing go up with the log of the polygons. So, upping this model to 7500 polygons (more than used in Doom 3 models) would only double the processing requirements, which shows a bright future for rayracing.

    Look at it this way: ten years ago, real-time raytracing was a pipe dream. Right now it is very much a reality on high-end hardware, albeit with reduced features and benefits. In another ten years, I will be surprised if raytracing is too difficult for mainstream hardware.

    --

    Man is the animal that laughs.
    And occasionally whores for Karma.

    1. Re:I'm not saying it's better... by SanityInAnarchy · · Score: 1

      How do those features and benefits scale, though?

      --
      Don't thank God, thank a doctor!
  58. Real-time ray tracing: it's here and open source by padkumar · · Score: 1

    I don't get all this "won't happen" bah humbug. We have ray tracing - heck, we have real time ray tracing, and it's about to go open source: http://blogs.zdnet.com/OverTheHorizon/?p=10