In the computer games industry it still pays to know your way around cycle counts, pipelines and caches. Just because your device has a cache, and you're coding mainly in an OO language, doesn't mean to say you've left the world of cycle-level optimisation behind. And particularly on Sony machine it's almost a requirement to fully understand the various hardware interactions in order to get a decent turn of speed out of it.
As an industry we're now finding it very hard to employ people who know this kind of stuff. Most graduates are taught Java or C++ and have no decent experience at the assembler or hardware level. Now I'm not saying that we spend all day hand-crafting assembly code - games are just far too big nowadays - but every now and then you'll get an unusual crash which can only be debugged using knowledge of the hardware. In my experience CS graduates just freak out when you show them a disassembly of their code!
Critically though, and as pointed out in some other posts:
Writing off the end of an allocated piece of memory is underfined. Just because it might work now, doesn't mean it'll work later.
More importantly, these were allocated off the stack, not using malloc/new. The stack is usually DWORD aligned. Thus accessing Head[0xffff] could well actually be accessing Tail[0].
Whatever, relying on undefined behaviour is relying on coincidence. In your example the malloc/new alignment is a coincidence of the particular heap of the particular version of the C runtime. There's nothing to say it won't change between VC6, VC7 and VC.NET (and actually, it did). Nor does it mean it will work on other platforms. It's just a coincidence that on the platform it was developed for, with the compiler and associated runtime it was developed for, it worked.
Unless I'm very much mistaken, there's a blatant buffer overflow in this code. In bitfast.h the "Tail" and "Head" arrays are defined as having 65535 (0xffff) entries. That's from 0-65534 inclusive.
The code in "BitFast" then uses the array as if it can access Head[0xffff] and Tail[0xffff] - which is one past the end of the arrays. I'd guess it works by coincidence on Win32.
Additionally, although there might be some reason I can't follow at the moment, in the float version only the top 15 bits of the values are seemingly used. This would seem to ignore the sign bit of the FP representation. Maybe there's another subtlety I'm missing, but I can't see it at the moment.
Given these basic misunderstandings of programming concepts I'm not surprised the authors hadn't known of the radix sort which many people here have linked as being very similar indeed.
Woo, always cool when big companies remember their roots. But, and with flame-proof suit on...Google may be huge but they're only one company. So in the intro text:
the original Googleplex is intended to be preserved as part of the companies living history.
That "companies" should be "company's" - more info on this common grammar mistake at the Apostrophe Protection Society. Arguably a nitpick I know, but on something as well-read as Slashdot it's nice to try and set an example:)
Ok flame-proof suit on, but "it's" is short for either "it is" or "it has". In this case the apostrophe isn't needed to denote ownership any more than you need an apostrophe in the words 'his' or 'hers'. More info at the Apostrophe Protection Society.
3d accelerator cards do not 'use lightmaps' - they can (or at least most can) as either a second blended texture stage, or a multi-texture operation. Many developers have now moved on from lightmaps and are using techniques previously only used in non-realtime rendering only a few years ago. Similarly 3D modellers don't necessarily use ray tracing; many use a hybrid approach, and/or more natural-looking techniques like radiosity/photon mapping.
also, if what you really want is to do your final renders with hardware then i have two things to say:
1. write a render plugin to do it.
This is exactly what we do for our games; we have a bespoke realtime rendering solution with a novel idea of materials (similar in some ways to a RenderMan shader system, though much more limited). This means a material plugin needs to be written for our artists just in order for the materials to be edited in situ. As a side-effect it's usually relatively simple to get the materials rendering as they would in the game in the render window.
I was referring to the fact that most multiprocessor machines have at least some form of debugging support; they're not black boxes you just have to hope work right. Multiprocessor systems are well understood, and embedded systems with multiple processors likewise; though usually when developing in such situations there is a level of debugging support, be it even the local equivalent of 'printf()'. Until very recently PS2 development was a 'fire and forget, hope it works' style approach.
The Slashdot audience is made up of computing professionals with a wide array of knowledge. I use terms to facilitate communication, not to shout to the world that I know arcane terminology
A fair point; but I think it's also fair to say your comment made no sense at all. I just see too much putting-down of things people don't fully understand on this site, and your line just looked like blanket 'if it can't do this it's appalling'. Apologies for taking it to heart too much.
If the programmer needs somthing else: like generateing all the textures using algorithms, or simulating deformable shapes on a per-pixel basis, that the design like the massivly parrallel and massivly flexible PS2 really shines.
...it shines if you like programming an almost impossible-to-debug multiprocessor system. Orchestrating four separate processors with DMA accesses flying over limited bus power is tricky. Plus Xbox, though DX8-based is not just DX8, it's superficially similar but greatly optimized and tailored specifically for Xbox.
Xbox has UMA too, which means the CPU can get in and address textures directly itself, unlike on the PS2 where DMAs have to be set up to talk to texture memory, so in fact it's easier on Xbox to generate the textures using algorithms, as you describe.
As for 'simulating deformable shapes on a per-pixel basis' I've been in the graphics trade for five years, and have never heard such a made up bunch of junk. You want deformable shapes? Cool; you can either dump polys completely and write your own renderer, in which case Xbox will beat PS2 as it has a faster processor, and none of the specialist rendering hardware in either box can help you. If you mean deformable as in morphing/procedurally modified vertices, then both machine are equal. If you mean procedural generation of geometry, then granted, the PS2 shines here, though it's not as if Xbox can't do it. As for anything 'per-pixel' the PS2 can render a single texture per-pixel at a time. Only Xbox and GameCube can do anything like arbitrary per-pixel operations.
Anand had a great example of this: Electronic A rts just used one of the the PS2 vector units to encode Dolby 5.1. sound. Thats flexible.
Granted, that is cool; but you are of course giving up 30% of your processing power to do something Xbox does in hardware. All credit to them though!
Ripping the encrypted data probably isn't too hard; but getting the ripped BIOS off of the Xbox and working out how it gets decrypted by the peripheral bus on bootup would be the hard bit. If you can decipher the BIOS then I agree, yes, it's possible to emulate the encryption along with everything else an Xbox does. Getting that BIOS unencrypted will be the tricky part.
The Xbox uses DDR RAM, and it's GPU is significantly altered from a vanilla GeForce III, for a start it has two copies of the transformation engine running in parallel, so can transform twice as many vertices than your PC's GF3.
Of course, over time that'll change - I don't think anyone except MS and NVidia could tell you what agreements they have about delaying the PC equivalent stuff. Maybe that's why the nForce chipset is AMD only, praps MS made them sign away the rights to their equivalent Intel chipset (ie the Xbox mainboard)
Fair play there; my information is based on the fact that I've been developing an Xbox title for over a year. Tell your developer friend to read the docs!
Especially since no direct hardware access is alowed, according to MS coding standards. (seems they are covering themselves for future upgrades).
That's just plain untrue; developers have full access to the components that matter (i.e. the graphics and sound processors). Those that aren't time-critical, such as the HD and network card are driven by the kernel in BIOS. So MS can upgrade them to cheaper components as time goes by without breaking games. There is no plan for 'future upgrades'.
Al hardware access is done with an API, so things like the shared memory model are only a performance issue, not a portability issue.
(It has to be an API, since its running a streamlined 2K kernel anyway).
Xbox runs in RING0 all the time, so game code can poke the hardware as much as it likes...the 'API' you mention is as thin as possible and can be bypassed directly.
There must be some sort of handshake going on to determine if its a bootable program in there, that is made for Xbox. If that could be cracked, sky would be the limit for messing around with the Xbox.
Yup; but you'll be lucky to get past the crypotographically signed data going back and forth; you'd have to sign all the data with Microsoft's key first...good luck there.
Where the heck are you getting your 'information' from - you don't appear to have a clue what you're talking about.
In this case the only reason you don't see the "shared memory architecture" on a modern PC is because the CPU's MMU is set up such that you don't.
That simply isn't true - the CPU has no direct access to the RAM on a graphics card. Period. Instead it must use memory mapped IO to set up DMA transfers to and from the graphics card's RAM, or poke each byte individually over the bus - very unlike the direct cache-line level access it has to its own RAM chips.
Xbox on the other hand has all of the RAM available to both devices; they share access to the same physical RAM - PCs have two physically distinct RAM banks; one on the graphics card and one on the motherboard.
Of course there is AGP but this is a way for the graphics card to read and write to a limited subset of the mainboard's RAM - and very slowly at that, causing all sorts of contention issues. At no point can the processor or its MMU access the memory on the graphics card directly.
I don't know what the copy protection looks like on the Xbox (if anyone knows anything about it... please post it), but I think it will be bypassed very shortly.
Can't say much (under NDA here) but the copy protection system is several steps above and beyond anything currently out there, drawing from various hardware facilities and strong cryptography with all code and data on the DVD and HD being signed/crypted.
If I were a betting man I'd bet against the protection being broken in the next year or so - it really is that much of a leap above the usual PS-style damaged block/weak crypto system.
Yup MS have lots of nasty cryptographic tricks; the BIOS and system components are all encrypted and there is proprietary hardware on the mobo to decrypt the data.
With today's JIT-style code translation systems and the fact that the P3 can outperform the SH4 cycle-for-cycle anyway I wouldn't say it would be impossible for the Xbox to emulate a Dreamcast.
As for the different graphics subsystem on the DC, it would be trivial to bung a translation layer between the two systems. In fact the hardest part of the DC to emulate would be its curious bumpmapping format, although the Dx8-style pixel shaders on Xbox could do a lot to help this out. The shadow volumes supported by DC could also be translated into equivalent stencil-buffer operations too.
Just my two-penneth - as an engine coder on a DC game who's now working on an Xbox title I've got some idea what I'm talking about;)
It's a much overstated, and wrong, point that the Dreamcast has a WinCE layer - it doesn't - unless you specifically link with it. WinCE is just another library to a console developer; if you don't use it (which frankly you'd be foolish to do so, imho) it doesn't get loaded, or even mastered onto the final disk.
Very few serious games use WinCE, but MS insisted on Sega putting the logo on the box, which has confused a few people into thinking the whole thing runs on CE. The truth is you get hardware-level access without CE, which allows you to register vertices as fast as you can generate them straight to the gfx processor.
The DC is a lovely little machine; I wish I had had time to attempt a linux port (I have a dev-kit sat on my desk here right now which would make it a bit easier than the guys using a vanilla DC), so good luck to the future linux development I say!
typo! :)
30 print A$," is a wanker!"
In the computer games industry it still pays to know your way around cycle counts, pipelines and caches. Just because your device has a cache, and you're coding mainly in an OO language, doesn't mean to say you've left the world of cycle-level optimisation behind. And particularly on Sony machine it's almost a requirement to fully understand the various hardware interactions in order to get a decent turn of speed out of it.
As an industry we're now finding it very hard to employ people who know this kind of stuff. Most graduates are taught Java or C++ and have no decent experience at the assembler or hardware level. Now I'm not saying that we spend all day hand-crafting assembly code - games are just far too big nowadays - but every now and then you'll get an unusual crash which can only be debugged using knowledge of the hardware. In my experience CS graduates just freak out when you show them a disassembly of their code!
Critically though, and as pointed out in some other posts:
- Writing off the end of an allocated piece of memory is underfined. Just because it might work now, doesn't mean it'll work later.
- More importantly, these were allocated off the stack, not using malloc/new. The stack is usually DWORD aligned. Thus accessing Head[0xffff] could well actually be accessing Tail[0].
Whatever, relying on undefined behaviour is relying on coincidence. In your example the malloc/new alignment is a coincidence of the particular heap of the particular version of the C runtime. There's nothing to say it won't change between VC6, VC7 and VC.NET (and actually, it did). Nor does it mean it will work on other platforms. It's just a coincidence that on the platform it was developed for, with the compiler and associated runtime it was developed for, it worked.Unless I'm very much mistaken, there's a blatant buffer overflow in this code. In bitfast.h the "Tail" and "Head" arrays are defined as having 65535 (0xffff) entries. That's from 0-65534 inclusive.
The code in "BitFast" then uses the array as if it can access Head[0xffff] and Tail[0xffff] - which is one past the end of the arrays. I'd guess it works by coincidence on Win32.
Additionally, although there might be some reason I can't follow at the moment, in the float version only the top 15 bits of the values are seemingly used. This would seem to ignore the sign bit of the FP representation. Maybe there's another subtlety I'm missing, but I can't see it at the moment.
Given these basic misunderstandings of programming concepts I'm not surprised the authors hadn't known of the radix sort which many people here have linked as being very similar indeed.
There's Collada, which is at least a step in the right direction for the industry as a whole.
Hahahaha :) I've been out-pedanted! Top marks :)
Ok flame-proof suit on, but "it's" is short for either "it is" or "it has". In this case the apostrophe isn't needed to denote ownership any more than you need an apostrophe in the words 'his' or 'hers'. More info at the Apostrophe Protection Society.
Nowt wrong with the name 'Moog' mate :)
3d accelerator cards do not 'use lightmaps' - they can (or at least most can) as either a second blended texture stage, or a multi-texture operation. Many developers have now moved on from lightmaps and are using techniques previously only used in non-realtime rendering only a few years ago. Similarly 3D modellers don't necessarily use ray tracing; many use a hybrid approach, and/or more natural-looking techniques like radiosity/photon mapping.
I was referring to the fact that most multiprocessor machines have at least some form of debugging support; they're not black boxes you just have to hope work right. Multiprocessor systems are well understood, and embedded systems with multiple processors likewise; though usually when developing in such situations there is a level of debugging support, be it even the local equivalent of 'printf()'. Until very recently PS2 development was a 'fire and forget, hope it works' style approach.
A fair point; but I think it's also fair to say your comment made no sense at all. I just see too much putting-down of things people don't fully understand on this site, and your line just looked like blanket 'if it can't do this it's appalling'. Apologies for taking it to heart too much.
I'll join you on that one
...it shines if you like programming an almost impossible-to-debug multiprocessor system. Orchestrating four separate processors with DMA accesses flying over limited bus power is tricky. Plus Xbox, though DX8-based is not just DX8, it's superficially similar but greatly optimized and tailored specifically for Xbox.
Xbox has UMA too, which means the CPU can get in and address textures directly itself, unlike on the PS2 where DMAs have to be set up to talk to texture memory, so in fact it's easier on Xbox to generate the textures using algorithms, as you describe.
As for 'simulating deformable shapes on a per-pixel basis' I've been in the graphics trade for five years, and have never heard such a made up bunch of junk. You want deformable shapes? Cool; you can either dump polys completely and write your own renderer, in which case Xbox will beat PS2 as it has a faster processor, and none of the specialist rendering hardware in either box can help you. If you mean deformable as in morphing/procedurally modified vertices, then both machine are equal. If you mean procedural generation of geometry, then granted, the PS2 shines here, though it's not as if Xbox can't do it. As for anything 'per-pixel' the PS2 can render a single texture per-pixel at a time. Only Xbox and GameCube can do anything like arbitrary per-pixel operations.
Granted, that is cool; but you are of course giving up 30% of your processing power to do something Xbox does in hardware. All credit to them though!
Ripping the encrypted data probably isn't too hard; but getting the ripped BIOS off of the Xbox and working out how it gets decrypted by the peripheral bus on bootup would be the hard bit. If you can decipher the BIOS then I agree, yes, it's possible to emulate the encryption along with everything else an Xbox does. Getting that BIOS unencrypted will be the tricky part.
Of course, over time that'll change - I don't think anyone except MS and NVidia could tell you what agreements they have about delaying the PC equivalent stuff. Maybe that's why the nForce chipset is AMD only, praps MS made them sign away the rights to their equivalent Intel chipset (ie the Xbox mainboard)
Fair play there; my information is based on the fact that I've been developing an Xbox title for over a year. Tell your developer friend to read the docs!
That's just plain untrue; developers have full access to the components that matter (i.e. the graphics and sound processors). Those that aren't time-critical, such as the HD and network card are driven by the kernel in BIOS. So MS can upgrade them to cheaper components as time goes by without breaking games. There is no plan for 'future upgrades'.
Xbox runs in RING0 all the time, so game code can poke the hardware as much as it likes...the 'API' you mention is as thin as possible and can be bypassed directly.
Yup; but you'll be lucky to get past the crypotographically signed data going back and forth; you'd have to sign all the data with Microsoft's key first...good luck there.
Where the heck are you getting your 'information' from - you don't appear to have a clue what you're talking about.
How about just getting the one with the best games on it?
That simply isn't true - the CPU has no direct access to the RAM on a graphics card. Period. Instead it must use memory mapped IO to set up DMA transfers to and from the graphics card's RAM, or poke each byte individually over the bus - very unlike the direct cache-line level access it has to its own RAM chips.
Xbox on the other hand has all of the RAM available to both devices; they share access to the same physical RAM - PCs have two physically distinct RAM banks; one on the graphics card and one on the motherboard.
Of course there is AGP but this is a way for the graphics card to read and write to a limited subset of the mainboard's RAM - and very slowly at that, causing all sorts of contention issues. At no point can the processor or its MMU access the memory on the graphics card directly.
Can't say much (under NDA here) but the copy protection system is several steps above and beyond anything currently out there, drawing from various hardware facilities and strong cryptography with all code and data on the DVD and HD being signed/crypted.
If I were a betting man I'd bet against the protection being broken in the next year or so - it really is that much of a leap above the usual PS-style damaged block/weak crypto system.
Yup MS have lots of nasty cryptographic tricks; the BIOS and system components are all encrypted and there is proprietary hardware on the mobo to decrypt the data.
It ain't gonna be easy to crack, that's for sure.
As for the different graphics subsystem on the DC, it would be trivial to bung a translation layer between the two systems. In fact the hardest part of the DC to emulate would be its curious bumpmapping format, although the Dx8-style pixel shaders on Xbox could do a lot to help this out. The shadow volumes supported by DC could also be translated into equivalent stencil-buffer operations too.
Just my two-penneth - as an engine coder on a DC game who's now working on an Xbox title I've got some idea what I'm talking about ;)
He had a barrage of CVs/happy birthdays to lucas@ilm.com before eventually ilm bought the domain back off of him.
Very few serious games use WinCE, but MS insisted on Sega putting the logo on the box, which has confused a few people into thinking the whole thing runs on CE. The truth is you get hardware-level access without CE, which allows you to register vertices as fast as you can generate them straight to the gfx processor.
The DC is a lovely little machine; I wish I had had time to attempt a linux port (I have a dev-kit sat on my desk here right now which would make it a bit easier than the guys using a vanilla DC), so good luck to the future linux development I say!