OK. I can't give you the code but it is my own implementation of a pretty standard bioinformatics sequence comparison program which doesn't use SSE/MMX type instructions and is single threaded. On all platforms it was compiled using gcc with -O3 optimisation. I have tried adding other optimisations but it doesn't really make much difference to these numbers (no more than a couple of percent at best).
When you say you've tried "adding other optimizations," are you referring only to other GCC optimization flags? If your program's algorithms have any moderate degree of parallelism and you haven't tried vectorization either by compiler (GCC and ICC can both do this) or by hand, the benchmark you've done is not unlike a race where no one is allowed to shift out of first gear. Can you go into any more specifics about how this program does sequence comparisons?
Also, the disappointing numbers from the G5 may be partially explained by the fact that its integer unit has higher latency than the other desktop processors in that list. The G5 isn't exactly known for blistering integer performance, anyway.
Hmm...do you mean specifically on AMD's hardware? That stopped being true for Intel starting with the Core, which has 1-cycle latency on SSE instructions.
Core2 has single-cycle throughput on most SSE instructions, not single-cycle latency. Most of these instructions still take 3-5 cycles to generate results, which is similar to the Pentium M, but now a vector of results finishes every cycle, instead of every two or four cycles.
An important consequence of this is that if your instructions are poorly scheduled by the compiler (or assembly programmer) and the processor spends too much time waiting for results of previous operations, the advantages of single-cycle throughput mostly disappear.
I've noticed what you observed about test_boxstack, too, but what bothered me more was the way the stack of boxes falls once you finally knock a piece far enough out. It seems like the stack ought to sink as one, but for now it doesn't look like a box starts to fall until the box below it has moved completely out from below it. I suspect this has to do with not actually running the simulation on objects unless certain conditions are met, since the amount of contact resolution that needs to happen in order to keep a stack like that stable would use up way too much CPU time. The alternative (that is, simulating the entire stack with an amount of contact resolution to be feasible for a game) would make the stack look somewhat spongy when it's standing.
That being said, I happen to like the way ragdolls behave in Doom 3. It seems like many other games don't use a high enough coefficient of friction for the ground, which makes bodies slide a bit in whatever direction they were moving.
Incidentally, the Unreal games are currently using Karma. Havok is used to good effect in Max Payne 2, though, and I'm definitely looking forward to seeing what Valve has done with it in HL2.
Don't be misled by the submitter's mention of Doom 3's test_boxstack map, Doom 3's physics were done in-house at id by Jan Paul van Waveren (a.k.a. Mr. Elusive), who was also responsible for bots both official and unofficial in past Quakes. Gamespy interviewed him and Robert Duffy back in 2001, although the interview doesn't go too much into the details of exactly how Doom 3's physics work.
On the other hand, there was a press release back in late March about how Epic Games will be using NovodeX technology in future versions of the Unreal engine.
(On an unrelated side note, if any HA regulars are reading this, it was pretty much my fault that the previous test wasn't attributed to Roberto. I apologized to him as soon as I realized my error, but I'll apologize here once more just to be sure.)
VirtualDub users can try Deshaker, which sounds like it does exactly what you want it to do. If you want to see the type of output it produces, here's a page where someone actually tests it out on real camcorder footage...alternately, you could just try it yourself.
It might contain MPEG-1 or MPEG-2 audio, but even this, according to a later section, is only Layer-2 audio, and not the same thing as AAC. There's a reason that AAC is called MPEG-2 NBC--NBC stands for non-backwards-compatible.
I think you're a little mixed up about DVDs. Yes, AAC is part of the MPEG-2 and MPEG-4 standards, but DVDs don't use AAC, they use AC3, LPCM and (sometimes) DTS.
Seeing as Quicktime movie trailers have been using Sorenson Video (1, 2 and 3) for video since about the time that trailers for The Phantom Menace were coming out, I'm wondering if you somehow remembered the page wrong. I do know that to get that kind of quality out of Sorenson needs the Pro version of the codec (which gives you bidirectional coding, VBR and other goodies) and an encoder that actually supports 2-pass VBR properly (Cleaner comes to mind).
I can't help but think that given the same sort of quality source material that Apple has, home users could get that kind of quality with a little know-how and the right tools. AviSynth, for example, has tons of fantastic user-created filters for cleaning up less-than-ideal video and removing noise. Also, someone correct me if I'm wrong, but the main thing I've noticed about successive versions of Sorenson is that it seems to be better about not wasting bits compressing background noise in places with little motion. (This comes, of course, from years of diligent trailer-watching.)
a thread in the forum has just been started to talk about the comparison. There were only a couple posts when I linked to it, but I'm sure there'll be more interesting critiques later.
PA newspost, and the true state of buffering
on
Real's Reality
·
· Score: 2, Interesting
As of today, Penny Arcade had a newspost from Tycho that takes a paragraph at the bottom to disparage Real and plug RealAlternative like so many Slashdotters have already done here.
On a pretty much unrelated topic, I thought it might also be interesting to point out that none of the major media players, as far as I can tell, suffer from the buffering which has been the butt of so many (!) jokes in this topic already. All of them have some feature (under different names, of course) that allows them to build up their playback buffers as fast as the Internet connection will allow, which basically gives you minutes of buffer after only a short period of time. Borders on progressive download, I guess. That and RealPlayer 10 has a feature that allows you to cache a user-specified amount of the past stream, even for live streams.
Perhaps I'm too quick to consider forgiving Real for their privacy issues, but as far as playback quality goes (both in terms of streaming and codecs), bashing Real for being bad at that would be just plain misinformed.
For good reasons, the posters on Hydrogenaudio don't take kindly to people making unfounded assertions about which codecs are better, so if you're going to argue with them, think twice and ABX first. You will be, after all, arguing with many audio developers, e.g. people who make contributions to LAME, people who've tuned the Vorbis encoder, and a surprising number of people who work for Ahead (makers of Nero, of course).
I would imagine that the software renderer is probably a PC-only option. UT2K4 uses Pixomatic for its software renderer, which from what I can gather from the website, is heavily optimized for speed, but is only for PCs. (I suppose the fact that it was written by Mike Abrash, who worked on the original Quake software renderer with John Carmack and has written a fair share of optimization books, is more than enough assurance for me.) Anyways, there doesn't seem to be a Mac version of that, but Macs tend not to be saddled with crappy onboard graphics chips, anyway.
UT2K3 added "Mega Kill," "Ludicrous Kill" and "HOLY SHIT" to the announcer's vocabulary. Incidentally, it's interesting to hear how...excited the Aroused announcer gets once you start racking up the consecutive kills.
Perhaps bellowing "HUMILIATION" in a low voice would be more appropriate for a tennis ball to the head.
He does claim to put in 8-12 hours a day in the weeks leading up to a tournament. I know there were a lot of gamers out there expecting Fatal1ty to say "practice, practice, practice" somewhere in this article. (For those that haven't read it yet, no, he doesn't say it.)
You can turn on the same feature in UT2K3, too, just not when you're running a multiplayer game. In Instant Action mode, it's in the Game Rules tab, which seems like a bizarre place to put it (i.e. away from the rest of the bot settings). I know in 2K3, at least, there's a limit on how many levels the computer is allowed to promote the AI--someone who initially chose Experienced bots won't get Godlike ones just because they're kicking too much butt.
I watched AOTC with digital projection too, so I know what phenomenon you're talking about. But since Ep3 is going to be digital anyway, consider the alternative to having digital theaters...staircasing on thin diagonal details, and the grain, jitter/weave, and other defects that go along with a film transfer, or even multiple generations of film transfer.
On the other hand, if there's one upside to digital cinema, it has to be that it makes for fantastic DVD transfers. I mean, you would have to be lazy or negligent to screw up a totally clean digital source. AOTC has to be the best non-CG/animation movie transfer that I've ever seen.
ESReality has a bunch of articles about CXG--not only commenting on how it degenerated into disaster, but also concerning how the tournaments were progressing up until the plug was pulled. Interesting reading, even if you don't know all the big names in the various games who attended the tournaments.
IIRC, different encoders might not agree on whether 1kbps is 1024 bits or 1000, and I wouldn't be surprised if MP3 encoders and WMA encoders have different ideas as such. Of course, similar issues arise when you're encoding video and these sorts of 2% margins of error end up being entire megabytes. (Anyone who's used Gordian Knot has probably noticed that the calculated bitrates vary depending on whether GK is configured to use DivX or XviD.) Oh, and similarly, since the encoded audio needs to be put into the Windows Media wrapper to be written to a file, a substantial amount of overhead is required, which makes the undersizedness even more interesting.
Real is supporting AAC now (as of very recently--they use it in basically the same way they used Atrac3 in the past), but considering that a lot of Slashdotters don't seem to like Real, so this doesn't really say that much about AAC not being locked in.
Did you look in the forums? Though there's some amount of flaming that goes on when people ask questions that they could answer themselves with a little work, most of the discussion that happens there actually does concern recent codec developments. Also, the forums provide a huge body of people who are more than willing to test out cutting-edge builds of codecs.
And don't get the idea that it's just a bunch of movie rippers in there. There's a fair number of people who work on XviD, write video filters, and a lot of other useful and interesting video software. For example, developers from On2 have a user account there, and discuss VP6 with the forum members from time to time.
I should note, for reference, that WMA version upgrades, at least until WMA Pro came out, were basically just encoder upgrades, which is why having so many versions of WMA (e.g. 2, 7, 8, 9 non-Pro) doesn't break hardware support.
When you say you've tried "adding other optimizations," are you referring only to other GCC optimization flags? If your program's algorithms have any moderate degree of parallelism and you haven't tried vectorization either by compiler (GCC and ICC can both do this) or by hand, the benchmark you've done is not unlike a race where no one is allowed to shift out of first gear. Can you go into any more specifics about how this program does sequence comparisons?
Also, the disappointing numbers from the G5 may be partially explained by the fact that its integer unit has higher latency than the other desktop processors in that list. The G5 isn't exactly known for blistering integer performance, anyway.
Core2 has single-cycle throughput on most SSE instructions, not single-cycle latency. Most of these instructions still take 3-5 cycles to generate results, which is similar to the Pentium M, but now a vector of results finishes every cycle, instead of every two or four cycles.
An important consequence of this is that if your instructions are poorly scheduled by the compiler (or assembly programmer) and the processor spends too much time waiting for results of previous operations, the advantages of single-cycle throughput mostly disappear.
I've noticed what you observed about test_boxstack, too, but what bothered me more was the way the stack of boxes falls once you finally knock a piece far enough out. It seems like the stack ought to sink as one, but for now it doesn't look like a box starts to fall until the box below it has moved completely out from below it. I suspect this has to do with not actually running the simulation on objects unless certain conditions are met, since the amount of contact resolution that needs to happen in order to keep a stack like that stable would use up way too much CPU time. The alternative (that is, simulating the entire stack with an amount of contact resolution to be feasible for a game) would make the stack look somewhat spongy when it's standing.
That being said, I happen to like the way ragdolls behave in Doom 3. It seems like many other games don't use a high enough coefficient of friction for the ground, which makes bodies slide a bit in whatever direction they were moving.
Incidentally, the Unreal games are currently using Karma. Havok is used to good effect in Max Payne 2, though, and I'm definitely looking forward to seeing what Valve has done with it in HL2.
Don't be misled by the submitter's mention of Doom 3's test_boxstack map, Doom 3's physics were done in-house at id by Jan Paul van Waveren (a.k.a. Mr. Elusive), who was also responsible for bots both official and unofficial in past Quakes. Gamespy interviewed him and Robert Duffy back in 2001, although the interview doesn't go too much into the details of exactly how Doom 3's physics work.
On the other hand, there was a press release back in late March about how Epic Games will be using NovodeX technology in future versions of the Unreal engine.
Actually, no, these aren't those people.
These are people who do double-blind testing and who recognize that other so-called audiophiles are being silly when they buy ridiculously expensive power cables.
(On an unrelated side note, if any HA regulars are reading this, it was pretty much my fault that the previous test wasn't attributed to Roberto. I apologized to him as soon as I realized my error, but I'll apologize here once more just to be sure.)
VirtualDub users can try Deshaker, which sounds like it does exactly what you want it to do. If you want to see the type of output it produces, here's a page where someone actually tests it out on real camcorder footage...alternately, you could just try it yourself.
It might contain MPEG-1 or MPEG-2 audio, but even this, according to a later section, is only Layer-2 audio, and not the same thing as AAC. There's a reason that AAC is called MPEG-2 NBC--NBC stands for non-backwards-compatible.
I think you're a little mixed up about DVDs. Yes, AAC is part of the MPEG-2 and MPEG-4 standards, but DVDs don't use AAC, they use AC3, LPCM and (sometimes) DTS.
Seeing as Quicktime movie trailers have been using Sorenson Video (1, 2 and 3) for video since about the time that trailers for The Phantom Menace were coming out, I'm wondering if you somehow remembered the page wrong. I do know that to get that kind of quality out of Sorenson needs the Pro version of the codec (which gives you bidirectional coding, VBR and other goodies) and an encoder that actually supports 2-pass VBR properly (Cleaner comes to mind).
I can't help but think that given the same sort of quality source material that Apple has, home users could get that kind of quality with a little know-how and the right tools. AviSynth, for example, has tons of fantastic user-created filters for cleaning up less-than-ideal video and removing noise. Also, someone correct me if I'm wrong, but the main thing I've noticed about successive versions of Sorenson is that it seems to be better about not wasting bits compressing background noise in places with little motion. (This comes, of course, from years of diligent trailer-watching.)
a thread in the forum has just been started to talk about the comparison. There were only a couple posts when I linked to it, but I'm sure there'll be more interesting critiques later.
As of today, Penny Arcade had a newspost from Tycho that takes a paragraph at the bottom to disparage Real and plug RealAlternative like so many Slashdotters have already done here.
On a pretty much unrelated topic, I thought it might also be interesting to point out that none of the major media players, as far as I can tell, suffer from the buffering which has been the butt of so many (!) jokes in this topic already. All of them have some feature (under different names, of course) that allows them to build up their playback buffers as fast as the Internet connection will allow, which basically gives you minutes of buffer after only a short period of time. Borders on progressive download, I guess. That and RealPlayer 10 has a feature that allows you to cache a user-specified amount of the past stream, even for live streams.
Perhaps I'm too quick to consider forgiving Real for their privacy issues, but as far as playback quality goes (both in terms of streaming and codecs), bashing Real for being bad at that would be just plain misinformed.
For good reasons, the posters on Hydrogenaudio don't take kindly to people making unfounded assertions about which codecs are better, so if you're going to argue with them, think twice and ABX first. You will be, after all, arguing with many audio developers, e.g. people who make contributions to LAME, people who've tuned the Vorbis encoder, and a surprising number of people who work for Ahead (makers of Nero, of course).
You might find the graphs for a previous listening test interesting if you want to see how AAC stacks up against other codecs.
Ack, realized that the grandparent was talking about running 2K4 on Linux without 3D acceleration.
I would imagine that the software renderer is probably a PC-only option. UT2K4 uses Pixomatic for its software renderer, which from what I can gather from the website, is heavily optimized for speed, but is only for PCs. (I suppose the fact that it was written by Mike Abrash, who worked on the original Quake software renderer with John Carmack and has written a fair share of optimization books, is more than enough assurance for me.) Anyways, there doesn't seem to be a Mac version of that, but Macs tend not to be saddled with crappy onboard graphics chips, anyway.
UT2K3 added "Mega Kill," "Ludicrous Kill" and "HOLY SHIT" to the announcer's vocabulary. Incidentally, it's interesting to hear how...excited the Aroused announcer gets once you start racking up the consecutive kills.
Perhaps bellowing "HUMILIATION" in a low voice would be more appropriate for a tennis ball to the head.
He does claim to put in 8-12 hours a day in the weeks leading up to a tournament. I know there were a lot of gamers out there expecting Fatal1ty to say "practice, practice, practice" somewhere in this article. (For those that haven't read it yet, no, he doesn't say it.)
You can turn on the same feature in UT2K3, too, just not when you're running a multiplayer game. In Instant Action mode, it's in the Game Rules tab, which seems like a bizarre place to put it (i.e. away from the rest of the bot settings). I know in 2K3, at least, there's a limit on how many levels the computer is allowed to promote the AI--someone who initially chose Experienced bots won't get Godlike ones just because they're kicking too much butt.
I watched AOTC with digital projection too, so I know what phenomenon you're talking about. But since Ep3 is going to be digital anyway, consider the alternative to having digital theaters...staircasing on thin diagonal details, and the grain, jitter/weave, and other defects that go along with a film transfer, or even multiple generations of film transfer.
On the other hand, if there's one upside to digital cinema, it has to be that it makes for fantastic DVD transfers. I mean, you would have to be lazy or negligent to screw up a totally clean digital source. AOTC has to be the best non-CG/animation movie transfer that I've ever seen.
The Q3 tournament at CXG was cancelled too, you know. (The only tournament that actually finished besides WC3 was Unreal Tournament 2K3.)
ESReality has a bunch of articles about CXG--not only commenting on how it degenerated into disaster, but also concerning how the tournaments were progressing up until the plug was pulled. Interesting reading, even if you don't know all the big names in the various games who attended the tournaments.
IIRC, different encoders might not agree on whether 1kbps is 1024 bits or 1000, and I wouldn't be surprised if MP3 encoders and WMA encoders have different ideas as such. Of course, similar issues arise when you're encoding video and these sorts of 2% margins of error end up being entire megabytes. (Anyone who's used Gordian Knot has probably noticed that the calculated bitrates vary depending on whether GK is configured to use DivX or XviD.) Oh, and similarly, since the encoded audio needs to be put into the Windows Media wrapper to be written to a file, a substantial amount of overhead is required, which makes the undersizedness even more interesting.
Real is supporting AAC now (as of very recently--they use it in basically the same way they used Atrac3 in the past), but considering that a lot of Slashdotters don't seem to like Real, so this doesn't really say that much about AAC not being locked in.
Did you look in the forums? Though there's some amount of flaming that goes on when people ask questions that they could answer themselves with a little work, most of the discussion that happens there actually does concern recent codec developments. Also, the forums provide a huge body of people who are more than willing to test out cutting-edge builds of codecs.
And don't get the idea that it's just a bunch of movie rippers in there. There's a fair number of people who work on XviD, write video filters, and a lot of other useful and interesting video software. For example, developers from On2 have a user account there, and discuss VP6 with the forum members from time to time.
I should note, for reference, that WMA version upgrades, at least until WMA Pro came out, were basically just encoder upgrades, which is why having so many versions of WMA (e.g. 2, 7, 8, 9 non-Pro) doesn't break hardware support.