Visual Analysis Of Mp3 Encoders
Chris Johnson writes: "I've just finished an interesting scientific analysis of several mp3 encoders and have my findings up on the Web. The process involves differencing a 'sonogram' image from an encoded test signal with the image of the original signal, and then producing response curves showing the disparity in direct signal volume, and over time. Umm . . . which is just to say this is probably the most rigorous analysis of any encoders anywhere on the web, and very geeky (in a good way). LAME carries the day, but BladeEnc shows that it has a completely distinctive sonic approach- and Fraunhofer proves unacceptable (in the version I tested) for audiophile use, though it's unbeatable at very low bit rates. See why." Truth in advertising -- this is a cool example of how visual information can convey more than you'd expect it to.
If you are a big fan of classical you will have an opinion on _which_ parts of the sonic information are expendable
No, when a certain frequency component is discarded, it's not because the listener won't mind, it's because even if it's there, the listener cannot hear it. If you can't hear a sound, why encode it? Now, there are sometimes problems with classical music, but that's because it's often hard to predict exactly what you can and can't hear.
Opus: the Swiss army knife of audio codec
That said, it is generally the case that "pre-echo is bad" and "over-ring is bad." Reducing these can be thought of as a good thing. Let's assume that for these encoders, pre-echo and over-ring are universally bad (I'll give an example where this isn't the case, below). Furthermore, this comparison actually says nothing about these encoders other than the pre-echo or over-ring. I.e. what happened to the sound that was the "same" on the sonogram? It is quite possible for an "encoder" to mangle the audio quality yet have a pristine sonogram by this test's standards.
Just to throw a wrench in the works, more advanced encoders and/or psychoacoustic models can utilize what's called temporal masking. This is the ability of a higher-amplitude signal to mask (make inaudible) a lower-amplitude signal either before or after itself, as far as the human ear is concerned. Pre-echo is the phenomenon whereby a transient signal (i.e. a very 'sudden' attack, like a drum hit) is smeared in time. The audible effect can be most obnoxious. Yet encoders utilizing temporal masking will explicitly allow a certain amount of pre-echo through, as long as it is temporally masked. This leaves the encoder to spend those bits on other parts of the signal that would be more seriously degraded as far as our ear is concerned. In short, a sufficiently savvy encoder could exhibit more pre-echo than another worse-sounding encoder, especially if it uses temporal masking.
Quantitative analysis for perceptual audio coding is not easy; this has been a grail for researchers in the field for years. I strongly suggest that interested parties dig into various IEEE and AES (Audio Engineering Society) journal papers on the subject, as well as various books, etc.
Another fun experiment is to do this same thing sonically (makes a little more sense) -- encode to mp3, convert back to wave, and then subtract the original from the encoded one. The resulting wave will have all of the bits which were discarded.
It's difficult to interpret the results (I agree with those who say that this study is more or less worthless) but it does sound pretty neat. =)
While agreeing that for high quality audio one must "fuck mp3", I have to disagree with you that it will loose it's appeal.
Right now, the attitude is "Why be able to store several hundred songs, when I can store several thousand..."
In a couple of years, the numbers will change but the rationale will be the same. Why store ten thousand PCMS when I can have a hundred thousand??
I agree at some point things will become meaningless, but there will have to be quite a major revolution first... Perhaps that infinite data storage by quantum methods. Perhaps I'm a bit too hesitant to rely overmuch on Moore's law.
E
Give me some of what you are smoking, dude!
MP3 distortions are very evident especially at 128kbps(so called CD quality) They become less evident the higher the bitrate, but even at 320kbps the distortions are still easily identified compared to the original CD.
The basic idea of mpeg is that the encoder removes the parts of the music which you (probably) can't hear. The encoder splits the sound into pieces, and rates each piece after how important it is for the total sound image. Then it starts with the most important sound and encodes that, and continuing with the less important parts until the available bit rate is reached (e.g 128kbit/s). The rest of the sound data is discarded.
The tricky part is the calculation of the "importantness" of each sound, and that is what differentiates the encoders. This calculation is done with an algorithm called "a psycho acoustic model".
To measure the quality of an mpeg encoder automatically, you need an algorithm which calculates the quality the the encoded signal. By knowing this algorithm it is trivial to create an encoder which will score maximum on this quality measurement, since the quality measurement algo is basically the same as the psychoacoustic model.
This test is "snake oil", a real test of mpeg encoder unfortunately involves listening to the music to evaluate the psycho acoustic model of the encoder, and not comparing two artificially created psycho acoustic models with each other.
RFC1925
Not really being an audiophile I beg to differ. I got some tracks from the Lola Rennt film throught Napster, remembering that I enjoyed the soundtrack as much as the film. They sounded allright on my Aureal Vortex 2 soundcard and the cheapest model Rotel amplifier. Nonetheless, when I bought the CD, the difference was noticable. And we are talking about 192 Kbs MP3's. The clarity of CD's is far superior to MP3's.
-- Spelling and grammar errors tend to be a sign of erroneous thinking.
Giving this sort of thing to Slashdot is as fun as nude mudwrestling. Gotta love it. :)
On the Mac, I would have to _pay_ to use the Xing encoder. I just got through a serious ramen-and-spaghettios period, and there's just no way I'm going to merrily throw money at people who not only support the mp3 licensing patentholders, but also make an encoder that is considered to be more prone to artifacts and ringiness than even the Fraunhofer high bit rate stuff.
Beat me, whip me, slashdot me and call me unrigorous, but I'm not paying money for Xing. The lurkers support me in email. So there ;)
I had to know why- no, scratch that, I knew why. I had to know which encoders did better- what they in turn traded off- and I had to know across a wide range of bit rates in a way I could quickly cross correlate.
I've written for (IMNSHO) the foremost High End Audio journal. It's not that I'm not interested in listening to encoders! But if they are _all_ quite compromised, why not break 'em down into a series of measurements relative against each other with clearly identifiable characteristics? Shows you what to listen for- and tips you off to particular issues.
You can add me to that list- and such a comparison (I naturally kept a logbook to be able to reproduce the process later) would indeed be meaningful to me. For instance, if Vorbis was more sophisticated in its control of over-ring and either imposed a flatter characteristic (resisting resonant peaks) or went for an intentionally tailored characteristic (say, suppressing ring around 3-5K like Fraunhofer 32K bit rate) this would have obvious and interesting application to the sound quality. Conversely, if it had big ugly peaks and artifacts, their location in the frequency response would tell a lot about the sonic signature of the encoder.
Doh! For years I've used a purely white background for airwindows.com, with a sort of vintage-cnet layout. I also used to keep a 'graphics' section in which I had some web background gifs I'd done. They were made like this:
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Do a diffusion dither between white and the lightest 'web safe' gray- then take all the pixels at x positions and knock 'em out to white too. The result (works with other colors as well) is a texture in which no two colored pixels are ever directly next to each other- it's a paperlike texture but never gets darker than half Netscape grey.
Which is to say- sorry, I did it that way because I liked it, and I'll keep it. Honest, I have done everything I possibly could to avoid obscuring the text, but it's sort of like a trade-off: in getting rid of additional table clutter that I used to have, I found that I liked the pages when this simpler layout was backed by the softest texture I had, rather than plain white.
I hope it didn't bother your eyes too much :)
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
Woops. Or I could have said 'checkerboard' and saved myself the hassle :)
The idea is from company named Boxtop Software which produced a photoshop plugin that put different web safe colors in checkerboard patterns to produce a much greater range of 'web safe' colors (which look solid). I figured, why not run with that and do textures that way? Maybe the Gimp would benefit from some websafe checkerboard texture generators too :)
Actually, I think this would be a _very_ good experiment. I'm aware that my questioning some of these concepts is seen as prima facie evidence of being a tottering loony *g* but the whole concept of the psycho-acoustic model is so central to current audio theory... and this theory basically says, 'mp3s can be made to sound indistinguishable from CDs' and they cannot- the same theory on a broader level says 'CD itself is theoretically perfect sound', and it is not- mastering engineers, for instance, have learned that to do their work they need something better than CD audio.
I'm not certain that the psychoacoustic model must necessarily be that much better than, for instance, trying to diffuse unavoidable error as evenly as possible over the frequency and time domains. You are essentially insisting that concentrating the error in particular areas that are said to be 'masked' is far superior. This assumes the masking is effective, and that there are no side effects- neither assumption is wholly true, as large numbers of people are able to find fault with (say) 128K mp3s, and any filtering is going to impose extraneous characteristics. Finally, you're assuming that given an encoder that does not have a psychoacoustic model (I assume this would mean one that diffuses error pretty uniformly) is going to perform 'very well' in the procedure I devised. I'm not sure of that- I'd like to try it experimentally before jumping to that conclusion.
Finally, I have to admit- I haven't got the faintest idea what the resulting sonogram, and frequency/overring characteristics, would look like. I can say some things about it- with regard to the over-ring, diffusing it over a wider frequency range is not only desirable but markedly preferable. Fraunhofer loses badly to LAME, sonically, over just this issue- and Blade gets away with its severe over-ring by diffusing it over a wider frequency range. If the experimental psychoacoustic-model-less encoder showed significant improvements in diffusing out this over-ring and reducing its duration- there would be legitimate applications for its tonal characteristics, even if the raw frequency response was noticably compromised. It would be sort of like the 'anti-Blade'.
I don't suppose anyone will actually _try_ it, much less help me out with measuring it :P but if anyone is genuinely interested in investigating this, drop me a line? It sounds like something that could be attempted. Seriously- the whole point of such a model is 'masked stuff can't be heard'. If people can hear the masked error anyhow, what is the point? And if you assume people who can't hear anyhow and won't notice, what's the difference? Is it so axiomatic that you have to shun diffusing error evenly, and instead concentrate it in areas you think won't be heard?
You are talking about applying only the psychoacoustic model of the mp3 encoding, and producing a comparison of that with the original signal. I would indeed be really interested in seeing that- I'd like to know which of the various distortions, over-rings etc. arise from the psych model and which arise from the fractal part.
In the argument (lower in the thread) I was questioning whether you could skip the psych model entirely (pretend people can hear the difference between 128K mp3 and real life ;P ) and see just what you'd get if you went purely with the fractal encoding- trying to diffuse any and all error in the process as evenly as possible over frequency and time.
People will swear up and down that this will be drastically worse. I'd like to measure it in comparison with normal mp3 encoders and see exactly what it is, not just run around making theories that it's going to be awful. The one thing I'm willing to guess about it is that the sound will be the opposite of BladeEnc's sound. For some people that'll be bad- but the idea of an 'anti-Blade' might really interest others.
I don't know if anybody's comfortable enough with hacking on a version of LAME or whatever that they'd be willing to try it- I am going to bounce the idea off Martin Hairer, with whom I worked to perfect the sonogram-plotting program (I needed to request better picture export capacities- he came through like a trouper and fixed everything). I think he is the one who ported LAME to his program, and he might be both able to try such experiments, and interested in seeing what they do.
At any rate I wanted to say that your idea of isolating the transformations and considering them independently _is_ truly an interesting exercise- and I hope to be able to do such experiments, and learn from them, with a bit of work and patience :)
Spectral and waveform analysis and such has all been done before, and LAME has been known to be superior for quite some time. I've been singing the praises of this site for at least six months.
I'll agree that perception is what matters. However, what souds great on my $48 Labtec speakers at work sounds like crap on my $500 studio headphones at home. The fact of the matter is, most people don't have $25,000 of audio equipment nor sufficiently trained ears to tell the difference. I'll readily use LAME encoded stuff from people I trust, but cringe in horror when I listen to the rapage that Xing's encoder performs to the quality of complex music.
Think of it this way: most people are arguing which color of crap tastes better. Sites like this one and the one in the article are trying to point out that you don't have to eat crap.
hymie
Kexis is a GPL'd lossless encoder which has proved to be _almost_ as good as shorten for filesize, is _much_ faster to decode and encode than any encoder I have ever used... The fact that the kexis file format may change in the future is largely a petty issue as you can simply losslessly convert from the old format to the new one. Have a look at it at http://kexis.sourceforge.net
Blade became popular because it was the first program to be banned by Fraunhoffer. In fact, blade is really a copy of the ISO reference code, optimized for speed. Lame incorporated massive quality improvements, but came too late to catch the wave of publicity offered to Blade. It would be nice to have access to the code which generated these sonograms.
r3mix.net is really the definitive site for this sort of thing. Not only does the site show waveform deviation, but the tester actually listens to lots of very diverse music to test for quality. The waveforms are used mainly to explain errors heard during listening (ie. what the hell is that fuzzy warp sound overriding the bassline?). So anyways, read up at r3mix.net -- you'll realize people have already done this much better.
-- jar
Absolutely. CD quality (44.1 kHz 16 bit PCM) is total CRAP to true audiophiles. I won't be satisfied until they invent a format that will store the timing and stength of every single air molecule hitting my eardrum, precise to within the Heisenberg uncertainty principle. Uncompressed.
some of my vinyl is way better
Vinyl sounds "warmer" because...
Will I retire or break 10K?
That's trivially proven to be incorrect since gzip and bzip2 compress data and yet have the outputs be the same as the inputs. In an audio context, ten minutes of a pure frequency sound be easily compressed to a small size. The only information you really need keep is the length of the tone and the frequency.
"When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
I would also be very interested in seeing similar graphs (preferably from the same source) made with Vorbis encoders, to see how they stack up.
The Matrix is going down for reboot now! Stopping reality: OK. The system is halted.
the key point here is that mp3 encoding is in fact a process of two separate transformations (both of which consist of many processes, of course), the first of these is my bone of contention as it seems less well-known than the second, which i will address first.
the "second transformation" is the one familiar to most people, the iterative fractal encoding procedure, which simply adds information to that audio frame until it a) either hits a "quality threshold" (ie is consider good enough), or b) fills up its bitrate allocation. it's similar in many ways to making a "jpeg of sound". you can get a good view of this whole process by following this link to a graphic of the aac encoding process on fraunhofer's website. It is the stuff inside the box at the lower left that this concerns.
however the first transformation here is the important one, this is the stuff outside and above the box in the graphic linked above. (i am not sure the graphic is detailed enough, there may be some missing, from what i remember) - this is a series of transformations to limit the amount of data the second transformation has to deal with (and hence get essentially better encoding for the same bitrate), according to the way the human ear works. our ears have "features" like having a dead area in frequencies near loud noises, which means these bits can be cut out, and other bits and pieces that i can't remember and don't have to hand ;) this is of course psychoacoustics, as other people have commented. there is a _very_ basic primer on this at the fraunhofer site here, but it doesn't go into any technical detail.
as an aside, there used to be some fantastic and informative articles on these subjects at mp3.org back in the day (1997-1998?), may it rest in peace. does anyone have some links for where something as good on this subject is? i haven't been as in touch with the technical side of mpeg encoding as i used to be...
but anyway back on subject, this first transformation actually distorts the signal *significantly*, but only in a way that makes it easier to process, while still sounding the same (or close) to the human ear. it may be an interesting exercise to isolate this first transformation, apply it and then save without any fractal encoding, and compare that to the original signal. this transformation will cause great "visual degradation", as shown in the article, but imho this is not an accurate criteria for measuring audio quality. still interesting, and a good read, though :)
fross
What about your amp and speakers?
The ideal result from the process (totally unaltered waveform information) would be an entirely _black_ 'sonogram' at the end of the process. That's not going to happen. Since there are going to be deviations, it's down to the psychoacoustic model- and the pictures and charts are going to show what the encoder chose to throw away, on a larger scale.
You can argue that the encoder throws away stuff that can't be heard, therefore measuring _that_ is meaningless. This equates to arguing that the result is indistinguishable from the source audio. I disagree, and feel that all mp3s are audibly degraded from the source audio- which is itself degraded, being typically 16 bit 44.1K digital audio :)
I'm trying to measure what the encoder's failing to do. The project was meant to answer my own questions, and has done so.
Personally, I'm with LAME for my sonic requirements, although the only mp3s of my music out there (so far) are Blade, done many months ago before I did this research. But the point is not that there is a 'winner'- the point is that the differing sonic characteristics of these encoders CAN BE QUANTIFIED. Perhaps not measured outright (my charts etc. are _relative_ to each other), but these encoders take significantly different approaches to discarding information, and that applies directly to your choice of encoder for recording music, and translates to a completely predictable sonic characteristic of the encoder on ANY music, no matter what.
I put all sorts of music through Blade when I was on mp3.com with only Blade for a free encoder- no matter what I did, the result was always identifiably BladeEnc, with the smooth extended frequency response and absolutely terrible transient impact. For some pieces, this was suitable- for some it was grossly unsuitable. But the sonic characteristics were consistent- and correlate with what I learned about the encoder in this 'torture test'.
Ogg Vorbis?
--
--
You are a fucking moron.
__________________________________________________ ___
rooooar
I have been wondering about this kind of thing for a long time. I have used Lame as of late because it is very fast with the optimized compile I have. I wish it were as fast on VBR, but I guess I'll have to settle for CBR.
I'd really like to see something like this with Ogg Vorbis once it matures. Or now even, because it seems to be a bit better already, though it's hard to tell on my laptop speakers.
WARNING: there is a trojan on your
I've been using the VBR Lame Encoder 3.99 Alpha for a couple of weeks and I love it. It's fast, and it sounds great. I was using BladeEnc for a while. I have found that Lame sounds better, and using VBR will result in a smaller file than Blade and still sound better.
It's either on the beat or off the beat, it's that easy.
I moderate therefore I rule!
--
Ah, but nowhere does this article try to disprove that, does it? The whole point is that certain codecs does a better, more intelligent job of discarding information, and that is what the author set out to prove.- ---------
-----------------------------------------
-------------------------------------------------
This sig could have been put to good use.
Not that I particularly care, but this seems to be a shallow argument. When you're searching the skies, you're trying to FIND something; ignorance is NOT bliss in this case. When you're listening to music, all that matters is what you can hear. Now maybe there is a more scientific method to determine what you can hear, such that you can detect percentable problems before you run into them, but other than that, who really cares?
The author is on drugs, is all I have to say. :)
;)
I'm taking a course currently on audio and image compression, and his article annoys me greatly. He uses ambiguous terminology and often the wrong terminology (for example, calling things "wavelets" that aren't actually wavelets). He describes things which can't be seen clearly in the graphs and would much better be viewed with a different display format. Etc.
I'm still wondering if some of my compression ideas will work... I plan to test them out before too long: grouping some of the generally weak high-frequency signals together since the human ear is less sensitive to high frequency pitch variation (we're sensitive to frequency on a logarithmic scale - an octave is a doubling of frequency); and, instead of doing block transforms on the music, generate a 2d image of the signal (graph: frequency vs. time), compress the frequency axis as you normally would, and instead of saving the time axis as a series of blocks of discrete frequencies, actually compress it greatly with a fft - doing this, you should be able to save space on recurring themes in songs (such as a chorus, a regular beat, etc). Voice may introduce complications, though, and I may end up having to do some kind of combination between the two (such as, compressing the difference between the original and final signal as a low quality block transform and saving it with the compressed signal). Two ideas of mine I plan to test when this incredible work load from my senior year stops bearing down on me
- Rei
He's just being nice so my real father won't freeze him in carbonite and sell him for spice.
Audio quality for compression codecs cannot be measured in terms of visual graphs or synthetic benchmarks. (I.E. just comparing the difference between the original singal and the compressed signal does not work.)
It is quite possible to have a singal that very much resembles the original wave graph, and yet sounds horrible to the ear. It is also equally possible to have a signal who's graph doesn't resemble the original very much, and yet has a much higher 'percieved' quality.
Just remember: The first rule in every single BEGINNERS guide to sound is to "Trust your Ears," and that is the only way to tell a good codec from a bad one.
-----
Natural != (nontoxic || beneficial)
In fact I think I have seen this before and r3mix actually affected my approach to my encoder analysis. Definite kudos to r3mix, and I entirely agree with many of this site's decisions and approaches- interestingly they reach precisely the same conclusion as I did, that LAME 256 was the ideal archival encoder and LAME VBR was the best one for smaller file sizes- except that r3mix has added the recommendation that joint stereo be used in the latter case! (this would really hurt the relative comparison with higher bit rate stereo encoders with my mono test signal, but I think I will take the advice and try that for my own mp3s...)
r3mix also chooses to use _relative_ graphs rather than attempting to give absolute measurements, something I heartily approve of.
Now, here's the thing- r3mix's results are sometimes a subset and sometimes comparable to mine, just depicted in a different way. The primary measurement of a frequency sweep produces different-colored graphs- if you take the horizontal axis and express the vertical deviation of each graph, from an ideal line of flat reproduction at the top, as a brightness value of a single pixel, you'd get something akin to a single line on one of my 'sonograms'. The test with the 'applaud' signal is an example too- if you subtracted the source from the results you'd end up with distortion levels very similar to my differenced sonograms.
More interesting to me is the fact that my sonograms show an _intermediate_ step- several r3mix tests are the averaged responses of an encoder over time. That is exactly what my 'charts' are- they are sums of all the deviation and distortion over the entire length of a sonogram, over a range of frequencies.
I'm almost certain I'd seen r3mix before doing my own analyses- I think it's very likely that this site significantly helped me define the processes I used for my own stuff. I heartily recommend checking it out- this is good work, I totally endorse it, in fact I'm going to put a link to it on my own encoder page right now :) *put* there!
I think the guy's hearing with his eyes, or using a totally different set of music than what I listen to.
If you wanna hear how dog-fuckingly-shit Blade is, encode the first 10 seconds of New Order's "Blue Monday" (a basic drum machine emitting a sound common to much new-wave, dance, and industrial from around 1980 to the present day) at 128/160/192 using Blade, Fraun, and LAME.
Blade will be unlistenable at 128, shit at 160, and you may hear artifacts at 192 if you know what to listen for. LAME and Fraun sound sweet, even at 128.
Similar results can be achieved with a heavy guitar track, e.g. Def Leppard or other 80's "hair metal" bands.
I don't have data on string quartets - but for non-classical music, Blade blows steaming piles of donkey dick.
_Definitely_ an interesting site. Also, referring to the listening tests: "The Fraunhofer encoder produced a surprisingly harsh sounding attack on the guitar; it remained quick and sharp, but was artificially crisp and accentuated." That's precisely what I was trying to say, couldn't have put it better myself. It turns out Ars _likes_ that. I do not. But if you do- clearly, you're going to like Fraunhofer. It's not about picking a winner, it's like picking a musical instrument...
from the can-a-cue-cat-read-these? dept.
Well, after calibrating my cat on a couple of Pop-Tarts boxes, I tried several scans on the diagrams on the web page... nothing! I can therefore conclusively answer this question with a big, fat NO.
-----
MP3 is about selectively discarding information from the audiostream. The purpose is not to create an output waveform which is as close as possible to the input. This is what the whole business with the psycho-acoustic model is about.
The guy used the example of Fairport Convention with Sandy Denny.
I don't know about his rigor, but the guy's alright by me.
Who knows where the time goes?
OK, now we see what parts of the spectrum are thrown away at very low bit rate, but why is it supposed to be "probably the most rigorous analysis of any encoders anywhere on the web"? First off, the *only* way to evaluate the quality of a perceptual encoder is to listen to it, period. Who cares what is rejected (non encoded) if you don't hear it.
...
Also, while using the 32 kbps bitrate amplifies the effects of perceptual quantization, so it's easy to see them, the problem is that not all the encoders where meant to work at this bitrate.
Think about it, when standard institutes want to evaluate audio/speech codecs, they don't calculate sonograms like this, they make subjective tests. They make a bunch of listeners hear the result of many encoders on *many* audio files. That's right you need many files to evaluate a codec. Some will perform better for certain musical instruments, some will perform better with or without background noise, echo,
For all these reasons, I do NOT consider this analysis rigorous at all!
Opus: the Swiss army knife of audio codec
Is it me, or or does this seem like an oxymoron? Not being an audiophile, someone correct me if I'm wrong here... Audiophiles are interested in the most accurate reproduction of sound... Why would they even consider a lossy compression scheme at all? Just like serious digital artists shun JPEG for all but web distribution to the masses, and even then we see much done in gif or tiff. I would say that MP3 audio done by ANY encoder is unacceptable to an audiophile.
Second, I want to challenge some of the assumptions and declarations that this experimenter made. The experiments placed on these encoders are mostly "torture tests" that one would never encounter in real situations... And by using this series of torture tests he tells people which encoders are best for encoding mp3's. Does anyone see this reasoning as flawed? He's subjecting encoders to situations that NONE of them have been designed for, and proclaiming that this has something to do with reality. I see little correlation... How often do you hear pure sine sweeps in any song?
I found the previous mp3 performance analysis posted on Slashdot to be much more informative. It put the encoders up on real world performance, and rates them accordingly.
The guys who wrote the encoders realized that some things just wouldn't happen in normal music, such as these torture tests, so they wrote "shortcuts" that ignored these conditions, and resulted in a higher compression rate! How dare he rate encoders on something that the programmers all deliberately IGNORED.
My friends, trust no statistics that you did not falsify.
Because it's a real pain in the ass to mess with 300 CDs, but it's really easy to select a directory with 300 CDs worth of music and put it on random. You have no idea how useful it is until you put 4000 songs (I'm not kidding) on random. :)
WWJD? JWRTFM!!!
I would hope that anybody reading either what I wrote, or what you've just written, would avoid accepting unsupported claims, consider the facts of the situation, and make up their own minds...
Well, actually, there is a reason: the Xing encoder blows chunks. Sure, it's fast, but the sound quality sucks. If all you're encoding is Teeny Bopper of the Week music, then you're not missing out on anything. If you're encoding stuff that's a lot more complex, you're better off with soemthing that doesn't sacrifice quality for speed..
hymie
http://users.belgacom.net/gc247244/analysis.htm#MP 3ENC31
This is what I found when searching for mp3 comparison. It compares different implementations of encoding for mp3 as well as output quality. Much more useful and definitive.
Often wrong but never in doubt.
I am Jack9.
Often wrong but never in doubt.
I am Jack9.
Everyone knows me.