2nd Multi-Format 128kbps Public Listening Test
technology is sexy writes "Roberto Amorim has launched his latest public listening test evaluating the performance of different audio codecs at 128kbps, among them Apple's AAC implementation (used in iTunes), LAME, Ogg Vorbis fork auTuV, WMA, Musepack and even Sony's Atrac3 format, which is soon to be used in their own music store. Read more on Hydrogenaudio and check out the results of prior tests. As opposed to most evaluations of audio codecs, this is a scientific test adhering to ITU-R BS.1116-1 as much as possible while still allowing everybody to participate."
Here's insightful. Ogg is a wrapper. It has nothing to do with the quality of the sound. You should be chanting Vorbis.
Etiquette is etiquette. He kills his mother but he can't wear grey trousers.
Also, a frequency plot tells us nothing about the phase or frequency distribution at certain times in the signal. I can make a sine sweep that would match exactly the spectrum of a pop song, but obviously would sound nothing like it.
There are ways of objectively measuring the performance of perceptual encoders, but frequency analysis isn't really one of them.
Different codecs and implementations of those codecs may be optimized for different bitrates, so its important to test codecs at various target bitrates.
Frequency analysis only gets you part way there. For those who didn't look around at the articles (I'm not refering to you, of course; just some hypothetical /. reader), there are time domain audio effects that are not visible on FFT plots. An example of this is pre-echo. With pre-echo you get a n echo of an upcoming sound (like a drum beat) before the actual sound happens. This can happen when linear-phase FIR filters are used, but is also an artifact of some frequency domain encoder/decoder systems. The FFT is only part of the story.
The different formats don't simply limit the frequencies stored. A given compression format will change the sound in different ways depending on what input soundfile is. Some codecs perform well with some types of sounds, but poorly with others (for example, the compression your cell phone uses is good at speech but lousy at music).
Also, all frequencies aren't of equal importance to a our ears. Our hearing is best in the middle range (near where the important elements of speech are), and taper off above and below. And, if there are multiple sounds occuring at the same time (a loud guitar and soft violin), our ears don't hear the softer sounds as well.
You can't simply do a FFT of all of input and output files and simple add up the differences, as all the differences aren't created equally.
DON'T CLICK THE LINK!
The sad thing is that somebody went to the trouble of putting together a perfectly reasonable, logical post just to throw in a porn link. *sigh*
Karma: Segmentation fault (tried to dereference a null post)
I must be deaf, I just did the test on a the kraftwerk sample file, and it took me a lot of relistening to finally pick out 3 out of 6 encoded files (although the first one - whatever it was - was fairly easy). The other 3 sounded exactly like the reference sample to me. This is using Sennheiser HD500 headphones and an Audigy ZX2 sound card.
Try doing the test, you might be surprised, or conversely if you're not surprised, you might contribute valuable information to the project.
Switch back to Slashdot's D1 system.
The r3mix tuning (--r3mix), while a small step forward, was inherently flawed because of his insistance on tuning based on pictures instead of acual listening tests. As a result, the --dm-presets were invented and improved by Dibrom (the HydrogenAudio founder) along with a multitude of testers. eventually those were included in LAME as the --alt-presets (and in the latest version they just replace the normal --presets). In short, Hydrogen Audio is THE place to go for this stuff now.
Jeremy