2nd Multi-Format 128kbps Public Listening Test
technology is sexy writes "Roberto Amorim has launched his latest public listening test evaluating the performance of different audio codecs at 128kbps, among them Apple's AAC implementation (used in iTunes), LAME, Ogg Vorbis fork auTuV, WMA, Musepack and even Sony's Atrac3 format, which is soon to be used in their own music store. Read more on Hydrogenaudio and check out the results of prior tests. As opposed to most evaluations of audio codecs, this is a scientific test adhering to ITU-R BS.1116-1 as much as possible while still allowing everybody to participate."
Never heard of it.
Ogg, ogg ogg. Ogg oggity ogg ogg!
Now that that's out of the way, let the insightful comments begin.
Great, now all the ____ fanboys are going to forge results to make their codec look good. Talk about useless tests.
Not possible. All you will get is a bunch of WAV-files, you have no way to tell which file belong to which codec.
That said, I don't care which codec wins the test because Vorbis is still the only one free from patents and the margins are so incredibly small.
Vorbis will win for me even in the unlikely scenario that it comes out last.
My other account has a 3-digit UID.
Yes... certainly this kind of listening test is important to access the capabilities of each codec.
But in the real world other factors may be more important to chose a coded, like for example general acceptance, freely available code and specs, and a large content base available.
You see: performance will increase allways in all codecs with time... so this kind of testing is only a minute factor amongst others.
You cannot proceed from the informal to formal by formal means
Because "human auditory capacity" is not fully understood. Sure we can give standard frequency response graph, but most of these codecs take advantage of psycho-accoustic hearing models -- where certain frequencies mask other frequencies in our perception. Since this is a developing field, objective listening tests could really help determine what's working and what's not.
For example, conventional wisdom says that the human ear cannot detect sounds above roughly 20kHz, yet there is at least some anecdotal evidence that higher order harmonics shape what we hear.
If "normal" human auditory capacity was a completely decoded topic, there wouldn't be nearly as much a need for different approaches to music compression (it would be a much simpler problem with fewer possible solutions)
There are no karma whores, only moderation johns
Well I could be wrong, and forgive me if I've misinterpreted your post...but
Don't all of these compression algorithms rely on psychacoustic modeling to remove 'extraneous' information from the bitstream?
If that is correct, and the algorithms are implemented correctly, then really what we are looking for is the best perceived result.
Just because the output meets the algorithm input->output specs, justn't mean it's the best output as perceived by humans.
Maybe think of it as optimizing sort routines? Yep, bubble-sort or b-tree still output a sorted list, but the perceived value is that the b-tree is better because it performs it's function more quickly.
This isn't an exercise in getting the frequencies algorithmically correct - the end result has to be listenable.
Humans are analog devices...
Uh, if the sample is the same length, and the but rate is the same, won't the file size be the same as well? A 10 second sample at 128 Kb Per Second should be 1280Kb regardless of the format, no?
And, just FYI, MOST people, something like 95% of listeners cannot tell the difference between 128kbps sample and the original. I generally can't, even with decent headphones on.
I think that all you compression elitist snobs work for HD manufacturers, trying to get me to buy a 250GB drive to store the same amount of music as my 60GB will hold!
BTW, I think the difference between MP3 and Vorbis at 128 kb/s is perfectly noticeable. MP3 sounds rather bad, vorbis sounds pretty good. And the point is precisely to tell which format sounds best, so you don't want to do 512 kb/s bitrate where all formats sound close to CD quality.
Because "human auditory capacity" is not fully understood. Sure we can give standard frequency response graph, but most of these codecs take advantage of psycho-accoustic hearing models -- where certain frequencies mask other frequencies in our perception. Since this is a developing field, objective listening tests could really help determine what's working and what's not.
From my understanding of MP3 compression and others, the compression protocols take advantage of this frequency masking, so if humans can't hear it, it removes it. It also obviously takes into account frequency ranges of hearing. As a side note, I think it might be neat to be able to compress 30-50% better based on your personal hearing characteristics, but it'd stink if you got old and had to not only wear a hearing aid, but also start collecting MP3's all over again.
Also, a frequency plot tells us nothing about the phase or frequency distribution at certain times in the signal. I can make a sine sweep that would match exactly the spectrum of a pop song, but obviously would sound nothing like it.
There are ways of objectively measuring the performance of perceptual encoders, but frequency analysis isn't really one of them.
No matter *what*?
Not even if it's about average quality speakers?
Not even if it's about some rather cheap speakers?
I can't say I hear much of a difference with modern codecs, and I own some average speakers. Maybe 128 kbps mp3 can sound bad (although that depends a lot on the kind of music), but that's an aging codec anyway. I think encoded files in the 192 - 256 kbps range is the best, and 128 kbps ogg's often acceptable, especially with the DFX plugin (or similar) for Winamp to compensate for shortcomings in compressed formats.
I'd definitely not call 128 kbps in modern codecs "disgusting". In ogg's I've found it to be roughly as 160-192 kbps mp3's and that's perfectfly fine for my ears.
Beware: In C++, your friends can see your privates!
Frequency analysis only gets you part way there. For those who didn't look around at the articles (I'm not refering to you, of course; just some hypothetical /. reader), there are time domain audio effects that are not visible on FFT plots. An example of this is pre-echo. With pre-echo you get a n echo of an upcoming sound (like a drum beat) before the actual sound happens. This can happen when linear-phase FIR filters are used, but is also an artifact of some frequency domain encoder/decoder systems. The FFT is only part of the story.
FLAC! Flac-a-flac-a-flac!
Aflac? What does a silly duck have to do with sound compression?
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
When you listen to compressed audio over inexpensive speakers / headphones, you can't hear the difference. With my Sony Studio Monitor headphones, I lost the difference at about 250k with mp3, so I started using 320K as that was the best at the time. Then I bought $2000 Martin Logan Mosaic Speakers, and the original CD was clearly better than even the 320K bitrate. So now I only do lossless compression. That's fine at home, but in any other environment, there's usually so much noise and distractions that even if you had excellent headphones or speakers, you wouldn't appreciate that little difference lossless brings over 256K or even 128K.
And how do you know what you are asserting? Have you done properly controlled listening tests with 128kbps encoding using a variety of codecs?
The fact is that for a lot of people, knowing the best codec at 128kbps is worth knowing because:
1) They are using portable devices where they are space constrained
2) They are using portable devices that may not have the perfect fidelity of a high-end sound system, but can go anywhere with them.
3) They are using their portable device in a somewhat noisy environment that overshadows any sound quality issues caused by a lower bitrate.
DON'T CLICK THE LINK!
The sad thing is that somebody went to the trouble of putting together a perfectly reasonable, logical post just to throw in a porn link. *sigh*
Karma: Segmentation fault (tried to dereference a null post)
The best replacement for r3mix.net in my opinion is HydrogenAudio . The forums are frequented by a lot of professionals, as well as developers of LAME, FLAC, Nero AAC, Musepack, Wavpack, and other codecs.
The r3mix tuning (--r3mix), while a small step forward, was inherently flawed because of his insistance on tuning based on pictures instead of acual listening tests. As a result, the --dm-presets were invented and improved by Dibrom (the HydrogenAudio founder) along with a multitude of testers. eventually those were included in LAME as the --alt-presets (and in the latest version they just replace the normal --presets). In short, Hydrogen Audio is THE place to go for this stuff now.
Jeremy