2nd Multi-Format 128kbps Public Listening Test
technology is sexy writes "Roberto Amorim has launched his latest public listening test evaluating the performance of different audio codecs at 128kbps, among them Apple's AAC implementation (used in iTunes), LAME, Ogg Vorbis fork auTuV, WMA, Musepack and even Sony's Atrac3 format, which is soon to be used in their own music store. Read more on Hydrogenaudio and check out the results of prior tests. As opposed to most evaluations of audio codecs, this is a scientific test adhering to ITU-R BS.1116-1 as much as possible while still allowing everybody to participate."
Never heard of it.
Ogg, ogg ogg. Ogg oggity ogg ogg!
Now that that's out of the way, let the insightful comments begin.
Great, now all the ____ fanboys are going to forge results to make their codec look good. Talk about useless tests.
Not possible. All you will get is a bunch of WAV-files, you have no way to tell which file belong to which codec.
That said, I don't care which codec wins the test because Vorbis is still the only one free from patents and the margins are so incredibly small.
Vorbis will win for me even in the unlikely scenario that it comes out last.
My other account has a 3-digit UID.
How do you bas a listening test on the web? People with crappy speakers are going to say that all of them sound bad yet the people that have the better speakers are going to have the better responses. This should be something that is done in a controled environment so that the hardware that is playing back the audio is standard.
Yes... certainly this kind of listening test is important to access the capabilities of each codec.
But in the real world other factors may be more important to chose a coded, like for example general acceptance, freely available code and specs, and a large content base available.
You see: performance will increase allways in all codecs with time... so this kind of testing is only a minute factor amongst others.
You cannot proceed from the informal to formal by formal means
Because "human auditory capacity" is not fully understood. Sure we can give standard frequency response graph, but most of these codecs take advantage of psycho-accoustic hearing models -- where certain frequencies mask other frequencies in our perception. Since this is a developing field, objective listening tests could really help determine what's working and what's not.
For example, conventional wisdom says that the human ear cannot detect sounds above roughly 20kHz, yet there is at least some anecdotal evidence that higher order harmonics shape what we hear.
If "normal" human auditory capacity was a completely decoded topic, there wouldn't be nearly as much a need for different approaches to music compression (it would be a much simpler problem with fewer possible solutions)
There are no karma whores, only moderation johns
Why does anyone still use 128kbps? I hate it when I download music (legal ;) and the only bitrate available for the song i want is 128. With 200GB+ hard disks being so affordable these days and everyone having high speed, I think everyone should encode their (mp3||ogg||aac) at 192 or 256.
Well I could be wrong, and forgive me if I've misinterpreted your post...but
Don't all of these compression algorithms rely on psychacoustic modeling to remove 'extraneous' information from the bitstream?
If that is correct, and the algorithms are implemented correctly, then really what we are looking for is the best perceived result.
Just because the output meets the algorithm input->output specs, justn't mean it's the best output as perceived by humans.
Maybe think of it as optimizing sort routines? Yep, bubble-sort or b-tree still output a sorted list, but the perceived value is that the b-tree is better because it performs it's function more quickly.
This isn't an exercise in getting the frequencies algorithmically correct - the end result has to be listenable.
Humans are analog devices...
Uh, if the sample is the same length, and the but rate is the same, won't the file size be the same as well? A 10 second sample at 128 Kb Per Second should be 1280Kb regardless of the format, no?
And, just FYI, MOST people, something like 95% of listeners cannot tell the difference between 128kbps sample and the original. I generally can't, even with decent headphones on.
I think that all you compression elitist snobs work for HD manufacturers, trying to get me to buy a 250GB drive to store the same amount of music as my 60GB will hold!
BTW, I think the difference between MP3 and Vorbis at 128 kb/s is perfectly noticeable. MP3 sounds rather bad, vorbis sounds pretty good. And the point is precisely to tell which format sounds best, so you don't want to do 512 kb/s bitrate where all formats sound close to CD quality.
Because "human auditory capacity" is not fully understood. Sure we can give standard frequency response graph, but most of these codecs take advantage of psycho-accoustic hearing models -- where certain frequencies mask other frequencies in our perception. Since this is a developing field, objective listening tests could really help determine what's working and what's not.
From my understanding of MP3 compression and others, the compression protocols take advantage of this frequency masking, so if humans can't hear it, it removes it. It also obviously takes into account frequency ranges of hearing. As a side note, I think it might be neat to be able to compress 30-50% better based on your personal hearing characteristics, but it'd stink if you got old and had to not only wear a hearing aid, but also start collecting MP3's all over again.
Also, a frequency plot tells us nothing about the phase or frequency distribution at certain times in the signal. I can make a sine sweep that would match exactly the spectrum of a pop song, but obviously would sound nothing like it.
There are ways of objectively measuring the performance of perceptual encoders, but frequency analysis isn't really one of them.
No matter *what*?
Not even if it's about average quality speakers?
Not even if it's about some rather cheap speakers?
I can't say I hear much of a difference with modern codecs, and I own some average speakers. Maybe 128 kbps mp3 can sound bad (although that depends a lot on the kind of music), but that's an aging codec anyway. I think encoded files in the 192 - 256 kbps range is the best, and 128 kbps ogg's often acceptable, especially with the DFX plugin (or similar) for Winamp to compensate for shortcomings in compressed formats.
I'd definitely not call 128 kbps in modern codecs "disgusting". In ogg's I've found it to be roughly as 160-192 kbps mp3's and that's perfectfly fine for my ears.
Beware: In C++, your friends can see your privates!
Different codecs and implementations of those codecs may be optimized for different bitrates, so its important to test codecs at various target bitrates.
Frequency analysis only gets you part way there. For those who didn't look around at the articles (I'm not refering to you, of course; just some hypothetical /. reader), there are time domain audio effects that are not visible on FFT plots. An example of this is pre-echo. With pre-echo you get a n echo of an upcoming sound (like a drum beat) before the actual sound happens. This can happen when linear-phase FIR filters are used, but is also an artifact of some frequency domain encoder/decoder systems. The FFT is only part of the story.
The different formats don't simply limit the frequencies stored. A given compression format will change the sound in different ways depending on what input soundfile is. Some codecs perform well with some types of sounds, but poorly with others (for example, the compression your cell phone uses is good at speech but lousy at music).
Also, all frequencies aren't of equal importance to a our ears. Our hearing is best in the middle range (near where the important elements of speech are), and taper off above and below. And, if there are multiple sounds occuring at the same time (a loud guitar and soft violin), our ears don't hear the softer sounds as well.
You can't simply do a FFT of all of input and output files and simple add up the differences, as all the differences aren't created equally.
FLAC! Flac-a-flac-a-flac!
Aflac? What does a silly duck have to do with sound compression?
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
When you listen to compressed audio over inexpensive speakers / headphones, you can't hear the difference. With my Sony Studio Monitor headphones, I lost the difference at about 250k with mp3, so I started using 320K as that was the best at the time. Then I bought $2000 Martin Logan Mosaic Speakers, and the original CD was clearly better than even the 320K bitrate. So now I only do lossless compression. That's fine at home, but in any other environment, there's usually so much noise and distractions that even if you had excellent headphones or speakers, you wouldn't appreciate that little difference lossless brings over 256K or even 128K.
And how do you know what you are asserting? Have you done properly controlled listening tests with 128kbps encoding using a variety of codecs?
The fact is that for a lot of people, knowing the best codec at 128kbps is worth knowing because:
1) They are using portable devices where they are space constrained
2) They are using portable devices that may not have the perfect fidelity of a high-end sound system, but can go anywhere with them.
3) They are using their portable device in a somewhat noisy environment that overshadows any sound quality issues caused by a lower bitrate.
DON'T CLICK THE LINK!
The sad thing is that somebody went to the trouble of putting together a perfectly reasonable, logical post just to throw in a porn link. *sigh*
Karma: Segmentation fault (tried to dereference a null post)
The best replacement for r3mix.net in my opinion is HydrogenAudio . The forums are frequented by a lot of professionals, as well as developers of LAME, FLAC, Nero AAC, Musepack, Wavpack, and other codecs.
I must be deaf, I just did the test on a the kraftwerk sample file, and it took me a lot of relistening to finally pick out 3 out of 6 encoded files (although the first one - whatever it was - was fairly easy). The other 3 sounded exactly like the reference sample to me. This is using Sennheiser HD500 headphones and an Audigy ZX2 sound card.
Try doing the test, you might be surprised, or conversely if you're not surprised, you might contribute valuable information to the project.
Switch back to Slashdot's D1 system.
Not possible. All you will get is a bunch of WAV-files, you have no way to tell which file belong to which codec.
.ogg vorbis, an mp4 and 3 flacs. If you want to be biased either for or against mp3/oggvorbis/quicktime itunes AAC, you can.
Check the contents of the sampleXX.zip files; you actually get an mp3, an
SCO employee? Check out the bounty
The r3mix tuning (--r3mix), while a small step forward, was inherently flawed because of his insistance on tuning based on pictures instead of acual listening tests. As a result, the --dm-presets were invented and improved by Dibrom (the HydrogenAudio founder) along with a multitude of testers. eventually those were included in LAME as the --alt-presets (and in the latest version they just replace the normal --presets). In short, Hydrogen Audio is THE place to go for this stuff now.
Jeremy
After a while, once you have weeded out bad ways, one is going to reach the following situation. Each algorithm will perform very well for a large set of music and poorly for some small set of music. Barring pathologies, The poor set will be assymtotically fixable by increacing the bit rate. By the way this is not just my opinion. Theres theorems that say this is true of any compression scheme when applied to all problems.
what does this mean? it means that the end user is never going to work at the truly low end of the bit rate specrrum because they want something that virtually always works. Plus they want a wee bit more just in case they have to transcode it. So if the recommended rate is 128 people will encode at 160.
So these comparisons need to be done not at the bitter edge where music flaws are easy to spot because NO ONE WILL ACTUALLY MAKE THAT THE OPERATING POINT THEY USE. That is to say everyone knows vorbis sounds so-so at 64KB while MP3 sound much worse. But no one wants So-So they want darn good. So they are going to recors their Mp3 at 160 and at 160 Ogg and Mp3 sound so close that the size of the test you'd have to do to pick up the difference is silly.
the proper way to do this is the following. Pick the gold standard format, say MP3 and its standard excellent operating point, say 160. now test all the others at lower bit rates than 160, and see which one has the lowest bit rate that scores as good as the Mp3 at 160.
comparing all methods at a constant bit rate, esepciall a low one, is stupid
Some drink at the fountain of knowledge. Others just gargle.
Also what kind of codec bias could you possibly be referring to?
Apparently he doesn't realize that this is a double-blind test - meaning neither the listener nor the tester knows what codec is being presented at any given time.
I'm taking the test now (well, not right now, taking a break) and it's about as scientific as I think you could make a public test taken in the home. Yes, the samples get compressed and then put in easily accessible folders with proper file name extensions, but you never know what you're actually listening to when you're running the testing program. All you have is a source file for comparison, then two buttons marked "1" and "2", one of which is the source again, the other a randomized codec. You never know which of the two buttons is the uncompressed source and you also never know which codec you're hearing. The results are also encrypted, so it's not as if you can just go into the results files and look at what codecs you favor.
I suppose someone who's truly got the Ear of the Gods could listen to the samples outside of the testing program, pick various identifiable traits out of each, then listen for those traits in the testing program and vote up or down whatever codecs he or she chose, but that would be exceedingly difficult and more than a little time-consuming. I can't see how it would be worth it, especially as no single test result is going to skew the overall results to any significant degree.
This is the first time I've ever taken a test like this and I am honestly pretty shocked at how good all of these codecs sound. I am having a really hard time even deciding which is the compressed track most of the time, and I consider myself something of an audiophile. I'm even listening in a fairly controlled environment with a good pair of headphones, at a volume loud enough to hear any background noise clearly but below any clipping whatsoever. I will be surprised if any codec really does significantly better than the others consistently when we see the final test results.