Audio Format Listening Tests Concluded
Pointing to the conclusions of this listening study, nullity writes: "The results are interesting, and show a high variation in the performance of the various codecs on different musical styles. Ogg seems to work well on dance music, WMA8 on chamber music, etc."
Ogg seems to work well on dance music, WMA8 on chamber music, etc.
Like requiem...
These tests are all at 64 kbps and most people use much higher bitrates for real music. I'd like to see comparisons at 128k bits minimum, and preferably 160k or 192k, which is what most quality mp3's are at, for direct comparison.
Tests confirmed that attempting to encode "Aphex Twin" with any of these codecs caused the PC to tremble at a frequency that, when connected to a refracting laser stuck up Bill Gates' ass, had it spell out "we're all dead" on the nearest wall.
Invoicing, Time Tracking, Reporting
Well....not quite. There's a different frequency distribution between electronic, pop acoustic and classical music.
Specifically, electronic music, which most dance stuff is, has a very flat frequency distribution. See this for yourself - load your favourite media player, siwtch on the graphic equaliser graph and watch how basically nothing happens except in the mid-range.
Now try again with an orchestral piece. There will be much more variation, though in most it will tend towards the top end.
Now try again with rock. Tends towards the bottom and top, with middle frequencies missing.
Keep going with any format you feel like mentioning...you'll get the same.
Actually, this is a striking example of how recording techniques can ruin sound as well. Take a look at the Apollo 440 album - Gettin' High on Your Own Supply. A good mixture of guitars and electronics, right? Well, look at the frequency graph again. See how virtually every guitar frequency variation has been cut out: this music was recorded digitally, mostly using samples by the looks of it. The normal variations you'd associate with having guitars play live are all filtered out, and the graph goes back to the flat digital sound again.
Cheers,
Ian
Let's assume that anyone who likes Ogg and is seriously into music will compress their music with both Ogg variants and use the best variant for each file.
Therefore we should also consider taking the best of the two results and comparing it to mp3.
From a quick look at the results it appears that Ogg will still be edged out by mp3 when analysed in such a fashion, but it's much closer.
Also a test on several bit rates would be useful.
See http://etbe.coker.com.au/ for my blog.
I guess grip will have to use Genre info from CDDB to decide what to encode the the files as now. I wonder if you coudl set up something to optimize individual tracks. Like scan a wav and pick the best codec for the frequencies used in the audio.
Why not fork?
I noticed a number of confused posters here... The tested codecs were AAC/MP3PRO/OGG/WMA, not MP3. Had mp3 been tested, it would have lost every round as all of the tested codecs are vastly superior to plain MP3 at this bitrate.
It also should be noted that the only two samples that WMA beat OGG at (indeed the only ones that it didn't totally flop on) were two very simple samples that are demonstrations of two differnt weaknesses in the current revision of vorbis. Orignally the results page had some very interesting commentary from Monty on this, but it looks like it got pulled.
With the exception of those two samples, OGG clearly won. Even including those, it was only beat out by MP3PRO by a small margin. When you factor in that MP3PRO isn't available at anything but such low bitrates and that it's substantially more propritary then MP3, it seems like pretty much a no-contest.
that these codecs are lossy, and take advantage of the fact that the human ear is better at hearing certain things than others to pair out extraneous info and improve compression. IOW, it doesn't matter how technically different the new files are as long as they still sound the same to the human ear.
BlackGriffen
Considering that different codecs do better at different music w/ different frequency spreads, who else thinks that the next generation of audio codecs will be multi-modal; in effect, be several codecs in one. Then have each codec specialize on certain types of music. Perhaps even have them run in an advanced mode where they do a frequency analysis of whole songs, rather than just using genre, to automatically select the best codec for the job. Perhaps even use different codecs for different sections of the song. That would definitely help songs like Bohemian Rhapsody and orchestas with movements, etc.
Would this be too time consuming to implement or what?
BlackGriffen
Just a side note about the frequency distribution of different styles of music:
The reason why classical music generally compresses better is because the frequency distribution of the sound of natural instruments like for instance string instruments (including the human voice) is harmonic. This means that the sound spectrum consists mainly of a superposition of peaks at the base frequencies of the instruments played and their corresponding harmonics at higher frequencies.
If you were to make a two dimensional spectral analysis of a such sound recording with the time axis to the right, the frequency to the top and the amplitude as the color intensity of the point you would see a lot of wiggling lines at
regular distances. (BTW: this would make a great visualization plugin for xmms)
Since audio compression algorithms also make such a spectral analysis and after that discard some of the information below a threshold they can
reproduce a mainly harmonic spectrum easier than that found in pop or rock music, which is much more complex and more "noisy" because of the
use of distorting amplification and all kinds of
percussion.
Holger
Looking at the data, it looks the two samples where Ogg performed poorly ended up being encoded at a significantly smaller average bitrate than any of the other encoders.
The table at the end lists LiszBMinor with an average ogg bitrate of 45 and BachS1007 with an average bitrate of 47. Since the other codecs encoded those samples at a bitrate 64 or higher, this may explain the results.
The results may point to a flaw in Ogg's VBR login rather than in the lossy compression scheme.
OVERALL RANKINGS (12 SAMPLES)
mp3pro 49.00
oggq0 44.00
ogg64 40.00
wm8 24.00
aac 23.00
The AC above me speaks the truth. mp3PRO has no hope of gaining enough market share to become a worthy competitor. It's a very proprietary extention to MP3. OGG being open source and free (as in beer) has clear advantages for hardware vendors (where it really counts). Lets hope the codec is easy to embed into portable products.
I want my Portable OGG CD Player! I'll buy the first one that comes out. Could you imagine? Twice the capacity of normal players and it STILL sounds better (or same capacity truly indistinguishable from CD -- at only 128k). Right now I have to encode my mp3's at ~180-220kbit to get something acceptable. =/
ABC/HR.. as in ABC/Hidden Reference... as in, there is a copy of the original track included as a hidden reference on every single trial.
The users are given 2 sliders per sample laid out on a panel. The samples are loaded in random order. On the sliders for each sample, one slider is for the original sample, and one is for the encoded. These are also randomized per sample. The user does not know which is which. If they happen to rate the original sample less than 5.0 (highest rating, meaning it should be transparent), then their results are disregarded entirely for that sample.
Been using Ogg/Vorbis/Squish on Quicktime for a year. The Ogg/Vorbis/Squish codec got much better between 1.0rc2 and 1.0. At 128k it's already better than mp3 and the managed bitrate encoding is faster than the hard drive can read. The real value is of course, the ability to read these encoded files as long as there is UNIX. Mp3 is going to die and when it does there won't be any appliance makers interested in paying the $10,000 royalty to support mp3.
To thsoe of us who just want to listen to music on a PC, the newest greatest best algorithms are always good (mp3pro, oggs, wma8). But for many, the goal is to put that music on a MP3Player and listen to it anywhere. I'll summarize the support of these various codecs by MP3Players, as well as mention whether or not my MP3Player (RioVolt SP100) supports them.
MP3PRO -- little support on MP3Players. Not supported by RioVolt SP100.
Oggs -- little/no support on MP3players. Not supported by RioVolt SP100.
WMA8 -- little support on MP3players, though many support older WMA's. Not supported by RioVolt.
So, in summary, all of these new formats are completely useless to me on my MP3Player. The one option they present is if I want to encode something in two formats -- one for my computer, and another for the MP3Player.
Personally, I think more work should go into fractal endcoding, as most music has fractal patterns in it (especially Bach's music).
social sciences can never use experience to verify their statemen
I am Karma Man, hear me Whore.
An honest double-blind listening test is extremely difficult to arrange, and there is no evidence whatsoever on such on the site.
This is how the test was conducted.
The test required access to a Windows machine (probably Win95 and up, didn't try with Win3.1) with a sound card. Users were required to download the ABC/HR "practice" Zip file, which includes the ABC/HR program, the Ogg Vorbis 1.0 command-line encoder and decoder, a LAME command-line encoder/decoder (I forget which version), a FLAC command-line decoder program, and a .flac sample file (the instrumental introduction to The Eagles' "New Kid in Town").
After unzipping this, the user had to run a batch file (encdec_foobar.bat) which un-FLACced the sample file, then encoded it with Ogg Vorbis and LAME, then decoded both of the resulting files back to .wav.
Then the user executed the ABC/HR program, which is a Win32 GUI application. After loading the sample into the application (pull-down menu and file selector dialog), the interface became a row of double-slider pairs. Below each slider was a "Play" button. Below each slider pair was a "Play Ref" button. Below that was a "Stop" button. There was a pair of sliders for each decoded sample -- so for the practice run, there were two pairs of sliders: one for file #1, and one for file #2. The user did not know which file was Ogg Vorbis, and which was LAME MP3.
The user then listened to the Reference file by clicking any of the "Play Ref" buttons. After hearing the Reference, the user could then click any of the normal "Play" buttons. The first task was to determine, for each pair of sliders, which one was the original and which one was the encoded file. Having determined that, the user used the slider (which went from 1.0 to 5.0 in increments of 0.1) to "score" the sample on the subjective quality of the result. There were also text labels on the slider: 4.0 was "perceptible but not annoying", 3.0 was "slightly annoying", 2.0 was "annoying" and 1.0 was "very annoying".
Finally, there was an ABX button, which launched a different window. In the ABX window, the user could select "Original", "Sample 1", or "Sample 2" for the "A" and "B" samples. Normal ABX testing proceeded from that point. (If you don't know what ABX is, go to pcabx.com.) I found that the ABX window sometimes helped me to focus on a specific sample so that I could find its flaws; armed with that knowledge, I was able to make a determination of which of the two sliders, right or left, was the encoded version.
Once a slider was pulled down from the default 5.0 position, another button became active above that slider. Clicking on it opened a new window with a text box, into which comments could be typed. When the user was finished with the test, the slider positions, the comments, and the ABX results (if any) were written to a plain text file (DOS CR/LF format), which was to be mailed to the test administrator. (Though, of course, you weren't supposed to mail the practice results.)
Now, that was just the practice session, which was a prerequisite for participation in the actual test. For the actual test, the process was similar, but differed in a few details.
The actual test samples included copyrighted, patented codecs for which there are no freely distributable decoders. Therefore, the WMA, AAC and MP3Pro samples were distributed as FLAC files, and decoded by the batch file. Since MP3 did not participate in the listening test, the LAME encoder was not used during the actual test. The Vorbis encoder, of course, was used twice: first with -q 0, and then with -b 64 --managed.
With 5 encodings per audio sample in the actual test, there were 5 pairs of active sliders instead of only 2 pairs. But otherwise, the actual test was exactly like the practice session.
(Personal note: I did 10 of the 12 samples, skipping the two classical ones. Out of 50 encoded versions of the 10 samples, there was only one case where I couldn't tell right from left -- "The Source", encoded with MP3Pro.)
And if you log onto kazaa and download the mp3 to and then attempt to do this quality comaprison, you are not qualified to ever post on slashdot again. :P
This is an interesting and relatively well done test (although it appears that the listeners knew which format they were listening to, so it wasn't truly double-blind, and a anti-MS and pro-Ogg bias can't be ruled out).
However, some discussions seem to be focusing on this saying AAC is bad or WMA is bad, when really it refers to the particular implementations in codecs of those formats.
For example, the Apple MPEG-4 AAC-LC encoder was used for AAC. This is a Low Complexity version of the format. Also, the Apple encoder has a strange limitation where it automatically converts 44.1 stereo to 32 stereo at that data rate. This isn't required by the AAC format. Other AAC encoders yield MUCH better results, and beat MP3 Pro in double-blind testing. I haven't seen any double-blind comparisons between AAC and Ogg.
Also, the WMA8 encoder is due to be replaced by the backwards-compatible WMA9 in early September. Of course, there may well be improved versions of the other encoders by then as well.
My video compression blog