Visual Analysis Of Mp3 Encoders
Chris Johnson writes: "I've just finished an interesting scientific analysis of several mp3 encoders and have my findings up on the Web. The process involves differencing a 'sonogram' image from an encoded test signal with the image of the original signal, and then producing response curves showing the disparity in direct signal volume, and over time. Umm . . . which is just to say this is probably the most rigorous analysis of any encoders anywhere on the web, and very geeky (in a good way). LAME carries the day, but BladeEnc shows that it has a completely distinctive sonic approach- and Fraunhofer proves unacceptable (in the version I tested) for audiophile use, though it's unbeatable at very low bit rates. See why." Truth in advertising -- this is a cool example of how visual information can convey more than you'd expect it to.
How does Ogg Vorbis hold up against these?
The human ear is tuned to be more receptive at the frequency ranges of the human voice.
You can lose more detail at high and low frequencies without it having as noticable an effect on the sound as perceived by the listener.
If you are a big fan of classical you will have an opinion on _which_ parts of the sonic information are expendable
No, when a certain frequency component is discarded, it's not because the listener won't mind, it's because even if it's there, the listener cannot hear it. If you can't hear a sound, why encode it? Now, there are sometimes problems with classical music, but that's because it's often hard to predict exactly what you can and can't hear.
Opus: the Swiss army knife of audio codec
Rather than sitting and listening to all the different encoder/decoder combinations, wouldn't you prefer to view some metric that you can evaluate at a glance?
Besides, something that might show up in the visual depiction may be audible, but not necessarily obvious the first time you listen. It's kinda like when you're at the eye doctor and he's flipping through lenses: "Is this one better? How about this one? Is the first one better than the second one? First one? Second one?"
You may not notice a visual problem with one of the lenses at first, but then after wearing them all day, you get a headache.
Why are you letting these clowns ruin our country?
Well, You might not hear the difference, but others might...
Mikael Jacobson
Greylisting is to SMTP as NAT is to IPv4
That said, it is generally the case that "pre-echo is bad" and "over-ring is bad." Reducing these can be thought of as a good thing. Let's assume that for these encoders, pre-echo and over-ring are universally bad (I'll give an example where this isn't the case, below). Furthermore, this comparison actually says nothing about these encoders other than the pre-echo or over-ring. I.e. what happened to the sound that was the "same" on the sonogram? It is quite possible for an "encoder" to mangle the audio quality yet have a pristine sonogram by this test's standards.
Just to throw a wrench in the works, more advanced encoders and/or psychoacoustic models can utilize what's called temporal masking. This is the ability of a higher-amplitude signal to mask (make inaudible) a lower-amplitude signal either before or after itself, as far as the human ear is concerned. Pre-echo is the phenomenon whereby a transient signal (i.e. a very 'sudden' attack, like a drum hit) is smeared in time. The audible effect can be most obnoxious. Yet encoders utilizing temporal masking will explicitly allow a certain amount of pre-echo through, as long as it is temporally masked. This leaves the encoder to spend those bits on other parts of the signal that would be more seriously degraded as far as our ear is concerned. In short, a sufficiently savvy encoder could exhibit more pre-echo than another worse-sounding encoder, especially if it uses temporal masking.
Quantitative analysis for perceptual audio coding is not easy; this has been a grail for researchers in the field for years. I strongly suggest that interested parties dig into various IEEE and AES (Audio Engineering Society) journal papers on the subject, as well as various books, etc.
Another fun experiment is to do this same thing sonically (makes a little more sense) -- encode to mp3, convert back to wave, and then subtract the original from the encoded one. The resulting wave will have all of the bits which were discarded.
It's difficult to interpret the results (I agree with those who say that this study is more or less worthless) but it does sound pretty neat. =)
mp3 is not popular because it saves hard drive space, it is popular because it saves internet bandwidth... (all those people using napster thru a modem)
While agreeing that for high quality audio one must "fuck mp3", I have to disagree with you that it will loose it's appeal.
Right now, the attitude is "Why be able to store several hundred songs, when I can store several thousand..."
In a couple of years, the numbers will change but the rationale will be the same. Why store ten thousand PCMS when I can have a hundred thousand??
I agree at some point things will become meaningless, but there will have to be quite a major revolution first... Perhaps that infinite data storage by quantum methods. Perhaps I'm a bit too hesitant to rely overmuch on Moore's law.
E
There's another dimension in audio that will eat up more hard disk space... As hard drives get larger, will the high end audio people still stick with 44 kHz stereo? I think not. As the capabilities of machines to handle much finer sampling rates increases, so will file size as well. As it is we've been seeing a lot about DVD quality audio, or the Sony system... Plus as things get better/faster/cheaper, I wonder if quadrophonic sound, or something else of that nature gives file size another doubling.
Though speed and storage double easily, I've noticed, so do audio file sizes. There comes a point in the future where we are just not sure anymore, but I think at least for the forseeable future, audio compression will become more important, not less.
E
Wait a minute...aren't "MP3 Encoders" suppose to produce .mp3 files? .ogg =/= .mp3 I don't know about the rest of you but my portable mp3 player can't read a .ogg file so whats the point of making them.
- darellik
this is the problem when one looks at MP3 purely as a technology. if you want to boil it down to pure psycho-acoustics, of course selective discard is the ultimate goal.
personally, i want to see mp3 music come as close to uncompressed music as possible. i want to encode my songs without tinnyness or that annoying "swoosh". to me, an effective method of sound compression has no compression artifacts and has an output exactly the same as the input.
i think people who really listen to music should go for MP3's that SOUND good, not just look good in a white paper.
Give me some of what you are smoking, dude!
MP3 distortions are very evident especially at 128kbps(so called CD quality) They become less evident the higher the bitrate, but even at 320kbps the distortions are still easily identified compared to the original CD.
and gogo took this even further..
It took LAME's quality and then was optimized for speed...
The secret of success is honesty and fair dealing. If you can fake those, you've got it made. (Marx)
The basic idea of mpeg is that the encoder removes the parts of the music which you (probably) can't hear. The encoder splits the sound into pieces, and rates each piece after how important it is for the total sound image. Then it starts with the most important sound and encodes that, and continuing with the less important parts until the available bit rate is reached (e.g 128kbit/s). The rest of the sound data is discarded.
The tricky part is the calculation of the "importantness" of each sound, and that is what differentiates the encoders. This calculation is done with an algorithm called "a psycho acoustic model".
To measure the quality of an mpeg encoder automatically, you need an algorithm which calculates the quality the the encoded signal. By knowing this algorithm it is trivial to create an encoder which will score maximum on this quality measurement, since the quality measurement algo is basically the same as the psychoacoustic model.
This test is "snake oil", a real test of mpeg encoder unfortunately involves listening to the music to evaluate the psycho acoustic model of the encoder, and not comparing two artificially created psycho acoustic models with each other.
RFC1925
Not really being an audiophile I beg to differ. I got some tracks from the Lola Rennt film throught Napster, remembering that I enjoyed the soundtrack as much as the film. They sounded allright on my Aureal Vortex 2 soundcard and the cheapest model Rotel amplifier. Nonetheless, when I bought the CD, the difference was noticable. And we are talking about 192 Kbs MP3's. The clarity of CD's is far superior to MP3's.
-- Spelling and grammar errors tend to be a sign of erroneous thinking.
Besides, as computers and networks become faster and storage cheaper and more compact, we're not too far from the point where non-lossy compression wil suffice, as far as downloading/storing music is concerned.
I want my music in .gz format, not .mp3 !
--lp
__________________________________________________ ___
rooooar
Giving this sort of thing to Slashdot is as fun as nude mudwrestling. Gotta love it. :)
That one of the sonograms seems to be closer to the original visually says nothing about how it will sound.
On the Mac, I would have to _pay_ to use the Xing encoder. I just got through a serious ramen-and-spaghettios period, and there's just no way I'm going to merrily throw money at people who not only support the mp3 licensing patentholders, but also make an encoder that is considered to be more prone to artifacts and ringiness than even the Fraunhofer high bit rate stuff.
Beat me, whip me, slashdot me and call me unrigorous, but I'm not paying money for Xing. The lurkers support me in email. So there ;)
I had to know why- no, scratch that, I knew why. I had to know which encoders did better- what they in turn traded off- and I had to know across a wide range of bit rates in a way I could quickly cross correlate.
I've written for (IMNSHO) the foremost High End Audio journal. It's not that I'm not interested in listening to encoders! But if they are _all_ quite compromised, why not break 'em down into a series of measurements relative against each other with clearly identifiable characteristics? Shows you what to listen for- and tips you off to particular issues.
This is like that. The original ASCII art was mixed with antimatter, in the lacking of " ".
Leaving the disfigured creature you see before you.
--Giving to trolls for the benefit of us all
You can add me to that list- and such a comparison (I naturally kept a logbook to be able to reproduce the process later) would indeed be meaningful to me. For instance, if Vorbis was more sophisticated in its control of over-ring and either imposed a flatter characteristic (resisting resonant peaks) or went for an intentionally tailored characteristic (say, suppressing ring around 3-5K like Fraunhofer 32K bit rate) this would have obvious and interesting application to the sound quality. Conversely, if it had big ugly peaks and artifacts, their location in the frequency response would tell a lot about the sonic signature of the encoder.
Doh! For years I've used a purely white background for airwindows.com, with a sort of vintage-cnet layout. I also used to keep a 'graphics' section in which I had some web background gifs I'd done. They were made like this:
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Do a diffusion dither between white and the lightest 'web safe' gray- then take all the pixels at x positions and knock 'em out to white too. The result (works with other colors as well) is a texture in which no two colored pixels are ever directly next to each other- it's a paperlike texture but never gets darker than half Netscape grey.
Which is to say- sorry, I did it that way because I liked it, and I'll keep it. Honest, I have done everything I possibly could to avoid obscuring the text, but it's sort of like a trade-off: in getting rid of additional table clutter that I used to have, I found that I liked the pages when this simpler layout was backed by the softest texture I had, rather than plain white.
I hope it didn't bother your eyes too much :)
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
Woops. Or I could have said 'checkerboard' and saved myself the hassle :)
The idea is from company named Boxtop Software which produced a photoshop plugin that put different web safe colors in checkerboard patterns to produce a much greater range of 'web safe' colors (which look solid). I figured, why not run with that and do textures that way? Maybe the Gimp would benefit from some websafe checkerboard texture generators too :)
Actually, I think this would be a _very_ good experiment. I'm aware that my questioning some of these concepts is seen as prima facie evidence of being a tottering loony *g* but the whole concept of the psycho-acoustic model is so central to current audio theory... and this theory basically says, 'mp3s can be made to sound indistinguishable from CDs' and they cannot- the same theory on a broader level says 'CD itself is theoretically perfect sound', and it is not- mastering engineers, for instance, have learned that to do their work they need something better than CD audio.
I'm not certain that the psychoacoustic model must necessarily be that much better than, for instance, trying to diffuse unavoidable error as evenly as possible over the frequency and time domains. You are essentially insisting that concentrating the error in particular areas that are said to be 'masked' is far superior. This assumes the masking is effective, and that there are no side effects- neither assumption is wholly true, as large numbers of people are able to find fault with (say) 128K mp3s, and any filtering is going to impose extraneous characteristics. Finally, you're assuming that given an encoder that does not have a psychoacoustic model (I assume this would mean one that diffuses error pretty uniformly) is going to perform 'very well' in the procedure I devised. I'm not sure of that- I'd like to try it experimentally before jumping to that conclusion.
Finally, I have to admit- I haven't got the faintest idea what the resulting sonogram, and frequency/overring characteristics, would look like. I can say some things about it- with regard to the over-ring, diffusing it over a wider frequency range is not only desirable but markedly preferable. Fraunhofer loses badly to LAME, sonically, over just this issue- and Blade gets away with its severe over-ring by diffusing it over a wider frequency range. If the experimental psychoacoustic-model-less encoder showed significant improvements in diffusing out this over-ring and reducing its duration- there would be legitimate applications for its tonal characteristics, even if the raw frequency response was noticably compromised. It would be sort of like the 'anti-Blade'.
I don't suppose anyone will actually _try_ it, much less help me out with measuring it :P but if anyone is genuinely interested in investigating this, drop me a line? It sounds like something that could be attempted. Seriously- the whole point of such a model is 'masked stuff can't be heard'. If people can hear the masked error anyhow, what is the point? And if you assume people who can't hear anyhow and won't notice, what's the difference? Is it so axiomatic that you have to shun diffusing error evenly, and instead concentrate it in areas you think won't be heard?
You are talking about applying only the psychoacoustic model of the mp3 encoding, and producing a comparison of that with the original signal. I would indeed be really interested in seeing that- I'd like to know which of the various distortions, over-rings etc. arise from the psych model and which arise from the fractal part.
In the argument (lower in the thread) I was questioning whether you could skip the psych model entirely (pretend people can hear the difference between 128K mp3 and real life ;P ) and see just what you'd get if you went purely with the fractal encoding- trying to diffuse any and all error in the process as evenly as possible over frequency and time.
People will swear up and down that this will be drastically worse. I'd like to measure it in comparison with normal mp3 encoders and see exactly what it is, not just run around making theories that it's going to be awful. The one thing I'm willing to guess about it is that the sound will be the opposite of BladeEnc's sound. For some people that'll be bad- but the idea of an 'anti-Blade' might really interest others.
I don't know if anybody's comfortable enough with hacking on a version of LAME or whatever that they'd be willing to try it- I am going to bounce the idea off Martin Hairer, with whom I worked to perfect the sonogram-plotting program (I needed to request better picture export capacities- he came through like a trouper and fixed everything). I think he is the one who ported LAME to his program, and he might be both able to try such experiments, and interested in seeing what they do.
At any rate I wanted to say that your idea of isolating the transformations and considering them independently _is_ truly an interesting exercise- and I hope to be able to do such experiments, and learn from them, with a bit of work and patience :)
Actually, I'm pretty sure that the poster forgot entirely about lossless compression.
True dat. I'm a skimming fool. Oh well, I still wasn't too impressed with his analysis. Perhaps I was turned off by the large amount of text against a grey static background to read all of it. Note to self: it's never a good idea to skim an article linked on Slashdot, then post an opinion about it. :)
Even the samurai
have teddy bears,
and even the teddy bears
Even the samurai
have teddy bears,
and even the teddy bears
get drunk
I can hear the difference between different encoders and different decoders. I consider myself a moderate audiophile.
As a test, try encoding the same song using two different encoders (making sure to use the same bitrate). Using the same decoder, see if you can tell the difference. You can also try downloading the MP3s from the site referenced. A quick listen to them (at the same bit rate) should show an audible difference.
The only other differencemight be in speaker set-up. A crappy computer speaker might not be able to really show the difference between two MP3's.
I use a Blade-based encoder on my Mac with Cambridge Soundworks Digitals.
- (c) 2018 Hank Zimmerman
Spectral and waveform analysis and such has all been done before, and LAME has been known to be superior for quite some time. I've been singing the praises of this site for at least six months.
I'll agree that perception is what matters. However, what souds great on my $48 Labtec speakers at work sounds like crap on my $500 studio headphones at home. The fact of the matter is, most people don't have $25,000 of audio equipment nor sufficiently trained ears to tell the difference. I'll readily use LAME encoded stuff from people I trust, but cringe in horror when I listen to the rapage that Xing's encoder performs to the quality of complex music.
Think of it this way: most people are arguing which color of crap tastes better. Sites like this one and the one in the article are trying to point out that you don't have to eat crap.
hymie
Kexis is a GPL'd lossless encoder which has proved to be _almost_ as good as shorten for filesize, is _much_ faster to decode and encode than any encoder I have ever used... The fact that the kexis file format may change in the future is largely a petty issue as you can simply losslessly convert from the old format to the new one. Have a look at it at http://kexis.sourceforge.net
I am tired of seeing people making MP3s tests comparing the initial signal and the resulting encoded one in terms of how similar are they.
This is useless. MP3 is perceptual coding, and the only way by now, to decide what is better is to listen and decide. If you can't hear it, why do you need to encode it? That's the idea of MP3.
Don't try to see if the encoded signal looks the same than the original in terms of spectral content, try to see if it sounds the same!
Blade became popular because it was the first program to be banned by Fraunhoffer. In fact, blade is really a copy of the ISO reference code, optimized for speed. Lame incorporated massive quality improvements, but came too late to catch the wave of publicity offered to Blade. It would be nice to have access to the code which generated these sonograms.
r3mix.net is really the definitive site for this sort of thing. Not only does the site show waveform deviation, but the tester actually listens to lots of very diverse music to test for quality. The waveforms are used mainly to explain errors heard during listening (ie. what the hell is that fuzzy warp sound overriding the bassline?). So anyways, read up at r3mix.net -- you'll realize people have already done this much better.
-- jar
Absolutely. CD quality (44.1 kHz 16 bit PCM) is total CRAP to true audiophiles. I won't be satisfied until they invent a format that will store the timing and stength of every single air molecule hitting my eardrum, precise to within the Heisenberg uncertainty principle. Uncompressed.
Ever hear of shn files? People use them for trading bootlegs because it cuts the file size 50% and produces no loss.
Only the State obtains its revenue by coercion. - Murray Rothbard
Ask /. articles are often a great way to get info but you have to be willing to do some reading and thinking for yourself. Often the best articles are the shortest ones -- they are just links to outside sources.
This article is way inferior to www.r3mix.net. You should go back to that old Ask /. article and figure out why you didn't pick up on that web site. The fact that you didn't come away with an answer from the first article was entirely your own responsibility. All the info you needed was there.
some of my vinyl is way better
Vinyl sounds "warmer" because...
Will I retire or break 10K?
I disagree, to extend your metaphore, this is like your optomologist playing a song on different stero systems and perscribing your diopter based on your reactions to the sound. What is being measured here is an absolute difference in the sound, but the value in lossy compression (both in audio and visual realms, and others?!?) is that you can loose data size without losing the important data.
This test is valueless, as it does not take the human ear into account. The quality of the compression is completely a subjective thing, it will always be so. There will never, ever, be a worthwile mathematical test for lossy compression.
I've been using GoGo, which is another Japanese implementation of Lame, this one with MMX acceleration. It sounds fine at 128k to me, better than Xing, which everyone agrees is crap. Fast crap, but crap.
I know that... I don't encode Classical (anything) with AudioCatalyst. Nonetheless, if you are going to include LAME and BladeEnc, which suck ****, too, you might as well throw in Xing, no?
---------
nuclear presidential echelon assassination encryption virulent strain
Whizzmo
That's trivially proven to be incorrect since gzip and bzip2 compress data and yet have the outputs be the same as the inputs. In an audio context, ten minutes of a pure frequency sound be easily compressed to a small size. The only information you really need keep is the length of the tone and the frequency.
"When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
Distortion you can't hear will affect you. It will cause you to feel tired and stressed. This is one reason that people who have to spend all day listening hard to audio (audio engineers) choose reproduction equipment that introduces the least distortion (or introduces distortion in the least displeasing way).
I would also be very interested in seeing similar graphs (preferably from the same source) made with Vorbis encoders, to see how they stack up.
The Matrix is going down for reboot now! Stopping reality: OK. The system is halted.
the key point here is that mp3 encoding is in fact a process of two separate transformations (both of which consist of many processes, of course), the first of these is my bone of contention as it seems less well-known than the second, which i will address first.
the "second transformation" is the one familiar to most people, the iterative fractal encoding procedure, which simply adds information to that audio frame until it a) either hits a "quality threshold" (ie is consider good enough), or b) fills up its bitrate allocation. it's similar in many ways to making a "jpeg of sound". you can get a good view of this whole process by following this link to a graphic of the aac encoding process on fraunhofer's website. It is the stuff inside the box at the lower left that this concerns.
however the first transformation here is the important one, this is the stuff outside and above the box in the graphic linked above. (i am not sure the graphic is detailed enough, there may be some missing, from what i remember) - this is a series of transformations to limit the amount of data the second transformation has to deal with (and hence get essentially better encoding for the same bitrate), according to the way the human ear works. our ears have "features" like having a dead area in frequencies near loud noises, which means these bits can be cut out, and other bits and pieces that i can't remember and don't have to hand ;) this is of course psychoacoustics, as other people have commented. there is a _very_ basic primer on this at the fraunhofer site here, but it doesn't go into any technical detail.
as an aside, there used to be some fantastic and informative articles on these subjects at mp3.org back in the day (1997-1998?), may it rest in peace. does anyone have some links for where something as good on this subject is? i haven't been as in touch with the technical side of mpeg encoding as i used to be...
but anyway back on subject, this first transformation actually distorts the signal *significantly*, but only in a way that makes it easier to process, while still sounding the same (or close) to the human ear. it may be an interesting exercise to isolate this first transformation, apply it and then save without any fractal encoding, and compare that to the original signal. this transformation will cause great "visual degradation", as shown in the article, but imho this is not an accurate criteria for measuring audio quality. still interesting, and a good read, though :)
fross
He's measuring the MP3 encoders, and Ogg Vorbis is not an MP3 encoder
Wouldn't it be interesting to make one of these tests comparing many different encoding techniques (MP3, Ogg Vorbis, VQF...)? I saw once one that made a comparisson between MP3 and VQF (I think it was posted to Slashdot, maybe) and it was pretty interesting.
I tried the Ogg Vorbis encoder the other day for the sake of trying, encoding a small song (Black Sabbath's Paranoid) with both BladeEnc and the Ogg Vorbis enconder... and I can say that the high frquency responde for Ogg Vorbis was much, *much* better. The MP3 sounded noticeably different from the CD, while with the Ogg Vorbis file such a difference was not so trivial to hear. (Ok, I know that it is a well known fact that MP3 sucks at higher frequencies, but, it was an example.)
Anyway, a deep comparisson showing the pros and cons of each encoding technique would be very interesting. This won't change the fact that it will be very very difficult to convince people that there may be better alternatives to MP3, but...
--
Marcelo Vanzin
Marcelo Vanzin
This site is the whole reason why I started using LAME...
I modded the Troll Investigation and I got
What about your amp and speakers?
The ideal result from the process (totally unaltered waveform information) would be an entirely _black_ 'sonogram' at the end of the process. That's not going to happen. Since there are going to be deviations, it's down to the psychoacoustic model- and the pictures and charts are going to show what the encoder chose to throw away, on a larger scale.
You can argue that the encoder throws away stuff that can't be heard, therefore measuring _that_ is meaningless. This equates to arguing that the result is indistinguishable from the source audio. I disagree, and feel that all mp3s are audibly degraded from the source audio- which is itself degraded, being typically 16 bit 44.1K digital audio :)
I'm trying to measure what the encoder's failing to do. The project was meant to answer my own questions, and has done so.
Personally, I'm with LAME for my sonic requirements, although the only mp3s of my music out there (so far) are Blade, done many months ago before I did this research. But the point is not that there is a 'winner'- the point is that the differing sonic characteristics of these encoders CAN BE QUANTIFIED. Perhaps not measured outright (my charts etc. are _relative_ to each other), but these encoders take significantly different approaches to discarding information, and that applies directly to your choice of encoder for recording music, and translates to a completely predictable sonic characteristic of the encoder on ANY music, no matter what.
I put all sorts of music through Blade when I was on mp3.com with only Blade for a free encoder- no matter what I did, the result was always identifiably BladeEnc, with the smooth extended frequency response and absolutely terrible transient impact. For some pieces, this was suitable- for some it was grossly unsuitable. But the sonic characteristics were consistent- and correlate with what I learned about the encoder in this 'torture test'.
My old p120 plays most mp3's smoothly, even some discmans do...
Why don't we shift to a more agressive compression-method for todays systems then?
Why not grab the wav files and use the highend "100x" compression methods "you read about, but never see irl"?
(Those articles allways claim: "current systems are too slow for this (fractal method)", but they never mention anything useable.)
Current top-of-the-bill machines should be capable of playing the raw cd-grabs realtime from a highly compressed file, in the process possibly getting near-DOS system loads (who cares), without loosing any detail of the original track and making mp3 sound like the inbreed godzilla version of this pure little salamander.
LAME and BladeEnc produce terrible sound quality (or you have to use ridiculously high bitrates; no wonder Napster's full of huge 160 kbps files) and Fraunhofer is the only one that I'd call even adequate at 128 kbps. It's the sound not some friggin' visual graph of the music that's ultimately important.
Ogg Vorbis, on the other hand is superb. It's not only free of patents and also GPLd!
Ogg Vorbis?
--
--
You are a fucking moron.
Oops, I am using Lame 3.88 Alpha 1
It's either on the beat or off the beat, it's that easy.
I moderate therefore I rule!
--
Mark Neidengard (or Niedengard, maybe?) from Caltech (was an undergrad... he's now at Cornell) has an analysis on his page... it seems to jump around, but it's worth a look.
Anyhow, good thought nonetheless.
Even then, you'd need to ensure that the rest of the audio reproduction path was the same: a CD played on crappy speakers will almost always sound worse than a high-quality analog setup with top-notch speakers.
Finally, keep in mind that these kinds of do-it-yourself experiments are notoriously lax at controlling for confirmation bias. This is particularly troublesome when your goal is to measure something as subjective as audio perception.
Hey, check this out, courtesy of Ars. It presents an alternate viewpoint using different means. I remembered reading this not too long ago. Interesting read. . .and to add a spoiler, it definitely recommends the Fraunhoffer over LAME and BladeEnc.
-s
- - - - - - - -
Don't worry, being eaten by a crocodile is just like going to sleep in a giant blender.
__________________________________________________ ___
rooooar
I have been wondering about this kind of thing for a long time. I have used Lame as of late because it is very fast with the optimized compile I have. I wish it were as fast on VBR, but I guess I'll have to settle for CBR.
I'd really like to see something like this with Ogg Vorbis once it matures. Or now even, because it seems to be a bit better already, though it's hard to tell on my laptop speakers.
WARNING: there is a trojan on your
I've been using the VBR Lame Encoder 3.99 Alpha for a couple of weeks and I love it. It's fast, and it sounds great. I was using BladeEnc for a while. I have found that Lame sounds better, and using VBR will result in a smaller file than Blade and still sound better.
It's either on the beat or off the beat, it's that easy.
I moderate therefore I rule!
--
Ugh... Everybody thought that whatever they were using was the best thing under the sun, no research supporting their claims, and in the several hundred comments, not even a hint of some general concensus.
Finally, something that'll allow me to choose based on fact, something that'll allow me to make an *informed* decision. Thank you.
---
I'm glad to see I've been using the right encoder (BladeEnc) for all my classical music. I can't remember why I started using it (I don't think I've used anything else), but now I see that it beats the others as far as tonality goes. Classical music is all about the right pitches (I even have perfect pitch), so perhaps that's why all those bad sounding classical mp3's off napster sound so bad (or they were ripped off records...).
Sometimes I've believed as many as six impossible things before breakfast.
I am using the same amp and speakers for my soundcard as for my cd-player. Only the cable between my soundcard and my amp is much longer and not as thick as the one between the cd-player and the amplifier.
-- Spelling and grammar errors tend to be a sign of erroneous thinking.
Ah, but nowhere does this article try to disprove that, does it? The whole point is that certain codecs does a better, more intelligent job of discarding information, and that is what the author set out to prove.- ---------
-----------------------------------------
-------------------------------------------------
This sig could have been put to good use.
Showing how information is discarded and which information is discarded is the point. If you are a big fan of classical you will have an opinion on _which_ parts of the sonic information are expendable differing greatly from somebody collecting Britney Spears mp3s.
"To excuse such an atrocity by blaming U.S. government policies is to deny the basic idea of all morality: that individu
I haven't read the article yet, but any assessment that ranks LAME and BladeEnc over Fraunhofer is very obviously flawed. It's possible that his visual methodology isn't a good portrayal of psychoacoustic effects.
I've tested both LAME and Blade against the Radium release of the Fraunhofer codec, and the results (at 128 kbps) aren't pretty: Radium leaves the others bleeding in the dust.
Sure you would know which parts were discarded and which weren't. But... Would you know which discarded bits were audible, and which weren't, and which were in between?
By using a very detailed model of what the human ear can and can't hear, you have to weigh every discarded bit by labelling it "audible discarded", or "inaudible discarded", or perhaps a percentage in between, which would be more accurate.
Once you have that, then you can really tally which encoder is best, because you can compare which encoder truly drops more audible data. Nobody but rabid audiophiles cares about the inaudible bits, not even the programmers who wrote these encoders!!!
When rating mp3 encoders, one should rate them for what they're designed to do, not rate them by how well they reproduce experimental sine sweeps and other such sound test garbage.
Can anyone think of anything else that I missed that would help make a more fair test?
"Now, there are sometimes problems with classical music, but that's because it's often hard to predict exactly what you can and can't hear."
Um, right. Which is why I wrote what I wrote.
"To excuse such an atrocity by blaming U.S. government policies is to deny the basic idea of all morality: that individu
The http://www.airwindows.com/encoders/index.html article is just another "Johnny Come Lately". The absolute best MP3 encoder comparion site is http://www.r3mix.net and an good MP3 decoder comparion site is http://privatewww.essex.ac.uk/~djmrob/mp3decoders/ intro.html
Really. I mean, at what point does "good enough" become good enough?
Not that I particularly care, but this seems to be a shallow argument. When you're searching the skies, you're trying to FIND something; ignorance is NOT bliss in this case. When you're listening to music, all that matters is what you can hear. Now maybe there is a more scientific method to determine what you can hear, such that you can detect percentable problems before you run into them, but other than that, who really cares?
This is off-topic, I know, but people really have to learn the difference between a good and bad background for a web page. In general, bunches of little dots that obscure the text are a Bad Idea. If a person puts in the time and effort to do an analysis like this, they should at least make sure everyone can read it comfortably.
The author of this article, Chris Johnson, was pretty conclusively proven to be 100% ignorant of the very basics of signal processing and analysis just a few days back in the big article about Sony's new "high-end" cd format. He's basically just another green-magic-marker-waving "audiophile" to whom science and math are only relevant when they justify his decision to buy ever-more-expensive stereo gear. Take any "technical" analysis presented by this loon with at least a few grains of salt.
Audio quality for compression codecs cannot be measured in terms of visual graphs or synthetic benchmarks. (I.E. just comparing the difference between the original singal and the compressed signal does not work.)
It is quite possible to have a singal that very much resembles the original wave graph, and yet sounds horrible to the ear. It is also equally possible to have a signal who's graph doesn't resemble the original very much, and yet has a much higher 'percieved' quality.
Just remember: The first rule in every single BEGINNERS guide to sound is to "Trust your Ears," and that is the only way to tell a good codec from a bad one.
-----
Natural != (nontoxic || beneficial)
In fact I think I have seen this before and r3mix actually affected my approach to my encoder analysis. Definite kudos to r3mix, and I entirely agree with many of this site's decisions and approaches- interestingly they reach precisely the same conclusion as I did, that LAME 256 was the ideal archival encoder and LAME VBR was the best one for smaller file sizes- except that r3mix has added the recommendation that joint stereo be used in the latter case! (this would really hurt the relative comparison with higher bit rate stereo encoders with my mono test signal, but I think I will take the advice and try that for my own mp3s...)
r3mix also chooses to use _relative_ graphs rather than attempting to give absolute measurements, something I heartily approve of.
Now, here's the thing- r3mix's results are sometimes a subset and sometimes comparable to mine, just depicted in a different way. The primary measurement of a frequency sweep produces different-colored graphs- if you take the horizontal axis and express the vertical deviation of each graph, from an ideal line of flat reproduction at the top, as a brightness value of a single pixel, you'd get something akin to a single line on one of my 'sonograms'. The test with the 'applaud' signal is an example too- if you subtracted the source from the results you'd end up with distortion levels very similar to my differenced sonograms.
More interesting to me is the fact that my sonograms show an _intermediate_ step- several r3mix tests are the averaged responses of an encoder over time. That is exactly what my 'charts' are- they are sums of all the deviation and distortion over the entire length of a sonogram, over a range of frequencies.
I'm almost certain I'd seen r3mix before doing my own analyses- I think it's very likely that this site significantly helped me define the processes I used for my own stuff. I heartily recommend checking it out- this is good work, I totally endorse it, in fact I'm going to put a link to it on my own encoder page right now :) *put* there!
Well,I read the article.
I'd like to write my own mp3 experience here.
When the first CDs were available, the problem of CD sound is that it's sound chilly.
Actually, nowadays, I started to listen LP recording again, because CD sound causes fatique.
Now.. the MP3.. it's worse than CD.
Can't we have a digital format/encoding scheme
which sounds as good as CD or LP?
Is it impossible?
I don't know whether it's possible because I don't major in signal processing. Can anyone please explain it to me?
Well, compression degrades quality of sound, but
can't we avoid some of the amount of degradation?
"Ein offenes Betriebssystem kann schon mal mutieren. Bei Windows 2000 hingegen gibt es alle Services und Dienste aus einer Hand. Das spart Zelt und somit wirklich Geld. Mehr infos unter www.microsoft.com/germany/windows2000"
"ein offenes betriebssystem hat nicht nur vorteile"
"An open operating system can mutate already times. With Windows 2000 however there are all services and services from a hand. That thus really saves tent and for cash. More information under www.microsoft.com/germany/windows2000"
"an open operating system does not only have predivide"
I have 42 gigs of mp3's on hard drives in my computer. I have instant access to any song that I have; I don't have to waste time finding the right cd and putting it in a drive.
And if its all on you computer, its much easier to share the music with your friends if you live in a dorm or have dsl. Its really great if you have a home network, too.
I do burn all my music to cd, but only for backup purposes.
from the can-a-cue-cat-read-these? dept.
Well, after calibrating my cat on a couple of Pop-Tarts boxes, I tried several scans on the diagrams on the web page... nothing! I can therefore conclusively answer this question with a big, fat NO.
-----
Shoot me if im ignorant. But could that mean that you could "see" the SDMI watermark? I guess you would need to see an untarnished one, and then the difference would be the watermark. Can it help if you dont have an untarnished song? Could you use other songs, since I imagine there are only a limited number of watermarks, and then look for common bits? Any thoughts from people who know about this? Cam
-- Cheer, Cheer, The Red and the White.
I know that Xing (AudioCatalyst) doesn't have the greatest encoder, but that's no reason to leave it out...
After all, Ars Technica didn't...
---------
nuclear presidential echelon assassination encryption virulent strain
Whizzmo
MP3 is about selectively discarding information from the audiostream. The purpose is not to create an output waveform which is as close as possible to the input. This is what the whole business with the psycho-acoustic model is about.
The guy used the example of Fairport Convention with Sandy Denny.
I don't know about his rigor, but the guy's alright by me.
Who knows where the time goes?
OK, now we see what parts of the spectrum are thrown away at very low bit rate, but why is it supposed to be "probably the most rigorous analysis of any encoders anywhere on the web"? First off, the *only* way to evaluate the quality of a perceptual encoder is to listen to it, period. Who cares what is rejected (non encoded) if you don't hear it.
...
Also, while using the 32 kbps bitrate amplifies the effects of perceptual quantization, so it's easy to see them, the problem is that not all the encoders where meant to work at this bitrate.
Think about it, when standard institutes want to evaluate audio/speech codecs, they don't calculate sonograms like this, they make subjective tests. They make a bunch of listeners hear the result of many encoders on *many* audio files. That's right you need many files to evaluate a codec. Some will perform better for certain musical instruments, some will perform better with or without background noise, echo,
For all these reasons, I do NOT consider this analysis rigorous at all!
Opus: the Swiss army knife of audio codec
Is it me, or or does this seem like an oxymoron? Not being an audiophile, someone correct me if I'm wrong here... Audiophiles are interested in the most accurate reproduction of sound... Why would they even consider a lossy compression scheme at all? Just like serious digital artists shun JPEG for all but web distribution to the masses, and even then we see much done in gif or tiff. I would say that MP3 audio done by ANY encoder is unacceptable to an audiophile.
Second, I want to challenge some of the assumptions and declarations that this experimenter made. The experiments placed on these encoders are mostly "torture tests" that one would never encounter in real situations... And by using this series of torture tests he tells people which encoders are best for encoding mp3's. Does anyone see this reasoning as flawed? He's subjecting encoders to situations that NONE of them have been designed for, and proclaiming that this has something to do with reality. I see little correlation... How often do you hear pure sine sweeps in any song?
I found the previous mp3 performance analysis posted on Slashdot to be much more informative. It put the encoders up on real world performance, and rates them accordingly.
The guys who wrote the encoders realized that some things just wouldn't happen in normal music, such as these torture tests, so they wrote "shortcuts" that ignored these conditions, and resulted in a higher compression rate! How dare he rate encoders on something that the programmers all deliberately IGNORED.
My friends, trust no statistics that you did not falsify.
> Alpha for a couple of weeks and I love it.
Err? The latest Win beta compile is 3.87, and the latest source isn't too far ahead.
> I have found that Lame sounds better, and
> using VBR will result in a smaller file than
> Blade and still sound better.
This is true depending on what you were doing with Blade. What John Q. Clueless knows of as an mp3 is what he usually gets off of Napster, which is 128kbps and generally encoded with Fraunhofer or Xing's encoders (even worse because it's so abysmally bad and it's in used in so many Windows CD ripping/mp3 encoding utilities). Lameenc VBR-encoded mp3s will be larger than them (once again a shameless plug for the site you should live, learn, love: http://www.r3mix.net/.)
If you were encoding with Blade at a static rate of 256kbps, then the odds are in favor of a lameenc VBR-encoded mp3 being smaller. Not always, because there is always the chance that the mp3 will require an unusually large number of blocks >256kbps, but it's still possible.
Because it's a real pain in the ass to mess with 300 CDs, but it's really easy to select a directory with 300 CDs worth of music and put it on random. You have no idea how useful it is until you put 4000 songs (I'm not kidding) on random. :)
WWJD? JWRTFM!!!
http://users.belgacom.net/gc247244/analysis.htm#MP 3ENC31
This is what I found when searching for mp3 comparison. It compares different implementations of encoding for mp3 as well as output quality. Much more useful and definitive.
Often wrong but never in doubt.
I am Jack9.
Often wrong but never in doubt.
I am Jack9.
Everyone knows me.