Sneaking Past Heavy-Handed Audio Compression on YouTube
niceone writes "Recently YouTube seems to have started applying extreme compression
to the audio of uploaded clips. This is the type of compressions used
by radio stations to make everything louder, but in this case applied
extremely badly. In quiet passages, breathing and shuffling become
overpoweringly loud. A gently plucked guitar chord becomes a distorted thud.
Listen to an example here. And here's what it could sound like — still not perfect, but a whole lot better. The
fixed version is thanks to a workaround proposed by
Sopranoguitar — the idea is to turn down the audio and mix in
a high frequency sine wave (I used 19kHz). The sine wave fools YouTube's
compressor into thinking that the file is at a uniform level (and does
not need the volume changing at all) but is filtered out by the encoding
process (so, no need to worry about deafening any dogs)."
Can someone post an example I could possibly listen to for more than one second?
Who's the mentally... challenged... individual who decided that applying such compression in the first place was a good idea, and then proceeded to implement or accept such a shitty implementation?
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
Wouldn't it be easier to set your gate correctly? Cut out the background sounds BEFORE submitting to youtube; do proper editing and then it doesn't matter so much what they do. Here, in my opinion, is a good site for all such information.
Qxe4
After some more testing it seems that there is a problem with high quality mode. With the tone and sample rate I used (19kHz and 44.1k) at least the high quality encoder whistles at, some other frequency. Sounds like somewhere less than 10kHz to me.
I hope YouTube fix this soon.
ccalam - acoustic versions of new songs.
Wouldn't another solution be to sneak past the entire recompression process by submitting a .flv video that meets YouTube's requirements to avoid recompression? Or would the compression on audio (not the same type of compression, the one this article is talking about) still be forced on these?
By the way to improve the trick, what you could do is detect the envelope of your sound, a modulate your 19 kHz sine with an envelope complementary so that the two envelopes would sum up to a flat line, so your 19 kHz envelope would be f(t) = 1 - original_sound_envelope(t).
You just got troll'd!
YouTube is just trying to enforce a standard level of quality to the content. Everyone expects crappy video with lots of compression artifacts, so the audio might as well follow suite.
Better known as 318230.
I can understand why youtube does this (apart from the scream videos...) - a lot of people browse on laptops which have terrible audio. Thinkpads are especially awful; when I play video on VLC without headphones, I set the volume at 1600% and suffer the clipping - at least I can hear it over, say, a room A/C.
Is anyone aware of a lower-level playback compressor on Windows and/or Linux, perhaps a virtual device driver, which lets you mangle your sound this way? VLC doesn't work for everything after all.
"They were pure niggers." – Noam Chomsky
Youtube, what bad audio compression you have!
All the better to hear the screams of yet another teenager lighting his farts (and himself) on fire, user.
I upload already compressed FLV and find that the video has the same audio quality as I had encoded.
Brilliant idea though. hmm..I remember an episode of Batman Beyond where the Shriek used sound waves to destroy stuff and he even masked other waves. This is off-topic but maybe youtube compressor can be used as a weapon to increase the frequency to a point where glasses may break and dogs can be made deaf, hmm..
Ruff ruff rufff, and ruff rufff, you little bigoted ruff ruff ruffff.
-Tablizer's Dog
Table-ized A.I.
Comment removed based on user account deletion
Can someone post an example I could possibly listen to for more than one second?
No
Table-ized A.I.
1) Normalize sound around -15dBFS (RMS)
(using the normalize-audio software)
2) Compress sound using the Dyson compressor
(available in ecasound)
you mean that high pitched squeal that is driving me nuts in the example more then the audio compression? Yea.. that's filtered out all right...
Defective Logic
The high quality version of the audio will have the 19 (or up to 22.1) kHz sine wave you choose to use in your video upload. So this is a trade-off of quality (high-quality = eek!) versus lack of unwanted range compression (low-quality = listenable, for lack of a better word).
FWIW, I can hear 19 kHz waves. So this trade-off affects me.
It would be nice if YouTube offered some choices, such as volume adjustment, no volume adjustment, and also other things like stereo. The only way I know of to get stereo is to submit it in Adobe's proprietary formats. YouTube is pulling a Henry Ford: you can have any color you want, as long as its black.
Table-ized A.I.
While this is more along the lines of aural compression, my guess is that they're figuring out a good way to save some bits on the backend as well. If you can cut down the size of an audio track by a couple of percent, that's thousands saved in bandwidth costs over the millions of videos viewed per day.
This is something that I really find at odds with the nature of technological progress. When quality should always be improving, the bean counters ratchet things down over and over again to eke out extra $$. Once in a while, you get lucky, and a new technology allows for increasing quality at a lower cost.
As a side note, that Youtube video gets some really odd resonance out of my sub.
It surprises me after all these years, audio formats don't provide recording information about the dynamics of the waveform.
Cameras write EXIF information into JPEG files, why can't we have something similar for audio so we don't have to adjust the volume all the time?
You don't have to be an audiophile to appreciate good audio. I have a custom amp next to my computer into which I've plugged headphones. Find anyone with a pair of headphones, and you'll find an amp, too. Either that, or a deaf person who's been tortured by a bad Flash file.
Read more http://replaygain.hydrogenaudio.org/.
To give YouTube the benefit of the doubt here--it's possible that the sound modification is being done with good intentions.
Most YouTube videos are uploaded by amateurs with respect to the details of A/V. As such, the audio quality in many videos is probably low. It's also likely that the sound level between different uploads are radically different. This means that unprocessed audio would sound bad for many videos. Moreover, switching from clip to clip on YouTube would involve annoying changes in audio level (e.g. you turn up the sound for one video, but then the next one is painfully loud).
So, they may be automatically recompressing all the audio to fix many of the mistakes of amateur uploaders. Again, it's important to remember that YouTube is not really meant for distribution of high-resolution content.
Having said all that, I'm also not an A/V expert, so I may wrong on this. Also, it would be nice if YouTube gave some advanced options for more experienced people to use, so that they could flag their content for "skip audio recompression" or whatever. Also, YouTube's introduction of the "watch in high quality" option shows that they want to expand beyond low-res clips... in which case they need to provide commensurate audio quality.
and humans can hear that ok
2nd is 5k and thats right in the mid band
or dont the geeks understand how sound works ?
Why don't they just NORMALIZE instead of compress ? Compression changes dynamics, as in the thread descrription. Normalize would just raise the volume up to peak. No changing music, just increase volume...
"...Wouldn't another solution be to sneak past the entire recompression process by submitting a .flv video..."
Last time I submitted a video, about six to eight months ago, Youtube did not accept .flv or .swf formats, even though that is the format that they use to stream. Youtube wanted mpg, divx or mov formats. That sucked because my original was done in swf. First I had to convert the swf to divx which I uploaded to Youtube. Converting from swf to divx resulted in a big quality degradation. Youtube then converted the divx back to flv which resulted in a second quality degradation with the audio being completely out of sync with the video.
What you just said makes no sense.
The first natural harmonic of a 5kHz tone would indeed be 10kHz, but the second would be 20kHz, and so on.
Every sound has a whole raft of natural overtones and harmonics, individual to itself. A 1kHz tone's harmonics are at 2kHz, 4kHz, 8kHz, etc. It's therefore really quite disingenuous to just state that the "first harmonic is at 10kHz"
That would be a mess, though it couldn't be much different than what they currently do. They could simply accept finalized flv files and put out-of-spec flvs in the encoding chain with the rest of the uploads.
Brilliant! I've heard about problems with "normalizing" in the past. This is an excellent way to work around this problem!
Host your video somewhere else, upload it in a high-quality format, and let the site make derivatives for you (including a Flash video and a player you can embed in your webpage if you insist on placating a proprietor). Some organizations do this daily and it works excellently. YouTube needs you more than you need YouTube.
Digital Citizen
I have indeed heard of such deterioration on the Teen Buzz website (which is currently down for excessive bandwidth usage?) - but this page describes it as well.
Those little annoying sine-wave sounds are also used by TV advertisers such as Kentucky Fried Chicken to grab teens' attention if adults are not their market. (For the record, if you can't hear the tone, it sounds off when the KFC bucket shows up.)
Ha... however considered that .flv video is H263 (or is it H264 now?) I guess you could find a program that would change the container to an AVI-compatible one and thus avoid recompressing?
You just got troll'd!
hopefully this wont date me much, but this reminds me of tape bias, the high-frequency signal applied to the magnetic frequencies used to record tapes (oh it did have unintended consequences). http://en.wikipedia.org/wiki/Tape_bias
I would guess they are doing this to better "service" handheld devices like the iPhone and upcoming Android devices that have limited dynamic range in their speakers.
I am becoming gerund, destroyer of verbs.
Sorry, couldn't resist.
-dZ.
Carol vs. Ghost
Although I tend to think that Dan East summed it up best, I feel the need to point out that 95% of bad YouTube audio is the result of lousy recording quality, not subsequent processing.
Garbage In, Garbage Out.
The mics and electronics on most consumer camcorders (or that most people use with their Macs and PCs) are just plain crappy, and shouldn't be relied on for anything that you hope to distribute. And of course, some actual audio recording skills help too.
Three Squirrels
I didn't realise that what I said sounded impressive to the point of prompting a trekkie (or whatever it is you're referring to) to deblaterate nonsensical gobbledegook to try to sound as "cool".
You just got troll'd!
Use http://vreel.net/ instead - its divx, better quality anyway.
If Google really cared they would fix Android Chrome to reflow text, instead of discriminating
I measured that KFC tone at 4825 kHz with a spectrum analyzer. It's not very high at all -- certainly not the mosquito tone.
You forgot to mention holodecks, time travel and alternate universes.
Oh, and have the Borg assimilate it all just to mess things up a bit.
I clicked the YouTube links with an eager tingle in my spine. Very anticlimactic.
..some people with asthma heard just fine up to 30K! They need to go to more rock concerts - sit by the speakers. That will cure it. I can barely hear over 8K myself!
He's probably a Digg user. Those guys are almost, to be generous, as dumb as the YouTube commenters...
...who decided to brute-force modify each and every copyrighted audio track on YouTube (every audio track) without permission from the copyright owner?
Destroying the dynamic range intended by the originator in this manner is not acceptable.
If Google feels that some audio needs "normalization" or whatever they wish to call it, at least make it a feature that can be opted into by the owner of the video...
--Tomas
How about using a very low frequency sound, say 1 Hz? Or a Square wave with a period that is the same length (or greater) as the clip in question? Maybe that way you could avoid the re-encoding / aliasing issues.
Many modern compressors (aka compressor/limiters) use multiple frequency bands, typically four or more. The compression is done separately in each band with the results re-combined into a final result. One of the reasons for processing multiple bands is to eliminate pumping in which large dynamic range variations in one band modulate other sounds in inverse fashion--the thumping base causing the female singer's voice to fluctuate in level, for example. Although it is impossible to get heavy dynamic range compression without some degree of this effect, some versions are a bit more discreet about it.
Adding a 19 KHz sine wave would work only to defeat the upper band of a multiband compressor. If this approach works on YouTube it is because they are using a one- or two-band compressor.
Heavy compression ruins virtually all popular music in the form in which it is released. Further compression e.g. by FM radio, internet streams, and YouTube, ruins it a little bit more.
For an interesting discussion of how one artist is pissed off about all of this, Google for Bob Dylan and dynamic range compression.
I hope that guy didn't actually think that anyone would listen to his whole sone. Man I don't know if the music or crappy compression was worse. I made it about 5 seconds into each recording....
What happend to do no evil?
Oh, right, it doesn't apply if it allows them to save money...
Meet Google, the next Microsoft, this time a scary version with enough information on all people to keep them in jail or quiet. Love the data mining! Go GMail, Go Search, Go Ad clicks, Go Firefox reporting back...
Yeah Google!
Duh, I think it really sucks that YouTube is doing this. I think that all videos should be in high-def H.264 and if they come in at a lower quality, YouTube uses Google's massive server farm to extrapolate, using math, AI, and other methods, the missing data, a la the computer in Star Trek that can remove a person from a picture who is obscuring someone standing behind that person, then extrapolate the missing information to show that person's face, even though there is no possible way to do that (except in a television series). So you'll be able to upload a JPEG image that has been compressed to such an extent that it appears as one solid color, and YouTube will turn that into the full 220 minutes of Cecil B. DeMille's The Ten Commandments (the 1956 version, not the 1923 silent version). I don't know how that could possibly happen, but this is Google we're talking about.
McCain/Palin '08. Now THAT's hope and change!
So I tried these videos on three different playback mediums:
In all cases, the differences observed were too subtle to demonstrate the claims. Yes, the first video was louder, but the difference can easily be attributed to volume degradation by the post-processing performed on the second video. The placebo effect can play a big part in these comparisons.
The only way to prove what YouTube is doing to sound is to rip the audio and compare the waveform with the original. These videos do not prove the poster's claims.
No, I will not work for your startup
Youtube has accepted flvs for quite some time now. That does not excend to SWFs, however. Animations are not videos, and can't be converted easily.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
So let me get this right, you're complaining that youtube compands audio? You're probably the same bunch who upload videos with heavy macroblocking and the wrong aspect ratio, and you're worried about companding the audio?
The original Flash video format is Sorenson Spark. It's based on H.263, but incompatible.
H.264 and AAC don't fit in an AVI well at all... It's possible to do, but it's a mess, which most apps don't properly support.
In either case, I don't believe for a second that YouTube's video conversion is smart enough to detect compatible video or audio encodings in a different container, and remux rather than reencode.
Despite what the parent said, however, you can encode properly formatted, fully compatible FLVs to youTube, and they will be published, unmodified.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Particularly if you use the "brown noise" at high volume. That would give additional meaning to "garbage in, garbage out".
Oops, sorry. The KFC tone is 4,825 Hz or 4.825 kHz. Damn them decimals!
youtube reencodes any flv now, before you uploaded an flv with a total bitrate under 350kbps and it WASN'T re encoded, thus stereo sound, they re encode EVERYTHING now, even a 100 kbps total flv was re encoded
Live Electronic Music
From my understanding, the concept of YouTube is to put your personal audio/video recordings online for people to view.
I understand how they may re-process videos so they stream well (limiting size, etc). And how audio may be re-processed as well to stream easily, which is called Compression (but not the same term as we are discussing).
The fact that YouTube is applying, not a quality/size modifier, but a SOUND modifier, is something to be concerned about. Essentially, the sound you are getting from these videos is not (relatively) the same as it was when it was recorded and uploaded.
I am kind of thinking of a chili cooking contest where all submissions, unknowingly, have a tablespoon of salt or MSG added. Thats not the same as serving less (downsampling).
YouTube should recognize this problem and do the right thing by removing the limiter/compressor and letting the audio remain as natural as it was when it was recorded. Or they could call it SomeWhatYouTube.com....
*** I kinda wish the author of this one had made more clear the difference between compressing audio file size (like making an mp3) and compressing audio with digital/analog tools/gear for the purposes of making softer sounds louder. The terminology may confuse people who are not sound-boys.
The more common term for this type of audio processing is referred to as AGC or Automatic Gain Control. A good number of camcorders have this built in already. It sounds like the issue with the youtube implementation is that the max gain allowed is just too much and the attack rate (for gaining up) is way too fast. Artistically they should allow you to turn it off or adjust the parameters, otherwise they just made all new music on the site sound bad.
Classic compression, on the other hand, is when the loud stuff is made quieter but the quiet stuff stays quiet. If you plot an input level vs output level you get a 'knee' where the threshold for compression begins. The angle of the knee is determined by the ratio of compression.
AGC is like someone has the volume knob and cranks it up so that you can always hear something regardless of the content. Usually there are minimum thresholds and max gain settings to go along with this to adjust issues such as these.
Normalizing is yet another technique which requires non-realtime analysis of the entire piece to determine and set a single gain setting for the entire file; a sort of best fit gain.
And from the more complex end, there's Dolby Volume which incorporates several of the above features with their own 'special sauce' in an attempt to provide uniform listening levels between sources and content. I haven't heard it yet to know if it is any good.
-david
I don't think its YouTube that is the problem, It's your speakers...
imagine going to see a classical music concert and the entire concert is played at the exact same volume, no crescendos or decrescendos
Sweet! Mozart might be tolerable, or even rather nice.
The number one thing I hate about typical classical music, particularly Mozart, is that I have to keep adjusting the volume knob while it plays. It's either deafening or inaudiable.
..using e.g. http://softsolutions.sedutec.de/audioanalyser.php (Windows only)
Any hints to a similar tool for Mac OS X or Linux would be greatly appreciated!
You can upload flv's but youtube will re-encode them if the average bitrate is higher than a certain, very low threshold (something like 350kbps total) and you won't get a high quality encode.
After some more testing it seems that there is a problem with high quality mode. With the tone and sample rate I used (19kHz and 44.1k) at least the high quality encoder whistles at, some other frequency. Sounds like somewhere less than 10kHz to me.
From your description (and the article's mention of a 16 kHz limit), it sounds like they're actually sampling at 32kHz, which would give a Nyquist frequency of 16 kHz... If they don't have an anti-aliasing filter (because they've only been around for 40 years, so it's understandable that Google couldn't use a search engine to learn about one), then a 19 kHz tone will alias around the Nyquist frequency and end up at 13 kHz.
Try doing a 16 kHz tone instead, and see if it goes away. Also, make sure you're using a sine wave rather than any other type - you don't want higher harmonics in there.
I use Vimeo for everything these days. I can't impress clients with their horrid quality.
Well, I assume that everything you said had meaning, but to someone who doesn't know anything about audio (like me, and presumably the AC), it really does sound like Trek-worthy technobabble.
"16MB (fuck off, MiB fascists)" - The Mighty Buzzard
I ran into this problem in a big way when I posted this. There are many places where the limiting is creating distortion (especially in the right-hand chords), but the worst is at about 6:46. If you start at about 6:30, you'll get the full effect when you reach 6:46 --- OUCH!!!
I made a test video that shows that the technique of mixing in a high-frequency tone (I used 20 KHz, in a 48 KHz file) works. The original distorted version is at 0:04 and the same with the distortion eliminated with the addition of the tone is at 0:23.
FLV is not a supported format.
I operate a marginally popular music-channel on Youtube (http://www.youtube.com/user/Profeshian), and have decided against further uploads until such time as when this utter travesty has been lain to rest. Hopefully my upwards-of-1,300 subscribers will understand.
Okay, here's a video that will give you a quick sense of what the YouTube compression/clipping distortion can sound like and what the effect of the 20KHz sine-tone workaround is; the audio is accompanied by before/after spectrograms showing (a) the "spectral splatter" of the distortion, (b) the filtering out of the 20KHz sine tone during compression, (c) the reduction of the spectral splatter distortion, and (d) other distortion artifacts that are not removed with this technique (they look like a reflection of low-frequency energy, so they're probably Nyquist-related aliasing).
If you want to hear this music in context, it's from this video at around 6:46.
I am an anime fan.
I put a AMV up on YouTube.
The YouTube version was re-compressed into a smaller file which caused the A/V quality and synchronization to suffer. The input file doesn't have this problem.
Therefore I think YouTube is doing this to appease the media cartels for all the infringement going on on YouTube. But won't the same thing happen to 100% original content as well? It shouldn't!
check my video and remix that its killing the hell out of, i pulled down the overall volume by -3db but still no joy check it out for an example....
youtube.*com/watch?v=hmO92x3L-hI
then goto myspace*.com/a2osound to see myspaces version