Sneaking Past Heavy-Handed Audio Compression on YouTube
niceone writes "Recently YouTube seems to have started applying extreme compression
to the audio of uploaded clips. This is the type of compressions used
by radio stations to make everything louder, but in this case applied
extremely badly. In quiet passages, breathing and shuffling become
overpoweringly loud. A gently plucked guitar chord becomes a distorted thud.
Listen to an example here. And here's what it could sound like — still not perfect, but a whole lot better. The
fixed version is thanks to a workaround proposed by
Sopranoguitar — the idea is to turn down the audio and mix in
a high frequency sine wave (I used 19kHz). The sine wave fools YouTube's
compressor into thinking that the file is at a uniform level (and does
not need the volume changing at all) but is filtered out by the encoding
process (so, no need to worry about deafening any dogs)."
Who's the mentally... challenged... individual who decided that applying such compression in the first place was a good idea, and then proceeded to implement or accept such a shitty implementation?
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
Wouldn't it be easier to set your gate correctly? Cut out the background sounds BEFORE submitting to youtube; do proper editing and then it doesn't matter so much what they do. Here, in my opinion, is a good site for all such information.
Qxe4
After some more testing it seems that there is a problem with high quality mode. With the tone and sample rate I used (19kHz and 44.1k) at least the high quality encoder whistles at, some other frequency. Sounds like somewhere less than 10kHz to me.
I hope YouTube fix this soon.
ccalam - acoustic versions of new songs.
Sure, how about the given example? One second is really all you need.
In the heavily compressed one, you hear an annoying hiss and the sound of the microphone being moved for the first few seconds.
In the non-heavily compressed one, you don't.
That's really the complete example without having to listen to the song. Really, the first few seconds are the best example, because Google is apparently amplifying almost complete silence to noise. The song part really doesn't help much. (Or at least, as much as I was willing to listen to it, which was only a few seconds.)
You are in a maze of twisty little relative jumps, all alike.
Wouldn't another solution be to sneak past the entire recompression process by submitting a .flv video that meets YouTube's requirements to avoid recompression? Or would the compression on audio (not the same type of compression, the one this article is talking about) still be forced on these?
By the way to improve the trick, what you could do is detect the envelope of your sound, a modulate your 19 kHz sine with an envelope complementary so that the two envelopes would sum up to a flat line, so your 19 kHz envelope would be f(t) = 1 - original_sound_envelope(t).
You just got troll'd!
YouTube is just trying to enforce a standard level of quality to the content. Everyone expects crappy video with lots of compression artifacts, so the audio might as well follow suite.
Better known as 318230.
Can someone post an example I could possibly listen to for more than one second?
No
Table-ized A.I.
you mean that high pitched squeal that is driving me nuts in the example more then the audio compression? Yea.. that's filtered out all right...
Defective Logic
Ruff ruff rufff, and ruff rufff, you little bigoted ruff ruff ruffff.
Translation : I'm ultrasound-deaf, you insensitive clod!
You just got troll'd!
The worst examples I've seen have been videos of a lecture/speech, and while the main speaker has a microphone it also picks up sound from around the auditorium or lecture hall.
Normally this is fine as we have all become accustomed to faint background noise, with this extreme compression the faintest cough or shuffling in the audience sounds is as loud as the person speaking and is thus very distracting.
Considering most of the lectures I view are 30+ minutes long this really pisses me off.
I don't know about the practicality, but I read a tutorial of running all of your sound (In Linux) through Jackd.
You could then run your applications through the jack rack and tweak it however you wanted.
Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
The high quality version of the audio will have the 19 (or up to 22.1) kHz sine wave you choose to use in your video upload. So this is a trade-off of quality (high-quality = eek!) versus lack of unwanted range compression (low-quality = listenable, for lack of a better word).
FWIW, I can hear 19 kHz waves. So this trade-off affects me.
It would be nice if YouTube offered some choices, such as volume adjustment, no volume adjustment, and also other things like stereo. The only way I know of to get stereo is to submit it in Adobe's proprietary formats. YouTube is pulling a Henry Ford: you can have any color you want, as long as its black.
Table-ized A.I.
It surprises me after all these years, audio formats don't provide recording information about the dynamics of the waveform.
Cameras write EXIF information into JPEG files, why can't we have something similar for audio so we don't have to adjust the volume all the time?
You don't have to be an audiophile to appreciate good audio. I have a custom amp next to my computer into which I've plugged headphones. Find anyone with a pair of headphones, and you'll find an amp, too. Either that, or a deaf person who's been tortured by a bad Flash file.
Read more http://replaygain.hydrogenaudio.org/.
And how exactly would that help making smaller files?
You just got troll'd!
"...Wouldn't another solution be to sneak past the entire recompression process by submitting a .flv video..."
Last time I submitted a video, about six to eight months ago, Youtube did not accept .flv or .swf formats, even though that is the format that they use to stream. Youtube wanted mpg, divx or mov formats. That sucked because my original was done in swf. First I had to convert the swf to divx which I uploaded to Youtube. Converting from swf to divx resulted in a big quality degradation. Youtube then converted the divx back to flv which resulted in a second quality degradation with the audio being completely out of sync with the video.
Host your video somewhere else, upload it in a high-quality format, and let the site make derivatives for you (including a Flash video and a player you can embed in your webpage if you insist on placating a proprietor). Some organizations do this daily and it works excellently. YouTube needs you more than you need YouTube.
Digital Citizen
I have indeed heard of such deterioration on the Teen Buzz website (which is currently down for excessive bandwidth usage?) - but this page describes it as well.
Those little annoying sine-wave sounds are also used by TV advertisers such as Kentucky Fried Chicken to grab teens' attention if adults are not their market. (For the record, if you can't hear the tone, it sounds off when the KFC bucket shows up.)
Ha... however considered that .flv video is H263 (or is it H264 now?) I guess you could find a program that would change the container to an AVI-compatible one and thus avoid recompressing?
You just got troll'd!
hopefully this wont date me much, but this reminds me of tape bias, the high-frequency signal applied to the magnetic frequencies used to record tapes (oh it did have unintended consequences). http://en.wikipedia.org/wiki/Tape_bias
I would guess they are doing this to better "service" handheld devices like the iPhone and upcoming Android devices that have limited dynamic range in their speakers.
I am becoming gerund, destroyer of verbs.
Although I tend to think that Dan East summed it up best, I feel the need to point out that 95% of bad YouTube audio is the result of lousy recording quality, not subsequent processing.
Garbage In, Garbage Out.
The mics and electronics on most consumer camcorders (or that most people use with their Macs and PCs) are just plain crappy, and shouldn't be relied on for anything that you hope to distribute. And of course, some actual audio recording skills help too.
Three Squirrels
I measured that KFC tone at 4825 kHz with a spectrum analyzer. It's not very high at all -- certainly not the mosquito tone.
How about using a very low frequency sound, say 1 Hz? Or a Square wave with a period that is the same length (or greater) as the clip in question? Maybe that way you could avoid the re-encoding / aliasing issues.
see my newest videos, uplaoded this week, as soon as I drop the kick the level jumps 6-10 dB and when the kicks come back it squashes and pumps like a benassi bassline (not in a good way) http://www.youtube.com/profile_videos?user=muzik4machines compare the newest one (destroyed by youtube) and some older ones where it sounds almost exactly like my original mix (at 22KHz, but still, not squashed) and t does so with the quick uploader as well as the uploaded videos, which is even worse, the quick uploader, i would understand as people uses built in mics and stuff, but my final, mastered HD performance is squashed all life out of it, mono-ified and downsampled to 22, 050 KHz, it's not really an incentive for artists to upload their stuff anymore, it makes you sound liek you don't know how to mix properly (and it does it with the qui
Live Electronic Music
youtube reencodes any flv now, before you uploaded an flv with a total bitrate under 350kbps and it WASN'T re encoded, thus stereo sound, they re encode EVERYTHING now, even a 100 kbps total flv was re encoded
Live Electronic Music
The more common term for this type of audio processing is referred to as AGC or Automatic Gain Control. A good number of camcorders have this built in already. It sounds like the issue with the youtube implementation is that the max gain allowed is just too much and the attack rate (for gaining up) is way too fast. Artistically they should allow you to turn it off or adjust the parameters, otherwise they just made all new music on the site sound bad.
Classic compression, on the other hand, is when the loud stuff is made quieter but the quiet stuff stays quiet. If you plot an input level vs output level you get a 'knee' where the threshold for compression begins. The angle of the knee is determined by the ratio of compression.
AGC is like someone has the volume knob and cranks it up so that you can always hear something regardless of the content. Usually there are minimum thresholds and max gain settings to go along with this to adjust issues such as these.
Normalizing is yet another technique which requires non-realtime analysis of the entire piece to determine and set a single gain setting for the entire file; a sort of best fit gain.
And from the more complex end, there's Dolby Volume which incorporates several of the above features with their own 'special sauce' in an attempt to provide uniform listening levels between sources and content. I haven't heard it yet to know if it is any good.
-david