Xiph Episode 2: Digital Show & Tell
An anonymous reader writes "Continuing a firehose tradition of maximum information density, Xiph.Org's second video on digital media explores multiple facets of digital audio signals and how they really behave in the real world. Demonstrations of sampling, quantization, bit-depth, and dither explore digital audio behavior on real audio equipment using both modern digital analysis and vintage analog bench equipment... just in case we can't trust those newfangled digital gizmos. You can also download the source code for each demo and try it all for yourself!"
Plus you get to look at Monty's beard and hear his soothing voice. There's a handy wiki page with further information and a summary of the video if text is your thing.
>transient signals behave differently from the periodic signals he tests in the video
He addresses this. No signal in our rate-limited sample exists in isolation. A transient blip or noise is not detected if it qoesn't fall in a point that is measured, yes, but a real signal does not exist on a single point. An audio codec is at another level above the hardware and doesn't change the fundemental physics of digital-analog implementations that havn't changed much in the last 10-15 years, and are developed with signal theory in mind.
Found it very informative to a non-guru.
Aside from that, the video and its audio, and I'm not kidding here,
were very pleasant and sympathetic to the ears and eyes.
If a 100Mhz analog scope can't detect the stair steps, then there is absolutely no fucking way you have the slightest chance of detecting it.
100,000,000 Hz vs your ears at maybe 20,000 Hz.
See the difference?
He has actual hardware there, as he explains quite old consumer grade hardware, which does the conversion from analog to digital to analog, and the result is still for all intents and purposes PERFECT. Yes, the delivery is smug, but rightly so. Talking to "audiophiles" is like talking to people who believe in homeopathy: It is extremely difficult to not just make fun of the fools. When you instead manage to deliver an explanation and a demo that clear, you get to be smug. (Captcha: mockery)
He knows what he is talking about, he explains things clearly, he is not condescending to the viewer -- I think the apparent smugness is not for real, or maybe it is just how his personality comes over. And if you still don't understand how there is no stair-step, you need to watch the video again! Even though I've done loads of DSP, the nice demos he gives really illustrate well what he is saying, and who can argue with pure-analogue gear proving the point -- not just theory and hand waving, but real experimental evidence. Really nice work.
See the difference?
But, but, but... I'm an audiophile, dammit! I listen with my soul. That's why I can hear it!
That is all.
This guy knows what he's talking about, and communicates it well. Amateur audiophiles should especially read his article here: http://people.xiph.org/~xiphmont/demo/neil-young.html.
... for all the bullshit Blackboard technology mess, videotaped classroom lectures, and .edu buzzwords, this sort of thing is exactly how open education should be done.
congrats Monty, once again you've done well.
~.~
I'm a peripheral visionary.
Never underestimate the power of self-delusion. Placebo effects have been accepted by science as being very real. The fact that someone, whether a believer in homeopathy, audiophile-quality, sugar pills or whatever scientifically-unsubstantiated nutjobery, actually believes in it is beneficial to them. Trying to take away their self-delusions is just plain mean. I really wish I could convince myself that what amounts to water is as effective as real medicine...I might be a healthier person.
Could you please post the specs of the reel to real equipment you're using. To beat a 16-bit digital system, it has to have better than 96 dB SNR and dynamic range. I don't remember having ever seen that.
Opus: the Swiss army knife of audio codec
you're missing the point.. whether it's a pure sine wave, or a combination of sines (all real number samples up to nyquist freq), nothing is lost.
This ignores the fact that perception doesn't define reality. Stripping delusion away might be 'mean', but sometimes it's necessary. History is rife with examples of delusion driving whole societies over the edge.
With 44.1kHz, 16bit audio we already have all the data to reconstruct a signal band limited to the range of human hearing. Effectively then you're calling nyquist bullshit without even understanding it.
Only via compounding / NR and those solutions are not broadband. How could you use tape and not know this?
He says several times that he used sine waves as a simple illustration. Then he switches to square waves. You apparently don't understand sampling theory well enough to understand why your second sentence, in the context of PCM audio, is incorrect. Perhaps reading this will help: http://people.xiph.org/~xiphmont/demo/neil-young.html.
On the other hand, you might just be an AC troll, an "audiophile" or an old enthusiast or sound engineer who might have been an excellent technician but never developed a proper understanding of signals. In any case, anybody tempted to agree with your post should read the article at that link.
No, stair steps don't exist in the music you hear (at least not beyond trivially small ones). You could potentially make an ADC that did produce stair step samples but it would be a stupid thing to do. As he also mentioned, in some cases ADCs to produce stair step samples, but that's an intermediate step in the conversion and is not what is output.
If you want to get really pedantic, any real ADC does take a certain amount of time to complete a sample so the sample does have some finite extent, but it's very, very small, isn't uniformly distributed and is far outside the band limit anyway.
I suppose you could construct a codec in which an in-band transient might be treated differently than a periodic signal but that has very little to do with what actually happens in real codecs that people do, or might, use.
Maybe you think he's smug because he knows what he's talking about and it disagrees with what you think you know?
Stripping away someone's self delusions isn't mean. It prevents him from being an easy mark for snake oil salesmen, whether those are homeopaths or Best Buy employees. In medicine the placebo effect is great, until you get something it doesn't work on and you die because you didn't get real treatment. In stereos it just means you get separated from your money, over and over and over again.
So long as all of the steps involved are linear, does it matter?
If a linear signal processing procedure behaves a certain way for all sine wave signals in a frequency range, then it will behave in the same way for any sum of sine waves in that range as well.
For a video all about audio, why does the guy's voice keep flapping around from left speaker to right speaker? I found it pretty distracting. Next time, try a clip-on mic and mix it down to mono unless it's necessary to make a point.
systemd is Roko's Basilisk.
You're absolutely right, and real devices are more or less prone to some distortion when converting digital signals.
However, this must be determined empirically, and is beyond the scope of a discussion of digital signals. As a rule, the DSP developer gets his algorithm right, and getting those Signed Ints to sound good is strictly the client's responsibility.
Don't blame me, I voted for Baltar.
We listen to nothing more than sums of pure sine waves.
Don't blame me, I voted for Baltar.
Not absolutely. You get dithering losses and quantization distortion, so a 16 bit system usually has about 14 ENB thus 84 db dynamic range. SNR is the range from the floor to the reference fluxivity, which per the AES spec would be -18 dbFS or -66 db. I have worked with 24 track machines (a Studer A827 in this case), at 15 ips and with Dolby SR, with a fresh, degaussed roll of Quantegy, that could do 70 decibels of SNR on the first generation.
Don't blame me, I voted for Baltar.
SACDs are encoded with a 1-bit sigma-delta pulses. Those would be some pretty whopping staircases if sampling theory didn't work to eliminate them.
I am becoming gerund, destroyer of verbs.
On that topic, while I was filming the epilogue I started feeling ill and ignored it for a while, but eventually went to the emergency room. Three hours later I had an emergency appendectomy. Medical science also rocks. If I'd gone to a faith healer, I'd be dead. (FTR, the epilogue footage in the final vid was from three days after the appendectomy)
Extensive testing reveals that no matter the wailing and gnashing of teeth claiming digital sucks, no one can actually tell the difference in a double-blind test.
Theory is indeed useless when reality doesn't agree. Every real scientist knows this. An engineer's job depends on it. I invite audiophiles to add that bit of wisdom to their own thinking.
As a nitpick, you get dithering losses _or_ quantization distortion, or a linear tradeoff between the two. You don't get the worst case of both on top of each other unless you screw up.
Without dither, worst case, all your 16 bit quant distortion products will be under -100dB regardless of input amplitude. I actually display the worst case in the video to make it easy to see. Quantization distortion aliases, and I chose an integer sample period so the aliased distortion would always land in the same bins after folding. If I hadn't, it would have spread out more and been even lower. If I had chosen a relatively prime frequency, the quantization distortion would have spread out across all bins equally.
As you point out though, this depends a lot on the selection of dither spectrum and the dither's probability density function, and these are much more pokey issues and depend on subjective analysis of the signal target -- noise shaped dither is great if you're mastering for delivery but can screw you if your recipient isn't an ISO 226 listener, and expects to be able do pitch shifting or folding with a ring modulator.
Counter nitpick: Monty, as a professional motion picture sound designer, I cannot tell you how distracting it is to hear your voice constantly changing its pan across the stereo field :)
Don't blame me, I voted for Baltar.
Right, and this is why dither is only applied to 'last-mile' audio intended to be consumed. Dither 'screws' you in other ways if you intend to use that audio in production, such as losing all the property of removing the distortion, yet still having the additive noise. But we're still talking about changes 100+dB down.
>Counter nitpick: Monty, as a professional motion picture sound designer, I cannot tell you how distracting it is to hear your voice constantly changing its pan across the stereo field :)
The audio was recorded with a stereo pair. It wasn't panned artificially :-) Look down a few comments for more about this, you weren't the only person to complain.
Oh! And 'linear' was completely wrong. I don't know how I braino-ed that in there, the [at very least perceived] distortion/noise tradeoff is not linear.
We listen to nothing more than sums of pure sine waves.
Well I, for one, listen to the radio.
Defining Statistics and Social Research