Slashdot Mirror


Xiph Episode 2: Digital Show & Tell

An anonymous reader writes "Continuing a firehose tradition of maximum information density, Xiph.Org's second video on digital media explores multiple facets of digital audio signals and how they really behave in the real world. Demonstrations of sampling, quantization, bit-depth, and dither explore digital audio behavior on real audio equipment using both modern digital analysis and vintage analog bench equipment... just in case we can't trust those newfangled digital gizmos. You can also download the source code for each demo and try it all for yourself!" Plus you get to look at Monty's beard and hear his soothing voice. There's a handy wiki page with further information and a summary of the video if text is your thing.

50 comments

  1. soothing? verarschst Du? by Anonymous Coward · · Score: 0

    soothing, really? I am keenly curious to what he has to say, but find his delivery smug.

    Also, he doesn't address what really happens in the hardware of the audio codecs. Yes, sampling *theory* means there are no stair steps, but codecs are real devices. They run on discrete clocks and depending on the particular codec implementation, there are layers of non-idealities. So, yes, the stair steps do exist in the music you hear. And no, you most likely can not perceive it. I'm honestly not sure if that hardware he uses could detect it. Probably for certain specific signals. Speaking of certain signals, for certain codec architectures, transient signals behave differently from the periodic signals he tests in the video. No, that does not mean he's wrong about transients having to line up with a sampling boundary in order to be properly sampled.

    No matter how confident he seems to you, take your grain of salt and caveat emptor. Or as Wu-Tang Clan would put it: "Ain't a damn thing changed boy, protect ya neck."

  2. Re:soothing? verarschst Du? by Neuroelectronic · · Score: 1, Informative

    >transient signals behave differently from the periodic signals he tests in the video

    He addresses this. No signal in our rate-limited sample exists in isolation. A transient blip or noise is not detected if it qoesn't fall in a point that is measured, yes, but a real signal does not exist on a single point. An audio codec is at another level above the hardware and doesn't change the fundemental physics of digital-analog implementations that havn't changed much in the last 10-15 years, and are developed with signal theory in mind.

  3. I jus happened to have watched part 1 last week by Mister+Liberty · · Score: 4, Informative

    Found it very informative to a non-guru.

    Aside from that, the video and its audio, and I'm not kidding here,
    were very pleasant and sympathetic to the ears and eyes.

  4. Re:soothing? verarschst Du? by cheater512 · · Score: 4, Insightful

    If a 100Mhz analog scope can't detect the stair steps, then there is absolutely no fucking way you have the slightest chance of detecting it.

    100,000,000 Hz vs your ears at maybe 20,000 Hz.
    See the difference?

  5. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 5, Informative

    He has actual hardware there, as he explains quite old consumer grade hardware, which does the conversion from analog to digital to analog, and the result is still for all intents and purposes PERFECT. Yes, the delivery is smug, but rightly so. Talking to "audiophiles" is like talking to people who believe in homeopathy: It is extremely difficult to not just make fun of the fools. When you instead manage to deliver an explanation and a demo that clear, you get to be smug. (Captcha: mockery)

  6. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 0

    If something is barely perceptible in a good way, that's what 'soothing' means.

  7. Re:soothing? verarschst Du? by Aguazul2 · · Score: 5, Informative

    He knows what he is talking about, he explains things clearly, he is not condescending to the viewer -- I think the apparent smugness is not for real, or maybe it is just how his personality comes over. And if you still don't understand how there is no stair-step, you need to watch the video again! Even though I've done loads of DSP, the nice demos he gives really illustrate well what he is saying, and who can argue with pure-analogue gear proving the point -- not just theory and hand waving, but real experimental evidence. Really nice work.

  8. Re:soothing? verarschst Du? by frank_adrian314159 · · Score: 3, Funny

    See the difference?

    But, but, but... I'm an audiophile, dammit! I listen with my soul. That's why I can hear it!

    --
    That is all.
  9. This is good by ceoyoyo · · Score: 3, Informative

    This guy knows what he's talking about, and communicates it well. Amateur audiophiles should especially read his article here: http://people.xiph.org/~xiphmont/demo/neil-young.html.

  10. full props to Monty by nadaou · · Score: 5, Informative

    ... for all the bullshit Blackboard technology mess, videotaped classroom lectures, and .edu buzzwords, this sort of thing is exactly how open education should be done.

    congrats Monty, once again you've done well.

    --
    ~.~
    I'm a peripheral visionary.
  11. Using real world audio waveforms? by Anonymous Coward · · Score: 0, Flamebait

    While Monty's presentation is excellent, what we have to remember is we are not listening to pure sine waves. What he should be using is a true audio wave which requires so much more resolution to recreate accurately. When he says he can reel to reel in x number of buts, I was surprised. He doesn't say what sort of speed of tape and source of recording he was using. Having done both reel and digital recordings for two decades, I've yet to come across a 16bit digital system being able to beat reel to reel, let along anything lesser.

    1. Re:Using real world audio waveforms? by jmv · · Score: 2

      Could you please post the specs of the reel to real equipment you're using. To beat a 16-bit digital system, it has to have better than 96 dB SNR and dynamic range. I don't remember having ever seen that.

    2. Re:Using real world audio waveforms? by Anonymous Coward · · Score: 1

      you're missing the point.. whether it's a pure sine wave, or a combination of sines (all real number samples up to nyquist freq), nothing is lost.

    3. Re:Using real world audio waveforms? by Anonymous Coward · · Score: 1

      What he should be using is a true audio wave which requires so much more resolution to recreate accurately.

      With 44.1kHz, 16bit audio we already have all the data to reconstruct a signal band limited to the range of human hearing. Effectively then you're calling nyquist bullshit without even understanding it.

      I've yet to come across a 16bit digital system being able to beat reel to reel, let along anything lesser.

      Only via compounding / NR and those solutions are not broadband. How could you use tape and not know this?

    4. Re:Using real world audio waveforms? by ceoyoyo · · Score: 4, Insightful

      He says several times that he used sine waves as a simple illustration. Then he switches to square waves. You apparently don't understand sampling theory well enough to understand why your second sentence, in the context of PCM audio, is incorrect. Perhaps reading this will help: http://people.xiph.org/~xiphmont/demo/neil-young.html.

      On the other hand, you might just be an AC troll, an "audiophile" or an old enthusiast or sound engineer who might have been an excellent technician but never developed a proper understanding of signals. In any case, anybody tempted to agree with your post should read the article at that link.

    5. Re:Using real world audio waveforms? by Entropius · · Score: 1

      So long as all of the steps involved are linear, does it matter?

      If a linear signal processing procedure behaves a certain way for all sine wave signals in a frequency range, then it will behave in the same way for any sum of sine waves in that range as well.

    6. Re:Using real world audio waveforms? by iluvcapra · · Score: 3, Informative

      While Monty's presentation is excellent, what we have to remember is we are not listening to pure sine waves.

      We listen to nothing more than sums of pure sine waves.

      --
      Don't blame me, I voted for Baltar.
    7. Re:Using real world audio waveforms? by iluvcapra · · Score: 1

      To beat a 16-bit digital system, it has to have better than 96 dB SNR and dynamic range

      Not absolutely. You get dithering losses and quantization distortion, so a 16 bit system usually has about 14 ENB thus 84 db dynamic range. SNR is the range from the floor to the reference fluxivity, which per the AES spec would be -18 dbFS or -66 db. I have worked with 24 track machines (a Studer A827 in this case), at 15 ips and with Dolby SR, with a fresh, degaussed roll of Quantegy, that could do 70 decibels of SNR on the first generation.

      --
      Don't blame me, I voted for Baltar.
    8. Re:Using real world audio waveforms? by Anonymous Coward · · Score: 0

      Congratulations, you've explained why everyone tracks at 24 bit and demonstrated that tape does in fact have less dynamic range than 16bit mastered to 0dbFS for distribution. Also, you'll find that whatever the AES say; most people reference 0 VU to -12dbFS when working at 16bit.

      Sure, you could get a 1" 2 track headstock (around +6db over 1/4" 2 track) and use NR to make an academic point. You have to remember that this whole discussion is about folks who think 16bit, 44.1k audio is a substandard distribution format and that there's some tangible, audible benefit to increasing sample rate and bit depth for real world listening environments. Engineers need to make the technical explanations clear enough to stop these folks making shit up.

      Then we'll have to explain why we still track to tape...

    9. Re:Using real world audio waveforms? by xiphmont · · Score: 1

      As a nitpick, you get dithering losses _or_ quantization distortion, or a linear tradeoff between the two. You don't get the worst case of both on top of each other unless you screw up.

      Without dither, worst case, all your 16 bit quant distortion products will be under -100dB regardless of input amplitude. I actually display the worst case in the video to make it easy to see. Quantization distortion aliases, and I chose an integer sample period so the aliased distortion would always land in the same bins after folding. If I hadn't, it would have spread out more and been even lower. If I had chosen a relatively prime frequency, the quantization distortion would have spread out across all bins equally.

    10. Re:Using real world audio waveforms? by iluvcapra · · Score: 1

      As a nitpick, you get dithering losses _or_ quantization distortion, or a linear tradeoff between the two.

      As you point out though, this depends a lot on the selection of dither spectrum and the dither's probability density function, and these are much more pokey issues and depend on subjective analysis of the signal target -- noise shaped dither is great if you're mastering for delivery but can screw you if your recipient isn't an ISO 226 listener, and expects to be able do pitch shifting or folding with a ring modulator.

      Counter nitpick: Monty, as a professional motion picture sound designer, I cannot tell you how distracting it is to hear your voice constantly changing its pan across the stereo field :)

      --
      Don't blame me, I voted for Baltar.
    11. Re:Using real world audio waveforms? by xiphmont · · Score: 1

      Right, and this is why dither is only applied to 'last-mile' audio intended to be consumed. Dither 'screws' you in other ways if you intend to use that audio in production, such as losing all the property of removing the distortion, yet still having the additive noise. But we're still talking about changes 100+dB down.

      >Counter nitpick: Monty, as a professional motion picture sound designer, I cannot tell you how distracting it is to hear your voice constantly changing its pan across the stereo field :)

      The audio was recorded with a stereo pair. It wasn't panned artificially :-) Look down a few comments for more about this, you weren't the only person to complain.

    12. Re:Using real world audio waveforms? by xiphmont · · Score: 1

      Oh! And 'linear' was completely wrong. I don't know how I braino-ed that in there, the [at very least perceived] distortion/noise tradeoff is not linear.

    13. Re:Using real world audio waveforms? by Sigg3.net · · Score: 1

      ...what we have to remember is we are not listening to pure sine waves.

      We listen to nothing more than sums of pure sine waves.

      Well I, for one, listen to the radio.

  12. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 2, Informative

    Never underestimate the power of self-delusion. Placebo effects have been accepted by science as being very real. The fact that someone, whether a believer in homeopathy, audiophile-quality, sugar pills or whatever scientifically-unsubstantiated nutjobery, actually believes in it is beneficial to them. Trying to take away their self-delusions is just plain mean. I really wish I could convince myself that what amounts to water is as effective as real medicine...I might be a healthier person.

  13. Re:soothing? verarschst Du? by epyT-R · · Score: 1

    This ignores the fact that perception doesn't define reality. Stripping delusion away might be 'mean', but sometimes it's necessary. History is rife with examples of delusion driving whole societies over the edge.

  14. Re:soothing? verarschst Du? by ceoyoyo · · Score: 1

    No, stair steps don't exist in the music you hear (at least not beyond trivially small ones). You could potentially make an ADC that did produce stair step samples but it would be a stupid thing to do. As he also mentioned, in some cases ADCs to produce stair step samples, but that's an intermediate step in the conversion and is not what is output.

    If you want to get really pedantic, any real ADC does take a certain amount of time to complete a sample so the sample does have some finite extent, but it's very, very small, isn't uniformly distributed and is far outside the band limit anyway.

    I suppose you could construct a codec in which an in-band transient might be treated differently than a periodic signal but that has very little to do with what actually happens in real codecs that people do, or might, use.

    Maybe you think he's smug because he knows what he's talking about and it disagrees with what you think you know?

  15. Re:soothing? verarschst Du? by ceoyoyo · · Score: 1

    Stripping away someone's self delusions isn't mean. It prevents him from being an easy mark for snake oil salesmen, whether those are homeopaths or Best Buy employees. In medicine the placebo effect is great, until you get something it doesn't work on and you die because you didn't get real treatment. In stereos it just means you get separated from your money, over and over and over again.

  16. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 0

    You deleted the part where I said "for certain codec architectures,"

    He doesn't tell you what codec architecture is inside the audio interface he uses.

    However, for certain types, such as sigma-delta, there are errors which are worse on sharp transients at high-frequencies than on steady-state signals.

  17. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 0

    Sure, you don't let someone treat their cancer with homeopathy, but why stop them from treating an upset stomach, migraines or some other such non-fatal malady? We take medicine for a lot of problems and very few of those problems can be fatal.

  18. Great lesson, but what's with the audio? by wonkey_monkey · · Score: 2

    For a video all about audio, why does the guy's voice keep flapping around from left speaker to right speaker? I found it pretty distracting. Next time, try a clip-on mic and mix it down to mono unless it's necessary to make a point.

    --
    systemd is Roko's Basilisk.
    1. Re:Great lesson, but what's with the audio? by xiphmont8352 · · Score: 2

      I intentionally avoid using lavalier mics. Their amplitude and timbre are all over the place depending on the direction you're looking, they pick up clothing noise, and you're either tethered to a wire or have to deal with the complexity and limitations of a wireless system.

      Headsets work better overall, but are highly visible and still add a layer of complexity. They also sound like someone talking directly into your ear. Even with additional postprod [going all the way to room modeling], I've never managed to make a headset sound totally natural. If others have, I'd appreciate some tips.

      In any case, as someone who spent ten years dealing with lavs and headsets in live performance as a sound engineer, I avoid them in my own time as much as humanly possible :-)

      The stereo image was intentional, it's a trick for removing perceived echo/reverb by spreading it out across a stereo image instead of it all piling up right behind the voice in mono. That said, the image was wider than I'd have liked. That was a result of mic placement and angle, another tradeoff to avoid wide amplitude changes as I moved around.

      FWIW, several other video producers wrote me to ask how I got such great sound without a lav, and (like you) others wrote to mention that they found the very wide stereo distracting. I'll tinker with it more in the next vid.

    2. Re:Great lesson, but what's with the audio? by wonkey_monkey · · Score: 1

      FWIW, several other video producers wrote me to ask how I got such great sound without a lav

      Aside from the stereo thing, it did sound very clear.

      The stereo image was intentional, it's a trick for removing perceived echo/reverb by spreading it out across a stereo image instead of it all piling up right behind the voice in mono.

      You just can't stop teaching people stuff, can you? ;)

      --
      systemd is Roko's Basilisk.
    3. Re:Great lesson, but what's with the audio? by xiphmont · · Score: 1

      I think it's more "I want people to know why I do the stupid things I do." Latent fear of being committed.

    4. Re:Great lesson, but what's with the audio? by iluvcapra · · Score: 1

      The stereo image was intentional, it's a trick for removing perceived echo/reverb by spreading it out across a stereo image instead of it all piling up right behind the voice in mono.

      o_0

      FWIW, several other video producers wrote me to ask how I got such great sound without a lav, and (like you) others wrote to mention that they found the very wide stereo distracting. I'll tinker with it more in the next vid.

      Nobody records dialogue in X-Y stereo professionally, and recording dialogue in any kind of stereo field is exceedingly rare -- we pan it if we want to position it, but generally speaking it's distracting and breaks convention unless you're trying to emphasize two sources in L-R space. Tom Holman was fond of telling us in class that the only major American feature film to ever shoot stereo dialogue was Bette Midler's The Rose, and the experiment died a quick death in dailies. (That crew did win an Oscar though.)

      If you insist on recording in stereo though, you might do as they did, and record with a Mid-Side array and use a matrix to decode back to L-R, so you can control the stereo spread in post-production.

      --
      Don't blame me, I voted for Baltar.
    5. Re:Great lesson, but what's with the audio? by xiphmont · · Score: 1

      >If you insist on recording in stereo though, you might do as they did, and record with a Mid-Side array and use a matrix to decode back to L-R, so you can control the stereo spread in post-production.

      That would not have controlled the reverb; the space this was recorded in was a concrete floor with concrete walls and no acoustic treatment. Like I said, it was a tradeoff, and one that was successful if not perfect.

  19. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 0

    There are no stair steps, as the analog filter after the DAC removes them.

    And yes, I am talking about modern oversampling delta sigma converters here. There is still an analog filter, but it's a very gentle slope at ultrasonic frequencies.

  20. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 0

    Also, he doesn't address what really happens in the hardware of the audio codecs. Yes, sampling *theory* means there are no stair steps, but codecs are real devices. They run on discrete clocks and depending on the particular codec implementation, there are layers of non-idealities. So, yes, the stair steps do exist in the music you hear.

    No, they don't. Nature abhors stairsteps. A true step function (infinitely fast rise/fall time) expressed in a physical medium would require an infinite series of large-amplitude harmonics of the fundamental (square-wave) frequency. There is no way to force air to do this.

    "Okay", you say, "maybe there's still a crude approximation of a step function, with higher frequency harmonics naturally attenuated away." But it turns out not even that is true. An essential part of every complete DAC is a "brick wall" analog filter placed immediately after the raw digital-to-analog conversion. Ideally, this filter removes all frequency components higher than 1/2 the sampling frequency. This filtering means you won't find a single one of the harmonics required to approximate sharp stairsteps between samples. They'll all have been smoothed away by the filter.

    This is also known as a "reconstruction filter". It's an essential part of any complete sampling system, as described by the Shannon-Nyquist Sampling Theorem. Which is literally a mathematical proof that it's possible to reconstruct any continuous waveform from discrete samples, within certain limits (the big one: you can only reproduce frequency components less than 1/2 the sampling frequency, and in fact you must filter out anything higher before sampling in the first place).

    The final output of a system as described by the theory is not a discrete series of points. Think of it as being more like a mathematical curve-fit through those points. In real systems, we can only approximate the ideal mathematical constructs described by the Sampling Theorem, and the reconstruction filter is what approximates the curve-fitting. Though it's hard to design theoretically ideal reconstruction filters, the engineers who design audio DACs have refined them to the point where they're damn close to ideal over audible frequency ranges. (This is no new development, either. Nigh-perfect audio DACs have been cheap, mass market parts for about 20 years. Just add competent system design.) So you will not see stairsteps in the output waveform.

    No matter how confident he seems to you, take your grain of salt and caveat emptor.

    Advice people would do well to apply to your words. You're trying to project an image of confidence to prop up an argument which can only be made from ignorance. IMO, the reason you "find his delivery smug" (and so forth) is that he's challenging your presumptions, and you don't like it.

  21. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 0

    Sure, you don't let someone treat their cancer with homeopathy, but why stop them from treating an upset stomach, migraines or some other such non-fatal malady? We take medicine for a lot of problems and very few of those problems can be fatal.

    Because the people who promote things like homeopathy reliably over-sell their product's efficacy, and thereby discourage people who need real medical attention from getting it.

  22. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 0

    Because it mightn't be "upset stomach" or a "migraine", it might be internal bleeding or brain cancer, for example.

  23. Re:soothing? verarschst Du? by iluvcapra · · Score: 1

    Yes, sampling *theory* means there are no stair steps, but codecs are real devices.

    You're absolutely right, and real devices are more or less prone to some distortion when converting digital signals.

    However, this must be determined empirically, and is beyond the scope of a discussion of digital signals. As a rule, the DSP developer gets his algorithm right, and getting those Signed Ints to sound good is strictly the client's responsibility.

    --
    Don't blame me, I voted for Baltar.
  24. All I have to say is "that was awesome" by Anonymous Coward · · Score: 0

    The "lollipop" as opposed to "stairstep" representation of a digital soundwave really clarifies things.

  25. Re:soothing? verarschst Du? by wiredlogic · · Score: 2

    SACDs are encoded with a 1-bit sigma-delta pulses. Those would be some pretty whopping staircases if sampling theory didn't work to eliminate them.

    --
    I am becoming gerund, destroyer of verbs.
  26. Re:soothing? verarschst Du? by Anonymous Coward · · Score: 0

    You could potentially make an ADC that did produce stair step samples but it would be a stupid thing to do.

    That's called a NOS (non-oversampling) DAC and sold for substantial amounts of money to audiophools who have been convinced that "filters are evil".

  27. It's about the sound stupid! by Anonymous Coward · · Score: 0

    There's no talking to trolls because those who have heard master tape know what I am saying here. 16/44.1 is definitely not enough to reconstruct a proper analogue signal. The manufactures themselves knew this and that standard was a bare minimum. However much one might try to justify this with numbers and figures is irrelevant. The manufactures themselves knew it and that's why there was so much work in trying to improve on that standard.

  28. The value of science by xiphmont8352 · · Score: 1

    On that topic, while I was filming the epilogue I started feeling ill and ignored it for a while, but eventually went to the emergency room. Three hours later I had an emergency appendectomy. Medical science also rocks. If I'd gone to a faith healer, I'd be dead. (FTR, the epilogue footage in the final vid was from three days after the appendectomy)

  29. It sure is! by xiphmont8352 · · Score: 1

    Extensive testing reveals that no matter the wailing and gnashing of teeth claiming digital sucks, no one can actually tell the difference in a double-blind test.

    Theory is indeed useless when reality doesn't agree. Every real scientist knows this. An engineer's job depends on it. I invite audiophiles to add that bit of wisdom to their own thinking.