Why Distributing Music As 24-bit/192kHz Downloads Is Pointless

← Back to Stories (view on slashdot.org)

Why Distributing Music As 24-bit/192kHz Downloads Is Pointless

Posted by Soulskill on Monday March 5, 2012 @03:08PM from the but-all-those-bits-sound-so-good dept.

An anonymous reader writes "A recent post at Xiph.org provides a long and incredibly detailed explanation of why 24-bit/192kHz music downloads — touted as being of 'uncompromised studio quality' — don't make any sense. The post walks us through some of the basics of ear anatomy, sampling rates, and listening tests, finally concluding that lossless formats and a decent pair of headphones will do a lot more for your audio enjoyment than 24/192 recordings. 'Why push back against 24/192? Because it's a solution to a problem that doesn't exist, a business model based on willful ignorance and scamming people. The more that pseudoscience goes unchecked in the world at large, the harder it is for truth to overcome truthiness... even if this is a small and relatively insignificant example.'"

10 of 841 comments (clear)

Min score:

Reason:

Sort:

Re:I can tell the difference by Aboroth · 2012-03-05 15:22 · Score: 5, Informative

You are missing the point of the article. 192KHz is not 192kbps.
Re:The article writer is a deaf idiot by Sparohok · 2012-03-05 15:43 · Score: 5, Informative

A group of sixty audio professionals and audiophiles did a series of controlled double blind trials published in the Journal of the Audio Engineering Society. They found no perceptible degradation caused by a 16-bit/44.1kHz A/D/A.
http://www.aes.org/e-lib/browse.cfm?elib=14195
Re:Can we stop using the word "truthiness," please by xiphmont · 2012-03-05 15:53 · Score: 5, Informative

Truthiness refers to a specific kind of lie-- a lie that sounds true, and that a large segment of people really want to be true. The kind of thing that's close enough to true for AM radio talk show hosts.
And now... I'll get off your damned lawn. Don't forget to take your teeth out before falling asleep.
Re:Pro recording by Bassman59 · 2012-03-05 16:26 · Score: 5, Informative

I recently remixed a classic recording for sony records. The files where rolled off of tape at 24bit/96k. 48k I can understand but 96k is pointless. WAAAAAAY beyond the range of human hearing. In the old days, things like cymbals and brass could really stick out because the encoders and decoders where just not where they are today.
Anyone that tells you they can hear the difference between 48k and 96k is dreaming. Its the quality of the recording that counts more than anything these days.
The difference is that the antialiasing filters are much simpler and have a gentler roll-off when sampling at 96kHz. The high-order filters necessary to ensure adequate attenuation at Nyquist and above when sampling at the lower rates have this tendency to ring.
Re:44KHz by tftp · 2012-03-05 17:47 · Score: 5, Informative

There may be no theoretical benefit, but since there's no such thing as an ideal sampler or filter or quantiser, it has many practical benefits.
Here is a quick example. You sample at 44 kHz. The first Nyquist zone is from 0 to 22 kHz, the second one is from 22 to 44 kHz (with flipped spectrum.)
Now, say that some [mechanical] harmonic from some instrument has frequency of 33 kHz. We don't hear those with our ears (parts of the ear are too massive to vibrate fast enough) so no harm done. The orchestra is playing as usual.
But now record this orchestra with an imperfect antialiasing filter (there are reasons why a perfect one wouldn't do you much good anyway.) The 33 kHz harmonic falls into the 2nd Nyquist zone. It will be played back as if it was (22 kHz - 11 kHz = 11 kHz.) Can you hear 11 kHz? Most people hear it just fine. Think about it for a moment. There was no 11 kHz signal in the original spectrum; there was 33 kHz, an inaudible one. The artifact showed up because a [lossy] mathematical operation was performed on the data that describes the signal. The resulting distortion produced an audible tone where none was present originally.
However if you encode at, say, 128 kHz sampling rate, things change. First, the antialiasing filter - even if it is of the same architecture - will have its cutoff way below the Fs/2. This means that signals of the second Nyquist zone will be attenuated by many tens of dB - essentially they can be completely eliminated because nobody cares what you do to ripple and phase above 30 or 40 kHz. Second, for the alias to show up it has to be in LF radio band now, starting at 128 kHz. Microphones aren't even mechanically capable of picking up those frequencies. And finally, if that 33 kHz harmonic passes through the filter (with the same mediocre attenuation as in the first example) ... it will be played back as 33 kHz, and it won't go anywhere. The amplifier will filter it, and the speakers will attenuate it greatly. In other words, a serious distortion that was present when you are sampling at 44 kHz disappears when you are sampling at a much higher rate.
Re:The article writer is a deaf idiot by DMUTPeregrine · 2012-03-05 18:18 · Score: 5, Informative

My last hearing test has shown that I can hear up to 21khz. I play Tin Whistle, Great Highland Bagpipe, Ceilidh Pipe, and Guitar. I have heard the rattle of a live sax. I have heard a delicate triangle ringing out over a live orchestra. I have heard live trumpet. I've spent quite a bit of time training my ears to hear those sounds.

I have consistently failed to find a difference between the following in ABX tests I have run:
192/24 and 44/16 .wav
96/24 and 44/16 .wav
44/16 .wav and FLAC, encoded with the FLAC reference encoder
My reference tracks have been Pink Floyd's "Time", Sirenia's "Meridian", Bach's "Herz und Mund und Tat und Leben" part 7 conducted by Nikolaus Harnoncourt.
The reference system was a PC with an Asus Xonar Essence sound card, a Rogue audio Perseus pre-amp, a pair of Rogue M-180 monoblock power amps, and Vandersteen Signature 2ce speakers. (My father's sound system and my PC).

Of course, msobkow will claim that since I like Highland Bagpipes my hearing is inferior, and I can't hear the differences because he's better than me.

That said, I do like having music in 192/24. Why? Because I can play with it. I can edit it, there's more headroom. If I feel that "Another Brick in the Wall" just needs a tin whistle part, well, I'll have an easier time editing it in without distortion. But for listening? Nope.

--
Not a sentence!
Its also called a factoid by tkrotchko · 2012-03-05 20:19 · Score: 5, Informative

Many people think a "factoid" is a small fact. Actually a factoid is something that sounds true, but is actually false.

--
You were mistaken. Which is odd, since memory shouldn't be a problem for you
No smooth by DrYak · 2012-03-05 23:00 · Score: 5, Informative

The higher the sampling rate smoother the signal.
Well... no. There's enough information in a low sampled curve. As TFA explains it, the output isn't "jagged" when played back in analog.

Human perception wise a audio signal recorded at 96KHz sampling rate might well be indistinguishable from one sampled at 192Khz
as explained in the article:
- Yup the human ear won't hear anything aboe 20kHz sounds, because it doesn't have any receptors for that.
But there are some real-world problems that come into the mix. No audio installation is perfect. You always get distortions.
- Thus, a 192kHz sampled file could contain frequencies up to 96kHz. These are sound which can't be heard in theory. In practice if you throw 96kHz frequencies to a sub-optimal speaker, the speaker can barf a lot of distortions, including distortion below the the 20kHz. So not only are you trying to output a sound that can be heard, but you force the speaker to produce bad noise *which* is audible.

But my thinking is that future technologies might let you do interesting things with the extra bit of data which is useless to us right now.
Hard to do anything with those bits at all. We simply lack the anatomic feature to do anything with them. Unless you do something like transpose everything at lower frequencie (slow down everything 2x = move everything 1 octave lower). At which point you aren't really outputing the original sound anymore. You're simply using the data to produce new sounds that weren't here to begin with.
The only practical use-case for this would be zoologist studying animals whose sound are beyond the human hear range. In that case "moving everything a couple of octave down" would help the scientist have an approximation with which he can work (to find rythms or other variation that are inaudible in the original frequency range). But that has nothing to do with hearing music made by human, for humans, with instruments designed for human hearing ranges.

Kind of like with digital pictures which are too noisy or blurred, but which might be cleaned up with future algorithms to give us a slightly more useful picture.
The situation with pictures is slightly different. What you're speaking about is spacial frequency. I.e.: resolution.
And human eyes can percieve way much more than some blurry low-res pictures. And in addition to that, there's this thing called zooming which makes perfectly sense to record picture at higher resolution. Because looking at details is simply looking at the same picture at another scale.
The "visual equivalent" to 192kHz sounds would be recording colours outside the human range. Like recording also infra-reds, microwaves, ultraviolets, and X-Rays.
Things that can't never been seen, because human lack the corresponding apparatus. The only way to get someting out of this extra data would be to transpose it into the visible domain. Thus use pseudo-colours to display levels of low infrared (heat), etc.
Just like the "zoologist" use-case above, there are a lot of scientific use-case where that could actually make sense (as an exemple, think about all the data collected by astronomers).
But in no way is it useful to record X-Rays to enjoy a painting by some known artist. The painting was done by a human painter, for human public, using colours chosen for their effect on an un-aided human visual system, disposed on a canvas in a way which is pleasing to the eyes.
(Well, okay. I know that some scientist use infra-red or X-ray image of paintings to analyse how they were done, what are the layers underneath or if there's even another picture over which the current one was painted. But these are scientist analysing the paint, so we're agin on the "scientific analysis" use-case).
24/192 makes sense as an intermediate format to avoid rounding errors, aliasing during filtering, etc.
There could be also some scientific value to keeping

--
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Re:Pro recording by scary_jeff · 2012-03-06 01:07 · Score: 5, Informative

I also spent 4 years studying an EE degree, and although it was not especially focused on signal processing, I now work for a large pro audio company.

Some of the issues pointed to in this and other posts regarding oversampling and AA filters are not really relevant to the subject at hand, given the technology currently in use. A statement like 'oversampling at 192 kHz' shows a lack of knowledge regarding the kinds of audio converters that have been in use for a good while now. A Delta Sigma ADC running with an Fs of 48 kHz might often be oversampling at 3.072 MHz or 6.144 MHz. Anti aliasing filters that many people have mentioned are implemented digitally inside the converter (no need for external analog filters, which may well exhibit many of the problems mentioned), and actually have extremely good pass band ripple.

Look at datasheets for converters from manufacturers such as TI (burr brown), cirrus [page 36 here has detailed plots of 48, 96, and 192 kHz pass pand characterisitcs for the device, highlighting the fact that increasing the sampling rate does not improve pass band ripple for this device (also note the scale is 0.02 dB/div)], AKM, Wolfson micro You will find pass band pass responses that are flat to within less than +/- 0.05 dB over the audible range, and stop band attenuation in excess of 100 dB, whether sampling at 48 kHz or 192 kHz. If you can find anything in actual converter datasheets that points to better converter performance from selecting a higher sampling rate, I would be interested to see it.

All in all, the basics of sampling theory don't really help people to understant the real world issues in designing a moden high end audio device. And in the end, surely the proof of the pudding is in the blind tests, that never seem to show that anybody can tell any difference when moving to higher rates? Even if there were a few people who could hear this difference in some perfect listening envirmonment, would it really make sense for everyone else to go out and buy 192 kHz equipment?
Re:Can we stop using the word "truthiness," please by Anonymous Coward · 2012-03-06 03:07 · Score: 5, Informative

At that sample rate a 15kHz tone has only three samples. With only three samples there's no way to accurately draw the waveform. With three samples there's no way to discern between a sine wave, a square wave, or a sawtooth wave.
I wish you guys would get this right. There is absolutely no way you can tell the difference between a 15kHz sine wave, square wave, or sawtooth wave (apart from amplitude, perhaps).
Sawtooth waves have even and odd harmonics, and square waves only have odd ones. This means that the first harmonic of a 15kHz sawtooth wave would be at 30kHz, and the square's 3rd harmonic would be at 45kHz. As you pointed out, even if you could hear them, you'd have to have damn good speakers to reproduce.
Three samples is enough to reproduce the 15kHz fundamental per Nyquist.