Sony Super CD: More Bits, More Bucks, Mo' Betta?
Reader dcigary pointed to this "nice writeup on the new Sony Super CD." Though the explanation of the difference between supposedly revolutionary "DSD" recording over conventional digital seems to get by with a knowing mumble, the piece does mention the price (high) and that competition from audio-only DVDs may cripple acceptance of the new format. Even if I like the idea of ultra-fidelity, my faith in the Nyquist theorum is too strong to spend a grand and a half on a CD player anytime soon ...
Virtually all audio A/D and D/A converters today use sigma-delta, also commonly referred to as "one-bit" conversion.
In a sigma-delta A/D converter the audio signal is sampled with a high sampling frequency (typically a few MHz) and low sample resolution (1 bit). An error feedback mechanism is used to ensure that most of the energy of the quantization noise is "shaped" into high frequencies, giving excellent fidelity in the audio band. One bit is inherently linear - no need for carefully matched resistor networks such as those used on older A/D converters. This stream is then filtered and decimated using digital signal processing techniques to a lower sampling rate (e.g. 44100Hz) while gaining sample depth on the way (16 bits and higher).
For D/A conversion the process is reversed: the 44100Hz signal is interpolated up to a high sampling rate and then the sample depth is reduced down to one bit. Again, error feedback is used to ensure that the quantization noise resulting from the low resolution is shaped to high frequencies. This bitstream is then low-pass filtered and used as the audio signal. Again, with much better linearity than D/A converters based on carefully adjusted resistor networks.
The Sony SACD skips the decimation and interpolation stage. It stores the noise-shaped bitstream directly on the disc. The beauty of this idea is in its simplicity: it performs much less transformations on the sigma-delta signal and therefore should offer inherently higher fidelity and wider bandwidth.
If sigma-delta converters were available 20 years ago when the CD was invented they would probably have chosen this method for its simplicity. But at that time the analog conversion technique known was resistor networks so PCM was used.
Remember that at the time the CD was really stretching the limits of consumer technology. No other consumer product prior to the CD player used so many new and advanced technologies: lasers, error correction, digital signal processing. If they could have used this technique it would have reduced the cost of CD players significantly. For example, this bitstream is much more tolerant to bit errors because unlike PCM there is no "most significant bit" that can cause a large error if corrupted.
Using this technique today, though, is insane. There is no real savings in simplicity when a million digital transistors cost close to nothing. If you want higher fidelity, 96kHz and 24 bits is more than enough.
Let's say you want something simple like a graphic equalizer on your SACD player. If it's analog such a complex circuit will introduce lots of noise. If you implement it digitally it would take insanely large amounts of CPU power to process a signal sampled at over 2mHz. Manufacturers will probably end up downconverting it to PCM at 96kHz or lower, doing the signal processing and then converting back to sigma delta for playback. This will lose all of DASD's alleged advantages.
BTW, for the purpose of preserving analog masters DASD is really a good idea because they contain useful information at very high frequencies such as the tape bias signal and the intermodulations it creates. Preserving this information will allow future signal processing techniques to create accurate models of the nonlinearities of the magnetic medium and use this high frequency information to reconstruct the original recording with better fidelity down in the audio band. For home use SACD is a very bad idea. Just about the only good thing I can see about it is that it can be marketed effectively because it's such a "radical new concept".
The DVD audio uses conventional, well proven PCM with somewhat higher sampling frequency and bit depth than CD. Why use a higher sampling frequency when we can't hear over 20kHz? It turns out that while we can't hear a sinewave at frequencies higher than 20kHz the high frequency components of complex waveforms make a noticable difference even up to 26kHz. To take a good safety margin and maintain integer ration a 96kHz sampling rate was used. This does not significantly hurt the data rate required because non-lossy compression is used on DVD audio. A compressed 96kHz signal takes about 30% more space than a compressed 48kHz signal. 16 bits is, again, almost enough. In fact, with proper in-band noise shaping the noise floor is inaudible in all but very extreme circumstances. 24 bits is therefore a very good safety margin.
Another reason why DVD-audio is superior is because it supports Ambisonics. Ambisonics is a surround sound system. It was not crated for cinematic effects. Ambisonics was designed for music and for reconstructing the subtle spatial cues of the ambience of the recording venue. With a proper arrangement of speakers it can create true 3D sound - including the height dimension. Imagine listening to a recording and feeling the height of the concert hall!
Please never ask "how many channels does Ambisonics use" because it's not a relevent question. Ambisonics deconstructs the 3D sound field mathematically using a four component representation (XYZW). This representation can be processed with a simple linear matrix for playback on different speaker configurations and numbers of channles with varying levels fidelity of 3D soundfield reconstruction. This includes the popular 5.1 setup used on home theaters (it's probably going to be the default settings for DVD-Audio players) .
DVD-Audio is also backward compatible with DVD players although a DVD-audio player will be required to take advantage of all the features and full quality.
More information about DVD-Audio here
----
Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
A few articles ago, someone suggested that Malda & company implement a "Just plain wrong" moderation option.
It's mathematically provable (and the end result of said proof is the Nyquist theorem) that all you need to do is sample a signal at 2x its maximum frequency, and you can recover that signal EXACTLY.
Sample the signal, later pass that sampled signal through a low-pass filter. Afterwards, the only difference between the final signal and the original is a constant multiple. (For example, if you sample with infinite impulses, which have infinite height and zero width but an area of one, the end signal is 1/T (where T is the period) times the original.) The aforementioned technique is "ideal sampling" - which never occurs in the real world. With practical sampling, you get a different multiple than 1/T. But the end result is the same.
retrorocket.o not found, launch anyway?
The way the Sony thing works is like this: at each point the encoder asks 'am I above or below the actual wave form?' If it's above, it reduces the angle of its recording by X degrees. If it's below, it increases it by that much. It is constantly overshooting and crossing over the actual waveform- at about 2 megahertz, not some mere 96K. At no point does it record the signal voltage itself- it only follows the changes in angle, at a rate so high that it's way beyond anything that will be recorded, and it's kind of 'lossy' as it'll almost never be _exactly_ on the target waveform- the oscillation of it tracking the waveform will be higher than 96K anyhow and it will be a _sine_ oscillation, not the square-stepped, nasty distortions of raw PCM sampling, so it won't even need to be filtered.
I'd love to see this actually catch on- as far as sound is concerned, nothing else should be needed. The interesting thing is this- could this format be _synthesised_ digitally? I'm picturing some future sort of audio workstation where you have all the modern gimmicks like pitch correction, EQ plugins etc, but you never resample anything- just overlay all the different sample rates you end up with, seamlessly :)
It would be interesting to see the venerable Yamaha DX-7 redone with this technology for its audio outputs! :)
On a somewhat-related note, it is remarkably interesting what effect a more accurate clock signal has on the quality of a 44.1KHz recording
Thank you, thank you. This phenomenon (sometimes known as "clock jitter") also explains, in large part, the age-old argument as to why digital-to-digital copies are not always perfectly identical, despite the notion that "it's only copying numbers/bits, it has to be perfect". Any digital recording references a time base, and any variation in that time base skews the way the audio sounds when you play it back. Commercial recording studios pay very large sums for centralized, highly accurate clock sources to which each piece of equipment that handles a digital audio stream is synced.
Arny Kruger has a lot of misgivings about this method. First of all, it is a lot harder to code DSP for this instead of PCM. Second of all, there are apprantly problems with high-frequency artifacts that this encoding technique uses.
I also have a lot of misgivings about this method. Personally, I think the average listener thinks 16/44.1 is good enough, and has no need to listen to something at a even higher bit rate. The popularity of MP3s indicate that 16/44.1 is better sounding than what the average consumer needs.
- Sam
The secret to enjoying Slashdot is to realize that it should not be taken too seriously.
Even if I like the idea of ultra-fidelity, my faith in the Nyquist theorum is too strong to spend a grand and a half on a CD player anytime soon ...
You've got it all wrong. You see, humans have an approximate range of hearing between 20Hz and 20KHz (assuming no hearing loss). Now, Nyquist theory says that you must sample at twice the frequency of the highest frequency you wish to preserve. We use 44.1 KHz for this. Sounds good so far, right?
Well, what many neglect to mention about Nyquist theory is that you must run the resulting output through a filter. The filter, according to the theory, is a brick-wall filter. Of course, these things don't exist. Filters have a roll off. As a result, people invented the concept of oversampling. This way, you move the sampling frequency way above 44.1 KHz, and you can put a filter in at, say, 100 KHz. Nice, right? Wrong. Filters have audible effects *well* below their -6dB level.
That said, there's still another problem. People have an approximate dynamic range for their hearing of 120dB. Using 16 bit samples (like CDs), you end up with only 96dB of theoretical dynamic range. So people invented the concept of running a low level noise input when digitizing. This ends up pushing the dynamic range above 96dB. This is how most modern CD players can claim a dynamic range of about 102dB. Sounds good, right? Wrong. You're increasing the dynamic range, but you're also increasing the noise. This is not good.
Of course, for 1980, CDs were pushing the limits of technology. Now they're not. Now we have DVDs. With DVDs and compression, you can get 5 channels of sound digitized with 24-bit sampling at a sample rate of 96kHz. Now that kicks ass. Of course, all the different parties messed around with the standards committees long enough to pretty much kill DVD-Audio (it was finally released a few months ago, but there is way too little material released under the format).
So, if you want a cheap CD player that truly sounds good, I recommend you get a DVD player, and listen to music on DVDs or DVD-As. Even a cheap one can have a pretty poorly built filter and sound OK. Of course, cheaper D2A converters have their own problems, like jitter. But that's a story for another time.
Look, if you don't believe me about this quality issue, go to your local high-end (and I don't mean that they carry Denon and Yamaha...I'm talking about equipment like Wadia and Krell), and ask them to do some listening tests with a Krell compared to your $189 Technics (or whatever). If you don't hear a difference, then you either have hearing loss or you aren't used to paying attention to sound quality. Like many elements of perception (sight, hearing, etc), the more you work the sense, the more acute it becomes.
--Be human.
ANY signal, especially a periodic signal, can be represented as a sum of sine waves. (This is the basis of the Fourier transform.)
What is a 15 kHz square wave? It's a 15 kHz sine wave plus sine waves at multiples of 15 kHz. The human ear can't hear those multiples, hence can't tell the difference.
What's my point? My point is that if you sample at 44.1 kHz, you catch ALL of the portions of your signal that are human-hearable. Yes, a human can tell the difference between a 500 Hz sine and a 500 Hz square - that's because there are many harmonics with the human hearing range. But I can assure you, if you play a 15 kHz sine wave and a 15 kHz square/triangle/sawtooth wave, you won't be able to tell the difference.
retrorocket.o not found, launch anyway?
A lot of what you (nathanh) say is true. But it's clear you don't have the best ears on the planet--do more music listening, even on mediocre equipment, and you'll hear the things you claim can't be heard.
Quit getting so excited about what you know. Just because you've read about signals somewhere doesn't mean you understand human auditory function. And remember that we don't really understand any of this stuff--we're just constructing models. If I can hear what you say I can't, then your _model_ is wrong, not my ears. After all, the models are trying to conform to reality, not vice-versa as you seem to argue.
I am a mathematics grad student, familiar with some signal processing, and critically inclined. I also am passionate about music. I find your comment "Most audiophiles are full of shit" offensive--I know several audiophiles and none of them are full of shit. Some are musicians, some are electrical engineers. All of them have highly trained ears.
I don't care that "most people's speakers have trouble doing better than -3dB at anything over 18kHz anyway"--mine are spec'd -2db at 22kHz, and they only cost $450 for the pair (Sound Dynamics 300ti). I take exception to your comment "So it's hardly any loss at all to throw away the sine waves over 20kHz"--hardly any loss to whom? I would benefit from better-sounding recordings. I do care about the high-range.
Finally, not everything can be represented by sinusoids, as you claim. Fourier claimed, IIRC since I'm not going to check, that all L2(0,1) functions could be represented by an infinite sum of sinusoids. I don't understand why all changes in air pressure must necessarily be in L2(0,1)--and unless a patient physicist tried explaining it to me, I probably wouldn't believe it anyway.
-Paul Komarek
Console yourself with the fact that after relatively few playings the AM quad information would be scraped off the LPs anyhow ;)
Now while this is entirely true it gets more interesting than this in real-life. There is a need for better than 16-bit resolution. Here's why.
You know that you reconstruct your signal from the samples by using zero-order-hold on each sample, then applying an ideal low-pass filter at half the sampling rate.
Sadly you can't make ideal low-pass filters for any money. In fact it's hard to make even good low-pass filters. The solution is to interpolate the signal first then use a much cheaper and less accurate low-pass filter.
Linear interpolation (first-order-hold) is common but produces nasty results. Bandlimited interpolation (sinc pulses) is a much better method but difficult to implement. In practise you'd use something halfway between the two.
However all interpolation methods rely on the sample being as close to perfect as possible. If you're using first-order-hold then you will only get the error from the nearest two samples. But if you're using bandwidth interpolation then the sinc function means ALL THE SAMPLES are used to create your interpolated sample.
In practise several dozen samples will be combined to create your interpolated sample. The errors from all these samples combine to create one huge error on your interpolated sample.
24-bit samples greatly reduce interpolation errors. 16-bit sampling is only good enough if we have perfect DACs and ideal low-pass filters. 24-bit samples will allow the use of practical DACs and cost-effective low-pass filters.
This one sounds like it'll be no more popular than minidisks.
B&K makes mics with a reponse up to 40khz (I know they are called DPA mics today, but I still call them B&Ks). B&Ks (and Genelecs) are what people use when they want a really accurate sound, as opposed to a "warm", "larger than life" sound.
That said, I agree with what you are saying. People can not hear about 20khz, and people that think they can will need to run some double-blind scientific tests to back up their claims before I will seriously listen to them.
- Sam
The secret to enjoying Slashdot is to realize that it should not be taken too seriously.
It seems most new technologies now have a catch: DVDs and the mess over DeCSS and the MPAAs desire to sue, MP3 encoding and the RIAA drooling to sue anyone who even looks at an MP3, intellectual property right, DMCA, who owns what, etc.
I'm obviously a huge fan of new technology, but is there a catch somewhere in Super CD? The article was big on technical, not on much else. Anyone have any insight to this?
Nyquist's theorem states that the highest frequency that can be represented is one half the sampling rate. This is obvious because you must be able to detect at least a peak and a valley of the sound wave.
Nyquist's theorem does not imply, however, that the representation of the maximum [or near maximum] frequencies will be highly accurate as far as the shape of the wave form is concerned. At and around 1/2 sampling freqency, the wave forms become basically nothing but square waves [alternating between a single high, and a single low point]. In order to deal with this, some sound decoders will attempt to interpolate the waves, but they cannot reproduce the original sound accurately. This is why higher sampling frequencies ARE relevent to higher audio fidelity. Higher bit resolutions are arguable though...
If you try to sample this at 44.1 and reconstruct it, you're hosed- you have substantial problems based on the fact that the pulse would be quite accurately tracked by a 20K rate but you're trying to force a 22.05K rate to track it. It doesn't help to throw away all the very obvious and unavoidable over-20K components (come on, _5K_ and you get only a couple of harmonics? That's nuts): you're still hosed by the interference pattern that is produced, because these are not simply a pile of sine-tones: they are sine-tones (lots of them) AT SPECIFIC TIMES. Just because 20K is lower than 22.05K doesn't mean 22.05K is going to do even a remotely acceptable job of sampling it.
Remember that we are talking about a _5K_ wave here, not something absurdly high. Do you not think you can hear the difference between a really thin pulse wave and a sine at _5K_? Yet the pulse only needs to be about 1/4 width to cause really severe problems with sampling- specifically, a rapidly cycling distortion component that will go from 0% distortion to severely, severely distorted at a rate fast enough to make up an entirely new, harmonically unrelated tone! This is only lessened by throwing away frequencies over 20K, not entirely fixed. You can lessen it a lot more effectively by throwing away all frequencies over _10K_ but come on- whoever heard of a pulse wave with only two harmonics? Our 5K pulse wave would be unrecognizable- this has to be considered distortion.
I'm afraid Nyquist is mostly crap- it would be fine if all musical frequencies were exact subdivisions of 44.1K, but when you start dealing with real-world frequencies, you start having these intermodulation problems, and the problem is not that you're not having the supersonic frequencies, the problem is that the resulting intermodulation effects are cyclical degradations of continuous tones and produce _inharmonic_ tones. It's very hard to hear slight differences in harmonic tones but it's not hard to hear problems when there are inharmonic tones being synthesized.
Extra credit- what inharmonic tone is generated by the distortion on a 5K 1/4 width pulse wave sampled at 44.1K? Use as much treble-rolloff as you want, it'll always be the same intermodulation frequency. Anybody good enough at math to say what frequency is generated as intermodulation distortion? :)
Sony has had SACD for quite a while now. I remember picking up some audiophile magazines over a year ago and reading about it. They've released a lot of Sony Classical stuff on it.
However, a lot of the systems to play SACD cost over $1K for the cd player. Not to mention you're not going to get the nice dolby surround of DVD at the same price....
No, what the MP3 revolution shows is that almost all music fans value the content more than the technical reproduction.
I think you're mistaken. Content doesn't play a role here. Essentially everything available in MP3 format is also available in an uncompressed digital format. But most the vast majority of MP3 listeners can't hear or don't care about the extremely large quality loss that is inherent in the MP3 compression algorithm.
This is the same group of people that use the term "high end car audio" without a hint of irony.
-- "Complacency is a far more dangerous attitude than outrage." -Naomi Littlebear
So here's a digital format that should please nearly all the classical music afficionados out there who spend tens of thousands of dollars constructing acoustically-perfect "listening rooms". Nothing bad about that. At the very least, it finally creates a reasonably lossless way to digitize analog material for archival and preservation purposes--although any archivist will tell you that the real archives themselves for long-term preservation should be old-fashioned stamped analog discs.
These two markets--archivists and money-is-no-object audiophiles--should be covered with about 20,000 of these devices. So what about the rest of us? I have serious doubts that the difference between this and DVD-Audio can be heard on even a $3,000 home theater system.
Sony (and presumably Philips/Magnavox) intend to build support for this into all of their players starting sometime soon, maybe a year from now. The thing is, nearly all the DVD players being sold today can play the competing DVD-Audio discs. None, not even Sony's, and not any of those millions of Playstation2s shipping in the next year, can play SACDs.
Ultimately, this is about patent royalties. Sony and Philips have been collecting royalties on every CD player and CD drive sold for over a decade now, and SACD is about trying to do it again for another decade. DVD-A is the format endorsed by everyone in the industry except Sony and Philips. Is it a good professional archival format? Nah. Is it both better and more flexble than CD? Yep.
So here's the ugly truth. The MP3 revolution seems to have proven that most people have tin ears. Ask a hundred people. 98 of them will tell you that 128Kbps MP3 is "CD quality". Fact is, it's inferior to Minidisc, to FM radio and--in many respects--analog cassettes. But it doesn't have hisses and pops, and that's all most folks really notice. Heck, 320Kbps MP3 sounds crappy next to a CD, even on a $400 stereo.
If people think MP3 is "good enough"--when it can't even hold a candle to CD--why is the mass market going to embrace SACD over DVD-A? Especially when they'll have DVD-A players available from dozens of manufacturers and SACD players most likely available from... three?
CD will be superseded, not because most people want higher-resolution sound quality they can't hear on Britney Spears remixes, but (1) because DVD-A and SACD players will offer things like 6-channel sound and bundled-in DVD video clips, and (2) because the record industry will stop making CDs, just like they stopped making LPs, in order to force everyone to buy the new players and buy yet another copy of Billy Joel's Greatest Hits to go with the LP, cassette and CD they already have.
The best format won't win. The more ubiquitous one will. The question remains which coalition will blink first. Will the Sony-Philips side break down and allow their record companies to start making DVD-As once they see SACD players aren't selling well, or will companies like Matsushita start paying royalties and buying chips from Sony because the Sony/Philips DVD-A embargo has made it impossible to get record stores to carry DVD-As?
Yes, but using 2:1 or 2.5:1 lossless compression is so easy, that halves the data rate. I'm not sure what compression Meridian Lossless Compressions achieves, but I'm pretty sure it exceeds 2.5:1.
"How perfectly Goddamn delightful it all is, to be sure" Charles Crumb
No- actually it cannot get the _corners_ of the square wave but it can get the vertical part substantially more vertical than DVD audio. The rate of voltage change will keep on accelerating right until it has to reverse the rate of change and stop. DVD audio will stop at 48K. Frequency-wise this is not much of an issue but if you think about the amount of transient voltage involved with the sides of the squarewave (which demands, in theory, INFINITE voltage), there's a huge difference here. The Sony system will pack a huge amount more voltage into the sides of the squarewave- how much more I'm not sure but it could be several orders of magnitude more voltage. There's some risk of ringing that follows the edges- but when was the last time you looked at squarewaves produced by CD players? >:)
Sorry, nathanh. Almost all of what you have said has been very accurate. But your aliasing calculation isn't: aliased freqs appear at the _sampling_ rate minus the freq in question, not at the frequency minus the Nyquist as you claim.
Therefore the original poster was correct about his 24kHz signal. It will appear at 44.1k-24k=21kHz.
CDs are great becuase they removed a transducer from one end of the playback process: no need for a needle or a tape head. Now you get the signal, and, most importantly, the dynamics of the signal nearly perfectly. Now you need speakers that can take this dynamic range to your ears.
Most speakers, especially most small speakers, fail to do this. They may be "accurate" in terms of response to the range of frequencies put through them, But that won't reproduce the performance, which is in the dynamics. You can A/B test them endlessly against each other, and find interesting and subtle differences, but here is the real test: Take a pianist and a real concert grand piano, and some recordings of a piano, and see if the speakers you think are perfect can fool you into thinking they are the piano. Usually, there is no contest. Any idiot half deaf from a thrash metal concert could tell the difference, because the piano puts so much energy into the air that very few loudspeakers can come close. The piano shakes the floor, makes the windows rattle, and you can feel it in your bones. By contrast, even really good speakers make it sound as if the lid is shut and there is a pile of coats on top of the paino. There is no way a small speaker can do what a piano, cello, bass fiddle, tympani, baritone horn, etc. can do. Put them in a concert hall together, and you have a real challenge.
I would much rather have a pair of Klipschorns (if I only had a room with corners) than a pair of similarly priced near-audiophile conventional speakers even though the K-horns would no doubt be hideously less linear. They would be efficient enough to come close to reproducing the dynamics of real musical instruments. The fact is, no two pianos have the same response up and down the scale, nor the same resonances either, nor do two rooms, so why worry about getting close to absolute linearity? The same argument holds even more as music gets more complex: An orchestra can be close miced, or not, recorded in long takes, or cut and pasted from small snippets, multitracked or not, etc. All those engineering decisions make absolute reproduction a joke. Reproduction of what? I'd rather hear how hard those bows are coming down on those strings. That is where the information is.
I wrote parts of this stuff
Sony will introduce one format, and Panasonic will introduce another. Sony has their "Memory Stick", so Panasonic introduced "SD Memory". DVD-RAM, DVD-RW, DVD+RW, etc.
They are not doing it to actually innovate, but to make money from licencing the technology. CD is an old enough technology that the patents are either due to expire, or have already expired. So they have to introduce some new patented technology so they can keep that revenue stream going. Remember that they have introduced several stupid formats (anyone remember the El-cassette? Philips' DCC?) for every one that succeeds.
If they goal was simply to make better audio available, they would be releasing regular DVDs without video tracks. 5.1 24-bit 96khz. No, instead they want you to buy a whole new machine that essentially does they same thing, except is broken by disabling the digital output! Seriously, both DVD-Audio and SuperAudio CD do not have any way to output anything other than multiple analog audio channels. Mega-stupid. Their fear of people copying their tracks has rendered both formats worthless. The worst thing to happen to Sony was when they purhased Columbia.
"How perfectly Goddamn delightful it all is, to be sure" Charles Crumb
Sorry, yes. The previous poster wrote "22050 + 150 = 24000". I was pointing out that "22050 + 1950 = 24000". I should have completed the entire line of reasoning rather than just point out the mistake.
Even a signal as solidly within the passband as a 14.7K _sine_ wave will completely fail to be recovered exactly. This wave gets three samples per cycle to capture a symmetrical wave. The only possible result is a pulse wave of 1/3 pulse width- that's what's in the data, there is no other possible result. When you apply a theoretically perfect brick-wall filter to that and perfectly get rid of the sampling artifacts YOU STILL HAVE IRREGULARITIES. Substantial ones, by audio measuring standards- many percent.
If you do a 14.6 sine wave, not only do you get basically a 1/3 width pulse wave, but you get a subcarrier.
Can't you _see_ this?? Doesn't your math acknowledge this? These are not only measurable distortions but the problem is still present even _completely_ in the theoretical realm.
Are you arguing that a 14.7K sine sampled at 44.1K is a symmetrical waveform? Or that a 1/3 width pulse wave at 14.7K with a 22K filter is EXACTLY a sine wave? I would suggest that it is not...
You're not wrong here: there are no cheap and commercially available 24-bit dacs, and noones hi-fi comes even close to 24-bit fidelity from source to speaker, but you're missing my point.
Just as in playback where the 44.1kHz signal is oversampled, during recording the signal is often recorded at much higher frequencies. The extra data samples allows you to reduce error and this lets you increase the sample size. It isn't a direct measurement that gives them 24 bits per sample. It's an indirect calculation based on nearby samples with a realistic 20-bit ADC (or whatever studios use these days).
Now it doesn't matter that at no stage will a 24-bit DAC or ADC be used. You get 24-bits of information per sample into a 44.1kHz recording by sampling at a higher frequency. You can use the extra information during playback because your sample interpolation will be more accurate.
It's a matter of information. Increasing the stored sample size increases the amount of info on the disc. More information lets you reproduce the original signal more accurately. Ignore the practicality of the stored info: 24-bit isn't a very practical sample size, but it is more info, and the recording studio will have used clever techniques to ensure that they are using all 24 bits even if they don't have 24 bit ADCs.
It's a lossy format in a peculiar way- this particular method is wildly more accurate at low frequencies than high ones. The thing is, 'high' is relative >;) to this format, 100K is 'low'. The really low frequencies are almost arbitrarily accurate- and again, the ability to delineate high-energy transients is many orders of magnitude better than bandlimited formats. I must admit I am not terribly worried about 10,000% harmonic distortion... at two megahertz. I don't believe I can hear even the most outrageously powerful sound waves at 2 megahertz. Would those be microwaves? Maybe these new players will cook people in front of them if you turn them up loud enough (and have tweeters that can put out loud 2 million hertz signals ;) )
CD's do a great job on s/n ratio - they don't do such a great job of accurately reproducing a wave form. There is a clearly audible difference between CD's and Analog records on top quality audio equipment.
Even the best audio equipment does a really poor job of reproducing sounds. The proof is to make a recording in an anechoic chamber of a speakers output and then compare that signal with a synched up input signal on an oscilloscope set up in differential mode. If the two were identical the output would look like a straight line, instead it looks like a bowl of spaghetti; the phase and amplitude distortions are terrible.
I know that there are tests that 'show' that the ear is not sensitive to phase differences, but those tests were badly flawed; the phase distortion of the reproducing equipment was so bad that of course no one could hear differences. One phase distortion sounded just as bad as another. Most of the differences in speaker sounds have to do with the phase response of the speakers.
The ear is basically a Fourier analyzer, and phase does matter when you do Fourier transforms.
You're 100% right. 16 bits is not enough for mastering and mixing. The intermediate values of the mixing must be preserved as accurately as possible to ensure the best possible final result. When the final output is obtained, 16 bits may be enough for listening to. I don't know where your statement about an undigitized output and the 16 bit quantized output is coming from though. I don't think anyone has ever said that the result of 16 bit sampling can exactly reproduce the undigitized output. But are you asserting that the difference is a really big audible difference?
As for the 44.1kHz? If the audio is already sampled at 44.1kHz, you'd better have a serious cutoff at 22kHz or else you're getting alot more distortion at higher frequencies. What exactly are you talking about when you say that amp designers go for pass band into the MHz? If there is any power above 22kHz it is purely noise (that is if the input is a CD, LP is arguable at this point). Now is the higher sampling frequency better? Of course! Just like the 24bit sampling, it makes it easier to reproduce high quanlity recordings.
You seem to be quite an audiophile. I am not. But I wonder, have you ever really tested the validity of your gut feeling about these technologies. I feel that your statements are an amalgam of audiophile "common knowledge" but don't represent the physics and mathematics of sounds and frequency analysis.
Ever heard of a book called 'The Hidden Persuaders' by Vance Packard? (expose of the advertising industry) This book exposed how advertising people were able to persuade consumers to buy one thing rather than another. With music, the audio quality is the 'hidden persuader'.
Answer: quintuple the price, trace the edges with a green magic marker, slap a bunch of pseudoscientific gibberish on the front and then congratulate them on having a more "well-trained ear" than the rest of those damn hoi-polloi.
The best way to deal with "audiophiles" is to consider them a form of free entertainment, and proof that cocaine isn't god's only way of telling you that you have too much money.
News for Nerds. Stuff that Matters? Like hell.
Vinyl does sound better than a CD. I'm not talking about the S/N ratio, either. Sure, CDs have better silence than vinyl and they generally don't warp or pop. But because they're higher-resolution than a CD you not only get the very high frequencies that you wouldn't think matter. You also get better reproduction of complex harmonics. This is what all that headroom 100KHz frequency response gets you. And the sampling rate that leads to it allowss for a higher S/N ratio, too.
It must be nice to be unable to hear the difference between a 320k MP3 and a CD on your "$3000 sound system", because I can hear the difference on the $80 speakers connected to my PC and on my $800 "sound system". I'm not exactly an audiophile. CDs are good enough that I almost never buy vinyl anymore, but part of that comes from the added convenience of the CD. It's playable on portables and in a car, it's easier to carry, and so forth.
But every time I put on some vinyl, even if it's technopop or garage rock, the room warms up.
No, a square wave is a square wave is a square wave. Sine waves operate on entirely different principles. A true sine wave is the result of a trigeometry identity - sine. They are not an "infinite number of sine waves" rolled into one. All you've done here is a nice graphing trick - just like a good way of representing pi is 22/7.
Try this: Take a 22kHz square wave, and run it through a low-pass filter with a cutoff frequency that's slightly higher -- say, 25kHz. Look at the output on an oscilloscope.
What will you see?
You'll see a somewhat distorted sine wave, that's what you'll see.
If a square wave is in fact made up of a bunch of sine waves, it is easy to explain why this happens. The filter has allowed the fundamental frequency to pass, and has attenuated the higher harmonics. The distortion results from the fact that the filter is not perfect, and will allow some of the first few harmonics to pass.
On the other hand, if indeed a squarewave is something entirely different, how do you explain this result?
In other words, you have just denied a theorem which underlies much of modern signal processing and communications. You better have damn good proof of your position!
You are confusing bits/second with sampling rate. To capture and reproduce a sine wave, you need only sample at double the highest frequency. ie, to capture 22khz or lower, you sample at 44khz. This is the Nyquist Criteria, which I believe may have been mentioned earlier in this thread. The formula is somewhat complex to write out here, so please visit these guys for the formula
No, he's right. If you want to get a reasonably accurate digital representation of a squarewave, you need to sample at a higher rate. Why? In order to capture the higher harmonics present in a squarewave, that's why. If you sample at a high enough rate, you will capture several higher harmonics, which will combine in order to approximate a square wave.
Oh, but you don't seem to accept Fourier's theorem. Well, never mind then.
Get yourself a pair of really good speakers and amplifier and *then* worry about the quality of your CD player.
Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)
Irrelevant. Quantisation is the accuracy to which you can represent an intensity level. Vinyl has a limitation just like any recording medium.
You measure it indirectly with vinyl, using SNR readings from a single sinusoid.
I only hope that this provides a clear, easy way to prove to "the masses" how much better things can sound at high bitrate.
Just like with HDTV, where the networks want to use high bandwidth channels to broadcast 4+ SDTV signals, I constantly fear that something like 128k MP3 will become "a standard" because it is "good enough".
Yeah.. that works too. Try combining the two! (I mean, a good stereo, AND good dope, not pink floyd & the dead.)
After posting last night, I dug into the subject a bit.
It turns out that many modern CDs are *very* badly mastered, in that they do not use the dynamic range available to them. They maximize everything to the loudest volume in order to get the loudest radio play. So... as a result, many modern stereos are configured (by the users) to listen to everyday pop music... and unfortunately, if something that has real dynamic range to it is used, you'll just miss most of it.
Most people wont be able to hear a big difference right off the bat, and in fact, humans do not HEAR much out of the range of current CD-audio, but people can feel the difference without registering a sound. That ultra-clear, ultra-high note wont "sound" but it will make other high notes sound better. Just like 3D visuals, you don't see everything they produce, but what you don't see helps to make what you do see all that much better.
Its just a feeling and i support this new format, its may be a little to high end for most, but costs will fall as the format matures
There's no way anyone is going to be able to hear the difference between a 64x oversampled stream, and a 16-bit, 44.1 Khz stream in a double-blind test.
That business about needing to preserve 50Khz+ components in their old analog tapes is a CROCK. It's called *noise*, guys. Show me a single human being who claims he can hear up to 50 Khz, and I'll show you a self-deluded idiot.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
Maybe he just has sand in his ears. Somehow, I think this post sounds more coherent on a first read than it actually is on close analysis.
The encouraging thing about this article is that Sony is actually spending money to preserve their audio tape library, as opposed to the motion picture industry, which deliberately destroyed their silent movie archives, allowed their nitrate archives to molder to dust, and sat by as their Eastman color libraries faded away.
Ok, so, Betamax, Minidiscs, Memory Sticks, SACD. What's the deal with striking out on their own? How many times will they come up with a good format, only to have it ignored? Why do they continue to bother?
Sorry, but this whole post is like complete BS.
- The Fourier theorem says (and imho proves) that every periodic signal consists of an arbitrary or even infinite number of sine waves. So a square wave actually IS a composition of the base tone and all odd harmonics.
- Typically, when humans grow older, the upper end of their listening range goes from about 20KHz to 12KHz at worst. I know enough 20-year-olds which aren't able to hear a PAL TV's beeping (at 15625Hz) anymore.
- Harmonics aren't sine waves BELOW the base tone, but actually ABOVE them, at 2x, 3x, 4x etc the frequency.
- The nyquist theorem is correct in a certain way, but it's neglecting the fact that from frequencies at samplerate/10 up, there is enough aliasing to severely disturb the signal. Go try sample a 22051Hz sine wave with 44100Hz.
- What's that complete nonsense with 32bit/sec? A CD player reads at 176400bytes/sec (2x44100x16bit). With 16bit resolution, the quantisation error and thus the SNR is at about -96dB. Go look up what "dB" means next time, please.
- it's interesting that you first claim everybody knows what "digital" means and then compare digital signal transmission with your oh-so-cool car stereo. Digital audio signals are transmitted with about 2-3 MBit and can even pass some 10 metres of simple cinch audio cable without any loss. Flipped bits are very improbable and timing jitters are completely neglectable, as they get flattened out by the receiver's internal clock (if the equipment is good, that is). The only situation in which jitters are important is mixing of arbitrary digital audio signals which all have their own clock source - and normally, this does not happen outside studios.
So, better do some research next time or ask the people you refer to all the time before throwing around words and expressions you obviously do not understand.
Thanks,
kb
Observe: Bill Gates had an itch (getting lots and lots of money), scratched it and produced many valuable products (as well as many other pieces of crap products) as a side-effect.
Half the point of SACDs is that they're backwards-compatible, right? Wrong. According to my boss at the record label where I worked over the summer, some of the SACDs we distributed were not backwards-compatible. As far as I know, there's no way to know before you plop the disc in your player whether it'll work on your old-school CD player, either.
Switch the . and the @ to email me.
even if a mic would pick up sound above 22kHz - a special one - there is distortion. the higher the frequencies the more distortion.
:-)
...
but now you can record it
higher bit depth is a positive thing, but then again *who* really hears the difference, even the equipment which plays that is too expensive
The whole scam about getting people to buy better equipment applies not just to the makers of pro audio gear, but to folks like Sony trying to convince suckers to shell out big bucks for a "better" CD player. If you're listening to music on a $65,000 pair of Wilson Grand SLAMMs, driven by some Martin-Logan class A amps, all in a specially-designed acoustically perfect listening room, and you're listening to chamber music recorded in an anechoic chamber which Schoeps mics, godlike preamps, and godlike A/D, and your ears are 30 years old or younger and you live and die by what your stereo sounds like, then maybe you can talk to me about 24 bits or 96 kHz sampling rates, but if not all of the above are true, forget it. The existing CD standard is way better than everything else in the audio chain, from performer to listener, for 99.9% of the audio equipment, performers, recording engineers, listeners, and types of music (Britney Spears in 24/96?) in existence.
I'm not just blowin' smoke here. I do mobile audio recording professionally (Earthworks mics, Earthworks pre, Apogee A/D, record straight to stereo DAT -- see my URL), and it's standard practice for me to record things 3-6 dB lower than peak to leave breathing room, and in post-processing, design the sound to use no more than 30-40 dB of dynamic range nearly all the time (as making a recording with more dynamic range than that makes it very difficult to listen to unless you're in a very quiet room over high-end headphones). I'm only using a fraction of what lil' ol' CD is capable of, but anything more would just capture room rumble and people's breathing in that much more detail.
If you believe that current audio is somehow shortchanging you from hearing that ineffable "something" that will make your music sound better, then give up the fake science in Sony's product descriptions, save your money on fancy CD players, and go out and buy some better speakers than those Radio Shacks that your brother Jim gave you when he went off to college.
-----
Eh, there's lots of other companies doing that. Wadia was certainly not the only one. You just won't find the products at Radio Shack ;) and I daresay you won't find them cheap even now- I just looked on eBay and the prices I was seeing were $200, $500 and $1500 in US dollars.
What the _hell_ do you think I have for mains, Yamaha NS-10s? The _only_ way to get remotely serious performance is either to spend loads of money or educate yourself extensively and rebuild/custom build _everything_.
There is a hard limit to the low frequency extension available from ported enclosures: that's why my speakers have variovents. There's another hard limit on how much acoustic power you can produce with a given cone size within the linear excursion limits of the voice coil (_not_ the suspension physical limit, but keeping the voice coil within the magnet gap). That's why my speakers run 12/10/8/6.5 drivers in a series/parallel configuration as if they were a guitar speaker cab- all the drivers contribute to the piston area for subsonic content. There's a softer limit to enclosure size- I don't feel like running an active EQ system like the very clever Bag End 'ELF' system, so that's why my cabs are four feet tall, over three feet deep and as much as 15 inches wide. They'd make damn good stage bass cabinets, though they're not really designed for those wattages. That fixes the lowest frequency at... let me put it this way. I own synthesizers. I can _put_ subsonic tones through my full system, and stuff will be falling over. There's a limit to how much acoustic pressure even that much cone area can produce, but it's not giving up at 20hz, that's for damn sure.
The other extreme? Composite drivers using piezo elements to handle the extreme highs. (no, not a whizzer cone with a clunky piezo disc glued onto the back and flopping around in a plastic housing). This type of element is easily capable of extreme frequency content- this is what they use to make ultrasonic devices, piezos. I have inverted aluminum domes on these which are so light and delicate you could damage one by blowing hard at it. To top all this off these are effectively HORN LOADED- not in an extreme fashion but the waveguide is definitely reminiscent of an exponential horn, and that means LOUD. The supersonic acoustic power these can produce is not subtle- it will give you a nasty headache very quickly if there's a lot of supersonic ringiness or grunge in the signal.
I don't know _where_ these people are coming from, but there seems to be an infinite supply. All I can say is- I hope you folks are my competition for sound engineering work, because the more fervently you insist that none of that high endy stuff even matters, the less use you will be at a task like mastering. This stuff is not hypothetical: there is practical application provided you work in the business. Ability to monitor a signal over an audio range substantially beyond 20-20K means better ability to control the sound balance and integrate the various instruments into a coherent whole, never mind that in certain areas like bass there are whole subcultures (car audio!) dedicated to showing off the ability to push this limit (by the same token, electrostatic speaker fans are declaring their interest in pushing the opposite limit).
Indeed I think about bandwidth and- S/N ratio is a rotten way to express this, let's call it _linearity_. However, the system _will_ happily go above 20K- and the linearity is considerably in excess of 16-bit quantization. And so is my digital source- I swear by Alesis 20-bit 48K ADAT. The highs could be better but the linearity is really quite good and I'm not dead certain a full 24 bits is necessary. However, the difference between 20 bit and 16 bit is major- and the difference between 44.1 and 48K is minor but every little bit helps. I was testing some new tweeter design changes on Frank Zappa's "One Size Fits All" album- the Kerry McNab-engineered Zappa albums push the high end _hard_ and the mikes turn a really large amount of the 40% supersonic content of cymbals into voltage. Playing this back with the new tweeter configuration really drove home how much we've lost: that album is very good at reproducing the natural supersonic balance of the sounds it contains, and a CD cannot contain even distorted versions of this acoustic content- the LP tends to distort it but the CD throws it away completely. So, again- 44.1K (really significantly below 22K) isn't enough. 16 bits aren't enough either.
I _will_ note that the wave is phase shifted compared to the original source- in fact when I talk about intermodulation distortion what's happening is that the wave gets phase shifted forward and back very rapidly.
I would _suggest_ that a wave which is phase shifted against other sounds is not the _same_ wave that is sampled. It can be _a_ sine wave but if it's not in the same phase as the source (compared to other music data) is it the same wave?
It's possible that given 4X oversampling Nyquist is as near to correct as necessary. Certainly if you're not concerned with phase but only the presence of frequency components 2X Nyquist is basically right. I draw the line at saying it is theoretically perfect- I'd say it could be a very good approximation. I don't understand why so many people arguing for Nyquist take on a positively religious tone, demanding the acceptance of articles of faith. 'Theoretically perfect' is a very strong statement, and the context (such as 'bandlimited, and ignoring phase and time information completely') needs to be clarified or people take it to mean 'therefore CDs are perfect reproducers of sound': which is not accepted by any competent sound engineer today, since again 'perfect' is a very strong term.
First off MP3's sound like crap...
As far as how to sell this to the mass market. I think the DVD spec provides a means to really add value to the music disc.
The disc could contain pictures, live video, it could even be recorded in 5 channel.
That's what I suspect we'll see.
It's basically like the CD-Extra format, only better.
It is _very_ nice that digital allows low frequencies down to 0 hz. I will never knock even CD sound quality for the ability to convey extremely low sounds- the only limitation is the analog stages that drive your stereo. I produce sonograms of my recordings sometimes, and they easily produce frequency information at 9 hz and lower (I refuse to use bass drum sounds that can't be made to have substantial subsonic content :) ). CD can easily encode _substantially_ below 20 hz. There's basically no limit. The format is flat to 0 hz.
Sounds of up to half the sampling rate can be reproduced *exactly* by (ideal) audio equipment.
This is what the Nyquist theorem says. The thing you are saying about square waves is a misunderstanding on your part.
The frequencies we are talking about are all sine waves. So the shape to be reconstructed is predetermined and known clearly. There is no "interpolation". You may be familiar with the Fourier transform, which takes advantage of the fact that all sounds can be represented as compositions of sine waves.
All that said, I think that the idea of higher fidelity is a good one. Because there are two things that happen on CDs as they stand now:
(1) Companding
(2) Quantization
A brief and kinda-misleading explanation of "companding" is that the bottom hundred or so Hertz of music on a CD is chopped out. This is under the theory that the human ear can't hear this stuff, but lots of people say they can. And even if you can't consciously notice these tones in isolation, it's not much of a step to believe that they contribute to the "richness" of a sound when mixed in.
Quantization is another matter. Say you have 44,100 samples per second stored on a CD. Each of those samples represents the intensity of the sound at each point in time. That intensity is a real number (an analog value) and it's being stored as an integer (a digital value). In mapping from the analog to the digital value, you lose precision (think of rounding a float to an
int). The more bits you have to represent each value, the less distorted the sound is. Think
of the difference between 16-bit truecolor and 24-bit truecolor when viewing photographs and whatnot.
I think added resolution on disc-based music would be really nice. Whether it comes through audio DVD, or a different CD format, though, I don't really care. I think that DVD would be nicer for obvious reasons of storage capacity.
-J.
16 bits isn't enough. That's _really_ obvious at this point- no professional works in 16 bits except for the final CD output.
And the reason is... Because real-world signals have a lot more dynamic range than we can fit on the final output. Stick your ear into the bell of a trumpet or 2 inches under a snare drum and tell me how much hearing you have left after an hour. Yet you find microphones there all the time. When recording digitally, it's critical not to exceed the peak amplitude -- digital clipping is obvious and nasty. So you leave plenty of room at the top for extra-loud sounds, and adjust the volume when mixing it down. That means you want more than 16 bits to get 16 bits of resolution in the final product.
even if you mix with ideal noiseless coloration-less electronics
ARGH!! Digital mixing is addition. There's no noise or coloration to be added. If you want to add some, you have to add it on purpose.
44.1K isn't enough either.
Actually, this may be true. Some studies have shown pretty convincingly that a few people can detect the presence of tones close to 30 kHz. Whether such tones exist at detectable levels in actual music isn't clear. But the science does allow the possibility that a 60 or 70 kHz sample rate may be necessary. However...
High end amplifier designers go to great lengths to get their pass-bands up into the megahertz
Utter nonsense. Deliberately extending your passband that high would not only seriously compromise performance at audio frequencies, but it would make your amp incredibly sensitive to RF interference.
The neat thing about the bit rate is that it's effectively infinite bit rate
Not at all. What they describe is delta-sigma modulation, the same thing used in nearly all CD players today. From what I can tell, their scheme is simply a higher bit-rate delta-sigma system with the storage medium containing the 1-bit stream. There is absolutely nothing new about any of this.
There doesn't need to be _any_ brickwall filter on the output
Precisely. And this is in fact the very reason that delta-sigma modulation is used on commercial CD players (and most of the A/D converters used in the industry, as well). Again, the difference here (AFAICT) is that the 1-bit stream is what gets stored, rather than resampling to the 44.1/16 CD standard.
the potential slew rate of this technology is just astronomical
This has absolutely nothing to do with slew rate. Nothing at all. And slew rate has little to do with how loud something sounds. It is true that absurdly powerful amplifiers can sound better than smaller amps with otherwise equal specs. And you almost have it right; the reason is that the peaks require a lot more power than the average, and if you turn up the volume, the smaller amp can't produce the peak power needed for the transients. The result is clipping, which is audible.
However, again this has nothing to do with the storage medium. Remember, modern CDs are almost all recorded and played back using delta-sigma modulation. Sony's plan is to simply skip the current step where the bitstream is reorganized to 16 bits at 44.1 kHz. This allows them to choose a bit rate with a higher dynamic range and cutoff frequency. I leave the arguments about the need for those to other postings.
http://www.sonymusic.com/sacd/
http://www.superaudio-cd.com/
Something that doesn't gush like a press release
Another more objective link
It appears they're using a dual layer method for backwards compatibility. The details about copy protection methods are vague, but they do mention visible and invisible watermarks aimed against both pirates and counterfeiters. But I can't seem to find a decent explanation of how the encoding DSD encoding scheme works.
There's a good analogy for the dynamic range of CDs. It is computer monitor displays. People who are content with 16 bits in their audio because it covers a range of silent to really loud should _also_ be content to always use 'thousands' of colors on their monitors- because that covers the same range of black to white as 'millions' does.
Of course, people will not be content with this, because 'thousands' produces subtle banding and a perceptible degradation of the picture in ways that are clearly understood. In the same way, 16 bit digital audio produces a thinning and drying of the sonic texture that is the audio equivalent of banding- for the same reason, which is linear dynamic encoding of an analog source that will virtually never be _exactly_ equal to the 16 bit truncation of its resolution.
20 or 24 bit is near as dammit: it's way harder to notice any adverse effects. The Sony format is particularly interesting as it has the potential, sometime in the future, of being handled without recourse to PCM encoding (though, distressingly, it seems that the current incarnations do make use of such a stage- hopefully a really impressive one like 128K cutoff and 32 bit or something equally flash). What that would mean is that there would be no specific bit depth it would be comparable to- saying '8 bit' or '16 bit' or '24 bit' w.r.t PCM is saying 'Here is the maximum signal displacement, from -Xv to Xv. I will now chop the area between into equal parts and quantize whatever the _real_ voltage is to the nearest level I can encode.' In this sense, PCM is never accurate at all- it's very unlikely that at a given moment, the voltage really exactly matches the encoding, because the encoding might be 1.0256 volts and the real voltage was 1.02562854647823862349823474634672348 volts.
Again, it's the same issue as monitor color trueness at 'thousands' of colors. The ironic thing is, with well recorded music the hottest peaks are _really_ hot: 99.9% of a piece of music might be less than half the available dynamic area used for encoding. The encoding is linear, so that is _wasting_ half the bits. A sort of Gaussian distribution might have been a better idea, but it's too late to worry about that now :)
Of course if you make music horrible enough sonically with brutal overcompression, most of it will take up the full dynamic range of the format. It's a pity that this sounds atrocious, as it's the only way to really make use of linear encoding :)
You are both correct.
If I were to sneak into a recording studio and insert a 20 kHz low-pass filter and inject white noise at -90dB, would anyone notice?
No.
And that's the sort of thing you should think about. Not mythical triangle and square waves (which instruments produce those?) but what's the bandwidth and the S/N ratio of the whole system? From mikes to master tapes to mixers to your home amps and speakers. All this put end-to-end, what is the bandwidth and S/N ratio? That's all. Once you have that, you know what is the lowest sampling frequency you can use, and the lowest resolution you can use.
Hint: the system has a bandwidth of less than 20 kHz and an S/N ratio of less than 70 dB.
Unlimited growth == Cancer.
I don't understand how you manage to turn the idea of a 14700 sine into a 918.75Hz tone, but so much the worse for you: the 14700 (ironically) does have the second harmonic obliterated by the 22.05K lowpass. When you start talking about 918 hz tones you shoot yourself in the foot- you don't _get_ a 919 hz filter in CD audio, and the modulations are going to still be there. They won't be as nasty as a stepped sampled wave but you've got quite a few harmonics that will remain after lowpass filtering at 22.05K (even the compromised realworld version). Some of the modulation you see _will_ survive this.
It's flatly incredible how many armchair audio theoreticians won't even go as far as you have, insightfully, gone, in seeing that the results of the sampling process are modulated. Phase differences do matter, but even if they did not, the degree of modulation of the sampled wave is mathematically provable unless you get to do a _specific_ lowpass filter on it. If you count obvious changes in phase, the amount of change is provable even when you do ideal lowpass filtering on it.
Yes, math is grand ;)
I thought DVD Audio had been put on hold while they worked on an "improved" copy protection scheme.
Mea navis aericumbens anguillis abundat
Well, MP3's sound fine to me. I listen to variable bitrate ones that I rip myself, usually, with max bitrate at around 320.
And I can't tell the difference between that and the CD on my $3000 sound system...
On 128kbps ones, sure. But 192 and higher, most people really _can't_ tell.
The only people who say they can tell are, as far as I see it, aloof audiophiles who are scared shitless that they aren't really getting their money's worth. These are the same people that cling to vinyl and say it's better than a CD.
Sorry, no. Sure, it might theoretically have a higher frequency response, but the human ear can only hear from 20 to 20000 Hz, which requires precisely CD quality sound. CD also allows for a little lower than that for big ol' subs.
So there's no point in going up to 100kHz audible range, because NOBODY EXCEPT MY CAT AND THE NEIGHBOUR'S YAPPING LITTLE DOG are going to get any advantage from it.
And my cat doesn't like Paul van Dyk anyways.
The reason that people tend to think that mp3 is 'cd quality' is due to two things.
1) The majority of mp3 afficiandos are listening to pop music (or former pop music). To get the effect of this music, a really high quality setup is not required (may times because they have only listened to the CDs on crappy headphones or a mediocre stereo). So to them, the mp3 sounds perhaps slightly different, but not really any worse, and completely listenable. I know the first time I heard Pink Floyd: The Wall on a *nice* stereo (hand built & tuned amp, nice speakers..).. I was amazed. I could not *BELIEVE* what I heard.
2) people use 'cd quality' to mean 'acceptable quality for me to listen to without pops and hisses'. You are entirely correct. They don't mean it's the same quality is a CD recording; they mean they couldn't care if it is or not.
And if any of them really loved symphony... you'd see that difference right away.
- Human voice sibilance is maybe 6% over 20K.
- A cymbal crash is as much as _40%_ over 20K.
- Keys jingling is _60%_ over 20K.
I (big surpriseNote: the harmonic distortion content to the 918.75hz sine wave is not going to be continuous. It is going to be rapidly cyclical as it's the product of interaction with the sampling rate- each harmonic's strength will oscillate between zero and whatever the maximum is (a fraction of a percent?) depending on the phase of the tone relative to the sampling rate. This effect will be considerably more obvious at higher frequencies but will still be present at 918hz. The harmonic distortion is never a continuous amount, but is invariably a rapid cycling between zero and the maximum amount.
Cables are _hugely_ important. It's not because of magic- it's because of capacitance, resistance, inductance and the way amplifiers interact with these qualities. You could easily have an amplifier and speaker combination where, with one set of cables, all was well, but with another set the amplifier would oscillate to feedback blowing up the amp! That is about as real as you could ask for. Normal cables are less prone to be _that_ bad, but there's still a great deal of difference in different cables. Again, it's not primarily the cable itself but the way it interacts with the transducer and especially the amplifier.
I think in order to understand this properly, you'd kind of have to have enough technical background to know that speakers' impedance drops sharply at the driver's resonance... without granting some facts like that, it's impossible to even begin to explain to otherwise smart people what's happening. 'But I put a voltage through and it was there on the other side of the wire!' is not evidence that a wire will perform under demanding dynamic conditions- or hooked up to a borderline-unstable amplifier. And competitive high end car audio is very much about borderline-unstable amplifiers due to the very low impedances :) interestingly, stadium sound reinforcement uses similar principles. You might have a stadium-sized PA with an amplifier that puts out one watt... and God knows how many amps, through a speaker network that is 0.00000000001 ohms. The rules change when you start to deal with jobs that big...
So here's the ugly truth. The MP3 revolution seems to have proven that most people have tin ears...
No, what the MP3 revolution shows is that almost all music fans value the content more than the technical reproduction. In the same way I'd prefer a worn paperback Nabokov to a shiny-new hardbound Stephen King, I'd rather, by a factor of a thousand at least, listen to a scratchy cassette copy of "Tim" on my Walkman than the latest Brittney Spears (sp.?) blasting out through the best stereo system you've ever seen or heard in your life. If you're not a Replacements fan, or if you are, God help you, a B.S. fan, go ahead and substitute in the above sentence the names of your faves ad lib.
Yours WDK - WKiernan@concentric.net
You're my favorite thing!
A square wave of 22050 can NOT be reproduced by even the most ideal audio equipment.
In fact, all the energy in the cosmos can't reproduce any square wave perfectly, because the leading edge of an ideal square wave is perfectly vertical, requiring an infinite acceleration at the speaker cone.
Fussily yours WDK - WKiernan@concentric.net
Let me ask a question from a coder's standpoint: how do I write code to perform operations on a high-frequency 1-bit sample? I can write code to lowpass, highpass, bass boost, mix, compress... 16-bit audio easily. But what happens when I have to write my .S3M player to do these effects on a 1-bit sample? Even if the format is technically superior, I think it may be too hard for the amateur to work with for it to be useful.
More frequency range isn't going to be recorded, played, or heard by anyone.
First of all, things above 22kHz aren't picked up by ordinary mics... Even the ultra-high-end Neumann U87Ai only claims 20-20kHz frequency response (http://www.neumann.com/mics/u87ai.htm)
Secondly, most speakers won't crank out those high frequencies without a severe falloff in response: the high-end Genelec 1038A triamped monitor gets you 33-20k Hz (-3dB). (http://www.genelec.com/products/1038a/1038a.htm)
Finally, most people can't hear above 20kHz, especially those people who are incessantly blasting their ears out with loud music.
The best reason for Super CD (or DVD or whatever) is higher bit depth, NOT higher sampling rate; going from 16/44.1 (CD quality) to 24/44.1 takes just 50% more space, for nontrivially better quality, while going from 16/44.1 to 16/88.2 brings minimal benefit at a 100% space penalty.
Nyquist's theorem does not imply, however, that the representation of the maximum [or near maximum] frequencies will be highly accurate as far as the shape of the wave form is concerned.
That's incorrect. If the signal is sampled at twice its highest frenquency, the signal can be reconstructed exactly. This assumes that the samples are recorded precisely without quantization, and that the signal is truly bandlimited.
This is why higher sampling frequencies ARE relevent to higher audio fidelity. Higher bit resolutions are arguable though...
No. Higher sampling frequencies allow you to get away with fewer number of bits per sample, and this usually simplifies the electronics. e.g. With delta sigma modulation, the signal is sampled with 1 bit per sample at a very high sampling rate. The bit sample essentially encodes the change between successive samples., i.e. an increase or decrease, and if the sampling rate is high enough, the original signal can be reconstructed from this information fairly accurately.
Whatever you may think of Sony's conduct in general, this particular product is entirely consistent with the open-source philosophy that we all cherish. From the article:
The company was faced with archiving some 300,000 pre-1960 analog recordings. Unlike fine wine, analog does not age well with time; after about thirty years, recording tape becomes brittle and disintegrates. Preserving these analog masters by creating digital copies is of paramount importance. But the limitations of digital sound proved a hindrance.
Sony had an itch (archiving their own recordings), scratched it, and produced a valuable product as a side-effect. We should support them for it.
-- Anne Marie
Actually, DVD Audio discs won't play on existing DVD Video players. DVD-A was slated to use a varient of CSS ("CSS-2") for copy protection, but DeCSS put the kibosh on that. DVD-Audio as now shipping uses a new and supposedly improved encryption scheme (but still nonstandard and closed, so it's probably crackable). Think RIAA's having a fit now that we can make perfect rips of 44.1k/16-bit CDs? Hah! That's nothing :)
Anyway, most DVD-Audio players coming out now are also DVD-Video (thank god), but older DVD-V (and computer DVD-ROM drives) won't be able to read DVD-A discs.
Well, another day, another media format. Of course, the media companies will happily sell me their products. But I already have Radiohead's 'OK Computer' on CD, so I already paid the license fees. I want to 'upgrade' that CD to the format-du-jour, and am willing to pay the production costs and a little something to make it worthwile for the industry to keep on developing new products. I do NOT want to pay royalties again, since I already did. And since I have always been told that those compact discs are so expensive because of the license fees, this upgrade should be quite cheap, am I right? I mean, I only OWN the piece of plastic, which is cheap. It is the license fee which drives up the price (or so 'they' say). So, just let me upgrade my piece of plastic then...
No, unfortunately I am wrong. But I should be right...
--frank[at]unternet.org
In short, no. Nothign ends up that way.
Others make MD players. The one thing that kept MD players from saturating the market was price... too expensive at first. More reasonable now, but not flexible enough given the current age. (I have a player, but never use it; I don't like not having direct access to copy my MDs)
HOw can one blame sony for this? Did they *hurt* you by inventing a new technology and then making it expensive? No.. you didn't have it in the first place.
I mean, I hate corporatism (to coin katz) (I can't believe I did that) as much as anyone, but Sony is just not one of the companies that I think of as 'evil'.
I think some of their stuff is too expensive.. but that's their loss.
I'm amazed how so many Americans find the fact that Napster is being sued to outrageous... hell, you Americans sue each other like there's no tomorrow!
What do you mean 'once again reinventing'? How is this bad? So they come out with a new CD format that's proprietary and must be licensed from sony.. how is this going to hurt you?
According to the article, DVD audio has an average sampling frequency of 96 Khz and can go as high as 192 Khz. It uses a 24-bit word length. (Regular CD is 44.1 Khz with 16-bit word length.) On the other hand, Sony Audio CD uses a sampling frequency of 2.82 Mhz (which is 64 times more than regular CD) but only has a 1-bit word length (from my reading of the article). So, the Sony format holds only around 4 times the info of a regular CD.
By my calculations, at the average DVD sampling rate, Sony holds one-quarter more info than DVD audio (D = 0.8 * S). But at the maximum sampling rate of DVD, Sony holds one-third less info than DVD (D = 1.5 * S).
Seeing as the Sony proprietary players cost a couple grand and DVD audio players cost a couple hundred, I'm inclined to go with DVD for my future audio needs and forsake the Sony format.
--
--
He lives in a world where those who do not run the client software of the omnipresent meme are unacceptable.
Calls to question, however, what will we do about all the CD's already burned at this rate (that is, if the standard gets acknowledged)? Will we be seeing Super CD's in the future with warnings on the back "Due to the fact that this CD was recorded in conventional format, you may experience small pops and scratches that are a known limitation of this original format" (much like they say with CD's created from old records).
Also, does DAT produce any better recordings? Does DAT even sample the same way a CD does?
- I don't care if they globalize against free speech. All my best free thoughts are done in my head.
Sorry, SACD is a lot more complex than a simple application of Nyquist can handle. The key to SACD's high fidelity is all in the quantization theory.
Yes, an SACD has a sample rate of 2.82MHz, but that's with one bit per sample (per channel). Yep, that's right---a single bit per sample. In fact, the signal-to-noise ratio on a SACD is very likely negative--there is more noise than signal.
Now before you blow your top with how absurd that sounds, let's clarify one thing: the SACD format jumps through serious technical hoops to insure that the vast majority of that noise is in the completely inaudible range. And, the vast majority of the signal is, of course, within the audible range. The technique is, not surprisingly, called "noise shaping".
So once you limit your measurements to, say, 0-20kHz, you're back to where you would hope: the astronomical dynamic range and signal-to-noise ratio of a high-fidelity audio format. (In fact, SACD is designed to provide ultra-low noise and 120dB of dynamic range all the way out to 100kHz, from what I understand.)
For those of you who remember, or perhaps own, CD players with "1-bit D/A"s, you're using a similar version of this technology. The difference is that the SACD recording process can decide at the mastering stage how to get down to 1 bit per sample, and that's a much better place to make that decision.
DVD-A was slated to use a varient of CSS ("CSS-2") for copy protection
Let's see... W3C releases CSS (Cascading Style Sheets). DVDCCA and MPAA release CSS (Content Scrambling System).
Then W3C releases CSS2, with positioning. DVDCCA and RIAA release their own CSS2.
Trademark dilution?Will I retire or break 10K?
First, although it is often quoted that people hear sounds up to 20 kHz, they don't. Most people hear up to about 15 kHz. Typical listeners lose 5 dB in threshold between 5 and 10 kHz, and another 10 dB between 10 and 15 kHz. The thresholds are so high around 15 kHz that you cannot ever hear such frequencies with normal sound equipment. For almost any recordings knocking off all frequencies above 12 kHz won't change perception during playback.
Second, almost all speakers drop off severely above 20 kHz. There is no easy way to play such sounds unless you get a speaker designed to keep gophers away (they, like many mammals, hear above 20 kHz).
Third, you cannot reconstruct accurately sound frequencies above 10 kHz with standard CD encoding anyway. 16bit sampling is simply not enough. I've spent some time reverse engineering 20 millisecond sound samples, and I can tell you from experience that with limited sampling bits you are better off doubling the Nyquist. And the Nyquist theorem doesn't guarantee such reconstruction anyway with 16 bit sampling.
SONY is pulling a marketing ploy - redesigning sound recordings to try to give people a little extra - well, I'll tell you what, I'll bring my dog along. At least he has a chance of hearing the differences. Because I sure don't.
There are a lot of people in this world who can hear the difference and suggest that the CD format is severely limited in terms of it's frequency range.
It's not as simple as you claim.
DSD is a single bit recording format
Which is identical to the format the PC Speaker accepts.
CD's are a single bit recording format
No. CDs are sampled at 44.1 KHz, 16 bits per channel.
The big question in my mind is what sort of recording time do they have? The sampling rate is 2.82 MHz
And the data rate is 2.82e6 samples/sec * 1 bit/channel * 2 channels/sample * 1 byte/8 bits = 705 KBytes/sec, which makes (assuming DVD data density of 4.7 GB/side) 1 hour and 51 minutes of play time.
Will I retire or break 10K?
You know, Sony seems to have a real problem grasping the concept of standdard formats. Witness the Betamax, 8MM video, and now the Memory Stick. What's really weird/ironic is that they're partnering with Phillips, a company famous for giving away the analog audio cassette patent -- and driving out a lot of competing formats in the process.
The above link is broken because Slashdot insists on munging the following url:o ral_histories/transcripts/drabek.html
http://www.ieee.org/organizations/history_center/
__________
Aren't these the same pinheads who had a run-in with Bob Carver a few years ago?
Carver offered to tune the transfer function of one of his M400 amps to match the most expensive frou-frou tube amp the Stereophile morons could muster, and they couldn't tell the difference in a double blind test. After admitting as much in one issue of their magazine, they reneged in the next issue, after they realized that their market is the kind of jackasses who would spend fifty grand on a turntable with a fluid transmission.
That magazine is all about snob appeal, and has nothing whatsoever to do with science.
-jcr
(I've got a set of speaker cables on e-bay, for the very low reserve price of just two grand, each!)
The only title of honor that a tyrant can grant is "Enemy of the State."
- 16 bits isn't enough. That's _really_ obvious at this point- no professional works in 16 bits except for the final CD output. Mix busses have to be many times that in the digital domain, but even if you mix with ideal noiseless coloration-less electronics there's a really big difference between monitoring an undigitised feed of the signal with monitoring the 16 bit output.
- 44.1K isn't enough either. This is not primarily due to people being able to hear beyond 20K (though you can sense such sounds to some extent- why do you think smashing glass or dropped plates make you jump? Viciously loud supersonic transients), it's due to the brick-wall filters required. High end amplifier designers go to great lengths to get their pass-bands up into the megahertz (and nobody claims humans hear that!) because cutting off lower causes interactions across the entire frequency band. Cutting off at 22K is just ridiculous.
Now, how does the Sony approach compare? The neat thing about the bit rate is that it's effectively infinite bit rate- it's not a finite set of voltage levels but just one bit very fast tracing a voltage level that could be anywhere. This is substantially beyond even 24 bit- a major, major advance. That's gonna be very noticable.As for frequency, there is a surprise in store here. It may or may not be competitive with advanced PCM encoding at say 96K- but two very, very important points:
- There doesn't need to be _any_ brickwall filter on the output- provided a circuit can be made to output this stuff that doesn't merely calculate it as a super-PCM-encoding and D/A converter. If the format can feed a sort of very high frequency analog synthesiser, no filter is needed- which is critical, because...
- ...the potential slew rate of this technology is just astronomical. I hope the power supply of the players is up to it- if not there will be some very effective power supply mods waiting to be done, such as backing up the power supply with MIT Multicaps (a film cap that can produce very very high instantaneous voltage). Basically, if you fed this technology a big square wave, it might not be able to turn the corners of the wave instantly, but the vertical parts of the wave would be _vertical_- no brick-wall-filtered system can get anywhere close to this.
We're talking absurdly high transient peak voltages here: this is why high end audiophiles use absurdly heavy cables and absurdly powerful amplifiers, to let those peaks through. It doesn't hurt the speakers: this isn't RMS or even 'peak' wattage, the spikes are of such short duration that you can feed speakers many times the maximum 'peak' voltage if it's only for a microsecond, and high end systems do just that.Where do you find such peaks? Easy- The Who ;) seriously, The Who is a _good_ example, but symphony orchestras are also good for this. The capacity for this type of extreme and essentially 'inaudible' (too brief!) transient translates to the ability to produce the _sensation_ of loudness- for instance, you could easily make many systems play 'Live At Leeds' and sound loud and bright and kind of grating and ear-splitting, but with this technology it would be less grating but more _electrifying_ and the impact would be like having the living people right there playing at you, not just a bunch of very loud sounds. Alternately, you could play big orchestra crescendos and the resulting sound would be _huge_, not just loud but as big as a live performance.
It's really not hard to make stuff sound 'loud', but making it _feel_ loud is something else. If you don't have that, the loudness ends up being just a grating, thin surface, which is actually a very good description of the sound of most pop recordings these days :) the irony is that this technology is coming around just when the recording industry's pushing sounds that are substantially worse than even CD audio can produce...
Bottom line: I want one. Specifically, I want this to _master_ to. I have quite a bit of stuff that loses about 2/3 of its potential when made into 44/16 (eight tracks of 48/20 output analog and mixed with passive resistance mixing will tend to do that- I once figured the rough equivalent resolution was about a 64 bit mix bus, possibly higher) Maybe I should try to wheedle Sony out of a recorder ;)
Unfortunately it's not gonna be the same with audio- people _just_dont_hear_the_difference_. You're not going to be able to sell someone a super high definition sound system like HDTV; pretty much anyone can look at an HDTV and say "wow, that looks great." Once they get cheap, they'll be flying off the shelves.
However in the rarified world of audio, you can't draw much of a parallel. Most of the listening public are sheep; to them the stock stereo in their honda civic is more than they could ever want, and they can't hear the glaring deficiencies with it. What difference would SCD make for them? Absolutely nothing. _That_ is why 128kbit mp3's are so common.
An illustration:
I was encoding some radio commercials into mp3's at work a while ago so people could e-mail them around. I was going from a studio copy on a burned cd; it was probably taken right off the DAT's. I encoded it at 160kbit, and then again at 48kbit mono and played both for a bunch of people at work on decent stereos and they couldn't tell the difference. To me, the low bitrate one sounded like ass, but they were perfectly happy with the 48kbit one.
This is why you hear the word "audiophile" a lot more than "videophile."
If you do the math, you'll see that the maximum error you obtain with 16-bit sampling is all you need given that line-out signals are 3 volts peak-peak. I don't remember the exact voltage, and I'm too tired to re-work it now. (3 volts divided by 2^16...) But the maximum error of a 16-bit sampled signal is FAR lower than the minimum electrical noise added by even the best super-high quality amplifiers with gold-plated speaker wire contacts and the like. The mathematics behind that are far simpler than the mathematics behind the Nyquist theorem. (While I knew the NT to be true in the past, I finally saw a proof/derivation of it in my Signals and Systems lecture last week - cool stuff.)
retrorocket.o not found, launch anyway?
You can hear the difference between 2-channel and 5.1 surround. You can hear a BIG difference.
But you can mathematically prove that you can't hear the difference between 44.1 KHz/16 bits and something "better". (Note, as mentioned before, there are some slight exceptions... The mathematics that say 44.1 KHz is enough assume a perfect low-pass filter with no phase-shifting and an infinite slope dropoff. Using a somewhat higher sampling rate allows you to use a less-perfect LPF.) But above 96 KHz, you need not bother... 96 gives PLENTY of headroom for the filter designers.
So this SACD format is going to die unless Sony pushes it with HEAVY marketing. But DVD-A has more people backing it with cheaper (but better!) equipment.
retrorocket.o not found, launch anyway?
The difference between this and CD audio _is_ quite comparable with the difference between DVD and VHS- except there is no change that I can see in listening behavior- it's all audio quality. I'd check it out- if I don't miss my guess the difference will drop your jaw. It's not a question of 'all of a sudden the high frequencies are X times louder, thanks Sony': the frequency ranges may not seem any different. However, the technology has the capacity to produce an extended range with _no_ filter ringing of any sort, plus effectively infinite bit depth (I'd be _very_ interested to know exactly how they are doing playback- is it essentially an analog circuit steered by the digital data?). The result will be this: the sound will be bigger, lusher, more vivid yet warmer at the same time. It'll be like a 35mm film photograph compared with a generic digital camera- the 'liquid' quality should come across even on very cheesy speakers, to some extent, but of course the finer the equipment the better. Actually, if you played it over those Klipsch Promedias the differences should be particularly striking- the Klipsch use a sort of horn loading on the sats for both the woofer and tweeter, and this will maximise the increased transient dynamics. Those are relatively affordable computer/multimedia speakers and some Slashdotters probably have them- the acoustic power output they have are what will show off this new format, there will be some very hot peaks in the signal, felt more than heard.
Again, VHS to DVD is a pretty good analogy for what this format would do, and the frequency range is not the most significant factor- effective bit depth is important but the capacity for slew rate and lack of HF ringing from lowpass filters are really the killer aspects. It _will_ come across even on humble speakers (actually the rock bottom cheapness ones may be better than feature-laden mid-level crud w. 7 band equalisers). I want one, but only if I can record to it, as I don't care in the slightest what Sony artists are doing lately. :)
Yes, and it is a good thing. 16-bit samples are good but not great. 24-bit samples are getting well past the great and into the "more than we'll ever need" stage.
Smaller samples produce quantisation errors. The noise produced by quantisation errors is difficult to remove. 24-bit samples make these errors basically irrelevant.
It's interesting to notice that audiophiles often focus on vinyl's claimed (but untrue) higher recordable frequencies, but neglect the far more important quantisation errors.
CD has quantisation errors approximately 2 orders of magnitude smaller than vinyl (-26db).
Now that DVD-Audio is finally hammered out (dear god why did it take so long???) they come out with this, thinking that somehow it could catch on as a mainstream standard.
Kind of reminds me of DCC back in '92. It's a format built around half-backward compatability (you need new equipment, but at least it can read your old media too), touting enhancements over the media it seeks to replace, while admitting that only audiophiles will be able to hear the difference, and that media will cost more.
Really, truly? I think nowadays the primary motivation behind any new audio media format is to inhibit rampant replication. do you have a Super-CD burner? Will they sell one?
the article describes the tech as if the advancements in sampling were directly tied to the 'advancements' in media. There's no reason this data format couldn't be stored on a DVD, except that DVD-RAM players are becoming too readily available for Sony Music's liking.
Kevin Fox
Kevin Fox