Audio Compression Primer

← Back to Stories (view on slashdot.org)

Posted by CmdrTaco on Thursday January 13, 2005 @07:31AM from the do-you-hear-what-i-hear dept.

Hack Jandy writes "For those of you with a little extra time this afternoon, check out Sudhian's primer to all things concerning audio compression. The article details everything from DRM to CRC matrixes (with a healthy dosage of Ogg)."

4 of 236 comments (clear)

Min score:

Reason:

Sort:

AAC by sometwo · 2005-01-13 07:49 · Score: 3, Informative

So what about AAC used by Apple in their music store?

I did a little googling and found this (http://www.teamcombooks.com/mp3handbook/13.htm):

AAC (Advanced Audio Coding) is not a MPEG layer, although it is based on a psycho-acoustic model. Sometimes referred to as MP4, AAC provides significantly better quality at lower bit-rates than MP3. AAC was developed under MPEG-2 and also exists under MPEG-4.

AAC supports a wider range of sampling rates (from 8 kHz to 96 kHz) and up to 48 audio channels, plus up to 15 auxiliary low frequency enhancement channels and up to 15 embedded data streams. AAC works at bit rates from 8 kbps for mono speech and up to in excess of 320 kbps for high-quality audio. Three profiles of AAC provide varying levels of complexity and scalability.

AAC software is much more expensive to license than MP3 because the companies that hold related patents decided to keep a tighter reign on it. Most AAC software is geared towards professional applications and secure music distribution systems, so it may be a while before you see AAC in consumer-oriented products.
more algorithms by barik · 2005-01-13 07:59 · Score: 5, Informative

While the article is a primer, I was a little disappointed in the algorithmic treatment given in the article itself. Right now I know of two excellent free publications: Introduction to Sound Processing and The Sounding Object, which both treat the theoretical, DSP side of things. Any other resources that Slashdot readers can recommend for those who are interested in the subject of audio compression and representation?

--
Titus Barik
Re:128K should be enough for everyone by pthisis · 2005-01-13 08:21 · Score: 3, Informative

especially when listening to music on hi-quality speakers a la Bose

Bose is doesn't make high-quality speakers, they make expensive speakers that don't perform nearly as well as alternatives (for instance, the Acoustimass satellites use crappy paper cones that perform poorly in the upper frequencies). A $300 pair of B&W DM302's will thrash anything Bose makes soundly for sound quality. Also investigate Hale, Thiel, or Paradigm. If you really want to spend thousands, spend it on Magnepan (Magneplanar 1.6Q) or Vandersteen (2ce signature) or the higher end speakers from the companies I already mentioned. But those DM302's are good enough to be highly rated by places like Stereophile magazine and they're an incredible deal.

If you really want a bunch of little satellite speakers, Energy makes a much better sounding (and somewhat cheaper) system like that. I hear from people I trust that Tannoy makes an incredible one as well, but I haven't heard it.

--
rage, rage against the dying of the light
ARRRG! He gets Nyquist WRONG! by wowbagger · 2005-01-13 08:36 · Score: 3, Informative

According to the "Nyquist Theorem," you need to have twice as many digital samples as the frequency of the analog signal you are trying to represent to have enough data to accurately build it.

WRONG!

Nyquist's criterion is "You must have at least twice as many samples as the largest BANDWIDTH of the signal in order to correctly reconstruct it."

You can take a 10.7 MHz signal, and sample it at 10000 samples per second, and correctly reconstruct it, so long as the signal is guaranteed to be bandwidth limited to 10.7 MHz +/- 2.5 kHz. This is often done in software defined radio to aquire the signal from the intermediate frequency (IF) of the analog front end.

You also have to have an appropriate reconstruction filter at the output of the system in order to correctly recover the signal - if you don't have the right reconstruction filter, you will NOT reconstruct the signal correctly.

You also have to take into account the effects of any signal modulation - take a 20 kHz sine wave, and burst it for 10 msec, and you widen the bandwidth of the signal by about 100 Hz (depending upon the exact shape of the burst - a perfect square burst will widen the signal as a sinc function and will, in effect, increase the bandwidth to infinity, which is why square bursts are generally Considered Harmful in communications work).

Also, you don't oversample a signal in time to account for "rounding errors" - you oversample in time because the frequency response of sampling a system in time introduces a sinc response in frequency - by moving the sampling rate up you reduce the impact of this response on the recovered signal's frequency response. You also greately ease the requirements on the reconstruction filter - the filter can be wider (have fewer poles in the transfer function - thus fewer parts needed).

--
www.eFax.com are spammers