Audio Compression Primer

Re:Virtually dismisses lossy compression by tsanth · 2005-01-13 07:43 · Score: 2, Informative

Given the topics in the audio section (it has an audio section!), the site seems to lean more towards audiophiles.

I don't agree with the dismissal of lossy algorithms either, but I think it makes sense given the context.

Re:Is FLAC worth it? by jasoncc · 2005-01-13 07:49 · Score: 2, Informative

I use FLAC because converting from a lossy format to another lossy format can produce crappy results. If I choose a lossy format for all my audio and then I need the audio to be in some other lossy format, I might be screwed.

You might choose Ogg for your audio then sometime in the future, a new lossy format sweeps the industry. Your Ogg files might not convert well to the new format.

and besides...Disk is Cheap!

being pedantic, but... by demonbug · 2005-01-13 07:49 · Score: 2, Informative

Trying to transmit audio data with uncompressed audio or video is not the easiest task. After all, even an audio CD contains data that transmits at 1400kb/s

Shouldn't that be 1200 kb/s? 150 KB/s * 8 = 1200 kb/s, right? Or is the 150 KB/s figure I'm using incorrect (I could have sworn that was the 1x CD speed)?

Re:being pedantic, but... by stratjakt · 2005-01-13 07:53 · Score: 2, Informative

441000hz*16bits*2 channels = 1411200 bits per second, 1400 kb/s

The 150KB number is for CD-ROM data storage, the gap between the two data rates is for the extra error detection and correction.

--
I don't need no instructions to know how to rock!!!!
Re:being pedantic, but... by stratjakt · 2005-01-13 07:55 · Score: 2, Informative

Err, that would be error codes and positional information.

There's even a little more room, in the subcode channels where one can hide the data for CD+G (karaoke) or CD-TEXT.

--
I don't need no instructions to know how to rock!!!!
Re:being pedantic, but... by Piquan · 2005-01-13 08:09 · Score: 2, Informative

Shouldn't that be 1200 kb/s? 150 KB/s * 8 = 1200 kb/s, right? Or is the 150 KB/s figure I'm using incorrect (I could have sworn that was the 1x CD speed)?
Data CDs are 150 KB/s at 1x, but you're missing an important difference between data and audio CDs.
CD sectors are 2352 bytes (I'm ignoring subchannels here). Data CDs have 2048 data bytes, plus 304 bytes of error-correction data, so every bit comes off perfectly. Audio CDs have no error correction, so they use all 2352 bytes for audio data (on the assumption that a few bits missed won't hurt). That means that audio data is moved 14.8% faster (in b/s) than 9660 data. 1200*1.148 = 1378.
Another calculation you can use instead: 44100 samples/sec * 2 channels/sample * 16 bits/channel = 1411200 bits/sec, or 1378 K/s.

AAC by sometwo · 2005-01-13 07:49 · Score: 3, Informative

So what about AAC used by Apple in their music store?

I did a little googling and found this (http://www.teamcombooks.com/mp3handbook/13.htm):

AAC (Advanced Audio Coding) is not a MPEG layer, although it is based on a psycho-acoustic model. Sometimes referred to as MP4, AAC provides significantly better quality at lower bit-rates than MP3. AAC was developed under MPEG-2 and also exists under MPEG-4.

AAC supports a wider range of sampling rates (from 8 kHz to 96 kHz) and up to 48 audio channels, plus up to 15 auxiliary low frequency enhancement channels and up to 15 embedded data streams. AAC works at bit rates from 8 kbps for mono speech and up to in excess of 320 kbps for high-quality audio. Three profiles of AAC provide varying levels of complexity and scalability.

AAC software is much more expensive to license than MP3 because the companies that hold related patents decided to keep a tighter reign on it. Most AAC software is geared towards professional applications and secure music distribution systems, so it may be a while before you see AAC in consumer-oriented products.

FLAC will live forever by parvenu74 · 2005-01-13 07:50 · Score: 2, Informative

Because the code is open source, FLAC will be around forever and available on whatever OS/Platform you want to use it on if you feel like compiling the software.

Another reason it's going to be around and much more prevalent as time goes on is that the compression is so good and the speed/resource usage figures are so attractive. When I rip CD's to FLAC I am limited to 40x by my burner (CPU utilization is around 20-25%). When I rip the same CD to ogg, I top out under 30X because the processor has reached 100% utilization.

Fast. Free. Efficient. Frugal with the CPU. What else do you need?

Re:128K should be enough for everyone by wfberg · 2005-01-13 07:54 · Score: 2, Informative

FM Radio is far from CD quality hence there isnt really a need to use very high bitrate MP3s or whatever

Or consider this; since FM radio has a limited range of frequencies that come across well, songs that are intended to be widely played on FM radio (e.g. Britney Spear's latest "hit" song) are actually engineered to sound best in those frequencies. With the end result that when you hear Britney Spears on the radio, the track sounds just like it does on the CD.

Meanwhile, quality music, lovingly mixed onto CD by people who actually give a damn, sounds like crap on the radio..

In other words; if you can't hear the difference between 128kbps and higher, it might just be that you're listening to mass produced music.

As for musicians preferring 128kbps? Well, sound engineers usually don't sit on stage with zillion Watt speakers right next to their fragile precious ears for a reason..

Me, I have crap taste in music AND I'm tonedeaf, so whatever, 128kbps all the way! ;-)

(MPEG artifacts in video drive me nuts, though)

--
SCO employee? Check out the bounty

more algorithms by barik · 2005-01-13 07:59 · Score: 5, Informative

While the article is a primer, I was a little disappointed in the algorithmic treatment given in the article itself. Right now I know of two excellent free publications: Introduction to Sound Processing and The Sounding Object, which both treat the theoretical, DSP side of things. Any other resources that Slashdot readers can recommend for those who are interested in the subject of audio compression and representation?

--
Titus Barik

Re:more algorithms by Hal-9001 · 2005-01-13 10:28 · Score: 2, Informative
Any other resources that Slashdot readers can recommend for those who are interested in the subject of audio compression and representation?
- An older but good technical survey of digital audio compression, including MP3, is Davis Yen Pan, "Digital Audio Compression," Digital Technical Journal (Spring 1993). (PDF)
- Some other technical reference material on MP3 is also available on the Digital Audio Systems website.
- A more recent survey of perceptual coding of audio, which covers more recent formats like AAC, is Painter and Spanias, "Perceptual Coding of Digital Audio," Proc. IEEE (April 2000). (PDF)
- Ogg Vorbis is documented on the Xiph.org website, but I found the documentation to be lacking when read from a signal processing perspective. Christopher Montgomery provides a better description from that perspective in a Slashdot interview from 2000. I found another good description in this thread in the hydrogenaudio forums--it hyperlinks a good block diagram of the encoding process.
--
"It take 9 months to bear a child, no matter how many women you assign to the job."

Re:The actual meaning of lossless ?? Any clues? by stratjakt · 2005-01-13 08:04 · Score: 2, Informative

If it's lossless, you should be able to take digital file A, compress it into compressed file B, and then if you uncompress B to get A', then A' = A.

That is, the checksums for A and A' should match, etc.

That's how I define mathematically lossless.

Whatever this asshat is on about double blind and testing and all that, has more to do with the ability of his FLAC playing equipment to sound the same as his CD player, which is a whole 'nother ball of wax altogether.

--
I don't need no instructions to know how to rock!!!!

Re:Virtually dismisses lossy compression by Sebastopol · 2005-01-13 08:14 · Score: 2, Informative

Yes, I noticed the article is 3 PAGES LONG! It makes only passing reference to other codecs. Not much of a primer, and it didn't take the entire afternoon to read, it to 5 minutes.

Did I miss a crucial link or something?

--
https://www.accountkiller.com/removal-requested

Re:One sad bit.. by Anonymous Coward · 2005-01-13 08:16 · Score: 1, Informative

Vorbis decoder is and has been done for a long time. Like other codecs, tweaks can always be made to the encoder to produce better results by using different psychoacoustic models, etc. As long as the output still follows spec, the decoder will still decode just fine. This is why your crappy MP3's from 1997 still play today, and fancy MP3's from today will still play on those old sound players from 1997. As long as the encoder follows spec, the decoder will always be able to decode it properly.

Re:128K should be enough for everyone by pthisis · 2005-01-13 08:21 · Score: 3, Informative

especially when listening to music on hi-quality speakers a la Bose

Bose is doesn't make high-quality speakers, they make expensive speakers that don't perform nearly as well as alternatives (for instance, the Acoustimass satellites use crappy paper cones that perform poorly in the upper frequencies). A $300 pair of B&W DM302's will thrash anything Bose makes soundly for sound quality. Also investigate Hale, Thiel, or Paradigm. If you really want to spend thousands, spend it on Magnepan (Magneplanar 1.6Q) or Vandersteen (2ce signature) or the higher end speakers from the companies I already mentioned. But those DM302's are good enough to be highly rated by places like Stereophile magazine and they're an incredible deal.

If you really want a bunch of little satellite speakers, Energy makes a much better sounding (and somewhat cheaper) system like that. I hear from people I trust that Tannoy makes an incredible one as well, but I haven't heard it.

--
rage, rage against the dying of the light

Actually, you hear quantization distortion by cogito+ergo+blog · 2005-01-13 08:27 · Score: 2, Informative

(Mod to -3, nitpicking)

The MDCT in itself is actually lossless. Any distortion you notice is most likely introduced by the quantization applied post MDCT during compression.

--
"There is no dark side of the moon really. Matter of fact it's all dark."

ARRRG! He gets Nyquist WRONG! by wowbagger · 2005-01-13 08:36 · Score: 3, Informative

According to the "Nyquist Theorem," you need to have twice as many digital samples as the frequency of the analog signal you are trying to represent to have enough data to accurately build it.

WRONG!

Nyquist's criterion is "You must have at least twice as many samples as the largest BANDWIDTH of the signal in order to correctly reconstruct it."

You can take a 10.7 MHz signal, and sample it at 10000 samples per second, and correctly reconstruct it, so long as the signal is guaranteed to be bandwidth limited to 10.7 MHz +/- 2.5 kHz. This is often done in software defined radio to aquire the signal from the intermediate frequency (IF) of the analog front end.

You also have to have an appropriate reconstruction filter at the output of the system in order to correctly recover the signal - if you don't have the right reconstruction filter, you will NOT reconstruct the signal correctly.

You also have to take into account the effects of any signal modulation - take a 20 kHz sine wave, and burst it for 10 msec, and you widen the bandwidth of the signal by about 100 Hz (depending upon the exact shape of the burst - a perfect square burst will widen the signal as a sinc function and will, in effect, increase the bandwidth to infinity, which is why square bursts are generally Considered Harmful in communications work).

Also, you don't oversample a signal in time to account for "rounding errors" - you oversample in time because the frequency response of sampling a system in time introduces a sinc response in frequency - by moving the sampling rate up you reduce the impact of this response on the recovered signal's frequency response. You also greately ease the requirements on the reconstruction filter - the filter can be wider (have fewer poles in the transfer function - thus fewer parts needed).

--
www.eFax.com are spammers

Re:ARRRG! He gets Nyquist WRONG! by Kiryat+Malachi · 2005-01-13 12:25 · Score: 2, Informative

And you've just described "beating". Imagine that, instead of that 10k sine at 20khz sampling, you have a 9.99kHz sine at 20k sampling. The point on the waveform that you're sampling is going to slowly change from cycle to cycle, and you're going to wind up with a 9.99kHz sine wave amplitude modulating - "beating" - at 0.01kHz.

--

---
Mod me down, you fucking twits. Go ahead. I dare you.
(I read with sigs off.)

iPods don't play .ogg by me+at+werk · 2005-01-13 09:41 · Score: 2, Informative

From Apple - iPod - Technical Specifications:

Audio formats supported: AAC (16 to 320 Kbps), MP3 (32 to 320 Kbps), MP3 VBR, Audible, AIFF, Apple Lossless and WAV
Upgradable firmware enables support for future audio formats

The second bullet leaving the possibility there, but the page lists it as currently (meaning iPod users now, popularity etc) not supporting it.

--
For context, click Parent.

"VBR" 320kbps by silverfuck · 2005-01-13 09:42 · Score: 2, Informative

I know that even large radio stations use 128Kbit sampling frequency.

Sampling frequency would typically be 44.1KHz, bitrate would be 128kbps. Also, FM radio quality (with good reception) compares to about 96kbps well-encoded mp3, so there's not much point in them recording higher except for archival purposes.

I have switched from 128K to VBR 320K

You should be using LAME to encode, and LAME only goes up to 320kbps (blade for instance goes up to 384kbps, but is much lower quality), ergo you can only have 320kbps CBR, not VBR.

And to everybody else out there who complains about background noise, you should be extracting digitally from the CD!

flac doesn't seem to have come far enough yet for me (500+ albums is a lot of diskspace if it's around 300MB/album), but to my ears on my equipment (Klipsch £250 (pound sterling if that doesn't come out) speakers, cheapo SB Audigy2 soundcard), lame --preset standard (around 200kbps VBR) sounds damn near perceptual transparency.

--
You know you've been IMing too long when you almost say 'lol' out loud to a non-geeky friend...

20 of 236 comments (clear)