FLAC Joins The Xiph Family
Ancipital writes "Xiph.org (of Ogg Vorbis fame) have today announced that the FLAC (Free Lossless Audio Codec) project has joined the Xiph rebel alliance. The full story and press release can be found at the Xiph site. (FLAC is nice, because it gives you pristine lossless audio at roughtly 50% size reduction over uncompressed WAVs- you can store them on your hard drive/wherever and then transcode down to a lossy format when you need portability, yum!)"
How about ID3 tags, seekability, and built in md5 verification?
Zoid.com
So the point isn't that FLAC is new... the point is that FLAC is OSS, and has joined forces with an organization backing such efforts. The SHN codec is not OSS.
If the CD is lost or destroyed by scratches (many of mine are allready), you still have the original recording that you can compress with lossy compression of the day for your daily use. Conversion between lossy codecs is meaningless, but compressing from a lossless format to a lossy format is OK.
So, if Ogg Vorbis 2.0 is better than 1.0, you can make 2.0 files from your lossless compressed files.
Employee of Inrupt, Project Release Manager and Community Manager for Solid
> Is lossless really a good idea?
Yes, it is.
There are many musicians who want portability. Try encoding some wav to mp3/ogg at home, decoding it in the studio, mix it, encode it again to mp3/ogg and go home to your homestudio.
Then try that 20 times, and see what remains of the soundquality.
Then sure, you can also carry wavfiles if it matters that much to you, but 50% savings can be a lot.
Well, don't worry about that. We can get you back before you leave. (Dr. Who)
Somehow, with all of the repetition in music, there has GOT to be a way to do better than that.
The problem comes with the word "lossless".
Music does indeed have a *lot* of repetition, at a high level. If you look at an audio waveform, you can see very regular-looking patterns in the data, that change every now and then but can go on for thousands of samples with only slight variation. At a low level, however, music has a *huge* amount of noise (not noise as in clicks and artifacts, mind you, noise as in stronly leptokurtic Gaussian deviations from what the waveform "should" look like), and even extremely regular plosives just destroy any sort of adaptive prediction-based encoding. For reference, "huge" means on the order of 5 to 6 bits out of 16 (even local nonlinear methods give a RMS error of at best 40ish, but getting that low means storing a lot of parameters of the prediction model, RBF centers and weights as an example).
If you want and extremely high level of compression that you can *almost* call lossless, use FLAC (or Shorten, or Monkey's, or whatever) *after* running your sound through a trajectory-based nonlinear noise reduction filter. You'll see the compression go from 50% to 25% or better (for reference, "archive quality" VBR OGG only gets down to 20-25%). But, you can't *truly* call that lossless anymore, because even though you might not consider the "noise" as part of the music, people *can* tell the difference and usually prefer the version with noise (and, as I mentioned, such a filter blunts plosives, which *should* stay in the music, so you'd need to detect those and add them back in to avoid a noticeable degredation of quality).
Trust me, lossless audio compression does *not* count as a "toy" problem, nor one that people have already "solved" optimally (for example, just about every well-understood time series prediction/analysis technique out there depends on a property called "stationarity", which music very strongly lacks... You can still use such methods, but they give suboptimal results in the best case, and exhibit serious instability in the worst cases). For another problem, *almost all* research on time series analysis has focused on out-of-series error and stability. This lets you do things like predict stock values and the weather. It doesn't, however, necessarily give the best *in-series* error, which matters in an application like audio compression, since you already know the entire extent of the data you need to predict (postdict?). In AI, this has a close analogy to the idea of "overfitting" a neural net - if you train a neural net too long, it learns too many subtleties of the training data and loses its generalization power. Except, in audio compression, you don't *care* about the generalization power, you care about it learning as much about the training data as possible.
We should see FLAC streaming support in Icecast soon, at least I hope so.
:-)
I'm not sure your ISP hope so, though.
Beware: In C++, your friends can see your privates!
Except that they go even further than your naive scheme, and use a predictor to get even smaller deltas than your scheme (e.g., assume waveform is locally quadratic/cubic/quartic then extrapolate the next sample). A signal can be varying rapidly and yet still be highly predictable. Your simplistic scheme wouldn't handle it.
Then they use Rice-Golomb coding to encode the deltas. This does FAR better than gzip ever could, because it is designed SPECIFICALLY to handle the geometric distribution of the deltas, whereas gzip is a generic dictionary algorithm.
I really doubt you've even tried what you are suggesting. You're on the right track, but the FLAC team beat you to the punch. Sorry.