Domain: xiph.org
Stories and comments across the archive that link to xiph.org.
Comments · 962
-
Re:Lossless is impossible with digital
Actually digital or "numeric" is exactly what you call "virtually infinite", watch it in action using analog gear here.
-
24/192 Music Downloads and why they make no sense
-
Re:why vinyl might sound better in practice
For the high frequency, the digital circuits could produce prominent interference because the wave shape of a digital signal has a lot of harmonics (square wave as opposed to sinusoidal
You don't know what you're talking about. Digital waveforms are not square waves or "jagged" stairsteps. Watch this video if you want a good demonstration of how digital samples reconstruct sinusoidal analog waveforms:
http://www.xiph.org/video/vid2.shtml (relevant part starts around 17:18)
-
Re:Gullible Moron!
I can't help but to think how digital sampling is the exact opposite problem. Taking a good quality analog source with smooth curves and then making it digital with a jagged curve.
Digital waveforms are not "jagged" - this is a common misunderstanding of how sampling works. Usually the waveform is drawn with either a horizontal line between samples ("staircase") or a straight line from sample value to sample value ("zigzag"). Neither is representative of the source waveform; there is actually nothing "between samples." Samples are discrete.
Here are a couple links to explain why the waveform you get from a DAC is smooth:
https://wiki.xiph.org/Videos/Digital_Show_and_Tell
http://www.hydrogenaudio.org/forums/index.php?showtopic=93496
-
Re:Depends on the source
Well, this guy who came up with Ogg Vorbis seems to disagree – sampling rates that high are a liability and introduce all kinds of unwanted side effects both at DAC and playback level. Taking that into account, a losslessy compressed 16/44.1 track makes way more sense than the other one.
-
Re:Depends on the source
This article is a pretty good explanation of why 16/44.1 is as good as anyone needs for playback.
kinda like 640K?
I gave some advice to a glove factory once. I told them, you should start making all their gloves with 6 fingers on them - just to be sure. They told me that most people had just 5 fingers, and that five fingers should be enough for anyone. I rudely quipped "yeah, you sound exactly like Bill Gates about that 640kB of RAM!"
They called the police on me, and I was trespassed from the building.
-
Re:Depends on the source
Your preference for 24/96 audio as a listener is entirely due to the placebo effect. There are good reasons to master audio in high res, but for listening 16 bit 44.1khz audio is as good as anything.
There's a pretty good explanation of this (and other factors) on xiph.org: "24/192 Music Downloads are Very Silly Indeed"
-
Re:Any good studies?
Anyone know of any good double-blind studies comparing people's ability to tell FLAC from 320kbps MP3? Googling just turns up people debating in forums whether you would be able to tell the difference rather than any serious academic research.
This may not be exactly what you're looking for: Xiph on 24/192 music, but it likely has links to something that will answer your question.
And it's extremely interesting in its own right.
-
Re:Depends on the source
Everybody needs to read the following very carefully:
http://people.xiph.org/~xiphmont/demo/neil-young.html
Some of it agrees with you, some of it doesn't. Let me summarize some key highlights.
16 bits encodes more than the entire range of the human ear. In practice, you can get 120 dB of range, "greater than the difference between a mosquito somewhere in the same room and a jackhammer a foot away.... or the difference between a deserted 'soundproof' room and a sound loud enough to cause hearing damage in seconds." A 24-bit (or even 32-bit float, for that matter) signal path isn't strictly necessary, but reduces mistakes in recording and processing: "The primary reason to use 24 bits when recording is to prevent mistakes; rather than being careful to center 16 bit recording-- risking clipping if you guess too high and adding noise if you guess too low-- 24 bits allows an operator to set an approximate level and not worry too much about it."
So, a perfect 16-bit signal path will technically be just fine, but 24 works better in practice. For final playback, however, using more than 16 bits is generally a waste.
Now, on to 44.1 vs. 96 kHz sample rate. Oversampling at 96 kHz used to have a purpose back in the days of analog -- more wiggle room for filters and whatnot. But now, digital processing has all but eliminated the benefits of a 96 kHz sample rate.
And then you get into the drawbacks of oversampling. If there's any weak link in the path from recording to headphones, recording ultrasonic frequencies (the ones over 20 kHz, basically) will only produce distortions. So, if you have a 96 kHz recording, you'd better have "headphones that can accurately reproduce frequencies above 20 kHz" and whatnot, otherwise you're going to get distortion.
But, here's the kicker: if the original recorder were done at 48 kHz instead, the exact same sound quality can be produced at the back end, without requiring special playback hardware to work around the distortion produced by unnecessarily recording sounds that can't be heard anyway. With modern digital processing, that is.
Here's the lesson: record at 48 kHz, 24 bit, encode for playback at 48 kHz, 16 bit. Higher sampling rates will only produce distortions unless everybody has great hardware (unlikely) and doesn't improve the sound anyway. The extra bits per sample are unnecessary for playback.
-
Re:Depends on the source
"However, dither doesn't change the fact that once a signal sinks below the noise floor, it should effectively disappear. How is the -105dB tone still clearly audible above a -96dB noise floor?
"The answer: Our -96dB noise floor figure is effectively wrong; we're using an inappropriate definition of dynamic range. (6*bits)dB gives us the RMS noise of the entire broadband signal, but each hair cell in the ear is sensitive to only a narrow fraction of the total bandwidth. As each hair cell hears only a fraction of the total noise floor energy, the noise floor at that hair cell will be much lower than the broadband figure of -96dB.
"Thus, 16 bit audio can go considerably deeper than 96dB. With use of shaped dither, which moves quantization noise energy into frequencies where it's harder to hear, the effective dynamic range of 16 bit audio reaches 120dB in practice [13], more than fifteen times deeper than the 96dB claim.
"120dB is greater than the difference between a mosquito somewhere in the same room and a jackhammer a foot away.... or the difference between a deserted 'soundproof' room and a sound loud enough to cause hearing damage in seconds.
"16 bits is enough to store all we can hear, and will be enough forever."
-
Re:Depends on the source
Your preference for 24/96 audio as a listener is entirely due to the placebo effect.
Well, in all fairness, listeners may actually hear perceptible differences between 24/96 and 16/44.1 audio sources due to different mastering, but of course that says nothing about whether they can actually tell the difference between the two bitrates when everything else is equal.
This article is a pretty good explanation of why 16/44.1 is as good as anyone needs for playback.
There's plenty of articles from "experts" on why 16/44.1 is all you need, however these kind of opinions risk being being wrong: What about actual data? I see little solid data where the hypothesis has been put to the test.
I think the point is, some say 24/96 survives lossy compression better, it also produces less artifacts as the higher frequencies have less data points to describe their waveform. Perhaps some won't hear the difference in uncompressed audio, but I bet some can hear the difference in compressed. -
Re:Depends on the source
This article is a pretty good explanation of why 16/44.1 is as good as anyone needs for playback.
kinda like 640K?
Not even a little.
-
Re:Debunked
Read the xiph.org article. Yes. Debunked.
-
Re:Better question
I think the real point is that there are known limits to human hearing and many audiophiles fantasize about their hearing being superhuman. It just ain't so. Dynamic range compression is one thing, but perceptual compression, sample rate, and bit depth are a different matter. No audiophile has ever heard the difference between FLAC and 320Kbps mp3 audio in an ABX test at a statistical rate that is better than guessing.
Any time this argument starts, I refer people to this well written article that lays out the limits of human hearing compared to the specifications of recording formats...
-
I knew this article was gonna be BS
"By Young's estimation, CDs can only offer about 15% of the data that was in a master sound track"
And nothing of value was lost in the remaining 85% of the *data* that is inaudible to the human ear.
"Young, in fact, created his own digital-to-analog conversion (DAC) service called Pono. Young has tweeted that the Pono cloud-based music service, along with Pono portable digital-to-analog players, will be available by summer."
There's your cash-in scheme lurking behind all the BS.
"Young's service would increase the quality, or sampling rate, of the music from 44,100 times per second in a CD (44.1KHz) to 192,000 times per second (192KHz), and will boost the bit depth from 16-bit to 24-bit."
I would like to repeatedly hit you over the head with http://people.xiph.org/~xiphmont/demo/neil-young.html
"The sample rate of a digital file refers to the number of "snapshots" of audio that are offered up every second. Think of it like a high-definition movie, where the more frames per second you have, the higher the quality."
NO, do not think of it like that unless you're a charlatan. Refer to rebuttal on xiph.org.
"Millions of people in the world are audiophiles."
No doubt, Millions of people in the world are fools and they have money that could be yours.
"It's just common sense that the higher the resolution -- the more data that's in an audio file -- the better the sound quality, Chesky said."
Too bad this thing called SCIENCE has been trumping "common sense" for millenia now.
"The site also recommends high-resolution player software such as JRiver, Pure Music, or Decibel Audio Player. The software, which basically turns your desktop or laptop into a music server or a digital-to-analog converter,"
HILLARIOUS. I won't even begin to..
"The most popular music server among audiophiles, according to Bliss, is an Apple Mac Mini."
This is beautiful. I am not surprised in the least to see this audiophile-appleophile overlap. -
Re:No
with a high sample rate you will hear more detail in the high frequencies
With a sampling rate of 44KHz (CD quality), you can encode ALL frequencies below 22KHz -- the fidelity is only limited by the bit depth, and 16 bits is WAY beyond human perception.
Increasing the sampling rate beyond 44KHz will get you more detail only for frequencies beyond 22KHz, which no human can hear. There's a lot of misconception about this because people see images like the ones in this page and don't understand them completely. The truth of these images is this: it doesn't matter how coarse the quantization looks -- if the original signal doesn't have frequencies higher than half of your sampling rate, then you can EXACTLY reconstruct the original signal (as long as each sample has enough precision, which is about bit depth and not sampling rate).
If you still don't believe me, watch these videos to get better explanations.
-
Re:Depends on the source
You don't have to do a personal ABX test when there are many others who have done them and confirmed his statement. In fact, it's a much more powerful statement citing many others than just yourself. One is a statistic and the other is an anecdote.
And for a MUCH more exhaustive and scientific discussion than any post on this article will ever make (anther post in this thread already linked it, but you must have missed it, and it's a great article): http://people.xiph.org/~xiphmont/demo/neil-young.html
-
Re:Depends on the source
This article is a pretty good explanation of why 16/44.1 is as good as anyone needs for playback.
kinda like 640K?
-
Some scientific mumbo-jumbo
This might shine a lot of light into the topic: http://people.xiph.org/~xiphmont/demo/neil-young.html
-
Re:Depends on the source
Your preference for 24/96 audio as a listener is entirely due to the placebo effect.
Well, in all fairness, listeners may actually hear perceptible differences between 24/96 and 16/44.1 audio sources due to different mastering, but of course that says nothing about whether they can actually tell the difference between the two bitrates when everything else is equal.
This article is a pretty good explanation of why 16/44.1 is as good as anyone needs for playback.
-
Re:Using real world audio waveforms?
He says several times that he used sine waves as a simple illustration. Then he switches to square waves. You apparently don't understand sampling theory well enough to understand why your second sentence, in the context of PCM audio, is incorrect. Perhaps reading this will help: http://people.xiph.org/~xiphmont/demo/neil-young.html.
On the other hand, you might just be an AC troll, an "audiophile" or an old enthusiast or sound engineer who might have been an excellent technician but never developed a proper understanding of signals. In any case, anybody tempted to agree with your post should read the article at that link.
-
This is good
This guy knows what he's talking about, and communicates it well. Amateur audiophiles should especially read his article here: http://people.xiph.org/~xiphmont/demo/neil-young.html.
-
Re:The bigger WebRTC news
Yes, this is really cool. Opus really is the best of all and it's royalty free and had an open source implementation. Yes, it was partly done by Skype and the people from http://xiph.org/ thus it is BSD-licensed.
What more could you want ?
Comparisons:
http://opus-codec.org/comparison/
Demo:
-
Re:Major overhaul is useless: still no FLAC suppor
Is installing the Ogg plugin for QuickTime "dual stacking"? And if so, what's so wrong with dual stacking?
-
Re:FLAC
Nobody who understands digital audio and oversampling converters would ever record at 192kHz. No double blind test ever confirmed the claims of the golden ear brigade. Anyone recording at 192kHz is a "sound engineers" in the pejorative because they clearly do not know what they're doing.
48kHz sample rate and 24bit resolution already exceeds the capabilities of human hearing. Any difference people think they can hear (nobody can reliably identify higher sample rates in a double blind test) is undoubtedly due to poorly designed converters.
-
Not a step up.
No, it's not a step up. No-one has ever been able to reliably distinguish a 24/96 recording from it's downgraded 16/48 version in a properly conducted double-blind test.
It is absolutely necessary to oversample when acquiring data (since all analog filters have some roll-off), and it is good to use higher dynamic range when mixing to keep the repeated rounding errors below the noise floor. But once the final recording it is mastered, there is no benefit to distributing or listening to the result at higher than 16/48.
-
Re:Obligatory
aacplus was just the early CT-proprietary version of HE-AAC. They did test against the two best publicly available HE-AAC encoders, which have improved quite a bit since the aacplus days. Opus was better, by a statistically significant margin.
Opus has band folding, which is in some ways similar to SBR but considerably superior. Halfway down Monty's two-year-old CELT demo page there's some explanation and a visual of what this looks like on a spectrogram in low-bitrate situations. (Opus used technology from CELT but is considerably improved.)
If you really think HE-AAC type codecs sound like CD at 32kbps and so forth you are extremely insensitive to coding artifacts. Unless you meant mono for all of those.
-
Re:Obligatory
aacplus was just the early CT-proprietary version of HE-AAC. They did test against the two best publicly available HE-AAC encoders, which have improved quite a bit since the aacplus days. Opus was better, by a statistically significant margin.
Opus has band folding, which is in some ways similar to SBR but considerably superior. Halfway down Monty's two-year-old CELT demo page there's some explanation and a visual of what this looks like on a spectrogram in low-bitrate situations. (Opus used technology from CELT but is considerably improved.)
If you really think HE-AAC type codecs sound like CD at 32kbps and so forth you are extremely insensitive to coding artifacts. Unless you meant mono for all of those.
-
Re:Getting it right the second time!
-
Re:Getting it right the second time!
-
Re:The summary missed the real headline feature!
The Opus site links to this great writeup explaining why 16bit/48khz audio all we'll ever need for consumer audio distribution: http://people.xiph.org/~xiphmont/demo/neil-young.html
-
Re:What is a CD?
Sure! Just pad each sample with an extra byte of zeros, upsample, and encode! For instance, sox in.wav -b 24 out.flac -rate 96k does the trick!
That's what many distributors of 24/96 music do anyways, and even if they don't, you wouldn't be able to hear any improvement from 24/96 anyway. 24/96 is great for use in the editing and mastering process, but for listening purposes, both 24-bit and high sampling rates are a waste of space, and though 24-bit is innocuous quality-wise, playback of high-sampling-rate audio often introduces audible distortions.
A developer of a popular audio device firmware told me that they downsample any high-sample-rate files to 44.1k using linear interpolation (fast but bad resampling) and nevertheless lots of people rave about the quality of playing back their 24/96 FLACs using these devices.
-
Re:huh
Lossless 16-bit digital formats give you a dynamic range of 96dB
Not quite right.
http://people.xiph.org/~xiphmont/demo/neil-young.html
See "The dynamic range of 16 bits" -
Re:huh
112dB? Ha! Hilarious.
You're lucky to get 70dB out of audiophile grade vinyl. See http://wiki.hydrogenaudio.org/index.php?title=Vinyl_Myths
There's also an interesting discussion of the dynamic range of both vinyl and 16-bit/44kHz digital audio here:
http://www.hydrogenaudio.org/forums/index.php?showtopic=47827&st=0&p=425794&#entry425794
The dynamic range of vinyl does vary by frequency. For example, in that thread a poster notes he measured 84dB at 300Hz for vinyl. A 300Hz tone recorded to a 16-bit wave file with noise shaped dither exhibited a dynamic rage of 151dB!
Vinyl has extremely limited dynamic range in the bass - something like 30dB at 20Hz. The needle would pop out of the groove if you tried to record more than that. Vinyl also suffers from constant negative signal to noise ratio incidents, when impulse noise (clicks and pops from scratches, dust and defects in the groove, static discharge) completely drowns out the signal. Unacceptable, in any format.
See also this recent article, which, while skewering the distribution of 24-bit/192kHz audio, notes that 16-bit digital audio has an overall dynamic range of 120dB with dither:
http://people.xiph.org/~xiphmont/demo/neil-young.html
Vinyl's a shitty format for reasons apart from its inferior dynamic range, but that's not terribly surprising since it's like 100 years old, mechanical, and prone to a plethora of issues - rumble, wow and flutter, phase issues caused by the RIAA equalization / de-equalization process, scads of unwanted harmonics and harmonic distortion, ultrasonic noise, preamp hum, static clicks, etc., etc., etc.
Probably should have been replaced by some other analog disc-based format by the early '70s - maybe something based on RCA's capacitance discs, which wound up being used for video, and had scads of bandwidth - more than enough for near-flawless reproduction of the original studio master tapes. But at the time most industry attention was focused on the emerging lo-fi but convenient tape formats, first 8-track then cassette, as well as the failed competing quad systems. And then by the middle of the decade everybody knew a digital format was coming, with Sony and Philips working first separately, and then by '79 or so together on what would become the Compact Disc.
-
Re:Nyquist
There is only _ONE_ shape of waveform (a sine wave) at the Nyquist. Any other shape would contain higher frequencies.
(There is a nice example of this on hacker news, a 1khz wav file with every other sample set to -.25,
.25 was resampled to 48kHz. and you see the nice perfect 500Hz sinewave).A lot of people seem to have the wrong mental model of sampling. It's not some stair step that becomes a finer and finer approximation but is never perfect. (The quantization part is, but we can give that orders of magnitude more approximation than is required. The sampling part, however, is not an approximation).
This is like that rule you learned in grad school mathematics: If I have a curve formed from an N degree polynomial and tell you _any_ N unique points on it you can perfectly recover all the terms of the polynomial exactly (though, some values make it easier than others). Likewise the sampling theorem tells you that if you have a signal which is band limited to contain energy under some frequency N, then with 2*N equally spaced samples you can recover the original signal perfectly (and it tells you how— and its fortunately easier than recovering polynomials from random points).
-
Re:can you hear me now?
The only real comparison we've made with Vorbis and AAC was a 64 kb/s test comparing Opus to Vorbis and HE-AAC (v1). See the results and the analysis. At higher rate, we definitely reach a point where Opus is transparent for everything, but the exact rate depends on the content and the listener.
As for Microsoft, they've actually updated their covenant to something which is nicer than what Skype originally had and (IMO but IANAL) totally acceptable.
-
Re:can you hear me now?
We actually wanted to compare Opus and AAC-ELD, but there was just no way to actually get an AAC-ELD implementation. The best we were able to do is to get an AAC-LD implementation from Apple. See this demo page (scroll down) for the comparison we did between AAC-LD and CELT (which is now part of Opus). In the very few modes we had access to, CELT (Opus) was clearly superior to AAC-LD. I've no idea how much better AAC-ELD is.
-
Three ways to seek
In a constant bitrate stream, you can just multiply the chosen time by the bitrate, seek once to that point in the file, and start playing. In a variable bitrate stream, you can't. So you have to either A. read the whole file and construct an index of where to seek for each second, B. seek somewhere near where the user clicked, or C. seek near where the user clicked and then retry up to four times ("interpolated bisection" assuming piecewise constant bitrate) to find the exact second. The best option ends up differing for each container. In AVI, option A is best because the vast majority of files have an "index" at the end mapping keyframe times to byte offsets. VirtualDub uses option A, which is fast for AVI but slow for MPEG. Based on your description, VLC appears to use B. The Ogg project tends to use C, but Monty eventually realized that that's too slow over an Internet connection with a wireless last mile, so he relented and put an index into Ogg Skeleton (source).
-
Re:IE on PCs also supports WebM
Not that I'm aware of. But it is available for Windows Mobile 5.0 through 6.5. http://xiph.org/dshow/
-
Re:Why not...
Yes, just like all the developers/studios that produce these games: http://wiki.xiph.org/Games_that_use_Vorbis.
-
Re:Good!
> Sitting in on the development of a standard and then
> patenting those components is dirty pool.You may be interested in reading http://lists.xiph.org/pipermail/theora/2010-April/003769.html in this context...
But in general, what Apple is presumably doing here is making use of http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-exclusion-with
-
Re:Don't get overexited
None of this solves the software patents problem in the USA. The software patents problem isn't caused by some bad apple applications slipping through the procedures. The problem is that software has to conform to standards (interfaces and data formats), and these are being covered by thickets of patents.
I think everyone concerned about this sad state of affairs should read Xiph's comments to the FTC Patent Standards Workshop. Their submission focuses on how software patents affect Standards Setting Organizations.
-
Re:WebM is too "geeky"; too "open/free"
What's the penetration of this open and free format out in the music player industry? Zero. Another example: Theora. Penetration? Zero.
Normally I don't respond to AC's, but you're a fucking liar. My phone, portable music player, laptop, my wife's phone, my wife's laptop, and my wife's music player all play Vorbis and FLAC. The only ones that don't play Theora are the music players (which don't have color screens). Music players that support Vorbis aren't that hard to find.
as much facts as the reality that current WebM encoders do a worse job in terms of video quality than x264 does for H.264. End-users' experience doesn't matter, I take it.
Differences in quality between WebM and H.264 are negligable, at best. Most people won't notice or care. But how about that "end-user experience" of paying a royalty fee everytime you want to encode, decode, or distribute a video? Or not being able to play H.264 videos out of the box on Linux because the members of MPEG-LA can't compete any other way? Doesn't sound very fucking user-friendly to me.
WebM is open and free in every sense of the word; the submarine patent issues apply equally to H.264; and hardware support for WebM is coming along rapidly. Face it, WebM is the future, and that's a good thing.
-
Re:WebM is too "geeky"; too "open/free"
Yeah, wow, just look at this extensive list of players that support Ogg Vorbis. Surely it's the up and coming dominant audio format.
-
Why users care...
...is at the top of the first Opus/CELT demo page:
http://people.xiph.org/~xiphmont/demo/celt/demo.html
The low latency makes more interactive applications possible. By way of illustration, the total algorithmic delay of an Opus or CELT stream is approximately equivalent to the time it takes sound to travel from you to someone standing five feet away.
-
Re:Compatibility
Why not just post the link to the list of portable Vorbis players? Also, the list of not-so-portable players wouldn't hurt either.
-
Re:Compatibility
Why not just post the link to the list of portable Vorbis players? Also, the list of not-so-portable players wouldn't hurt either.
-
Re:Not in theory
moving to higher sample rates and bit depths allows easier filtering of nyquist noise. the highest audible frequencies are damn close to the nyquist limit of 44100hz making it difficult to filter the aliasing without losing high frequency information. higher frequencies also benefit from additional samples.
-
Google is at least trying
At this point, WebM is a closed codec because there are not enough specs and no standard for which someone can create a compatible codec of their own.
WebM is Matroska, Vorbis, and VP8. Matroska and Vorbis are already well documented, and Google is at least trying with VP8, having submitted a draft RFC to IETF.
-
Re:Ogg Theora has no technical merit over H.264
H.264 is miles ahead of Theora, *especially* at low bit rates
I agree with this statement.
The original page, comparing Theora to YouTube videos, was not trying to trick people into thinking Theora is better than H.264. A guy at Google famously claimed: "If [youtube] were to switch to theora and maintain even a semblance of the current youtube quality it would take up most available bandwidth across the Internet." The Theora guy compared the quality available per bit with Theora, with the actual quality of actual videos from YouTube; and he concluded that Theora was able to meet or beat the quality of the example videos he pulled from YouTube. At the bottom, in the conclusions section, he noted that YouTube uses H.263, and just a subset at that, for many videos, making it easier for Theora to match YouTube; and even for H.264, YouTube isn't making full use of the standard. He speculated that perhaps YouTube was making a tradeoff, allowing the files to be bigger to make them easier to seek within or some such. Here's the link again; go read what he actually wrote.
I'm sure there are idiotic, deranged Theora fanboys out there who claimed it is better in all ways than H.264, but I am not one. If you make effective use of H.264 you get the highest quality per bit possible with current video technology, full stop. The only advantage of Theora is that it is patent-free and BSD-licensed. That is a very large advantage for some purposes.
But now we have VP8, which is much better than Theora, while still freely available for use. It's not as good as H.264, but it seems to be better than everything else, and good enough for practical use. It will have its place.
H.264 isn't going away. But short of a successful patent challenge, WebM isn't going away either. Its advantages in freedom will make it the top choice for many purposes, even though it can't match the ultimate quality per bit of H.264.
steveha