State of Sound Development On Linux Not So Sorry After All
An anonymous reader writes "There have been past claims by Adobe and others that development on Linux is a jungle, particularly with regards to audio. However today, the author of the popular 'The Sorry State of Sound in Linux' has posted a follow up showing Adobe's claims to be FUD, as well as being a good update on where OSS and ALSA are holding today, and why PulseAudio isn't a good idea."
I don't have any problems getting the modules to load, it's the quality of the output that's lacking for me (NForce4 chipset). Popping (DC bias) as you slide the volume fader up and down, as well as throughout playback is unbearable. Not to mention the state of media players on Linux...
I can't speak for Logic Pro, but there is a (VERY good) alternative to Pro Tools; Ardour. The ubuntustudio packages had everything I needed to jump right into a professional DAW. I've been using it for high-quality recording/editing for almost a year now, no problems.
Really, it is.
It can be a pain in the ass to get working still, and is buggy.
I'm sure it works well for some, but many others still have problems.
Yes, Linux audio sucks. If nothing else, we have three common and incompatible APIs to perform a single tasks, and none of them are definitively better than the others. So, my question: what exactly is it that we're trying to achieve? What's the end goal of creating newer APIs instead of perfecting the old ones, such as moving from OSS to ALSA to whatever they roll out this month?
For comparison, FreeBSD uses multi-channel OSS. You can have a whole passel of processes writing to /dev/dsp simultaneously, because whenever a process attempts to open it, the OS spawns off a new copy. It Just Works. I'm a little amazed that my FreeBSD server's sound handling is so much better than my Linux desktop's and requires approximate zero client configuration. So again, what was Linux hoping to achieve by dropping old "obsolete" OSS in favor of increasingly complex solutions?
Dewey, what part of this looks like authorities should be involved?
And that is that ALSA's way of handling mixing is completely moronic.
As an user, I care about hearing sound first of all. Sound quality (no pops or crackles) comes second, latency comes third.
There should always be sound mixing, with no ifs, buts, exceptions, or configuration required. It should be there by default for anything that tries to play sound, whether through ALSA or the OSS backwards compatibility.
The result of this nonsense is that crap like pulseaudio continues to exist, which is CPU hungry, often skips, fails to work with some programs and crashes frequently (what the hell is up with that?).
Is there any document out there which explains why /dev/dsp doesn't get mixing with ALSA? And why nobody tried to patch that yet?
The audio quality is disappointing on Linux. I don't know if it's the decoding or the playback, but audio sounds much better on Windows.
Not to mention the state of media players on Linux...
Going a little bit off-topic first: a cousin of mine had Ubuntu on his laptop (featuring a Geforce 9300M G) and couldn't get rid of image tearing in VLC. Who would be the culprit in this case? The video drivers or the media player? I have kept wondering since then and my enquiring mind would like to know.
At any rate, could you please elaborate? What makes media players bad under Linux?
"The body may heal, but the mind is not always so resilient." -- Deus Ex: Human Revolution
I HIGHLY agree with this. I thought audio production was at a complete standstill in the days of Rosegarden since it crashed for no reason on any machine I installed it on. It was a great concept, but it simply didn't work. When I installed Ardour I couldn't believe how great and functional it was. The ubuntustudio package is indeed a super easy way to get yourself up and running. Not to mention there is a pretty big following in #ardour on Freenode. Always someone there willing to lend a hand. Beats spending tons of Pro Tools, Adobe Audition, Sonar, or anything else out there. I believe there's a Mac version too.
*plays the Apogee theme song music*
Over the years I had a lot of prolbems with ALSA, the biggest being the lack of sound mixing with the sound card on my motherboard. To get around it, I went out and bought a different sound card that supported hardware mixing. I still had problems where ALSA would just break periodically and require restarting it. Then at one point it just plain broke and nothing would fix it.
I had enough and installed OSS. What a difference. Latency is better and it just works. There is no excuse to not providing consistent audio mixing. I should have switched to OSS in the beginning rather than buy an expensive sound card because ALSA couldn't do software mixing.
A sound API should provide sufficient abstraction so that basic operations do not depend on the underlying hardware. Mixing, sample rate conversion (when needed) and per-application volume settings fall under basic operation as far as I'm concerned.
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
My chief complaint, both on Windows and Linux is that probably 99% of applications have no concept of anything other than the default sound card, making multiple cards useless for all but a few niche applications. Apps that use sound need to provide a way to specify which device is used in case the user wants to use other than the default, period. None of the solutions for audio so far have really done anything to make this better (or they make it worse in the process) - granted, it's mostly an application issue, but control of device selection in the mixer as well would help.
I do not understand how Alsa differs from the in-kernel specification you are proposing. It exposes every possible hardware feature via a well-defined common api.
Moreover, I don’t think many programmers get excited about managing audio buffers and performing sample rate and format conversion, so they would still need a userspace library to do at least those jobs (the kernel can’t even do floating point!). So here comes PulseAudio, which gives the developers a far greater freedom than any kernel-based implementation could ever do. How would you deal in-kernel with features like sound over bluetooth, user-provided codecs, sound over the network, or sound redirection for whatever reason you could ever think of?
I agree with the specification point, and agree that the lowest level API should be as basic (and standard) as possible. Then, once you have that, you can layer whatever higher-level architecture you like on top, as the low-level drivers are "just there" and will "just work".
However, this doesn't help applications, necessarily. I would argue to help apps writers, you need to standardize the glue between layers, such that sound and commands can be passed from one layer to another in a predictable manner. Innovators can always add new commands that are parsed by their own injectable layer.
I would also argue that it's impossible to chain userland software a-la JACK via the kernel efficiently, as you've a double context switch per element in the chain. Since transforms are CPU intensive, you want to do the fewest composite transforms possible, which means a software mixer should be something you can chain, which means that the heavy-lifting mixer needs to be in userspace.
(Either that, or you're going to need LADSPA and LV2 support in the kernel, plus some way of coaxing "smart" sound cards into supporting such effects. Since the kernel developers would force the first coder who tried to submit such a patch to walk the plank, I don't see it as likely.)
This would leave the low-level mixer for mixing between kernel threads (rather than between applications per-se) and normalizing the inputs. If we're not having to normalize values anywhere else in the process, we should end up with improved quality and less latency. (Anything that mucks with precision hurts quality, and any operation at all hurts latency.)
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
App -> libao -> OSS API -> OSS Back-end - Good sound, low latency.
App -> libao -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> libao -> ALSA API -> OSS Back-end - Good sound, low latency.
App -> libao -> ALSA API -> ALSA Back-end - Bad sound, horrible latency.
App -> SDL -> OSS API -> OSS Back-end - Good sound, really low latency.
App -> SDL -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> SDL -> ALSA API -> OSS Back-end - Good sound, low latency.
App -> SDL -> ALSA API -> ALSA Back-end - Good sound, minor latency.
App -> OpenAL -> OSS API -> OSS Back-end - Great sound, really low latency.
App -> OpenAL -> OSS API -> ALSA Back-end - Adequate sound, bad latency.
App -> OpenAL -> ALSA API -> OSS Back-end - Bad sound, bad latency.
App -> OpenAL -> ALSA API -> ALSA Back-end - Adequate sound, bad latency.
App -> OSS API -> OSS Back-end - Great sound, really low latency.
App -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> ALSA API -> OSS Back-end - Great sound, low latency.
App -> ALSA API -> ALSA Back-end - Good sound, bad latency.
Do you by any chance buy Monster cables, and a wooden volume knob, because it "sounds better"?
I'm sorry, but without proper ABX tests, I do not believe a single word of this table.
And about the latency: Please enlighten us, how you actually measured them?
Any sufficiently advanced intelligence is indistinguishable from stupidity.
Try using the OpenGL output driver, and make sure 'wait for vertical blank' (vsync) or a similarly-worded option is enabled.
Which is great, but it's not so great if you are trying to produce audio.
When I plug my guitar in, I can notice a latency greater than 5ms. And greater than 25ms, it drives me insane.
Compare that to what I get with PluseAudio (usually): 100-150ms. No thank you.
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
I'm having difficulty understanding where you're coming from regarding the media players in Linux. For audio, Amarok 1.4 is nothing short of amazing. Even the venerable Winamp and Foobar can't touch it for flexibility and keyboard friendliness. As for video players, mplayer is one of the programs I fire up to show Linux off to Windows using naysayers. It loads anything you throw at it and it's blazing fast at seeking, start-up, you name it. And if you just have to have a GUI, VLC is untouchable. Hell, Miro will play your local videos too and it has tons of features. I know that all of the video programs I mentioned run under Windows too but, to imply that media players on Linux suck is disingenuous at best.
The soylentnews experiment has been a dismal failure.
Pretty sure VLC doesn't do hardware acceleration on any platform period. Nvidia supports VDPAU in linux which allows you to play HD flawlessly with practically any card as long as the video player supports it (and a number do, mplayer and XBMC are two that come to mind off the top of my head).
See: http://www.phoronix.com/scan.php?page=article&item=nvidia_vdpau_gpu&num=1
For so many ALSA just works when it is installed, not for me.
For so many OSS just works when it is installed, not for me.
For a few, PulseAudio just works when it is installed, not for me.
The one thing I will fault ALSA and OSS for, is not allowing multiple audio streams to play simultaneously without crashing the system; in all circumstances. The support forums are littered with issues like this. It definitely dos NOT work for everyone. At least handle one and play one, just choose one, but to crash and not play anything, that just sucks. And to force a reboot before working again, well that is a FAIL! This is not from my personal experience, as I have posted, I could not get any of them to work for my scenario, but based on days and weeks of searching through forum posts looking for solutions, I know I am far from alone.
I read support requests for all three: ALSA, OSS and PulseAudio. So to date, outside of BSD (which I am currently NOT running, but may in the future) Linux + SOUND is a very REAL ISSUE! (see my solution at the end of this post, there is one solid solution, guaranteed to work)
For any one of these to be the best solution, they must handle additional audio streams, from any source, without crashing the system. For instance, when my VoIP phone rings, and I answer the phone, the Audio Radio stream, CD playing, or Video should pause until I restart it and let me answer the phone and hear the person talking to me.
Ideally if I want to listen to the music and watch a video at the same time, I should be able to mix in the sound levels and do that. The solution should have a way to handle it. Heck I should be able to hear the radio, video and VoIP phone all at the same time if I wanted this. I should be able to mix the sound levels and it should work.
Back in the mid 90s I was using a midi keyboard to play a sound track, save it. Then use that same keyboard to play another sound track and save it. I could even convert what I played on the keyboard and make it seem like it was a different musical instrument. An Oboe, a flute, a trumpet, etc. The software (Audio Visual Communications 1.3 running on OS/2 1.2, when the marketeers would have you think only a MacIntosh PC could do this; I was doing it on both IBM PCs and MACs.) would then let me play back all the sound tracks together. I could literally create my own symphony.
Based on what I read, this was one of the major things PulseAudio was going for that was NOT available in either OSS or ALSA. The fact that OSS had gone proprietary was not helpful either. I think they have both a proprietary and open source OSS solution today, but am not sure.
And this was over a decade and a half ago. So Linux should have this today. Perhaps an API solution would allow for this, but first, just to handle multiple audio streams in an intelligent way without crashing the system would be HUGE!
For anyone reading this that wants to avoid these types of issues, there is a solution. If your PC was installed with Linux out of the box, with everything you need: WiFi, 10/100/1000 Ethernet, Sound (audio), Video, Burn CDs, Burn DVDs, plug n play USB support, Ext Monitor support if a netbook or laptop, You should be okay!
Stop going to any vendor and buying a PC with any other operating system installed on it. Only buy hardware with Linux pre-installed and you avoid allot of issues. Avoid vendor LOCK IN.
A
I agree. I think it has more to do with some kernel developers who refuse to consider OSS after OSS3.
The OSS kernel interface is simple and the audio mixing is performed in the kernel (if needed) where it should be. All an app needs to do is open /dev/dsp and perform a few ioctl calls and they're ready to go. They don't need to care whether some other application is also playing audio or not.
It's much cleaner than ALSA, which is a mess IMO. I've had a lot of problems with ALSA until I finally dumped it for OSS4 which solved the constant clicking, stuttering and lack of audio mixing. ALSA would often need to be restarted and it finally got to the point after a kernel upgrade where ALSA just plain refused to work at all.
With OSS I can basically choose the format of the audio, the sample rate and the volume and just set it and go. If the hardware doesn't support multi-stream mixing and volume then OSS does it in software. Similarly, if the hardware doesn't support the sample rate (i.e. 44100) then OSS will resample it to match the hardware, thus abstracting the hardware from the software, which is the way it should be.
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
Since you're the only one who asked politely, I'll share with you. Nothing necessarily makes media players bad under Linux, but similarly, nothing guarantees one will be good either. The reason I don't believe any Linux media players are good is simply that nobody has written one yet. While Amarok looks promising, that's all it looks, but next to a commercial offering like iTunes it's nothing. It's not that I particularly like iTunes, it's just that:
short version: bugs bugs bugs bugs bugs
All this ignoring the unacceptable quality of some sound drivers, the nForce4 AC97 being my current example, but most cards that I've used in the last ten years suffering OSS->esd->aRTS->ALSA and back again have exhibited artifacts, often DC bias in the output. I guess it's probably because a lot have been incompletely reverse engineered.
Believe me, I've never been a fan of iTunes either, but on the other hand it's given me no reason to hate it either. It does what's promised nice and smoothly, and stays out of my way. Ultimately, here's a hypotheical example of a key difference I believe to be between a commercial closed source player like iTunes and an OSS one like Amarok.
I accept that if you run Linux, Amarok is as good as you're going to get. Until a month ago, it was the best *I* had for about three years. (Before that was XMMS, but it became stable so Gentoo removed it). I hated every minute of it, and now that I've got the choice I'll never use it again.
I do, actually after I post this I'll be using it to produce a Jazz album that I also recorded on a linux machine (Fedora, ccrma). I think Jack is great to hook up different audio applications and I think the resulting production process is a real step forward from existing digital mixing/mastering processes. Not perfect, sure, but I'm not certain it can be duplicated on Mac's and windows boxen.
I've produced three albums so far under linux and the software has come forwards significantly since I started playing around with it in 2003.
In recording mode - where it matters most - I have a machine that is stable because if there are any problems during the recording the musicians are not likely to be understanding. Usually the machine is set to record over a day or two with 16 channels of input and very little interference. The underlying features that a Linux box offers like LVM's, fast file systems like reiserfs, tunable kernels are a bit of a hassle to set up at first but the result is an exceptionally stable system.
There are shortcomings but I just develop new habits to overcome them. With the money I saved on a mac and protools I have bought some great recording equipment. I plan to start donating to the Ardour and jack projects because that is what they need to improve and make them progress a lot faster.
Without the Alsa project as a foundation I don't think any of the sound projects happening now under linux would have been possible.
My ism, it's full of beliefs.
If we switch to OSSv4, people will start whining because we will have three sound systems instead of two. A gift for all Linux FUD spreaders. Drivers quality will not improve in the switch from ALSA to OSS (why should it?) so people will keep complaining about cracks and pops and out-of-the-box hardware support, and new bugs will inevitably crawl in during the process of converting existing drivers from ALSA to OSS.
Of course, developers will have to support ALSA for a long time (dropping ALSA altogether would break nearly ALL the current linux applications, not just flash player) so the support burden for distributions maintainers would become even heavier.
All of this - because ALSA does not match the pipe dream about sound systems of TFA writer. In the end, the features offered to the end user by a OSSv4 stack would be less than those provided by a working ALSA + PulseAudio stack, as even the writer itself states (about hybernation support).
Not to mention the fact that nowadays many applications will make use of high level libraries that hide the details of the sound system from them, so they couldn’t care less about ALSA or OSS.
So no, thank you! Please report bugs, do complain as loud as you can, but yet another fork is the last thing we need now.