GNU C Library Alternative Musl Libc Hits 1.0 Milestone
New submitter dalias (1978986) writes "The musl libc project has released version 1.0, the result of three years of development and testing. Musl is a lightweight, fast, simple, MIT-licensed, correctness-oriented alternative to the GNU C library (glibc), uClibc, or Android's Bionic. At this point musl provides all mandatory C99 and POSIX interfaces (plus a lot of widely-used extensions), and well over 5000 packages are known to build successfully against musl.
Several options are available for trying musl. Compiler toolchains are available from the musl-cross project, and several new musl-based Linux distributions are already available (Sabotage and Snowflake, among others). Some well-established distributions including OpenWRT and Gentoo are in the process of adding musl-based variants, and others (Aboriginal, Alpine, Bedrock, Dragora) are adopting musl as their default libc." The What's New file contains release notes (you have to scroll to the bottom). There's also a handy chart comparing muscl to other libc implementations: it looks like musl is a better bet than dietlibc and uclibc for embedded use.
Several options are available for trying musl. Compiler toolchains are available from the musl-cross project, and several new musl-based Linux distributions are already available (Sabotage and Snowflake, among others). Some well-established distributions including OpenWRT and Gentoo are in the process of adding musl-based variants, and others (Aboriginal, Alpine, Bedrock, Dragora) are adopting musl as their default libc." The What's New file contains release notes (you have to scroll to the bottom). There's also a handy chart comparing muscl to other libc implementations: it looks like musl is a better bet than dietlibc and uclibc for embedded use.
... which I don't believe because the guys at gnu know a thing or 2 about compilers and libraries - or this library has cut some corners and/or missed out some functionality.
For those curious about which "5000 packages" that build with musl, there is the awesome automated pkgsrc tests published: http://wiki.musl-libc.org/wiki...
It might be easier to add than to remove, leading to bloat over time and glibc has been around for a while. Also, building on old code might mean that you are limited in what you can change. For example, the modular design of LLVM has been a pretty big success and is considered easier to work with/develop than gcc. For musl, I think they have decided to remove all legacy stuff + non-standard extensions.
libc is just old and not developed very much...
I'm guessing that it only targets x86, amd64, ARM, and MIPS. That sounds comprehensive until one considers sparc, HPPA, PPC, POWER, and various "embedded but not ARM or MIPS" architectures like Blackfin or CRIS.
What is the real benefit besides license? Is it correctness?
-=/\- Jizzbug -/\=-
I downloaded the library to see some random code. Here is the very first file I (randomly) chose (putw.c):
#define _GNU_SOURCE
#include
int putw(int x, FILE *f)
{
return (int)fwrite(&x, sizeof x, 1, f)-1;
}
Cheers.
From the FAQ:
On musl, the entire standard library is included in a single library file — libc.a for static linking, and libc.so for dynamic linking. This significantly improves the efficiency of dynamic linking, and avoids all sorts of symbol interposition bugs that arise when you split the libraries up — bugs which have plagued glibc for more than a decade.
Bringing it all together? That's why they call it the love musl.
You obviously never worked on or looked at their source code.
The first priority on musl is correctness, and they will take a hit to size and speed if that's what's necessary to achieve it. But thus far, they've been doing a good job of achieving correctness without introducing too much bloat.
Take a look at their page on bugs found while developing musl, and you'll find that they've found and reported quite a few bugs in glibc where glibc had been "cutting corners".
Steps to a useless comment:
1) Speculate on the features of something
2) Note that that speculated feature set doesn't include something you want
3) Criticise based on your speculation
For every problem, there is at least one solution that is simple, neat, and wrong.
You're right that musl doesn't support the same breadth of architectures that glibc does. They currently support x86, amd64, ARM, MIPS, PPC, microblaze, and they have experimental support for superh and x32.
One big advantage they do have is that it's much simpler to add support for a new architecture to musl than it is to add it to glibc. They are interested in supporting more architectures, so I'd expect their list of supported architectures to grow fairly quickly if there are people interested in that support.
bugs fixed:
- buffer overflow in printf when printing smallest denormal exactly
-=/\- Jizzbug -/\=-
The first thing I saw was MIT vs GPL. What is the difference between the two?
Never trust a man wearing a coat and tie!
The chart shows a few things, though I notice they don't include comparison to the full glibc itself.
Hacker Public Radio is our Friend
Such a trend to reinvent wheels. Hidden intention seems to be to allow more "Mixed source" BS through the push for permitive licenses. And devs are falling all over it by providing free code to these projects. A shame.
Here is a link to the comparison chart mentioned in the description.
Have you ever looked at static linking in detail? .a file is basicly a collection of .o files. The linker only links those that are needed. .a file instead of two or more .a files. This allows them to prevent difficult interdepencies between those .a files.
A
So they have a single
The end result might still be a very small subset of the complete library.
Secure messaging: http://quickmsg.vreeken.net/
Where does it say you have to link the whole thing into your application? Musl supports dynamic linking just fine. The musl developers do have a preference for static linking, so they have better support for it than glibc (see their size comparisons of static linked programs on musl and glibc, for instance). But that doesn't mean you have to use it.
The bit about aiming for correctness is correctness of musl itself. Of course they can't, in general, guarantee that you will write your own code correctly. In theory, they could split the math library out and force you to link against it correctly. But what would be the point? To arbitrarily break broken programs, while having no impact on correct programs? It would also have several downsides.
Musl is the only C library I'm aware of which allows the entire C library ecosystem (C library, math library, threading library, dynamic linker, and some others probably) to be upgraded atomically, which eliminates a small window during upgrade where you might start a new program and have it break because it gets conflicting versions of these components.
There is also code within the main C library (for example, the code to format floating point numbers in printf) which benefits from being able to call functions that are part of the math library.
Forking the Linux userland yet again should have some serious motivations behind it. I can't find them neither in the benchmarks nor in the feature comparisons provided here.
But did you use Bitcoins and Apple when building it? Then it would be 2^256 times more newsworthy.
At the time the comparison was made, glibc was essentially unmaintained and Debian-based distributions were using the eglibc fork. Now that glibc is under new leadership, eglibc is being discontinued and the important changes have been merged back to glibc upstream. So when I update the chart's quantitative comparisons, it will be for glibc rather than eglibc. The main things that will change when I do are significant increases in size (especially since I seem to have under-measured eglibc's totals) and possibly some improvements in performance. In terms of all the other qualitative comparisons, glibc remains about the same place it was before.
I am glad to see an alternative to GNU's C library. I have run into a number of bugs with the GNU C library, or more specifically, incompatibilities which crop up between versions. Sometimes behaviour changes between one version of the library and another, causing end-user applications to stop working properly. If someone provides a more consistent library with similar performance, I would be happy to see it adopted.
... which I don't believe because the guys at gnu know a thing or 2 about compilers and libraries - or this library has cut some corners and/or missed out some functionality.
NSA has not had a chance to sneak their stuff in yet?
Unlike some projects, we fully disclose bugs that might be relevant to security. In this instance, the bug could only be triggered by explicitly requesting sufficiently many decimal places (16445 for ld80) and printing a denormal long double with the lowest bit set, as in:
printf("%.16445Lf", 0x1p-16445);
In addition, even when triggered, it only wrote past the end of the buffer by one slot, and we were unable to get it to overwrite anything important like a return address (of course, what it overwrites depends on the compiler, so in principle it could).
Flagellum and all?
Spent All My Mod Points
Oh. my. god. I didn't see the picture... http://en.wikipedia.org/wiki/F...
Spent All My Mod Points
WTF does this mean? I'm sure as hell not developing against a libc that doesn't have debugging hooks. This can't be what it means.
I read the internet for the articles.
Yeah, the guys at gnu must know a lot, it's thanks to them that we have autoconf hell and myriads of projects trying to save us from it. Or why others started LLVM, or why others start a libc.
Don't measure the knowledge of the gnu guys based on your own lack of it.
Musl is the only C library I'm aware of which allows the entire C library ecosystem (C library, math library, threading library, dynamic linker, and some others probably) to be upgraded atomically,
GLIBC allows this. You can definitely bring in new versions of all of those.
SJW n. One who posts facts.
It doesn't mean you can't use gdb, just that libc itself does not try to double as a debugging tool. This is actually a security consideration. For example, glibc prints debugging information if it detects corruption in malloc. But if there's already memory corruption, you have to assume the whole program state is inconsistent; the corruption may be intentional due to the actions of an attacker, and various function pointers, etc. may have been overwritten. Continuing execution, even to print debug output, risks expanding the attacker's opportunity to take control of the program.
FWIW, musl does detect heap corruption. The difference is that it immediately executes an instruction that will crash the program rather than trying to continue execution, make additional function calls that go though indirection (the PLT) and access complex data structures, etc.
Don't measure the knowledge of the gnu guys based on your own lack of it.
Why not? It's exactly how we operate our political... oh, wait...
You have the right to remain sentient. If you give up the right to remain sentient, you will be elected to public office
As far as I know, the whole .a file gets linked in each time unless you use -fdata-sections when compiling and -Wl,--gc-sections when linking.
What sig ?
There is a way to upgrade glibc atomically, but it's a big hack, and even still it doesn't achieve the goal. The way it would work is to have /lib be a symlink to a versioned directory, and atomically replace (via rename()) the symlink with a symlink to the new directory. However, even if the replacement is made atomically, you still have the situation that the dynamic linker can load libc.so before the replacement is made and libpthread.so after it's made, resulting in mismatching versions.
Thanks for the update. I used to hear about the drama around drepper, but I didn't understand what happened after he left.
argument by status quo? surely you're not serious that
since gnu has been at it a while it follows that they are good?
what's this a remix of the old hungarian mathematics joke?
The compare page is missing the only other entry I wanted to see.... and that is, BSD libc. This is widely used by QNX (Blackberry) and probably all kinds of other vendors. I image Apple has their own fork.
The others aren't comparable because they're copylefted, so cannot be used everywhere like musl and BSD libc can.
If that were true, there would be no difference in size in static linking regardless of how many functions you used from the .a file. It's the entire original .o file that gets linked in, normally. This is why musl is split up into many .c files which are compiled into many .o files.
No. You're thinking of the fact that once the linker marks a .o as required, it brings in all of that .o file unless you provide those flags you mention. But the point is definitely that only the .o files that are needed get linked into the final object model. Just do a gcc of a simple "hello world" program and then look at the a.out's size; it won't be anywhere near as big as libc.a. Also, nm a.out will show you the relatively few symbols in the a.out file.
Let me fix that for you:
Steps to a typical Slashdot comment:
1) Speculate on the features of something
2) Note that that speculated feature set doesn't include something you want
3) Criticise based on your speculation
The .a file is an archive of the individual .o files, only the individual .o files that are actually referenced get linked into the final executable. See also:
If I have been able to see further than others, it is because I bought a pair of binoculars.
You are brain damaged and are commenting without knowing anything about linkers.
Comment removed based on user account deletion
Just look at typical GNU code though. It's well written but it's not small, and often not efficient. Much of this is due to accretion over time, however there also is a certain style that the programs follow. Thus the parodied GNU HelloWorld program. Glibc makes an implicit assumption that it is being used on a fast computer with lots of memory (ie, a PC or minicomputer). This is perfectly normal, however it leads to a different sort of optimization than you would find for embedded systems or small computers for example, thus the popularity of alternative standard C libraries or lots of roll-your-own.
Yes some functionality may be missing, but is that necessarily required or standard functionality?
I wonder how small Musl is in comparison to Bionic which is really, really small.
You don't know much.
Some parts of glibc are definitely broken. For example, snprintf(3) does a ton of dynamic memory allocation, which means printing a formatted string to a static buffer could still fail with ENOMEM! That's because snprintf is a wrapper around fprintf() using a dynamic file object, among other niceties! Sane implementations like on OpenBSD are async-signal-safe for all the basic formatting specifiers.
The problem with glibc is they do too much dynamic memory allocation in general. Several functions you would think should never fail on a sane implementation could fail. Then you have just plain stupid stuff, like NL_TEXT being INT_MAX, because apparently the GNU folks expect that some time in the future strerror_r() may return system error messages gigabytes in length. It's really because they took too literally the GNU Coding Style requirement of using dynamic memory as much as possible to avoid arbitrary limitations. But sometimes arbitrary limitations are really nice, making simpler code which is more secure. Imagine dealing with DNS names of arbitrary length!
It's ridiculous. When you're boxed into a corner because of various _other_ failures (I/O, authentication failure, w'ever), the last thing you want to worry about in your failure path is having to deal with crap like OOM conditions.
Someone with mod points, please mod up the parent post. Even if you disagree with it, it's informative about one of the big issues in glibc that musl does differently: musl's snprintf and dprintf, for example, are async-signal-safe. Roland McGrath, who holds claim to being the "inventor" of dprintf and author of the original implementation in glibc, has stated that he intended for the function to be async-signal-safe or at least close to it, and that later introduction of dynamic allocation is a bug (which I later filed as #16060) that glibc should fix.
good lord. it has *never* been this way. even in the 70s.
I'm not sure why this was voted down, but there is a point here. If the future of OSS belongs to licenses that don't protect the continued openness of source, OSS authors are effectively working for free and competing with those who take their code and wrap it up in proprietary licenses.
It's ok to charge cash for binaries and/or source. So why is it so terrible to 'charge' source for source?
No, glibc is just garbage, like so much other Gnu code. And no, I disagree with your unwarranted assertion that Gnu code is typically well written. Counter examples galore: Gnu Emacs, Autoconf, Automake, guile, GCC.
It's just Gnu has fame from blowing the Free Software horn so hard and for so long, that people naturally and incorrectly assume that they must be technically competent.
I wouldn't work for one of these projects and be exploited like this. If Apple wants to use part of the toolchain that makes them billions, they can pay for it.
Well I would, and have. I build embedded Linux systems, and the code I write for a living isn't usually open source. I have contributed back patches to musl and a bunch of other MIT/BSD licensed projects, because I use them in my work.
What being GPL/LGPL really means is that I'm less likely, as a proprietary software developer, to use (and therefore to contribute to) software when it's license makes including it in my work more difficult/impossible. The end result is that I can earn a living and contribute to free software if it is BSD/MIT license, but I cannot if it is GPL.
My paymaster is not going to just turn around and say to me, well that's GPL, so I guess we'll just release all our software as GPL too.
The most only time I use GPL software at work, is either where the license allows, like running on Linux, and distributing the source for the kernel but keeping our secret sauce secret; or parallel invention, where I build something using a GPL library as a prototype, and then I or someone else has to go and reinvent the GPL parts we used as a prototype. For an actual example, I often use FFTW as a prototype in signal processing applications, but then we yank that bit out and use IPP because it's cheaper than a commercial FFTW license. I'm looking at using FFTS, which is MIT licensed and supposedly faster than FFTW and IPP, but I'm currently banging my head against their wack autoconf build system, and it also appears to not support x86 (it supports x86_64, but the docs wrongly suggest x86).
Unlike some projects, we fully disclose bugs that might be relevant to security.
Thank you. As a security guy, knowing that the disinfectant of sunlight is illuminating your project, I am willing to spend more time examining and using your project. I hope your project becomes the default libc everywhere.
"Someone needs to talk to the tree of liberty about its ghoulish drinking problem." by ohnocitizen
Thats not true.
Nearly every time I used a BSD licensed program for some project that left my computer I contributed bugfixes and sometimes features back (if I added features, often just used unmodified).
The alternative with GPL (in the embedded world at least) is that most companies say f*ck off and license some proprietary system. The value of the source to some product (unless it's some disposable thing such as a home router) is much much higher than the cost of licensing some components. The community then gets nothing.
For a libc that is LGPL there is not really an issue, but getting a license vetted by legal can in some companies also be a very large cost, often als much higher than buying something from a company they already verified. A big (technical) advantage of musl is that it properly supports static linking. This allows making binaries that work almost anywhere.
BTW: Some marketting people for proprietary embedded solutions (cannot say which products) refer to Stallman as a way to spread FUD for open source. Some management people seems to get really scared of him.
But if there's already memory corruption, you have to assume the whole program state is inconsistent; the corruption may be intentional due to the actions of an attacker, and various function pointers, etc. may have been overwritten. Continuing execution, even to print debug output, risks expanding the attacker's opportunity to take control of the program.
I'm sorry, this sounds like a poor rationalization for lack of useful functionality. By this logic, you should crash the program without warning whenever invalid input is detected - it could be an attack since no program should ever provide invalid input to a function. In real life, programs have tons of bugs and diagnostic messages are hugely useful in identifying and then fixing them. Especially since the vast majority of programs are not used in a context where an attack can occur.
A 'paranoid' mode with this behavior may make sense for some people. Most people, especially those in the process of developing the software, would prefer diagnostics when things go wrong.
I'm sorry, this sounds like a poor rationalization for lack of useful functionality. By this logic, you should crash the program without warning whenever invalid input is detected - it could be an attack since no program should ever provide invalid input to a function. In real life, programs have tons of bugs and diagnostic messages are hugely useful in identifying and then fixing them. Especially since the vast majority of programs are not used in a context where an attack can occur.
A 'paranoid' mode with this behavior may make sense for some people. Most people, especially those in the process of developing the software, would prefer diagnostics when things go wrong.
I disagree. There is a big difference between invalid input to a function (eg trying to convert "abc" to an integer) and a memory corruption bug. In the former case, you can return an error to the caller, and if they were written with enough attention to detail, they can fix the problem and move on, or ask the user for actually valid input, or whatever followup action may be appropriate.
In the case of a memory corruption bug, there is no way to correct the problem and move on. By the time you detect the problem, you're already hosed. You can't even rely on the fact that the program accurately knows what it was in the middle of trying to do. I think crashing is absolutely appropriate here. And if you want to debug it, then attach a debugging to the program or the resulting core dump -- all the same information you would have gotten from printed diagnostics can be found (albeit with more effort). But trying to diagnose a memory corruption bug after the fact like this is the hard way to do it anyway. You really want to catch the corruption as it happens, and the are much better tools for this (valgrind, etc).
Perhaps you should read this: https://twitter.com/solardiz/s...
What is wrong with old code?
It is well-tested.
Even if you were correct, and you are not, you could just take the source code that you need from musl.
But you are wrong, and very brain-damaged.
To be fair Apple contributes a lot to OSS.
Printing support wouldn't be first-rate in Linux if not for Apple.