Measuring The Benefits Of The Gentoo Approach
An anonymous reader writes "We're constantly hearing how the source based nature of the Gentoo distro makes better use of your hardware, but no-one seems to have really tested it. What kind of gains are involved over distros which use binary packaging? The article is here."
The source-based thing isn't even why most people use gentoo. According to a recent poll on the gentoo-user mailing list, most people like it because of Portage (the package management system), with Customisation / Control coming in second (performance was third). Portage rocks. Even with the compiling, it takes less time to install some stuff (eg nmap) than it would take to locate the relevant .rpm. Of course, kde's a different matter, but with distcc compiling doesn't take too long.
Having said that, it looks like the guys doing the testing got their CFLAGS wrong. Gentoo's performance should never be worse than Mandrake -- I reckon they forgot omit-frame-pointer. Also, the kernel compile is unfair, because gentoo-sources includes a whole load of patches that Mandrake and Debian don't.
Finally, what's with measuring compile times? How is that a fair way of measuring performance? Hey, look, my distcc + ccache + lots of CPUs system with gcc3.2 can compile stuff faster than your single CPU gcc2 system... It's like comparing chalk and oranges.
Can this be correct. Debian turns out to he fastest?
Anyway, I like the idea of gentoo, and I saw I a lot of Debian users head over to gentoo because the idea of controlling everything including the build was nice, however, I saw the gentoo idea also pretty much die, since a log of these users are power desktop users and not everyone could wait 3 days for X to build.
What I like about debians packages is that if you do make a mistake you can always pretty much correct the package by fixing your souce list or goingt o packages.debian.org and getting the older working package and installing it manuaall with a simple dpkg -i old_package.deb.
In gentoo, you had to rbuild to the whole thing, whihc with x coud take forever. And so what I saw gentoo suddenly doing was having a lot of pre-complied binaries start being provided by gentoo because they saw the problem with building taking forever, and so it kind of killed the whole idea of building for yerself, in which case, if you are going to stick with built pacakged why not have them maintained by some of the best developers around (ie debian)
The othjer thing I noticed is that a lot of developers of software acutally use debian. I've noticed many a time that some cool software wa being made and the developers wouls provide source and they would provide a debian package and nothting else. Ie Debian appears to be the preferred developer's distro. In this I would like to hear discussion,.
Thansk all
Sigs are dangerous coy things
I don't use Gentoo (When I use Linux, I use Slackware), but I do use FreeBSD and its ports collection.
Purported performance gains are one thing source packages give you (although I don't enable super optimizations because you never know when gcc bugs with -march=pentium4 -O3 or whatever will bite you).
There are two major reasons I like installing from source, though. One is that you can customize the build to your system; lots of software packages have various compile time options, and when I have the source I can choose exactly how it's going to be built.
Another thing is that when you install from source, you can hack the program to your heart's content. On my desktop box there are around 15 programs that I have to modify to get to act like I want (from simple things like getting cdparanoia to bomb immediately when it detects a scratch to halfway complex things like rewriting parts of klipper and XScreenSaver, which now picks a random screen saver on MMB and lets me scroll through all screensavers with the wheel =).
I don't modify stuff on my servers, but I still get to choose exactly how things are built, which I very much enjoy.
Most of the comparisons in the article were for X-related graphics applications, and while they were comparing the versions of the applications, they were not comparing the libraries underneath them (glibc, X11, and probably the window manager too come into play) and they should've compared versions there too. It becomes complicated because for a typical X11-based app there are probably several dozen libraries involved (in addition to all the configure-time options for them...)
Besides, before doing any comparisons on Debian vs. Gentoo they should have compared Gentoo vs. Gentoo on different optimizations. Like using -O2, -Osize, -mfp-math=sse. Comparing video drivers. Trying different filesystem types. And a whole gaggle of other configurables at compile-time.
You'd be yelling bloody murder if Microsoft sponsored a study without doing this sort of research before pitting Windows vs. Linux.
Doing the Right Thing should not be preempted by making a buck.
Using it. It rocks. Best Linux distribution yet.
So are Debian, RedHat, SuSE and Slackware, according to Debian, RedHat, SuSE and Slackware users (respectively). I take it you're a Gentoo fan?
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
I tried Gentoo for a while and eventually gave up. The problem is that you still have dependency hell. Most packages look for stuff at compile time, and many have optional components. For example a video player may not include support for QuickTime unless the libraries are already on there at compile time.
So the fun starts when you start installing stuff, they don't include support for other components because they weren't there at compile time, you then discover the missing support, have to install the missing libraries and then recompile every package.
This is an especially big issue with multi-media stuff, and gets many layers deep as some libraries have optional components depending on other optional components.
About the only way to guarantee a fully uptodate system is to keep doing complete recompiles of the entire system until there are no changes.
Upon testing with hdparm, it was apparent that this machine was having troubles setting above udma2. Eventually this problem was traced to the HD cable, a salutary lesson in the variability of identical hardware setups.
Very telling pair of sentences.
Stating on Slashdot that I like cheese since 1997.
There seems to be little attention given to the fundamental unfairness of this test presented.
The distributions were running with different software versions initially and although this was corrected there seems to have been little consideration given to the minor tweaks given to each different installation used. Which services were running on each system? Were the kernel settings identical in use? Were the machines experiencing differences in performance due to the X setup causing X to add different loads?
etc.
Fundamentally this test was probably not complete enough to suggest anything in particular. Perhaps it would have been better to boot a single machine three times and perform the sequence of events exactly the same each time as this would have also ruled out some other potential factors.
Jon.
http://www.jonmasters.org/
Here you go, the obligatory Gentoo Zealot Translate-o-matic reference!
Enjoy!
The Free desktop that Just Works
Having said that, it looks like the guys doing the testing got their CFLAGS wrong. Gentoo's performance should never be worse than Mandrake -- I reckon they forgot omit-frame-pointer.
Omit-frame-pointer is not a regular optimization. Working without stack traces to hand to a developer if you have a problem isn't really a reasonable optimization unless you're doing something like an embedded system, where you couldn't get at the stack trace anyway.
This is *exactly* what the real tech-heads have been saying for years, what my tests confirm, etc. A minor change in a couple of compile flags above -O2 almost *always* makes very little difference. Compiling your own packages really just plain doesn't matter. Maybe if gcc really was incredibly tuned to each processor, but certainly not with the current compiler set.
Also, the kernel compile is unfair, because gentoo-sources includes a whole load of patches that Mandrake and Debian don't.
And perhaps the inverse is true, too?
Look, the point is, Gentoo is not significantly faster than any other general distro out there. If you use it, it's because you like their tools or packaging scheme. You aren't cleverly squeezing out more performance.
Oh, and last of all, I've seen compiler folks saying that it's not that unusual for -O3 to perform worse than -O2. When I was taking our cache performance analysis bit in university, cache hits and misses really *is* the dominant factor in almost all cases. Loop unrolling and function inlining can be a serious loss.
Finally, compiling for different architectures generally makes very little difference on any platform other than compiling for i586 on a Pentium. The Pentium runs 386 code rather slowly. The PII and above will happily deal with 386 code.
May we never see th
It seems that the people that would benefit the most from a source-based distro and optimizing binaries specifically for their hardware are the ones with the slow hardware that will take too much time to get everything installed for it to be a worthwhile investment of time.
I didn't see anywhere in the story if the Gentoo installation was done from scratch stage1 or from stage3. I would think this would be a very important piece of information to mention.
Where the Music Matters
It's not worth it. Moving from distro to distro for performance is pretty ridiculous.
Here's what I'd consider, since this is where the biggest differences lie:
* How frequently are new releases announced? Frequent new releases may be better for hobbyists, but a pain in the ass for servers sitting in a back room somewhere. (It's the reason RH can see an enterprise edition that's simply not released as frequently).
* How do you like the packaging system? Try out apt, emerge, up2date (actually, don't -- up2date truly sucks. Everyone using RH who cares about automatic updating has long since started using apt or (IMHO, better) yum).
* How do you like the config system? Most vendors have their own interface to let you configure the system. RH used to use linuxconf, and is now using Redhat-config. SuSE uses yast.
* How much do you care about commercial support? A few widely used distros tend to get the only commercial support. Mandrake gets a little, but if you're going to be running packages that require support (especially binary-only), you're probably best off with Red Hat.
* Which desktop environment do you want to use? Mandrake puts more work into KDE on their system, Red Hat into GNOME.
Arguments about speed or features is really pretty meaningless -- common software is generally packaged for most of these, and rare software for none (use checkinstall to *make* packages -- you'll be much happier). It's still Linux with the GNU suite present.
People that switch from distro to distro (or maintain *multiple* distros on their machine) are nuts, IMHO. It's a fair amount of work to relearn the quirks of each
May we never see th
If you invest a lot of time in learning a distro, you're terrified that it might not be the best, and will spend ridiculous amounts of time insulting the others.
Hence, the distro wars.
May we never see th
They optimized Gentoo for the p3 platform? Celeron 1.4 ghz and above is based on the p4 core.
Your "Is -mmmmx and such worth it?" guide is a little unfair. Thing is, you notice that only -O3 really made much of a difference. Well, that's because each of your tests is just one big loop and -O3 does heavy loop unrolling. You basically chose the absolute optimal case for -O3 to win (besides possibly having a small function call inside that loop so that -O3 could inline it). And you didn't do any floating point multiplies or divides which is where -mmmx+sse and -m3dnow would help you.
If anything you were also using a relatively small dataset. If you get a large enough data set (or code size) -O3 might actually hurt you (loop unrolling and function inlining will bloat both code and data size and make it much more likely to have a cache miss).
Anyways, synthetic benchmarks are one thing but your is so synthetic as to be rediculous.
When I see things like the program time going from 39m 08s to 11m 21s (when all that was changed was a minor version number) that just screams -bad testing-.
You should repeat every one of the tests a number of times, and make sure that you get the same (or similar) results each time. You should not NEVER expect a 4:1 ratio of performace doing the exact same task on identical hardware. Bells should be going off that say "casual testing" when you see something like that.
Besides, there are so many variables that have to be kept the same between the different installs - which services are running, how they are configured, what kernel options are set, what patches have been applied to the kernel, which modules are loaded... If you pick up Redhat 9 and do a "kitchen sink" install, you will hardly have the same amount of free RAM for caching, etc. compared to doing the "regular" install of some other distro that leaves out things. Hopefully it's obvious that such a comparison that would not be fair at all.
In short, you should take a given kernel source, with a fixed set of patches, options, settings, modules, etc., and complile it with the default i386 options and then a second time with all the fancy optimizaions, then compare those. LEAVE EVERYTHING ELSE THE SAME! Repeat with glibc.
The results in this article are just pathetic. They vary all over the place and are crying out for more rigorous testing methods and procedures. Making a good test is really a science, you have to design the test to specifically measure what it is that you're interested in. For all we know one of those tests could have already had a majority of the libraries loaded into the disk cache, resulting in the huge performance differences.
The only way to have the same hardware is to use the same machine for each distro. Period.
[100% ISO 646 Compliant]
SVM, ERGO MONSTRO.
The major benefit for me was that Gentoo was the first distro I'd used that gave me the slightest clue about what the operating system was doing, and how the software worked.
.conf file, and my Linux experience was pretty frustrating.
.confs you can tinker with and what they do. It gave me flexability while keeping the results trim. The USE flag is the most amazing option I've ever seen.
I'd tried RedHat, Mandrake, and a few other distros that set everything up for me so I could "just use it." The problem was that in just using it, I had no idea what I was trying to use. I would go looking for software to do x, y, or z, and I'd either find nothing that seemed to do the job, or a jillion different apps that all did the job differently, and I didn't know why to pick one over the other. Add to that the sense of being at the wheel of an out-of-control car every time I wanted to make a change to a
Gentoo was a brilliant introduction into how to install a Linux-based OS. It started me off easy -- here's the command line, here are the commands to install the system, here are the
Installing Gentoo was more like playing with LEGOs than installing a system, and when I got done with it, I had a computer that I knew, really *knew*. I knew all the init.d services and what they did. I knew what module was controlling what hardware in my kernel *and* how to fix it if it didn't detect properly. I knew all the apps installed, even by their weird names and locations, and I knew what they were there for. I knew it because I built it that way. And I never had to hunt down a dependency or resolve a version conflict. NOT ONCE. Redhat and Mandrake just installed this mysterious Linux Stuff and threw the computer back at me when done. Gentoo got my hands dirty with building it up, but didn't make me jump through hoops to do it.
The benefit was teaching me what my computer was doing when I used it.
*THAT* is how I wanted my computer to run. And it does. Thanks, Gentoo team!
GMFTatsujin
Ok, I'm a gentoo user. I'll admit a sizable percentage of our ranks dont know what they're talking about, i'll even admit that most distro "speed" is in the users head. But most of you are missing the point. Many gentoo users (including myself) installed gentoo as an ongoing learning experience. Sure, there's really no difference between the "l337ness" of typing emerge foobar and typing rpm -ivh foobar. But those of us who have taken the time to understand the portage system have learned a great deal. As an aspiring programmer, this was my distro of choice because it enabled me to learn about gcc. Also, i like the idea of (although most install standard packages) being able to beta test bleeding edge applications. While there are a lot of phoney gentoo users who are under the impression that theyre furthering the opensource movement by emerging packages, gentoo's backbone a highly active community of volunteers who are really interested in Open Source. Basically, all i'm trying to say is that any idiot can probably get gentoo installed and working, but the real point is to understand the OS that you've built, and i've found that gentoo helps me and others do this better than package based distros.
Look, Squinky86. I'm not simply pulling this out of my ass. I've had to cover this back in university. I'm not a gcc developer, but I have sat down and gone through generated assembly from the compiler, and have spent many hours tweaking software in every way possible to make it run at a reasonable rate on my old P2/266. There are a very few pieces of software for which arch flags make a measurable difference (as a later poster noted, gzip is one). As for individually specifying flags, I'd be facinated to know what you're trying to use above -O3. You can use -ffast-math. It's unlikely to provide particularly useful results. Approaches like this have been done for a long time (see libmotosh on the PowerPC) -- they can cause the rare, PITA to find problem, and any libs or programs that really need the speed increase have probably done custom work that's even faster than anything you're going to pull with ffast-math (libfftw, for instance). You *might* get some performance gains on povray...but most folks I know compile povray themselves anyway, since it isn't packaged by, say, Red Hat. -fstrict-aliasing is a *very* balsy flag to use if you haven't actually written the software yourself. It's a pretty safe bet that building a lot of unknown software with -fstrict-aliasing will break it. There's a good reason strict aliasing is off by default -- valid C programs (easy ones to write, too) will die with this option, occasionally and in odd ways. You're a damned fool if you use this on *any code* that you did not write or explicitly says that it was written to allow this optimization. I just finished talking to an optimizing compiler designer Thursday who reinforced my feelings about aliasing-dependant optimizations -- they're almost always a bad idea, since the small speedup isn't worth the random problems that you can very very easily induce. -fomit-frame-pointer can produce a small benefit, but surprisingly small, and makes tracking down any crashing bugs or requesting help with a crashing bug infeasible.
If you know how to and properly configure your compiling flags, the speed gains are tremendous
Bullshit. The vast majority of software I've run benchmarks on will never see less than a 10% performance gain (and that's being *very* generous...most will see no measurable change) with anything other than the default -O2 or -O3.
Oh, hell. I was a lot like you not so many years ago, sure that I could speed things up if I just found the right ways to manipulate the code. The only cure for it is actually sitting down and benchmarking things yourself, since you're sure that everyone else is doing something wrong.
Go ahead, you'll see what I mean. Try building libs that tie up a lot of CPU cycles in multiple apps like libjpeg -- that's where you're going to see your best payback for any optimizations. Time a couple runs.
but if you just try gentoo, the learning experience and speed gains are very noticeable.
I think I've adddressed speed gains. As for learning experience, Gentoo is not synonymous with compiling software from source (from that standpoint, Slackware and similar distros blow Gentoo away). I've never bought into the "learning experience" claims -- let folks start out on the GUIs their distro maker provides, and then, regardless of distro, you can quite happily find out what's going on.
This is not to say that I don't think Gentoo is a worthy distro. I'm a bit of a package management aficiado, and emerge certainly interests me. However, the kind of sweeping claims I see the occasional Gentoo user make on Slashdot are ridiculous. The general-purpose Linux distros are all fairly close together. Distro fans tend to be produced when someone fails to understand how to properly use a different distro (or got accustomed to one), or has sucked down some false claims from other folks, or just don't want to consider that the distro they've sunk lots of time into learning isn't far better than any other choice.
Now, if you happen to like Gentoo, go for it. But like it for the legitimate reasons, not inflated false ones. Don't make exaggerated claims WRT to it, because misinformation certainly doesn't help out Linux folks in the long run.
May we never see th
They need a hardcore gentoo optimizer in there for the gentoo box, someone that knows what they're doing....Hence, I say the person doing the gentoo install is NOT informed on optimizing gentoo and thus renders this test invalid.
[sigh] Nobody ever listens to me [Dark City].
Okay, let's take a look at how informed you are. First of all, -mathlon-xp implies -m3dnow, -msse, and -mmmx. -O2 or -O3 is default already on most systems. The only differences -O3 produces is -frename-registers (which does essentially jack on the x86 line) and inlining (which tends to produce very minimal or negative benefits, given the fact that cache misses (which this aggravates) are far more of a timesink for most programs than setting up and returning from function calls). -pipe produces no runtime benefit, though I leave it in my own flags. -fforce-addr, -frerun-cs-after-loop, and -frerun-loop-opt are implied by -O2 or -O3 already. -falign-functions=4 is considered a slowdown for the Athlon line by the gcc team relative to the default (64 on current gcc). I haven't tested -maccumlate-outgoing-args, and I'm not familiar with what it internally does -- the only benchmark I could google for indicated a slowdown caused by it. -ffast-math is a decision of dubious value. Very little code uses floating point math, so ffast-math rarely has an effect. The code that does and actually cares about this degree of performance generally has native implementations that are faster than -ffast-math, since they're special-cased. This can cause software breakage. (We already saw the realization that these sorts of optimizations are of dubious value with Motorola's LibMotoSh for the PPC). -fprefetch-loop-arrays is implied by -O2.
The overwhelming majority of code does *not* have an #ifdef __SSE__ with alternate code.
Basically, the only seriously useful flag you used is that which all distros use -- -O3 (and I generally feel that -O2 is a better choice on modern processors, where cache is so critical). -march=athlon-xp can help, but it's unlikely to make a measurable difference on any but a very few pieces of software. Most distro vendors already benchmark and ship versions of software that benefits with a different arch -- look at RH's different RPMs. -fomit-frame-pointer is arguable -- but you're going to probably see *well* under a 10% performance difference, and you have no ability to track down crashing bugs or send in useful bug reports. -ffast-math can cause breakage, and provides little benefit for almost any package (one exception is povray -- it's a floating point heavy package that tries to be portable). I custom-build povray, but then RH doesn't package povray anyway, so that's not too much of a concern.
Anyway, my point is not to criticize you. I spent my days with a three line CFLAGS string as well, sure that I was producing nicer and better code than anyone else. Then came my benchmarking days and a compiler class and some days picking apart gcc-generated assembly...and I realized that I really wasn't gaining anything.
If you like Gentoo, do it for the legitimate features that it provides (like emerge), and not for some fanciful performance improvements.
May we never see th