Putting Linux Reliability to the Test
Frank writes "This paper documents the test results and analysis of the Linux kernel and other core OS components, including everything from libraries and device drivers to file systems and networking, all under some fairly adverse conditions, and over lengthy durations. The IBM Linux Technology Center has just finished this comprehensive testing over a period of more than three months and shares the results of their LTP (Linux Test Project) testing."
Just put a link to each box on /. and wait 24 hours.
Of course, we all knew this in our hearts, nice to see it in writing.
"Kittens give Morbo gas!"
Anyone know if the test will be repeated with kernel 2.6.x?
You want to put any OS to the ultimate test, you should run cheap generic hardware. I swear it's an industry conspiracy that generic parts struggle a boat load. If your parts don't come from the big boys (DELL, gateway, etc), you are likely going to see issues down the line.
Get some ECS motherboard, generic RAM... bang. You're in for the evening.
> You're thinking Microsoft Works.
I'm thinking it doesn't.
Why do you trust IBM's Linux Technology Center to evaluate Linux?
I seem to recall getting random crashes with cheapo memory, and it was a pain to track down the offending component. Of course, one would assume that IBM wouldn't go for cheapo components, but still: how does one point the finger at the software, instead of hardware? Is it just repeatability?
This is nice to hear, but it would be even more valuable if the same tests were performed on a variety of operating systems in order to compare the results.
Brian
My Company
The Linux kernel and other core OS components -- including libraries, device drivers, file systems, networking, IPC, and memory management -- operated consistently and completed all the expected durations of runs with zero critical system failures. Every run generated a high success rate (over 95%), with a very small number of expected intermittent failures that were the result of the concurrent executions of tests that are designed to overload resources. How does that compare with other OS's?
Because what would they have to gain by lying? The true power of opensource is that when someone does point out the weaknesses, they are fixed quickly. IBM knows that if they tell the opensource world "Hey, LINUX is pretty good but it kinda struggles in the (foo) area." that the opensource community will redouble their efforts to fix that. Microsoft is only trying to say "Windows rules, Linux sux. See, this evaluation we did proves it. Buy windows"
Second off, If this were M$ testing 2k3 and publishing the paper, everyone here would be crying foul. But because its, "Linux" it must be 100% unbais and true.
I've been using Linux for 8 years now including under high stress enviroments, 3d graphics rendering mainly, and from experiance I have see very good things from Linux. We have had software glitches before, but the core software maybe has caused 3 - 5% of our downtime. Over 70% of our downtime involves human error and about 25% of failures are due to hardware giving out.
Still what my customers are wanting to see isn't benchmarks as "So easy Grandma could use it" in Linux. While the people in the datacenters want to know how well Linux will bear under a load, most end-users and SMB's don't need to worry about it, they just need something easy to use that works.
"The problem with socialism is eventually you run out of other people's money" - Thatcher.
2.6.0 has only been out for a week. I'm going to want to see someone stress test it for a hell of a lot longer than a week before I call it "enterprise ready".
The people performing it have a vested financial interest in having it turn out a specific way, notably positive. If the test resulted showed poor reliability, then I would understand trusting it because it would go against the motives of the people performing it. Since the test affirms their business model, no matter how documented it is, it should be suspect.
It doesn't appear to be a test rigged to make one platform look better than the other.
It looks a bit skewed to me. Many of the test results depend on the computer systems meeting expectations of the people testing it, particularly in overload cases. Since the people who tested work in the Linux Technology Center, their expectations stand a greater likelyhood of being consistant with the system.
Take C/C++ and Java. Someone who regularly works with C/C++ knows certain libraries (notably the character ones) return ints for status in the form 0 being false and not 0 being true. If someone expects that, the system meets expectations and passes. If someone comes from a different background, say Java, he or she may not expect that, and the system would consequently fail the test of meeting expectations. I would like an evaluation from somewhere in-between, not someone whose years of experience allow them to gloss over what might be problems for another person.
Microsoft commonly hires outside companies to perform their tests. Do you remember the evaluation of Exchange versus Notes/Domino scalability by Ziff-Davis but funded by Microsoft? People justifiably questioned those results, as the company hired (Ziff-Davis) has an interest in pleasing the hiring company (Microsoft) so they get future work.
They have much to gain: more corporate customers and more respect and funding by greater IBM. Just because IBM supports Linux doesn't mean its motives are pure (not financially driven). Another reason for bias is the division also stood to have huge setbacks if the tests were unfavorable. How could they justify expansion and better funding if their previous statements about Linux being enterprise-ready were unfounded?
I am pretty sure that kernel 2.6 can't be considered "enterprise ready", for one, it hasn't gone through that level of testing.
Don't knock "yesterday's news". Far be it from some geeks to understand this, but there are times that "tried and true" is more important than having the latest and greatest. This testing started well before 2.6.0 was released! They can probably get started wit 2.6 as soon as an enterprise Linux distribution incorporates it.
I used to use RH8.0 and then RH9.0 and moved on to SUSE 9.0 Pro recently. I noticed that when apps crash RH distros took it much better- You could close the unstable application safely and continue working on the rest. Suse OTOH freezes. Did anyone else notice this ?
Siggy Say, Siggy Do
- because the test methodologies are documented
- because it's disclosed up-front that it's IBM Linux Team testing Linux (ie no hidden conflict of interest
As opposed to the usual (ie in the Microsoft World)Visit CryptoGnome in his home.
The full hardware/software details of the test are there. If you don't trust it, you have the ability to rerun the tests yourself.
So, ya reply to one point but ignore the rest? I think his (ultimate) point is valid. If the test was rigged, the folks involved with developing the kernel would catch on and take IBM to task for fudging the results. No, I'm not talking about the Slashdot/Fark crowd. I'm talking about REAL developers.
Also, Linux has weathered some unfavorable (and honost!) critiques before. Linus Torvalds said it best when he said (and I paraphrase since I am too lazy ATM to look up the actual quote) that it doesn't matter if there's negative publicity in the press about Linux. It just meant he got his bug reports from the Wall Street Journal as opposed to the regular kernel mailng list.
--- Journals are boring; Go to my web page instead
95% success ratio... does that mean that 1 in 20 programs I run segfaults or what? What do they mean by "failure"? Not finishing given task in predefined time? Getting the results wrong? Hanging?
Sorry but that means nothing. Even if there -was- a comparison to other systems, it would still mean nothing. 95% success ratio, 78% happiness factor and 93% user satisfaction.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
I think IBM used SuSE instead of Redhat because IBM Global Services and SuSE have been partners for almost two years.
Maybe you should stop hmmmmm'ing about these great mysteries and start googling.
Beware blue cats moving at
Hardware fails in random patterns (bits flipped by beta radiation, for example), software fails the same way in controlled instances (the same flawed logic fails the same way each time).
So when you run a test 5 times, and you get 5 results, the hardware is broken. When you run the same test 5 times, and it gets to the exact same point before sig11ing, you have a software flaw.
This is also why you do multiple tests to ensure you're getting an accurate picture of what's going on (flawed or not).
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
I think it's more human than that. If a company or group releases results completely against their interests, integrity is the only reason such a group would go forward. Why would anyone skew results to spite themselves?
This test would have been more interesting if there had been failures. Perhaps they could have tried the test on an older version of Linux, or a different operating system.
I have been trying to write some tests of my own recently. So far I have found a filesystem OOPs, a ptrace BUG(), and my system locks up on low memory situations. Probably the lockup is because my ethernet driver allocates memory in the interrupt handler (GFP_ATOMIC) and can't handle the result when there is no memory available.
I need to fix the lock up first of all so the other tests have time to run...
specifically because you'll now trust their future results?
"wow, last year, they had to admit their product just wasn't up to the task. but now, dang, look at 'em go!"
yes, it's quite human indeed. you don't know what all they're up to -- what seems to be self-defeating isn't always. and sometimes, well, you honestly find out that you're doing the job you had hoped you were doing, trouncing the competition. go figure: you might actually manage to not suck! but you don't get to tell anyone? and your only solution is to pay someone else to announce it? oh, wait, that's not allowed either!
any publicity is good publicity. if you can't get good publicity by announcing your product is good, just say it isn't. close enough. it's not like anyone pays attention anyway.
Before you roll something out into production, you hammer on it for a while. The crappy parts will fail or generate errors and you can have them replaced.
In my experience, most of the no-moving-parts hardware will fail within the first week, or last for years and years.
The stuff with moving parts will eventually fail. But that's harder to predict.
for one thing, it would be difficult to run a 3 month stress test on 2.6.0 when 2.6.0 isn't 3 months old, and isn't part of a released enterprise product. If they stress tested one of the betas and it failed, Microsoft would use it for advertising. :)
FTA
The tests demonstrate that the Linux system is reliable and stable over long durations and can provide a robust, enterprise-level environment.
Ok, now i dont mean to troll here, so mod down if you wish, i really dont care.... BUT...
I am a linux user/programmer/lover for the past few years now, and i wanna see a company that is not SO IN LOVE with linux say what have just been said by IBM above.
In other words, i dont want to see companies who sell Linux, or who have benefit in selling Linux praise it. Does any one of you know of someone who fills in these criteria. Sun for one is not very fond of Linux, nor is MS ofcorse (despite the fact sometimes i doubt they have code in their stuff from Linux...)...to make a long story short
It would be really nice if such a judgment came from someone else besides IBM/REDHAT/ORACLE...
The lunatic is in my head
"The Linux kernel properly scaled to use hardware resources (CPU, memory, disk) on SMP systems."
Sorry, but how can the scaleability of the CPU resource be proven on a 2 CPU system? Show incremental results on 1, 2, 4, 8, 16, 32 etc. etc. and then CPU scaleability may be proven.
This is NOT an anti-Linux troll, rather the evaluation needs to justify it's outcomes or it starts to look like something from a company starting with M.
1) Linux is free if your time is worthless
2) My time is not worthless
3) Linux is not free
4) Giving Linux as a Christmas present is not "cheap"
5) Linux is a good Christmas present.
Espically for thigns like this. If a company isn't the one performing the research, chances are they are bankrolling the company that is. This means that generally thigns will be stacked in the favour, and unfavourable results will often be suppressed.
Even true independants are often not unbiased. For example some individual, with no teis to and OS developers or vendor, might decide to test OSes. Of course it might be that they are a huge Linux or Mac or Windows zealot so again stack things in their favour. You see this fairly frequently with independent Mac test sites. They are MAc heads that work to make thigns look good for the Mac.
YOu see this in research too. I can't count the number of times I've seen articles, in respected journals, where the researchers have glossed over or ignored something that could contest or invalidiate their findings. They want their hypothesis to be true and so are prone to look at the data that supports it.
So when you deal with something involving money, you are just going to see some biased results. In scientific research, you generally get other labs testing findings, so the truth is eventually revealed, despite biases. However in bussiness, espically something with quick life cycles like software, forget about it. All tests are going to be biased. Read the results and take them for what they are worth, don't use them ot generalize. Do your own testa, and use what works best for you.
All that this shows is that a Linux based system works in the way that it should. Would you expect anything else if you ran your: TV, central heating, ... for a long period ?
The trouble is that, after a period of increased stability in the 1980's, in the last decade people have come to expect that computers fail, and they wonder with amasement if they don't.
OK: 30 years ago I remember it being a good day if the mainframe stayed up 12 hours. But things have moved on, today you expect your: MVS, VMS, Unix, Linux machine to stay working. The only OS vendor who's products have not matured is the one in Redmond - largely because of rampant infestation with new features.
The above is not intended to belittle the fantastic efforts of all those involved.
In fact, it's not much of a question for me anymore -- when there's a problem, it's normally hardware malfunction. I have several machines with 160+ day uptimes, which would be longer if not for an extended power outage at the office.
IBM just confirmed what I already knew. Guess what, Win2k is pretty stable, too. Sorry, but it's true.
But, jeeze, isn't anyone else drooling over those systems they tested on? Makes me hate my busted whiteboxes and horrible HP's a little more everyday.
Repeat after me....."MMMM, dual Power4......MMMM, dual Power4...."
OK, so I don't have a paper, but I remember my old Linux/P166 running great for a day or so when the CPU fan had died. I only noticed when I rebooted into Windows!
My notebook has a flaky RAM connection. 32 MB comes and goes depending on how the machine is squeezed. Win 9x products crash it hard, Linux and Win2k don't even notice.
So in my experience, Linux doesn't mind a hostile platform.
SYS 64738 NO CARRIER
My client is a big megacorp. Their strategy for the coming years is to migrate all Unix systems to Windows/.Net (client side), and to Linux or NT (server side, depending on which OS fits best). This isn't the kind of corporation that makes such a decision after reading a sales brochure or a Gartner article. They research their options, thoroughly. Apparently the conclusion was that Linux is reliable enough to be entrusted with mission-critical stuff.
The sad thing is that they will (probably) keep the results of this research confidential. Why help the competition with this knowledge?
If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
The very reason Linux has already made so many inroads into coporations in the first place is because of its reliability and stability, and not because some marketing campaign has churned out the words on header paper.
Another point is that I personally expect the sytems I administer to run for a darn side longer than 30, 60 or 90 days unless I need to restart them because of a kernel upgrade. When my last bunch I worked for went tits-up, our SAMBA file server had a 790 day uptime, and had run the SAMBA daemons reliably throughout, as well as doing internal DNS and DHCP. That's what your average Linux sysadmin expects from a Linux server box.
A Linux desktop being used for all manner of things though is completely another story: if I muck around with the Linux install on my laptop, as I do because that's what I do, then I expect to break it from time to time, and so "reliability" is not measured in the same way on a desktop/laptop system, IMHO.
The ideal environment for Linux is as a networked server, where it can get on with doing what it was setup to do, and will continue doing so until someone pulls the power plug on it. In that context, there are few OS's playing on the same field that can rival it for reliability and stability.
but with package and dependency madness.
I couldn't tell you the number of times I tried to install something and it fails because I was missing "X-Widget-2.41.so.1", so I try to install that "X-Widget-2.41" package and the "X-Widget-2.41-devel" package and they fail because they are missing several other depends as well.
Linux stability is fine. The GNU software stability is fine. We need a better way to install and maintain software.
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
I've been using an old P120 laptop as a firewall/router for my house for the past several years running 2.2.something. I wondered why it rebooted after noticing an uptime of only a day or two, but found that instead I was experiencing the uptime rollover bug (at about 500 days; Windows used to crash on a similar bug after 48 days). About a month ago, it stopped giving out DHCP addresses. I went downstairs to investigate, as I couldn't log in remotely, and found that the hard drive was making that nasty clicking sound. I eventually managed to ssh in (sshd and sh were in ram; I just waited for the logging to time out). I was able to kill syslog and cron, and now dhcp is again giving out addresses.
It's been running just fine for a month now with a dead hard drive.
(Yes, I'm getting a replacement because it won't survive an extended power outage on that ancient battery.)
MS is running ads saying how windows XP is so reliable. It is kinda hard to believe when you hear the ad because you a getting a cup of coffee waiting for XP to reboot. Same with 2k3. It crashes. Not as often as XP same as XP doesn't crash as often as 98 and so on. But it still crashes.
Now on to my linux machines. Wich don't crash. I only run in total about a dozen of them and not one of them has crashed.
I also have had some experience with AIX. Typically on machines everybody had forgotten about that ran some app that everyone just used and they only noticed its importance when someone unplugs an old useless cable.
So from my daily experience I will find any report coming from MS saying that they are reliable suspect. From my experience with AIX and Linux I will be far more willing to believe a report from IBM about reliabilty because THEY HAVEN'T BEEN TELLING ME TO MANY LIES BEFORE.
You of course may have different experiences. Linux if nothing else seems capable of generating wildly conflicting emotions in people. So does MS software come to think of it. Funny that we can get so worked up over a collection of bits.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
We run SuSE Professional 8.2 on our home machines, one server, two workstations.
My partner's machine freezes occasionally. Most recent one was yesterday, and even Alt+Ctrl+Backspace wouldn't get control back. I needed to power off!
I've never been able to work out exactly what causes these freezes (my partner is not very "'puter literate"), but suspect that it may be somehow related to the printing subsystem. One time I did manage to ssh into the machine when it froze and the cups process was feasting on CPU cycles.
We upgraded to the then latest patches from the SuSE ftp site about two weeks ago. This did not affect the reliability.
I can't contrast these observations to any other distro. I've only ever used SuSE (since 6.x).