Putting Linux Reliability to the Test
Frank writes "This paper documents the test results and analysis of the Linux kernel and other core OS components, including everything from libraries and device drivers to file systems and networking, all under some fairly adverse conditions, and over lengthy durations. The IBM Linux Technology Center has just finished this comprehensive testing over a period of more than three months and shares the results of their LTP (Linux Test Project) testing."
Uhh..I have a computer built completely from individual parts that I've bought, and I haven't had any stability problems. Ever.
Keep in mind that Dell and Gateway's PC's are good because the hardware they choose to include is good. Anyone can buy the same hardware.
> I swear it's an industry conspiracy that generic
> parts struggle a boat load.
Either that, or it's part of the myth of "I can build a box that does XXXX cheaper than *big name here*"
After experiencing the reality of real world reliability, performance and support hassles, I just have to snigger at ANYONE who comes up with those claims.
Generic cuts it when I have a server in the next room at home and I'm here all the time that it needs to be working. For anything else, it's just buying extra work for myself.
http://www.memtest86.com/ Freeware GPL bootable memory tester for PC platforms... highly recommeded for troubleshooting flaky RAM...
Microsoft commonly hires outside companies to perform their tests. Do you remember the evaluation of Exchange versus Notes/Domino scalability by Ziff-Davis but funded by Microsoft? People justifiably questioned those results, as the company hired (Ziff-Davis) has an interest in pleasing the hiring company (Microsoft) so they get future work.
Do a SuSE FTP install of SuSE 9.0.
s use_linux /
It's as close as it's going to get to a SLES release, and it's free. Suse has to many obligations to support propriatory software to make SLES free to download.
link:
http://www.suse.com/us/private/download/
Also you can get a AMD64 bit version from download now, too.
Its the results from the Linux Text Project suite. 95% success rate, zero critical failures, means that 95% of the 2000 test cases completed successfully, and nothing crashed the kernel. To see what that means, just take a look at what test cases are in the LTP!
A deep unwavering belief is a sure sign you're missing something...
Not necessarily: When uncompressing one of the XFree86 source tarballs, X430src-3.tgz, on my old k6 2-450, gzip would always die with a bad CRC. Nothing else at all seemed to go wrong with the machine, but I couldn't uncompress the file until I downed the memory clock to 66MHz, rather than 100.
I found one other person with the same motherboard having the same problem in a google search, and also heard there was a problem with that mainboard using ram at 100MHz.
printf("%s@yahoo.co.uk\n", uid[569754].name);
for one thing, it would be difficult to run a 3 month stress test on 2.6.0 when 2.6.0 isn't 3 months old, and isn't part of a released enterprise product. If they stress tested one of the betas and it failed, Microsoft would use it for advertising. :)
You should not trust this evaluation at all.
- Go to the site
- download the testing tools yourself
- read the test paper
- use the test methodologies as documented
- do your best to verify their test results yourself
- go back to the site
- post your results for everyone else to see
(ie follow the good practices of basic science)After all... On the internet , nobody knows you're a dog.
Any JimBOB can write a convinving paper, with all the right buzzwords, that sounds as if X+Y=Z, especially if that was logically a likely/expected outcome in the first place.
As a well-known TV show once said (several times and loudly) Trust No-One.
Remember people, YMMV.
Visit CryptoGnome in his home.
Bleh, thats not necessarily true at all. A good race condition in a many-threaded program can quite easily look very much like a hardware problem, in that it is difficult to reproduce reliably.
It takes a lot of time and money to do very thorough analyses of operating systems, hardware and enterprise apps. So that money has to come from somewhere. It would be all well and good to say "hi, we're an independent research and analysis lab, we'll write unbiased reports about the state of the industry", but somebody has to fund that shit. And pretty much all that money can be traced back one way or another to some of the big companies in the business who can afford to throw it around for marketing benefits - like Microsoft or IBM.
In a perfect world, all the customers and potential customers of software would get together and each chip in a little bit of money to fund good, unbiased research. But like in the world of politics, it's easier to get a few special interest groups who have a lot at stake together than to get hundreds or thousands of parties who each have a little at stake to cooperate.
So when you run a test 5 times, and you get 5 results, the hardware is broken. When you run the same test 5 times, and it gets to the exact same point before sig11ing, you have a software flaw.
This isn't true. If you're running a program that uses a deterministic memory allocation algorithm (a compiler, for instance) and have a segment of bad memory, then you easily could crash at the exact same point (when a pointer in that segment is dereferenced, for instance).
I know. It's happened to me. I've even had such slightly bad memory that I could compile nearly everything I needed, but one project consistently failed. I took out a bad memory chip (actually it was simply mismatched PC100/PC133) and everything worked fine.
Jeremy
Looking for a Python IRC bot?
Try running Memtest86 on something like that. It might be the RAM, as Windows handles RAM differently from Linux, and can hit bad parts sooner than Linux (and vice versa).
OK, so I don't have a paper, but I remember my old Linux/P166 running great for a day or so when the CPU fan had died. I only noticed when I rebooted into Windows!
My notebook has a flaky RAM connection. 32 MB comes and goes depending on how the machine is squeezed. Win 9x products crash it hard, Linux and Win2k don't even notice.
So in my experience, Linux doesn't mind a hostile platform.
SYS 64738 NO CARRIER
The very reason Linux has already made so many inroads into coporations in the first place is because of its reliability and stability, and not because some marketing campaign has churned out the words on header paper.
Another point is that I personally expect the sytems I administer to run for a darn side longer than 30, 60 or 90 days unless I need to restart them because of a kernel upgrade. When my last bunch I worked for went tits-up, our SAMBA file server had a 790 day uptime, and had run the SAMBA daemons reliably throughout, as well as doing internal DNS and DHCP. That's what your average Linux sysadmin expects from a Linux server box.
A Linux desktop being used for all manner of things though is completely another story: if I muck around with the Linux install on my laptop, as I do because that's what I do, then I expect to break it from time to time, and so "reliability" is not measured in the same way on a desktop/laptop system, IMHO.
The ideal environment for Linux is as a networked server, where it can get on with doing what it was setup to do, and will continue doing so until someone pulls the power plug on it. In that context, there are few OS's playing on the same field that can rival it for reliability and stability.
The 2.4 kernel has a number of unstable algorithms. Most of the corresponding algorithms in genetic UNIX are stable by design. For example, read the bug postings on RedHat Bugzilla for this critical flaw which has failed to get fixed after more than 6 months: [Bug 89226] (VM)Kernel prefers swapping instead of releasing cache memory