Stress-Testing The Linux Kernel
An anonymous reader writes "Automating software testing allows you to run the same tests over a period of time, ensuring that you are really comparing apples to apples and oranges to oranges. In this article, Linux Test Project team members share their methodology and rationale, as well as the scripts and tools they use to stress-test the Linux kernel."
- video (3D)
- sound (alsa, oss-emulation)
- third-party modules state (like linux-wlan-ng)
- usb printer
- usb key
- usb modem
- ...
Too many times I missed a problem after a kernel upgrade, and lost time wondering "when did it stop working". I seem to always forget to test the broken thing each time I upgradeblah
it's REALLY done
We use all types of telco hardware running private OS's in a production enviroment. We patch on a regular basis, because there is no way to load test this hardware. Enabling new functionality or modifing the enviroment pops up new problems, new software upgrades on connected resources can spring up new problems on a daily basis.
Its amazing, its like whack a mole, it never gets stable. You spend your time fixing one problem to move onto a new one that pops up with some new feature or alteration becomes production.
Linux wants to be the OS of choice for production enviroments, 5 nine hardware for critical applications need some kind of testing.
Don't think of a webserver, think of a dozen servers talking multiple protocols to each other over multiple mediums all changing the data in realtime. Its easy for memory leaks, timing issues, or corruption to appear.
I guess its good to be a vendor, since the customer will have to pay support contracts to the end of time.
This is why I laugh at IBM's "Servers are self healing" commericals, while its nice in theory, they make too much money off contracts.
Stress testing is a complicated subject. For basic system stress you can use stress, but often you want more specific tools like siege or mysqlstress. None of them can precisely replicate the exact stress patterns that trigger bugs in real-world Linux deployments. What is needed is a tool that can capture system calls and "play them back" exactly as they occurred.