The Pros and Cons of Mainframe Linux
magellan writes "There is a good article on LinuxWorld.com that goes over some of the pros and cons of Linux on the mainframe. The author, Paul Murphy is an old mainframer and current UNIX user, as well as a frequent contributor to LinuxWorld.com, so he has some good insights.
"
I couldn't agree with you more. My company gave the Linux-on-a-mainframe idea a trial run, and I'm sad to say, it just couldn't keep up with the load we were running through it. Both the VM and the scheduler were serious limitations compared to the Unices we have in use.
Linux was never ported to the 486 and Pentium. As they support the same instruction set as the 386, no porting was necessary. Sure, over time enhancements were made to take advantage of specific features of various x86 features, but that's not porting.
And GNU has nothing to do with porting of Linux to any platform.
And your corporate mainframe doesn't run NT. Or if it does, you're using a definition of "mainframe" with which I was not previously familiar.
And preemptive multithreading and protected process management is not new to 2.4.5. That's something that has been in every Unix since, oh, about 1970. It's also something that has been in every enterprise-class system in the past twenty years. I would hardly call it a boon to admins.
AFAIK, there are no mainframes that run NT. And I've never seen one with a USB port, either. Your post makes absolutely no sense from a mainframe perspective. What are you talking about with "dual boot"? Mainframes can run multiple simultaneous LPARs, which are virtual machines sort of like VMware provides, but nobody ever boots a mainframe if they can avoid it. And mainframers don't use the word "boot" anyway, since individual components of the mainframe can be individual restarted. Mainframers IPL from the HMC, which is the analogous operation to rebooting. Mainframes run VM, MVS, OS/390, and similar. They don't run NT or OpenBSD (though OpenBSD could probably be ported). There is no graphics console device on a mainframe, so how the hell could you run NT?
Much server work looks like this: request comes in from network, appropriate program gets loaded, maybe talks to a database, runs for less than a second, returns results to network, exits. Typically, a large fraction of the resources used go into the "appropriate program gets loaded" step, doing the same startup ritual over and over.
There are two UNIX/Linux solutions to that startup overhead problem. One is to build the transaction program into the network application (as in Apache/mod_perl/php). Note that this uses an interpreter to protect the network application from bugs in transaction programs, which is a major performance hit.
The other approach is to use the regular UNIX/Linux program launch facilities to run a separate program for each transaction (as in CGI programs.) This is safer and easier to maintain, because the CGI programs each run in their own processes, but the cost of program loading (which might include initializing a Perl or Java environment) often dwarfs the cost of doing the useful work.
A mainframe transaction processor basically maintains process images which are ready to run a transaction, with all loading complete. When a process is needed to run a transaction, it's made by copying one of those process images (with read-only or copy-on-write sharing of pages) and launching it to do the job. The new process runs for a short period and exits. This is a facility that Linux/UNIX lack, because they were intended for interactive use, not server-side transactions.
Because Linux has copy-on-write semantics for fork(II), it's should be possible to do a high performance transaction facility under Linux. A transaction program initializes itself by loading everything it needs, but without any per-transaction data available. It then goes into a loop waiting for work, and on each request, forks off a copy of itself to do the job. Each copy does one transaction and exits. If it crashes or gets corrupted, only one transaction is affected. Note that there are no expensive exec(II) calls involved in starting a new transaction.
Has this been done? It's obvious enough that somebody has probably tried it.
You are incorrectly aggregating mainframes, shared-memory multiprocessors (SMP/NUMAs), and clusters as massively parallel machines.
Linux makes a great operating system for certains classes of massively parallel machines: clusters. It is low-cost and has a decent MPI (message-passing interface) implementation. It also runs on commodity hardware. Don't be surprised if you see the next ASCI supercomputer using Linux as the OS for each node.
You are correct in that Linux is not a good operating system for larger shared-memory multiprocessors. It lacks the fine-grained locking necessary to run the same kernel instance across dozens of processors.
I can't comment on mainframes because I am unaware of their architecture. I do know that high-end UNIX servers and mainframes are different beasts. The former focus on performance while the latter prefers uptime above all. I also believe that IBM, the kind of mainframes, has not used UNIX as their traditional operating system. Thus you are comparing apples to oranges. Linux makes a perfectly decent "mainframe OS" if you are partitioning the machine into multiple virtual machines.
Also please elaborate on "Linux' inferior TCP/IP stack". And "inferior handling of multi-threading on a large scale". Are Solaris light-weight processes any better?
I work for a large shop and here are my experiences running Linux on the Mainframe.
First, I'm a mainframe person. I like the mainframe. I've used Linux at home for about 6 years so I was chosen to be on a "proof of concept" with running Linux under VM. I've been doing OS/390 & z/OS support for about 4 years. I'm in the "30 & under" crowd and I've seen both the Unix & mainframe side of support.
We've played with TurboLinux, SuSE, & the RedHat beta for the zSeries. We're running zVM 4.2.
First, lots of things work really well. It was strange seeing the normal Linux boot messages appearing in zVM. We've been primarily using the 2.4 series kernel, but we have tested things with the 2.2 series. We've played with Oracle, WebSphere, DB2 Connect, Samba, Apache, IBM HTTPD server. The only technical problem we really had was Samba caused kernel crashes. Some patches from the IBM z/Linux site fixed it.
The biggest problems we have had are philosophical and percepteion based. Here are some of the difficulties:
We had to force our customers to a shared outage window. Even VM needs to be IPLed every year or so. If they can't tolerate a 6 hour window every quarter or 6 months, we won't support them on the zSeries. A second box could make it a true zero downtime machine, but we are initially targeting the low usage, non critical machines.
Lots of people have the delusion that the zSeries processor is hundreds of times faster than other processors. It isn't. It's fast, but not several magnitudes faster than the other processors out there. It's also not designed for heavy computational applications. Don't try, you'll hate the results. It can be done on a limited basis, but don't try and compute PI. It works better on I/O related applications which are traditional mainframe strengths.
A lot of the code on the zSeries for Linux is the first generation to be released there. A lot of the performance perks for that platform are not there yet. If there is enough adoption, ISVs will make the performance better, but right now a lot of them are testing the water.
Some people have the illusion that if you take a piece of crap application on Solaris or NT and run it on Linux, it will run better. The OS typically doesn't make your piece of crap any better.
When people buy an Intel or Solaris server, they typically get the most memory & disk space they can afford. This is the worst thing to do under VM. We had a lot of people want 2GB of RAM and 100GB of disk space. Later analysis showed they could survive with much less memory (some as little as 128M) and used almost none of the disk. The reason for this is simple. Whey you buy a Sun or Intel server, upgrading them is a pain, so you do the pain up front. Under VM you can change the amount of memory & allocate more disk very easily. This was a big learning curve for people, and not just the Unix people. The major difference we found in the memory is because Linux uses it as disk cache. On the zSeries the hardware has lots of it's cache on on it.
People needed to understand they were sharing CPU & memory. Performance tuning has a very big impact. On Intel or Sun who cares if your application is looping endlessly. On VM everyone cares. Lots of our Unix sysadmins really hated this fact and the customers couldn't fathom it. You want to put applications with LOW USAGE on this platform. The idea behind sharing is that nobody needs all of the CPU all of the time. If you run at 100% on a 4-way Pentium CPU, you won't like sharing CPU with dozens of other virtual servers and they won't like sharing with you. This was probably the most difficult thing to stress to the users.
This isn't emulating Intel. It took a while to get people to understand that VM wasn't emulating an Intel machine and that the nice pre-compiled intel binaries don't work. Lots of people went out looking for software from ISVs and the ISVs said "Sure we support Linux". What they didn't say is "We support Linux/390". There is a very big difference. Linux is not just Linux on Intel and it took some education to get this through to the users.
Once we convinced people that it isn't running Intel, they tried to recompile their favorite programs and found out that for some applications a "simple" recompile wasn't enough. I would imagine that the power-pc folks had similar problems, but some programs take a little investigation.
There were some really nice aspects of running on a zSeries.
Disaster Recovery is easier. Mainframe DR has been established for decades and it isn't terribly different with Linux on the mainframe. Much more simple than having dozens of individual machines to recover.
The hardware never fails. It may be expensive, but CPUs have a 30 year mean time to failure, the disk is all raid, multiple IO chanels help ensure there is not single point of failure. Hardware can typically be swapped out without taking an outage. CPUs can by dynamically added.
If you want to copy an existing virtual server and make a test copy, that can be done in minutes. That makes it really nice for developers who want to do the "what if I do this" tests.
VM's programmable operator facility makes for some nice system automation. You can also create Rexx scripts for your operations so they never even need to logon to Linux to do certain work.
Creating a new server is easy. No more running through the install screens. Once you have one customized, just use it as a template for new servers.
We were able to have certain drives shared as read-only across all images. This makes support a little easier. We made one Linux have the drive read-write. When we changed it there, we just unmounted & remounted it on the other images (a Rexx script made that painless) and it was magically everywhere. We can even take down the read-write linux to be sure something isn't accidentally changed. We've been experimenting with sharing lots of Linux mount points this way. We estimate we can concentrate about 100GB down to 2 GB which cuts down the overall cost. The majority of code on all Linux images are the same and will tolerate being shared, so as long as your environment is stable and you do some planning, you can dramatically cut down on disk usage. The amount of disk you save is directly related to the number of images your machine can handle.
The virtual-linux to virtual-linux IP traffic happends at memory-to-memory speed. It's also very nice not to worry about network issues when trying to debug a problem because there is no physical network.
Recovery is easier if an image won't boot. Just attach the drive to another, running image and fix the problem. No need to physically go to the machine.
Sorry to ramble, but this is what we have found. Linux on the zSeries has it's place and does work, but it's not a solution to every problem. Few things are.
I know IBM has demonstrated this with some of their next generation PPCs. Alpha has had single-cycle context switch for years (I don't know if tru64 or alphaLinux supports it, but OpenVMS does.)
Yes, but note firstly that this article is making two different points, and secondly that at least one of them is clearly wrong and deliberatly misleading.
First the article claims that Linux on mainframes isn't price efficient compared to Linux on Intel, and that Intel boxes are emerging which have similar reliability to mainframes.
Possibly true; I don't know enough about mainframes to know, although I'm certainly not aware of these high-reliability Intel boxes.
Second, the article launches an ill-informed FUD assault on Linux, saying
- 'Linux vendors for requiring users to constantly update their software to fix errors'
- 'current Linux incarnations are relatively immature, as evidenced by the interminable list of errors/patches on Linux providers' Web sites'
- 'Linux isn't capable of running more complex, critical applications, such as e-mail notification systems'
Are any of those things true? What does that say about the rest of the article?I'm old enough to remember when discussions on Slashdot were well informed.
Not to be a GNU pundit, but...
> And GNU has nothing to do with porting of Linux to any platform.
... is demonstratably false. Whether or not the individual people who port the Linux kernel to a new architecture are or are not GNU affiliates is, simply put, irrelevant. The first step to getting Linux (or BSD, or whatever) on a new system is porting GCC to its architecture. While this is sometimes done by the people responsible for the Linux porting effort, most of the time this is done by members of the GCC team -- getting a new port to work without breaking all the others requires a great deal of cooperation and support.
Not to mention a working linker. Assembler. The list goes on. Who wrote those?
Lately I've heard a lot of Linux weenies dissing GNU and RMS as out-dated hippies who are prone to overestimating their importance. Unfortunately for these people, GNU is the only reason Linux exists. It's not like Linus wrote his kernel and there just happened to be a binutils chain, compiler, libc, etc just sitting there, ripe for the taking without someone doing a HECK of a lot of work. Probably more work than goes into developing the Linux kernel itself.
Unlike some morons, I'm not here trying to say that Linux/Linus don't deserve a lot of credit; they do. But people who disagree with RMS and his policies often decide that that makes it okay to write revisionist history and downplay his importance to the OSS movement. Without him, there is no movement. Like him or not, don't forget it.