IBM Saves $250M Running Linux On Mainframes
coondoggie writes "Today IBM will announce it is consolidating nearly 4,000 small computer servers in six locations onto about 30 refrigerator-sized mainframes running Linux, saving $250 million in the process. The 4,000 replaced servers will be recycled by IBM Global Asset Recovery Services. The six data centers currently take up over 8 million square feet, or the size of nearly 140 football fields."
We (Bigattichouse's Vectorspace Database) went through their Linux certification (as well as Grid cert), and they were a pleasure to work with - providing expert advice and patience in every step of the process. Not exactly on topic, I guess, but I thought I'd share. They really seem to embrace the engineering and spirit of Linux.
meh
SCO UNIX runs on the x86 architecture, that was the basis of the claim that Linux contained copyrighted SCO code. IBM's Linux on POWER solutions run on, um, POWER :)
Really this is just a slashvertisement - it's great they're using Linux on a mainframe, but they're just IBM mainframes running multiple Linux instances, rather than multiple IBM servers running Linux. Honestly, on IBM hardware, I'd prefer IBM's OSes, but they're marketing the fact that you can have a high-powered, highly efficient, highly available consolidation solution that runs your existing Linux apps
When they came for the communists, I said "He's next door. Take him away. Goddam commies."
Part of that is because IBM will customize the machines to your heart's content. The sky and your budget are the only limits. They leave a good many of the loadout details (xGB/TB of RAM, DASD storage size, # of CPUs per card, # of CPU cards, even number of mainframes - they can be chained in parallel). You should look at the Z series hardware specs for the general details and look up what details you don't know.
If you're looking for benchmarks or comparisons to x86/x86-64 or other commodity architectures good luck - they are nearly impossible to find. This is due to the implementations being on entirely different scales. The best comparison you an find is the MIPS per CPU. You can find some slightly stale numbers here (BTW: an LPAR is something that's been around on mainframes for several decades - one LPAR can run up to several hundred x86 VMs concurrently).
US Democracy:The best person for the job (among These pre-selected choices...)
They're probably computing cost over the expected lifetime.
Combine IT salary for 3-5 years, power over 3-5 years, etc. etc. and that number makes sense.
These are machines that don't break, period. We're talking the types of machines that run the major banking systems of the world and the like. They simply do not go down. In this situation, if one of the 133 apps buggers up, it's only that VM that's shot. You just nuke it and restart it, the rest of the machine just keeps ticking along.
...si hoc legere nimium eruditionis habes...
Not quite. They are engineered (as they have been for decades) for stability and were designed to handle that kind of load. Its CPU/RAM/storage are redundant, so that if something in the system goes down, new resources are allocated. Additionally, shops will have multiple mainframes just for that kind of redundancy. Its kind of like saying your car is a "single point of failure" - sure, it is, but they were engineered for the purpose of being reliable.
The blue team was in the other day and I sat through the entire 2 hours of how much money running Linux on the Z would save us. It sounded great on paper. As I was leaving with the AIX guys they could barely contain themselves so I asked what was so funny. They gave me the summary of how the last time several years ago we tried this with results that were similiar if not worse then the perivous poster. We would have spent many many millions to run our p series and or the i series servers. I'm sure we will take another shot at this but, even as a Blue supporter, buyer beware.
According to another article it is saving the $250M over 5 years, predominately from reduced running costs
A game has objectives and is competitive, anything else is just play
Hi, yes and no. 370 runs 360 code but, as too often even today, people coded to bypass the OS. Old devices, drums, paper / magnetic card readers, terminals, channels, etc. Even todays systems have the idea, VM especially, of 80 column cards, punches, readers, etc and if used correcly they work wonders, trust me, 360 architecture is one of the best even today. The problems is that not too many people any more want to learn the basics, i.e. Priciples of Operation ( any 3xx, a good book to read, required reading, IMHO ). Search on which OS version macro libraries Linux ( 370 HAL ) was first compiled on 360/370, you will be amazed. Emulations in 360-xxx mostly mean address space differences ( 24/31/32/64/.. ) and some added machine code / functionality, done by OS/hardware. And of course a long time trapping the floating point was/is(?) one if you didn't have the fp hardware installed.
But with the System z mainframe and Geographically Dispersed Parallel Sysplex (http://www-03.ibm.com/systems/z/resiliency/gdps.h tml), you could have your hot backup site located something like 300 km away, with minimal data loss (e.g. in-flight transactions).
... if a facility is damaged, work will be moved to the backup facility. Would anyone really try to sort through the rubble to see which servers could restart and then see whether what came up represented a coherent configuration that could actually perform useful work?
And I think your disaster recovery example is a bit far-fetched
"one LPAR can run up to several hundred x86 VMs concurrently)."
When I started out the "hot" PC, the best you could get, was a 4Mhz Z80 running CP/M. I had one of those at home and at work, I worked the operating system of a very old (even then) CDC mainframe. It was a CDC6600. We had a Z80 emulator that ran on the 6600 and we could emulate a Z80 at about 20 times real time. Not bad, a virtual PC running on a mainframe in the late 1970's
Us software people really need to get off the ball and think of something new rather then just re-implementing 40 year old ideas on ever cheaper and faster hardware.
Last week, I attended a presentation at IBM's Australian Development Lab in West Perth, where a lot of the z/OS-related code is maintained and developed.
From what we were told, IBM z/OS mainframes are the *most* reliable platform to host software services (but of course, they'd say that).
The following is from memory, as best as I can remember it, and may not be 100% accurate:
The 'z' in 'z/OS' stands for 'zero downtime'. z System mainframes are engineered for 99.999% availability, or less than 3 minutes of downtime a year (we were actually quoted 'less than 5 minutes', but (1 - 99.999%) * 365.25 * 12 * 60 = 2.63). Apparently, they quite easily meet this requirement - we were told that it is not uncommon for systems to remain online for 10 years or more without failing.
Up to 32 z System mainframes can be clustered in a 'sysplex'. Each mainframe is divided into several LPARs (Logical Partitions), each which can host several VMs. If an application fails, the automated recovery service will attempt to restart it, either on the same VM, a different VM, a different LPAR or a different mainframe in the sysplex, as appropriate in the situation. It is also possible to host a redundant sysplex in a different site, which mirrors data and which the primary sysplex can failover to in the case of failure.
IBM mainframes are used in many major corporations around the world, particularly those where the cost of downtime is very high (think thousands of dollars a second).
There are a few reasons why the specs for mainframes are so hard to find.
n s.html
One is that the things you find on IBM's website are designed for CEOs and CIOs who don't really care about technical details -- only "solutions"
The second is that the specs themselves aren't well-defined. As an earlier poster pointed out, you don't buy one of these things off the shelf. You tell IBM what you want to do with it, and you work with them to construct not just a mainframe, but all of the storage and other add-ons that come along.
And finally, the third reason is that the specs don't line up with anything you likely work with normally... (If they did, you'd know where to find them.)
Here are some specs for the z9 Enterprise Class:
http://www-03.ibm.com/systems/z/z9ec/specificatio
Simplified you are looking at 54 CPUs with 512GB of memory.
The CPUs themselves are basically Power6 processors, but thats really simplifying everything down.
Each CPU is actually a "book" of CPUs. Several run at once on the same data. If any disagree, the instruction is rerun on a different CPU. Entire backup books (in addition to the 54) kick-in if a problem is detected.
Additionally, the z/Series comes with a bunch of "Specialty" CPUs. You can get 27 CPUs that do nothing but process Java work natively. Or ones that handle DB2 workload. Or even special processors optimized for the linux kernel. Oh and don't forget the built-in hardware crypto CPUs.
Memory and I/O and Power and everything else works pretty much the same way on a mainframe. And all of it is hot-swapable. (Even the Emergency Power Off switch can be replaced while the system is running).
The hardware specs are impressive, but the biggest deal about these boxes is that they don't go down. Most people I talk to question the idea of consolidating servers into one box because of "single point of failure" concerns. This is where the mainframe shines. These things have MTBF of decades, and will just churn away forever.
Yeah, if somebody hits the Big Red Switch, there's going to be a problem. But, if they don't, well, it's a mainframe.
The Linux on these machines is running under z/VM, in multiple virtual machines. When one of them has a software fault, you reboot that one's VM and keep going; the other 132 Linux-running VMs run without noticing anything happened. (It is possible for z/VM to fault, sure. But it's an OS with 40 years of refinement in the "100% uptime" mainframe culture, and its task is just managing the virtual machines.) When something goes wrong with the hardware, the fault tolerance and self-healing features keep things running, and you fix the faulty element with a hot-swap. A properly set-up datacenter is going to minimize external risks, with backup power and such. Proper choice of datacenter location will minimize natural disaster risk.
So, yeah, the big risk is human failure, and these IBM-built, IBM-owned datacenters are presumably going to have extensively trained IBM-employed mainframe personnel, which minimizes that risk.
Now, if some cable company cuts the fiber optic lines . . .
There are a lot of errors in your comments, unfortunately. Of course you can run Red Hat and SuSE concurrently in a single LPAR under z/VM, and multiple versions thereof. This has always been true, ever since Linux began running on mainframes many years ago. You might want to have more than one LPAR to run more than one version of (first level) z/VM, but you don't need many. Two or three for z/VM and Linux is typical and just fine. And it's not as if LPARs are in short supply on mainframes: up to 60 are available on a single machine (30 on the smaller model), so "spending" 1 to 3 is no big deal.
Re: Investing in new mainframes, come on, get real. It's so easy to find market data because companies like Gartner and IDC publish it, and IBM just announced its 8th straight quarter of mainframe hardware growth, something that hasn't happened since before Y2K. It's impossible to do that with "a few showboat customers."
And no, you simply cannot approach the level of virtualization these machines offer on any other system, at least for typical business computing, and still offer reliable service to users. In fact, in IBM's case many of the software licenses are presumably "free," and they still found big cost savings by taking 4,000 machines down to 30. For the rest of the world the mathematics in such situations are even more compelling.
I was involved in a migration to the zOS architecture three years ago. I am currently involved in a similar exercise for a British telecoms company whose name escapes me. In both cases the principle was perfectly sound, but the reality rapidly starts to come down to what can be migrated, when, and why. At IBM application compatibility was a major consideration, and ultimately prevented key parts of the system from being migrated. At the current site, surprise surprise, the problems are the same, plus reluctance to do the work (upgrades, work required on the client's part, age of applications and Plain Old Politics). I wish IBM good luck, and perhaps because there is a better integration of operations and systems they might succeed, but I would be willing to bet that by the end of the process, they will have reached about 80% of their target.