Managing Linux and Virtual Machines?
deijmaster asks: "For a couple of months we have been hearing (as a major consulting firm) IBM people pushing the possibility of installing a Z/Linux VM setup at one of our biggest clients (financial). To a Linux user such as myself this sounds great, at first. Now, I am a bit reluctant when it comes to managing this kind of infrastructure, with little or no local expertise at IBM. Has anyone gone through a Z/Linux VM corporate installation and lived through the management of such a solution?"
If you have never touched VM, then you will be well and truely out of your depth. It's a whole different world to Unix/Linux.
So you will have to get a VM person in. Probably only on part time contract, and IBM will can provide that person for an additional fee.
In time you may learn enough to support your very limited VM environment.
ln -s
Have you ever dealt with a cluster? Large clusters are fucking expensive to run 24x7x365. They require a lot of Air Conditioning (we spend over $1,000 a month on just AC, that's an expense that is never going away), electrical and a shitload of space.
I know this is Slashdot, but a beowulf is not always the best choice!!!
Can I get an eye poke?
Dog House Forum
Disk IO, reliability, workload management and power consumption are also probably relevant in that equation (and on the side of z/linux)
Linux/390 is great for experimental servers, test systems, etc. OTOH - if you have any significant workload, buy a rack-mount PC.
Exactly. I find it interesting when people comment out of the space of speculation. The original question was for someone with "experience". That doesn't mean that he wanted uninformed opinions based on some notion of logic. If someone hasn't sailed the boat, don't tell me how to do it.
Wintel hardware is crap and not at all scalable. It's like comparing a ferrari (z hardware) to a pinto (wintel) and saying "well, they're both cars". Sure the ferrari costs more, but it's a hell of a lot more likely to be able to win in a race.
Reasoning by analogy is always fraught with pitfalls.
The Ferrari can't carry more than two people. The IBM machine is designed for fast I/O. The Ferrari breaks down a lot. The IBM is designed to be highly reliable.
Perhaps a better, but still rather imperfect analogy would be to a tractor trailer--lots of horsepower, but not a speed daemon. Lots of cargo space. A decent diesel engine that can stand up to abuse.
IBM thinks that if you replace 20-30 Intel CPUs , all running at 5% utilization, with a single zSeries CPU running at 85-90% utilization, you'll save money and aggravation. On the other hand, if those 20-30 Intel CPUs are rendering CGI for a film, or modeling a jet engine (and thus running near 100% load), a zSeries CPU would only be able to take on the work of 4-5 Intel CPUs, if that.
Ahh yes, grasshopper, but when that one uber-box dies(hard disk, fan, power supply, whatever), gets powered off by accident, network cable unplugged, yadda yadda- it affects ALL the virtual machines.
Granted in the Big Iron, you've got lovely hot-swap capabilities and such(processors, memory, etc)...but nothing is foolproof or 100% reliable. It's the old joke with pilots about twin-engine airplanes; the door swings both ways and there's no such thing as a free lunch. On one hand, you've got a spare engine if one dies, but you're 2x as likely to have a failure, you've got a lot of added complexity, and sometimes it still won't save your bacon(twin engine planes have an abysmal survival rate for engine failure in part because of the really shitty way they fly with one engine down). This is VERY applicable- because managing this big IBM server is much more complex(the whole point of this article) than seperate hardware.
Best example I can think of in how hot-swap can still not save the bacon is with the Cisco PIX 5-something(The 1U pizza-box one). It has FULL failover- if you've got two, and one shits the bed COMPLETELY, the other one takes over absolutely everything, including active connections; they share ALL state information for what's called stateful failover. Aside from a momentary blip where things stop for a sec...nobody's the wiser that a piece of very expensive hardware just let the Magic Smoke out. The problem is that the PIX OS version we had was buggy and would crash randomly- and because they were sharing connection tables and everything, they'd BOTH die, which was REALLY bad since the boxes didn't have hardware watchdogs(!). We turned off fully-stateful failover, and the problem went away; we'd notice they'd ping-ponged(there's an 'ACTIVE' led to show you which is live) and we'd power-cycle the other.
So ask the tough questions; instead of asking what's N+1, ask what's NOT N+1, and do a very careful breakdown of what exactly it will cost to run this big huge box, and figure out what the 'per [virtual] machine' costs are...
Please help metamoderate.
With VM you can have all 100 instances of linux share the same system disks read only, install code on one, then each can pick up the updated code with a /etc/init.d/blah restart command.
And - that restart command can be issued from a VM service machine (PROP - the programmable operator) whose sole function is to issue commands to all the Linux machines and make sure they do it.
So basically it's rpm -Fvh foo.rpm on the master disk image, followed by a RESTART FOO message to PROP and you're done.
(Note - I'm not Adam - but I can vouch that he does know what he's talking about and this is my guess at what he'd say)