Managing Linux and Virtual Machines?

← Back to Stories (view on slashdot.org)

Managing Linux and Virtual Machines?

Posted by Cliff on Wednesday September 3, 2003 @12:10PM from the seeking-wisdom-from-the-pioneers dept.

deijmaster asks: "For a couple of months we have been hearing (as a major consulting firm) IBM people pushing the possibility of installing a Z/Linux VM setup at one of our biggest clients (financial). To a Linux user such as myself this sounds great, at first. Now, I am a bit reluctant when it comes to managing this kind of infrastructure, with little or no local expertise at IBM. Has anyone gone through a Z/Linux VM corporate installation and lived through the management of such a solution?"

19 of 239 comments (clear)

Min score:

Reason:

Sort:

You WILL need help by salty_oz · 2003-09-03 12:25 · Score: 4, Insightful

If you have never touched VM, then you will be well and truely out of your depth. It's a whole different world to Unix/Linux.

So you will have to get a VM person in. Probably only on part time contract, and IBM will can provide that person for an additional fee.

In time you may learn enough to support your very limited VM environment.

--
ln -s /dev/null /dev/clue
1. Re:You WILL need help by Anonymous Coward · 2003-09-03 13:07 · Score: 4, Insightful
  
  So you will have to get a VM person in
  
  Yes, this is true, but if you are going to run Linux , you only need one VM person. The rest of your Admins should be Linux Admins.
  
  Don't imagine that the VM person will understand (or even like) Linux and don't expect your Linux admins to understand (or even like) the Mainframe.
TurboLinux, yes by mao+che+minh · 2003-09-03 12:31 · Score: 2, Insightful

I helped a admin friend (pure Novell guy that was somehow tasked with this job) implement TurboLinux on a IBM Z series mainframe. It is kind of easy to work, but you lose some performance, and updates and fixes can be hard to track down sometimes. Clustered Linux solutions could end being cheaper at first, but their TCO may rise higher as time goes on (especially if your company/institution lacks a very competent Linux cluster admin/programmer).
Re:I would advise against it by Anonymous Coward · 2003-09-03 12:46 · Score: 1, Insightful

I'd like to see the hardware that this supposed "pure linux" solution would run on. Something piddly and crappy like a dual xeon setup?

Wintel hardware is crap and not at all scalable. It's like comparing a ferrari (z hardware) to a pinto (wintel) and saying "well, they're both cars". Sure the ferrari costs more, but it's a hell of a lot more likely to be able to win in a race.
Re:Here is how I manage Linux by Anonymous Coward · 2003-09-03 13:01 · Score: 0, Insightful

You wear your underwear on the outside?
Re:Clusters by DA-MAN · 2003-09-03 13:07 · Score: 3, Insightful

Have you ever dealt with a cluster? Large clusters are fucking expensive to run 24x7x365. They require a lot of Air Conditioning (we spend over $1,000 a month on just AC, that's an expense that is never going away), electrical and a shitload of space.

I know this is Slashdot, but a beowulf is not always the best choice!!!

--
Can I get an eye poke?
Dog House Forum
Re:not cost efffective by Covener · 2003-09-03 13:08 · Score: 4, Insightful

Disk IO, reliability, workload management and power consumption are also probably relevant in that equation (and on the side of z/linux)

Linux/390 is great for experimental servers, test systems, etc. OTOH - if you have any significant workload, buy a rack-mount PC.
Re:Doesn't seem like such a big deal by dalslad · 2003-09-03 13:13 · Score: 3, Insightful

Exactly. I find it interesting when people comment out of the space of speculation. The original question was for someone with "experience". That doesn't mean that he wanted uninformed opinions based on some notion of logic. If someone hasn't sailed the boat, don't tell me how to do it.
Re:not cost efffective by Detritus · 2003-09-03 13:15 · Score: 2, Insightful

The net result is that the Athlon is about twice as fast as the G6 mainframe.
That depends on your definition of speed.
Mainframes aren't bought for raw MIPS.

--
Mea navis aericumbens anguillis abundat
Re:I would advise against it by Jeremy+Erwin · 2003-09-03 13:19 · Score: 5, Insightful

Wintel hardware is crap and not at all scalable. It's like comparing a ferrari (z hardware) to a pinto (wintel) and saying "well, they're both cars". Sure the ferrari costs more, but it's a hell of a lot more likely to be able to win in a race.

Reasoning by analogy is always fraught with pitfalls.
The Ferrari can't carry more than two people. The IBM machine is designed for fast I/O. The Ferrari breaks down a lot. The IBM is designed to be highly reliable.

Perhaps a better, but still rather imperfect analogy would be to a tractor trailer--lots of horsepower, but not a speed daemon. Lots of cargo space. A decent diesel engine that can stand up to abuse.

IBM thinks that if you replace 20-30 Intel CPUs , all running at 5% utilization, with a single zSeries CPU running at 85-90% utilization, you'll save money and aggravation. On the other hand, if those 20-30 Intel CPUs are rendering CGI for a film, or modeling a jet engine (and thus running near 100% load), a zSeries CPU would only be able to take on the work of 4-5 Intel CPUs, if that.
A little one-sided. Here's the downside of VMs by SuperBanana · 2003-09-03 13:24 · Score: 5, Insightful

Another benefit of virtual machines are their logical separation from the host server. Each virtual server has their own users (including root), applications, file systems, IP address, etc. That means that if security is compromised on one, the others are unaffected. Ditto resources can be allocated to each virtual server according to need. And any mis-configuration on one doesn't affect the other. This compares to running multiple applications on the same server for different purposes (e.g. running HR and Account systems on one server, if email goes down them both systems are affected. In a virtual server setup, only one of the other would be affected.
Ahh yes, grasshopper, but when that one uber-box dies(hard disk, fan, power supply, whatever), gets powered off by accident, network cable unplugged, yadda yadda- it affects ALL the virtual machines.
Granted in the Big Iron, you've got lovely hot-swap capabilities and such(processors, memory, etc)...but nothing is foolproof or 100% reliable. It's the old joke with pilots about twin-engine airplanes; the door swings both ways and there's no such thing as a free lunch. On one hand, you've got a spare engine if one dies, but you're 2x as likely to have a failure, you've got a lot of added complexity, and sometimes it still won't save your bacon(twin engine planes have an abysmal survival rate for engine failure in part because of the really shitty way they fly with one engine down). This is VERY applicable- because managing this big IBM server is much more complex(the whole point of this article) than seperate hardware.
Best example I can think of in how hot-swap can still not save the bacon is with the Cisco PIX 5-something(The 1U pizza-box one). It has FULL failover- if you've got two, and one shits the bed COMPLETELY, the other one takes over absolutely everything, including active connections; they share ALL state information for what's called stateful failover. Aside from a momentary blip where things stop for a sec...nobody's the wiser that a piece of very expensive hardware just let the Magic Smoke out. The problem is that the PIX OS version we had was buggy and would crash randomly- and because they were sharing connection tables and everything, they'd BOTH die, which was REALLY bad since the boxes didn't have hardware watchdogs(!). We turned off fully-stateful failover, and the problem went away; we'd notice they'd ping-ponged(there's an 'ACTIVE' led to show you which is live) and we'd power-cycle the other.
So ask the tough questions; instead of asking what's N+1, ask what's NOT N+1, and do a very careful breakdown of what exactly it will cost to run this big huge box, and figure out what the 'per [virtual] machine' costs are...

--
Please help metamoderate.
Re:not cost efffective by AchilleTalon · 2003-09-03 13:43 · Score: 2, Insightful

Exactly!
Management costs for dedicated servers which are almost idle, but still required as dedicated servers for many reasons are high. Also, reliability is an issue when you suddenly multiply low cost servers, which in turn reflects on the management costs, hardware cost and downtime cost.

--
Achille Talon
Hop!
Unless you want real mainframe class hardware. by TheLink · 2003-09-03 13:50 · Score: 2, Insightful

Unless you're in for the mainframe class hardware (and possibly support).

Coz for x86 servers, you can always use vmware e.g. vmware esx.

Not sure if vmware has anything lined up for opteron, but if that goes fine then it'll be cool.
--
- Too many replies beneath your current threshold
Re:oh my god, I finally have important information by Covener · 2003-09-03 14:08 · Score: 2, Insightful

They're great number crunchers, but they don't hold up under any kind of pressure as a web server. We had the z-series with no sites on it run benchmarks and compare to our development box with 20 sites hosted, and the development box (Penguin Computing) kicked its ASS.

You clearly have no idea what you're talking about. Great number crunchers? I can't even imagine what your testing was.
Re:Experience with z/Linux and VM by Anonymous Coward · 2003-09-03 15:43 · Score: 4, Insightful

With VM you can have all 100 instances of linux share the same system disks read only, install code on one, then each can pick up the updated code with a /etc/init.d/blah restart command.

And - that restart command can be issued from a VM service machine (PROP - the programmable operator) whose sole function is to issue commands to all the Linux machines and make sure they do it.

So basically it's rpm -Fvh foo.rpm on the master disk image, followed by a RESTART FOO message to PROP and you're done.

(Note - I'm not Adam - but I can vouch that he does know what he's talking about and this is my guess at what he'd say)
Re:not completely true by NighthawkFoo · 2003-09-03 16:05 · Score: 2, Insightful

Yes, mainframes do go down, but it's usually due to some edge case that testing didn't catch. A production system going down (an "outage") usually causes IBM field engineers to hop on the nearest plane to the customer site.

IBM Mainframes have the advantages of a very old and robust operating system, reliable and redundant hardware, and a thorough testing process before they are shipped out the door. This is what makes them more reliable.

--
"I disapprove of what you say, but I will defend to the death your right to say it."
- Evelyn Beatrice Hall
Re:Kinda suprised by vidarh · 2003-09-03 23:47 · Score: 2, Insightful

Of course you need redundancy, but instead of having umpteen different servers you need backup for, you need only two Z-series servers, or you can quite likely achieve the redundancy you need by outfitting the Z-series machine properly.
The Z-series supports taking CPUs out of comission for replacement without downtime. Same for RAM. Multiple hot-swappable SCSI controllers connected to a fully redundant storage system such as the ESS/Shark (where you can connect to two separate banks of controllers, so that any one of them can be offline without causing problems, and which has two separate AIX servers handling requests, supports RAID and synchroneous mirroring over fiber to a backup ESS), multiple hot swappable network cards, multiple power supplies, and you start getting pretty safe.
Yes, it will cost money, but so will providing all of the above for standalone servers. The Z-series is marketed primarily as a way of reducing maintenance work by consolidating your "servers" on one or two physical platforms, not for it's purchase price - it's an expensive beast.
You'll sleep better .... by NoCleverName · 2003-09-04 01:37 · Score: 2, Insightful

... using VM. Not everything can be measured in pure dollars and cents. Consider: All the stuff written about "what-if" this or that fails because I have only one box can largely be ignored. All that fail-over stuff is built under the skin of the box. Just because you don't see it as multiple distinct boxes doesn't mean it's not under the covers (multiple power supplies, cpu's, busses, etc.). When something goes wrong in an app you can right off generally cross-off hardware problems. That's because, if there are hardware faults, the system brings in spares and shoots out diagnostics on EXACTLY what's wrong, right down to the card level. So if the sys is quiet about the hardware, it isn't the hardware. One very big advantage is being able to run multiple versions of your OS's simultaneously. That means you don't have to worry about the crusty app running on the dusty box nobody remembers anything about. It's all on your M/F and will move right over if you change hardware. And, of course, business recovery is a dream since your not talking about replicating all those unique boxes you've accumulated over the years. In general, VM should be looked at as a management tool more than pure power under the hood. If you need to manage your corporate computing needs at a corporate, strategic level, VM's for you. But that doesn't mean there won't be a few instances where you've got to have the pure dedicated power for one app. But as the years go by and some apps hang around and must be maintained while focus moves onto other things, you will be very happy you've got VM there to manage your own sanity.
Paper on using vm's to manage linux clusters by Anonymous Coward · 2003-09-04 02:05 · Score: 1, Insightful

There is a bunch of research in academia about managing computers using VM's. One such paper is appearing in USENIX's 2003 LISA conference: http://suif.stanford.edu/collective
Internet suspend/resume at Intel Research in pittsburgh is another: paper HERE. They also had an article in scientific america awhile back.
One big advantage of managing with VM's is a complete system is just like a file, and thus can be copied and migrated easily. For example, if you have a production server with some faulty hardware, you can migrate the machine to a new host by simply copying the VM files, then repair the hardware, and copy it back.
Of course the efficiency is degraded somewhat do to the VM overhead, but the main argument is cycles are cheap, peopel are expensive. It's cheaper to by a P4 2.4 GHZ for $500 than buy a new sysadmin for $60,000. If you are performance-limited, just replicate instead of buying some fancy hardware (or look into better VM technology like VMware ESX server).