Virtual Containerization
AlexGr alerts us to a piece by Jeff Gould up on Interop News. Quoting: "It's becoming increasingly clear that the most important use of virtualization is not to consolidate hardware boxes but to protect applications from the vagaries of the operating environments they run on. It's all about 'containerization,' to employ a really ugly but useful word. Until fairly recently this was anything but the consensus view. On the contrary, the idea that virtualization is mostly about consolidation has been conventional wisdom ever since IDC started touting VMware's roaring success as one of the reasons behind last year's slowdown in server hardware sales."
The great thing about virtual machines is that you basically can do whatever you want with them. Things you'd normally never do to your computer.
It's only lacking a feature of throwing the virtual computer out of the window.
Sure, containerization might sound like a good idea... but if you find the word 'containerization' ugly NOW, wait until you see what furry abominations grow in the containers you forget about at the back of the work server for 2 months. >_>
The word is contain, people, not containerization.
As a software developer, being able to take snapshots, clone, pause, rewind (via snapshots) and backup makes VM'ing worth the cost in CPU/performance.
It's proved so useful that I'm sincerely considering doing the same for my actual WWW server so that if at any given time things go -bad- on the device I can just either roll back or transparently transfer to another machine, the latter, due to the (mostly) hardware agnostic nature of the VM setup makes disaster recovery just that much simpler (sure, you still have to setup the host but at least it's a simpler process than redoing every tiny little trinket again).
I've used virtualization for both containerisation and also to consolidate boxes too...
At my previous company, we invested in two almighty servers with absolutely stacks of RAM in a failover cluster. They ran 4-5 other servers for critical tasks...each virtual machine was stored on a shared RAID5 array. If anything critical happened to the real server, the virtual servers would be switched to the next real server and everything was back up again in seconds. The system was fully automated too, and frankly, it saved having to buy several not-so-meaty boxes while not losing much redundancy and giving very quick scalability (want one more virtual server? 5 minute job. want more performance? Upgrade redundant box and switch over virtual machines).
The system worked a treat, and frankly, the size & power of the bigger, more important fewer servers gave me a constant hard-on.
throw new NoSignatureException();
In case your interested, the article is really a review of rPath, a virtual appliance builder based on a custom tailored gnu/linux...
I read somewhere (possibly on the PHP bug system) that they were considering scrapping most fo the security features we've all grown the .. well, hate really, and replace them all with a virtualisation system. I did think at the time that the virtualisation system they'd implement to keep PHP-based vhosts separate and secure would be to run apache in many virtual OSes.
I suppose jailing applications is a well-known way of securing them, this really just improves on that, but with much more overhead. I wonder if anyone is thinking about providing "lightweight" virtualisation for applications instead of the whole OS?
... that develops applications, mostly in C, I also find it extremely useful, especially when installing software. Some installers change the state of the system, some problems only occur first time round. There is nothing else like the ability to take your blank windows VM, copy it, install stuff, screw around with it in every possible way and then when you're done just delete the thing. They also allow you to install stuff you just don't want on your native box, but need to develop against.
And you still have that blank windows install to clone again when you need it.
VMs are a fantastic dev tool.
It's becoming increasingly clear that the most important use of virtualization is not to consolidate hardware boxes but to protect applications from the vagaries of the operating environments they run on. It's all about 'containerization,'
Don't trust "it's all about" or "it turns out that to the contrary" or "set to fully replace" statements, especially when there's lack of evidence of what is claimed.
Hosting services use virtualization to offer 10-20 virtual server per one physical machine, I and many people I know use virtual machines to test many configurations we can't afford to have separate physical machines for.
So even though it's also about "containerization" (is "isolation" a bad word all of a sudden?), it's not ALL about it.
Solaris has Zones for that exact purpose. Lguest, I believe, offers something similar for Linux.
With virtualization like linux vserver, xen, vmware etc. there are two main reasons to why people are using it.
1) Consolidation
2) "Containerization" or whatever their calling it today.
The company that I work for are using multiple virtual servers to be able to keep applications separate and be able to migrate them from machine to machine easier which is a common use for vmware (e.g. the appliance trend). So you're trading performance and memory usage for security and robustness/redundancy.
Across maybe 100-200 servers, the number of vservers we have is astonishing (probably around 1200 to 1500, which is a bit of a nightmare to maintain) which are hosting customer applications, when an application starts to use more resources the vserver is moved over to a machine with less servers on it, and gradually to it's own server, which in the long run saves money & downtime.
The other major industry using them is the hosting industry, allowing customers a greater amount of personalization rather than the one-size-fits-all cpanel hosting companies. This is the real industry where consolodation has increased, biting into the hardware markets possible sales because thousands of customers are now leasing shared resources, instead of leasing actual hardware.
Either way, the number of new machines (virtual) machines and ip addresses, all managed by different people is becoming a management nightmare. Now everybody can afford a virtual dedicated server on the internet regardless of their technical skills which often ends up as a bad buy (lack of memory and resource constraints compared to shared hosting on a well maintained server).
That can be a chilling thought to companies like Intel, Microsoft or Oracle. Also, the carefully woven concoluted DRM and TCPA architectures that consume gazillions of instructions and slow down performance to a crawl... will simply be impossible if the Virtualisation layer simply ignores these functions in the hardware. Which is why I felt it very strange for the Linux Kernel team to get involved in porting these VMs in order to allow Vista to run as a guest OS. It shouldn't have been a priority item for the kernel team at all, IMO.
If you keep throwing chairs, one day you'll break windows....
This is kind of obvious, I used to use more machines for security reasons, now I use less machines but they are more powerful. When you do server consolidation, it implies that applications used to run on different hardware for security and stability reason will now be running on the same hardware within different VMs. So how can they say "protect applications from the vagaries of the operating environments" is opposed to "consolidating hardware box".
"Consolidating hardware boxes" implies "protect applications from the vagaries of the operating environments" you just do that with less machines.
I use virtualization because it leaves me with less physical servers to manage, "protect applications from the vagaries of the operating environments" was already done before virtualization. So, virtualization doesn't help me "protect applications from the vagaries of the operating environments", it helps me because I have less servers to manage.
Everything I write is lies, read between the lines.
I use vmware servers for software that is node locked.. Node locked software is usually done by a machines MAC address, I find that using VMs reduces downtime in the event of either host or client failing. In the case of the host if we can recover the VM we just copy it to another host and run it. In the case of the client dying the great thing is I just create a new VM and change its mac address to match the dead one then reinstall my licence files, saving me from having to reregister all of the licences to the "new" machine.. Hardware consoladation also plays a large part of my use of VMs, but the main reason is recoverability so much so that all my DCs are on VMs so if their host dies (hardware other than HDD) then i can either pull the disks and put them in another machine, or if my replication has succeeded more recently then I just start my backup copy of the DC and let it update from the domain. Total downtime is about 15min tops.
I run a whole bunch of virtual servers and that's exactly what I'm doing.
It's fantastically handy to be able to install and configure a service in the knowledge that no matter how screwed up the application (or, for that matter, how badly I screw it up), it's much harder for that application to mess up other services on the same host - or, for that matter, for existing services to mess up the application I've just set up.
Add to that - anyone who says "Unix never needs to be rebooted" has never dealt with the "quality" of code you often see today. The OS is fine, it's just that the application is quite capable of rendering the host so thoroughly wedged that it's not possible to get any app to respond, it's not possible to SSH in, it's not even possible to get a terminal on the console. But yeah, the OS itself is still running fine apparently, so there's no need to reboot it.
This way I can reboot virtual servers which run one or two services rather than physical servers which run a dozen or more services.
Granted, I could always run Solaris or AIX rather than Linux, but then I'll be replacing a set of known irritations with a new set of mostly unknown irritations, all with the added benefit that so much Unix software never actually gets tested on anything other than Linux these days that I could well find myself with just as many issues.
Isn't this de facto evidence that the sandboxing, which was supposed to be a key component of both Java and .Net's security models, has either failed to deliver on their promises, or simply isn't adequately well engineered to provide protection against rogue applications?
As has been said before, we need a way to grant applications permissions to use resources. We have that, to some degree, with firewalls and apps like ZoneAlarm/LittleSnitch which ask you for permission before an application is allowed to "call home", but what about other resources -- for example, being able to access only a particular directory or install a system-level event hook which acts as a keylogger? etc.
> What do you run the virtual machine on - an OS!!
Unless you're running Xen, unless you consider Xen an OS. But this brings us back to the question, "what is an OS?"
Xen is a kernel for managing virtualized guests, it sits at Ring-0 where traditional OS normally resides. Xen requires that a single guest machine is setup to be booted by default, which will receive special priviledges for purposes of managing Xen. This special guest is called the "dom0", but is for all other intents and purposes -- just another virtual machine.
Well, yes and no.
As I keep telling people when I work with virtualization, it does not necessarily lead to server consolidation in the logical sense (as in instances of servers), rather it tends to lead to server propogation. This is probably expected; generally I/O will be lower for a virtual machine than for a physical machine, thus requiring the addition of another node for load balancing in certain circumstances. However, this is not always the case.
Virtualization DOES help lead to BOX consolidation; as in it helps reduce the physical server footprint in a datacenter.
Let me give you my viewpoint on this; generally virtualization is leveraged as a tool to consolidate old servers to bigger physical boxes. Generally, these old servers (out of warranty, breaking/dying and so on) have lower I/O requirements anyway so often see a speed boost going to the new hardware... or at the very least performance remains consistent. However, where new applications are being put on virtual platforms, quite often the requirements of the application cause propogation of servers because of the I/O constraints. This is generally a good thing as it does encourage the developers to write "enterprise ready" applications that can be load balanced instead of focusing on stand-alone boxes with loads of I/O or CPU requirements. This is good for people like me as it provides a layer of redundancy and scalability that otherwise wouldn't be there.
However, the inevitable cost of this is management. While you reduce physical footprint, there are more server instances to manage, thus you need a larger staff to manage your server infrastructure... not to mention the specialized staff managing the virtual environment itself. This is not in itself a bad thing, and generally might lead to better management tools, too... but this is something that needs to be considered in any virtualization strategy.
Generally in a Wintel shop, more newer applications get implemented in most companies these days. This is particularly true since most older applications have been or need to be upgraded to support newer operating systems (2003 and the upcoming 2008). This means that the net effect of all I've mentioned is an increase in server instances even while the footprint decreases.
"Containerization" (yuck!) is not new by the way. This is just someone's way of trying to "own" application isolation and sandboxing. People have done that for years, but I definitely see more of it now that throwing up a new virtual machine is seen as a much lower "cost" than throwing up a new physical box. The reality of this is that virtualization is VERY good for companies like Microsoft who sell based on the instances of servers. It doesn't matter if it's VMWare or some other solution; licensing becomes a cash cow rapidly in a virtualized environment.
Where I work we've seen about a 15% net server propogation in the process of migrating systems so far. Generally, low-load stuff like web servers virtualize very well, while I/O intensive stuff like SQL does not. However, a load-balanced cluster pair of virtual machines on different hardware running SQL can outperform SQL running on the same host hardware as a single intstance... this means that architecture changes are required, and more software licenses are needed, but the side effect is a more redundant, reliable and scalable infrastructure... and this is definitely a good thing.
I am a big believer in virtualization; it's somewhat harking back to the mainframe days, but this isn't a bad thing either. The hardware vendors are starting to pump out some truly kick-ass "iron" that can support the massive I/O that VM's need to be truly "enterprise ready". I am happy to say that I've been on the leading edge of this for several years, and I plan to stay on it.
Within a VM system, one will now find three types of systems running in the virtual machines.
It is these Service Virtual Machines that equate to the topic of the original post. A SVM usually provides one specific function, and while there may be interdependence between SVMs (for example the TCPIP SVM that provides the TCP/IP stack and each of the individual TCP/IP services), they are pretty much isolated from each other. A failure in a single SVM, while disruptive, usually doesn't impact the whole system.
One of the first SVM's was the Remote Spooling Communication Subsystem (or RSCS). This service allowed two VM systems to be linked together via some sort of communication link -- think UUCP.
The power of SVM's is in the synergy between the Hypervisor system, and a light weight platform for implementing services. The light weight platform itself doesn't provide much in terms of services. There is no TCP/IP stack, no "log in" facility (only relying on the base virtual machine login console), and maybe not even any paging memory (letting the base VM system manage a huge address space). Instead a light weight platform will provide a robust file system, memory management, and task/program management. In IBM's z/VM product, CMS is an example of a light weight platform. The Group Control System (GCS) is another example (GCS was initially introduced to provide a platform to support VTAM - which was ported from MVS).
Part of the synergy between between the Hypervisor and the SVMs is that the Hypervisor needs to provide a fast, low overhead intra-virtual machine communication path that is not built upon the TCP/IP stack. In otherwords the communication between two virtual machines should not require that each virtual machine contain it's own TCP/IP stack with it's own IP address. Think more along the lines of using the IPC or PIPE model between the SVMs.
Since the SVM itself is not a full suite of services, maintenance and administration is done via meta-administration, in otherwords you maintain the SVM service from outside the SVM itself. There is no need to "log into" the SVM to make changes. Instead of the SVM providing a sys-log facility, a common sys-log facility is shared among all the SVM's. Instead of each SVM doing paging, simply define the virtual machine size to meet the storage requirements of the application, and let the Hypervisor manage the real storage and paging.
Maybe a good analogy would be taking a Linux kernel and implementing a service via using the init= parameter in the kernel to invoke a simple set up (mounting the disks) and running just the code needed to perform the service. Communication for other services would be provided via hypervisor PIPEs between the different SVM's. So one would have a TCP/IP SVM that provides the TCP/IP network stack to the outside world. A web server SVM that provides just the HTTP protocol and base set of applications, using a hypervisor PIPE to talk to the TCP/IP stack. Within the web server SVM, would use hypervisor PIPEs to talk to the individual application SVMs.
Many QA people, including myself, use VM as well. Very useful with buggy builds. The best part is sharing the image. I can send a copy of my image to a developer with the reproduced issues without having him/her to come over to see it on my real machine. We still use real machines for testing, but VM is useful.
Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
I think the growing need for virtualisation as a safety/management measure reveals major flaws in the fundamental design philosophy of both operating systems and languages. Specifically, it is becoming abundantly clear now that our existing methods of breaking software into modular components simply don't work. If they worked, we wouldn't need to draw boxes around things at the physical or virtual server level in order to guarantee containment.
I think basically the problem is that our languages still think largely in terms of a single executable process, leaving interactions with hardware, files and other processes up to the operating system, while our operating systems are still mostly geared toward the old timesharing model: how to multiplex access to CPU and random access storage between multiple users. They're too low-level, too close to the hardware. Process tree, file tree, libraries, even component framework, all of these are angles of attack at the problem but not general enough to prevent nasty interactions between themselves - you can't, for example, safely create any kind of 'sub-system' or 'chroot jail' equivalent inside all of the filesystem, hardware, IP address, library/components, and process tree at once. But that's the minimum you need to be able to guarantee that you have a single, isolatable system that can deliver a service. A modern graphical desktop, for example, requires all of: libraries, executables, system config files, user config files, user data, an X server, a time service, a software patch/update service, network access (with ports non-firewalled), many little utility services like D-BUS, clipboard, etc. There's no way you can draw a box around all of those inside an OS with the tools we have now.
So, you boot up a virtual server and do a whole OS install, because you know that works. If you've got the time and a *very* specialised application, like webhosting, you *might* be able to get away with something less than full virtualisation - just virtualising the filesystem, for instance. But it's risky.
What we want is a much more general kind of computing metaphor that takes *a system of components* as a fundamental primitive and allows easy reuse and sandboxing of these as a matter of course. Something like a Plan 9 approach where 'everything is a file' at a radical level, including processes. There would need to be an integrated language that is based around parallel clusters of communicating file-like components rather than serial threads of execution. And make 'duplicate this system, but inside this functional requirements sandbox' be a very, very basic primitive (if not the lowest-level one of them all).
You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC