VM-Based Rootkits Proved Easily Detectable
paleshadows writes "A year and a half has passed since SubVirt, the first VMM (virtual machine monitor) based rootkit, was introduced (PDF), covered in the tech press, and discussed here. Later Joanna Rutkowska made news by claiming she had a VMM-based attack on Vista that was undetectable — a claim that was roundly challenged. Now in this year's HotOS workshop, researchers from Stanford, CMU, VMware, and XenSource have published a paper titled Compatibility Is Not Transparency: VMM Detection Myths and Realities (PDF) showing that VMM-based rootkits are actually easily detectable."
I'm still convinced that it's possible to make a VM that appaears to software running within as real hardware.
The paper, however, takes a practical approach, examining how some industry standard VM-s operate, such as VMWare and Virtual PC.
Those VM-s take plenty of shortcuts to improve performance, and don't virtualize some instructions, rather remap them, or "shift rings" of execution etc. as much as possible so to take advantage of the hardware while remaining sandboxed. They don't virtualize the clock as well, so you could time the performance.
A rootkit isn't competing with other rootkits based on performance, it does so based on how undetectable it is. It's arguably a different problem. I think we're yet to witness what a full blown VM made to be a rootkit will act like, and whether it'll be detectable.
Of course, this basic problem was described quite eloquently by Ken Thompson. He went after the compiler, but the problem of proving that the binary you have matches the source you have is a tricky one no matter what.
There actually are some very clever solutions to try to catch cheating compilers like this, but none of them are trivial. It's a cat and mouse game, and there are actually proofs that winning either side completely is impossible.
On a native machine, we achieved about 55-70 transactions per second, after that, the CPU of the machine was maxed out. This was a quad Xeon with about 16 gigs of ram. The same exact machine, running ESX host, and one single VM, one, our Windows 2003 server, was able to achieve about 2-5 transactions per second before the host throwing in the towel. Now I am sure ESX 3 will be faster. This wasn't ESX 3, was 2.something.
What I noticed was that:
- VMWare has a lot of trouble with applications who do a lot of context switches. Basically, object pools with significant usage. If the CPU has to swap from thread to thread, it kills VMWare.
- We did a few network tests with bizarre results like VM network latency being 50% more. This is a killer with any system remotely trying to get a decent transactions per secon. We had to de-virtualize our SQL server and SNA gateway, it wasn't able to hold the load.
- For some odd reasons MOM, anti-viruses and SMS can choke a host without any problems. My hypothesis is that missed file cache is brutal for VMWare, especially if other VMs are doing some I/O intensive stuff.
I wouldn't recommend anyone putting a server with moderate to high load as a VM. However, VMWare is awesome for very low load server, we can pack 6-10 of these servers easily on the same dual dual core Xeon. And could probably more.