Attack Steals Crypto Key From Co-Located Virtual Machines
Gunkerty Jeb writes "Side-channel attacks against cryptography keys have, until now, been limited to physical machines. Researchers have long made accurate determinations about crypto keys by studying anything from variations in power consumption to measuring how long it takes for a computation to complete. A team of researchers from the University of North Carolina, University of Wisconsin, and RSA Security has ramped up the stakes, having proved in controlled conditions (PDF) that it's possible to steal a crypto key from a virtual machine. The implications for sensitive transactions carried out on public cloud infrastructures could be severe should an attacker land his malicious virtual machine on the same physical host as the victim. Research has already been conducted on how to map a cloud infrastructure and identify where a target virtual machine is likely to be."
The published paper is an interesting read. Obtaining the crypto key to libgcrypt is only one application. In general, the authors say, it is possible to construct a side-channel attack on other, unrelated, processes in the attacked VM.
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
"public cloud infrastructure". The very thought of that makes me cringe, then laugh at the absurdity of it.
We can't even code bug free operating systems. What makes anyone think we can code a bug free hypervisor? I'm still confused as to why people believe that VMs are inherently secure- are they secure because VMware/Xen/Oracle says they are? Or are they secure because they've been tried and tested in the fires of time? All I ever see about hypervisors these days is some inflated marketing terms or new "cloud" interoperability features or some other random junk that solves an imaginary problem someone first had to go out of their way to create. I've never seen anyone actually come out and say "This version of our hypervisor is even more secure then the last because of XYZ!".
The company I work for makes extensive use of "cloud influenced" features in-house. It's awesome to be able to two-click a LAMP stack into existence through a nice web portal or do the same for a couple of Win2K8 instances. Some idiot was preaching about outsourcing our hardware to someone else and putting everything "in the cloud". Luckily management saw it for the farce it was and put that guy in his place pretty quickly.
So again, I'm really curious as to why people explicitly trust: A) Their services/platforms to someone other then themselves, and B) expect that VM hypervisors are bullet proof.
All timing attacks are done in controlled conditions. This is extremely important. Most of them don't work well, if it all, in busy environments.
And what do you know about who's controlling the conditions when you host your data in "the cloud"?
It appears that the hypervisor leaks data from one VM to another by not clearing a cache.
What is leaked is not actually the data in the cache; another virtual machine running on the same computer cannot access that data. What is leaked is some information about cache usage, which may then allow an attacker to find out what the other VM has been doing. The attacker fills the cache with data, switches to another VM, and when it gets control again, the attacker measures how long it takes to access the data that it put into the cache itself. If it's fast, then the attacker knows that the other VM hasn't touched that part of the cache. If it's slow, the attacker knows that the other VM touched this part of the cache.
You can find a more detailed blog post about this here:
http://blog.cryptographyengineering.com/2012/10/attack-of-week-cross-vm-timing-attacks.html
It appears that the hypervisor leaks data from one VM to another by not clearing a cache. If that is all, this leak can be fixed by explicitly clearing the cache when switching to another VM. This will probably cost a few CPU cycles (and cause a few extra cache misses when a VM is resumed).
The problem isn't data leaking but the change in latency to access memory when on the same cpu where a crypto algorithm is running. The keys can be reverse engineered if the crypto algorithm uses a well known table. There is no direct data leakage across VMs required. This is not a joke it is effective, but you have to get you VM onto the same server as the VM you are attacking. You can avoid the issue by using a dedicated server in the Amazon cloud case, or an Extra Large VM in Azure.
This post gives a high-level summary of the attack:
http://blog.cryptographyengineering.com/2012/10/attack-of-week-cross-vm-timing-attacks.html
(I previously posted this as AC, but it vanished.)
No, it isn't since modern operating systems tend to isolate programs from each other, and in the case of this article the programs are even running in disparate virtual machines, which should put a wall between the two. It is only through exploiting the processor cache that the key could be extracted. The attacker monitors how the victim fills the instruction cache. Since the victim's crypto algorithm follows different code paths depending on the key, the researchers were able to determine key.
This kind of side-channel attack was not universally thought practical so this is news and would be good to think about how to mitigate this problem.
if the attacker has sufficient access to the host to study the vm's execution profile then i suspect the attacker can do a lot more than capture that key.. in the uses I would expect people to care about, web services running on a vm on the net, this implies that the attacker already has ssh access to the host machine. so an assumption of this vulnerability is that the host system is completely compromised. in such a case the attacker will have other options to get that key. as a side note, i dont know if it would be a good thing to fix this problem, you pay for perf for web servers so disguising the computation to mitigate such an attack would cost way more than it would achieve
I doubt that. Pulling this attack off in an UNcontrolled environment will be damn near next to impossible, no matter how good these people think they are.
Modern clouds shuffle VM execution in realtime from hardware to hardware to hardware on a continuous basis, depending on where resources are available, and where they are in demand, at any given time.
In any case, a user's inability to use a system properly is not cause for AMD or Intel to run off and start changing their architecture. This "problem" is one that is incumbent upon the customers of the cloud service to fix, by not being stupid and putting national security stuff in the public cloud where it can be stolen.
A few factors come to mind:
1. The fact that certain not-officially-known nation states have pulled off highly sophisticated attacks is a matter of public knowledge.
2. A 'nation-state', as an entity that gets to collect taxes and is charged with assorted non-market processes like 'defense', has much broader ability to do things that make minimal financial sense. If you are worried about spammers, or PIN-skimmers, or whatnot it suffices to be more expensive to attack than your resources are worth. If you are in some clandestine entity's sights, you actually have to be hard, rather than simply uneconomic.
3. Even if (and this is far from obviously true) all state employees are a bunch of drooling idiots herded by ideological Kommisars, they always have the option of contracting the attack out to their private sector superiors. Plenty of contractors who specialize, or have a department specializing in, electronic attack tools and they'll hold your hand every step of the way if you cut them large enough checks.
just bind the cryptographically sensitive process to a dedicated processor.
In other words, this exploit requires: knowing what cryptographic software is being run, the presence of Xen and an apparent security hole therein, and lucky core colocation of the VMs in an environment that could easily have dozens of VMs running against more than a dozen cores "over the course of a few hours".
In short, all of this is unlikely to be reproducible outside of a lab.
Problem is that a leak of any PII data for customers of a business is a PR nightmare and potentially a largish lawsuit costing millions. Millions may be more than was saved by virtualizing.
A fool throws a stone into a well and a thousand sages can not remove it.
It also appears that it doesn't work if there are more than two virtual machines running on the same physical CPU, or if the attacking VM is the only one running on a given CPU.
With 3 or more VMs on the same CPU, the cache gets populated by virtual machines other than the targeted "victim" machine, so the attacker doesn't know which is affecting what. And if the attacking VM is alone on the CPU, it can't find any other VMs to attack.
---------
There is inferior bacteria on the interior of your posterior.
The StuxNet attack vector was probably thought of in the same way - until it was used. When there is a high value target, getting all the ducks in a row is not impossible, it's the reason professionals are called in. You only have to make it work once (though you have to avoid getting caught on all the other attempts).
A fool throws a stone into a well and a thousand sages can not remove it.
Millions aren't often saved by virtualizing your hardware. Almost all numbers I've seen show that the cost of running on virtual hardware is actually more costly than running on your own servers after you amortize the price of the servers over their lifespan. Often buying your own hardware pays for itself within a year. Hosting in the cloud makes sense in a small number of instances where you have wildly varying amounts of traffic and need to be able to scale up and scale down very quickly to big load changes. It also allows you to get some nice servers on day one without much capital investment. But that's not being very business smart. If the servers can pay for themselves in the first year, you should really be buying the servers. It can also be very cheap if you are utilizing almost no resources, but that is something I would consider more of a home project, and not something that is really something that business would be looking at.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
If it's slow, the attacker knows that the other VM touched this part of the cache.
Um... no. It knows that the cache has been touched by another VM. There's no guarantee that it was the target VM.
When our name is on the back of your car, we're behind you all the way!
this exploit requires: knowing what cryptographic software is being run, the presence of Xen and an apparent security hole therein, and lucky core colocation of the VMs in an environment that could easily have dozens of VMs running against more than a dozen cores "over the course of a few hours".
It doesn't seem that far fetched to me. Call up the cloud provider as a customer and ask what technology they use. If they say Xen, go ahead, if not find another cloud provider.
Then you guess what cryptography software is likely to be in use. AES on LUKS is a very common setup. Since multiple VMs are likely to be sharing the same hardware, this increases your chance of a hit.
Then you wait. Yes, it might take a while for the two VMs to coincide on the same CPU, but it will happen.
Give me Classic Slashdot or give me death!
would be good to think about how to mitigate this problem.
Simple: Use a different core / cache for different VM instances... Oh, wait.
If necessary you could simply do unnecessary work on one of the code paths so that they end up doing the same amount of work on each path.
Suppose I have 4 build machines, each running a different OS or version of the OS. At any given time I only need to be building 1 version.
If I virtualize them, I can use one machine (with 4x the disk space). Even accounting for reliability (and getting better redundancy than before) I can get away with 2 machines instead of 4.
Very true, but the burden of proof is on the victim. A PII loss really means nothing to a company other than a couple articles of bad press. Sony came out of the PSN compromise unscathed. Other companies have had break-ins, and they are not the worse for wear for the incidents, regardless of how things are handled.
The only organizations which actually would be held to task for break-ins would be government stuff. A private company losing data is considered normal. A government agency losing the same data will get people up in arms.
It depends on the task at hand:
I have multiple virtual machines for various tasks, and it isn't just for security. It is also for separation of duties:
One VM runs Quickbooks. This is stored on a USB flash drive so I can do accounting on any machine, then physically lock up the drive when done. Unless a remote intruder is savvy enough to nail my machine while the VM is active, my Quickbooks data is fairly protected, since when it isn't in use, the external drive is stashed in a safe.
Another VM has Windows and some potential client information. I don't want this information to end up in my personal stuff, so it stays in the VM, and with the VM disks encrypted, all data stays protected regardless of where it sits.
A third VM is for anonymous Web browsing. It has sandboxie and other tools to make it difficult for malware to get out and about. Nothing is 100% secure, but unless there is a F0 0F like bug that can get something in ring 3 into ring 0 on x86, it does the job.
A fourth VM is used for Mozy/Carbonite/etc. It shares TrueCrypt volumes via CIFS which are mounted to other machines. This sounds roundabout, but it ensures that if the backup client got compromised, it wouldn't spread outside the VM, and the only data it works with is encrypted.
A fifth VM is what I use for GPG and documents. This is stashed on a USB flash drive, so when I'm done signing/decrypting files, the private keys are physically offline. Of course, a dedicated intruder can still get those, but it limits the avenues of attack.
VMs have a lot of advantages. I like using them for isolation so data done for a certain task stays in one place.
It can be asserted that running under a user is good enough.
However, the advantage of VM level isolation is that everything related to a project (apps, data, even OS modifications) are stashed in one place. This can be done with users to a limited degree, but being able to have the complete OS with everything needed to run a specific application stored in one place is important. If done right, the VM doesn't care what hardware it runs on, so a future computer that might be ARM but translates x86 opcodes will be able to run the VM.
Then, there is the fact that malware can phone home. Having it only be able to access and report about a VM gives an attacker less info than if it is able to find what users a remote site possesses on its machines.
Wow, what browser do you use? You *might* be able to make an argument for netcat being secure, though I sure wouldn't bet on that. Firefox, Chrome, Opera, Safari, and IE have all had vulnerabilities discovered in the last year. Most of them were rapidly patched, and in some cases nobody other than the developers would ever have learned of the vulnerability if not for the patch notes, but I can guarantee you that they didn't find them all!
Also, logging into another account is insufficient. Just as your browser is vulnerable, so is your OS; I'm absolutely certain there are local EOP vulns in it somewhere. What good is logging into account B to protect account A when the exploit is running as root? Of course, the same argument could be made for VMs - once the guest machine is taken over, you're now trusting your hypervisor to keep your main OS from attacks, but there's probably some vuln in the hypervisor too (although this is a bit less likely than in a more complex piece of software like an OS). It goes on even further from there, too; attacks against the network, the hardware, your local power grid or ISP...
At some point, you simply must decide that there is *enough* security, and work from there. For most people, a fully-patched browser, preferably with Flash and Java disabled and possibly also JavaScript, running as a limited user with ideally some sandboxing around the browser process, is sufficient. That doesn't actually make you not vulnerable, though.
There's no place I could be, since I've found Serenity...