Attack Steals Crypto Key From Co-Located Virtual Machines
Gunkerty Jeb writes "Side-channel attacks against cryptography keys have, until now, been limited to physical machines. Researchers have long made accurate determinations about crypto keys by studying anything from variations in power consumption to measuring how long it takes for a computation to complete. A team of researchers from the University of North Carolina, University of Wisconsin, and RSA Security has ramped up the stakes, having proved in controlled conditions (PDF) that it's possible to steal a crypto key from a virtual machine. The implications for sensitive transactions carried out on public cloud infrastructures could be severe should an attacker land his malicious virtual machine on the same physical host as the victim. Research has already been conducted on how to map a cloud infrastructure and identify where a target virtual machine is likely to be."
The published paper is an interesting read. Obtaining the crypto key to libgcrypt is only one application. In general, the authors say, it is possible to construct a side-channel attack on other, unrelated, processes in the attacked VM.
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
"public cloud infrastructure". The very thought of that makes me cringe, then laugh at the absurdity of it.
We can't even code bug free operating systems. What makes anyone think we can code a bug free hypervisor? I'm still confused as to why people believe that VMs are inherently secure- are they secure because VMware/Xen/Oracle says they are? Or are they secure because they've been tried and tested in the fires of time? All I ever see about hypervisors these days is some inflated marketing terms or new "cloud" interoperability features or some other random junk that solves an imaginary problem someone first had to go out of their way to create. I've never seen anyone actually come out and say "This version of our hypervisor is even more secure then the last because of XYZ!".
The company I work for makes extensive use of "cloud influenced" features in-house. It's awesome to be able to two-click a LAMP stack into existence through a nice web portal or do the same for a couple of Win2K8 instances. Some idiot was preaching about outsourcing our hardware to someone else and putting everything "in the cloud". Luckily management saw it for the farce it was and put that guy in his place pretty quickly.
So again, I'm really curious as to why people explicitly trust: A) Their services/platforms to someone other then themselves, and B) expect that VM hypervisors are bullet proof.
It appears that the hypervisor leaks data from one VM to another by not clearing a cache.
What is leaked is not actually the data in the cache; another virtual machine running on the same computer cannot access that data. What is leaked is some information about cache usage, which may then allow an attacker to find out what the other VM has been doing. The attacker fills the cache with data, switches to another VM, and when it gets control again, the attacker measures how long it takes to access the data that it put into the cache itself. If it's fast, then the attacker knows that the other VM hasn't touched that part of the cache. If it's slow, the attacker knows that the other VM touched this part of the cache.
You can find a more detailed blog post about this here:
http://blog.cryptographyengineering.com/2012/10/attack-of-week-cross-vm-timing-attacks.html
It appears that the hypervisor leaks data from one VM to another by not clearing a cache. If that is all, this leak can be fixed by explicitly clearing the cache when switching to another VM. This will probably cost a few CPU cycles (and cause a few extra cache misses when a VM is resumed).
The problem isn't data leaking but the change in latency to access memory when on the same cpu where a crypto algorithm is running. The keys can be reverse engineered if the crypto algorithm uses a well known table. There is no direct data leakage across VMs required. This is not a joke it is effective, but you have to get you VM onto the same server as the VM you are attacking. You can avoid the issue by using a dedicated server in the Amazon cloud case, or an Extra Large VM in Azure.
This post gives a high-level summary of the attack:
http://blog.cryptographyengineering.com/2012/10/attack-of-week-cross-vm-timing-attacks.html
(I previously posted this as AC, but it vanished.)
No, it isn't since modern operating systems tend to isolate programs from each other, and in the case of this article the programs are even running in disparate virtual machines, which should put a wall between the two. It is only through exploiting the processor cache that the key could be extracted. The attacker monitors how the victim fills the instruction cache. Since the victim's crypto algorithm follows different code paths depending on the key, the researchers were able to determine key.
This kind of side-channel attack was not universally thought practical so this is news and would be good to think about how to mitigate this problem.
In other words, this exploit requires: knowing what cryptographic software is being run, the presence of Xen and an apparent security hole therein, and lucky core colocation of the VMs in an environment that could easily have dozens of VMs running against more than a dozen cores "over the course of a few hours".
In short, all of this is unlikely to be reproducible outside of a lab.
Problem is that a leak of any PII data for customers of a business is a PR nightmare and potentially a largish lawsuit costing millions. Millions may be more than was saved by virtualizing.
A fool throws a stone into a well and a thousand sages can not remove it.
Millions aren't often saved by virtualizing your hardware. Almost all numbers I've seen show that the cost of running on virtual hardware is actually more costly than running on your own servers after you amortize the price of the servers over their lifespan. Often buying your own hardware pays for itself within a year. Hosting in the cloud makes sense in a small number of instances where you have wildly varying amounts of traffic and need to be able to scale up and scale down very quickly to big load changes. It also allows you to get some nice servers on day one without much capital investment. But that's not being very business smart. If the servers can pay for themselves in the first year, you should really be buying the servers. It can also be very cheap if you are utilizing almost no resources, but that is something I would consider more of a home project, and not something that is really something that business would be looking at.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
If necessary you could simply do unnecessary work on one of the code paths so that they end up doing the same amount of work on each path.