Xen Patches 7-Year-Old Bug That Shattered Hypervisor Security (arstechnica.com)

← Back to Stories (view on slashdot.org)

Xen Patches 7-Year-Old Bug That Shattered Hypervisor Security (arstechnica.com)

Posted by samzenpus on Thursday October 29, 2015 @09:34PM from the the-fix-is-in dept.

williamyf writes: ArsTechinca, The Register, and other outlets are reporting that today the XEN project patched a vulnerability in the ParaVirtualized VMs that allowed a guest to access the control OS of the hypervisor. Qubes researchers wrote: "On the other hand, it is really shocking that such a bug has been lurking in the core of the hypervisor for so many years. In our opinion the Xen project should rethink their coding guidelines and try to come up with practices and perhaps additional mechanisms that would not let similar flaws to plague the hypervisor ever again".

35 of 61 comments (clear)

Min score:

Reason:

Sort:

XEN PV mode is dead by Anonymous Coward · 2015-10-29 22:23 · Score: 5, Interesting

The truth is nobody uses para-virtualized VMs anymore. EC2 which was the last bastion for pv xen stopped using it a couple of years ago and moved entirely to hvm model. I'm not even sure that the latest Linux kernel support are compiled with Xen PV support. If you looked at the kernel code for PV XEN support you know what the mess that was so good riddance. You need to understand what PV mode means for hypervisors: a kernel must be specifically modified to talk to a hypervisor so instead of performing a privileged CPU instruction it would call a Hypervisor provided function. I'm sure there were tons of security issues with that approach and many still exists. Anyway PV model is not relevant anymore since Intel introduced hardware virtualization on the CPU. It was introduced to to improve perfromance of VMs but it's not relevant anymore
1. Re:XEN PV mode is dead by drinkypoo · 2015-10-29 23:16 · Score: 1
  
  I'm not even sure that the latest Linux kernel support are compiled with Xen PV support.
  You mean, you're not sure if it defaults to Y? Or whether common distributions are enabling it when they build the kernel? The answer to the latter, at least, is yes. AFAICT, though, most people still using PV are using KVM. The rest are using containers.
  
  --
  "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
2. Re:XEN PV mode is dead by engun · 2015-10-29 23:45 · Score: 2
  
  PV and HVM are not mutually exclusive. The basic idea of having a modified, "virtualization-aware" guest-os (a.k.a PV) is a good one and results in better performance. Hardware Virtualization often simplifies the implementation of both the hypervisor and PV on the guest, but does not obviate the need for PV. Using both in tandem can result in even greater performance gains. http://wiki.xen.org/wiki/PV_on...
3. Re:XEN PV mode is dead by TheRaven64 · 2015-10-29 23:57 · Score: 2
  
  You probably want to link to PVH, not PVHVM for the real relevant approach. That said, HVM usually implies, at a minimum, hardware support for nested page tables. The bug in question is only present when using shadow page tables. Even if you're using PV devices in an HVM or PVH VM, you're using the hardware page tables.
  
  --
  I am TheRaven on Soylent News
4. Re:XEN PV mode is dead by steamraven · 2015-10-30 03:23 · Score: 1
  
  That may be the case for cloud deployment. However, there are other very important areas that PVs are being used. For example: qubes, a security focused Linux distribution https://www.qubes-os.org/.
  In addition, there is actually a full spectrum between PV and HVM: http://wiki.xen.org/wiki/Xen_P.... Very few use straight HVM, generally it is HVM + PV Drivers. Linux on Xen ends up using PVHVM. The sweet spot for Open Sources OS under Xen is PVH.
Re:General advice, sir yes sir! by Rei · 2015-10-29 23:06 · Score: 1

Every piece of software contains at least one bug.
Also, every piece of software code can be shortened.
Therefore, every program can be shortened down to an empty source file which doesn't work.

--
"Oh, goodness. Look at my wrist, I have to go." "But what about your clothes?" "I don't love these."
Re:conspiracy thinking thread starts here. by slashdot_commentator · 2015-10-29 23:27 · Score: 1

It never really ever had "relevant" market share. Hypervisors only have been in the market for roughly ten years, and its not like they have been running nuclear power plants. Its closest relative, the microkernel, has in QNX, and other proprietary products.

--
There is no America. There is no democracy. There is only IBM and AT&T and DuPont, Dow, General Electric, and Exxon
Or just broke it by argStyopa · 2015-10-29 23:33 · Score: 1

"Shattered" really?
What the hell is in charge of the Gawker-style headlines, because I think that same robot should be made responsible for editing: at least we know it's working.

--
-Styopa
1. Re:Or just broke it by drinkypoo · 2015-10-29 23:50 · Score: 1
  
  What the hell is in charge of the Gawker-style headlines, because I think that same robot should be made responsible for editing: at least we know it's working.
  You won't believe this one annoying trick for making your clickbait better bait. Advertising works, that's why people use it. Propaganda, likewise.
  
  --
  "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
2. Re:Or just broke it by gstoddart · 2015-10-30 04:40 · Score: 1
  
  Do you understand the function of a hypervisor? Do you understand how tremendously BAD it is if the host OS can control the hypervisor?
  
  For seven years, Xen virtualization software used by Amazon Web Services and other cloud computing providers has contained a vulnerability that allowed attackers to break out of their confined accounts and access extremely sensitive parts of the underlying operating system.
  So, imagine ... all these people selling cloud services, making millions and millions of dollars ... now, imagine that those things in the cloud can control the infrastructure for the cloud, when they should have no way in hell of doing that.
  
  "The above is a political way of stating the bug is a very critical one," researchers with Qubes OS, a desktop operating system that uses Xen to secure sensitive resources, wrote in an analysis published Thursday. "Probably the worst we have seen affecting the Xen hypervisor, ever. Sadly."
  For a hypervisor, this is pretty much an epic fail.
  
  --
  Lost at C:>. Found at C.
With enough Eyes.. Blah Blah Blah by williamyf · 2015-10-29 23:54 · Score: 2

ESR Was wrong. Enough Eyes are not enough!
One needs Enough QUALIFIED AND MOTIVATED eyes, as well as proper test cases, a Quality Assurance group and Technical Guidelines.

--
*** Suerte a todos y Feliz dia!
1. Re:With enough Eyes.. Blah Blah Blah by serviscope_minor · 2015-10-30 02:06 · Score: 1
  
  No you're wrong because you have woefully misunderstood the quote.
  He said with enough eyes all bugs are shallow.
  Not "with enough eyes no bugs exist ever".
  It means that given enough people, once a bug manifests then it's shallow, i.e. easy to fix, for someone.
  
  --
  SJW n. One who posts facts.
2. Re:With enough Eyes.. Blah Blah Blah by LichtSpektren · 2015-10-30 02:10 · Score: 1
  
  Proprietary software: lawful users may never know about critical security exploits. Even if they do, they are at the mercy of the software's owner; if the owner tells you to toss off, you're SOL.
  
  FLOSS software: anybody can discover a bug, notify the maintainer, and have it fixed promptly. Even the maintainer won't do it, one also has the freedom to make the fix and recompile the source on one's own.
Re:General advice, sir yes sir! by Bengie · 2015-10-30 00:23 · Score: 1

It was almost as bad as a "if(true == true)" "bug". They should have unit tested this to make sure it failed as expected. Always test your edge cases, and also try to test corner cases. But really. Who doesn't test their edge cases? They're the simplest tests to identify.
Bugs happen, even in hypervisors by martyros · 2015-10-30 00:52 · Score: 2

Go do LWN's search page, uncheck all the boxes except for "security vulnerabilities", and then search for "KVM". Or Qemu, or Linux or Xen.
You'll find that all hypervisors have privilege escalation bugs discovered. However, this is the first one discovered in the Xen PV interface in a long time.

--
TCP: Why the Internet is full of SYN.
Classic Open Source by Luthair · 2015-10-30 01:31 · Score: 1

People who aren't going to participate proclaiming how others should do something.
1. Re:Classic Open Source by goarilla · 2015-10-30 03:29 · Score: 2
  
  The Qubes OS people do participate in Xen's development (http://xenbits.xen.org/gitweb/?p=xen.git&a=search&h=HEAD&st=commit&s=marek).
Criticizers. by Lisias · 2015-10-30 01:50 · Score: 1

On the other hand, it would be a good idea to people stop harassing open source projects when serious and/or old bugs are discovered *and* fixed.
Nasty 7 years old bug discovered? Bad indeed.
Nasty 7 years old bug *FIXED*? Good, very very good.
Once you decide not to throw everything through the Windows, I mean, window every year ("fixing" old bugs with new bugs), you must expect that old flaws will one day be discovered. And fixed.
There're too many criticizers nowadays - but almost none of them got his hands dirty to know what they are criticizing.

--
Lisias@Earth.SolarSystem.OrionArm.MilkyWay.Local.Virgo.Universe.org
Re:Bug in English by CurryCamel · 2015-10-30 01:56 · Score: 2

that would not let similar flaws to plague the hypervisor ever again
Can we trust people to critique code who can't even manage English grammar?
Yes. Very few program is written in English. C is more common.
And looking at the Qubes OS team https://www.qubes-os.org/team/, I'd bet English isn't the primary language for most of them.
Re:General advice, sir yes sir! by angel'o'sphere · 2015-10-30 01:59 · Score: 1

What is the difference between an edge case and a corner case in testing?
For a non native english speaker they seem the same to me.

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
I have no idea what Hypervisor is but... by wardrich86 · 2015-10-30 02:00 · Score: 1

I have no idea what Hypervisor is but all I keep envisioning is Pin*Bot.
Misunderstanding ESR. Shallow, not non-existent by raymorris · 2015-10-30 02:27 · Score: 4, Insightful

ESR didn't say "given enough eyeballs, no bugs exist."
He said they are -shallow-. "The fix will be obvious to someone". That is, you won't spend a month trying to to figure out exactly why foo sometimes conflicts with widget - with with several people looking at the source (not just the output of the binary), someone will more quickly see why foo conflicts with widget and how to fix it.
It looks like in this case it was about 48 hours or so to characterize the problem, agree on the proper fix, code it, test, patch the major public clouds, and release it publicly. Guessing that patching the public clouds took 24 hours, that's about 24 hours for understanding the problem, discussing it fixing it, and testing. Not bad. Here's a quote from CATB with the context of the "bugs are shallow" part:
---- ... if any serious bug proved intractable. Linus was behaving as though he believed something like this:
8. Given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone.
Or, less formally, ``Given enough eyeballs, all bugs are shallow.'' I dub this: ``Linus's Law''.
My original formulation was that every problem ``will be transparent to somebody''. Linus demurred that the person who understands and fixes the problem is not necessarily or even usually the person who first characterizes it. ``Somebody finds the problem,'' he says, ``and somebody else understands it.
----
It's about bugs not being intractable - they aren't extremely hard to figure out, "the fix will be obvious to someone". That doesn't mean they never existed.
1. Re:Misunderstanding ESR. Shallow, not non-existent by williamyf · 2015-10-30 10:58 · Score: 1
  
  First of, to some other chap that called me a shill (fortunately, down modded), my disbelief about the many eyes is not new, nor is it related to FOSS versus closed source. Please see my posting history on /.
  Second, this bug was "Unshallow" seven (7) years... 'Nuff Said!
  Let me explain using my favourite example: The Metafile fiasco of 2005.
  Here we had Two (2) Codebases. One Closed Source (Windows) and one FOSS (Wine). BOTH codebases contained the error. It took 10 years for someone (the guys at Sunbelt Software) to realize the error. Neither Microsoft nor WINE detected the error, even if they had many eyes looking at the code. My hypothesis is that the Microsoft guys were unmotivated, and the WINE guys lacked QA and technical direction. Both codebases patched fast.
  Please notice that the fact that WINE was FOSS did not help in the least the WINE team to detect the error (or any other group for that matter). And while some people say "The WINE team was just replicating the functionality", this is false, for, had the WINE Team themselves detected the security vulnerability as such, they would have made it public immediately, patched the code, and added a line in the config file of the form:
  MetaFileVuln = 0 /* 0 keep the vuln replicate windows behaviour /* 1 implement WINE team Fix /* 2 replicate Microsoft fix If or when they release it
  So, is not about many eyes. To catch and solved the bugs and security vulnerabilities you need more than many eyes...
  You need QUALIFIED AND MOTIVATED eyes, QA, Test Cases, and some Process guidance...
  And my friends, I read, and still have in my drive the Version 1 of ESR's paper. Read it fresh from the oven, not the many reinterpretations, remember, is a work in progress, is V3 nowadays...
  
  --
  *** Suerte a todos y Feliz dia!
Privilege escalation vulnerability caused by .. by nickweller · 2015-10-30 03:51 · Score: 1

Privilege escalation vulnerability caused by a buggy Memory Management Unit, instead of failing safely - it fails bad ...
Can someone explain how the bug worked? by caseih · 2015-10-30 03:59 · Score: 1

The actual bug is shown in the original article. The author says "It appears the seven-year-old Xen bug is caused by an entanglement of C macros, bit masking, and Intel x86's fiddly page table flags" but fails to explain exactly what's going on (probably he doesn't understand it himself). Can some explain what actually happens in this line and what failure modes caused the check to be bypassed?
The fact that such a simple-looking line could result in such seriously flawed code tells me that programming secure code in C is much much harder than I thought, especially when what looks like a clean function call is actually macro expansion, perhaps layers of macro expansion. Mot a fault of C per se, but a gotcha when using a lot of macros as if they were C functions.
Re:conspiracy thinking thread starts here. by Anonymous Coward · 2015-10-30 04:12 · Score: 1

You do realize that like, half of the 'cloud' runs on Xen right? Amazon, some of Rackspace, etc etc.
Re:General advice, sir yes sir! by Bengie · 2015-10-30 05:10 · Score: 1

An edge case is a problem or situation that occurs only at an extreme (maximum or minimum) operating parameter.

A corner case (or pathological case) is a problem or situation that occurs only outside of normal operating parameters
Flush XEN by sdinfoserv · 2015-10-30 05:12 · Score: 1

I dumped XEN for VMware last year and haven't looked back. The deciding factor (not to mention sliding market share, lack of compatible backup products, and weak tech support) was a VM simply 'disappeared' due to a faulty clean up process. The faulty process deleted the VM and support told me to call data restoration.. When I asked for the number, he said, "no, I mean your inhouse data restoration, your backup administrator"... VMware has so much of the market every single virtual product or offering just works.
Re:conspiracy thinking thread starts here. by DuckDodgers · 2015-10-30 05:42 · Score: 1

Xen was originally an out-of-kernel patch so KVM had an advantage for a while because it was less work to set it up. But from Linux kernel 3.0 and onwards, Xen is right in the mainline kernel.
ESR doesn't beleive that either by raymorris · 2015-10-30 14:59 · Score: 1

Have a read of the relevant sections of the oldest, most original CATB you can find. I think you'll see it says the same thing. You see, he was talking about the (then new) troubleshooting process that Linus had implemented.
The solution to the metafile bug didn't require deep meditation for ten years. If you don't know there is a bug, that doesn't mean it's buried deep, it just means you don't know there's a bug.
Of course to prevent bugs you need educated developers, good testing, etc. That's all true. And has little or nothing to do with what ESR discussed in that passage. Again, he didn't say "no bugs exist", he said "the solution will be obvious to someone" - it's about the process of solving bugs - preventing them is another topic altogether. If you read the four or five sentences BEFORE thehalf of the sentence that became famous, he's talking about a the difference between user who can only see the problematic output of a binary versus someone who can read the source and see which part is going wrong.
it means you're not dependent on a single vendor by raymorris · 2015-10-30 15:07 · Score: 1

To me, a huge value of FOSS is that the vendor doesn't have you by the balls. If you need something fixed or changed, you can hire any of millions of programmers to take care of that for you. It doesn't matter if the vendor has gone out of business, isn't interested, etc. - you're in control of your own systems.
This can be worth millions of dollars to a large business or government agency, because migrating to a different, competing system can cost that much if your current software doesn't fill your need. If you need some piece to handle Euros as well as dollars, a programmer with the open source can probably do that for a few hundred to a few thousand dollars, instead of tens of thousands or even millions to replace the system throughout your organization and re-do all of the integration work, employee training, etc.
That and of course for smaller organizations and families the dollar cost difference can be huge, allowing homes and small offices to have enterprise grade functionality. A router with "advanced" features like QOS can easily cost a thousand dollars or more. OpenWRT is $0.
Re:General advice, sir yes sir! by lazybeam · 2015-10-30 20:31 · Score: 1

A corner case would be at an intersection of two edge cases! Almost by definition.

--
-- no sig for you. come back one year.
Re:General advice, sir yes sir! by Bengie · 2015-10-31 00:23 · Score: 1

My biggest complaint about other programmers is they don't account for the possible ranges of their parameters. Many just want to get the program "working" and nothing more. I don't like undefined behavior. I try my best to design my programs to have all behaviors well defined. When something goes wrong, I can almost always tell you exactly why it went wrong or exactly where to look. I hate the whole, get 20 people involved because no one knows what code is causing the issue and lets spend hours guessing why a program is failing because we don't know how our programs work.
Re:General advice, sir yes sir! by niftymitch · 2015-10-31 06:15 · Score: 1

A corner case would be at an intersection of two edge cases! Almost by definition.
Almost, consider a corner involving three surfaces.
This is perhaps the single best question I have seen in a decade.
Bonus point for the asker.
End point case, overflow/out of bounds case, edge case, corner case...
I can offer some obvious, to me, thoughts.
*) End points are sometimes ill defined and the last legal value and first illegal value must be
correct... Off by one bugs fall into this ... so does testing for zero in floating point land.
Often found inside a function.
*) Edge cases would be interface issues between two functions with a single arg()
*) Corner cases functions with multiple args().
This simplicity ignores a lot!
In my experience labeling a bug with a type is more error prone than
any type-unsafe language. Consider bogus asserts() ....

--
Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.
Re: General advice, sir yes sir! by Bengie · 2015-10-31 08:59 · Score: 1

By some usage of "rarely". They seem pathological to me. They tend to happen in bursts or start once a threshold is met, in my experience. 80% of my job is dealing with these "rare" corner cases that other people never thought or cared about.