Doomsday Docker Security Hole Uncovered (zdnet.com)
An anonymous reader quotes a report from ZDNet: One of the great security fears about containers is that an attacker could infect a container with a malicious program, which could escape and attack the host system. Well, we now have a security hole that could be used by such an attack: RunC container breakout, CVE-2019-5736. RunC is the underlying container runtime for Docker, Kubernetes, and other container-dependent programs. It's an open-source command-line tool for spawning and running containers. Docker originally created it. Today, it's an Open Container Initiative (OCI) specification. It's widely used. Chance are, if you're using containers, you're running them on runC.
According to Aleksa Sarai, a SUSE container senior software engineer and a runC maintainer, security researchers Adam Iwaniuk and Borys Popawski discovered a vulnerability, which "allows a malicious container to (with minimal user interaction) overwrite the host runc binary and thus gain root-level code execution on the host. The level of user interaction is being able to run any command (it doesn't matter if the command is not attacker-controlled) as root." To do this, an attacker has to place a malicious container within your system. But, this is not that difficult. Lazy sysadmins often use the first container that comes to hand without checking to see if the software within that container is what it purports to be. Red Hat technical product manager for containers, Scott McCarty, warned: "The disclosure of a security flaw (CVE-2019-5736) in runc and docker illustrates a bad scenario for many IT administrators, managers, and CxOs. Containers represent a move back toward shared systems where applications from many different users all run on the same Linux host. Exploiting this vulnerability means that malicious code could potentially break containment, impacting not just a single container, but the entire container host, ultimately compromising the hundreds-to-thousands of other containers running on it. While there are very few incidents that could qualify as a doomsday scenario for enterprise IT, a cascading set of exploits affecting a wide range of interconnected production systems qualifies...and that's exactly what this vulnerability represents."
According to Aleksa Sarai, a SUSE container senior software engineer and a runC maintainer, security researchers Adam Iwaniuk and Borys Popawski discovered a vulnerability, which "allows a malicious container to (with minimal user interaction) overwrite the host runc binary and thus gain root-level code execution on the host. The level of user interaction is being able to run any command (it doesn't matter if the command is not attacker-controlled) as root." To do this, an attacker has to place a malicious container within your system. But, this is not that difficult. Lazy sysadmins often use the first container that comes to hand without checking to see if the software within that container is what it purports to be. Red Hat technical product manager for containers, Scott McCarty, warned: "The disclosure of a security flaw (CVE-2019-5736) in runc and docker illustrates a bad scenario for many IT administrators, managers, and CxOs. Containers represent a move back toward shared systems where applications from many different users all run on the same Linux host. Exploiting this vulnerability means that malicious code could potentially break containment, impacting not just a single container, but the entire container host, ultimately compromising the hundreds-to-thousands of other containers running on it. While there are very few incidents that could qualify as a doomsday scenario for enterprise IT, a cascading set of exploits affecting a wide range of interconnected production systems qualifies...and that's exactly what this vulnerability represents."
The Joyent cloud features a second layer of isolation. Sometimes you see this described as "double-hulled virtualization". The OS performance penalty to achieve this is low to non-existent due to the nature of BSD zones (hardened jails).
Joyent hybrid cloud: Triton Compute Service and Triton DataCenter integration
This is precisely the scenario that Joyent's technology exists to mitigate.
You think you're running Linux containers, but under the hood you've also got zones and ZFS snapshots.
There is a resource penalty involved in using a high-integrity file system like ZFS, (efficient copy-on-write requires extensive write-buffer coalescing) but it's often not a large one compared to the many gains in administrative security and ease.
It's a cross between a chroot environment and a virtual machine. For most purposes, it is a virtual machine, but by using file system overlays, the overhead per VM is much lower; almost as low as running them all in the same environment.
That's the theory, anyway.
If you're running dozens or hundreds of web servers or something like that, it's probably a good solution. If you're only running a few, there's probably no reason not to just use real VMs. Of course, for many people it's not about what's the best fit, it's about using the tool you know.
It's not a strong enough security boundary, but it damn well *should* have been. There's no problem doing this in FreeBSD and Solaris without paying a hardware virtualization performance tax. Frankly, the Linux community should be embarrassed that such a fundamental system facility was implemented in a botched and useless manner.
Then docker itself is lying:
"Containers and virtual machines have similar resource isolation and allocation benefits. [] Containers are more portable and efficient"
https://www.docker.com/resources/what-container