World's First Formally-Proven OS Kernel
An anonymous reader writes "Operating systems usually have bugs — the 'blue screen of death,' the Amiga Hand, and so forth are known by almost everyone. NICTA's team of researchers has managed to prove that a particular OS kernel is guaranteed to meet its specification. It is fully, formally verified, and as such it exceeds the Common Criteria's highest level of assurance. The researchers used an executable specification written in Haskell, C code that mapped to Haskell, and the Isabelle theorem prover to generate a machine-checked proof that the C code in the kernel matches the executable and the formal specification of the system."
Does it run Linux? "We're pleased to say that it does. Presently, we have a para-virtualized version of Linux running on top of the (unverified) x86 port of seL4. There are plans to port Linux to the verified ARM version of seL4 as well." Further technical details are available from NICTA's website.
but is the spec useable ? bugfree ?
Yes, I'm left. You have a problem with that?
first please
People are starting to see the value of this. Also of programming in logic based languages like Haskell, ML etc.
Major bug found in 3...2...
Or do you mean the "Guru Meditation Error"?
Visit the Arcade Restoration Workshop @ http://www.arcaderestoration.com
The Amiga Hand was the boot screen, not an error screen. You're thinking of the famous Guru Meditation.
Besides, who says that the Amiga kernel did _not_ meet the specifications? Have you read them? Does it mention "crash free" anywhere?
The Haskell code is called a "specfication", but if it is Haskell code, surely it should count as a _program_ already? How can you prove that that program is bug-free? How about conceptual bugs?
Was the toolchain verified? How about the hardware on which it runs?
What overhead does this approach have? Are the benefits worth it?
I'm not saying this is all bullshit, but it looks like me that they are pointing to one program, calling it a "specification", and then demonstrating machine translation / verification to a similar program. I'm not sure if I buy that methodology.
Yet another abstraction layer, just what we need......pssha bullocks
"If you give me six lines of code written by the most diligent of programmers, I will surely find enough in them to crash the OS" - Cardinal Ritchielieu
No sig today...
The "Amiga hand" is the Amiga bootscreen. Whoever wrote this probably meant "Guru Meditation".
Isn't being older than 14 (or knowing how to use wikipedia) required to be a slashdot editor?
Even if we have a perfect kernel, it won't insulate us from bugs in the software running on top of that kernel, so do we really gain much? I guess for mission critical apps the answer could be yes... But for every-day computing?? On my desktop I have more trouble with Firefox crashing than I do the OS! (Yes I run linux).
"Beware of bugs in the above code. I have only proven it correct, not tested it."
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
You know it makes sense, a little reminder from jointm1k.
You can't assure we'd never see a kernel fault again even using this. Most faults on most platforms are caused by hardware faults that even a formally proven OS cannot prevent. They also claim that something entirely on spec' has no bugs, which of course is laughable to anyone that has done any length of development. You just cannot account for everything that could happen within a complex system with maybe thousands of independant components.
Further more they claim in the article that software faults have caused plane crashes, when? Off the top of my head I cannot name a single instance when a software fault caused a plane crash (and yes, that includes the recent Air France crash).
I thought any sufficiently complex system was impossible to prove correct.
"No matter where you go, there you are." -- Buckaroo Banzai
As a once (very long ago!) C.S. major, I'm curious to know how they got around Godel's Incompleteness Theorem. While their system is certainly not "complete" if someone can explain (in advanced layman's terms!) what they left out and how that allowed them to formally prove the correctness of their kernel, it would be much appreciated. (Is it because, as an AC wrote, the spec was written in C? Does this "hide" the parts which are unprovable?)
I had a professor in college, along with several colleagues and students, working on a formal proof for nuclear power plant control software. They had been been working for 5 years and, at the time, were about 10% complete.
Obviously, at that rate, the time to complete the formal proof is probably longer than the lifetime of the particular control system they were targeting.
Hopefully Formal Methods have come a long way since I last studied them 10 years ago. In any case, congratulations to this team of researchers at NICTA.
Suddenly after that, the proven kernel will be brought to its knees when someone adds a driver for an old graphics card.
This is my sig.
every respectable geek knows the amiga hand is not an indication the system crashed, but the hardware asking for its OS.
Nope, the amiga crashes are known for the guru meditation, where the screen goes black and displays a BLINKING RED text.
Most amiga users have jumped out of their seat at least once or twice out of sheer surprise...
... we will lost all the fun for Blue Screens and Kernel Panics!
Religion: The greatest weapon of mass destruction of all time
What about Microsoft Research's Singularity? AFAIK its kernel is written in a contract-based version of C# and can be proven correct.
Recursion (n): see recursion.
PHEM - party like it's 1997-2003!
``Beware of bugs in the above code; I have only proved it correct, not tried it.''
Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
There is an old corollary that says, you cannot get from the informal to the formal by formal means. All they have proven is, that two specifications contain the same bugs. Both specification were formal (Haskell, C). This is the same as having Perl and Python code and you to prove they implement the same functionality. Neither is a proof, it is bug free (informal definition of bug, not if a bug is specified it isn't a bug).
the kernel was so perfect that it thought it was god and it condemned the rest of the OS to hell
Politics is Treachery, Religion is Brainwashing
Most large scale commercial software that is worth anything has massive batches of automated unit and functional tests applied to it - How does this differ? Are they automatically generating their "formal proof" code (unit tests I assume). If so, how do the automatically generated tests stack up against manually generated versions? I would imagine said tests can catch a variety of bugs, but as regards insisting that a program has been verified "bug free" - I will believe it when I see it.
Every program contains at least one bug, and can be shortened by at least one instruction. By induction, every program can be reduced to a single instruction which doesn't work.
And that instruction is...
HCF.
When our name is on the back of your car, we're behind you all the way!
Honywell SCOMP Early 1980s Was intended to me a secure front-end to Multics.
Verified by NSA et all as part of the first Orange-book A1 level certification.
For the time it was a magnificent bit of work.
I was fairly sure the OS running on the MONDEX smartcards was formally proven.
does it ru... uhhh... never mind.
This should once and for all silence the people who claim that it's impossible to formally verify any non-trivial code.
Oh, what am I dreaming. They won't shut up, but maybe now they'll die out.
Assorted stuff I do sometimes: Lemuria.org
I would expect it to be something designed and built along these formal lines. So far, feature set has been the main driver for software development, with robustness coming second. The emphasis seems to be very slowly shifting, as some relatively high level feature sets are reaching a stable point.
In the case of single core CPUs, we do actually seem to have reached a fairly stable performance point now. The way things are pushing forward is with mult-core CPUs, which requires a different mindset of design previously restricted to high end supercomputer and cluster programming. Now even desktop machines will need to start worrying about multi-core related design issues.
This is a more complex sort of programming, and most of the work being done today seems to be more about making existing abilities work faster (naturally, as that's the highest immediate productivity payoff). Which is not to say there won't be a stream of new ideas that continues to flow - what I do mean is that increasingly high level features of operating systems and even system libraries will become "standard" and be worth putting the extra work in for provably correct behavior, particularly in the complex world of ubiquitous multi-core hardware. If you ALWAYS have to implement feature X to support existing design paradigms and tools, and the expected behavior of feature X is well defined de-facto, then it begins to make sense to define it formally as well and insofar as possible forever eliminate it as a source of ANY bugs. This will subsequently free the human resources to concentrate on other problems.
It remains to be seen if the admittedly severe demands of provable software design and development (toolchain, programmer discipline and background knowledge, etc.) can be supported by the broader (non-academic-based) open source community, but given all that it has achieved and the open source tools available in the proof software domain there are definitely some interesting possibilities. Certainly the potential payoffs are exciting! Interesting times.
Hm I thought they proved the mapping from Haskell to C was correct, as well. At lest, so far, for the ARM version.
This is something that people who bang on about formal proofs conveniently forget - all it does is move the bugs from the source code to the formal specification. And if the spec is detailed enough to be useful it effectively becomes code anyway so you might just as well write the actual code and debug *that* instead.
It's a lot easier to get the kernel right when it only has twelve entry points...
Am I part of the core demographic for Swedish Fish?
sorry, but they mention that they have linux running on top of their kernel? how can you run a kernel on top of a another kernel?
In addition to what's already been pointed out about formal vs. informal specifications,
1. Running Linux on top of this is no more safe than running Linux directly on the hardware--it's Linux that crashes (though not very often!)
2. Running this on a CPU that has not been formally proven is nearly useless because only part of the system has been proven, which results in an overall system that is unproven.
"O'Connor, smash the window." "Why me, Bigboote?" "It might be boobie-trapped!" "Oh!"<smash> -Buckaroo Banzai
First off, as an Australian and a nerd, I am very proud.
Now.
Good news: there is now a formally verified microkernel. 8,700 lines of C and 600 odd lines of ARM assembly. Awesome.
Bad news: it took 200,000 lines of manually-generated proof and approximately 25 person-years by PhDs to verify the aforementioned microkernel.
Conclusion: formal verification of software is not going to take off any time soon.
Classical Liberalism: All your base are belong to you.
Isn't there a theory that if we ever get a formally proven OS that all OSs will instantly vanish and be replaced by something more complex? I think there is a corallary that says this has already happened. Obviously, the proof referred to must be mistaken as my OS is still running as I typ
Formal proofs are not unit tests. Unit tests test that certain values work correctly. Formal proofs show that the code works to the specification in all cases (modulo errors in the proof, of course). They of course cannot find bugs in the specification (which unit tests might, if they test what you thought the specification said, instead of what the specification really said).
The Tao of math: The numbers you can count are not the real numbers.
Okay, so you have a proven kernel. But it's L4. L4 provides nearly nothing in the way of functionality, just thread scheduling, and very rudimentary IPC.
Yes, an L4 kernel can be written in anything. Most variants have been written in C, but L4-Pistachio was written in C++. Anything that adheres to the L4 specification works. So it could be written in pretty much anything that doesn't require significant runtime libraries (and the associated overhead for filesystem handling, etc.).
Someone care to enlighten me?
I was born in 1989, you insensitive clod.
So who gets to prove the spec then?
In layman's terms, formally proven software is software written twice, once as a formal specification, and once as code, and then the two are proven to be equivalent. This is a labour-intensive way to write software, and it makes bugs much less likely, but it's not perfect: there can be corresponding errors in both the code and the specification, and there can be errors in the proof too. But nothing is perfect, and this certainly can be a good approach to write code that is as close as possible to bug-free.
he would probably say:
"Please be aware of bugs. I just proven this kernel to be correct, didn't actually run it" ;)
That would seem to be the bigger story. Up to now, it's looked like some form of mental masturbation popular only with computer science professors in academia and kids living in Mom's basement.
Please do not read this sig. Thank you.
...there are two caveats. One is that "executable specification in Haskell" is a bit of a dodge. It means that they have effectively verified the translation from Haskell to C, but if there are bugs in the Haskell then there will be bugs in the C as well (in fact this is part of what they've proven). They admit as much at the very top of http://ertos.nicta.com.au/research/l4.verified/proof.pml, but it still leaves the question of why this is any better than simply rewriting the microkernel in Haskell. If the benefit of a concise and abstract specification is lost, what's left?
The second major caveat is that they had to leave out some features that they couldn't verify. Most significant is that they didn't verify multi-core - by which it seems they mean multi-processor, but "multi-core" is the much hipper word. Since even laptops are multi-processor now, this limits applicability until they take the next step. With all of these shortcuts, though, if you look at the list of what they did manage to verify - mostly pointer errors and overflows - it's not clear that it couldn't have been done with a more real-code-oriented tool such as the Stanford Checker or sparse.
Slashdot - News for Herds. Stuff that Splatters.
The kernel has now been proven to implement buggy specifications precisely!
Wait, what if there is a bug in the Isabelle Theorem Prover...? I think Godel just turned in his grave.
As someone who does not work in IT, count me as surprised that not all OSes are tested this rigorously.
Genocide Man -- Life is funny. Death is funnier. Mass murder can be hilarious.
KLAATU!
BARADA!
NICTA!
Utilizing the synergization of benchmark e-solutions to pre-workaround action items!
Inmos use formal systems to assist in the design of the T800 transputer's floating-point unit. This happened about the same time as the 486 floating-point failures were getting publicity. Inmos claimed that they had been able to develop the unit so quickly because of their approach. As I recall, they first proved that their occam implementation of the floating-point spec was correct (done for an integer-only transputer) and then used automated translation to convert the code into Z (a machine-description language). The T800 was a remarkable microprocessor for its time.
I'm not a computer scientist, so this may be obvious, but how does the halting problem come into this? I mean, the summary suggests that you know this OS won't crash, but wouldn't a general method for proving that be solving the halting problem? I'm guessing that the answer is that they've just proved it meets the spec, and the spec cannot be proved not to crash. Or perhaps it's that they've proved that this program won't crash by a method that's not generally applicable to all problems, so they would not have solved the halting problem. Okay CS people, what's the answer?
"You call it a new way of thinking; I call it regression to ignorance!" -- Operation Ivy
Now all NICTA needs is an aproved website!
I don't trust authorities that talk about quality and standards while not supporting them themselves.
W3C on their http://nicta.com.au/:
Errors found while checking this document as XHTML 1.0 Transitional!
Result: 72 Errors, 1 warning(s)
Good luck with that EAL7 certification.
Oh, and Integrity is only EAL6+
http://www.ghs.com/products/rtos/integrity.html
-- I was raised on the command line, bitch
Disclaimer: my background is on working on Xen but I have a research interest in L4, amongst other things. Should also note that I've not yet RTFA'd but I will do as it sounds awesome.
Paravirtualization is where you modify a kernel so that it can run on something other than "bare hardware". Often in this situation you call the kernel that *does* run on bare hardware the "hypervisor" to distinguish it from the Linux (or whatever) kernel running on top. In this case Linux is running on top of their L4 kernel as a userspace program - instead of accessing parts of the hardware to get its job done it uses L4 system calls. This is basically the same as what Xen does, with L4 taking the role of the hypervisor. Researchers working on L4 have been playing with Linux-on-L4 for many years - longer than Xen has been about, AFAIK.
One nice trick you get, in this case, is that you have the ability to run Linux programs on your nice, formally-proven microkernel. But alongside that you can still run programs *directly* on the L4 kernel, independently of Linux (or with some controlled interaction with it). So, say you have some untrusted code you want to sandbox - you can run it directly on L4, which (now) has proven security properties. Or perhaps you have a crypto app which you want to isolate from Linux in case a root exploit leaves it vulnerable - you can run it directly on L4, outside Linux. You could potentially do similar things for programs that need realtime guarantees - let normal applications get scheduled by Linux's scheduler and realtime apps run directly on L4.
Basically, having L4 there is giving you an ability to start other "virtual machines" running paravirtualised OSes or high assurance applications. It's always been part of the goal that because L4 is simpler than Linux you get to have high assurance that it'll do what it's supposed to. Now a derivative has been formally proven the guarantees could potentially be even better.
The TUD:OS demo CD website has some interesting information about a real-world system that mixes the ability to run (multiple) Linux virtual machines, plus isolated secure applications, all under a security-oriented GUI. It's a pretty cool demo and you can run it as a LiveCD: http://demo.tudos.org/
I really hope that the marketing folks at that company know what they are doing, because I found the voice used in their web site and white papers really condescending. There seems to be an assumption that the reader is utterly surprised that formal proof techniques exist, and also there seems to be an assumption that the reader will be bored by meaninful details - like "take our word for it, because you'd fall asleep or wander away if we really tried to explain it".
Odd, because on the other hand, they seem to be trying to be up front about the boundaries of their proof. I just think it is possible to layer a presentation to different degrees of interest without comming off so jerky.
What's an "Amiga Hand?"
I think you meant "Guru Meditation."
The "Amiga Hand," as you call it, is the boot menu on the first floppy-disk Amigas.
Kriston
IMO, they've got nothing until they've verified the prover. I don't really think their confidence makes the correctness of the prover axiomatic.
"Formally proving" a kernel is something like formally proving your memory allocation algorithm will never fail, providing there's infinite memory available, proving your IO is perfect, providing there's 0 latency, infinite bandwidth and storage, and your security is unbreakable, providing absolutely everything is protected by one-time pads.
In real life, the situation is best described by the old saying: no battle plan survives first contact with the enemy. Kernel bugs are surely not uncommon but at most contribute to about 50% of the system (in)stability - the other half being the "real world" with which the kernel interacts: hardware, its greater environment (e.g. the network) and users. The kernel might be perfect, but at the same time can very elegantly go to pieces the first time it receives an invalid ATA response due to a flaky cable or it misses and interrupt timing because memory allocation algorithms take a little bit too long to defragment or find free space, or the users sets the root password to "god" and leaves a root-login-enabled sshd running. Even CPUs themselves have bugs.
In short, it's nice but the world will not suffer much if other popular kernel don't ever get to be formally proven.
-- Sig down
A machine-checked proof means they are checking that the implementation in C is consistent with the specification written in Haskell. Your confidence that the kernel is correct is exactly your confidence that the specification in Haskell is correct. The C code is 7500 lines, I wonder how big the Haskell code is. This has nothing to do with Godel's incompleteness theorem as correct doesn't mean anything even close the formal definition of complete.
Disclaimer: my background is on working on Xen but I have a research interest in L4, amongst other things. Should also note that I've not yet RTFA'd but I will do as it sounds awesome.
Well, for one thing you can use L4 as a hypervisor. L4 doesn't provide much in the way of high-level functionality - but neither does hardware. It can act as a hypervisor for running Linux VMs on. Bang, you've just formally proven a hypervisor so you can - hopefully - make precise guarantees about security between VMs, etc. In reality the picture is unlikely to be quite so rosy, since you still need to provide virtual devices to your VMs and there are potential security holes in paravirtualised device drivers. Still, it's a step in the right direction - the direction being that we can be *really* certain about how isolated VMs are, despite sharing the same hardware. And that isolation is, after all, part of the core purpose of virtualisation.
Going a bit more futuristic / researchy, there's been various work on running special high-assurance applications directly on L4 *alongside* Linux. So, for instance, modified versions of your crypto apps can be protected from Linux kernel exploits by running them outside Linux on L4 and using a very narrow communications channel to minimise the available vectors for an attacker. Or, conversely, you could distributed untrusted "applet" code to people that can be run directly on L4 - then they can run it outside Linux so they don't have to trust it as much.
Some of these concepts are demonstrated on a (separate) L4-based demo OS here, which is pretty cool: http://demo.tudos.org/
From Wikipedia:
He went on to prove that there was no solution to the Entscheidungsproblem by first showing that the halting problem for Turing machines is undecidable: it is not possible to decide, in general, algorithmically whether a given Turing machine will ever halt.
A link to the Wikipedia article: http://en.wikipedia.org/wiki/Alan_turing
Uhm, the Amiga Hand is not a bug or crash screen. That's the ROM telling you the floppy drive is empty and it wants some bootable media. It would pretty much never be seen with a hard drive system.
The Amiga crash screen is called the Guru Meditiation. It's either a red or yelow blinking box with some cryptic text inside, which only a developer would try to understand. Red is crash, yellow is "recoverable alert", which may leave the computer running but a crash is frequently soon to happen after a yellow guru. Earlier AmigaOS versions only had red, yellow was added sometime in 2.x or 3.x versions.
I think for the much more recent 4.x versions of AmigaOS, it's been changed to the Grim Reaper, and it's more of a window than a black screen with a blinking thing in it. Chances of recovering and continuing use are much better now.
In the 1960's Edsger Dijkstra (arguably one the founding fathers of "Computer Science", at least as a university subject) led a group at Eindhoven to develop a multiprocessing OS called "THE". The kernal was formally proven BY HAND .
I daresay the folks who have made this recent excellent achievement are likely well aware of THE, and therefore are being specious in claiming to be the world's first at doing this.
Prove the spec against what? The spec just says "This is what the kernel does". It's up to the client to decide whether what the kernel does is any use for them.
Quidnam Latine loqui modo coepi?
A spec that high level is useless for properly proving a program.
most modern languages attempt to get speed boosts by abstracting-away low-level stuff and providing mechanisms to specify or ensure the exact constraints of a system, so that optimizations can be made.
C still gets all its speed boost from having no abstraction, forcing the programmer to hand-optimize everything, or to format code in such a way that the compiler can make a few sloppy guesses about when optimizations should be made (the "can be made" thing isn't really worth mentioning, since C doesn't allow enough type-constraints to let the compiler make anything resembling an intelligent decision, so all optimizations are pretty much "does this get used this way right here? No? Okay then, I can do this...", always lowest-common-denominator stuff- pretty much just glorified macros ("I've seen that pattern before, You seem to be looping through every element of an array and copying everything.. I'll just change that to a re-implementation of memcpy for you...")
Low-level optimizations and hand-optimizing things will only go so far- yeah, we can all just switch to assembly and get a minor speed-boost by hand-optimizing everything, but nobody does that (for speed) anymore. Eventually, being able to say "This loop is safe to run in parallel" or "just give me the first match, I don't care how you do it", is going to be faster than "loop through this. I've given you all the pointer offsets in advance, so you can skip a dereference step!" or "Let me tell you exactly how you should find a match...."
-- 'The' Lord and Master Bitman On High, Master Of All
Against whatever those who wrote the spec intended it to say.
The Tao of math: The numbers you can count are not the real numbers.
I have written a truly marvelous bug-free operating system, however there is not enough space on this disk to include it here.
Towards the Singularity.
It's proven by the program that conforms to it. Duh!
Chelloveck
I give up on debugging. From now on, SIGSEGV is a feature.
You're one of those overclockers aren't you... I can tell...
How is this different from what Walker, Kemmerer and Popek did in the 1970s with the UCLA Unix security kernel? See their 1980 paper.
Okay, I understand that. My past with L4 has been more varied. I've played around with L4-Linux and DROPS, as well as the beginnings of the GNU/Hurd port.
But I can definitely see the benefit of sandboxing applications separate from Linux. Particularly applications that don't do a lot of direct filesystem or hardware access. Still a lot of work to be done, however. :-)
(Of course, this sounds a hell of a lot like what you're supposed to be able to do with native Hurd applications, where you could theoretically do most operations through native IPC, whether that's Mach or L4 or Coyotos or whatever)
The important point is: Specifications should be easier to read and understand than code generated from it. (I don't want to enter a discussion whether specifications /are/ easier to read, nowadays.)
And they proved that their translation process is sound, so its reliable.
Interestingly, they didnt say how many lines of code they had to write in Haskell.
One: they appear to have evaluated/proven their own work, always a dangerous thing.
Two: they claim to exceed Common Criteria requirements but don't claim Common Criteria evaluation successfully completed which would require a third party evaluation of their methodologies, designs and code against a Protection Profile. This a is a wonderfully complex process usually involving the developer organization, the evaluators and a national standards body empowered to oversee the effort, make their own tests and ultimately grant formal Common Criteria evaluation. In an effort I participated in, it took at least a dozen documents, millions of dollars and dozens of months to get a robust OS through an evaluation. This was done by a very talented group that had more than a dozen years experience providing evaluated products and getting them approved. And such evaluations are country specific as the spook agencies in each country want to have their own look and say.
Three: what is left out of this OS is what makes an OS usable in the real world to do real things that people want to do...like work over a network or work with files.
Why so? They have precisely specified what the kernel is to do, and have proven that it does it. Whether it's what you want it to do isn't their question, it's yours.
Quidnam Latine loqui modo coepi?
Who cares what they intended? All the users of the kernel care about is what is actually specified. This isn't a top-level application where there's a mapping to real world intentions ("prevent the aircraft from crashing"), this is a building block, and the builders need to decide whether it's the right size and shape for whatever they are building.
Quidnam Latine loqui modo coepi?
Just because someone did a formal proof doesn't mean it's correct, bug-free, etc. That's obviously the intention but just as people make mistakes when writing the initial code, people also make mistakes when doing the formal proofs.
Formal has been used for various pieces of hardware verification for a while. It's fairly easy to find examples of very real bugs that made it past formal.
The most famous one is probably the old, infamous Intel FDIV bug. Bob Colwell (PPro, P2, P3, P4 Chief-Architect) gave a talk at Stanford back in 2004 and one of the interesting slides brought up the FDIV bug and how it was "formally proven" but the formal proof was wrong.... Oops.
(You can find video of the talk online - great talk - check it out)
Microsoft will never use such a technology. Imagine how long it would take for the next release of Windows if Microsoft had to prove it met all of its specifications as documented? Oh, and imagine if Microsoft actually had to prove that they implemented a standard as specified by an authoritative, recognized standards body (i.e. LDAP, HTML5, JavaScript, etc)? I think this is great for embedded systems where failure can equal catastrophe. How long will it take to apply this formal methodology to an end-user operating system like OS X, Linux or Windows?
Btw, sorry if I repeated anything you already knew / had thought of.
Your original point was, I should say, highly valid in the sense that this isn't going to affect any of our desktop / server OSes very soon. But it's promising as an advance in the state of the art nonetheless. Also, I guess if they'd written "formally proven hypervisor", that sounds more useful and is also somewhat accurate - although I don't think anyone uses L4 commercially as a hypervisor. Another spanner any hypervisor will need some kind of device driver virtualisation which can often be a means of increasing the amount of trusted code (e.g. on Xen you basically trust all of the dom0 VM - including Linux - to do the right thing since it has IO privileges, even though the hypervisor itself is quite small).
The TUD:OS demo CD includes (IIRC) some sandboxed applications, such as some port of gpg to L4. It also features a minimal secure GUI that runs outside of Linux and can be used to isolate things - pretty cool.
GNU/Hurd always seems like a really awesome system to me. But last I read they had decided that L4 doesn't suit them and that they want to write a new microkernel. Disappointing :-( I really want to be able to use (and hack on) that thing but I'd like the project to look a little more well-defined first!
You really think C has *no* abstraction? So when you define a variable, you define the memory address for it then? You choose which registers to use in calculations? Look, even ASSEMBLY uses some abstraction; only machine language does not.
Nono...you helped me look at it a different way! :-)
any hypervisor will need some kind of device driver virtualisation which can often be a means of increasing the amount of trusted code (e.g. on Xen you basically trust all of the dom0 VM - including Linux - to do the right thing since it has IO privileges, even though the hypervisor itself is quite small).
Right. And you're still at the mercy of whatever drivers on dom0 might be flaky. Something like a network device driver that's susceptible to a DoS attack....that can be exploited by anything running in a domU, and the entire host goes kaboom. I'm not sure if virtualizing the driver is enough, because it still reveals the underlying hardware. I understand there's a performance hit, but if every NIC in the world is an e100, well, you have only one flaky driver to worry about with the domUs, while the dom0 might be running a Marvell gigabit or whatever.
The TUD:OS demo CD includes (IIRC) some sandboxed applications, such as some port of gpg to L4. It also features a minimal secure GUI that runs outside of Linux and can be used to isolate things - pretty cool.
I'll have to take a look!
GNU/Hurd always seems like a really awesome system to me. But last I read they had decided that L4 doesn't suit them and that they want to write a new microkernel. Disappointing :-( I really want to be able to use (and hack on) that thing but I'd like the project to look a little more well-defined first!
I haven't followed nearly as closely as I used to. I guess the bigger point with them is the longer they go without a compelling release, the more difficult it becomes for them to release anything people would want/need. I think that might have been some of the impetus behind the eventual Minix-3 release; let's get something out that proves our point. It doesn't have to beat Linux or BSD, but has to have something else compelling, and enough of a framework that shows potential for future growth.
I've also wondered if they might be able to look at seeing what they can do with either NetBSD or DragonFlyBSD's kernels. NetBSD, of course, gives you Mach compatibility, a halfway decently-performing FFS driver, etc. DragonFly's LWKTs might be a more hospitable environment for HURD daemon porting from Mach. Use it to strip down to just what you need, much the way they did between Mach releases. 2.5, 3.0, and 4.0 are all very different from each other.
Doing formal software verification on any non-trivial sized code quickly devolves into a nightmare. Proofs like this have long been the goal of the software verification community, and it's good to see this positive result. Regarding the verification, it's actually quite useful. If you run into a bug, it's because you're using the kernel incorrectly or the kernel has been improperly specified. This verification eliminates the need to search for bugs within the kernel itself. Its behaviour has been verified as exactly meeting the specification.
>> "Operating systems usually have bugs -- the 'blue screen of death,' the Amiga Hand,..."
Not to be a nerd but the function of the Amiga hand is only indicate that you need to insert a bootable floppy.
Actually the Amiga euqivalent of the blue screen of death is the Guru Mediatation, which in early Amigas is a red flashing box. In later Amigas it became a dialog box.
Well, for one, they claim to implement an L4 kernel. Now I don't know exactly what it takes for a kernel to be an L4 kernel, but certainly it means certain behaviour can be expected. So if their spec makes the kernel not behave like an L4 kernel, it is buggy.
The Tao of math: The numbers you can count are not the real numbers.
Well, ok, so it has to provide address spaces, threads, scheduling, and synchronous inter-process communication (but not in any particular way, AIUI). I think that checking that those are included in the spec could be pretty reliably assured by review, don't you?
Quidnam Latine loqui modo coepi?
By running device drivers in isolated virtual machines *and* using an IOMMU (for which you need suitable hardware, obviously) you can substantially reduce the harm device drivers can do ... They won't be able to abuse DMA hardware to write to any memory locations they want, they're stuck with writing to memory that guests provide them with. If the guests do crypto / checksumming then that could in principle cut down device drivers' ability to do harm to a "mere" denial-of-service.
If you can do this then you could probably engineer a system with a relatively small trusted base. There's been an ongoing conversation about "dom0 disaggregation" under Xen, which I believe is aiming to do something like this - that would remove having an entirely Linux distro as part of the trusted computing base, which would be A Good Thing ;-)
Although I'm discussing Xen, there's no reason that L4 or some other system could not use the same hardware to get the same effect. There have been various projects using L4 to isolate Linux or Linux drivers in VMs, so it's a logical thing for them to do too.
Re HURD - indeed, it would be nice if they had a release. The Mach version had things like userspace filesystems years before Linux has FUSE. Always struck me as a system with lots of benefits - and it would be nice to see the GNU project finally become a "complete" UNIX replacement. I was hopeful about their use of L4, as its known to be (capable of being) very fast. I'm really not sure where (if anywhere) they're going now; they do seem to lack a strong lead / direction, which like everything has both benefits and costs.
I didn't know NetBSD had Mach compatibility - do you mean it can run on top of Mach? That's cool. I don't, to be honest, know a whole lot about Mach - and to make things more complicated I have the impression there are several versions floating around out there. Minix is another OS that I've not become very familiar with.
DragonflyBSD is a very cool OS that I've been watching with enthusiasm for some time. I think it's a shame that they've not been able to complete their original SMP scalability work yet, as it was one of their original goals. But most of the other stuff they've been doing - HAMMERFS, SYSLINK, etc - are quite interesting. There's not been enough that's different about DragonflyBSD to get me running it just yet but that may change.
The Amiga Hand was the boot screen, not an error screen.
But ... but ... you can't wear a boot on your hand!
Your signature scares me. Not that I'm surprised, just... wow.
Wow. How can you prove something like this... I know mathematical induction and various finite automata tests were used in my old school days to prove software worked... but something as complicated as this.
In any event, an inductive hypothesis in software is tripe. Unless you can formally describe all states, you usually have an unbounded problem, and people test (and prove) positive case. There are just way too many states.
Just like Bjarne said Proof by analogy is fraud... and induction is akin to analogy.
/\/\icro/\/\uncher
Just because the specifications could be bugged does not mean this is not useful. Going from a low level implementation to a higher level abstraction or design is always useful. This is partly why we have comments, assertions, and specifications in the first place. At the higher levels, it's easier to check for intentional behavior (the bugs in the spec is easier to find than the bugs in the code). It is also easier to add new constraints.
Some people have mentioned that this is just a verification of the Haskell to C layer. I might be mistaken, but their site shows that they have multiple layers and most of them are formally verified. As well, this does a bit better than just a Haskell to C translator because this allows for hand optimized code (site mentions that their process accounts for things like pointer arithmetic). In addition, even if you used a Haskell implementation, you still have to verify that.
I understand that people, such as the parent, are worried that people might put "too much" stock in a formal verification, but I would still think it's better than *no* verification? Ok, most of the time - if people insist that something has been proven 'correct' when it, in fact, is not correct, that could be a serious problem, because then the problem might not get fixed, but I'd rather be flying in an airplane in which someone at least tried to prove the software correct, than one in which it has not been, in the general case. Sure, they could still both malfunction, but I bet the unproven software would be more *likely* to fail.
You made an awesome reference. You don't deserve to be languishing down here at +2. Still, you made me happy!
"I have discovered a truly marvellous formal proof of this OS, which this sig is too narrow to contain"
This line no sig
In the few programs I've made that had to be secure, by all means, bugs in the specification were found, years later, that allowed system compromise. Assumptions were made about other systems (that allowed us to authenticate users, validate user-classes, etc) that were false under some circumstances, for example, making a break-in possible even if the core itself was not buggy in itself. It would, essentially, maintain it's integrity but be fooled by others...
This is just like in cryptography: a secure algorithm does not guarantee whole system security.
Still a great achievement!
You cannot proceed from the informal to formal by formal means
There have been various projects using L4 to isolate Linux or Linux drivers in VMs, so it's a logical thing for them to do too.
Cool. FWIW, the situation is much worse with Hyper-V than it is Xen. :-)
I was hopeful about their use of L4, as its known to be (capable of being) very fast. I'm really not sure where (if anywhere) they're going now; they do seem to lack a strong lead / direction, which like everything has both benefits and costs.
Yeah, IIRC, they ran into a major show-stopper with L4 and the current design that put things on hold. Some of the developers were working with Jonathan Shapiro on the Coyotos project at CMU; I'm not sure anything concrete came of that. Might be a moot point, anyway, since Shap's off to Redmond.
I didn't know NetBSD had Mach compatibility - do you mean it can run on top of Mach? That's cool. I don't, to be honest, know a whole lot about Mach - and to make things more complicated I have the impression there are several versions floating around out there.
COMPAT_DARWIN
Not sure how close this is to being production-ready with the ongoing enhancements in OS X, but they'd made pretty good progress around the 10.3 stage. Much of the NeXTstep model used Mach IPC for message passing (think about how ObjC works!). NetBSD had to emulate the Mach facilities in-kernel to build this in. If someone cared, they could probably do a COMPAT_NEXT for m68k, etc. Not sure how many of the features would need to be added to support the Hurd daemons.
There's not been enough that's different about DragonflyBSD to get me running it just yet but that may change.
I played around with some of the early versions, but haven't touched it much lately to try out the nifty new features. Most of the things I used BSD for I converted from FreeBSD to NetBSD around the time that Matt stood up DF. Probably would have been easier in hindsight, especially the community machine I help maintain, but I didn't think it was quite ready for primetime back then. Been very, very pleased with NetBSD, however. Certainly much better quality in my experience than FreeBSD later than version 4.
And here I thought this was going to be an article about Linux 1.2.13.
Oh, great. Sure, this kernel is all well-behaved and meets its spec - as long as Mr. or Mrs. Cleaver are around. But the minute they turn their backs, who knows WHAT scheme this kernel is going to pull!
#DeleteChrome
We were doing formal proof of correctness of kernels back in the 1970's at System Development Corporation (SDC) in Santa Monica.
"Correctness" means correct with regard to a criteria, it doesn't mean that the system didn't have flaws outside of the criteria or that it worked or worked efficiently.
I personally worked with UCLA Data Secure Unix and SRI's PSOS (Provably Secure OS) and a couple of other systems.
One cool aspect was that in those days we had some machine architectures that had hardware support for "capabilities".
Generating the correctness criteria was very, very hard.
As for Haskell - we tended to use Pascal via a Pascal to C compiler.
Before you get angry, I suggest you go read the page what we prove and what we assume. The researchers are not claiming the kernel is bug-free. They are saying that certain parts of the kernel are free from certain classes of errors so long as the rest of the system (the C compiler, the linker, the hardware, the boot code, assembly code related to direct hardware access, etc..) is working as designed.
As someone who has been doing Linux kernel programming for about 8 years, I enthusiastically support efforts to make a more robust operating system. Just being able to say "this chunk of C code doesn't have any null pointer dereferences" is a significant achievement (and incidentally, it's one of the reasons why I do all my "for fun" programming in Haskell and not in C; life is too short to spend tracking down type errors that would never have gotten past a decent compiler).
7500 lines of code is actually quite a lot. Mainstream OSes like Linux have much bigger code bases, but that's because they need to support as many hardware devices as possible, dozens of filesystems, many cpu architectures, work equally well on embedded devices and many-cpu NUMA machines, and support any feature anyone happens to want. If you just want a system that can run particular applications on a particular hardware platform, a significant amount of functionality can fit in 7500 lines of code.
why does it take 6 months for Barack Obama to choose a dog, but he's wants to cram stimulus and healthcare legislation through without anyone reading or understanding it?
How exactly is it Obama's fault that you are too stupid to read and understand something?
Tesla was a genius. Edison however was a overrated hack who liked to torture puppies.
A lot of people here are saying: duh, they only proved that the C code has the same bugs as the formal specification.
But the researchers did consider that. The formal specification also allows them to prove important properties of the L4 kernel.
See the researcher's website: http://ertos.nicta.com.au/research/l4.verified/proof.pml
From the website:
Here, the absence of such bugs is just a useful by-product of the proof. To be able to complete our proof of functional correctness, we also prove a large number of so-called invariants: properties that we know to always be true when the kernel runs.
assignment != equality != identity
Mod AC up. GP doesn't know what he's talking about.
Who is this idiot?! First of all, the Amiga Hand was not a bug! It was the indicator that a disk had to be inserted! Also, Guru Meditation Error was not a bug, nor is the Blue Screen of Death! These are crash handlers! Plonkers!
Unit tests test that certain values work correctly.
What if the test has passed because of the faulty/unspecified/specific implementation of the specification?
This Slashdot article, referring to the so called "World's First Formally-Proven OS Kernel",was brought to my attention by a colleague who is aware of my experience with formally proven OS' in the 1980's. What follows is my response to the claim of being first, and the value of proving the correctness of an OS:
I am aware of at least two instances of operating system kernels that were built in the late 1970's / early 1980's using formal proofs of correctness. I will talk about my experience with one of them.
One of them, produced in the late 1970's was a kernel designed for a specialized environment. This kernel/OS was a reasonably functional kernel complete with multiprocessing, time-sharing, file systems, etc. Unfortunately while the formal proof for this kernel was solid, the axiomatic set on which this formal proof was based did not perfectly match its operating environment. This mismatch proved to be fatal to the OS security.
This formally proven OS took years to create and prove its correctness. Those who developed and maintained the OS were very proud of their work. There were plans underway to create a commercial version of their work and to market it through a certain hardware vendor on which their OS ran.
When I was a student intern working for the organization where that developed this OS worked, I worked in their OS environment from time to time. I came in from the outside where my OS experience was with a wide variety OS' such as MVS, NOS, KRONOS, TOPS-10, RSTS/E, VMS, Multics, and Unix (5th/6th/7th edition). I had enough experience in jumping into new OS environments that I felt comfortable as a user in this one, even though it was unusual.
An important point to observe here is I was one of the first people who enter this OS environment from the outside. I was not steeped in the development world of the OS. I brought with me, ideas that differed from the OS developers. As a young student, I believed that the OS should work for me instead of getting in my way. To help come up to speed, I ported over my collection of OS-independent tools and soon began coding away on my assigned tasks.
Then one day, working within my OS-independent tools, something very odd happened. By mistake, I did something that produced an unusual result. I quickly realized that something was very wrong because the result was "impossible" according to the formal proof of OS correctness. Under the rules set down by my employer I immediately contacted the appropriate security officer and the next thing I knew, I was in a long sequence of meetings with people trying to figure out what in the hell happened.
In the first meeting after my mistake, I learned that I had been reassigned to a new team. I was assured that I was not being disciplined, far from it: I had made a number of people very happy and they moved paperwork mountains to move me into their team. This team was given a task of attempting to repeat my previous "mistake" as well as to discover if exploits that are more general were possible against this OS. We were assigned to work âoeundercoverâ as developers under test/QA installations using this OS. In the end, the team was successful in discovering a much more general form of the OS hole I accidentally found.
What went wrong with the OS formal proof? Well the mapping from the formal logic world to the real world of hardware, physics, people, and the environment was flawed. In other words, when you added in the "real-world", the proof was incomplete. Attempts by the OS developers to expand their proof to address the "real-world" only produced a set of inconsistent theorems. I believe the OS project was abandoned after the OS developers failed multiple times to expand their formal proof to deal with âoereal-worldâ environments.
During this experience I was introduced to two "Security Camps": One, "the absolutists" as they called themselves, included people who worked on this formally proven OS. The opposing camp called the
chongo (was here)
There is a good overview of the topic on the authors page, at http://ertos.nicta.com.au/publications/papers/Klein_09.pdf. It specifically mentions "the other" big Project in this field, the VeriSoft project ( http://www.verisoft.de/ ). VeriSoft is broader in scope (covering the translation of programs down to machine code level, and also aiming to verify properties of user-space programs), but uses a simpler implementation Language lacking C-pointers, called "C0". C0 is extended with limited support for inline-assembly code for the compilation of the OS ( http://www-wjp.cs.uni-sb.de/publikationen/GHLP05.pdf ).
It's more like automatic debugging, if you ask me.
It allows you to code, then check that what you coded is according to the spec. If it isn't, at least, you now have a formal description of what isn't right, which should be simple in principle to trace back to the root cause, either the code or the spec.
C gets its speed boost from having no abstraction. The things which are abstracted are trivial substitutions and do not provide a speed boost. C has no speed-optimizing abstractions.
When you define a variable in C, you very well CAN define the memory address for it, rather than telling something else to do it for you. But this makes things slower, not faster.
C does not provide any abstraction which says something like "This variable should have all of its possible values processed simultaneously", and only very primitive ways of defining what those values are (from which, as far as I am aware, no speed boost can be claimed, as it is only done to prevent error)
So, yes, C has no abstractions in the context of the sentence right above it.
-- 'The' Lord and Master Bitman On High, Master Of All
Actually in Verisoft XT the successor of Verisoft now real C code is verified, including e.g. PikeOS with PowerPC assembly in a system developed to DO-178B airplane verification. In that project an important tool for C code verification is VCC developed by Microsoft Research, source included at that site.
And they have just finished proving that first few lines of code they wrote. In another five decades they hope to be able to have Notepad proven and ready to run so you can actually get some work done!
I completely agree with your sentiment here (somewhat sadly, I might add; it would be nice to have a provably correct linux/gcc/glibc/emacs/firefox/nethack).
People have seen the value of [formal verification, I assume] since the first days of programming. In fact, the value is so enormous that no one can afford it...
That is [in my economics dictionary] a contradiction in terms.
People finding something valuable can be understood as them being willing to give up a lot of resources, i.e. forgo a lot of alternative uses of the resources, i.e. pay a lot of money, to get it.
So you're saying because people are so eager to pay for it, no one can, right? i.e. the market price exceeds the willingness to pay of the highest (potentially-)paying customer. That doesn't make sense.
What does make sense is that the process of formal verification is so labor-consuming that the value it can generate (even if high) is less than the value of at least one thing the same (large) amount of labor could be spent to generate, i.e. people's time is better spent fixing bugs or adding features. At least according to what the market prices say. [we can then discuss whether the market for kernels works as it should.]
(When can then go on to discuss value in use vs. value in exchange vs. the value of research to increase further value in use which probably increases value in exchange and... you know, all of economics... :)]
http://74.125.47.132/search?q=cache:yBRVrk6qteIJ:www.cl.cam.ac.uk/teaching/0708/IntroSec/slides-4up.pdf+%22Verified+Design%22+and+%22Operating+System%22+and+%22security+design%22&cd=44&hl=en&ct=clnk&gl=us
PERTINENT QUOTE EXCERPT:
----
"Class A1: Verified design adds formal model for security pol-
icy, formal description of TCB must be proved to match the
implementation, strict protection of source code against unau-
thorised modification"
----
This is in regards to the Operating System being of "verified design", the reason I ask this is simple: Years ago when I was first getting into the Operating System security world (where you hear rating terms like "C2" from the "Orange Book" etc. et al), there were NO SUCH ANIMALS (OS' of "verified design") @ that point, & I have heard of a "Green Hills OS" that supposedly had been granted such status...
APK
P.S.=> (I did not read the article in its entirety yet, so, to save time? I am asking - STILL HAVING COFFEE THIS A.M., so, if reading the article would have satisfied this question, disregard please... thanks for your time!)... apk
If this kernel is as great as professed, then the compiler that was used to create it is likewise. And it would be great to perhaps port this kernel to replace the linux kernel. One way, I suppose, to make use of it in FOSS, is to use this kernel as the microcode that drives a Linux kernel surrounding it. I think of this architecture as the hub of a wheel, and the Linux kernel as the rim. Just my rambling thoughts
Leslie Satenstein Montreal Quebec Canada
What about Bevier's 1987 dissertation, "A Verified Operating System Kernel." ftp://ftp.cs.utexas.edu/pub/boyer/diss/bevier.pdf
*note that the "makes things slower, not faster", is just a consequence of the way memory tends to be allocated when doing things "by hand"- it is not a result of optimized abstractions
-- 'The' Lord and Master Bitman On High, Master Of All