World's First Formally-Proven OS Kernel

← Back to Stories (view on slashdot.org)

World's First Formally-Proven OS Kernel

Posted by Soulskill on Wednesday August 12, 2009 @11:57PM from the wait-for-it dept.

An anonymous reader writes "Operating systems usually have bugs — the 'blue screen of death,' the Amiga Hand, and so forth are known by almost everyone. NICTA's team of researchers has managed to prove that a particular OS kernel is guaranteed to meet its specification. It is fully, formally verified, and as such it exceeds the Common Criteria's highest level of assurance. The researchers used an executable specification written in Haskell, C code that mapped to Haskell, and the Isabelle theorem prover to generate a machine-checked proof that the C code in the kernel matches the executable and the formal specification of the system." Does it run Linux? "We're pleased to say that it does. Presently, we have a para-virtualized version of Linux running on top of the (unverified) x86 port of seL4. There are plans to port Linux to the verified ARM version of seL4 as well." Further technical details are available from NICTA's website.

41 of 517 comments (clear)

spec? by polar+red · 2009-08-12 23:59 · Score: 5, Insightful

but is the spec useable ? bugfree ?

--
Yes, I'm left. You have a problem with that?
1. Re:spec? by jadrian · 2009-08-13 00:38 · Score: 4, Insightful
  
  To verify the software meets its specification the specification itself must formalised in the theorem prover. This in turn gives you the possibility of proving properties of the spec itself.
2. Re:spec? by JasterBobaMereel · 2009-08-13 00:45 · Score: 5, Informative
  
  It only means it meets the spec, not that the spec is correct ...
  It does not mean that the faulty or erratic hardware cannot crash the system
  It does not mean that other programs cannot crash and lose your data ...
  It does not mean that buggy device drivers can make your system unusable
  It does not mean that the system is perfect, only that it will always do what it is supposed to ... which may not be what you want ...
  
  --
  Puteulanus fenestra mortis
3. Re:spec? by kamatsu · 2009-08-13 03:19 · Score: 3, Interesting
  
  1) L4 is a microkernel. You have no idea what you're talking about. If L4 works, then you could apply similar principles for all the other servers that run in kernel space.
  2)This is the largest formal proof ever done in this sort of field. Seriously, the results are not immediately useful, but it's a good start.
  Disclaimer: I have worked at NICTA in the past.
4. Re:spec? by TheLink · 2009-08-13 03:42 · Score: 3, Insightful
  
  Exactly. That's why these formal verification stuff is rather useless for most cases I see.
  
  If you pass the customer a mix of water, flour, yeast, eggs and sugar and the customer says "That's not cake, it's not acceptable".
  
  And you then say - "But we meet the cake spec we agreed on, so by that definition it's cake, you have to pay us".
  
  Sure you can go sue the customer and force them to pay you the full sum, but unless most other people agree the customer has just been way too fussy, you might have fewer customers in the future.
  --
  
  Too many replies beneath your current threshold
Amiga Hand? by ArcadeNut · 2009-08-13 00:03 · Score: 4, Informative

Or do you mean the "Guru Meditation Error"?

--
Visit the Arcade Restoration Workshop @ http://www.arcaderestoration.com
The Amiga Hand? by johannesg · 2009-08-13 00:05 · Score: 5, Interesting

The Amiga Hand was the boot screen, not an error screen. You're thinking of the famous Guru Meditation.
Besides, who says that the Amiga kernel did _not_ meet the specifications? Have you read them? Does it mention "crash free" anywhere?
The Haskell code is called a "specfication", but if it is Haskell code, surely it should count as a _program_ already? How can you prove that that program is bug-free? How about conceptual bugs?
Was the toolchain verified? How about the hardware on which it runs?
What overhead does this approach have? Are the benefits worth it?
I'm not saying this is all bullshit, but it looks like me that they are pointing to one program, calling it a "specification", and then demonstrating machine translation / verification to a similar program. I'm not sure if I buy that methodology.
1. Re:The Amiga Hand? by TheCycoONE · 2009-08-13 00:17 · Score: 3, Interesting
  
  Well the typical bugs that affect C programs, like buffer overflows, using a dereferenced pointer, etc; along with common mistakes made in procedural programming in general like off-by-one errors are much less likely to come up in a functional language like Haskel. In a lot of cases Haskel code is simply a LOT shorter and easier to read than it's C/C++ counterpart which makes it much easier to find a problem; not much harder than finding the problem in a spec on paper.
  So, no I don't think it guarantees anything, but it's a lot better than C code on its own.
2. Re:The Amiga Hand? by Anonymous Coward · 2009-08-13 00:31 · Score: 5, Informative
  
  The missing word is formal.
  They use a formal specification, which is then formally verified.
  The overhead? It took something like 5 years for a 10,000 line program. The benefit is if the specification is right, the program should be right.
  Other questions are answered in the FAQ linked in the summary and this page.
3. Re:The Amiga Hand? by Tom · 2009-08-13 00:43 · Score: 4, Interesting
  
  How can you prove that that program is bug-free? How about conceptual bugs?
  Formal verification does not tackle conceptual bugs. What it does is prove that the implementation conforms to the specification. If your specification is false, then it is false, but the implementation will correctly implement the false behaviour. In other words, this checks whether the house and the building plan are identical. If the plan has a window where there shouldn't be one, then that window will be there, because it's on the plan.
  
  What overhead does this approach have? Are the benefits worth it?
  RTFA. The amount of work required is staggering (four years, 200,000 theorems to prove) but since it's a verification of code, not additional testing code, there is zero overhead when the system is running.
  
  --
  Assorted stuff I do sometimes: Lemuria.org
4. Re:The Amiga Hand? by Chris+Mattern · 2009-08-13 03:48 · Score: 4, Insightful
  
  The benefit is if the specification is right, the program should be right.
  We'll have to prove the specification does what we want, then. Of course, then we have to make sure our conception of what we want is right...
  Personally, I think it's elephants all the way down.
Re:Yeah right by oji-sama · 2009-08-13 00:07 · Score: 5, Funny

It's not a bug, it's a formally proved feature ^.^

--
It is what it is.
Give me six lines of code... by Joce640k · 2009-08-13 00:10 · Score: 4, Funny

"If you give me six lines of code written by the most diligent of programmers, I will surely find enough in them to crash the OS" - Cardinal Ritchielieu

--
No sig today...
1. Re:Give me six lines of code... by Telecommando · 2009-08-13 00:15 · Score: 5, Funny
  
  Every program contains at least one bug, and can be shortened by at least
  one instruction. By induction, every program can be reduced to a single
  instruction which doesn't work. /old, I know.
  
  --
  Beta sux! Join the Slashcott! http://hardware.slashdot.org/comments.pl?sid=4760465&cid=46173047
2. Re:Give me six lines of code... by ranulf · 2009-08-13 00:31 · Score: 4, Funny
  
  > By induction, every program can be reduced to a single instruction which doesn't work.
  HACF?
3. Re:Give me six lines of code... by Anonymous Coward · 2009-08-13 00:54 · Score: 5, Funny
  
  /*
  *
  *
  *
  *
  */
Apps running on top will crash... so by WiseOwl2001 · 2009-08-13 00:11 · Score: 3, Insightful

Even if we have a perfect kernel, it won't insulate us from bugs in the software running on top of that kernel, so do we really gain much? I guess for mission critical apps the answer could be yes... But for every-day computing?? On my desktop I have more trouble with Firefox crashing than I do the OS! (Yes I run linux).
1. Re:Apps running on top will crash... so by John+Hasler · 2009-08-13 00:15 · Score: 5, Informative
  
  > Even if we have a perfect kernel, it won't insulate us from bugs in the
  > software running on top of that kernel, so do we really gain much?
  Since a kernel crash kills all your applications and background processes, kills your network connection, requires you to reboot, and can scribble anywhere on the disk, yes.
  
  --
  Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
2. Re:Apps running on top will crash... so by Tom · 2009-08-13 00:47 · Score: 4, Insightful
  
  Yes, it gains you a lot.
  Firefox crashing means your userland memory is fucked up and can't be trusted anymore. No problem, kill it, clean it and restart the application.
  A kernel crash leads to undefined behaviour on the ring 0 level. You don't want that, it's where root exploits live.
  Furthermore, we have a lot of really, really strong kernel-level security extensions, like SELinux, whose only two vulnerable spots are kernel-level exploits and weak security policies. If you can remove one of them, you've done a lot to improve security.
  
  --
  Assorted stuff I do sometimes: Lemuria.org
Knuth on proven correct: by John+Hasler · 2009-08-13 00:11 · Score: 5, Insightful

"Beware of bugs in the above code. I have only proven it correct, not tested it."

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
Re:Thank goodness by johannesg · 2009-08-13 00:16 · Score: 3, Insightful

People are starting to see the value of this. Also of programming in logic based languages like Haskell, ML etc.
People have seen the value of this since the first days of programming. In fact, the value is so enormous that no one can afford it... And they have just finished proving that first few lines of code they wrote. In another five decades they hope to be able to have Notepad proven and ready to run so you can actually get some work done!
Then a driver blows it all up.. by tjstork · 2009-08-13 00:18 · Score: 4, Funny

Suddenly after that, the proven kernel will be brought to its knees when someone adds a driver for an old graphics card.

--
This is my sig.
Who proved the proof-checker? by Dr.+Manhattan · 2009-08-13 00:23 · Score: 3, Funny

Recursion (n): see recursion.

--
PHEM - party like it's 1997-2003!
Proven? by mseeger · 2009-08-13 00:28 · Score: 5, Insightful

There is an old corollary that says, you cannot get from the informal to the formal by formal means. All they have proven is, that two specifications contain the same bugs. Both specification were formal (Haskell, C). This is the same as having Perl and Python code and you to prove they implement the same functionality. Neither is a proof, it is bug free (informal definition of bug, not if a bug is specified it isn't a bug).
Re:Godel's Incompleteness Theorem? by TheSunborn · 2009-08-13 00:31 · Score: 4, Insightful

Godel's Incompleteness Theorem just say(In this context) that there exists infinite many kernels that are correct but which can not be proven correct. It does not say that no kernel can be proven correct.
So they just happen to write one of the kernels that could be proven.
Re:Provable? by jamesh · 2009-08-13 00:33 · Score: 5, Insightful

I thought any sufficiently complex system was impossible to prove correct.
Then obviously this OS is not sufficiently complex.
World's First Formally-Proven OS Kernel - NOT by neongenesis · 2009-08-13 00:35 · Score: 3, Informative

Do some research Guys...
Honywell SCOMP Early 1980s Was intended to me a secure front-end to Multics.
Verified by NSA et all as part of the first Orange-book A1 level certification.
For the time it was a magnificent bit of work.
OK, you had me going there for a while... by pedantic+bore · 2009-08-13 00:44 · Score: 3, Interesting

This is a nice accomplishment, but L4 is such a minimal kernel that some folks argue that it's not even a microkernel. It's a picokernel.
It's a lot easier to get the kernel right when it only has twelve entry points...

--
Am I part of the core demographic for Swedish Fish?
Good News and Bad News by Jacques+Chester · 2009-08-13 00:48 · Score: 4, Insightful

First off, as an Australian and a nerd, I am very proud.
Now.
Good news: there is now a formally verified microkernel. 8,700 lines of C and 600 odd lines of ARM assembly. Awesome.
Bad news: it took 200,000 lines of manually-generated proof and approximately 25 person-years by PhDs to verify the aforementioned microkernel.
Conclusion: formal verification of software is not going to take off any time soon.

--

Classical Liberalism: All your base are belong to you.
Re:Sounds like automated unit test generation to m by maxwell+demon · 2009-08-13 00:50 · Score: 3, Informative

Formal proofs are not unit tests. Unit tests test that certain values work correctly. Formal proofs show that the code works to the specification in all cases (modulo errors in the proof, of course). They of course cannot find bugs in the specification (which unit tests might, if they test what you thought the specification said, instead of what the specification really said).

--
The Tao of math: The numbers you can count are not the real numbers.
Re:Empty promises... by ezzzD55J · 2009-08-13 00:54 · Score: 5, Insightful

Most faults on most platforms are caused by hardware faults
bullshit.
Re:Thank goodness by Anonymous Coward · 2009-08-13 01:10 · Score: 3, Insightful

Name the logic that C is based on, then.
C may be "logical" in a colloquial sense. It's not based on a formal logical calculus.
Do you even know what the hell you are talking about?
You mean they aren't all tested like this? by Remus+Shepherd · 2009-08-13 01:14 · Score: 4, Insightful

As someone who does not work in IT, count me as surprised that not all OSes are tested this rigorously.

--
Genocide Man -- Life is funny. Death is funnier. Mass murder can be hilarious.
1. Re:You mean they aren't all tested like this? by maxwell+demon · 2009-08-13 01:59 · Score: 3, Insightful
  
  Linux has more than ten million lines of code. Given that they needed 5 years for 12 persons to verify ten thousand lines of code, this means verifying the Linux kernel would give an estimated cost of 60,000 man-years. So even if they got a thousand people doing nothing else but verifying the Linux kernel, it would take then 60 years to finish.
  
  --
  The Tao of math: The numbers you can count are not the real numbers.
Re:Wish I had mod points by Anonymous Coward · 2009-08-13 01:24 · Score: 3, Insightful

That this moves the bugs to the formal specification is true, but that therefore you might just as well write the actual code is an invalid derivation. Formal specification languages are designed with the idea that one should be able to reason about them in mind (be it manually or with the help of automatic/interactive theorem proving, model checking). This typically leads to a language in which systems can be expressed on a higher level, because performance issues are not important: the specification does not need to be executed. Due to this higher level and the fact that they are easier to reason about with tool support, it is easier to find a bug in a formal specification than it is in programming language code.
Re:Thank goodness by xaxa · 2009-08-13 01:56 · Score: 3, Insightful

It's a paradigm, technically. Although Haskell isn't a logic language, it's functional. Prolog is logical, and nigh useless for most applications.
No, it's just more difficult to write the program for most applications.
IMO, it's because it's more difficult to precisely articulate the problem and method (for Prolog) than to work through the solution (for an imperative language).
A truly marvelous by autophile · 2009-08-13 02:28 · Score: 4, Funny

I have written a truly marvelous bug-free operating system, however there is not enough space on this disk to include it here.

--
Towards the Singularity.
Re:Thank goodness by kamatsu · 2009-08-13 03:23 · Score: 3, Insightful

Dude, can C's types be reasoned by formal inference? No. Hence, C does not follow typed logical calculus.
C doesn't follow boolean logic either, actually, it just maps to assembly instructions. The best thing you could do to reason about C is to use Dijkstra's proof method which is impractical in a large scale and easy to screw up.
Re:Empty promises... by Wraithlyn · 2009-08-13 04:42 · Score: 3, Insightful

The context of your full original post does not change the fact that you claim most faults are caused by hardware, which is the specific point he was disputing.
If you have something to strengthen your claim (from your original "context" or otherwise), present it. Otherwise, complaining about being quoted "out of context" is just rhetorical posturing.

--
"Mind, as manifested by the capacity to make choices, is to some extent present in every electron." -Freeman Dyson
Re:Thank goodness by ioshhdflwuegfh · 2009-08-13 04:43 · Score: 3, Insightful

You know the funny thing about this whole discussion is that the OS linked to in the article is not the first. Integrity from Green Hills Software was proven correct a while ago. It is popular for safety critical stuff like flight controls for airplanes and is one of the dominant players in that niche.
http://www.informationweek.com/blog/main/archives/2008/11/green_hills_sof.html
And what is truly amusing about following this argument, is that Integrity is written in C. :)
Although I can see that you're amused, what you're saying is false: Integrity is not formally proven correct, it only has some amusing but mathematically irrelevant industry certificate.
Re:Thank goodness by morgan_greywolf · 2009-08-13 04:49 · Score: 3, Informative

You're conflating two different concepts. Common Criteria Evaluation Assurance Level focuses on security while this test focuses on complete mathematical provability IOW, can it be mathematically proven that the kernel meets all of its specifications and that the compiled kernel is exactly what was specified in the source code? CC EAL focuses only on security aspects.
Furthermore, a system that was specified as being completely secure[1] would, in theory, be equivalent to a EAL 7, not merely "6+".
[1] I mean this only in a hypothetical sense since I don't believe it is even possible to specify a system that is completely impenetrable, let alone implement one. But then, that's because I subscribe to the theory of information security that says a completely secure system is impossible, therefore we must use multiple compensating ocntrols that get us to a 'virtually impenetrable' state, tuned to prevent the most likely types of attacks (cheap) vs. the possibility of someone building a multi-billion dollar super cluster to break the security.

--
My blog