naasking · Slashdot Mirror

← Back to Users

User: naasking

naasking's activity in the archive.

Stories: 0
Comments: 2,000
First seen: 2000-02-09
Last seen: 2017-01-04
Profile: (view on slashdot.org)

Comments · 2,000

Re:apples and oranges on The Microsoft Singularity · 2005-11-04 04:41 · Score: 1
What do you suppose "protocol conformance" means? It means avoidance of exactly the kind of invalid state transition that we've been talking about. Your claim that "there is nothing a verifier can do to prevent this" is clearly false since the verifier can and does prevent the module from compiling if it contains such errors.

I say "algorithmic errors". You say "protocol errors". We are not saying the same thing. Example: a SIP receives an HTTP GET request. It incorrectly parses it as an HTTP DELETE request. Is the verifier going to catch that? This is an algorithmic (programming) error. In the absence of a proof system that understands the HTTP protocol, a verifier/compiler will not be able to catch these sorts of bugs. In certain circumstances, you can attempt to express the solution in such a way that the type system might be able to catch this sort of error. You cannot do this for all problems.

You're letting bias get in the way of understanding what you're looking at, naasking. Please take off the EROS-colored glasses for a moment and actually look at what Singularity is doing instead of just trying to mine the document for quotes that you (wrongly) think support your predetermined position.
1. You seem to suffer from the very problem you accuse me of, as your implication of EROS-coloured glasses demonstrates. This particular discussion is not about EROS and nowhere did I drag EROS into it (except in addressing the original EROS claim). Instead of attempting to divine my state of mind, please just stick to addressing the facts.
2. I asked you two posts ago to define "invalid state transition" so that we could have a common understanding with which we could analyze how Singularity prevents it and other systems do not. I'm still waiting. All disagreements so far have stemmed from this lack of common basis.
3. I'm confused why you would consider citing quotes from the actual source a bad thing. It's my understanding that it actually clarifies the discussion, as this exchange just demonstrated.
Re:apples and oranges on The Microsoft Singularity · 2005-11-04 03:25 · Score: 1

Look again, this time at the section on Sing# and Boogie which do plenty to prevent this.

From Section 4.8:
Verifying that code executed in Singularity is type safe and satisfies the memory independence invariants is a three-stage process. The Sing# compiler checks type safety, ownership rules, and protocol conformance during compilation. The Singularity verifier checks these same properties on the generated MSIL code. Finally, the back-end compiler should--but does not as yet--produce a form of typed assembly language that enables these properties to be checked yet again by the operating system.
Type and memory safety are the only guarantees. These go a long way sure, but they do not prevent algorithmic errors.
Re:apples and oranges on The Microsoft Singularity · 2005-11-04 02:44 · Score: 1

Whether or not you call the two systems "microkernels", they are different designs and use very different runtimes. You can implement the same architectural features in both, but that's like saying that you can implement anything in assembly that you can implement in Perl--it's just not an interesting statement.

But it's a true statement. Your original statement that EROS cannot provide these same guarantees was simply false. I merely pointed it out, nothing more.

And you do a context switch on every method call?

You don't perform a context switch on every method call in any language. The performance is simply prohibitive.

C# and Java guarantee fault isolation at the level of the individual programming language object, there is nothing like it in any C-based system. C# and Java also allow you to get information about the type and structure of every single allocated block of storage; try that in a C runtime.

There is nothing stoping you from programming each EROS process in C#, Java, or any other language which provide these same features. You are way too fixated on the fact that the EROS core is written in C. That has little to no bearing on what I was pointing out. I am not a C fanboy, nor am I saying a system should be programmed in C. Stop attacking strawman arguments.

The architecture isn't at issue, at issue is developing kernel modules in an unsafe language without reflection, without class loaders, and without garbage collection.

This and your original statement are still incorrect. Firstly the OS "architecture" is fundamental to how you program in these systems. Secondly, there is no such thing as a "kernel module" in either Singularity or EROS. There are only Software Isolated Processes and user-level processes respectively. Communication between these isolated object pools is accomplished via IPC. In Singularity and EROS, the IPC is specified via a strong interface contract which is an IDL; or you can can program the marshalling and unmarshalling of data manually in EROS if you like, but who wants to do that?

Thirdly, there is nothing preventing you from implementing anything in a higher-level language in EROS, except the realities of system programming. For example, try writing a disk driver backed by Singularity's runtime features, on a system that is fully loaded and swapping. The disk driver tries to allocate memory, a page must be swapped to free a frame for the driver, this is converted into a request to the disk driver to write the page to disk, but oops! the driver is waiting for memory and can't proceed.

Some system software and application software simply must be programmed fundamentally differently. My earlier assertions are that full control over runtime features is necessary in these scenarios, as my example illustrates.
Re:apples and oranges on The Microsoft Singularity · 2005-11-04 02:26 · Score: 1

You completely misunderstood the point of Singularity. The point of Singularity is that all code (except OS code) is subject to verification, and any code that isn't verifiable is runtime bounds-checked.

No I really didn't. There is no reason you cannot build this functionality on EROS. Which was my point that you seemed to have missed. We have this functionality running on Linux right now with Java, mono, etc.

Furthermore, in Singularity, inter-process communication is structured, such that the OS can verify IPC traffic.

All IPC is structured to a certain extent, particularly in microkernels. Furthermore, why should the OS care how an IPC data string is structured? This is a contract between communicating entities. This is specified in typical microkernel systems like L4 and EROS via IDL. This is implemented in Singularity via their strong interface contracts (which is just another IDL). This is not a revolutionary idea and is not among the true benefits that Singularity provides.

Furthermore, the languages for Singularity are strongly typed at the object code level, and garbage collection is performed by the OS--explicit deallocation is impossible for any application. These facilities make it impossible for any application to have buffer overruns, segfaults, or overruns of other apps' data

These are all guarantees you can have in any other OS. Other microkernels merely use the MMU to enforce isolation instead. That's the only difference between EROS and Singularity in this regard, and again, it's not among the main benefits Singularity provides.

All that has nothing whatsoever to do with Eros. The two projects are not even similar.

Sure they are. They both seek to create a secure, reliable operating system. Comparing them on the techniques they use to accomplish those ends is perfectly valid.

The Coyotos OS is based on Eros and is quite similar.

Coyotos is being developed by the EROS creators. I'm on the development mailing list for both. I'm very familiar with these projects.

Additionally, Eros is not completely revolutionary. From the eros web page, What's new about Eros?: "Each of these faclities is...essential to providing scalable reliability, and all of them have appeared in prior systems. No prior system, however, has ... this particular combination ... in quite the same way.".

To truly understand the impact of such a statement, I think you should read into EROS. Using the same logic, Java and .NET were nothing revolutionary, and yet they transformed an entire software industry, and it is driving this new operating system which everyone thinks is the cat's pyjamas. Sure EROS is not the first capability-based system, sure it's not the first microkernel, sure its not the first system with a coherent single-level storage model, but it provides all of these features, and then some. It started as a re-implementation of KeyKOS, and grew into its own beast.

Your arrogance is unjustified.

I'm sorry, but if you read the previous thread where idlake and I argued, you'll see that my supposed "arrogance" is quite justified. Though how you can classify a simple statement of fact (that idlake has not programmed on anything even remotely resembling EROS) with arrogance is beyond me.
Re:apples and oranges on The Microsoft Singularity · 2005-11-04 02:12 · Score: 1

I was also involved in that previous discussion (in fact your post there was spawned by one of mine) and I think idlake is on the right side this time.

In the right? I never countered any of idlake's claims other than the false ones, namely that such functionality cannot be had in other properly designed microkernel systems (of which EROS is one).

The EROS architecture still relies on hardware memory protection between processes, which is a fundamentally different paradigm than a single-address-space OS that uses either compile-time or load-time verification to guarantee safety.

Singularity is not a single address space OS.

In addition, the EROS contructor does not provide any of the higher-level guarantees (e.g. regarding invalid state transitions) that Sing# does via the "Boogie" verifier.

What do you consider an "invalid state transition"? From which perspective? The EROS kernel, due to its atomic design, ensures that each state following a message send is correct given the previous state was correct from the kernel's perspective. From the application perspective, if it contains an algorithmic error, the state transition may trigger a bug, so a message may induce an incorrect state transition. There is nothing a verifier can do to prevent this. Only a system programmed fully with verification proofs can make guarantees about total correctness in the face of changing state.

Yes, you could build all the important parts of Singularity in EROS (and vice versa) but the result would be more of a emulation or virtualization like User-Mode Linux than like a single OS.

This is the nature of computation. Xen is a "virtual machine monitor" which is really just a microkernel (or exokernel if you prefer). Does running any operating atop Xen make it any less of an operating system? There is no difference between "emulation" and "execution". The distinction is artificial and does not exist. Just because the code executed in Singularity is MSIL and not the native instruction set, does that make it "not a real operating system because its emulated"?

There's nothing wrong with EROS, and there's nothing wrong with Singularity. They can both help us learn different things. Pissing contests about which one trumps the other are just pointless and childish.

I agree. Siungularity looks very cool. But if someone makes an incorrect statement, I will correct them. I would expect nothing less.
Re:apples and oranges on The Microsoft Singularity · 2005-11-03 10:39 · Score: 2, Insightful

EROS uses C and relies on memory management hardware for isolation. EROS also can't analyze or verify code it loads.

I don't want to re-hash our previous argument on this subject, but the above statement is trivially falsifiable. Singularity is built on a microkernel. EROS is built on a microkernel. Anything that can be built on Singularity, can be built on EROS, including verifiers, virtual machines, Software Isolated Processes, etc.

EROS has a default mechanism for isolating faults and loading untrusted code in the absence of any safety guarantees: the constructor.

I don't know whether Singularity is going to make it, but I have used and developed on systems like it (the idea isn't new), and it is a lot nicer than either UNIX kernels or EROS-like kernels.

Not that you know what developing on an EROS-like system is like, considering it's a completely revolutionary architecture comparable only to KeyKOS from which it's derived.
Re:Things like this put an interesting spin on... on Gene Found In Black Death Survivors Stops HIV · 2005-10-29 08:26 · Score: 2, Interesting

On a farm, more kids meant more helping hands. In a city those helping hands aren't needed, and in fact pull down prosperity levels. As such, people choose not to have them.

I agree that "overpopulation" will not be a problem in the future. However, the above strikes me as a little too rational and informed. I'd attribute people's choices more to selfishness than anything else: there are simply so many more choices and opportunities available today than there were in the past, so people are more reluctant to give up their freedom and be burdened by more responsibility as young as the people in the first half of the last century.

My personal theory is that population explosion drives growth in scientific and technological advancement, which increases leisure and freedom of choice, which feeds back to the input and decreases the drive for population growth; a nice, fairly self-regulating system (though perhaps a tad naive). :-)
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-27 09:57 · Score: 1

Whatever constraints you impose on storage management without GC you can impose with GC. In fact, GCs are arguably inherently better at enforcing real-time or storage management constraint than manual allocators (Sun's tribulations with Java notwithstanding--a poor GC can indeed be a headache, but so can a poor malloc).

My point was that binary modules linked with the kernel or interpreted in the kernel VM would share the same heap as the kernel itself. Allocating from this heap continually in a malicious cycle without freeing any of it is a trivial DoS attack. You have to start inventing ad-hoc quota mechanisms to start trying to limit the damage, but I've seen no accounting system that is flexible enough for all cases. If we're dealing with pure open source, this is less of an issue due to code "peer-review", but it's definitely a problem on Windows. If each allocation triggers even a brief collection cycle, the problem becomes even worse because you now also have a DoS attack against the scheduler since nothing else can run while we're stuck in kernel mode. Ugh, monolithic kernel designs are just a huge mess.

By pushing everything to user-space, microkernels export the memory management policy out of the kernel and make it fully pluggable. Thus, there is no global vulnerability to DoS, there is only a local vulnerability on a particular memory manager which may only be shared by a subset of the whole system; thus only that subset may be affected by an attack (depending on how resilient the design is). CapROS furthers this design with its single-level storage model, and a hierarchical Space Bank scheme, with potentially one Space Bank assigned per object (object-level granularity in storage accounting). You're clearly a fan of object-oriented design, and the CapROS architecture is fully object-oriented, Actors-style, right down to the bits written to disk; the object-model lacks only inheritance.

Regardless of how much better GC is, you simply cannot do better than a fully-pluggable memory management system.

Furthermore, since every extension to the system is a user-mode process, everything is schedulable, nothing ever blocks or stalls in the kernel, so now we can even provide realtime guarantees. Everything becomes simpler with microkernels. Only the programming the kernel becomes a bit harder.

UNIX has been, and continues to be, widely used for large multiuser systems with sometimes thousands of untrusted and malicious users. It really works for that application.

No, they are not malicious users, though they may be untrusted users. If they were actually malicious, that system would be taken down within minutes of it being brought up; certainly not the mark of a secure or reliable system architecture.

Well, it's good that we do agree as much that writing a kernel in a new language is a good idea.

It's a great idea, especially with the state of theorem proving being as advanced as it now is, I have high hopes for Coyotos. I believe our only point of disagreement was the suitability of C as a kernel language right now (due to lack of better options), and the benefits of the microkernel design. Having a safer language in which to write kernels is always a good idea. I mean really, why the hell wouldn't you want a better/safer language right?
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-27 09:36 · Score: 1

Or you may get a completely unrelated value, an incorrect value every 4 billion accesses, or a crash.

I suppose you're referring to uninitialized variables. Indeed, C imposes no restrictions on initialization, just like assembly. You are free to be as safe, or unsafe as you like within the limits provided by the language implementation.

It's also a problem of software engineering because it's impossible looking at a piece of C code to determine if and where it uses unsafe or unportable constructs.

I'm not sure this is entirely accurate. The class of unportable constructs is fairly well-known. Pointer casting for instance. Sticking to the straight C dataypes with no casting is generally pretty safe.

That causes major problems when porting systems code (but I gather CapROS is Intel/gcc only at this point).

Yes, x86-gcc only I believe.

This sort of difficulty is completely unnecessary--it buys you nothing in terms of performance or expressiveness. It's a historical accident. It would be trivial to have a language that is in every respect like C but that cleanly separates its portable and defined constructs from implementation or undefined constructs.

For the most part, that's very likely. But it's not here today, and that's all that matters when choosing a language here and now. Just cross your fingers that they get BitC right.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-27 04:59 · Score: 1
People have written operating systems in Modula-3 and Ada, so those would be obvious choices and they have free implementations. Of course, not many people know them, which is kind of a problem.
1. Modula-3: whether or not you believe me, the fact remains that garbage collection is unsuitable for microkernels. Can't use Modula-3.
2. Ada: a promising candidate. Unfortunately, in my tests about a year ago it had some binary-bloat problems which hinder its use in a microkernel. That and gnat is currently not available for amd64 which is my current machine.
The C language makes few guarantees about how machine datatypes map onto C datatypes. The mapping may even change according to compiler versions and compiler flags, and the mappings are often not the "obvious" ones.

I know that. We've been over it before. It's also not what I was saying. I was making an observation. It still stands. We are not talking about the 'abstract standard', we are talking about the reality of implementing a kernel right now. The reality is, you'll have to pick a particular compiler to write a kernel regardless of the implementation language. I'd wager any of these compilers have efficient, direct mappings between C types and underlying architecture value sizes.

Even something as simple as accessing the first byte of an int through a char pointer is undefined.

Because the "first byte of an int" depends on the endianness of the underlying machine. This is the identical situation encountered under assembly language: you are exposed to the details of the architecture. This is why I said C is barely a step above assembler. A well-written kernel (micro or monolithic) will generally consist of: effectively 100% portable C, platform-specific C, platform-specific assembly. I see no reason why the portable C can't conform to the standard.

And your remarks highlight another problem with C: its users don't even realize when they are using undefined constructs.

This is a problem of ignorance, not the standard or the language itself; people who can't be bothered to understand the language they're using have no business writing kernels.

Also, very few programs are 100% portable across all architectures and compiler implementations. Show me a serious program that is 100% portable across Lisp implementations, particularly a program that uses FFI.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-27 04:01 · Score: 1

OK, so to summarize, you agree now that safe, garbage collected, object-oriented languages can give performance competitive with C, then. (It doesn't take a VM or a compiler/runtime more complicated than C, but that's a separate issue.)

I never said it couldn't. I never said nor implied that C was the fastest language. Runtime-profiling and optimizations can often far outperform C. I'm a big fan of LLVM. That still doesn't mean that these features are, a) suited for microkernel development, b) are available for microkernel development.

There is extensive evidence. If you want to check it for yourself, just try using DCOM or RMI to put some objects of an application into a separate address space. Not only is it a significant amount of work, it's also slow.

This is anecdotal evidence. All this demonstrates is that DCOM and RMI are poor IPC implementations, not that IPC cannot be easy and fast.

Finally, the use of garbage collection doesn't give you license to abandon all principles of good storage management. In an environment like a kernel, you would likely use arenas, allowing you to recycle all resources associated with a user or a process instantly.

In my question I'm referring to malicious code which will not play by the rules of "good storage management". This code is actively seeking to break your system. In CapROS, you can run arbitrary code without fear of comprising your system or damaging your data because it supports confinement and fine-grained delegable resource management. You can't even do this in any other environment that I'm aware of (except some designs which were inspired by KeyKOS/EROS), Java/Smalltalk/etc. VMs included.

Well, as a UNIX user, admin, and developer for a quarter of a century now, I can say that they are not practical problems in any real-world usage I have encountered. The practical problems with UNIX are lack of robustness against errors in dynamically loadable modules, difficulties associated with developing new kernel modules, and problems with module versioning.

That's because nobody would dream of using insecure operating systems like UNIX and Windows in environments which require reliability, and secure collaboration among untrusted parties. Java has made some headway into this space (see Active Networks), but it too is limited (see the paper I referred you to on SCOLLAR; stack-walking security is inherently limited in distributed systems).

UNIX is inherently based on a collaborative system design; it was not designed to support secure collaboration however. Adding full ACLs or a security system like SELinux only moves it partly in that direction (and it becomes an admistrator's nightmare).

I understand your frustration with the same problems cropping up over and over again as programmer's continue to repeat past mistakes. Microkernels in unsafe languages do improve the situation somewhat. Microkernels in safe verifiable languages like BitC+Coyotos will solve a great deal more of these problems (more than your design I'd wager). The solution is not to drag all that cruft into a single monolithic kernel. But that's just my opinion; well, mine and a great deal of others who share it.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-27 01:06 · Score: 1

Or a install-time compiler. Or a regular compiler with signed binaries.

So the idea is to forbid installation and execution of arbitrary software? Only software signed with a verified compiler is permitted?

How so? With the compile-at-install model, the only attack vector is defeating the protections in the trusted compiler. I don't think that's particularly easier that defeating the protections in a 'trusted' kernel.

Well the only problem with this approach is that you'll have a hard time convincing people to run it. Little to no backwards compatibility (particularly binary in this case) makes people uneasy.

All it requires is a GC, and I find your criteria for excluding the GC capricious and aribtrary --- one that only has merit in the specific case of RT kernels.

Capricious and arbitrary? Why the hell would I want a GC when I'm not performing any allocation or reclamation of storage? What is its purpose? There is no point in it being there, it just unnecessarily bloats the kernel.

Furthermore, the entire point of microkernels is to design a kernel that is small, performant and flexible enough to build almost any kind of system on top of it, realtime or not. Avoiding GC is a huge issue for microkernels, which is the very subject of this thread. If you can't exclude GC, that language implementation is unsuitable.

Define "runtime". C++ has a runtime. C++ can and has been used for writing high-performance microkernels.

Yes, because the runtime can also be excluded as with C. A "runtime" is a set of services provided at "runtime" over which the programmer need not and often cannot exert any control. It is a set of runtime assumptions from which the language itself cannot be divorced. Lisp cannot be divorced from dynamic memory allocation and automatic reclamation for example.

And yes, C and C++ also have runtime assumptions, as we've gone over plenty of times (machine model), but the distinction is that these assumptions don't matter in practice, because they have little impact on the resulting code. C, C++ and Ada can run without memory allocation/reclamation. Lisp cannot.

Lisp can be used so it requires no bigger a runtime than C++. As long as you don't try to use EVAL in the kernel, you're set.

The problem is that Lisp requires a runtime period; the control over the runtime is not arbitrary. Many microkernels do not require a runtime, just as raw assembly does not require a runtime. C and C++, and to a lesser extent Ada, allow you to program (almost) portably, with no more practical assumptions than if you were programming directly in assembly. Does this elucidate my point? No other high-level language can be used this way yet. If there is, point it out.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-27 00:54 · Score: 1

If, as you just said, you "stick to machine words" what does field reordering have to do with anything?

Because we're in a new century. I/O isn't performed one word at a time anymore:

Device command queues
Example struct for SCSA1394 (see scsa1394_cmd struct)

Field reordering breaks device communication.

But it says nothing about alignment, so even though its more predictable, its still completely useless.

And once again, as I've stated before, this is what pointer arithmetic is for. If you have alignment requirements, then you need a host-specific alignment-check. Yes, it's worse than Ada which can specify alignment portably. Unfortunately, Ada has other problems, but I've mentioned them before so I won't repeat myself.

Well, you can, but its completely unportable.

The number of non-portable features are small, and isolatable. Yes C forces you to do more work than other languages in this regard. Unfortunately, it's the only language with an implementation that has the other required features. Sucks, but there it is.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-27 00:42 · Score: 1
No, it is not. C is a language and it has no facilities for low-level access. There are lots of C compilers that provide such access as implementation defined extensions (all in slightly different ways), but it's not part of C. EROS is a kernel written in one or two C compiler-defined dialects, not a kernel written in C. The distinction matters because you can do exactly the same thing in many other languages, and usually in better ways.
1. And once again you are ignoring the simple statements of fact. There are no such compilers for other languages. If there is, show me one. All my statements were qualified in terms of available implementations. Show me a Lisp/Smalltalk/high-level language implementation right now, that permits interfacing with low-level hardware (either natively or through FFI), that permits no runtime environment (no GC, etc.), and that has a free, downloadable compiler that compiles to native code. I don't care that "it can be done"; I never disagreed with that. Show me where it's been done.
2. Once again, I never said the C standard defined "facilities for low-level access". You cannot disagree with my statement because it is a simple observable fact: C provides direct access to datatypes which map almost directly to base valuetypes on commodity hardware.
  
  Commodity hardware: 32-bit and 64-bit desktop and server byte-addressable CPU architectures.
  C datatypes: char, int, long, long long (and unsigned variants)
  
  The above datatypes map directly to underlying byte-addressable word sizes on commodity hardware. You cannot deny this simple fact. This is important because it relates directly to what prompted this thread: why OS researchers chose C to implement microkernels. Answer: they deal with commodity hardware, they need a ubiquitous compiler, they need a language with sufficient control over memory layout and runtime. At the time, and even now, C, C++ and possibly Ada fulfill these criteria. If there is another one right now, show me.
3. Re: C dialects
  This is completely silly. Yes, if you use gcc extensions like __inline__, it's no longer technically "C". That does not mean you cannot write a kernel in pure C even on gcc. All host-specific code is encapsulated in separate asm files and linked to the portable C. There is nothing preventing you from doing this. Yes, CapROS has some gcc-specific code. This is unavoidable with kernel-level software. A similar "Lisp-kernel" will also be written to a specific compiler with an FFI/host-interface of some sort. In conclusion, this distinction doesn't matter because it's universal.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 15:07 · Score: 1

That is an imaginary problem, since standard hardware can support runtime safety, garbage collection, and object-oriented dispatch efficiently.

It can, if we impose a type system on top of it. See LLVM for example. You might be interested in LLVA in fact. ("LLVA: A Low-level Virtual Instruction Set Architecture")

Decomposition is good; using the MMU to decompose a large software system (kernel or otherwise) into modules is often bad.

This is pure speculation. You have no real evidence to support this conjecture.

"My idea" is a system very much like the Linux or BSD kernel but written in a compiled language with garbage collection, runtime safety, well-defined primitives for accessing memory and hardware, and a simple object system. [...] The main benefits would be easier testing, easier extensibility, fewer security problems, and better support for dynamically loadable modules.

I'm still not clear whether you actually have processes, or they're simulated on the kernel VM, whether you use the MMU at all, or rely solely on the type-safe IL and code-generation (ie. the CPU is always in kernel-mode and the entire system is essentially running as one big VM).

Also, I've pointed out a serious problem in this approach: DoS attacks. Perhaps this will be addressed when/if you can answer the above questions, but how do you account for resources and charge them to appropriate entities? Without proper accountability, you open up the system to DoS. The failure boundary for a JVM is the JVM itself. If a malicious object started accumulating memory without freeing it, the JVM would fail. What is the failure boundary in your system design?

Let's say you find a way to handle the above memory accounting issue, what if the malicious object now accumulated garbage and freed it immediately, causing a great deal of work for the garbage collector? How is the CPU time tracked and charged to the object in question? Is each object individually schedulable? Does that now make them active objects, aka Actors? They get their own thread and communicate via messages? That fundamentally changes the JVM computational model. You may need a process model of some sort after all.

Notice, the above accounting issues are all existing problems with just about every popular kernel in existence, particularly UNIX clones. EROS/CapROS has solved all of the above issues and Coyotos will inherit those benefits. I'm less familiar with L4, but I believe it too has solved the all of the issues; L4 is currently simply weak the security department compared to EROS/CapROS.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 14:41 · Score: 1
No, sorry, that is not true. There are languages that make such guarantees, but C isn't one of them.

I never said C made any guarantees to this effect. All of my statements have been simple statements of fact. You keep confusing them with ideology. Don't read more into what I say than what I'm saying. Here is what I said:
C provides direct access to datatypes which map almost directly to base valuetypes on commodity hardware.

This is a statement of the current state of affairs. As things stand right now, commodity hardware are based around byte-addressable machines. Word sizes correspond very closely to C datatypes. Yes, there is room for intepretation in the standard, yes there is no guarantee of a fixed integer magnitude across compilers. These are ideological issues. The simple fact I am stating, is that every compiler of which I am aware, provides the C datatypes as closely mapped to the underlying native hardware word sizes as is feasible.

Have I not repeatedly maintained, right from the beginning, that I am discussing a particular implementation of C, Lisp, etc.? One of my three main criteria for selecting a language in which to implement a kernel was "ubiquitous implementation that could provide the other two features".

The C language makes no guarantees whatsoever about real time behavior or memory management behavior.

Precisely. OS implementors don't want these policies in their languages. If they wanted their language to dictate policy to them, they'd choose Ada. It has portable semantics, interrupt handling, scheduling policies, even realtime extensions. But if they wanted to let languages dictate policy for them, they wouldn't be doing OS research would they?

There may or may not be a stack.

Sure, it depends on the host architecture naturally. But again, I'm discussing commodity hardware since we are discussing implementing a kernel on such hardware.

There may or may not be dynamic memory allocation even if you don't ever call "malloc".

Indeed, which is why it's important for the language to be useful without a runtime.

C has as much or as little of a runtime as the compiler implementor happend to choose. GNU C happens to use a fairly simple runtime, as did K&R C, but other C compilers don't. Some have garbage collectors, for example. Try running your code under tcc in "safe" mode.

Yes, but all these additional features are taken into account when selecting a candidate in which to implement the kernel. If tcc imposes undesirable features, then it will be similarly eliminated even though it implements the C standard. I don't think I've ever said anything to the contrary.

What I have said, repeatedly, is that C:
1. Has a ubiquitous, free implementation, that provides:
2. Sufficient low-level control over representation
3. Full control over runtime features
You've just said more or less what I've been saying, yet you're disagreeing with me somehow. I'm having a hard time understanding.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 14:20 · Score: 1

Right. If you're sticking to machine words, then who *cares* that C's rules for laying out structs is more predictable than what the Lisp optimizer might do to a struct? You can use a read/write oriented machine-word module in Lisp as easily as you can do the same in C!

Firstly, Lisp was merely an example, and not necessarily the best one. We are discusing high-level languages in general which perform field reordering optimizations; this is unacceptable for interfacing with hardware. As discussed, C forbids this, so it's already a little bit easier to talk to the hardware. That said, it is certainly possible to gain this same functionality with Lisp by calling into an FFI to a lower-level language if you like. All well and good, though you already lose Lisp's advantages, but fine, no matter, no argument. This was never the argument. If you recall, again, I had three criteria, and I have been most adamant that runtime was more a concern than a low-level interface with hardware.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 14:11 · Score: 1
Let's keep this discussion in context. You expressed horror at the idea of an OS that depended on a single IL. What exactly is your horror in reaction to?

Actually, I'm expressing horror at the consequences of interpreting a non-machine-native IL. Adequate performance entails implementing a full virtual machine with runtime code generation.

At the level of the instruction set, compiling to CIL is really no different than compiling to PowerPC or x86. It's completely transparent from the point of view of the programmer.

It's not the programmer I'm worried about. It's the end user I'm worried about; at the end of the day, he's the one taking the risk of running code on his machine. This system design leaves him vulnerable to local DoS attacks. If this approach could be made to work without adverse security consequences, and with more or less the same performance +/-10%, I'd be all for it.

Without a GC? Likely not, but then again, we're not talking about removing the GC. That is not necessary to make a language suitable for writing kernels. We're talking about adding functionality to allow the specification of in-memory representation ("read-mem", "write-mem", "read-io-port", "write-io-port"). If C with an external function "read-io-port()" is still C, I don't see why the same isn't true for Lisp.

No, we're really not talking about this. Let's revisit my main thesis. In order for a language implementation to be suitable for writing microkernels, it must:
1. Provide sufficient control over low-level representation (which both C and Lisp do, either natively or through an FFI to lower-level code)
2. Provide control over the runtime environment, with the explicit possibility of no runtime (which C does and Lisp does not)
3. Have a ubiquitous, preferably free, native implementation (which both C and Lisp havel I just added 'native' qualifier to exclude VM OSes for simplicity)
#2 is important because microkernels have no runtime, because many designs do not maintain state. The runtime features of Lisp et al are thus superfluous.

Multiple such beasts. Movitz is one such example. Sure, it's not particularly sophisticated, but that's because one guy is working on it. Popularity and suitability are two different things.

I took a gander. Looks neat, except once again, it drags aims to provide an entire Lisp runtime, GC included. That may be fine for some projects, but it's not good enough for all projects, particularly microkernels.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 09:03 · Score: 1

I'm sorry, but you still don't get my point. C is a programming language defined by a language standard.

No argument there.

You keep claiming that C permits efficient access to low-level features, but that claim is wrong: the C programming language provides no access to low level features at all.

Fine, let's be as specific as possible. Assembly language provides direct access to low-level features. C provides direct access to datatypes which map almost directly to base valuetypes on commodity hardware. Good so far?

Any such access is provided via implementation-dependent features. And just like it's easy to add those features to C, it's easy to add them to other languages.

If you recall I had three criteria for a microkernel suitable language. Yes, there are other languages that provide direct control over representation (even better than C for some), or at the very least they can manipulate these representations via opaque handles and callbacks to low-level code. Ignoring the difficulties inherent in this for a moment, the other higher-level languages in this category are still unsuitable because they don't grant you sufficient control over the runtime. You can't eliminate GC from Lisp or Java for instance, without essentially writing your own implementation (which is effectively no longer Lisp or Java). Rolling your own is all well and good, but it's a lot of work.

Similarly, the C runtime is unsuitable for use in a kernel; any kernel use of C needs its own runtime. Well, you do the same thing when writing a kernel in any other language.

That's just it, there effectively is no "C runtime" though. There is link-time code which provide services as functions, but in C, you get what you pay for. If you don't want to pay for something (like GC), you don't have to pay for it. If you don't want dynamic memory allocation, you don't pay for it. Other language implementations as they currently exist, already assume a standard foundation (memory allocation, GC, etc.).

C itself makes no real runtime assumptions beyond the presence of a stack. You pay for nothing else. This is essential for microkernels because there often is no memory allocation/reclamation.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 07:10 · Score: 1

It does not forbid variations in field padding, however. The rules for how fields are padded are predictable, but that does not change the fact that they are not "byte for byte" translations.

Right, which is why you stick to machine words.

Hardly. Even microkernels are full of logic (validating user input, for example). Even if you say a microkernel spends 50% of its code banging registers and memory locations (likely a gross overestimation), that's still 50% you don't have to write in a low-level language.

I think we're getting our signals crossed here. I've never stated C is ideal for writing kernels. I can honestly say I really don't like C. I would love to be able to write a kernel in a higher level language. What I am saying, is that no language is there yet; no language other than C, C++, and Ada currently provides the three features points I've outlined (low-level control, runtime control, compiler availability). C++ is possible, but you have to jump through a lot of hoops to get where you're going. Ada is great because it provides much more safety, but it has some bloating issues with the binaries it produces (gnat) which wreaks havoc on the caches critical for microkernel performance. So we're left with C, which is not as good as assembler for direct control, but it doesn't really get in our way the way the other two do for most of our purposes. It's unfortunate, but there it is.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 06:57 · Score: 1

Because memory protection is still enforced.

But it's not really, since in a kernel (particularly microkernels) you have to perform pointer arithmetic. This is necessitated by the MMU. All the C code which doesn't interface with the MMU should likewise be "memory-safe" (in that there's no need to perform pointer arithmetic). So it's mostly a wash IMO. There are a few corner cases where memory safety would actually be useful, but sufficient testing would produce the same confidence in correctness. GC is a big dependency that isn't needed in microkernels, particlarly microkernels that don't have state (like EROS/CapROS). So how do you get rid of it in a language that needs it?

Says everybody who is perfectly happy using Linux, Windows, BSD, OS X, etc, even with their unpredictable latencies of hundreds of milliseconds. Most kernels are not hard real-time. Those that are can be written in an approprite hard-real time language (either a safe one with a RTGC, or C with no MMU, no VM, and a RT malloc).

And researchers who want to build a general-purpose operating system that can be used in all of these scenarios are what? Shit out of luck? Thanks for coming out, try being less ambitious next time?

Listen, people can use whatever language they like to build a kernel; if it makes them happy, I'm happy for them. But to be honest, I don't see how you're going to build a microkernel with the desired properties given a high-level language and its runtime baggage. I'd like to be proven wrong. If you have seen a microkernel written in Lisp, Java, or what have you, that runs on commodity hardware, then I'd love to read about it.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 06:45 · Score: 1

To quibble: there is no reason to use a single IL. It just needs a trusted compiler for each IL it does support.

And what is the thing that recognizes, approves and executes each IL?

Furthermore, what exactly do you think C is anyway? It's essentially the common IL for UNIX. The C machine model is deeply ingrained within UNIX, just as much as the CIL machine model is ingrained within the .NET CLR.

No, no, no. The "IL" for any native code compilers are the host architecture's native instruction set.

If I understand the term the same way, you're talking about something like what is used in the MLton compiler, which automatically inferences regions and allocates to them.

No, Cyclone allows explicit region management IIRC. Region inference is an additional useful feature, but it's not required.

Such region management is not any more explicit or predictable than GC or regular malloc() for that matter.

I'm not sure if region inference is more predictable; I haven't read any papers on it. The collection cycle might be faster, and with explicit intervention, more predictable. Got any pointers?

I suppose you can include annotations in the program to delineate regions, but then why not just use the "preallocate and disable collection" idiom in a regular GC?

Fragmentation, no copying overhead for compaction, etc. These are important issues in constrained systems. For instance, the annotations are inline in the declaration of a pointer in Cyclone. This means it is checked by the compiler for safety. Does the compiler check that you've properly re-enabled the GC after everytime you disable it?

What exactly do you think the Linux kernel is written in? In C? If I recall correctly, there is no inline asm in C99.

Oh no doubt Linux uses gccisms. It's probably 80-90% C, and 5-10% asm, and 5-10% gcc extensions.

So sure, you can call them "Lisp-like" languages instead, but be aware that if you don't start applying the same reasoning to C, you're being inconsistent.

The main issue with Lisp is the runtime environment. Would you call Lisp without garbage collection Lisp? Have you even seen such a beast?
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 06:32 · Score: 1

I think I have stated this pretty clearly already: I'm looking for a kernel very much like the Linux or BSD kernel, but replacing C with a simple, safe object-oriented language that supports garbage collection. It would be nice if the runtime contained a JIT compiler, but that would not be necessary. Either way, modules would be dynamically loadable, communication between kernel components would happen through the language's object oriented mechanisms, and the language and runtime would guarantee fault isolation and resource reclamation (except in contexts where the programmer explicitly overrides it).

Ok, now we're getting somewhere. And here is the problem with this design: you seem to be ignoring the fact that it's running on a real machine without these high-level features. The only way to implement this system, is have the kernel as a full-blown virtual machine which only runs some sort of IL which doesn't allow arbitrary code execution/memory accesses. This means code-generation. This means garbage collection. This means forcing everyone to use the same IL. This means a new compiler for any language you want to support. This means only source-level compatibility. This makes the kernel vulnerable to denial-of-service attacks.

There is one other possibility: some sort of code-signing scheme to ensure that the code was compiled on a certified compiler (would only work with DRM of some sort). What a terrible idea this would be...

That's in contrast to microkernels that take the functionality currently contained in kernels like Linux or BSD and split it up among separate processes; the microkernel approach introduces unnecessary overhead and complications into the kernel design. I view microkernels as a workaround for the lack of runtime safety in languages like C.

a) The kernel design is actually much simpler. I'm not sure what's so complex about it; it's just IPC. Please elaborate.
b) Overhead issues have largely been addressed in the latest generation of microkernels (L4, EROS/CapROS).

In some sense a microkernel is a workaround for safety issues: by decomposing functionality, you are isolating faults. By decomposing you are also increasing reusability of components. But it's a workaround that is necessitated by the hardware, because the hardware itself is unsafe. If we were running Lisp machines, this wouldn't even be an issue.

It's also in contrast to the Lisp machine or some attempts at a Java OS, in which everything, including applications, runs inside a single big runtime and address space; resource management and other issues haven't been solved satisfactorily in that context and current applications aren't written for that kind of environment. However, once you have moved to a safe language and runtime, you can start exploring the issue of moving more functionality into the kernel's address space.

So you're idea is something like J-Kernel from the paper I linked to earlier, but actually running in kernel space and executing the whole system written in Java?
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 06:20 · Score: 1

ANSI C forbids reordering, but it doesn't define the size of a byte, nor does it precisely define alignment or padding.

Re: size of byte

Did you forget that not all architectures are byte-machines?

Re: alignment

That's what pointer arithmetic is for, which is needed for memory management anyway.

Re: padding

Padding is a trivial issue in C.

No doubt the above are useful features to handle uniformally in the compiler (and Ada addresses all of them better than C), but they are not show-stoppers.

The point is that the kernel doesn't use the language-supplied malloc/free; it uses its own unportable mechanisms.

It can. It doesn't have to be unportable. The only unportable code is allocating and freeing pages. This is architecture-specific code anyway.

My point exactly. (Re: memory-mapped I/O)

Because memory mapped I/O is not a feature of every OS in existence. The C library is pretty much designed for the lowest common denominator; this is a strength when dealing with kernels and the fact that these features are not required is its greatest strength for microkernels. I really don't see how memory-mapped I/O belongs in a microkernel anyway. You are arguing the lack of application-level features as a criticism for using a language in a kernel.

Well, and as we can see, you don't even know what features are part of the C language.

I'm not sure how you reached that conclusion. Where is your evidence to back up this assertion?

C implementations can be garbage collected and abort on every one of those "low-level features" you so dearly love as a runtime error

I never made any claims that C was a great language, nor that it was particularly safe, nor that garbage collection is even a good fit for it. I'm not sure where you're getting your outlandish ideas, but from the beginning I've simply maintained that C is currently the best language in which to write a microkernel (Ada would be better, but I've encountered serious code bloat issues with it -- this kills cache performance). Any other claims you make are a strawman.

The C language is not so much characterized by the presence of low level features, but by the absence of guarantees that high level features are present.

To a certain extent I agree. Unfortunately, all other high-level language implementations currently force you to carry other baggage that simply doesn't belong in a microkernel; at least, not every microkernel -- the fact that it isn't even an option is the problem.

But even if you are trying to argue that using GNU C or a similar compiler on Intel is "the best candidate", that is also methodologically wrong: you claim that systems like EROS and CapROS are aimed at performance and security. But if you base those systems based on a language with no language mechanisms for fault isolation and a propensity for serious security problems, then EROS and CapROS can be argued to be merely workarounds for the limitations of their implementation language.

Tell me, how do you gain confidence in a compiler verification? It was clearly originally written in a language less safe than itself, so how do you know it's bug-free? How do you know the implementations of its proofs are correct? You test it till the cows come home. That's called a solid epistemology. You verify through observation.

The longer a microkernel is tested without problems, the greater the confidence we gain in it. This confidence perhaps has an upper bound which is determined by the unsafety of the language and the complexity of the implementation. This is why microkernels seek to minimize the implementation. This is why the Coyotos implementors seek to improve the safety of the implementation.
Re:agreed 100% on Andy Tanenbaum Releases Minix 3 · 2005-10-26 02:14 · Score: 1

relying on memory management hardware rather than language and runtime mechanisms in order to achieve isolation.

MMU hardware is a runtime mechanism. It's just a mechanism implemented in circuitry rather than software.

Perhaps you could clarify exactly what runtime mechanisms you have in mind that could supplant MMU hardware for safety and fault isolation guarantees? You outlined your belief that the kernel should be implemented in a safe language. Very good. I suspect this kernel will still run at supervisor level on the CPU. How about objects hosted on this kernel? Would they run at user or supervisor level? Is the kernel a JITC VM? How are entities scheduled? How are resources, memory for instance, tracked and charged to various entities? How do you ensure freedom from denial-of-service attacks against the kernel, realtime scheduling, etc.? These are all issues handled by operating systems at the moment, so I'd like to hear your solutions for these problems, even speculative ones.