Leaving a mix of live and dead objects on each page and touching every object once (at least) as it becomes unreachable are about the worst things you can do to your cache. A GC tracing from the root set is bad too, but at least that doesn't happen very often, only touches the live objects (which are then packed into their own sets of pages, giving far better average locality), and can be done incrementally (so that the mutator tends to keep heavily-used objects in cache even during tracing).
you're right. i just checked with a colleague who knows much more about these things than i do, and he agrees with you...
however, inferno is designed to run well in systems that have relatively small amounts of physical memory - that's where ref counting comes into its own, as you can tightly bound the maximum amount of memory used by an application
(as long as you avoid circular references). that means that you can have useful Inferno boxes using much less memory than an equivalent box using more conventional GC.
And how do you cope with the unknown number of destructors that losing the last reference to a large tree might run? Don't the realtime guys avoided both virtual memory and malloc()/free() on critical paths because there's no hard upper bound on the number of clocks they might take?
the point is that the number of cycles is deterministic. so if you're worried about real-time performance in an application, you can do all your memory allocation at the beginning, and avoid it from then on. i.e. you know when your malloc() and free()
can happen and keep objects live if you don't wish
to incur an arbitrary overhead.
i still maintain that, CPU cache aside, on systems that support VM and paging, ref-counting can help keep a small working set and reduce paging and swapping. i have a feeling that for many applications, this will make more difference to performance than maintaining the CPU cache.
(given the huge and widening gap between disk performance and memory performance).
i should try and find/do some studies on this...
(Really pushes it hardest??? Oh right, as opposed to Linux where so many of my devices aren't represented as files. Let's face it, a friend of mine cats vi to/dev/audio for an alarm clock. How much harder can you push the concept? )
lots harder.
in plan 9, any old program can present a filesystem, and it can then interpret operations on that filesystem at will. basically, you can mount one end of a pipe. filesystem requests on any file or directory below the mountpoint turn into RPC messages down the pipe. so MIME mailboxes are presented as a filesystem, the editor cum window system acme allows program interaction through a filesystem, access to ftp is provided through a filesystem, etc, etc.
plan 9 doesn't have an ioctl call, which means that an enormous amount of functionality is available via straight shell commands (echo, cat, et al).
ok, so the ideas might not be completely new, but the implementation works really well in practise. and it means that a sophisticated system can be built out of small chunks of code, which in turn means that the whole system is more understandable and more reliable.
i can create windows with echo, look back through history with cd and extract parts of cpio archives with cat - and all of this functionality can be transparently exported and imported securely across the net.
Is Plan 9 taking off?? I would really like to ditch this Linux crap and use something a little more current!!
plan 9 is cool (it's the OS that i use for development), but due to the usual difficulty of developing PC drivers (in particular graphics cards) it probably won't work with your existing h/w configuration.
however, as dennis says in the interview, most of plan 9's features are in Inferno. in fact, Inferno's is basically a slimmed down Plan 9 with virtual machine and a new language
(Limbo) in which Ritchie has had
a strong influence.
in lots of ways, Inferno is considerably more sleek than plan 9 - it is a real OS, but it's also a "virtual OS" that will run hosted under plan 9 or Windows or Linux or BSD or... the same programs run identically on all Inferno platforms.
there's even a version of Inferno that runs as a plug-in inside Internet Explorer on Windows!
if you want to get a feel for it, there's even a
shell prompt to play with for command line addicts. not to mention a few other little demos to get a feel for the performance of the thing. i'm afraid the plugin doesn't currently run under Netscape or platforms other than Linux, but the full download does.
Inferno and Plan 9 are both OSs "done right", maintaining a healthy balance between performance-related pragmatism and theoretical purity. compared to the tangled morass that is Java or any of the more recent Unix variants (and i'm afraid i don't exclude Linux), they're a breath of fresh air.
it was plan 9 which John Carmack once described as "achingly beautiful" and he's not wrong.
The advantage of reference counting is its predictability--it will always perform equally badly, and immediately realize when an object has become unreachable, but you can save a tremendous amount of work if you postpone deciding whether a particular object is dead.
sometimes doing a bit more work is worth it for the other benefits it provides (we're talking about a system that uses a VM here!). if you want to avoid runaway memory usage (and in the process
improve cache performance and necessity for swap/paging) then ref counting is the way to go.
Inferno uses this technique (actually it uses a generational GC as well to catch the more unusual circular references) and we can happily run a system doing useful stuff (shell, GUI apps, etc) in 512K of RAM.
as any real-time afficionado will tell you, predictability is often more important than raw performance - and that's what ref counting gives you. most OS paging algorithms are designed around the concept of a "working set" of memory that a process needs to be able to compute with. conventional GC often works to foil this approach.
But what kind of threading API doesn't provide synchronization primitives?
i'll keep plugging this 'cos i think it deserves attention. i would do so even if i wasn't working for the company...
Limbo which has a similar thread-communication model to Occam's (if anyone's heard of Occam... no, thought not).
it uses synchronous channels:
somefunction()
{
ch:= chan of int;
spawn otherfunction(ch);
ch <-= 99; # send value down channel
}
otherfunction(ch: chan of int)
{
print("received %d\n", <-ch);
}
easy to use, and easy to reason about. one thread blocks until the other has received the value, hence the primitive can be used for locking, etc.(sorry, can't remember how to do indentation in html...)
I program in C, on Unix, where possible. I'm open to trying new things, but so far GUI programming doesn't look like that much fun. Yes, I like to have performance and simplicity in my language. Java is a huge, complicated mess, and nothing you say will make it any smaller.
[...]In fact, Java is quite non-intuitive, and full of special-purpose hacks.
well said. C's my 2nd favourite language. i used Objective-C for about 6 years and came across all the problems with its object model that i can see all too clearly in Java.
if you're interested in new languages, check out Limbo. IMHO it has all the simplicity and power that you're after, and its well designed semantics mean that programs are generally easy to debug and easy to maintain.
the APIs are almost without exception very well designed and "as simple as possible without being too simple"...
it's a concurrent language and has a threading model that isn't based on archaic 1970s technology unlike Ja^H^Hsome other languages.
plus programs will run without change (unconditionally) on linux, bsd, plan 9, windows and on several native platforms (e.g. the Ipaq)
if you have access to a Windows system with internet explorer, there's plugin that allows you to play around with some demos (source code included), or you can download a version for another platform (only that one's 14MB, ouch).
it's a beautiful language, and that can't be said of many...
I have never seen a cleaner network/socket interface than I have seen in java. I know C,C++ and perl. Java's interface for networks is incredible. There's no way you could implement something that uses any networked functionality in less code than you'd use in java. It's very abstracted, and clean.
i disagree. the java socket interface is not abstracted enough. it's still tied to TCP/IP. plan 9 and inferno have the cleanest networking interface i've seen.
want to make a connection?
connection:= sys->dial("tcp!slashdot.org!httpd");
that's all! the address is a string with a protocol explicitly mentioned, so can easily be passed in dynamically as an argument. the program itself is protocol independent. if i had an ATM interface, i could use "atm!someATMaddress!56446" and the program would continue to work unchanged. IPV6? no problem, just change the underlying implementation...
not to mention the fact that access is through a filesystem-like abstraction - the actual network interface might not even be on the local machine...
if you're a die-hard Java advocate for whom
Java is the One True Language to beat all other
languages, then stop reading here!
i've seen loads of languages mentioned in this
discussion, but nobody so far has mentioned my
favourite language ever: Limbo.
of course, i'm completely biased, as i work
for the company (Vita Nuova) that distributes it. it's not as bad as it might seem though, because the only reason i went for this job was that i'd already fallen in love with the language!
but seriously, Limbo is a truly excellent language which, like Java, takes much of its heritage from C but, unlike Java, gets the fundamental language designright.
i know that Java is here to stay, but if you've done much Java programming, you'll know that there are reasons
why you might want to keep an eye on alternatives.
Limbo is a language that C programmers will like but C++ programmers will probably hate. there's no object-hierarchy! so you actually know what code is being invoked by a function call.
debugging code does not require an intimate knowledge of obscure class hierarchies.
o it's completely type-safe. unlike java, it's impossible to get a run-time type exception.
o the syntax is very readable, but terse enough not to frustrate a C programmer...
o the type system is rich enough to make data-structures easy to create and manipulate, and the type syntax is beautifully clear. consider:
x: list of array of (string, int);
o memory management is automatic (garbage collected)
but unlike most other GC languages, memory usage
is economical and predictable, because almost all data structures are collected by reference counting. so when memory is tight, you have tight control on what memory is allocated, and when it's freed.
o and best of all, programs written in Limbo are completely portable, because they're surrounded by an entire virtual operating system, Inferno.
i could rant on for ages about nice aspects of the system, which is not surprising, given the
people involved in its creation (Rob Pike and Dennis Ritchie amongst others).
anyway, we've had a free download
available for some time, but our resident Windows-head has just come up with an Internet Explorer plug-in that runs Inferno inside a web page. the download is under 720K.
we've had fun coming up with a few demo programs so you can see how it performs compared to Java and what a typical Limbo program looks like. you can even even get a
shell prompt inside IE. (not that you can do a lot from it, as it's sandboxed...).
have fun! and bear in mind that the binaries of these programs will run unchanged under any incarnation of inferno... the "applet" API is identical to the normal one.
i hope that at least some people will Get It...
cheers, rog.
PS. i'm sorry we don't have a linux or netscape version, but we will have!
Also the folks at www.vitanuova.com are dusting off their Ipaqs to continue developing Inferno for it.
actually, as of last night, inferno is currently
running on an Ipaq. still a little way to go
before it's usable (e.g. the screen's the wrong
way round currently)but it looks good.
i saw it
today running the whole development environment
with all the files imported transparently over
the serial link. (that's the beauty of inferno - no extra code or compilation was required to accomplish this).
If you want to see some really obfuscated circuitry, check out this.
i think it's fascinating that here's a circuit that they built themselves, and they still cannot figure out how it works... reverse engineering has its limits!
PS. isn't code stored in NVRAM and used to program reconfigurable logic chips subject to the same copyright laws as any computer program? so presumably ripping this off and using it in your own product has about the same legality as selling a dodgy copy of MS Word... despite what others have said about trade secrets, there seems to be a fine line here between a "trade secret" and a "copyrightable piece of code". hmm.
I read the blurb on Inferno with a great amount of joy.
A freely downloadable OS that seemed to be focussing on the lacks of all the OSs it ran upon. A nice little tool if ever I saw one, and one that I'd greatly love to try.
Then I read the licence.
[...]
Well, for starters, the trademark.htm URL doesn't exist, so there is no guideline for use of these 'trademarks'.
that's true, the URL doesn't exist - we're fixing that. but... i think your worries about the rest of the license are somewhat misconceived.
Well, for starters, the trademark.htm URL doesn't exist, so there is no guideline for use of these 'trademarks'.
What is a classicist to do then? "I'm sorry, you can't have your lecture on Greek mythology, as all the names are currently trademarked..".
these are trademarks - we haven't sidelined a portion of the english language; we're just preventing other companies from trading using those names (and in fact it's not even as restrictive as that, as the trademarks only apply in, i think, certain sectors of the computer industry).
think about it! does the world stop talking about
windows in buildings because Windows® is a trademark?? i don't think so. similarly, unless you are trying to market another OS called Inferno, or a protocol called Styx, the fact that those names are trademarked is completely irrelevant.
so have a look at the software! we have tried to make the license as unrestrictive as possible,
so i hope you shouldn't have any problems with it.
Inferno might be fast on an embedded processor, but it sounds like it needs hosting inside another OS on a PC, which is a shame.
actually, it doesn't need hosting inside another OS - that's just one of the ways it can work.
it can, and does, run directly on all sorts of embedded hardware. the reason the free download is for the hosted configuration only is that there aren't many standard hardware platforms out there (and getting a new OS on bare hardware is rarely trivial).
inferno programs will run the same whether they're running on bare hardware or under another OS. as far as speed goes, user programs are interpreted by a virtual machine, faster than java, but still not up to machine-code levels.
but that's not really the point. (responsiveness is excellent).
unlike atheos (and linux for that matter), inferno
does incorporate a lot of genuinely revolutionary ideas, and it's mature enough for actual use.
if you want to see how beautiful (and easy to program) software can be, it's worth a look...
> plus unix-style documentation
Why, God? Why?
*sob*
because unix-style documentation is concise, clear, and tells you what you need to know?
and because it's infinitely better than the style of reference documentation found all too often these days, in
tutorial style, telling you randomly distributed pieces of information that you need to know, but will never be able to find again...
the unix reference-manual style might require a certain amount of knowledge as a pre-requisite ("you mean i actually have to read the intro?!"), but for conveying to the reader the specifics of how to use components of a system, i've not seen anything to beat it.
for overview and tutorial information on how the various components fit together, there are various papers which try to provide this. (and more to come, when we get some space away from software development to work on documentation, yum!)
to respond to the subject line, rather than the body...:
inferno doesn't currently run on either macos X or BeOS, but there's no reason at all why it can't. in a previous life, i was a nextstep/openstep hacker, so i imagine that if apple haven't mucked around with the APIs too much, then i should be able to port inferno reasonably quickly (it's almost entirely portable C).
BeOS i haven't programmed under, so i don't know how easy the port would be, but i doubt it would be that hard. we've got about a million priorities right now though, so adding another supported OS with a fairly small userbase is probably not near the top, unless there's a significant demand.
the inferno source is not expensive (<$300, given the strong dollar), so someone keen could probably do it themselves.
in some way, the data that a program reads becomes part of the program and therefore is part of the program in some way. what way that is depends on how sensitive the program is to its input. for something like a.wav file, obviously not very. for something like a java class file, completely. buffer overflow attacks come somewhere in between the two.
but to say that all these files are "just" data files is to miss the point. even machine code is "just" data if you're using a microcoded CPU...
evolutionary complexity & decipherment problems
on
Frankenstein Time
·
· Score: 1
what i find a little strange about this discussion, and about almost all of the media coverage that i see, is the assumption that now we've sequenced the human genome, we can immediately start working out which gene does what...
there's a major problem there: the assumption that there's a reasonably straightforward mapping from gene to meme. but i think that current assumptions are probably highly simplistic, and in fact, perhaps in general the mapping is not discoverable.
kauffman argues that the expression of a genome is not just just the simple reading of segments of the DNA which encode proteins which go away and build things, but that a set of genes really forms a boolean network, where the action of some gene can affect the expression of another gene, and vice versa.
what that means in computery terms is that the way your genes work is less like a shopping list (gene A implies obesity, gene B implies intelligence, etc), and more like a cellular automata. if you remember some of your computer theory, you might remember that many simple CA's (e.g. Conway's Life) are Turing complete.
so what we have is essentially an evolved computer program. and if you think that some people write bizarre code, wait till you've seen some that's generated by genetic algorithm. then multiply that by billions of year's worth of evolution, raise to the power of the Halting Problem, and that's the order of the difficulty of decoding the genome!
by way of illustration of the sort of complexity that can arise when even simple systems are evolved in the real world, check out Adrian Thompson's web page. In particular, this paper has a fascinating analysis of the properties of some genetically evolved FPGA hardware. now this stuff is really simple - we're talking digital components, 100 gates, evolved to perform a simple discrimination process.
the circuit worked, but they didn't really have the faintest clue of how! because it evolved, it pushed the physics of the FPGA as far as they would go. to quote from the paper:
There are numerous tactics that can be used to piece-together answers to analysis questions even for seemingly impenetrably circuits. We applied many of those techniques to the most advanced unconventional circuit yet produced. We still do not understand fully how it works: the core of the timing mechanism is a subtle property of the VLSI medium. We have ruled out most possibilities: circuit activity (including glitch-transients and beat frequencies), metastability, and thermal time-constants from self-heating. Whatever this small effect, we understand that it is amplified by alterations in bistable and transient dynamics of oscillatory loops, and in detail how this is used to derive an orderly near-optimal output. Certain peripheral cells fine-tune particularly sensitive time delays.
as anyone who's played with software knows that making a change in one place can have far-reaching implications. try experimenting with a simple 1 dimensional CA and changing the rules slightly - you'll get an almost completely different result.
that's why i argue for caution in the use of genetic engineering technology. actually, i'm not sure i do. nature has thrown so many genes together for so long that i doubt we can come up with much that does anything really useful that isn't just a simple isolated gene-to-attribute mapping.
the claims that are made for genetic engineering are way overblown - genes might be the roadmap for life, but i bet they'll be an almost completely unreadable one.
It's also not clear to me how exactly limbo is "OO in the deeper sense" without inheritance. The fact is that there are inheritance ("is-a") relationships in most problem domains, and if your language can't model this, then the language can't be OO in *any* sense, deep or shallow.
as far as i know, the idea of inheritance is not fundamental to OO. the idea of OO was to provide data encapsulation, reusable code, and implementation interchangability. Bertrand Meyer's "Object Oriented Software Construction" starts from the "five principles":
linguistic modular units
few interfaces
small interfaces (weak coupling)
explicit interfaces
information hiding
i don't see inheritance in there anywhere. it's just a convenient design that people happen to have latched on to a "being" OO.
by those criteria limbo is just as much an OO language as any other, and perhaps more so. the use of explicit interfaces and the lack of inheritance means that the coupling between objects is weak, and as a result programs tend to much more mutable than i've experienced in OO environments. you want to change the implementation of this object completely? no problem - just make sure you carry on implementing the same interface.
Inheritance is a powerful tool that can actually increase code readability and maintainability if used correctly. Consider how you would implement an "is-a" relationship in the problem domain with a language that doesn't have inheritance...
i'm not sure that "is-a" is something inherent to many problems. it is a way of looking at certain problems, sure, but i don't think it's an inevitable, or even a necessary concept.
the way i think of it is that when you're writing a piece of code that uses object A, you are aware exactly of the interface that A provides (or you should be) and the compiler should be able to make absolutely sure that you don't go outside that interface.
moreover, if i'm implementing object A, i know exactly what interface i want to present. it shouldn't matter in the slightest which objects i choose to use internally in order to implement that interface.
the main payoff to avoiding inheritance comes at the software maintenance stage. with an inheritance hierarchy, when inspecting some code that uses an object a of type A, there's no way of knowing which code is being invoked when something calls a method on a. it could be a subclass of A, which might or might not invoke its superclass method. i can't tell by reading the documentation for A what's going to happen, because A's idea of reality can be subtly subverted by a subclass. these problems can become really nasty when dealing with a large class hierarchy and a large program.
if an object is required to implement its entire interface, then these problems melt away. i am guaranteed that the module implementing the interface is responsible for all the bahaviour it exhibits. so you don't tend to get bugs created by the subtle interaction of subclass with superclass invariants. in fact, it's the invariants that are probably the most important thing. if i write some code like:
x:= 0; function1() {function2();}
function2() { x += 2; }
where function1 and function2 are part of an object's interface, i would like to be absolutely sure when looking at the code that x is 2 more after calling function1 than before. in a language like java (or objective-C, for that matter), i don't have that guarantee. this invariant, carefully maintained by the writer of the class, can be broken by someone carelessly overriding function2 and neglecting to call the superclass method.
I guess the big question is: Do you want a language which prevents you from doing something in all instances simply because *sometimes* doing that thing is the wrong thing? If that were the case, we should rid ourselves of gotos as well.
any high level language is a trade-off between safety and power. java (and limbo) chose to give up the safety of C-like pointers for the guarantee that arbitrary bits of memory couldn't be corrupted. but i don't think you'd find many people that would say that the power of the language has declined drastically because of that. on the contrary, the additional checking that the compiler can now do gives you more freedom to concentrate on the real meat of the program.
it's the same with inheritance. inheritance gives you the ability to implement some things conveniently (GUI widgets being the canonical example), but doing away with it means that code is vastly more readable, because you can see exactly what a piece of code is doing; there is no need to know your class hierarchy before you can see what the control flow is doing, because control flow is determined locally.
the same sort of thing applies to local variables in C. consider the code: { int i; i = 99; } any C programmer can tell by looking at that code that it does absolutely nothing (cpp munging aside:-]); it has no side effects; assigning to i cannot change anything else in the program. that's the power of local variables: they provide a cast iron guarantee that the state of the variable is local.
when looking for bugs, this sort of guarantee is invaluable. who hasn't spent hours looking for a bug, only to discover it somewhere that it "couldn't" be!? the more possibilities you can rule out based on a quick glance at the code, the more productive your bug hunting will be.
that's why i like limbo so much. when it gives a guarantee, the guarantee is absolute. and the guarantee that a the meaning of a name depends on the local code, not global state, is an excellent guarantee to be able to give.
The fact is, poor developers can write bad code in *any* language.
i completely agree. i've seen some pretty appalling code in Limbo too. but inevitably you're one day going to be asked "go and fix that bug!" in some of that code. that's the day that you bless the language design, because no matter how bad the author of the code, they can't break the guarantees of the language.
one can write (i think!) good code in any language too. if you're aware of the pitfalls, and write stylised code that avoids them. but inheritance *is* a pitfall (look, even the inventors of the language fell into it - doesn't that say something?!) and IMHO the more pitfalls a language can avoid, without compromising on the power of the language, the better the language.
Uhh, this is hardly a reasonable response. Microsoft's intention was to make their VM not interoperable, which is why Sun sued them.
if's true that microsoft did this deliberately. but from what i'm given to believe, there are portability issues to java, on non-microsoft platforms too, that derive from the fact that the underlying environment has platform-dependent differences.
inferno differs from java in that it's not just a VM and a set of libraries - those are just components in the operating system, and it's the OS that provides the true portability. i've heard people complaining about differences in GUI behaviour, differing library implementations, etc with java, and not only with relation to the microsoft VM. this, i think, is an inevitable problem with defining the portability layer at the library level, and having several vendors write the libraries.
it's much less of a problem with ports of inferno, because the interface to the underlying system is so narrow. to port a version of inferno, you have to write some code to create a window and copy bits to it, some code to map the native filesystem into a unix-like hierarchy, and some code to map the devices provided by the system into Inferno device format (e.g. the serial drivers). this is a far cry from re-implementing the entire API. you don't need any guidelines like "100% pure Java" for inferno, because the API semantics are the same, whether you're running under a 4 processor NT box with 2GB of memory, or a PDA with 1MB RAM, 1MB ROM and 2MB of flash.
Not portable? WTF are you talking about?? Are you using Microsoft's VM or something?
you make my point for me.
Is this just a matter of opinion (and thus not worth a lot) or do you have anything specific to critique the language on?
check out the thread on inheritance. somebody (a java developer) said it better than i. if you can get hold of a copy of the april 2000 edition of the IEEE Computer journal, then the article "coping with java programming stress" gives an excellent rundown on things that aren't right with java (by experienced java programmers). there's also: this by someone who knows their computer language stuff.
hmm. like java didn't have any hype? apart from my initial comment, which i admit was a cheap (but IMHO justified) shot, i've just been answering questions the best way i know how. no bullshit hype. i've been interested in inferno/plan 9 for years before i had this job with vita nuova; developing apps, trying to do the best i could with the tools at hand.
i reckon that inferno and limbo are fine tools, fit for any hacker's workbench, and i'd be happy if others find them useful too... this old unix hacker certainly has.
Maybe I missed it, but is there somewhere we can get the binaries for this new virtual machine OS?
you didn't miss it. we haven't put the binaries out yet. we're going to do so very soon. as with all these things, we're up against a very tight schedule, and have been spending most of the last couple of months writing the manuals... blah blah blah, i hate doing documentation! still, it's almost all done now (and online) and we're going to make the binaries available any time soon.
not forgetting that it's only the core VM source that is part of the "subscriber" arrangement and binary only if you haven't paid your $300; everything else in the system is Open Source, including the web browser, all the apps (over 200000 lines of code), and the build tools.
Is not to continue the Java idea of combining the declartion and implementation of a class in one file. Coming from C++, writing all code in what looked like the definition of the class seemed strange at first, but I've come to realize that this is simply very, very nice in many ways. I'd hate to go back. Too bad Limbo doesn't continue with this idea.
the thing is that in limbo, many modules can implement an interface, so it makes sense for it to be in a separate file, because it is not tied to the implementation of a particular module.
a Limbo interface is more like a specification for a class than something that comes from the implementation of the class itself. (unlike C and i think java, where the implementation of a class, in particular its inheritance characteristics, determines the interface it presents to the world).
if i write a limbo module interface, e.g. Add: module { add: fn(i, j: int): int; # add i and j; return the result }; then any number of modules can implement it, so it makes sense for it to be held in a separate place, because it is independent from any one of them. and in fact, an interface does not have to have any class implementation - that can be plugged in later, which is nice for top-down development.
by "inheritance pitfalls" i meant several things. probably near the top of the list is the fact that in programs that extensively use inheritance, you tend to get interdependency of classes where there is no need for such interdependency. this IMHO is directly counter to the basic tenets of OO programming which call for encapsulation of objects and their state towards the end of re-usable code and software.
if i use a class C which inherits from B which inherits from A, i'm dependent not only on C (which is the object i want to use) but also, unintentionally on B and A. If C wants to use a different superclass, for implementation reasons, then i have to change or recompile a lot of code that relies on it.
here's a little extract from a recent article in IEEE's "Computer" journal [April 2000]:
Subclasses are descendents of other defined classes. Java and other object-oriented langages let you substitute a subclass object for a superclass object. However, you must satisfy certain properties to guarantee that your substitution is safe. One safe substitution is when the subclass is a specialisation of the superclass. For example, a Cartesian point with color attributes can be a specialisation of a Cartesian point without color. You can then substitute a colored point because any behaviour of plain points also applies to colored points
Problems can occur when a subclass is not a true specialisation of its superclass. Consider the java.util.Stack class, which is part of the java.util package. Class java.util.Stack is a subclass of java.util.Vector. Stack defines common stack methods such as push(), pop(), and peek(). However, because Stack is a subclass of Vector, it inherits all the methods Vector defines. Thus, you can supply a Stack object wherever the program specifies a Vector object. A program can insert or delete elements at specified locations in a Stack object using Vector's insertElementAt or removeElementAt methods. It can even use Vector's removeElement method to remove a specified element from a Stack object without regard to the element's position on the stack.
Consequently the java.util.Stack can exhibit behaviour that is not consistent with the notion of a stack as a last-in, first-out entity. In addition, a program can access all the Vector operations on Stack objects directly when the Stack objects are not being substituted for Vector operations.
A stack is not a specialised vector, and it should not inherit vector operations. Instead, a vector should be a hidden, private representation of a stack. Stack objects cannot then export innappropriate vector operations. This preferred design uses aggregation, which lets you use inheritance and polymorphism to replace the vector representation with alternative implementations. If you use inheritance properly, the design will be more flexible and efficient.
[my italics]. It seems to me if it's possible to design a language which doesn't allow such problems, then it would be a good thing. i have yet to be convinced that inheritance is a Good Thing. i used Objective-C (java's object model is partially based on objective-c's) under NeXTstep for 7 years, and encountered all the problems mentioned. code reuse, like hardware component reuse can only come about by minimising the breadth of interconnection between code modules. inheritance does not help us do that.
plus there's the fact that code using inheritance is hard to read, because you're never quite certain which level of a class hierarchy is implementing a method... until you browse the hierarchy. so much for readable code.
you're right. i just checked with a colleague who knows much more about these things than i do, and he agrees with you...
however, inferno is designed to run well in systems that have relatively small amounts of physical memory - that's where ref counting comes into its own, as you can tightly bound the maximum amount of memory used by an application (as long as you avoid circular references). that means that you can have useful Inferno boxes using much less memory than an equivalent box using more conventional GC.
And how do you cope with the unknown number of destructors that losing the last reference to a large tree might run? Don't the realtime guys avoided both virtual memory and malloc()/free() on critical paths because there's no hard upper bound on the number of clocks they might take?
the point is that the number of cycles is deterministic. so if you're worried about real-time performance in an application, you can do all your memory allocation at the beginning, and avoid it from then on. i.e. you know when your malloc() and free() can happen and keep objects live if you don't wish to incur an arbitrary overhead.
i still maintain that, CPU cache aside, on systems that support VM and paging, ref-counting can help keep a small working set and reduce paging and swapping. i have a feeling that for many applications, this will make more difference to performance than maintaining the CPU cache. (given the huge and widening gap between disk performance and memory performance). i should try and find/do some studies on this...
cheers, rog.
ha ha.
but to give a serious answer to a joky question, beowulf clustering under plan 9 would be doddle. it's a naturally highly distributed system.
you'd hardly need any glue code at all.
lots harder.
in plan 9, any old program can present a filesystem, and it can then interpret operations on that filesystem at will. basically, you can mount one end of a pipe. filesystem requests on any file or directory below the mountpoint turn into RPC messages down the pipe. so MIME mailboxes are presented as a filesystem, the editor cum window system acme allows program interaction through a filesystem, access to ftp is provided through a filesystem, etc, etc.
plan 9 doesn't have an ioctl call, which means that an enormous amount of functionality is available via straight shell commands (echo, cat, et al).
ok, so the ideas might not be completely new, but the implementation works really well in practise. and it means that a sophisticated system can be built out of small chunks of code, which in turn means that the whole system is more understandable and more reliable.
i can create windows with echo, look back through history with cd and extract parts of cpio archives with cat - and all of this functionality can be transparently exported and imported securely across the net.
tell me that's not pushing it further!
plan 9 is cool (it's the OS that i use for development), but due to the usual difficulty of developing PC drivers (in particular graphics cards) it probably won't work with your existing h/w configuration.
however, as dennis says in the interview, most of plan 9's features are in Inferno. in fact, Inferno's is basically a slimmed down Plan 9 with virtual machine and a new language (Limbo) in which Ritchie has had a strong influence.
in lots of ways, Inferno is considerably more sleek than plan 9 - it is a real OS, but it's also a "virtual OS" that will run hosted under plan 9 or Windows or Linux or BSD or... the same programs run identically on all Inferno platforms.
there's even a version of Inferno that runs as a plug-in inside Internet Explorer on Windows! if you want to get a feel for it, there's even a shell prompt to play with for command line addicts. not to mention a few other little demos to get a feel for the performance of the thing. i'm afraid the plugin doesn't currently run under Netscape or platforms other than Linux, but the full download does.
Inferno and Plan 9 are both OSs "done right", maintaining a healthy balance between performance-related pragmatism and theoretical purity. compared to the tangled morass that is Java or any of the more recent Unix variants (and i'm afraid i don't exclude Linux), they're a breath of fresh air.
it was plan 9 which John Carmack once described as "achingly beautiful" and he's not wrong.
sometimes doing a bit more work is worth it for the other benefits it provides (we're talking about a system that uses a VM here!). if you want to avoid runaway memory usage (and in the process improve cache performance and necessity for swap/paging) then ref counting is the way to go.
Inferno uses this technique (actually it uses a generational GC as well to catch the more unusual circular references) and we can happily run a system doing useful stuff (shell, GUI apps, etc) in 512K of RAM.
as any real-time afficionado will tell you, predictability is often more important than raw performance - and that's what ref counting gives you. most OS paging algorithms are designed around the concept of a "working set" of memory that a process needs to be able to compute with. conventional GC often works to foil this approach.
cheers,
rog.
i'll keep plugging this 'cos i think it deserves attention. i would do so even if i wasn't working for the company...
Limbo which has a similar thread-communication model to Occam's (if anyone's heard of Occam... no, thought not).
it uses synchronous channels:
somefunction() := chan of int;
{
ch
spawn otherfunction(ch);
ch <-= 99; # send value down channel
}
otherfunction(ch: chan of int)
{
print("received %d\n", <-ch);
}
easy to use, and easy to reason about. one thread blocks until the other has received the value, hence the primitive can be used for locking, etc.(sorry, can't remember how to do indentation in html...)
well said. C's my 2nd favourite language. i used Objective-C for about 6 years and came across all the problems with its object model that i can see all too clearly in Java.
if you're interested in new languages, check out Limbo. IMHO it has all the simplicity and power that you're after, and its well designed semantics mean that programs are generally easy to debug and easy to maintain. the APIs are almost without exception very well designed and "as simple as possible without being too simple"...
it's a concurrent language and has a threading model that isn't based on archaic 1970s technology unlike Ja^H^Hsome other languages.
plus programs will run without change (unconditionally) on linux, bsd, plan 9, windows and on several native platforms (e.g. the Ipaq)
if you have access to a Windows system with internet explorer, there's plugin that allows you to play around with some demos (source code included), or you can download a version for another platform (only that one's 14MB, ouch).
it's a beautiful language, and that can't be said of many...
i disagree. the java socket interface is not abstracted enough. it's still tied to TCP/IP. plan 9 and inferno have the cleanest networking interface i've seen.
want to make a connection?
connection := sys->dial("tcp!slashdot.org!httpd");
that's all! the address is a string with a protocol explicitly mentioned, so can easily be passed in dynamically as an argument. the program itself is protocol independent. if i had an ATM interface, i could use "atm!someATMaddress!56446" and the program would continue to work unchanged. IPV6? no problem, just change the underlying implementation...
not to mention the fact that access is through a filesystem-like abstraction - the actual network interface might not even be on the local machine...
of course, i'm completely biased, as i work for the company (Vita Nuova) that distributes it. it's not as bad as it might seem though, because the only reason i went for this job was that i'd already fallen in love with the language! but seriously, Limbo is a truly excellent language which, like Java, takes much of its heritage from C but, unlike Java, gets the fundamental language design right.
i know that Java is here to stay, but if you've done much Java programming, you'll know that there are reasons why you might want to keep an eye on alternatives. Limbo is a language that C programmers will like but C++ programmers will probably hate. there's no object-hierarchy! so you actually know what code is being invoked by a function call. debugging code does not require an intimate knowledge of obscure class hierarchies.
o it's completely type-safe. unlike java, it's impossible to get a run-time type exception.
o the syntax is very readable, but terse enough not to frustrate a C programmer...
o the type system is rich enough to make data-structures easy to create and manipulate, and the type syntax is beautifully clear. consider:
x: list of array of (string, int);
o memory management is automatic (garbage collected) but unlike most other GC languages, memory usage is economical and predictable, because almost all data structures are collected by reference counting. so when memory is tight, you have tight control on what memory is allocated, and when it's freed.
o and best of all, programs written in Limbo are completely portable, because they're surrounded by an entire virtual operating system, Inferno.
i could rant on for ages about nice aspects of the system, which is not surprising, given the people involved in its creation (Rob Pike and Dennis Ritchie amongst others).
anyway, we've had a free download available for some time, but our resident Windows-head has just come up with an Internet Explorer plug-in that runs Inferno inside a web page. the download is under 720K.
we've had fun coming up with a few demo programs so you can see how it performs compared to Java and what a typical Limbo program looks like. you can even even get a shell prompt inside IE. (not that you can do a lot from it, as it's sandboxed...).
have fun! and bear in mind that the binaries of these programs will run unchanged under any incarnation of inferno... the "applet" API is identical to the normal one.
i hope that at least some people will Get It...
cheers, rog.
PS. i'm sorry we don't have a linux or netscape version, but we will have!
actually, as of last night, inferno is currently running on an Ipaq. still a little way to go before it's usable (e.g. the screen's the wrong way round currently)but it looks good.
i saw it today running the whole development environment with all the files imported transparently over the serial link. (that's the beauty of inferno - no extra code or compilation was required to accomplish this).
cheers, rog.
i think it's fascinating that here's a circuit that they built themselves, and they still cannot figure out how it works... reverse engineering has its limits!
PS. isn't code stored in NVRAM and used to program reconfigurable logic chips subject to the same copyright laws as any computer program? so presumably ripping this off and using it in your own product has about the same legality as selling a dodgy copy of MS Word... despite what others have said about trade secrets, there seems to be a fine line here between a "trade secret" and a "copyrightable piece of code". hmm.
[...]
Well, for starters, the trademark.htm URL doesn't exist, so there is no guideline for use of these 'trademarks'.
that's true, the URL doesn't exist - we're fixing that. but... i think your worries about the rest of the license are somewhat misconceived.
Well, for starters, the trademark.htm URL doesn't exist, so there is no guideline for use of these 'trademarks'. What is a classicist to do then? "I'm sorry, you can't have your lecture on Greek mythology, as all the names are currently trademarked..".
these are trademarks - we haven't sidelined a portion of the english language; we're just preventing other companies from trading using those names (and in fact it's not even as restrictive as that, as the trademarks only apply in, i think, certain sectors of the computer industry).
think about it! does the world stop talking about windows in buildings because Windows® is a trademark?? i don't think so. similarly, unless you are trying to market another OS called Inferno, or a protocol called Styx, the fact that those names are trademarked is completely irrelevant.
so have a look at the software! we have tried to make the license as unrestrictive as possible, so i hope you shouldn't have any problems with it.
cheers, rog.
actually, it doesn't need hosting inside another OS - that's just one of the ways it can work. it can, and does, run directly on all sorts of embedded hardware. the reason the free download is for the hosted configuration only is that there aren't many standard hardware platforms out there (and getting a new OS on bare hardware is rarely trivial).
inferno programs will run the same whether they're running on bare hardware or under another OS. as far as speed goes, user programs are interpreted by a virtual machine, faster than java, but still not up to machine-code levels. but that's not really the point. (responsiveness is excellent).
unlike atheos (and linux for that matter), inferno does incorporate a lot of genuinely revolutionary ideas, and it's mature enough for actual use. if you want to see how beautiful (and easy to program) software can be, it's worth a look...
cheers, rog.
Why, God? Why?
*sob*
because unix-style documentation is concise, clear, and tells you what you need to know?
and because it's infinitely better than the style of reference documentation found all too often these days, in tutorial style, telling you randomly distributed pieces of information that you need to know, but will never be able to find again...
the unix reference-manual style might require a certain amount of knowledge as a pre-requisite ("you mean i actually have to read the intro?!"), but for conveying to the reader the specifics of how to use components of a system, i've not seen anything to beat it.
for overview and tutorial information on how the various components fit together, there are various papers which try to provide this. (and more to come, when we get some space away from software development to work on documentation, yum!)
cheers, rog.
inferno doesn't currently run on either macos X or BeOS, but there's no reason at all why it can't. in a previous life, i was a nextstep/openstep hacker, so i imagine that if apple haven't mucked around with the APIs too much, then i should be able to port inferno reasonably quickly (it's almost entirely portable C).
BeOS i haven't programmed under, so i don't know how easy the port would be, but i doubt it would be that hard. we've got about a million priorities right now though, so adding another supported OS with a fairly small userbase is probably not near the top, unless there's a significant demand.
the inferno source is not expensive (<$300, given the strong dollar), so someone keen could probably do it themselves.
cheers, rog.
program is data.
data is program.
in some way, the data that a program reads becomes part of the program and therefore is part of the program in some way. what way that is depends on how sensitive the program is to its input. for something like a .wav file, obviously not very. for something like a java class file, completely. buffer overflow attacks come somewhere in between the two.
but to say that all these files are "just" data files is to miss the point. even machine code is "just" data if you're using a microcoded CPU...
there's a major problem there: the assumption that there's a reasonably straightforward mapping from gene to meme. but i think that current assumptions are probably highly simplistic, and in fact, perhaps in general the mapping is not discoverable.
not so long ago, i read most of a book by Stuart Kauffman called "At home in the universe: the search for the laws of self-organisation & complexity". It became a little less solid towards the end, but the early chapters contained what was, to me, a brilliantly illuminating discussion of the way that networks of genes function.
kauffman argues that the expression of a genome is not just just the simple reading of segments of the DNA which encode proteins which go away and build things, but that a set of genes really forms a boolean network, where the action of some gene can affect the expression of another gene, and vice versa.
what that means in computery terms is that the way your genes work is less like a shopping list (gene A implies obesity, gene B implies intelligence, etc), and more like a cellular automata. if you remember some of your computer theory, you might remember that many simple CA's (e.g. Conway's Life) are Turing complete.
so what we have is essentially an evolved computer program. and if you think that some people write bizarre code, wait till you've seen some that's generated by genetic algorithm. then multiply that by billions of year's worth of evolution, raise to the power of the Halting Problem, and that's the order of the difficulty of decoding the genome!
by way of illustration of the sort of complexity that can arise when even simple systems are evolved in the real world, check out Adrian Thompson's web page. In particular, this paper has a fascinating analysis of the properties of some genetically evolved FPGA hardware. now this stuff is really simple - we're talking digital components, 100 gates, evolved to perform a simple discrimination process.
the circuit worked, but they didn't really have the faintest clue of how! because it evolved, it pushed the physics of the FPGA as far as they would go. to quote from the paper:
as anyone who's played with software knows that making a change in one place can have far-reaching implications. try experimenting with a simple 1 dimensional CA and changing the rules slightly - you'll get an almost completely different result.that's why i argue for caution in the use of genetic engineering technology. actually, i'm not sure i do. nature has thrown so many genes together for so long that i doubt we can come up with much that does anything really useful that isn't just a simple isolated gene-to-attribute mapping.
the claims that are made for genetic engineering are way overblown - genes might be the roadmap for life, but i bet they'll be an almost completely unreadable one.
as far as i know, the idea of inheritance is not fundamental to OO. the idea of OO was to provide data encapsulation, reusable code, and implementation interchangability. Bertrand Meyer's "Object Oriented Software Construction" starts from the "five principles":
- linguistic modular units
- few interfaces
- small interfaces (weak coupling)
- explicit interfaces
- information hiding
i don't see inheritance in there anywhere. it's just a convenient design that people happen to have latched on to a "being" OO.by those criteria limbo is just as much an OO language as any other, and perhaps more so. the use of explicit interfaces and the lack of inheritance means that the coupling between objects is weak, and as a result programs tend to much more mutable than i've experienced in OO environments. you want to change the implementation of this object completely? no problem - just make sure you carry on implementing the same interface.
i'm not sure that "is-a" is something inherent to many problems. it is a way of looking at certain problems, sure, but i don't think it's an inevitable, or even a necessary concept.
the way i think of it is that when you're writing a piece of code that uses object A, you are aware exactly of the interface that A provides (or you should be) and the compiler should be able to make absolutely sure that you don't go outside that interface.
moreover, if i'm implementing object A, i know exactly what interface i want to present. it shouldn't matter in the slightest which objects i choose to use internally in order to implement that interface.
the main payoff to avoiding inheritance comes at the software maintenance stage. with an inheritance hierarchy, when inspecting some code that uses an object a of type A, there's no way of knowing which code is being invoked when something calls a method on a. it could be a subclass of A, which might or might not invoke its superclass method. i can't tell by reading the documentation for A what's going to happen, because A's idea of reality can be subtly subverted by a subclass. these problems can become really nasty when dealing with a large class hierarchy and a large program.
if an object is required to implement its entire interface, then these problems melt away. i am guaranteed that the module implementing the interface is responsible for all the bahaviour it exhibits. so you don't tend to get bugs created by the subtle interaction of subclass with superclass invariants. in fact, it's the invariants that are probably the most important thing. if i write some code like:
x := 0;
function1()
{function2();}
function2()
{ x += 2; }
where function1 and function2 are part of an object's interface, i would like to be absolutely sure when looking at the code that x is 2 more after calling function1 than before. in a language like java (or objective-C, for that matter), i don't have that guarantee. this invariant, carefully maintained by the writer of the class, can be broken by someone carelessly overriding function2 and neglecting to call the superclass method.
I guess the big question is: Do you want a language which prevents you from doing something in all instances simply because *sometimes* doing that thing is the wrong thing? If that were the case, we should rid ourselves of gotos as well.
any high level language is a trade-off between safety and power. java (and limbo) chose to give up the safety of C-like pointers for the guarantee that arbitrary bits of memory couldn't be corrupted. but i don't think you'd find many people that would say that the power of the language has declined drastically because of that. on the contrary, the additional checking that the compiler can now do gives you more freedom to concentrate on the real meat of the program.
it's the same with inheritance. inheritance gives you the ability to implement some things conveniently (GUI widgets being the canonical example), but doing away with it means that code is vastly more readable, because you can see exactly what a piece of code is doing; there is no need to know your class hierarchy before you can see what the control flow is doing, because control flow is determined locally.
the same sort of thing applies to local variables in C. consider the code: :-]); it has no side effects; assigning to i cannot change anything else in the program. that's the power of local variables: they provide a cast iron guarantee that the state of the variable is local.
{
int i;
i = 99;
}
any C programmer can tell by looking at that code that it does absolutely nothing (cpp munging aside
when looking for bugs, this sort of guarantee is invaluable. who hasn't spent hours looking for a bug, only to discover it somewhere that it "couldn't" be!? the more possibilities you can rule out based on a quick glance at the code, the more productive your bug hunting will be.
that's why i like limbo so much. when it gives a guarantee, the guarantee is absolute. and the guarantee that a the meaning of a name depends on the local code, not global state, is an excellent guarantee to be able to give.
The fact is, poor developers can write bad code in *any* language.
i completely agree. i've seen some pretty appalling code in Limbo too. but inevitably you're one day going to be asked "go and fix that bug!" in some of that code. that's the day that you bless the language design, because no matter how bad the author of the code, they can't break the guarantees of the language.
one can write (i think!) good code in any language too. if you're aware of the pitfalls, and write stylised code that avoids them. but inheritance *is* a pitfall (look, even the inventors of the language fell into it - doesn't that say something?!) and IMHO the more pitfalls a language can avoid, without compromising on the power of the language, the better the language.
PS. limbo got rid of goto too. :-)
if's true that microsoft did this deliberately. but from what i'm given to believe, there are portability issues to java, on non-microsoft platforms too, that derive from the fact that the underlying environment has platform-dependent differences.
inferno differs from java in that it's not just a VM and a set of libraries - those are just components in the operating system, and it's the OS that provides the true portability. i've heard people complaining about differences in GUI behaviour, differing library implementations, etc with java, and not only with relation to the microsoft VM. this, i think, is an inevitable problem with defining the portability layer at the library level, and having several vendors write the libraries.
it's much less of a problem with ports of inferno, because the interface to the underlying system is so narrow. to port a version of inferno, you have to write some code to create a window and copy bits to it, some code to map the native filesystem into a unix-like hierarchy, and some code to map the devices provided by the system into Inferno device format (e.g. the serial drivers). this is a far cry from re-implementing the entire API. you don't need any guidelines like "100% pure Java" for inferno, because the API semantics are the same, whether you're running under a 4 processor NT box with 2GB of memory, or a PDA with 1MB RAM, 1MB ROM and 2MB of flash.
Not portable? WTF are you talking about?? Are you using Microsoft's VM or something?
you make my point for me.
Is this just a matter of opinion (and thus not worth a lot) or do you have anything specific to critique the language on?
check out the thread on inheritance. somebody (a java developer) said it better than i. if you can get hold of a copy of the april 2000 edition of the IEEE Computer journal, then the article "coping with java programming stress" gives an excellent rundown on things that aren't right with java (by experienced java programmers). there's also: this by someone who knows their computer language stuff.
they say it better than i possibly could.
hmm. like java didn't have any hype? apart from my initial comment, which i admit was a cheap (but IMHO justified) shot, i've just been answering questions the best way i know how. no bullshit hype. i've been interested in inferno/plan 9 for years before i had this job with vita nuova; developing apps, trying to do the best i could with the tools at hand.
i reckon that inferno and limbo are fine tools, fit for any hacker's workbench, and i'd be happy if others find them useful too... this old unix hacker certainly has.
you didn't miss it. we haven't put the binaries out yet. we're going to do so very soon. as with all these things, we're up against a very tight schedule, and have been spending most of the last couple of months writing the manuals... blah blah blah, i hate doing documentation! still, it's almost all done now (and online) and we're going to make the binaries available any time soon.
not forgetting that it's only the core VM source that is part of the "subscriber" arrangement and binary only if you haven't paid your $300; everything else in the system is Open Source, including the web browser, all the apps (over 200000 lines of code), and the build tools.
the thing is that in limbo, many modules can implement an interface, so it makes sense for it to be in a separate file, because it is not tied to the implementation of a particular module.
a Limbo interface is more like a specification for a class than something that comes from the implementation of the class itself. (unlike C and i think java, where the implementation of a class, in particular its inheritance characteristics, determines the interface it presents to the world).
if i write a limbo module interface, e.g.
Add: module {
add: fn(i, j: int): int;
# add i and j; return the result
};
then any number of modules can implement it, so it makes sense for it to be held in a separate place, because it is independent from any one of them. and in fact, an interface does not have to have any class implementation - that can be plugged in later, which is nice for top-down development.
if i use a class C which inherits from B which inherits from A, i'm dependent not only on C (which is the object i want to use) but also, unintentionally on B and A. If C wants to use a different superclass, for implementation reasons, then i have to change or recompile a lot of code that relies on it.
here's a little extract from a recent article in IEEE's "Computer" journal [April 2000]:
[my italics]. It seems to me if it's possible to design a language which doesn't allow such problems, then it would be a good thing. i have yet to be convinced that inheritance is a Good Thing. i used Objective-C (java's object model is partially based on objective-c's) under NeXTstep for 7 years, and encountered all the problems mentioned. code reuse, like hardware component reuse can only come about by minimising the breadth of interconnection between code modules. inheritance does not help us do that.
plus there's the fact that code using inheritance is hard to read, because you're never quite certain which level of a class hierarchy is implementing a method... until you browse the hierarchy. so much for readable code.