Linus Says No to 'Specs'
auckland map writes to tell us about an interesting debate that is being featured on KernelTrap. Linus Torvalds raised a few eyebrows (and furrowed even more in confusion) by saying "A 'spec' is close to useless. I have _never_ seen a spec that was both big enough to be useful _and_ accurate. And I have seen _lots_ of total crap work that was based on specs. It's _the_ single worst way to write software, because it by definition means that the software was written to match theory, not reality."
Who was it that said:
In theory, practice and theory are the same. In practice, they are not.
Linus does code to specs: the kernel is intended to comply with all sorts of formal and informal specs, and its developers pay attention.
What is missing is people writing and committing to specs for some important kernel internal interfaces and functionality. This attitude goes hand in hand, of course, with the lack of stable internal interfaces within the Linux kernel and is one of the major reasons why the kernel source has bloated to such a humungous size and why every kernel release needs to include all the accumulated garbage of the past decade. If internal kernel interfaces were specified and committed to for each major version, then driver and module development could be separated from the rest of the kernel.
Of course, Linus is right in one area: most specs are useless. There are two primary reasons for that. Either, the spec is poorly written; there are lots of those. Or, the spec describes a bad design; there are many more of those. Many of the original UNIX design documents were exemplary specs: they told you concisely what you could and could not rely on. On the other hand, many recent standards (like HTML or SOAP) are examples of well-written specs that are bad specs because the underlying designs suck. But the fact that many specs are bad doesn't mean that it is inevitable that the Linux specs would be bad; that only depends on Linus.
At a conservative estimate, I've pissed away half of my lifetime development effort dealing with instances where the documentation of an OS, APIs or SDKs doesn't match the actual behaviour. Every time I get sandbagged with that, I wish I could just read the damn source and see what's really going on.
Linus is quite right that a spec can be useful as a descriptive abstraction, but not as an absolute or proscriptive definition. When you're sitting there at the keyboard and hit a point where the behaviour differs from the spec, it doesn't matter why the spec is wrong, just that it is. Red pen it and move on.
If you were blocking sigs, you wouldn't have to read this.
Predictable code is good code. You want your code to do x when y happens, and everyone who relies on your code should know what to expect from your code under every circumstance. Kernels are supposed to be boring.
Specs may suck in some cases; if they do, they're badly written. It's an indictment of the person who wrote that spec, not the concept of specs in general. When I call a function, I expect it to do exactly what its documentation says, and it should comply with the documentation exactly.
I shouldn't have to read the code just to use it. That defeats the entire purpose of segmenting things out into separate pieces. You might as well be using gotos to write your spaghetti code.
Remember, there were no nuclear weapons before women were allowed to vote.
POSIX, VESA and even ttys are all examples of specificatons that sought to unify existing practice. The practice came first, then the theory. If you want an example that goes the other way, you would have to look to something like IP, which was created as a specificaton along with the first implementations.
Linus is correct, though. Specs are rarely useful breasts up-front. Standardization of existing practice is often useful, but that's another beast.
From one of my past lives, when I designed computer systems in automobiles, before my current stint as a grad student, I can describe how we used specs. Some of our business folks who had some basic training as engineers would get together with our customers folks who had basic engineer training, and hammer our a 100+ page spec document covering everything from basic operating precepts to acceptable failure modes. They would use this document as a means of discussing what they thought they wanted to buy, and what we thought we could supply for them, and then determine a cost from this proposed spec.
After enough words had been passed back and forth, both sides would agree on a version of the spec, money would be passed around, and hardware design engineers and software engineers from all over the world would get to work. At this point, the spec would be skimmed, people would get a rough idea of what everyone wanted, and a couple hundred of the first prototype version would be cobbled together.
Testers and verification people on both sides of the fence would look at this thing, first against the spec and make sure it included everything that was talked about, and then in the system to make sure that it would work they way they needed it to. This is the closest that the design ever got to the spec. After this point, everyone would start noticing places where the spec was either too rigid to be followed cost-effectively, or just plain wrong for our customer. Since rewriting a spec is a ton of work, it never got done, and in the end was just a basis for verification folks to look at the design and complain that it didn't work they way that they thought it should. I guess someone should have included them in the "cool peoples idea passing club", but, neah.
This article isn't about a general comment as to how software should be written; it's about implementing protocols. And protocols have to be defined by specifications -- you can't simply have some vague spec that says "it should do approximately this"; you have to precisely define the language of the protocol so that everyone knows what they are implementing.
That being said, Ted Ts'o makes the most important comment in the discussion: This is the reason for the IETF Maxim --- be conservative in what you send, liberal in what you will accept. In other words, when your code talks over the protocol, be minimalist and adhere strictly to the spec; but if you can implement your code in such a way that it is tolerant of small variances (e.g. combinations of commands which make sense but are not allowed by a strict interpretation of the protocol spec) and can do it safely, then you should. This is the approach I've always taken, and having done things like implement a major printer vendor's IPP server, I have found this approach makes interoperability and compatibiltiy a lot easier and less painful to achieve.
So at least from that standpoint, I think I can see what Linus is trying to say; at any rate I agree that strictly adhering to a spec simply for reasons of mathematical correctness is not always the most productive route.
I have been working with specs-driven projects for the most part of my professional career, and I can tell you that Linus has a point.
If specs were 100% accurate, then there would not be a need to write the code, because the specs could be automatically translated to code (we are talking about 100% accuracy here, not 99.999999%). But specs can realistically never be 100% accurate...it is the missing part from the specs that causes headaches.
For example, I have worked in a project that required conversions between coordinate systems: UTM to geodetic, geodetic to local cartesian, local cartesian to target, etc. The user expected to edit UTM coordinates in the GUI, but the specs for UTM coordinates where never mentioned in the Software Requirements Specification. So we searched the internet, found out what 'UTM' is, and coded the relevant functions.
You know what? the specs talked about UTM coordinates, but they actually meant MGRS! UTM means 'universal transverse mercator' whereas MGRS means 'military grid reference system'. Although the concept between the systems is the same (the Earth's surface is divided into rectangular blocks), the two systems have different calculations.
When we released the application to our customer, they freaked out seeing UTM coordinates, and of course they refused to pay. Then we pointed out that the specs talked about UTM coordinates, and they (thank God) admitted their mistake, paid us, and gave us time to change the application from UTM to MGRS.
But the application has never been correct 100% (from that point on) until recently handling MGRS coordinates, because it was very difficult to successfully change something so fundamental and yet so missing from the specs(we are talking about 160,000 lines of C++ code).
Do you want another example? in the same application, the program should display a DTED (digital terrain elevation data) file, i.e. a map created out of a file of elevation data. The specs did not say anything about the map display respecting the carvature of Earth. So we went out and implemented a flat model, where each pixel on the screen was converted linearly to a X, Y coordinate on the map.
Guess what? they meant the map display to be 'curved', i.e. respect the Earth's carvature. The specs did not say anything, until the application was connected to another application that produced the video image of the battlefield using OpenGL (and of course, since it was to be in 3d, the presented map was 'curved').
The result is that the application still has some issues regarding coordinate conversions.
After all these (and many more...) I am not surprised at all that some certain space agency's probe failed to reach Mars because of one team using the metric system and the other team using the imperial system. Even for NASA that they write tons of spec and they double- and triple- check them using internal and external peer review, specs were useless at the end.
So Linus has a point...
That's because Linus is talking about hardware specs. His view is that it's better to trust reality (how the device actually behaves) than a spec (how it's documented to behave). And he's right about that.
Outside the world of OS kernels there are many software projects where the 'reality' is much more changeable, much less solid. Reality in the case of a Purchasing application or your KDE/Gnome desktop applet or the latest FPS game is likely to be a case of 'what the customer/user wants'. That's something you really need to pin down. Doesn't mean it can't change but it needs to be clearly defined.
My experience getting into the kernel was usually motivated by trying to write user space programs. I finally learned what people meant when they say Linux isn't well documented.
I could care less whether there's a spec, but if I'm going to use an API, I have to know what it does. The ALSA (advanced Linux sound architecture) is absolutely the worst. The documentation is full of entries that have the form:
Now the thing I don't understand is this. If you went out to write a really big and important piece of software, wouldn't you want people to use it? What's the point of an undocumented API? It makes no sense whatsoever. You either want someone to use it, and you tell them how, or you make the methods private (i.e. not part of the API)?
Rant over.
Bertrand Russel tried to put mathematics on an immovable foundation of theory and logic, sadly it turned out to be impossible.
I don't think that you got very far with Bertrand Russell.
The higher-order issues he identified caused just a temporary hiccup in the development of logic. While undoubtedly fundamental, that problem paraphrases best as "There are hidden depths to this", rather than as "All is lost".
Godel applies, as always. You don't apply a theory outside of its domain of discourse, not if you know what you're doing anyway.
Russell showed that the domain of logic gets tangled if you use it to think about itself. Well (with hindsight) that is no surprise at all. The expert logician recognizes the necessary boundary, and virtualizes the outer domain before it can be handled by the inner domain logic.
Russell is doing just fine, thank you. Almost the entirety of mankind's technological world is founded on the logic which you describe as "impossible".
"The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
What Linus is saying is that specs are dumb, and the right way to do things is to let him make or delegate all decisions, and retain the right to arbitrarily change those decisions later.
A spec implies a commitment, and Linus is used to everything being in the air when it comes to Linux. A spec would also make it easier to fork Linux, and thereby make Linus less important.
Linus is no idiot... this is clearly a political stance.
Conformity is the jailer of freedom and enemy of growth. -JFK