even for non-programmers
by
Anonymous Coward
·
· Score: 3, Funny
Great! I'll print off a hardcopy and stick it on my refrigerator! I'm sure my wife will love it!
Re:even for non-programmers
by
Nexus7
·
· Score: 4, Interesting
I guess a "liberal arts" major would be considered the quintessential "non-programmer". Certainly these people profess a non-concern for most technology, and of course, computing. I don't mean that they wouldn't know about Macs and PCs and Word, but we can agree that is a very superficial view of computing. But appreciating an article such as this "leaky abstractions" required some understanding of the way the networks work, even if there isn't any heavy math in it. In other words, the non-programmer wouldn't understand what the fuss is about.
But that isn't how it's supposed to be. Liberal arts people are supposed to be interested in precisely this kind of thing, because it takes a higher level view of something that is usually presented in a way that only a CS major would find interesting or useful, and generalizes an idea to be applicable beyond the specific subject, networking.
That is, engineers are today's liberal arts majors. It's time to get the so called "liberal arts" people out from politics, humanities, governance, management and other fields of importance because they just aren't trained to have or look for the conceptual basis of decision making and correctly apply it.
Re:even for non-programmers
by
jaredcoleman
·
· Score: 5, Insightful
Very funny! I agree that the average Joe is still going to be lost with the technical aspects of this article, but the author does generalize...
And you can't drive as fast when it's raining, even though your car has windshield wipers and headlights and a roof and a heater, all of which protect you from caring about the fact that it's raining (they abstract away the weather), but lo, you have to worry about hydroplaning (or aquaplaning in England) and sometimes the rain is so strong you can't see very far ahead so you go slower in the rain, because the weather can never be completely abstracted away, because of the law of leaky abstractions
I've heard a lot of people say that they can't believe how many homes, schools, and other buildings were destroyed by the huge thunderstorms that hit the states this past weekend, or that many people died. Hello, we haven't yet figured out how to control everything! American (middle to upper-class) life is a leaky abstaction. We find this out when we have a hard time coping with natural things that shake up our perfect (abstacted) world. That is what we all need to understand.
Although I used to program as a hobby, my eyes bugged out when I saw this article. It's actually quite interesting; I finally realize why the hell people program in lower level languages.
One point that I think could be addressed is backward compatibilty. I really know nothing about this, but don't the versions of the abstractions have to be fairly compatible with each other, especially on a large, distributed system? This extra abstraction of an abstraction has to be orders of magnitude more leaky. The best example I can think of is Windows.
Re:Informative
by
Jamey12345
·
· Score: 2, Informative
Com and it's decendants are supposed to take care of this. In reality they work relativly well, but also lead to larger and larger libraries. The simple reason is because Com has to remain backwards compatable, by way of leaving in the old functions and methods.
Re:Informative
by
Bastian
·
· Score: 4, Interesting
I think backward(really more slantwise or sideways) compatibility is almost certainly one of the reasons behind why C++ treats string literals as arrays of characters.
I program in C++, but link to C libraries all the time. I also pass string literals into functions that have char* parameters. If C++ didn't treat string literals as char*, that would be impossible.
Re:Informative
by
binaryDigit
·
· Score: 5, Interesting
I think it's a mistake to simply say that "high level languages make for buggier/bloated code". After all, many abstractions are created to solve common problems. If you don't have a string class then you'll either roll your own or have code that is complex and bug prone from calling 6 different functions to append a string. I don't think anyone would agree that it's better to write your own line drawing algorithm and have to program directly to the video card, vs calling one OpenGL method to do the same (well unless you need the absolute last word in performance, but that's another topic).
Exactly. The only way to do something more easily or more efficiently is to restrict your scope. If you know something about a particular operation, or if you can make a few assumptions about it, your life because much easier. Take sorting, for example. Comparison sorts run (at best) in Omega(n log n) time. However, if you know the maximum range of numbers k in a set of length n, and k is much smaller than n, you can use a counting sort and do it in Theta(n) time. But what happens if you put a k+1 number in there? Well, all hell breaks loose.
Another example: Java provides a pretty nifty mail API that you can use to create any kind of E-mail you can dream up in 20 lines of code or so. But you only ever want to send E-mail with a text/plain bodypart and a few attachments. So you make a class that does just that, and save yourself 15 lines of code every time you send mail. But suppose you want to send HTML E-mail, or you want to do something crazy with embedded bodyparts? Well it's not in the scope, so it's back to the old way.
In order to abstract you have to reduce your scope somehow, and you have to ensure that certain parameters are within your scope (which adds overhead). And sometimes there's just nothing you can do about that overhead (like in TCP). And occasionally (if you abstract too much) you limit your scope to the point where your code can't be re-used.
And as you abstract you tend to pile up a list of dependencies. Every library you abstract from needs to be included in addition to your library (assuming you use DLLs). So yes, there are maintenance and versioning headaches involved.
Bottom line: non-trivial abstraction saves time up front, but costs later, mostly in the maintenance phase. There's probably some fixed kharmic limit to how much can be simplified beyond which any effort spent simply in displaces the problem.
Re:Informative
by
oconnorcjo
·
· Score: 3, Insightful
I think it's a mistake to simply say that "high level languages make for buggier/bloated code". After all, many abstractions are created to solve common problems. If you don't have a string class then you'll either roll your own or have code that is complex and bug prone from calling 6 different functions to append a string.
-by binaryDigit.
You said my own thoughts so well that I decided to quote you instead! Actually I thought the article just "stated the obvious" but that it didn't really matter. When I want to "just get things done", abstractions just make it so that I can do it in a magnitude faster than hand coding the machine language [even assembler is an abstraction]. Abstractions allow people to forget the BS and just get stuff done. Are abstractions slower, bloated, and buggy? To some degree yes! But the reason why they are so widely accepted and appreciated is that it makes life SIGNIFICANTLY easier, faster and better for programmers. My Uncle who was a programmer in the 1960's had a manager who said "an assembler compiler took too many cycles on the mainframe and was a waist of time". Now in the 1960's that may have been true but today that would be a joke. Today, I won't even go near a programming language lower than C and I like Python much better.
There is a common misconception that permeates your statement. C++ is not a seperate language from C, it is merely an incremental improvement, an add-on basically. That's why it's called C++ and not D (and yes, the name is an intentional joke).
So, "backwards compatability" is really the wrong term, since C++ is just an extension of C, not a seperate language.
If you're curious, yes, there was a B, but there was not actually an A (or rather, there was, but it was called ALGOL).
-- Under capitalism man exploits man. Under communism it's the other way around.
or you want to do something crazy with embedded bodyparts? Well it's not in the scope
The fun stuff is never in the scope:-(
Re:Informative
by
MrResistor
·
· Score: 3, Interesting
even assembler is an abstraction
I have to disagree. Every assembly instruction directly maps to a machine code instruction, so there is absolutely nothing hidden or being done behind the scenes.
Assembly is just mnemonics for machine code. There is no abstraction in assembly since it doesn't hide anything, it simply makes it easier for humans to read through direct substitution. You might as well say that binary is an abstraction; you'd be equally correct.
Also, there is no such thing as an "assembly compiler". There are assemblers, which are not compilers.
-- Under capitalism man exploits man. Under communism it's the other way around.
Re:Informative
by
__past__
·
· Score: 4, Interesting
If you're curious, yes, there was a B, but there was not actually an A (or rather, there was, but it was called ALGOL).
Between ALGOL and B, there was BCPL (and CPL before that). Hence there was a dispute whether the language following C should be called D or P (and AFAIK, for each name there were several experimental languages that all didn't succeed), until C++ became popular.
Re:Informative
by
GlassHeart
·
· Score: 4, Informative
C++ is not a seperate language from C,
it is merely an incremental improvement
C++ is first of all definitely a separate
language, in the sense that a C++ compiler
will fail to compile legal C code. (Many
compilers accept both C and C++ code, but
must necessarily process them as either C
or C++, not both.) If C and C++ are not
"separate languages", then converting code
from C to C++ or C++ to C must be a trivial
task.
C++ is also a separate language in the sense
that good C++ code (the definition of which
does seem to differ depending on which edition
of Stroustrup you look at) looks little like
good C code. The STL (and templates in
general) and exceptions result in source code
that looks little like C.
it is merely an incremental improvement,
an add-on basically. That's why it's
called C++ and not D.
Stroustrup wrote: "I picked C++ because it
was short, had nice interpretations, and
wasn't of the form ``adjective C.'' in his
own FAQ. No mention of emphasis
on C++ "merely" being an "incremental
improvement".
If you're curious, yes, there was a B,
but there was not actually an A (or rather,
there was, but it was called ALGOL).
Re:Informative
by
GlassHeart
·
· Score: 5, Informative
Every assembly instruction directly maps
to a machine code instruction, so there is
absolutely nothing hidden or being done
behind the scenes.
Nonsense. On the 80x86, for example, a
one-pass assembler cannot know if a forward
JMP (jump) instruction is a "near jump"
(8 bit offset) or a "far jump" (16 bit
offset). It must generate code to assume
the worst, so it tentatively creates a
"far jump" and makes a note of this, because
it doesn't know where it must jump to yet.
In the backpatching phase, it may now know
that the jump was actually "near", so it
changes the instruction to a "near jump",
fills in the 8-bit offset, and overwrites
the spare 8 bits with a NOP (no operation)
instead of shifting every single instruction
below it up by one byte.
A multi-pass assembler can avoid the NOP,
but the fact is still that the same JMP
assembly instruction can map to two
distinct machine language sequences. The
two different kinds of JMP are abstracted
and hidden from the programmer.
There are very few legal C constructs that are not legal C++ [and behaving in the same way] - I'd even suggest that this set is as small as possible as can be with an incremental upgrade (eg. you're always going to get reserved-word conflicts when you incrementally add new reserved words).
there are so many C++ constructs not C++ should not be considered an "incremental improvement" anymore.
The ANSI C++ Standard, for example, is four times the size of the ANSI C Standard.
It was incrementally improved by extension, which is what I said if you read my entire comment. Naturally, a library or compiler that didn't include those extensions wouldn't work properly with source that required them. I don't think this invalidates my position in the slightest, and I'm not sure why one would expect otherwise.
Also, could you give an example of some valid C that isn't valid C++? I have yet to encounter any.
-- Under capitalism man exploits man. Under communism it's the other way around.
I think C++ treats string literals as arrays of characters because strings *are* arrays of characters
That's true, strings are arrays (or at least sequences) of characters. The problems that the author refers to arise from the fact that C++ (and C) treat strings as null-terminated arrays of characters.
This means that: 1) You can't determine the length of a string without a linear-time operation; and 2) You can't ever have a string which includes an ASCII NUL character.
What else could they be?
Well, for one, you could get rid of the terminator and prefix each string with a byte (or a word) containing its length. Pascal does this; so does Java. I believe that PHP uses this representation, and I'm sure that a boatload of other languages do as well.
You aren't really wasting any more space than the C representation, and you gain the ability to include any binary data you want within a string, and to determine the length of a string with a constant-time operation. Most string operations are not made significantly more complex with this representation, either.
--
Living better through chemicals
Re:Informative
by
GlassHeart
·
· Score: 4, Informative
Also, could you give an example of some valid C that isn't valid C++? I have yet to encounter any.
The most commonly-encountered difference is
probably:
#include <stdio.h> ... char *p = malloc(100);
which is perfectly valid (and good) C. It is invalid
C++ because the void * return type of malloc()
must be explicitly cast to (char *). However:
would actually be substandard C. Since C
assumes that an unprototyped function returns
int, forgetting to include stdio.h would
generate an error, which is silenced by the
explicit cast.
Furthermore, the most recent iteration of
ANSI C, known as C99, contains many features
not supported in C++.
Actually this is rubbish, you don't have to explicitly cast in C++. The compiler will guess the right cast type, the same as it does in C.
If you're going to complain about casting int to char * without warning, as you do in the second example, you should also complain about converting void * to char * without warning..
Actually this is rubbish, you don't have
to explicitly cast in C++. The compiler will
guess the right cast type, the same as it
does in C.
I'll let Bjarne Stroustrup (creator of C++) answer.
If you're going to complain about casting int to char * without warning, as you do in the second example, you should also complain about converting void * to char * without warning..
I'll let Steve Summit (author of the comp.lang.c FAQ) answer.
Healthy skepticism is a good trait on Slashdot, but it's best to stay polite ("rubbish") unless you know what you're talking about.
The underlying problem with programming
by
Jack+Wagner
·
· Score: 5, Insightful
I'm of the idea that the whole premise that high-level tools and high level abstraction coupled with encasulation are the biggest bane of the software industry. We have these high level tools which most programmers really don't understand and are taught that they don't need to understand in order to build these sophisticated products.
Yet, when something goes wrong with the underlying technology they are unable to properly fix their product because all they know is some basic java or VB and they don't understand anything about sockets or big-endian/little endian byte alignment issues. It's no wonder todays software is huge and slow and doesn't work as advertised.
The one shining example of this is FreeBSD, which is based totally on low level C programs and they stress using legacy program methodologies in place of the fancy schmancy new ones which are faulty. The proof is in the pudding, as they say, when you look at the speed and quality if FreeBSD, as opposed to some of the slow ponderous OS's like Windows XP or Mac OSX.
Warmest regards, --Jack
--
Wagner LLC Consulting Co. - Getting it right the first time
Re:The underlying problem with programming
by
binaryDigit
·
· Score: 4, Insightful
Well I'd agree up to a point. The fact is that FreeBSD is trying to solve a different problem/attract a different audience than XP/OSX. If FreeBSD was forced to add all the "features" of the other two in an attempt to compete in that space, then it would suffer mightily. You also have to take into account the level/type of programmers working on the these projects. While FreeBSD might have a core group of seasoned programmers working on it, the other two have a great range of programming experience working on it. A few guys who know what they're doing working on a smaller featureset would always produce better stuff than a large group of loosely coupled and widely differing talents working on a monsterous feature set.
Re:The underlying problem with programming
by
jorleif
·
· Score: 5, Insightful
The real problem is not the existance of high-level abstractions, but the fact that many programmers are unwilling or unable to understand the abstraction.
So you say "let's get rid of encapsulation". But that doesn't solve this problem, because this problem is one of laziness or incompetence rather than not being allowed to touch what's inside the box. Encapsulation solves an entirely different problem, that is the one of modularity. If we abolish encapsulation the same clueless programmers will just produce code that is totally dependent on some obscure property in a specific version of a library. They still won't understand what the library does, so we're in a worse position than when we started.
Re:The underlying problem with programming
by
Tom7
·
· Score: 2
I guess you are a troll (I hope you don't really believe what you're saying!!), but you're missing an important point: FreeBSD has security holes frequently found in it (ie, buffer overflows, heap overflows, format string attacks); bugs that would be impossible to make in a language like (say) Java. Security holes are the most salient example, but there are many perils to trying to do things "manually" and by programmer brute-force in a language like C.
Java's not my favorite language, but programs written in it tend to be more robust than their C counterparts.
Re:The underlying problem with programming
by
jilles
·
· Score: 2
The problem is that you need all these high level abstractions to reduce the workload of creating large systems. There's just no way you could have those VB monkeys be productive in C and there's just no way you are going to replace them with competent C programmers. Besides, competent programmers are more productive using high abstraction level tools.
BSD is just a kernel + a small toolset. As soon as you start running all the regular stuff on top of it performance is comparable to a full blown linux/mac os X/windows desktop. Proof: mac os X, remove the none BSD stuff and see what's left: no ui, no friendly tools, no easy access to all connected devices.
--
Jilles
Re:The underlying problem with programming
by
dfn5
·
· Score: 2
I'm of the idea that the whole premise that high-level tools and high level abstraction coupled with encasulation are the biggest bane of the software industry.
The problem with programming at the lower level, like Xlib, is that it takes 2 years to get the first version of your program out. Then you move on to Xt and now it only takes only 1 year. Then you move on to Motif and it only takes you 6 months. Then you move on to Qt and it only takes 3 hours. Of course you want it to look slick so you use kdelibs.
-- --
Thou hast strayed far from the path of the Avatar.
Re:The underlying problem with programming
by
Junks+Jerzey
·
· Score: 3, Insightful
I'm of the idea that the whole premise that high-level tools and high level abstraction coupled with encasulation are the biggest bane of the software industry.
Now that simply isn't true. Imagine you need to do reformat the data in a text file. In Perl, this is trivial, because you don't have to worry about buffer size and maximum line length, and so on. Plus you have a nice string type that lets you concatenate strings in a clean and efficient way.
If you wrote the same program in C, you'd have to be careful to avoid buffer overruns, you'd have to work without regular expressions (and if you use a library, then that's a high level abstraction, right?), and you have to suffer with awful functions like strcat (or write your own).
Is this really a win? What have you gained? Similarly, what will you have gained if you write a GUI-centric database querying application in C using raw Win32 calls instead of using Visual Basic? In the latter case, you'll write the same program in maybe 1/4 the time and it will have fewer bugs.
Re:The underlying problem with programming
by
gillbates
·
· Score: 3, Interesting
Amen.
I can't tell you how many times this has happened to me. After 5 years of programming, my favorite language has become assembler - not because I hate HLL's, but rather, because you get exactly what you code in assembler. There are no "Leaky Abstractions" in assembly.
And knowing the underlying details has made me a much better HLL coder. Knowing how the compiler is going to interpret a while statement or for loop makes me much more capable of writing fast, efficient C and C++ code. I can choose algorithms which I know the compiler can optimize well.
And inevitably, at some point in a programmer's career, they'll come across a system in which the only available development tool is an assembler - at which point, the HLL-only programmer becomes completely useless to his company. This actually happened to me quite recently - my boss doesn't want to foot the bill for the rather expensive C++ compiler, so I'm left coding one of my projects in assembly. Because my education was focused on learning algorithms, rather than languages, my transition to using assembly has been a rather graceful one.
-- The society for a thought-free internet welcomes you.
Re:The underlying problem with programming
by
mickwd
·
· Score: 2
No, the parent post is not offtopic.
"The one shining example of this is FreeBSD, which is based totally on low level C programs"
So how, exactly, is this different to other UNIX-based operating systems such as the other BSDs, and Linux ? OK, not all the applications may be written in C, but then the same is true with FreeBSD. Perhaps the best (biggest ?) example of non-C (C++) software is KDE - which, of course, runs on FreeBSD.
"...they stress using legacy program methodologies in place of the fancy schmancy new ones which are faulty"
So all new programming methodologies are faulty ? Is this really why complex modern software is buggy ?
"The proof is in the pudding, as they say, when you look at the speed and quality if FreeBSD, as opposed to some of the slow ponderous OS's like Windows XP or Mac OSX"
The quality of MacOS X is a problem ? As opposed to FreeBSD ? And what OS family is MacOS X based on ?
Re:The underlying problem with programming
by
Yokaze
·
· Score: 5, Insightful
Don't blame the tools.
High level languages and abstractions aren't the problem, neither are pointers in low level languages. It's the people, who can't use them.
Abstraction does mean that you should not have to care about the underlying mechanisms, not that you should not understand them.
-- "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
Re:The underlying problem with programming
by
radish
·
· Score: 5, Insightful
And inevitably, at some point in a programmer's career, they'll come across a system in which the only available development tool is an assembler
Do you REALLY believe that? Are you mad? I can be pretty sure that in my career I will never be required to develop in assembler. And even if I do, I just have to brush up on my asm - big deal. To be honest, if I was asked to do that I'd probably quit anyway, it's not something I enjoy.
Sure it's important to understand what's going on under the hood, but you have to use the right tools for the right job. No one would cut a lawn with scissors, or someones hair with a mower. Likewise I wouldn't write a FPS game in prolog or a web application in asm.
The real point is that people have to get out of the "one language to code them all" mentality - you need to pick the right language and environment for the task at hand. From a personal point of view that means haveing a solid enough grasp of the fundamentals AT ALL LEVELS (i.e. including high and low level languages) to be able to learn the skills you inevitably won't have when you need them.
Oh, and asm is just an abstraction of machine code. If you're coding in anything except 1's and 0's you're using a high(er) level language. Get over it.
--
----
Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"
Re:The underlying problem with programming
by
Hard_Code
·
· Score: 2
Wow that's excellent. Just hope your chosen architecture doesn't get obsoleted. And you have to relearn everything. Who is the tool of whom? You or the computer?
Re:The underlying problem with programming
by
Junks+Jerzey
·
· Score: 5, Insightful
After 5 years of programming, my favorite language has become assembler - not because I hate HLL's, but rather, because you get exactly what you code in assembler. There are no "Leaky Abstractions" in assembly.
Ah, but you are wrong, and I'm speaking as someone who has written over 100,000 lines of assembly code. The great majority of the time, when you're faced with a programming problem, you don't want to think about that problem in terms of bits and and bytes and machine instructions and so on. You want to think about the problem in a more abstract way. After all, programming can be extremely difficult, and if you focus on the minute then you may never come up with a solution. And many high level abstractions simply do not exist in assembly language.
What does a closure look like in assembly? It doesn't exist as a concept. Even if you write code using closures in Lisp, compile to assembly language, and then look at the assembly language, the concept of a closure will not exist in the assembly listing. Period. Because it's a higher level concept. It's like talking about a piece of lumber when you're working on a molecular level. There's no such thing when you're viewing things in such a primitive way. "Lumber" only becomes a concept when you have a macroscopic view. Would you want to build a house using individual molecules or would you build a house out of lumber or brick?
Re:The underlying problem with programming
by
ChannelX
·
· Score: 2
First off: nothing is impossible. The problem with Java (or VB, or PowerBuilder, or Delphi, or any other of the myriad of languages out there) is that they are all based on something else and the people who are creating/updating them might make mistakes. If you happen to run into one of those mistakes you have to code around it and then what happens when the particular bug that bit you is fixed? Will your code still work properly? A Java programmer I know was just bitten by a bug in one of the methods Java provides for creating URLs. This bug has been in existance for 2 *years* and it still hasn't been fixed. Java is full of crap like that so don't ever assume that anything is impossible in Java.
-- My blog: http://jkratz.dyndns.org/~jason/blog/
Re:The underlying problem with programming
by
Tom7
·
· Score: 3, Insightful
OK, fine: All programming languages have an implementation, and a host operating system. But switching from Java to C++ certainly won't save you from these kinds of problems. (In fact, there is only ONE C++ compiler that I know of that actually claims to be compliant with the C++ language definition; ie., every C++ compiler that people use to build programs is filled with bugs concerning the language's many insane idiosyncrasies!)
I only mean to point out Java as a *language* that has better abstraction properties than C++. (Personally, I prefer other less popular languages like SML, but Java serves the point as well. Just be careful not to take Java as the best example of a high-level language, because high-level languages can have better features and be more efficient than Java is.) Software written in a correct implementation of Java on a correct OS can not have buffer overflows. Programs written in C, even in a correct compiler (few exist) on a correct OS, can and frequently do have buffer overflows. I am reluctant to call this a programmer problem, because such bugs are so common, even among extremely good programmers. (Are the authors of Quake III Arena, Apache, MySQL, the Linux Kernel, ssh, BIND, Wu_ftpd just all bad programmers for having buffer overflows in their software? I personally don't think so...)
Some people are reading this article and using it as evidence to support low-level languages like C. ("Abstractions are leaky, so programmers need to have access to low-level details in order to work around leaky abstractions." or "Abstractions are leaky, so there's no point in using abstraction.") I think that's exactly backwards! Essentially, what I'm claiming is that C++ is a poor language for large software precisely because it does not allow programmers to create "tight" abstractions. Some languages do! These languages are much more pleasant to program in, and to build large software in! And in those languages, we can indeed make tight abstractions without the kinds of leaks he's described.
Re:The underlying problem with programming
by
Chris+Mattern
·
· Score: 5, Insightful
> There are no "Leaky Abstractions" in assembly.
At this point, may I whisper the word "microcode" in your ear?
Chris Mattern
Re:The underlying problem with programming
by
arkanes
·
· Score: 2
No, you can't. Your abstraction will not be perfect, otherwise you wouldn't need to abstract it. If it's not perfect, it will eventually break. If it breaks, then you need to know how it works to fix it. Someone who's never seen C and who doesn't know what a null terminated string is will be totally incapable of fixing a buffer overrun in the Java libraries. He's not saying we should all use C because abstractions leak. He's saying that when we teach people how to code, we shouldn't ignore the low level. His point about 2-dimensional arrays is one of the best, yet he kinda skipped over it - you can abstract a 2d array all you want to make it easy to work with, but it will be far more efficent when used one way than when used another - and the which way it is will depend on your system, and compiler, and a variety of other things. And if you want to fix problems like that, you need to know how the low levels work.
Re:The underlying problem with programming
by
metlin
·
· Score: 2
I think that parent meant that in _HIS_ line of work, you'd need an assembler at some point of time or the other.
Do you REALLY believe that? Are you mad? I can be pretty sure that in my career I will never be required to develop in assembler.
That would depend on what is it that you code, wouldn't it? I mean, when you're into graphics, you almost always have to cut corners, optimize math routines and use features particular to a card/processor and will end up using assembly.
And inevitably, at some point in a programmer's career, they'll come across a system in which the only available development tool is an assembler
That is a very bold statement to make, but then again it is true that many a time, even in places where there is absolutely no need for assembly, I've looked at asm to optimize some of my code. Perhaps you would encounter some situations where your solution would best be coded in assembly.
But just as how the parent made a pretty bold statement that you WILL need to use asm, your statement that you'd never have to code a line of assembly sounds equally ridiculous.
You never know.
Re:The underlying problem with programming
by
Dixie_Flatline
·
· Score: 2
This is a result of a lot of programmers not being Computing Scientists. They're just programmers.
And you don't need a degree to be a good Computing Scientist. If you have a good grasp of algorithmics, think about the work that you're doing, and pay attention to more than just the syntax of the language that you're programming in, you can certainly make a start of it. Having a formalized program to go through to get you to think like a Computing Scientist certainly gets you going in the right direction (and graduating from such a program doesn't guarantee that you'll be a good Scientist) but it isn't necessary.
Read. Read articles like these. Find books that don't just teach you the syntax of a language, but pick up books that show you how to how to use methodologies and design patterns and algorithmics to your advantage. If you read, practice and think, you can be better than just a programmer.
Re:The underlying problem with programming
by
CharlieG
·
· Score: 2
Folk, Languages are tools - If you can only work in one language, you've only got a hammer. Sometimes you need a hammer, sometimes a saw, sometimes a Vertical Milling Center
For those who say abstraction is a problem, I guess they should give up OSes of all sorts, and program right on the iron - remember, the OS itself is an abstraction!
I've done everything from real time machine code (right on the iron, thank you), all the way up to "hand wave" design (wave your hand and say "Magic Happens")
Abstraction is useful, so you don't have to remember the details - the human brain can only deal with so many orders of magnitute of problem at a time. Some folks can deal with more than others, but everyone has a limit. Higher level languages are a way that you can deal with larger projects. A TCP class is a good place for assembler. Writing a Video Tape Library Card Catalog is NOT
-- --
73 de KG2V
For the Children - RKBA!
"You are what you do when it counts" - the Masso
Re:The underlying problem with programming
by
YU+Nicks+NE+Way
·
· Score: 5, Interesting
And I had my mod points expire this morning...
He's exactly right. No leaky abstractions? I once worked on a project that was delayed six months because a simple, three-line assembler routine that had to return 1 actually returned something else about one time in a thousand. The code was basically "Load x 5 direct; load y addr ind; subt x from y in place", where we could see in the logic analyzer showing the contents in the address which was to be moved into register y was 6. Literally, 999 times in a thousand, that left a 1 in register y. The other time...
We sent the errata off to the manufacturer, who had the good grace to be horrified. It then took six months to figure out how to work around the problem.
And, hey, guess what? Semiconductor holes are a leaky abstraction, too. And don't get me started on subatomic particles.
Re:The underlying problem with programming
by
GoofyBoy
·
· Score: 2
"may I whisper the word "microcode" in your ear?"
Only if you do in a deep, heavy voice with lots of breathing.
Insightful/Interesting comment so THEY don't mod me down as Peverted; All programs are abstractions of business and real world situations. The fact that its leaky, means that it needs more functionality/completeness.
-- The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
Re:The underlying problem with programming
by
kawika
·
· Score: 3, Insightful
Even at the machine code level, IEEE floating point is the mother of all leaky abstractions for real numbers.
Re:The underlying problem with programming
by
biobogonics
·
· Score: 2, Interesting
After 5 years of programming, my favorite language has become assembler - not because I hate HLL's, but rather, because you get exactly what you code in assembler. There are no "Leaky Abstractions" in assembly.
Ah, but you are wrong, and I'm speaking as someone who has written over 100,000 lines of assembly code. The great majority of the time, when you're faced with a programming problem, you don't want to think about that problem in terms of bits and and bytes and machine instructions and so on. You want to think about the problem in a more abstract way.
I'd love to put Randy Hyde (author of High Level assembler) in the same room with Monte Davidoff (Multician, PL/I fan and author of the math package in Altair Basic).
Sometimes the abstraction is best cast at a lower level, that's one reason Knuth used MIX in his "Art of Computer Programming". Other times, higher level languages don't do the job.
Here are three examples:
1) Write a transparent filter for Windows 9x that runs in a DOS box. It must handle binary files without discarding LF on input and prefixing LF with CR on output. Try various C compilers and fail.
2) Translate Microsoft MBF (Microsoft Binary Format) single precision to IEEE singles. Yes you can do it in C, but the assembly version is compact and elegant (ignoring exponent underflow and de-normalization). Portable - no!
3) Examine the built in random number generator in PCC 1.2c. (DeSmet's C). It was supposed to be the same algorithm used in the so called "minimal standard" (also common to APL) but it's buggy. Not only does the C generated library code completely screw up by confusing unsigned and signed arithmetic, but it's a horror to debug. Even restricting yourself to 8088 code, a routine using simulated division is faster, cleaner and easier to verify as correct. On a 386+, even in real mode DOS, an assembly routine is a snap.
Re:The underlying problem with programming
by
ajs
·
· Score: 2
Do you REALLY believe that? Are you mad? I can be pretty sure that in my career I will never be required to develop in assembler.
Oh so!?
And if your C code starts to break in odd ways, and you find that it's a GCC code gen bug, then will you just throw up your hands, or will you write that section of code as inline asm?
And even if I do, I just have to brush up on my asm - big deal.
I don't know what you program in, but imagine someone who only knows how to program in a high-level code generation tool that writes code in the language YOU use saying "if I ever have to program in the underlying language, I'd just have to brush up on it."
To be honest, if I was asked to do that I'd probably quit anyway, it's not something I enjoy.
Do you quit jobs every time one of the duties isn't fun? You must have quite a string of jobs behind you. Either that, or you do some things that you don't enjoy in order to take on a majority of work that you do, no?
Sure it's important to understand what's going on under the hood, but you have to use the right tools for the right job. No one would cut a lawn with scissors, or someones hair with a mower.
In the real world, you usually don't run into a 2-inch square lawn or a 5-acre head of hair, but in programming such absurd problem domains are our bread and butter.
Likewise I wouldn't write a FPS game in prolog or a web application in asm.
Oh no?!
What if your FPS needed part of its AI to do something infrequently, but in a manner that really begged for a backward chaining solution? I'd throw prolog in there if I could find a decent prolog compiler.
No web applications in asm?! *snicker* Every SSL implementation I've seen uses ASM to speed the crypto. sendfile(2) was designed specifically for Web applications and is very hardware savvy, so it would be near impossible to implement in pure C.
Yes, when these needs come up you hide them behind an abstraction (e.g. an SSL library or a system call), but someone had to say "hey, I need a Web app that does X, and X needs a low-level implementation!" Then they wrote that low-level code and then built an abstraction around it.
That is the key to abstracting correctly. Not that you throw away all of your abstractions when they become cumbersome, but that you select out specific portions of your abstraction to imrpove by "denormalizing them" if you will into lower-level tools.
Re:The underlying problem with programming
by
slamb
·
· Score: 2
Now that simply isn't true. Imagine you need to do reformat the data in a text file. In Perl, this is trivial, because you don't have to worry about buffer size and maximum line length, and so on. Plus you have a nice string type that lets you concatenate strings in a clean and efficient way.
But this is still a leaky abstraction:
if you put a huge file or even line into a variable in memory, there's a cost. Performance could suffer or it could just not run at all. [*]
making string concatenation (for example) so easy may lead you to do more than is necessary and more than would be ideal from a performance perspective. Certainly with high-level stuff, it's harder to see the cost of a single line of code.
But it's still a useful one because these things are smaller/less common problems than the ones in C that you mentioned.
I think Joel is right - the really good programmers are the ones who understand the system at all levels. That's something I strive for personally, since I don't anticipate that ever changing.
But at the same time, many of the specific leaky abstractions he mentioned can be plugged. The C++ one, certainly - it really only is like that because of backward compatibility. Java has no such problem. The SQL one, maybe not - but even when I have to mess with the query planner, it's less work than having to do everything myself.
And as a general rule, I think our abstractions have lowered the entrance barrier to programming, increased the scale an expert can work on, and increased the speed at which an expert can do things. From the article, I don't think he disagrees. He just pointed out that they can be dangerous, which is definitely true.
[*] - Perl 6 will support streaming regular expressions, so in one way this is less of a problem, or at least can be sidestepped. But still you might not know that and put it into a variable first, you might put large chunks of the results of the regexp into a variable, or you might make regexps that backtrack...so the leak is still there, just smaller.
Re:The underlying problem with programming
by
blamanj
·
· Score: 2
There are no "Leaky Abstractions" in assembly.
Horse pucky. Consider 'INCR R0'. You have a n-bit implementation of the integer abstraction. As soon as you overflow that register your abstraction leaks all over itself, and anything that depends on it.
Re:The underlying problem with programming
by
Tom7
·
· Score: 2
I'm not saying that *nobody* should understand how things work, but the understanding that a compiler or runtime writer needs is different than the understanding that an application developer should need. Most application developers do not need to know which way it is faster to traverse an array, because that level of efficiency simply isn't important for most applications. To tell you the truth, there's no reason why the 2d array abstraction couldn't include a map or fold function (an iterator to Object-Oriented folks) that traverses it in the more efficient manner, abstracting away that detail from the programmer as well.
In a language that supports abstractions well, and under well designed abstractions, these kinds of breakdowns ("leaks") are rare. Though more knowledge seldom hurt anyone, if we were going to either spend time teaching students how compilers work so that they can debug strange compiler errors (these happen, of course, but are pretty rare) or teaching students how to create good abstractions and use them appropriately (these kinds of problems are much more common!!), I favor the latter!
Re:The underlying problem with programming
by
be-fan
·
· Score: 2
Not so. Using low level abstractions is manual and error prone. The compiler is far better at micromanaging, so why not let the compiler do the work? Take C++ templates for example. They are a very high-level construct, but generate code that is often better than hand-tuned C (for scientific programs, for example, or stuff like qsort or standardized datastructures). Then take stuff like virtual functions. FreeBSD uses something exactly like virtual functions in its driver code. You got bits like this:
struct driver_ops {
int (*init)(int param);
int(*do_stuff)(int param);
int (*cleanup)(int param); };
struct driver {...
driver_ops * ops;... };
This construct generates the *exact* same code as
class driver_iface {
public:
virtual int init(int param)=0;
virtual int do_stuff(int param)=0;
virtual int cleanup(int param)=0;
private:... };
class driver_imp:public driver_iface {
public:/* override pure virtual functions */... };
Do you think that using the first construct (which is dangerous and requires you to initialize the functions pointers correctly) is any better than the second? I agree with you to the extent that you should *know* how to work in the low-level, to fix bugs as they arise. But you should work in the high-level, to make sure that tedium and micromanagement doesn't get the better of you.
PS> As for the FreeBSD vs WinXP or OS X comment, that's useless. First, all the kernels are written in C (except for OS X which uses the aformentioned C++ construct in drivers). Second, it's always the algorithms that are more important than anything else. FreeBSD simply has better algorithms. To the extent that a high-level language allows you to focus on your algorithms rather than micromanagement, they're even better than low level code.
-- A deep unwavering belief is a sure sign you're missing something...
Re:The underlying problem with programming
by
J.+Random+Software
·
· Score: 2
Wagner appears to live in the US, which is hardly known for having an infallible legal system in which nuisance and SLAPP lawsuits cost innocent victims nothing.
Re:The underlying problem with programming
by
be-fan
·
· Score: 2
Every single compiler I know of uses v-tables. Sure you can't guarantee that behavior, but the basic guarantee (not by standard but by convention) is that using a virtual function is as cheap as deferencing a pointer to a function. That's all you really need to know, until a bug turns up at which point you can just consult your compiler manual. But the key idea is that you can ignore it until you can't anymore, instead of having to pay attention to it all the time.
-- A deep unwavering belief is a sure sign you're missing something...
Re:The underlying problem with programming
by
Old+Wolf
·
· Score: 2
I like C and C++ for the same reason. I know the languages well enough know that (barring compiler bugs!!) what I write is what I get.
Could this not be the same for HLLs?
Re:The underlying problem with programming
by
radish
·
· Score: 2
You have at least taken some time to forumlate a > 1 sentence reply, and you seem to be both agree with me (at the end) and disagreeing (at the beginning). So I guess I'm confused, but that's nothing new:)
I don't know what you program in, but imagine someone who only knows how to program in a high-level code generation tool that writes code in the language YOU use saying "if I ever have to program in the underlying language, I'd just have to brush up on it."
If they had already learnt it, I'd say fine. I know asm, I've written compilers. I also know smalltalk, miranda, prolog, ml, perl, sh, sql, C, modula, pascal, C++, VB and Java. And probably others too. But, many of those I don't use much - why? Either because I don't like them or I don't need to use them. Usually both. I am in the fortunate position of working on things I like.
Oh so!?
And if your C code starts to break in odd ways, and you find that it's a GCC code gen bug, then will you just throw up your hands, or will you write that section of code as inline asm?
I'm currently a server-side Java developer/architect, and believe me, asm is of limited use in that arena.
No web applications in asm?! *snicker* Every SSL implementation I've seen uses ASM to speed the crypto. sendfile(2) was designed specifically for Web applications and is very hardware savvy, so it would be near impossible to implement in pure C.
I don't think you know what a web application is. SSL is what's known in the trade as a "library". Yes I'm being childish, but come on, SSL is no more a web application than some device driver is an operating system - one is just a tiny piece of the other. I've never implemented an SSL layer, because I don't have to, because as you rightly point out, it's been abstracted away for me. Abstraction is a good thing, it's what stops you writing the same thing (and making the same mistakes) over and over again.
Web apps are, by their nature, big complex beasts. There are few times when I'd say something is totally inappropriate, but writing such an app in asm is simply an insane concept, surely you understand that. Do you have any idea what 2m lines of Java would convert to in asm? It would be unmanageable, unmaintainable and would take literally _years_ to build. By which time it wouldn't be needed anymore.
What if your FPS needed part of its AI to do something infrequently, but in a manner that really begged for a backward chaining solution? I'd throw prolog in there if I could find a decent prolog compiler
I didn't say an AI library, I said an FPS. By which I meant the whole thing, graphics? sound? network code? Prolog is not suitable for any of those. Let me reiterate - abstraction is a good thing. Using the right tool for the job is a good thing. Where implementing a specific component calls for a tool/language, use it. Then wrap it up and allow access to it from whatever language/environment you are using for the overall application. That's abstraction, and I think that's the bit we agree on.
--
----
Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"
Re:The underlying problem with programming
by
radish
·
· Score: 2
That would depend on what is it that you code, wouldn't it? I mean, when you're into graphics, you almost always have to cut corners, optimize math routines and use features particular to a card/processor and will end up using assembly.
Of course, that's why I said _my_ career. I don't do graphics code:)
But just as how the parent made a pretty bold statement that you WILL need to use asm, your statement that you'd never have to code a line of assembly sounds equally ridiculous.
A fair point, but his statement applied to the whole universe of developers (EVERYONE will need to use asm) vs mine which applied to me only (I won't). Still a chance it would be wrong, but a much smaller chance:)
--
----
Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"
Re:The underlying problem with programming
by
Old+Wolf
·
· Score: 2
The guy's point was that at least with C you only need to grok one layer of abstraction, but with say SQL over an XML database, you need to grok several new levels of abstraction *as well as* the C level(s), since your SQL and XML tools were probably written in C. The upshot being that programming these days requires much more sophisticated abstraction layer knowledge by the programmers, to be able to "use them properly" , as you would say.
Re:The underlying problem with programming
by
Junks+Jerzey
·
· Score: 2
But this is still a leaky abstraction:
I think you're stretching it here. Just because a program falls apart in some cases (programmer uses an N^3 algorithm; program written for 100K files is run on 100 gigabyte file) does not mean that the abstraction is leaky.
Part of using regular expressions, for example, is to know when they exhibit poor performance. And in that case you can change your approach. A classic example is to do 10 different searches instead of using one big expression with ten possibilities in it. This is not a leaky abstraction. It would be leaky if there was no clean way to do it properly, other than to chuck Perl and write it in C.
Re:The underlying problem with programming
by
renehollan
·
· Score: 2
Yet, when something goes wrong with the underlying technology they are unable to properly fix their product because all they know is some basic java or VB and they don't understand anything about sockets or big-endian/little endian byte alignment issues. It's no wonder todays software is huge and slow and doesn't work as advertised.
Oh yeah. Definately.
I'm one of the cranky, grizzled, old farts (well, 41, which, in this biz, is old). The early serious work I did involved Z80 assembler for X.25 PADS, switches, and mobile radio modems. Even code a FEC to combat Raleigh fading for a radio modem in a police car? We hacked practically to the bare metal 'cause that was the only way to get acceptable performance. We were few and far between, compared to the Cobol weenies that took "Business Programming" in school.
Of course, over the years, one moves from straight assembler, to C, to C++, and instead of rolling one's own kernels and schedulers, starts to use offerings like pSOS, and MQX, for real-time embedded apps, and Linux and FreeBSD for time-constrained work. And, one leverages language, and library, abstractions because they make your development life easier and more productive.
Then, something doesn't work. And, guess what? Us "old farts", who either understand the underlying implementations, or can make an educated guess about them, are often the only ones who can find and fix the problem. This skill often extends to areas where the abstractions are foreign: I once singlehandedly repaired a bunch of Java servlet and RMI code developed by an outsourced "Java expert" team, despite being quite the Java newbie myself.
The problem was simple enough: it was a thread-safety issue within the servlet engine, and "smelled" that way, from the looks of how it was failing. Damned, but I know all about those kinds of dangers and instinctively code defensively when it comes to multi-threaded code. The actual task of picking up enough Java to understand the intent of the code and the APIs being used, was no big deal.
I take that kind of skill, to drill down through the abstraction, and find and fix the problem, for granted. I joke, "That's why they pay me the big bucks!". I have found this skill lacking in many new graduates and people entering the field, having had abstractions and the language du jour rammed into their heads instead of basic computer principles: perhaps one in ten that I interviewed can sort their way out of a paper bag. That saddens me.
Yet, despite these shortcomings, it appears to be specific abstraction knowledge that's in demand, and not an inherent understanding of how the systems work their magic. To use an analogy, yes you need a chauffeur to drive you from A to B, but you also need a mechanic when the car breaks -- HR people are hiring chauffeurs on the assumption that cars never break or are abused.
If you need people who can adapt to whatever the abstraction of the month is, and drill down and understand the mechanics of the underlying implementations, call me. I'm looking for work.
Re:The underlying problem with programming
by
be-fan
·
· Score: 2
Um, what's a small typecase? For reference, on a 300Mhz PII with Visual C++ 6.0, a virtual function call is 10x slower than a regular function call. Significant, but you don't make virtual function calls all that often (like you wouldn't put one in an inner loop of a math routine, that's what templates are for!)
-- A deep unwavering belief is a sure sign you're missing something...
Re:The underlying problem with programming
by
ClosedSource
·
· Score: 2
I'm even older than you and I know where you're coming from.
It always amuses me that on Slashdot often the measure of your programming manhood is based on whether you program for Unix in C using command-line tools. To paraphrase Henry V, I would hold my manhood cheap if I hadn't written assembly code without an OS.
Having said all that, I don't believe in making anything harder than it needs to be. We did all that low-level stuff because we had to, not because we were trying to prove something.
One factor about being an old fart that you may have noticed: It's not just your age and salary that makes you harder to employ, but the fact that you know too damn much for your own good. The last thing a younger supervisor wants to hear about his pet idea is that you tried something like it before and found it didn't work. My hard-won advice is not to give people the value of your experience unless they really want it.
Good luck!
Re:The underlying problem with programming
by
ajs
·
· Score: 2
You make some good points, but you're missing my point around abstraction. Yes, there are not many web apps which are written in 100% assembler (there are some, and leave you to guess what platforms that would make sense on). However, there aren't many *operating systems* written 100% in assembly. Linux, for example, has a very tiny core of assembly for each platform.
The CPAN library for Perl has a ton of C code, but only small bits and chunks in each module, where it's needed.
But, if you program in C, will you ever have to know assembly? Probably.
If you program in Perl, will you ever have to know C? Probably.
If you program in Java... well, Java is a special case. It does tend to use libraries that are written (at least in part) in C++, but Java really tries (and fails, of course) to prove this Slasdot story wrong. The idea that abstractions must be leaky is anethema (sp?) to Java's whole world view.
My point being that most Web applications rely on SSL or sendfile(2) and the person who wrote those had to know assembly.
Did you think that the first sendfile implementation was just for grins? It was likely someone who could not get the performance out of his app that he needed.
The AI thing was another great example. Why the heck would you apply the same language selection criteria to sound management as you would to your game AI?! That would be a huge mistake.
If someone asked me what the one right language was to perform some very large task in, I would probably respond, "English... at least for the requirements specification... then re-evaluate what languages are good for each peice".
Re:The underlying problem with programming
by
bluGill
·
· Score: 2
Only tool? There was a time when assembly was not avaiable. I know more than on person who has manually entered the boot code into the front panel of a computer. Several of them modified that code in some way.
No abstractions? Assembly is an abstracction for the BINARY that the computer runs. I have worked with comptuers for which my assembler was buggy (There were better ones, but I couldn't afford them), so I brought out my reference manual and programed directly in binary. It was painfully beat into me at that time that a simple JMP is an abstraction for as many have 7 different binary codes, depending on exactly what is going on. You don't know if you end up with a long or short JMP in most cases. (You know about register or memory stored locations)
Speaking of memory, you don't know where each byte is in your program in assembly. You don't to a JMP to location 0xdeadbeef you do a JMP to a label which just happenes to map to that location, but the assembler figgures out that details.
Re:The underlying problem with programming
by
Zeio
·
· Score: 2
Thank you for your mention of FreeBSD. I concur with you in this "me too" post.
The quality and speed of FreeBSD leaves much to be desired in other operating systems, including Linux.
-- Legalize the constitution. Think for yourself question authority.
Re:The underlying problem with programming
by
metlin
·
· Score: 2
Of course, that's why I said _my_ career. I don't do graphics code:) :-)
A fair point, but his statement applied to the whole universe of developers (EVERYONE will need to use asm) vs mine which applied to me only (I won't). Still a chance it would be wrong, but a much smaller chance
Well, all I meant is that it doesn't really depend what you do, right? You could still encounter a situation where you might need asm.:-)
I say this because today, given the market situation and economic recession, you really can't be sure what you'd be asked to do as a developer. That increases the chances of developers being asked to code something as basic as asm at some point of time or other in their career, and refusing to might be a fine line between having a job or losing one.
Ofcourse, you could be absolutely right in your statement that you might _never_ have to write a line of asm in _your_ career.:-)
The thing is, we never know! I'm working on technologies which I never ever thought I'd work on, not that I'm complaining, but a few years ago I'd not have believed anybody tell me that I'd be doing what I'm today.
Anyway, done enough ranting:-) My 0.02!
Re:The underlying problem with programming
by
ajs
·
· Score: 2
If it's done right, in an environment where you're reading from a disk controler and writing to an ethernet device buffer, you really just want to do a DMA transfer. In generic, you are right, but there are special cases where you can get insane performance improvements by never going through main memory.
Re:The underlying problem with programming
by
Anonymous+Brave+Guy
·
· Score: 2
I only mean to point out Java as a *language* that has better abstraction properties than C++.
I'm curious about why you say this, because I would have said that C++'s single biggest strength was precisely that you can write software at a reasonably high level of abstraction most of the time, but also go to a lower level as and when necessary.
Surely Java's big strength relative to C++ is its vast and reasonably portable standard libraries, rather than the expressive power of the language itself? In the latter respect, C++ offers several considerably stronger features in the OO area (most significantly, probably, deterministic destruction), and of course there's all the whizzy stuff you can do with templates now that compilers are supporting them close to fully.
In contrast, most of Java's oft-cited "advantages" are simply features that are missing, and which you're at liberty to ignore in C++ as well. The only two big things that come to mind otherwise are the garbage collection and the provision of a finally keyword, and both of these are readily countered by that deterministic destruction, as long as you have someone smart enough to use it (which, bizarrely, many C++ programmers don't seem to be).
C++'s template wizardry may never be quite as smooth and efficient as native language features for things like closures, discriminated unions and pattern matching, but today's libraries are getting close enough to be very useful in practice. The lack of such things in Java is one of its major weaknesses in terms of abstraction tools, IMHO. I've worked on several large (well, >1MLOC, anyway) C++ projects that have been made very much nicer by the use of such things.
-- If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Re:The underlying problem with programming
by
Tom7
·
· Score: 2
> In contrast, most of Java's oft-cited "advantages" are simply features that are missing, and which you're at liberty to > ignore in C++ as well.
It's true that Java is lacking some features that C++ has (parametric polymorphism through templates comes to mind, maybe operator overloading if you like that sort of thing). But not all of its "features" are really features (to me). For instance, some might say that C++ has a feature of "letting me" manage my memory myself. I don't think of this as a feature, because it often leads to memory leaks or heap corruption (double deletes!). I like to have my memory managed automatically. Some might say that being able to cast someone's abstract data structure to a concrete type so I can use my special knowledge of how it's actually implemented is a "feature", but I think that it's precisely the library developer's ability to control how his library is used that's a feature. I would say that Java's nice and portable libraries are a direct result of this language support for abstraction. (And, to an extent, the politics of its development..)
I don't have any problem with languages that let you get at the details (Java does, through its native method interface) when appropriate; I just think that C++ tempts programmers too much by having things like unsafe arrays, pointer arithmetic, unchecked casts, and new/delete right there in in front of you in the core language.
Java is getting generics soon, so that will close a gap in its abstraction capabilities. And, of course the languages I actually like to program in (Java is not one of them) like SML and O'Caml have had polymorphism for a long time, as well as some of the other keen features you mention like higher-order nested functions, algebraic datatypes, pattern matching, fancy module systems, etc.
Re:The underlying problem with programming
by
Anonymous+Brave+Guy
·
· Score: 2
I agree with much of what you write, but I'd place the emphasis differently. I don't think the access to low-level features in C++ is a problem. Rather, the problem is that they are used by default by most C++ programmers, probably because of the way the C++ community developed from the C one, and the fact that even today most C++ textbooks and tutors aren't good at de-emphasizing this. C++ itself provides high level equivalents to all of the things you mention, which are usually substituted trivially for the dangerous low-level ones. There should be no reason, in typical high-level C++ code, to ever use a raw pointer, unchecked array or unsafe cast; you have smart pointers, vector, dynamic_cast and such instead.
Oh, and don't even get me started on Java's generics. A whole new generation of ill-informed Java evangelists is about to rise up and tell us how C++'s templates are no longer an advantage, because they haven't noticed that in C++ you can do more than write generic containers with them.:-(
Oh, and I agree that functional languages are nice, BTW. I would be very happy programming in a language that has the elegance of functional languages and the raw power of things like C and C++. Now if only some clever programmer would invent it...
-- If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Re:The underlying problem with programming
by
Tom7
·
· Score: 2
I'm with you on that except that I don't think it's easy to ignore manual memory management in C++. Sometimes it is nice to be able to do your own finalization, but most of the time I simply hate keeping track of the memory I've allocated. Snap-in garbage collectors are inefficient because they have to be conservative to preserve the C++ semantics. Reference counting can take care of this in some situations, but it's not as general as garbage collection and it has a higher overhead (I believe).
Except for manual memory management, O'Caml supports a lot of the useful low-level C stuff while having most of the nice features of a modern functional language... It's also very fast. Any reason not to use that, then?
Re:The underlying problem with programming
by
Anonymous+Brave+Guy
·
· Score: 2
I'm not sure about the C++ GC stuff. AFAICS, there's nothing to stop you writing your own full-blown GC, as long as your platform supports suitable multithreading and sync'ing protocols to let you co-ordinate between a GC'd pointer and a GC thread. You give up the portability and deterministic destruction, but such will always be the price of a GC system in a language with low-level features anyway.
I'm in two minds about OCaml as well, but I don't really know enough of it to have a well-informed opinion. I like the elegance and simplicity of SML, and somehow the OO and non-declarative parts of OCaml seem out of place in a functional language, powerful as they may be. Perhaps that's just because I mentally draw an inappropriate line between your average procedural/OO language and the declarative/functional world of ML, though.
-- If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Re:The underlying problem with programming
by
Tom7
·
· Score: 2
> I'm not sure about the C++ GC stuff. AFAICS, there's nothing to stop you writing your own full-blown GC, as long as your > platform supports suitable multithreading and sync'ing protocols to let you co-ordinate between a GC'd pointer and a GC > thread. You give up the portability and deterministic destruction, but such will always be the price of a GC system in a > language with low-level features anyway.
It's possible that you'd be able to create a GC library for C++. But, it would require you to use ONLY the special gc'd malloc and special smart pointers. That would mean that you wouldn't be able to link in library code and have that be garbage collected, and you'd have to be careful about what constructs you used if you wanted your program to work. (And it would most likely still be less efficient than one with langage and runtime support built-in.) Anyway, all I'm saying is that I think the cost of unsafe language features like C++ has can sometimes be more than what it seems on the surface, because they preclude you or the compiler writer from implementing something like Garbage Collection in a general way.
Also, there's no reason why GC needs multithreading -- most garbage collectors that I know of run at allocation time when space runs out. (If you need real-time guarantees or something then usually that means another GC thread.)
Re:timeout
by
binaryDigit
·
· Score: 3, Insightful
Well I wouldn't say that it's reliable "because there are timeouts". AAMOF, timeouts just compicate things. So you timeout waiting for packet N, you request a resend of it, and in the interim, guess what, packet N shows up, now you have two N's. Your code is now more complex in having to deal with this situation. Timeouts are just another parameter used adjust the behaviour of the algorithms that control the protocol. Getting deterministic results from an undeterministic foundation involves making observations, accepting some compromises, making some simplifying assumptions, and then writing code that takes all those things into account to come up with something that usually works.
In other words, TCP is obliged to somehow send data reliably using only an unreliable tool.
How is this news? All technologies, on some level, are inherently unreliable. Therefore, in order to obtain reliability, it is always by adding some kind of redundancy to an unreliable tool.
I've never seen a technology touted as "reliable" that didn't achieve that reliability without some kind of self-checking or redundancy somewhere. Maybe that's the author's point, but he makes it sound as TCP/IP is unique in this regard.
This is what programming is all about. It seems pretty obvious to me.
-- And the men who hold high places must be the ones who start
To mold a new reality... closer to the heart
but he makes it sound as TCP/IP is unique in this regard
I think he's just using it as an example that almost anyone can relate to in the internet age. And while it is obvious to those of us who code/administer/tinker with such things, but his use of the "hollywood actors" analogy would seem to point to the fact that his audience is not us.
Re:timeout
by
Minna+Kirai
·
· Score: 4, Informative
"Reliable" means "always works", it doesn't mean "always obeys the spec". (Unless you use a circular definition)
A timeout is a legal result by the TCP specification, but it's not reliable, because your data didn't make it through.
By the IP specification, your data might not make it either- and that's a legal result because the spec allows it to drop packets for any reason at all. That doesn't mean IP is reliable, just that it obeys its own definition.
Of course, no real protocol can ever meet this restrictive definition of reliable. Some maniac can always cut through your wires or incinerate your CPUs. Calling TCP a "reliable protocol" is just a shorthand for "as much more reliable than the underlying protocols as we could manage"
The timeout you mention does make TCP more reliable than IP, because it alerts you to the data loss, where the application can possibly take steps to retransmit it sometime in the future. But its not as if TCP could ever achieve the perfect reliablity that the simplest, most abstract description of it would imply. Which is why, as the author says, those who rely on the abstractions can get bitten later.
The ultimate leaky abstraction
by
nounderscores
·
· Score: 5, Insightful
Is our own bodies.
I'm studying to be a bioinformatics guy with the university of melbourne and have just had the misfortune of looking into the enzymatic reactions that control oxygen based metabolism in the human body.
I tried to do a worst case complexity analysis and gave up about half way through the krebs cycle.
When you think about it, most of basic science, some religeon and all of medicine has been about removing layers of abstraction to try and fix things when they go wrong.
Re:The ultimate leaky abstraction
by
ez76
·
· Score: 2
I'm studying to be a bioinformatics guy with the university of melbourne and have just had the misfortune of looking into the enzymatic reactions that control oxygen based metabolism in the human body.
I tried to do a worst case complexity analysis and gave up about half way through the krebs cycle
You see, sometimes it's better to analyze the source code rather than the binary or network traffic.
This reenforces the important point...
by
venomkid
·
· Score: 2, Insightful
...to start with, or at least be competent with, the basics.
Any good programmer I've ever known started with the lower level stuff and was successful for this reason. Or at least plowed hard into the lower level stuff and learned it well when the time came, but the first scenario is preferable.
Throwing dreamweaver in some HTML kiddie's lap, as much as I love dreamweaver, is not going to get you a reliable Internet DB app.
--
vk.
Re:This reenforces the important point...
by
bunratty
·
· Score: 2
Any good programmer I've ever known started with the lower level stuff and was successful for this reason. Or at least plowed hard into the lower level stuff and learned it well when the time came, but the first scenario is preferable.
The problem with this "start with the low-level" approach is that no matter where you start, there's a lower level that is abstracted away. That is, unless you start with the physics of electricity and slowly work up to how transistors work and how they're wired up to be logic gates, how the gates are arranged to be functional units within a CPU, etc.
I think the best approach is to start in the middle with a high-level language with a text editor and command line compilation. Then, as needed, they can later learn lower level assembly language and higher level IDEs, and then later learn even lower level digital design and even higher level code generation. With this approach, you're not abstracting so much away that beginners have no idea what's going on, but you're abstracting enough away that they can understand basic programming concepts immediately.
-- What a fool believes, he sees, no wise man has the power to reason away.
This is true of almost any engineering profession
by
today
·
· Score: 2, Interesting
The mechanical, electrical, chemical, etc, engineering fields all have various degrees of abstractions via object hiding. It just isn't called "object hiding" because these are in fact real objects and there is no need to call them objects because it is natural to think of them that way. When debugging a design in any of these fields, it is not unusual to have to strip down layers and layers of "abstraction" (ie, pry into physical objects) to get to the bottom of a real tough problem. Those engineers with the broadest skills are usually the best at dealing with such problems. There isn't really anything new in the article.
Abstractions don't kill systems, people kill...
by
redfiche
·
· Score: 2, Interesting
Abstractions are good things, they help people understand systems better. An abstraction is a model, and if you use a model, you need to understand its limitations. High level languages have allowed a tremendous increase in programming productivity, with a price. But just as you cannot be really good at calculus without a thorough understanding of algebra, you cannot be a really good coder if you don't know what's going on underneath your abstractions.
Great article, but don't throw out the high level tools and go back to coding Assembler.
--
Brevity is the soul of wit
-- Polonius
Leaky slashdotted server...
by
dereklam
·
· Score: 3, Funny
I said that TCP guarantees that your message will arrive. It doesn't, actually. If your pet snake has chewed through the network cable leading to your computer, and no IP packets can get through, then TCP can't do anything about it and your message doesn't arrive.
Unfortunately, his Slashdotted server is proving that to us right now.
Moore's law buries Leaky Abstractions law
by
Anonymous Coward
·
· Score: 2, Insightful
Hiding ugliness has its penalties. Over time processor performance buries these penalties. What Joel doesn't tell you is that abstraction can buy you productivity and simply put, make programming easier and open it up to larger audiences.
Maybe someone out there prefers to program without any abstraction layers at all, but they inherit so much complexity that it will be impossible for them to deliver a meaningful product in a reasonable time.
Re:Moore's law buries Leaky Abstractions law
by
Minna+Kirai
·
· Score: 2
Over time processor performance buries these penalties.
Sometimes. But not all penalties can be resolved by more CPU cycles. A faster CPU can't repair your severed ethernet wire. It can't change all the existing HTML browsers and C++ compilers to cover up supposed "flaws".
And unless this CPU is awesome enough to enable real AI, it can't save us from future shortcomings in computer-interface languages either.
Visual Basic and abstraction breakdown
by
RobertB-DC
·
· Score: 5, Interesting
As a VB programmer, I've *lived* leaky abstractions. Nowhere has it been more obvious than in the gigantic VB app our team is responsible for maintaining. 262.frm files, 36.bas modules, 25.cls classes, and a handful of.ctl's.
Much of our troubles, though, come from a single abstraction leak: the Sheridan (now called Infragistics) Grid control.
Like most VB controls, the Sheridan Grid is designed to be a drop-in, no-code way to display database information. It's designed to be bound to a data control, which itself is a drop-in no-code connection to a database using ODBC (or whatever the flavor of the month happens to be).
The first leak comes in to play because we don't use the data control. We generate SQL on the fly because we need to do things with our queries that go beyond the capabilities of the control, and we don't save to the database until the client clicks "OK". Right away, we've broken the Sheridan Grid's paradigm, and the abstraction started to leak. So we put in buckets -- bucketfuls of code in obscure control events to buffer up changes to be written when the form closes.
Just when things were running smoothly, Sheridan decided to take that kid with his finger in the dike and send him to an orphanage. They "upgraded" the control. The upgrade was designed to make the control more efficient, of course... but we don't use the data control! It completely broke all our code. Every single grid control in the application -- at least one and usually more in each of 200+ forms -- had to have all-new buckets installed to catch the leaks.
You may be wondering by now why we haven't switched to a better grid control. Sure enough, there are controls out there now that would meet 95% of our needs... but 1) that 5% has high client visibility and 2) the rest of the code works, by golly! No way we're going to rip it out unless we're absolutely forced to.
By the way, our application now compiles to a svelte 16.9 MEG...
-- Stressed? Me?
Of course not.
Stress is what a rubber band feels before it breaks, silly.
Re:Visual Basic and abstraction breakdown
by
GoofyBoy
·
· Score: 2
It might be a situation of "wrong tool for the job"
I find VSFlexGrid is excellent to work with.
-- The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
Re:Visual Basic and abstraction breakdown
by
glenebob
·
· Score: 2
"...bucketfuls of code in obscure control events to buffer up changes to be written when the form closes."
I've had to solve this same type of problem myself. The way I did it was to find a data source object that stores its data in memory. I use SQL (sometimes ODBC, sometimes an app server) to get data into it, and an app server to get data changes from it back into the DB. The grid doesn't know the difference, and I don't stress anything, so upgrades don't break things (too badly). You should invest some time in either finding a similar control, or implementing your own. It'll save you a ton of pain and money, and you won't have 'buckets' of spaghetti code (admit it, its a mess, isn't it?) to explain to someone later on.
IMHO it sounds like your team made a huge design mistake by forcing a component to do something it doesn't know how to do. You also made a huge mistake by not somehow putting your 'buckets' of code in a central library to minimize your pain when an upgrade does break things.
Like most VB controls, the Sheridan Grid is designed to be a drop-in, no-code way to display database information.
It sounds like you well may have made another mistake: binding your grid in some cases directly to an SQL server. One thing I've learned over the past 5 years or so, is that RAD tools are great for some things, but the promise of transparent database access is a complete crock. It's an abstraction that almost always leaks for a variety of reasons. Binding on-screen data access objects to a database server is a huge no-no in my book.
Re:Visual Basic and abstraction breakdown
by
frank_adrian314159
·
· Score: 2
I programmed using VB for a six month period once (don't ask...). The main problem with most VB controls is not that their abstractions are crappy (which most are), but that they try to abstract too wide a swath. They usually try to abstract the data model that the control uses, the persistance mechanism behind that, the application model that keeps track of the attributes specific to the given presentation object, the actual presenter code, and the control code that ties all of this crap together into one big unmamgeable ball of mud. This is why you can't use it in any way other than what the designer intended the control to work like without a huge amount of extension. It makes it easier for people who want to use that abstraction for the purpose it was intended, but it has no reusable parts to extend or replace, and usually the hood is sealed shut, too.
We've known how to design things right since the early Smalltalk days (MVC anyone?). But the idea of having a VB user design three classes per control makes me shudder, too. VB is a language dumbed down to the point where anyone can cobble together something that barely works. It sucks in that this is also the limitation of tool as well - you can only cobble together something that barely works. It is no surprise that the controls available to be used with this language are also on the same level.
This is one the best essays on software engineering I've read in a while. As a programmer and CS educator, it's really served to crystallize for me why (a) it seems so much harder for students to learn programming these days, and (b) why I've grown unhappy over the years with the series of new engineering paradigms that are in use. Extremely helpful for putting my own thoughts in order.
The law statement itself, "all non-trivial abstractions, to some degree, are leaky" may possibly get included in my personal "top 10" aphorisms manifesto.
-- We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
Joel is ignoring market factors
by
Ars-Fartsica
·
· Score: 2
Yes it would be nice to get back to 'first principles' and address machine resources directly, but its impossible to deliver a product to the marketplace in a meaningul timeframe using this method, particularly when Moore's law blurs the gains anyway - crap runs fast enough.
TCP for the bored
by
mekkab
·
· Score: 5, Insightful
FIne it's relaibale becasue of acks, timeouts, adaptive re-transmit timeouts that take statistical averages of RTT times, exponential back-off and slow start, window acks which keep track of what bytes are received, etc.
So in your case of timing out N, re-tx'ing N, and then getting the repsonse to the first N back after sending the second N, you do two things: 1) Good! You got yr packet! 2) keep track of how many bytes you have received thsu far (TCP is not sending messages, it is sending a stream) 3) when you get the response from your second request, discard it, becuase you already received those bytes from the stream. 4) since you timed out, DON'T use the Round TRip Time for that reponse: slow down your expected RTT time, and THEN start measuring.
And guess what? If I unplug the NIC of the other machine, there is no reliable way of transmitting that data (assuming your destination machine isn't dual homed)- so I keep streaming bytes to a TCP socket and I don't find out my peer is gone for approx. 2 minutes. WOW. There's nothing reliable about that boundary condition!
my point is TCP is reliable ENOUGH. But I wouldn't equate it with a Maytag warranty. It is not a panacea. Infact, for a closed homogenous network I wouldn't even consider it the best option. But if the boundary conditions fall within the acceptible fudge range (remember Real Time human grade systems are not 100% reliable, only 99.99999% and much of that is achieved through redundancy) your leaks are ok.
-- In the future, I would want to not be isolated from my friends in the Space Station.
Bad term choosen IMO
by
shadowtramp
·
· Score: 2, Interesting
Don't know how the term leak fills to mentioned nonprogrammers but in programmers' slang the word leak has distinct meaning. And it does diverge from what Joel use it for.
I think that term ooze would suite better in this case. It's possesses a kind of dirtiness to itself and the fealing the word 'ooze' gives me fits good with the matter of described problem.:o)
Back to the article. To be serious, i think that Joel mixed all things as examples of 'Leaky abstraction' to no purpose. Too different situations make concept to fall apart. Here what i mean:
In case of tcp/ip it denotes limits of abstraction. And regardless of programmer background every sane man should now those limits do exist.
In case of page faults it's a matter of competence - there is no abstraction at all. You either do know how your code is compiled and executed or you don't. It's the same when you know what the phrase in a given language do realy mean or you don't. I simplify here.
In the case of C++ strings i saw the only good example. What in my opinion the experience of STL and string class usage tells in this case is: one should understand the underlying mechanics fully before rely on abstraction behaviour.
In programming it is realy simple to tell will the given 'abstraction' present you with an easter egg or not: if you can imagine FSM for the abstraction you will definitely know when to use it.
-- I'm not a brake. I'm an accelerator. Just a slow one...
> Huh? You can't retransmit cabbages or actors or hard copies of badly researched > essays . . . but you can retransmit freaking TCP packets!
He does go on to say that "retransmission" in this case means sending an identical twin of the actor. Your criticism really isn't fair in ignoring this. The metaphor of identical twins for copies of information is not perfect, but as a teaching tool for explaining TCP to a non-technical type, I'm not sure I could come up with another metaphor which matches this one for its intuitive value.
Joel should write an article about his leaky hosting company... or maybe his leaky colo-ed box.
Since I can't get to the site and read the article, I'll tell some jokes.
"Leaky Abstractions?! Is this guy talking about Proctology??"
"Leaky Abstractions?! Someone get this guy a plumber!"
"Leaky Abstractions?! I knew we should have used the pill!"
-gerbik
Time to market is the factor, not elegance
by
Ars-Fartsica
·
· Score: 5, Insightful
This argument is so tired. The downfall of programming is now due to people who can't/don't write C. Twenty years before that the downfall of programming was C programmers who couldn't/wouldn't write assembler.
The market rewards abstractions because they help create high level tools that get products on the market faster. Classic case in point is WordPerfect. They couldn't get their early assembler-based product out on a competitive schedule with Word or other C based programs.
Re:Time to market is the factor, not elegance
by
daoine
·
· Score: 3, Insightful
The market rewards abstractions because they help create high level tools that get products on the market faster.
Agreed, but I think it's important to note that without the understanding of where the abstraction came from, the high-level tools can be a bane rather than a help.
I write C++ every day. Most of the time, I get to think in C++ abstraction land, which works fine. However, on days where the memory leaks, the buffer overflows, or the seg faults show up, it's not my abstraction knowledge of C++ that solves the problem. It's the lower level, assembly based, page swapping, memory layout understanding that does the debugging.
I'm glad I don't have to write Assembly. It's fun as a novelty, but a pain in the butt for me to get something done. However, I'm not sure I could code as well without the underlying knowledge of what was happening under the abstraction. It's just too useful when something goes wrong...
Re:Time to market is the factor, not elegance
by
ChaosDiscord
·
· Score: 5, Insightful
The market rewards...
I'd suggest stearing clear of that phrase if your intention is to indicate that something is "good". It's also completes with things like "The market rewards skilled con men who disappear before you realize you've been rooked" and "The market rewards CEOs who destroy a company's long term future to boost short term stock value so he can cash out and retire."
I'm all in favor of good abstractions, good abstractions will help make us more efficient. But even the best abstractions occasionally fail, and when they fail a programmer needs to be able to look beneath the abstraction. If you're unable to work below and without the abstraction, you'll be forced to call in external help which may cost you any of time, money, showing people you don't entirely trust your proprietary code, and being at the mercy of an external source. Sometimes this trade off is acceptable (I don't really have the foggest idea how my car works, when it breaks I put myself at the mercy of my auto shop). Perhaps we're even moving to a world where you have high level programmers that occasionally call in low level programmers for help. But you can't say that it's always best to live at the highest level of abstraction possible. You need to evaluate the benefits for each case individually.
You point out that many people complain that some new programmers can't program C, while twenty years ago the complaint was the some new programmers can't program assembly. Interestingly both are right. If you're going to be skilled programmer you should have at least a general understanding of how a processor works and assembly. Without this knowledge you're going to be hard pressed to understand certain optimizations and cope with catastrophic failure. If you're going to write in Java or Python, knowing how the layer below (almost always C) works will help you appreciate the benefits of your higher level abstraction. You can't really judge the benefits of one language over another if you don't understand the improvements each tries to make over a lower level language. To be a skilled generalist programmer, you really need at least familiarity with every layer below the one you're using (this is why many Computer Science desgrees include at least one simple assembly class and one introductory electronics class).
Re:Time to market is the factor, not elegance
by
Ars-Fartsica
·
· Score: 2
I'd suggest stearing clear of that phrase if your intention is to indicate that something is "good". It's also completes with things like "The market rewards skilled con men who disappear before you realize you've been rooked" and "The market rewards CEOs who destroy a company's long term future to boost short term stock value so he can cash out and retire."
If you are writing code to make money, by definition an abstraction is good if your product sells.
Re:Time to market is the factor, not elegance
by
Tet
·
· Score: 2
'm not sure I could code as well without the underlying knowledge of what was happening under the abstraction. It's just too useful when something goes wrong...
I can't agree with this strongly enough. Anyone can
become a competent coder with relative ease. But
unless you can write assembly, you'll never become
a truly great coder. I'm not claiming
you'll need to actually write any assembly -- I haven't
had to for probably 10 years or so. But the very
fact that you can means that you'll
have a better understanding of what's happening
underneath your high level abstractions, which
will in turn make you a better coder. In 20 years
or writing code, I haven't yet seen any exceptions
to this...
-- "The invisible and the non-existent look very much alike." -- Delos B. McKown
I dont think you understand TCP, you dont request a resend. TCP does that for you, if you timeout it means the connection is broken. Feel free to try again later, but trying again means opening a new connection, and no packages from an old connection will confuse the new one.
Isn't "leaky abstraction" a leaky abstraction of the leaky abstractions?
--
-... ---.-. . -....--..
Leaky? There's got to be a better word.
by
unfortunateson
·
· Score: 2, Interesting
For something like IP packets, leaky is acceptable, but for many of those other abstractions, constipated might be a better adjective. Some of the tools and technologies out there (remember 4GL report-writers?) were big clogging masses that just won't pass.
The first thing I do when I start in on a new technology (VBA, CGI, ASP, whatever) is to start digging in the corners and see where the model starts breaking down.
What first turned me on to Perl (I'm trying hard not to flamebait here) was the statement that the easy things should be easy, and the hard things possible.
But even Perl's abstraction of scalars could use a little fiber to move through the system. Turn strict and warnings on, and suddenly your "strings when you need 'em" stop being quite so flexible, and you start worrying about when it's really got something in it or not.
On the HTML coding model breaking down, my current least-fave is checkboxes: if unchecked, they don't return a value to the server in the query, making it hard to determine whether the user is coming at the script the first time and there's no value, or just didn't select a value.
Then there's always "This space intentionally left blank.*" Which I always footnote with "*...or it would have been if not for this notice." Sure sign of needing more regularity in your diet.
-- Design for Use, not Construction!
Not all abstractions are leaky
by
thom2000
·
· Score: 2, Insightful
Sure, the author points out a few examples of leaky abstractions. But his conclusion seems to be that you always will have to know what is behind the abstraction.
I don't think that's true. It depends on how the abstraction is defined, what it claims to be.
You can use TCP without knowing how the internals work, and assume that all data will be reliably delivered, _unless_ the connection is broken. That is a better abstraction.
And the virtual memory abstraction doesn't say that all memory accesses is guaranteed to take the same amount of time, so I don't consider it to be leaky.
So I don't entirely agree with the author's conclusions.
Re:Not all abstractions are leaky
by
mikeee
·
· Score: 2
the virtual memory abstraction doesn't say that all memory accesses is guaranteed to take the same amount of time
But that's just making the leak explicit. The VM abstraction is to make swap look like RAM, which reduces complexity, but in some situations you actually have to be aware of what the abstraction is doing or it will bite you in the butt. Leak!
No calling TCP reliable, means that it provides certain features you can rely on: 1. Data either arrives intact or dont arrive at all. 2. Data always arrives in the order it was sent 3. Data is never duplicated.
You can safely rely on all of these facts. Calling TCP a garantie that you packages arrive is leaky, but TCP doesnt claim that.
So if you dont abstract TCP more than you are meant to, it is a non-leaky abstraction.
This is an overrated rant about bad coding
by
PureFiction
·
· Score: 5, Insightful
Proper abstractions avoid unintended side-effects by presenting a clean view of the intent and function of a given interface, and not just a collection of methods or structures.
When I read what Joel wrote about "leaky abstractions" i saw a peice complaining about "unintended side-effects". I don't think the problem is with abstractions themselves, but rather the implementation.
He lists some examples:
1. TCP - This is a common one. Not only does TCP itself have peculiar behavior in less than ideal conditions, but it is also interfaced with via sockets, which compound the problem with an overly complex API.
If you were to improve on this and present a clean reliable stream transport abstraction is would likely have a simple connection establishment interface and some simple read/write functionality. Errors would be propagated up to a user via exceptions or event handlers. But the point I want to make is that This problem can be solved with a cleaner abstraction.
2. SQL - This example is a straw man. The problem with SQL is not the abstraction it provides, but the complexity of dealing with unknown table sizes when you are trying to write fast generic queries. There is no way to ensure that a query runs fastest on all systems. Every system and environment is going to have different amounts and types of data. The amount of data in a table, the way it is indexed, and the relationship between records is what determines a queries speed. There will always be manual performance tweaking of truly complex SQL simply because every scenario is different and the best solution will vary.
3. C++ string classes. I think this is another straw man. Templates and pointers in C++ are hard. That is all there is too it. Most Visual Basic only coders will not be able to wrap their minds around the logic that is required to write complex c++ template code. No matter how good the abstractions get in C++, you will always have pointers, templates, and complexity. Sorry Joel, your VB coders are going to have to avoid c++ forever. There is simply no way around it. This abstraction was never meant to make things simple enough for Joe Programmer, but rather to provide an extensible, flexible tool for the programmer to use when dealing with string data. Most of the time this is simpler, sometimes it is more complex (try writing your own derived string class - there are a number of required constructors you must implement which are far from obvious) but the end result is that you have a flexible tool, not a leaky abstraction.
There are some other examples, but you see the point. I think Joel has a good idea brewing regarding abstractions, complexity, and managing dependencies and unintended side-effects, but I do not think the problem is anywhere near as clear cut as he presents. As a discipline software engineering has a horrible track record of implementing arcane and overly complex abstractions for network programming (sockets and XTI) generic programming (templates, ref counting, custom allocators) and even operating systems API's (POSIX).
Until we can leave behind all of the cruft and failed experiments of the past, start new with complete and simple abstractions that do not mask behavior, but rather recognize it and provide a mechansim to handle it gracefully, we will run into these problems.
Luckily, such problems are fixable - just write the code. If joel were right and complex abstractions were fundamentally flawed, that would be a dark picture indeed for the future of software engineering (it is only going to grow ever more complex from here kids - make no mistake about it).
Re:This is an overrated rant about bad coding
by
StrawberryFrog
·
· Score: 2
I think this is another straw man. Templates and pointers in C++ are hard.
Yes, they are. Now think outside the C++ box and it's not such a straw man anymore - it's a criticism of C++ and C-style strings, a legacy that we are stuck with due to APIs, even if we move to a language with sensible string handling.
--
My Karma: ran over your Dogma
StrawberryFrog
Re:This is an overrated rant about bad coding
by
PureFiction
·
· Score: 2
it's a criticism of C++ and C-style strings, a legacy that we are stuck with due to APIs
Exactly, which is why later in my post I mention the need to abandon the old ways and start fresh with clean abstractions unencumbered by old cruft.
I like to code in c++ myself, but there is no way I would ever recommend it to anyone these days given the presence of better tools like Java or Python, etc.
Re:This is an overrated rant about bad coding
by
PureFiction
·
· Score: 2
In a known environment it becomes trivial. All you need to know are 1. indexes, and 2. tables sizes. There is a whole volume of information on ordering the clasues of a SQL query based on these criteria (and the type of clause itself) that will get you an optimal query without any use of profilers and crap.
Yes, a good DB programmer can look at a query, and given the indexes and table sizes, order the clauses optimally. It isnt that hard, you just need to do this for every environment that differs.
Re:This is an overrated rant about bad coding
by
GoofyBoy
·
· Score: 2
Like your initial post, I fully agree with you.
Thats why many SQL implementations have a way to "Force" or "Hint" to the optimizer on how queries are done.
-- The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
Re:This is an overrated rant about bad coding
by
PureFiction
·
· Score: 2
hey're just so often poorly explained.
I used to think this was the case. Then I tried to teach a friend of mine how pointers work. I swear, some people just cannot grasp the concept of indirection via pointers and memory addresses. It is simply pure alien to them.
To really understand pointers you need to understand pointer arithmetic, multiple indirection, stack and heap storage, function pointers and byte alignment.
All that combined adds up into some pretty complex concepts. Joel mentions VB programmers repeatedly in his article, and I can assure you that regardless of how pointers are taught, many of the VB programmers out there will simply not "get it".
Re:This is an overrated rant about bad coding
by
PureFiction
·
· Score: 2
You can't possibly be thinking that you could write and maintain and DEBUG a 5 million line Python code base for gods sake. And anything in java over about 10,000 lines becomes unrunnable on any current machine.
I don't want to get into a language flameware, but I firmly believe in the mantra: "The best tool for the job"
If you wrote the code in Python, it wouldn't be 5 million lines! Put performance critical code in C or C++ compiled natively. Integrate with Python or Java via the available interfaces. Distribute processing using RMI/RPC if the system grows too big. You enhance scalability and maintainability this way.
C and C++ both have their place in the world, but Java and Python and similar tools are proving incredibly usefull for solving a wide range of computing tasks quickly and efficiently.
I am still going to be writing C++ code for a long time into the future, but that doesn't mean I think everyone should spend the significant effort required to become proficient with C++, nor do I think C++ is even a good tool for many of the coding projects out there.
To each his own...
Re:This is an overrated rant about bad coding
by
ebbe11
·
· Score: 2
In a known environment it becomes trivial. All you need to know are 1. indexes, and 2. tables sizes. There is a whole volume of information on ordering the clasues of a SQL query based on these criteria (and the type of clause itself) that will get you an optimal query without any use of profilers and crap.
That knowledge is the leaky abstraction. Joel's point is that SQL tries to make an abstraction where this knowledge shouldn't be necessary - yet it is.
--
My opinion? See above.
Re:This is an overrated rant about bad coding
by
leandrod
·
· Score: 2
>
SQL - This example is a straw man. The problem with SQL is not the abstraction it provides, but the complexity of dealing with unknown table sizes when you are trying to write fast generic queries.
Correct conclusion, wrong reasons. The problem with SQL is not the complexity of getting good performance, but the low quality of its abstraction. A truly relationalsystem might even be slow, but it would be uniformally, thus predictably and treatably, slow.
A relational system should never give different performance for different syntax if the result is the same. The optimizer should know the structure and sizes of all objects and always get the same access path if the result of the expression is the same. SQL fails because many SQL constructs are ill-defined or even wrong.
All performance tuning must be done by DBAs at the mapping between the logical and physical schemas, but SQL makes that impossible, specially after SQL:1999. See DBDebunk for more details.
-- Leandro Guimarães Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
I think the key point is that this article is an abstraction (and a leaky one at that) of the truth:)
--
----
Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"
Complexity Management
by
Frums
·
· Score: 3, Interesting
The problem that this article points to is a byproduct of large scale software development primarily being an exercise in complexity management. Abstraction is the foremost tool available in order to reduce complexity.
In practice a person can keep track of between 4 and 11 different concepts at a time. The median lands around 5 or 6. If you want to do a self-experiment have someone write down a list of twenty words, then spend 30 seconds looking at them without using memnonic devices such as anagrams to memorize them then put the list away. After thirty more seconds write down as many as you can recall.
This rule applies equally when attempting to manage a piece of software - you can only really keep track of between 4 and 11 "things" at the same time, so the most common practice is to abstract away complexity - you reduce an array of characters terminated by a null characters and a set of functions designed to operate on that array to a String. You went from half a dozen functions, a group of data pieces, and a pointer to a single concept - freeing up slots to pay attention to something else.
The article is completely correct in its thesis that abstractions gloss over details and hide problems - they are designed to. Those details will stop you from being productive because the complexity in the project will rapidly outweigh your ability to pay attention to it.
This range of attention sneaks into quite a few places in software development:
Team sizes: teams of between four and ten people are generally the most productive - they, and the project manager can track who is doing what without gross context switching.
Object models: When designing a system there will generally be between four and eleven components (which might break into more at lower levels of abstraction). Look at most UML diagrams - they will have four to eleven items (unless they were autogenerated by Rose).
Methods on an object: When it is initially created an object will generally have between four and eleven methods - after that it is said to start to smell, and could stand to be decomposed into multiple objects.
Vacation Days in the US: Typoically between five and ten - management can think about that many at one time, any more and they cannot keep track of them all in their head so there are obviously too many;-)
Layers in the standard networking stack
Groups in a company
Directories off of/
other schemes exist for managing complexity, but abstraction is decided human - you don't open a door, rotate, sit down backwards, rotate again, bend legs, position your feet, extend left arm, grasp door, pull door shut, insert key in iginition, extend right arm above left shoulder, grasp seatbelt, etc... you start the car. Software development is no different.
There exist peopel that can track vast amounts of information in their heads at one time - look at Emacs - iirc RMS famously wrote it as he did because he could keep track of what everythign did, no one else can though. There also exist memnonic devices aside from abstraction for managing complexity - naming conventions, taxonomies, making notes, etc.
-Frums
Similar article on Salon
by
ChaosDiscord
·
· Score: 2
Joel Spolsky often grates on me (especially when he falls into, "here's how Microsoft solved the problem with near infinite access to manpower, so clearly you should do the same thing."), but this article really rang true.
People might also be interested in a similar article published in 1998 on Salon, "The dumbing-down of programming." The author comes from a slightly different point of view, but comes to a similar conclusion: we need to be wary of becoming too detached from the low level details.
Re:Advertisement or real issue?
by
Minna+Kirai
·
· Score: 2
Try again with some constructive criticism, not just criticism.
For most of these problems, there's no easy way to really fix the abstraction. The only solution is for users to be aware of the abstractions they depend on, so that they can troubleshoot the underlying foundations if things break down.
The article is constructive in that it spreads a warning to be cautious about relying on abstractions too much, without understanding how they work.
I dont think you understand TCP, you dont request a resend. TCP does that for you,
You misunderstood my statement. It wasen't made from the standpoint of someone using TCP, it was made from the standpoint of TCP itself. If it's looking at the UDP packets coming in and realizes one is missing, it will request that that missing packet be resent after some timeout period. TCP then also has to be able to handle the situation when two or more of the same packets arrive due to this behaviour.
Error conditions are part of the abstraction TCP provides. The concept of reliability in TCP doesn't mean that it always gets your data across, but that it either does or does not and you can rely on knowing the what happend. This isn't a weakness of TCP, it's a strength.
Unfortunately there are edge conditions in which your data is recieved but you never receive the ACK(nowledge) and therefore assume that the messasge was lost. This isn't very common, but if the cat chews the CAT5 at just the right moment it can happen.
Pessimism gone rampant
by
jneemidge
·
· Score: 5, Insightful
This article reminds me of what I hated most about Jurassic Park (the novel -- the movie blessly omits the worst of it) -- Ian Malcolm's runaway pessimism. The arguments boil down to be very similar. Ian Malcolm says that complex systems are so complex we can't ever understand them all, so they're doomed to fail. Joel Spolsky says that our high-level abstractions will fail and because of that we're doomed to need to understand the lower-level stuff. I have problems with both -- they're a sort of technopessimism that I find particularly offensive, because they make the future sound bleak and hopeless despite volumes of evidence that, in fact, we've been dealing successfully with these issues for decades and they're just not all that bad.
We have examples of massively complex systems that work very reliably day-in and day-out. Jet airplanes, for one; the national communications infrastructure, for another. Airplanes are, on the whole, amazingly reliable. The communications infrastructure, on the other hand, suffers numerous small faults, but they're quickly corrected and we go on. Both have some obvious leaky abstractions.
The argument works out to be pessimism, pure and simple -- and unwarrented pessimism to boot. If it were true that things were all that bad, programmers would all _need_ to understand, in gruesome detail, the microarchitectures they're coding to, how instructions are executed, the full intricacies of the compiler, etc. All of these are leaky abstractions from time to time. They'd also need to understand every line of libc, the entire design of X11 top to bottom, and how their disk device driver works. For almost everyone, this simply isn't true. How many web designers, or even communications applications writers, know -- to the specification level -- how TCP/IP works? How many non-commo programmers?
The point is that sometimes you need to know a _little bit_ about the place where the abstraction can leak. You don't need to know the lower layer exhaustively. A truly competant C programmer may need to know a bit about the architecture of their platform (or not -- it's better to write portable code) but they surely do not need to be a competant assembly programmer. A competant web designer may need to know something about HTML, but not the full intricacies of it. And so forth.
Yes, the abstractions leak. Sometimes you get around this by having one person who knows the lower layer inside and out. Sometimes you delve down into the abstraction yourself. And sometimes, you say that, if the form fails because it needs JavaScript and the user turned off JavaScript, it's the user's fault and mandate JavaScript be turned on -- in fact, a _good_ high-level tool would generate defensive code to put a message on the user's screen telling them that, in the absence of JavaScript, things will fail (i.e. the tool itself can save the programmer from the leaky abstraction).
What Ian Malcolm says, when you boil it all down, is that complex systems simply can't work in a sustained fashion. We have numerous examples which disprove the theory. That doesn't mean that we don't need to worry about failure cases, it means we overengineer and build in failsafes and error-correcting logic and so forth. What Joel Spolsky says is that you can't abstract away complexity because the abstractions leak. Again, there are numerous examples where we've done exactly that, and the abstraction has performed perfectly adequately for the vast majority of users. Someone needs to understand the complex part and maintain the abstraction -- the rest of us can get on with what we're doing, which may be just as complex, one layer up. We can, and do, stand on the shoulders of giants all the time -- we don't need to fully understand the giants to make use of their work.
(I didn't read the whole article, so my analysis may be leaky.)
It is true that every abstraction is but an imperfect representation of the concrete things it was abstracted from. It is true, and worth noting, that the degree to which the abstraction breaks down in certain situations can cause large problems.
I would like to nit-pick the Katzian "It's dragging us down" doomsday prediction at the end. Abstraction itself has lifted us, tremendously. The fact that abstraction is not perfect is a limitation to how much it can lift us, true. But it's like taxes to support military spending or highways--yes, they do "drag down" our paychecks, but they also make what we do to get that paycheck possible.
One other concept that doesn't seem to be considered is the fact that, even with the imperfections, there is a very powerful and important benefit to having done the abstraction in the first place. One people are using the abstraction instead of interfacing with the concrete target directly, then fixing a leak in the abstraction can be done at the abstraction, and everyone, potentially millions or billions, can benefit from it. Yes, there is a cost to rolling that out and ensuring compatibility, etc, but it's an advantage of abstraction that is a powerful tool to deal with inherent "leakiness" of abstracting anything.
Finally, I want to point out one thing that is implied but not stated, and that is how important it is to do your abstraction well. Once you have completed your abstraction, it's likely that thousands or millions will build things on top of it. Code that exists to be used by other code is more important than a one-off script. If you mess it up, you are messing it up for a lot of people. Just something to keep in mind, and something well illustrated by the article.
I agree that "it's dragging us down" may be a bit overstated, but that viewpoint may be a little self-serving as well. It serves to highlight that those of us who understand what's "under the hood" so to speak have much better job security *because* there are so many "programmers" who can't dive beneath the surface when the abstractions leak.
On the other hand, this does suggest some possible limitations with our technological development as a species at some point. As there are more and more technologies, and fewer and fewer people who really understand how to fix problems when the surface layer of a given technology fails, we might hit a, er, carrying capacity problem... So right now, *we* aren't being dragged down by leaky asbtractions, although many bozos are. The problem comes when the number of available non-bozos can no longer cover the number of (leaky) technologies.
well, I did notice the "you need to hire people like me aspect of it", but I would say that a more accurate statement would be "when the leaks show up, you need someone like me to fix it". That doesn't really imply that you need everyone to understand things all the way down. It means the demand for hard-core people is proportional to the leakiness of the abstraction. You might only need a coupld of really good guys in a crowd of a hundred passably competent VB'ers (or whatever--don't know much about VB or the people that program with it myself).
I agree that there could be a carrying capacity issue, but can you give evidence that the non-bozo ratio is going down? Recent irrational exuberance drew a lot of fully-abstraction-dependents into the market, but the more recent reality check probably pushed a huge number of them back out.
Anothing thing to note is that, if there really are a bunch of people out there that are heavily dependent on an abstraction that is leaky, that represents a large market for a better abstraction. So it's possible that forces will act to balance that problem anyway.
However, it does make for an interesting science fiction plot (I don't read much science fiction, so this should be easy pickings for you "Idiot--ths is exactly the plot of XXX"'ers out there:) where a few hyperintelligent folks keep the whole world working ("Marching Morons" did this, albeit with a bit of a racist/classist underpinning for the root of the problem).
You have a technical lead to solve sneaky problems that come up, and a bunch of vb/java/.net humps to crank out the business logic. We can't all be Einstein, can we?
-- love is just extroverted narcissism
Neal Stephenson...
by
mikeee
·
· Score: 4, Interesting
Neal Stephenson talks about something similar in In the Beginning was the Command Line. He calls it interface shear; he's specificially referring the the UI as an abstraction (an interesting idea in itself). His take on it was that abstractions are metaphors, and that "interface shear"/"leaky abstractions" occur in regions where the metaphors break down.
Your absolutely right in the reference, but calling "referring the the UI as an abstraction" an interesting idea is kinda...um....well, the thing is, it's a bit of a DUH!
I'll put it to you like this: not only is it self evident that the UI is an abstraction (I mean, come on...a file is an abstraction of 1's and 0's) and something that was known since XEROX-PARC worked out the desktop metaphor, but it's not yet abstract enough. The UI as it is is nothing more than an inefficient way of working with files. The proof of this is that the current UI is incapable of being used in a human way. The UI must be further abstracted (in other words: must have more definitions, functions and capabilities of being operated on tagged onto it) in order for it to be used in the rudimentary way of human interaction as in a "Computer, what's next?" question making the computer say, based on your schedule and the task you're doing at the moment, either "go to this meeting" or "you can code for 10 more minutes before you have to get ready to leave".
I know I've kinda messed up the meaning ('s late, too much wine, bleh), but I hope you get what I mean...Neal Stephenson, while being a great writer, isn't spot on in "In the Beginning was the Command Line"..."Snow Crash" works better in the metaphor needed.
Neither TCP nor UDP has requested resent packets. If a TCP packet is lost, the sender will not receive an ACK for that packet, and the sender will resend it. This will happen even if the receiver has caught fire. If a UDP packet is lost, it's lost totally.
It's not a Law it's a Theorem Until Proven
by
airrage
·
· Score: 2
The leaky abstraction is actually a good thing. We understand that as we layer programming code, inherently, the lower-layers are never going to be perfect. Hence, why we "bubble-up" error-handling.
I don't want abstraction to be universal constant. If problems occur in the abstraction, I want to be able to diagnose that. Just because the abstraction "leaks", that's a good thing. Sometimes a client-server app breaks because the network layer breaks, but we can make modifications, adjustments to fix the problem. If we couldn't, and this "leak" were permanent, we would throw up our hands and quit.
The author somehow is attempting to suggest that this imperfect-abstraction is hurting..something. But where is there perfect abstraction? I have yet to see one: the sky abstracts the universe from my environment, yet I still can get a sunburn.
Even God himself, tried to abstract human beings from evil, and that "leaked" as well.
So we shouldn't define this as "leaky" abstraction, but rather as "flexible" abstraction.
-- "This isn't a study in computer science, its a study in human behavior"
It also happens in Math
by
PacoSuarez
·
· Score: 3, Interesting
I think the article is great. And this principle can also be applied to Math. Theorems are much like library function calls. You can use them in your own proofs, without caring about how they are proved, because someone has already taken care of that for you. You prove that the hypothesis are true, and you get a result which is guaranteed to be true.
The problem is that in real Math, you often need a slightly different result, or you cannot prove that the hypothesis are true in your situation. The solution often involves understanding what's "under the hood" in the theorem, so that you can modify the proof a little bit and use it.
Every professional mathematician knows how to prove the theorems that he/she uses. There is no such thing as a "high-level mathematician", that doesn't really know the basics, but only uses sophisticated theorems in top of each other. The same should be true in programming, and this is what the article is about.
The solution? Good education. If anyone wants to be considered a professional programmer, he/she should have a basic understanding of digital electronics, micro-processor design, assembly language (at least one), OS architechture, C, some object oriented language, databases... and should be able to understand the relationship between all those things, because when things go wrong, you may have to go to any of the levels.
It's a lot of things to learn, but there is no other way out. Building software is a difficult task and whoever sells you something else lies.
In physics the abstractions leak. Newton's laws leak like crazy. Einstein's theories leak. Presently there are no fundamental theories in physics which don't leak like crazy when quantum mechanics and gravity interact.
In sports the abstractions leak. That's how we get players like Gretzky and pay a lot of money to watch what they do.
And how about the reason why didn't C++ didn't define a native string type. Because there isn't any way to implement a string class that serves all possible applications. The premise of C++ is not being stuck with someone else's choice on what part of the abstraction should leak. Because C++ doesn't define a native string type, the user is free to replace the default standard string implementation with any other string implementation and have it integrate with the language on an equal footing with the standard string type.
If a language is imposes standard abstractions it only takes one abstraction you can't live with to make that choice of language untenable. Which is how C++ has been so successful despite being the worst of all possible languages (except for all the others).
Well, not really, since string literals are always null terminated strings, so you're always forced to deal with that. And null terminated strings suck. Using pascal strings would have been a better solution, imo - sure, they have arbitrary limits on string size. But so do int's, and you can still abstract a string class to provide arbitrary length strings if you want.
C++ defines string literals the same way C defines string literals. Compatibility with C has never been a negotiable feature.
If you are making heavy use of the standard string type, your main use of string literals will be as initial values when creating String values. A typical implementation for a sequence type from the C++ standard library is to store pointers to the beginning and end of occupied storage. This makes ptrdiff_t the type used to count the number of elements in the sequence, a size on the order of the program's virtual address space.
A typical string class will support initialization from a C string by first determining the string length (which can be determined at compile time if you wish to do so), allocating enough space in the String class object, then copying the character bytes. You can copy the representation of a zero terminated string as efficiently as the representation of any other string implemenation I've ever seen.
The drawbacks of zero terminated string literals are a few percentage points in a typical C++ program written in a modern idiom and the price is usually paid outside of important loops.
There is a problem with the use of C string literals as constant values in template metaprogramming. Template metaprogramming often seems like death by 1000 leaks. That's the price you pay to implement novel abstractions that don't leak in any of the usual ways. If the assumptions change, a template metaprogram will vigorously adapt and the application programming working on top of the template library might not even realize that all this complicated stuff is taking place, other than noticing that compiles consume 500MB of virtual memory and take 20 minutes to complete.
It's actually in C that I've wanted access to pascal style strings more than C++ - as you said, in C++, if you make consistent use of classes then the overhead is pretty minimal. On the other hand, I'm dealing with a nice bug chunk of legacy C code that makes enourmous use of buffers that are, incidently, 255 bytes long. Pascal strings would speed things up enormously. Not that I would ever expect such a thing it happen, but it'd be nice if I had em.
Use a better language if leaking abstractions
by
Anonymous Coward
·
· Score: 3, Interesting
I agree with Joel, but some people seem to be taking it as a call to stop abstracting. That's silly.
Humans form abstractions. That's what we do. If you abstractions are leaking with detrimental consequences, then it could be because the programming language implementation you're using is deficient, not because you shouldn't be abstracting.
Try a high-performnce Common Lisp compiler some time. Strong dynamic typing and optional static typing, macros, first class functions, generic-function OO, restartable conditions, first class symbols and package systems make abstraction much easier and less prone to arbitrary decisions and problems that are really:
(i) workarounds for methods-in-once-class-rule of "ordinary" single-dispatch OO
(ii) workarounds for the association of what an object is with the name of the object rather than it itself (static typing is really saying "this variable can only hold this type of object", dynamic typing is saying "the object is of this type". Some languages mix these issues up, or fail to recognise the distinction.
(iii) workarounds for the fact that most languages, unlike forth and lisp, are not themselves extensible for new abstractions
(iv) workarounds for the fact that one cannot pass functions as parameters to functions in some languages (doesn't apply to C, thanks to function pointers - here's where the odd fact that low level languages are often easier to form new abstractions in comes in) (v) workarounds for namespace issues
(vi) workarounds for crappy or nonexistent exception processing
Plus, Common Lisp's incremental compile cycle means faster development, and it's defined behaviours for in place modifications to running programs makes it good for high-availability systems
I thought that TCP used a sliding window to prevent the overhead of having to ack every packet?
Something else this guy says
by
Nagash
·
· Score: 2
I liked the article, but I followed the link to strings are hard and I wasn't so impressed with the guy. To quote"
As every compiler writer knows, lexing and parsing are the slowest part of compiling.
How about optimizing compilers? I find I get performance hits when I turn on optimizations in gcc when I specify -O3 as opposed to -O0. Somehow, I doubt it's a result of lexing or parsing.
It's a gross generalization and I strongly suspect that sort of thing permeates all his writings, given that he spouts off on XML in the aforementioned article as well.
Woz
Re:Something else this guy says
by
arkanes
·
· Score: 2
I believe he was talking more generically, and he's also totally correct about why XML is inefficent, or are you claiming that parsing an XML template is somehow faster than accessing an offset in a binary file?
Re:Something else this guy says
by
Nagash
·
· Score: 2
The problem is he was too generic in that statement. Say that lexing and parsing take the most time in compilation is far too general for my tastes.
I wasn't saying anything about XML (or certainly didn't mean to).
I think "fundamentally broken" is an overstatement. TCP/IP and the "actor transportation" metaphor have the following in common:
-The low-level transfer is unreliable: things might come thru damaged or not at all, and things might not arrive in the order they were sent.
-A higher level management system deals with transmission failures by requesting retransmissions as needed. It also deals with ordering problems by holding onto things and putting them in the right order.
No metaphor is ever a perfect fit, but this one is a pretty good one, if you ask me. I think that even non-technical types generally understand that you can make arbitrarily many copies of information (they understand Xerox machines, for example), so I don't think this one failing greatly compromises the overall teaching usefulness of the metaphor.
Yup, your right. Should of got to bed before 4 last night and I wouldn't be making stupid mistakes (well, at least not as many).
Abstractions about your leaky abstractions
by
smittyoneeach
·
· Score: 2
Today, to work on CityDesk, I need to know Visual Basic, COM, ATL, C++, InnoSetup, Internet Explorer internals, regular expressions, DOM, HTML, CSS, and XML. All high level tools compared to the old K&R stuff, but I still have to know the K&R stuff or I'm toast.
I submit that the reason the abstractions leak is that infomation is a fluid. We have these spiffy binary computers, but the continuous information fluid just finds the leaks. Sure, higher level languages abstract better, and hide the ugly, bottom-level truth more and more, except for when they don't.
I submit that your "foo" + "bar" example, or variations on that theme, will continue to separate the wheat from the chaff in IT for the foreseeable, pre-Terminator, ante-dystopian apocalyptic meltdown future.
Resistance is feudal.
-- Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
I agree with your joke but take it seriously too
by
iamwoodyjones
·
· Score: 2, Interesting
When told to convert Fortran code over to C (over a million lines) I knew it was going to take me forever. f2c doesn't work in this case since the code is soooo messed up to begin with. So, I found myself doing repetitive conversions over and over again that are specific to the code base.
Solution:
Created a perl script that translates parts of it for me and highlights the rest that has to be hand changed and looked over.
So, to solve one probem I created a slew of more problems with the script freaking out and messing up code.
So far though, it's saved me every bit of that time that I would have spent working on tedious simple stuff. Which in turn allows me to post to Slashdot more!!!!
No two abstractions leak alike
by
Phouk
·
· Score: 2
Reading the article, I get the feeling that Joel has gotten used to more leakiness in his tools (Visual Basic, MFC, C++ AFAIK) than he should have. Let me explain:
Consider two imaginary libraries: Really Nice Library (RNL) and Mostly Crufty Functions (MCF).
Both expose some unavoidable leakiness when a) the abstractions don't fit your problem, or the really *is* at that lower level, or b) you are forced to consider the performance characteristics of the lower level, which the abstractions cannot hide.
In addition to that, MCF adds some more leakiness of its own: buggy implementation, bad assumptions, a badly designed interface, common cases not covered by abstractions etc. etc.
When there's leakiness of the first kind, Joel is right: the programmer should be capable of solving the problem at the lower level (and the tools should make it easy for him to do so).
But when you keep experiencing a lot of leakiness of the second kind, maybe you should go looking for a better set of tools. Life's just too short...
While I usually like Joel's work, I'm pissed about the random jab at C++. For those he didn't read the article, he says something along the lines of
"A lot of the stuff the C++ committe added to the language was to support a string class. Why didn't they just add a built-in string type?"
It's good that a string class wasn't added, because that lead to templates being added! And templates are the greatest thing, ever!
The comment shows a total lack of understanding of post-template, modern C++. People are free not to like C++ (or aspects of it) and to disagree with me about templates, of course, and in that case I'm fine with them taking stabs at it. But I get peeved when people who have just given the language a cursory glance try to fault it. If you haven't used stuff like Loki or Boost, or taken a look at some of the fascinating new design techniques that C++ has enabled, then you're in no place to comment about the language. At least read something like the newer editions of D&E or "The C++ Programming Language" then read "Modern C++" before spouting off.
PS> Of course, I'm not accusing the author of being unknowledgable about C++ or anything of the sort. I'm just saying that this particular comment sounded rather n00b'ish, so to speak.
-- A deep unwavering belief is a sure sign you're missing something...
Templates kick ass, but, at some level, no matter what, you're dealing with the overhead of null terminated strings. A native string type would allow you to avoid this, and doesn't invalidate the coolness of templates.
Um, how exactly would a native string type save you overhead? They'd be null-terminated strings underneath anyway. I can't imagine implementing something more efficient than null terminated strings (for short strings, of course, you should be using something like ropes for megabytes of text). If you could, you could do it in C anyway. Could you clarify?
-- A deep unwavering belief is a sure sign you're missing something...
Pascal strings. The length of the string is stored in the first byte, with the rest of the string following. This limits you to 255 characters in length, but means you don't have to be looping over strings looking for nulls all the time. strlen() is a single instruction. strcat() can be done in O(1) time. Buffer overflows go away (well, mnostly).
strlen becomes O(1) but strcat is still O(n) and strcpy is still O(n). I really don't think strlen() is performance intensive enough to justify the performance decrease. Besides, a string class could easily implement Pascal-like strings instead of C-style ones, without having it be a built in type.
-- A deep unwavering belief is a sure sign you're missing something...
strcat() and strcpy() are O(n), but the constant is much smaller, since you don't iterate either string, you just malloc and then memcpy. One strlen() is fast. But since you end up doing it almost every time you do anything usefull with null terminated strings, you're burning alot of cycles. And when you screw up a strcpy() or something, and end up with a string without the terminating null, well, thats a bad thing.
Most string classes (at least, every one I've ever seen) does store length data, for exactly the same reason. However, because the compiler still transforms string literals into null terminated strings, you still end up messing with them, and there's not really much you can do about that. Adding a string type (have a compiler switch?) would allow you to avoid that - note that you can still use null terminateds just fine, adding a string type doesn't remove your char arrays.
I'm not sure how you see a performance decrease form using these types of strings - in every case they will be at least as fast as null terminated ones.
Well, the vast majority of the time I'm working with raw strings, I'm working with short ones. Large quantities of text are wrapped in a string class which, yes, stores it's length in an int. But whenever I use a small buffer, and everytime I have to work with a literal, I'm using strlen(). That's extra iterations and cpu cycles that are totally unnecesary.
I don't really expect the C++ or C standards to change just because I want a string data type. But the fact is, there's no downside to having one, and a potentially signifigant gain to having one. That's all I'm saying.
Hm, it Pascal strings would make strcmp() slower, though, because of the overhead of maintaining a loop counter and checking it against the string length. I'd argue that you spend just as much time comparing strings as copying them around, so it's 6 of one or a half dozen of the other...
As for string literals, who uses string literals? Most text should be stored externally to allow for translations, and what few string literals there are aren't performance intensive at all. They hardly justify changing such a critical semantic in C/C++.
-- A deep unwavering belief is a sure sign you're missing something...
std::basic_string class keeps the string length stored. basic_string::size() does not have to scan the data for the NULL byte, it just looks at the size value it has stored.
Plain old NULL-terminated strings are horribly inefficient. You have to do a full scan every time you want to find the length or append. Ugh!
Well, I use em alot:P Screw people who don't speak english:P And if you start end the end and iterate up, you can probably eliminate the loop overhead... I'm no expert in unrolling loops. I hate strcmp() anyway - 90% of the time you don't care how different the strings are, you only care if they aren't the same, and in that case, being able to drop out early based on length will end up being a gain alot of the time.
Most string classes I've ever seen cache the size so you don't have to do a strlen(), you get it from the cache. essentially 'pascallizing' the string, costing you a couple extra bytes (I'm sure the length is an int, not a char) but as in all optimizations, it's a size/speed tradeoff. Remember that in C++ you can filter all operations that affect the internal data so the 'cached' size will always be current. (well, you can put up a big "DON"T TOUCH" sign, if people really want to get to the data, they can always cast...)
With (small) pascal strings, there's no size tradeoff - you're getting rid of the terminating null in favor of a pre-pended length. For strings 255 characters and shorter, there's basically no good reason to not use them. I know all about classes and C++ strings and the standard library and all that jazz, this is more about the fundamental C datatypes.
I don't think I implied that each packet transmitted required a seperate ACK packet. It definatly doesn't. An ACK means that all data upto that sequence number has been received successfully.
The Irony of the Shiny New Thing
by
Badgerman
·
· Score: 5, Interesting
Loved this article. Sent it on to my manager and a co-worker.
One thing I liked especially is the danger of the Shiny New Thing. It may be neat and cool and save time, but knowing how to use it does not mean that you can do anything else - or function outside of it.
Right now I'm on an ASP.NET project - and some ASP.NET stuff I actually like. But the IDE actually makes it harder to program responsibly, and even utilize.NET effectively. Unless one understands some of the underpinnings of this NEW technology, you actually can't take advantage of it. Throw in the generated code issues and the IDE, an abstraction of an abstraction, really is disadvantageous.
A friend of mine just about strangled some web developers he worked with as they ONLY use tools (and they love all the Shiny New Ones) and barely know what the tools produce. This has led to hideous issues of having to configure servers and designs to work with their products as opposed to them actually knowing how they work. The guy's a saint, I swear.
I think managers and employers need to be aware of how abstract things can get, and realize good programmers can "drill down" from one layer to another to fix things. A Shiny New Thing made with Shiny New Things does NOT mean the people who did it are talented programmers, or that they can haul your butt out of a jam when the Shiny New Thing looses its shine.
-- "The Sage treasures Unity and measures all things by it" - Lao Tzu
Re:The Irony of the Shiny New Thing
by
arkanes
·
· Score: 2
I had this exact problem when I was trying to learn to use the MFC for gui programming. Could have been partly an issue with the class I was taking, I suppose, but it's a pretty consistent problem in all the books I've seen. It's _hard_ to make the MFC work, and it's easy to break it. The wizards that the IDE want's you to use, general obfuscated (and hidden) code, so unless you already know how to do it all from scratch, you don't know what the IDE is doing. Which would be fine, except that it's easy to screw stuff up, so you then have to fight with the IDE to tell you what the hell it generated, so you can find out where it's breaking.
Almost all C++ string classes overload the + operator so you can write s + "bar" to concatenate. But you know what? No matter how hard they try, there is no C++ string class on Earth that will let you type "foo" + "bar", because string literals in C++ are always char*'s, never strings.
I found out that some PHP library functions (empty, isset, etc.) are not really functions, but "language constructs". They ONLY take variables as arguments. You cannot pass function results, constants, and IIRC expressions.
Appearently they married something to C or C++, ruining the "substitutability rule" of programming.
I couldn't believe my eyes. PHP is supposed to be a "scripting language", not a C++ preprocessor. That kind of shit just makes ASP look competative. Please don't completely kill MS because OSS will then feel free to expose and force-feed even more archaic C-family ugliness in its languages.
If it is done for speed purposes, then perhaps have an interpreter switch somewhere that lets one choose between "normal acting" functions and a crippled-but-fast option. (Normal being the default.)
Problem is not abstraction its team-work
by
mdritchi
·
· Score: 2, Insightful
The problem that Joel talks about is not really a problem with abstraction, it is a problem with teamwork. When I program I simply can not do all of it myself for all but trivial projects. By everything I mean write the compilers, write the OS etc. Instead I must rely on other programmers to write large portions of the code that I run. Whether it is the guy across the hall who wrote the search contact Stored Procedure in SQL or a programmer at microsoft writing a windows Disk IO function, I am relient on their code working as I think it should. This is the problem with teamwork but there is no other solution to programming modern applications.
Martin
Too much abstraction is a bad thing
by
dsaxena42
·
· Score: 3, Insightful
Maybe I'm an old fashioned has-been but people doign software development should understand the fundamentals of how computers work. That means that they should understand things like memor management, they should understand what a pointer is, they should undertsand about how tight loops versus unrolled loops might affect the performance of the caches on their system. I meet so many "programmers" that have no understanding that there are architectural constraints on what they can and can't do. Software runs on hardware. If you're going to write software and treat the hardware as a black box, you're not going to write it as well, or as efficiently as you could be doing it.
CPU cycles are cheaper than keystrokes
by
Eric+Savage
·
· Score: 2, Insightful
Now I always consider performance when designing/writing code, but programmers are WAY more expensive than hardware, so eeking out performance can often be a wasted effort. Everyone knows that C will smoke Java in most operations, but having its so hard to manage at the enterprise level that you are much better taking the 50%+ performance hit and writing in a "leaky" language.
--
This is not the greatest sig in the world, this is just a tribute.
abstractions == models
by
Dr.+Awktagon
·
· Score: 3, Insightful
Looks like he just discovered and renamed the basic idea that "all models are incomplete". Any scientist could tell you that one! I remember a quote that goes something like this: The greatest scientific accomplishment of the 19th century was the discovery that everything could be described by equations. The greatest scientific accomplishment of the 20th century is that nothing can be described by equations.
That's all an abstraction is: a model. Just like Newtonian physics, supply and demand under perfect competition, and every other hard or soft scientific model. Supply and demand breaks down at the low end (you can't be a market participant if you haven't eaten in a month) and the high end (if you are very wealthy, you can change the very rules of the game). Actually, supply and demand breaks down in many ways, all the time. Physics breaks down at the very large or very small scales. Planetary orbits have wobbles that can only be explained by more complex theories. Etc.
No one should pretend that the models are complete. Or even pretend that complete models are possible. However, the models help you understand. They help you find better solutions (patterns) to problems. They help you discuss and comprehend and write about a problem. They allow you to focus on invariants (and even invariants break down).
All models are imperfect. It's good that computer science folks can understand this, however, I don't think Joel should use a term like "leaky abstraction". Calling it that implies the existence of "unleaky abstraction", which is impossible. These are all just "abstractions" and the leaks are unavoidable.
Example: if I unplug the computer and drop it out of a window, the software will fail. That's a leak, isn't it? Think of how you would address that in your model: maybe another computer watches this one so it can take over if it dies..etc..more complexity, more abstractions, more leaks....
He also points out that, basically, computer science isn't exempt from the complexity, specialization, and growing body of understanding that accompanies every scientific field. Yeah, these days you have to know quite a bit of stuff about every part of a computer system in order to write truly reliable programs and understand what they are doing. And it will only get more complex as time goes on.
But what else can we do, go back to the Apple II? (actually that's not a bad idea. That was the most reliable machine I've ever owned!)
The Best Abstractions Plug the Leaks
by
scruffy
·
· Score: 2
I can agree that all abstractions are leaky (imperfect). Even a Turing machine assumes that you have infinite memory.
However, the best abstractions are those that plug the leaks or at least keep them to a drip rather than a stream. Automatic garbage collection for plugging memory leaks is a good example. Perhaps this is the main reason why I like Perl and Java. Of course, you still get into trouble, but the programs are much easier to debug without all the code tracking and freeing up storage.
But you know what? No matter how hard they try, there is no C++ string class on Earth that will let you type "foo" + "bar", because string literals in C++ are always char*'s, never strings.
WTF? Has he never heard of temporaries? I don't understand this point at all.
-- Dahlmann tightly grips the knife, which he may have no idea how to use, and steps out into the plain.
haha! It also seems like the messages from the keyboard are arriving out of order! (read the first 5 words of the original post... you see what I mean)
No, unfortunately, that is simply my brain dropping packets destined for my fingers. Tis a shame, really.
-- In the future, I would want to not be isolated from my friends in the Space Station.
Re:a leaky abstraction is a wrong abstraction
by
arkanes
·
· Score: 3, Insightful
You don't, and in fact can't, deal with page faults in your Java program. Nonetheless, your java program will suffer a performance hit when it page faults. Thats a leaky abstraction.
Joel is a retard, abstraction is good...
by
gnovos
·
· Score: 2
This one paragraph is what I hate about Joel:
Back to TCP. Earlier for the sake of simplicity I told a little fib, and some of you have steam coming out of your ears by now because this fib is driving you crazy. I said that TCP guarantees that your message will arrive. It doesn't, actually. If your pet snake has chewed through the network cable leading to your computer, and no IP packets can get through, then TCP can't do anything about it and your message doesn't arrive. If you were curt with the system administrators in your company and they punished you by plugging you into an overloaded hub, only some of your IP packets will get through, and TCP will work, but everything will be really slow.
This is what I call a leaky abstraction.
On the surface it looks like an almost reasonable way to describe the situation, but when you look closer, you realize it's mish-mash written to look smarter than it is.
Imagine, addign to teh example above, you were to flip off your computer, or pour a cola directly on the motherboard... at that point ALL programming would cease to function. All computer code exists at a level of abstraction, even when you are programming in machine language you are still abstracted to some degree away from the hardware...
But that is actually the POINT of computers. Abstraction is what gives computers thier strength. It's what allows machines to be programmed to do vastly complex things without requiring a vastly complex piece of code.
All his examples are simply whining that X program can't function when Y event happens. Javascript can't run when JS is turned off in the browser, c++ won't let you add two string literals together, some SQL queries are slower than others...
None of these are inherant faults with abstraction, they are specific instances of poor implementation, instances that can and probably should be fixed. Instead of looking at one flawed analogy and saying that analogies as a argumentative tool are all inherently unusable, you should fix the flaw in that one analogy and use it.
-- "Your superior intellect is no match for our puny weapons!"
There are no more Rennisance Men.
by
Ungrounded+Lightning
·
· Score: 3, Insightful
While you are slamming ``liberal arts'' -- a term you seem not to understand -- you highlight the need for it. Liberal arts does not imply a non-scientific, non-technological education. It implies a broad education, including science, mathematics, and engineering along with the ``traditional'' topics of history, literature, languages, politics, economics, and arts. For politics, governance, and management, I want people who are conversant in all of those topics.
Unfortunately, the subjects you list have all grown to the point that no human can obtain even a BASIC understanding of all of them before he's too old to have a useful carreer left.
It was once possible to be a "Rennisance Man" - a master of ALL the sciences and arts reduced to teachability. No more. It's just too bloody large. (I say this as someone who attended a univerdity that claims to try to produce such people - centuries after the last of them is dead. B-) )
Unfortunately, "Liberal Arts" schools have, over much of the last century, been filled with the mathematically and technically illiterate - both because the students without the necessary skills gravitated there, and because the faculties themselves were so disabled, and in turn disparaged the skills they were incompetent to teach.
The engineering/scientific/biologic/technical cirriculum had constant feedback from the real world about what was true and what was false. But the "Arts Schools" taught classes where what was "right" was ONLY a matter of opinion - and grades solely a measure of how well you could regurgitate your Prof's pet bonnet-bees. (This DESPITE the fact that SOME of these theories could be TESTED - if only the academics understood, and/or believed in, things like the scientific method, statistics, and sampling methods.)
Yes the "Social 'Sciences'" are hard. But the bulk of their credentialed practitioners used this as an excuse to drop "science" from their methodologies. (This despite that fact that mathematics departments were generally part of the art, rather than the engineering, side of the school organization.)
I've been out of academia for a while now. I can hope that things have improved, as you seem to claim. But I have not personally seen any sign of such from the outside (other than your claim).
In my school days, too, many students on the Arts side of the wall knew tech, math, and the like. (Students are generally young, and still hunting for their muse.) But they would generally transfer out to some field more conducive to clear thought, drop out to use it in the real world, or (if they stayed in LS&A) suppress it or flunk out.
-- Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Re:There are no more Rennisance Men.
by
Dirk+Pitt
·
· Score: 2
the subjects you list have all grown to the point that no human can obtain even a BASIC understanding of all of them
The subjects he lists, history, literature, languages, politics, economics, and arts, science, mathematics, and engineering, why would you disagree that it's possible for someone to obtain competency in these areas? It seems to me that for someone to be a modern "Rennisance Man" [sic], that he must have a basic competency in many areas, and the ability to apply them all to some goal. Now, the kind of mastery in these topics that you speak of might be rare, but so were the Da Vinci's of the time.
The Philosopher-Scientist exists today, and in record numbers! You have but a short stroll to the bookstore to discover that there are a multitude of scholars who with unbelievable multidisciplinary knowledge have brilliantly defended theories and proposed inventions.
Unfortunately, I must agree with you that the academic establishment isn't churning out your average student with even a glimmer of 'Renaissance Man'-style education. Many an engineer can prove a book's worth of formulae, but could not assert a belief and defend it in writing with much more skill than an illiterate.
So, do I believe that anyone can have doctorate-level knowledge in all the areas described? Definitely not. Do I believe that everyone, including myself, could have a broader level of knowledge that would be helpful in everyday life? Definitely so.
Cutting lawns with scissors...
by
Zinho
·
· Score: 3, Interesting
No one would cut a lawn with scissors
You'd be surprised what people will cut lawns with. In Brasilia (Capital of Brasil) the standard method of trimming lawns is to use a machete. No, I'm not talking about hacking down waist-high grass, I'm talking about trimming 3-inch high grass down to two inches by hacking repeatedly at it with a machete, trying to swing parallel to the ground as best you can. No, you don't do this yourself, you hire someone to do it. And if you're a salaried groundskeeper, it makes sure that you always have something to do - you woldn't want to be found slacking off during the day. On rare occasions I've seen people using hedge trimmers (aka big scissors) instead. My family was the only one I knew about in our neighborhood that even owned an American-style lawn mower. My parents were too cheap to hire a full-time groundskeeper, and I have lots of brothers and sisters who work for free:)
Moral of the story; if it works and fits the requirements better, someone will do it.
-- "Space Exploration is not endless circles in low earth orbit."
-Buzz Aldrin
Very interesting story. It seems anything you could call a tool is an abstraction. The very nature of language is abstraction. Only God can accomplish something without abstractions, although even he might not want to go to all that effort.
Most of the time, with a hammer, you're not supposed to care that some hammers have a wooden handle with a metal bit on top or that some are all metal, or metal and graphite, etc. Sometimes, however, it becomes very relevant (say if for some reason the head of your hammer is touching high voltage lines).
You're also not supposed to need to remember that with many hammers the metal head is held onto the wooden handle with glue and friction, and that on an old hammer the head can come if you swing it too hard.
As a wooden hammer handle starts to break, it's also suddenly very relevant how the grain of the wood goes along the handle.
A hammer is an abstraction of the pieces that make it up, just as a high level programming concept is an abstraction of the machine actions that implement it.
Liberal Arts is about lying..
by
rufusdufus
·
· Score: 2
Apparently you never had a liberal arts education. A large part of its function is to teach you to baffle with bullshit and debate based on form and not substance.
That is to say, engineers are not in any way similar to liberal arts majors, as you can't fool mother nature.
edge conditions in which your data is recieved but you never receive the ACK(nowledge) and therefore assume that the messasge was lost.
This is called the Two Generals Problem (they only win if they both attack, but they can't be sure the last messenger made it past the enemy). Sadly, there's a proof that it can't be solved.
Here goes....
by
gillbates
·
· Score: 3, Interesting
many high level abstractions simply do not exist in assembly language.
Okay, so this is a little snippet of some assembly language I've just recently worked on. Here's the declaration for the input file:
textfile input.txt
That's it. Is this readable? Is it abstracted at a level high enough? The primary difference between assembly and a HLL is that in assembly one must invent their own logical abstractions for a real world problem, where languages such as C/C++ simply provide them.
You've probably noticed that I'm using a lot of macros. In fact, classes, polymorphism, inheritance, and virtual functions are all easily implemented with macros. I'm using NASM right now (though I'm using my own macro processor), and it works very well. Because I understand both the high-level concepts and low level details, I can code rather high-level abstractions in a relatively low level language such as assembler. I get the best of both worlds: the ease of HLL abstraction with the power of low level coding.
Please tell me what you think of this - I would honestly like to know. For the past few years, I've been working on macro sets and libraries that make coding in assembly seem more like a HLL. I've also set rules for function calls, like a function must preserve all registers, except those which are used to pass parms. With a well developed library of classes and routines, I've found that I can develop applications quickly and painlessly. Because I stick to coding standards, I'm able to reuse quite a bit (> 50%) of my assembly code.
You might be tempted to ask, "Why not just write in a HLL then?" I do. In fact, I prefer to write in C++. But when the need arises, it's nice to be able to apply the same abstractions of a HLL in assembly. It just so happens that the need has arisen - I'm working on a project that will last a few weeks, and my boss doesn't consider it fiscally responsible to buy a $1200 compiler that will be used for such a short time.
Interestingly, the use of assembly has made me a better programmer. Assembly forces one to think about what one is doing before coding the solution, which usually results in better code.
Assembly forces me to come up with new abstractions and solutions that fit the problem, rather than fitting the problem into any given HLL's logical paradigm. Once I prove that the abstract algorithm will indeed solve the problem, I'm then free to convert the algorithm into assembly. Notice that this is the opposite of the way most HLL coders go about writing code - they find a way in which to squeeze a real world problem into the paradigm of the language used.
Which leaves them at a loss when "leaky abstractions" occur. Assembly has the flexibility to adapt to the solution best suited to a problem, where as HLL's, while very good at solving the particular problem for which they were designed, perform very poorly for solving problems outside of their logical paradigms. While assembly is easily surpassed by C/C++, Java, or VB for many problems, there are simply some problems that cannot be solved without it. But even if one never uses assembly professionally, learning it forces one to learn to develop logical abstractions on their own - which in turn, increases their general problem solving ability, regardless of the language in which they write.
I see the key difference between a good assembly coder and a HLL coder is that an assembly language coder must invent high level abstractions, where as the HLL coder simply learns and uses them. So assembly is a bit more mental work.
-- The society for a thought-free internet welcomes you.
Re:Here goes....
by
Junks+Jerzey
·
· Score: 4, Insightful
Please tell me what you think of this - I would honestly like to know.
I've worked in a way similar to you, and I might still if it were as mindlessly simple to write assembly language programs under Windows as it was back in the day of smaller machines (i.e. no linker, no ugly DLL calling conventions, smaller instruction set, etc.). In addition to being fun, I agree in that assembly language is very useful when you need to develop your own abstractions that are very different from other languages, but it's a fine line. First, you have to really gain something substantial, not just a few microseconds of execution time and an executable that's ten kilobytes smaller. And second, sometimes you *think* you're developing a simpler abstraction, but by the time you're done you really haven't gained anything. It's like the classic newbie mistake of thinking that it's trivial to write a faster memcpy.
These days, I prefer to work the opposite way in these situations. Rather than writing directly in assembly, I try to come up with a workable abstraction. Then I write a simple interpreter for that abstraction in as high a level language as I can (e.g. Lisp, Prolog). Then I work on ways of mechanically optimizing that symbolic representation, and eventually generate code (whether for a virtual machine or an existing assembly language). This is the best of both worlds: You get your own abstraction, you can work with assembly language, but you can mechanically handle the niggling details. If I come up with an optimization, then I can implement it, re-convert my symbolic code, and there it is. This assumes you're comfortable with the kind of programming promoted in books the _Structure and Interpretation of Computer Programs_ (maybe the best programming book ever written). To some extent, this is what you are doing with your macros, but you're working on a much lower level.
Spolsky's such a nob. He simply relays the water-downed thoughts of truly original thinkers.
If you really want to learn about "leaky abstractions" and a bunch of other topics, including human cognition, complexity, economics, and engineering design, read Herbert Simon's "The Sciences of the Artificial".
There is another tradeoff as well
by
Sun+Tzu
·
· Score: 2
Even if higher-level abstractions were perfectly bug-free and non-leaky, there is another tradeoff that would forever preserve the niche of lower-level development tools. The granularity of the abstraction is an inherent tradeoff not just in machine time/efficiency, but in programmer learning curve as well.
Save us time learning?
by
michael_cain
·
· Score: 2
So the abstractions save us time working, but they don't save us time learning.
Sure they do, because they provide a means to better organize the concepts that you have to learn. Take a simple example from physics: Newton's laws of motions are perfectly adequate in the large majority of cases; relativity is important only in cases where velocities approach that of light. Rather than teaching just relativity, we teach Newton first, then point out that it fails near the speed of light, then teach relativity. IMHO, that beats the hell out of teaching relativity first, then pointing out that there are simplifications that work 99% of the time.
The software field is notorious for employing people working whose course of study has been analogous to learning Newton but not learning that there are conditions under which that isn't sufficient. I am not in favor of requiring licensing for all software developers, but there are reasons that states do not allow people to, for example, hire themselves out to do bridge design unless they have demonstrated certain minimum training and competences.
why the hell is he talking about TCP being 'reliable' and IP being 'unreliable' when TCP is a transport protocol and IP is a routing protocol? i kept wanting to plug in 'UDP' for every instance of 'IP' when i was reading his article, and couldn't get very far down it because he was driving me up a wall. i'm all for writing up easy to understand explanations of things, but that was a wildly inaccurate way to easily explain it.
Leak Resistant Abstractions
by
Vagary
·
· Score: 2
< Today, I won't even go near a programming language lower than C and I like Python much better.
So if programmers don't need to know assembler any more what does this tell us? C's abstractions are not significantly leaky. And as compilers get better and better, higher and higher level abstractions will be leak resistant. A great example is OCaml: a very high-level language that still manages to perform [almost] as well as C.
My point: this article is good and all, but it's hardly the final word.
Re:Leak Resistant Abstractions
by
OwnedByTwoCats
·
· Score: 2
C's abstractions aren't leaky. But they're not very good, either. The str functions are inherently broken, and the strn functions so awkward that they are frequently misused, even by their creators.
...some SQL servers are dramatically faster if you specify "where a=b and b=c and a=c" than if you only specify "where a=b and b=c" even though the result set is the same.
I'm not a professional SQL programmer, though I've dabbled, so I'd really like to know why this is true. Is it because in the first case, the interpreter can compare all three variables at once, instead of in two different steps in the second case?
Could someone please explain?
Jon Acheson
-- All opinions expressed herein are my own, and not those of my employers, who are appalled.
Abstraction induced complexity
by
phsolide
·
· Score: 2
That Joel! Always coming up with catchy phrases for concepts that his betters have already published papers on. See David Keppel's
1993 paper for a more thorough explanation, complete with lack of baby-talk.
-- Quit playing Monopoly with Bill. Switch to one of many non-Microsoft products today.
I would prefer them to use C style expressions rather than wierd [] thingies.
And many of the bugs I have had in the last few weeks would just not appear in high level language, e.g. if you branch past an instruction that changes the CPU register sizes.
I'm not a big x86 fan, but don't they usually use ebx these days instead of bx?
Non-leaky abstractions
by
Animats
·
· Score: 3, Interesting
There's been a trend away from non-leaky abstractions. LISP, for example, was by design a non-leaky abstraction; you don't need to know how it works underneath. So is Smalltalk. Perl is close to being one. Java leaks more, leading to "write once, debug everywhere". C++ adds abstractions to C without hiding anything, which increases the visible complexity of the system.
It's useful to distinguish between performance-related leaks and correctness leaks. SQL offers an abstraction for which the underlying database layout is irrelevant except for performance issues. The performance issues may be major, but at least you don't have to worry about correctness.
C++ is notorious for this; the language adds abstractions with "gotchas" inside.
If you try to get the C++ standards committee to clean things up, you always hear 1) that would break some legacy code somewhere, even if we can't find any examples of such code anywhere in any open source distro or Microsoft distro, or 2) that only bothers people who arent "l33t".
Hardware people used to insist that everything you needed to know to use a part had to be on the datasheet. This is less true today, because hardware designers are so constrained on power, space, heat, and cost all at once.
Re:Non-leaky abstractions
by
Alan+Shutko
·
· Score: 2
SQL offers an abstraction for which the underlying database layout is irrelevant except for performance issues. The performance issues may be major, but at least you don't have to worry about correctness.
No, it means you think you don't have to worry about correctness. It is never the case that an abstraction means you don't have to worry about things. It just means you don't have to worry about things as often.
Someone told me that my query must be wrong... no, it was because the DBM was missing a service pack. (We worked around it anyway since I couldn't convince the lead that my approach should work, and would on the production box, which was more up to date than the dev box. Why weren't they the same? Don't even go there.)
Re:Non-leaky abstractions
by
smallpaul
·
· Score: 2
There's been a trend away from non-leaky abstractions. LISP, for example, was by design a non-leaky abstraction; you don't need to know how it works underneath.
Right, and there is a huge trend towards Lisp development in the last few years.
Leaky abstractions will always be with us. Yes, as time goes by we patch some of the egregious leaks (probably a big part of why Java is more popular than C++) but then we just build new leaky abstractions on top (e.g. Jython).
By the way, I would be surprised if Lisp float programming was significantly less leaky than that in other languages.
Re:Non-leaky abstractions
by
leandrod
·
· Score: 2
>
There's been a trend away from non-leaky abstractions.
[...]
> SQL offers an abstraction for which the underlying database layout is irrelevant except for performance issues.
Moral: a trend exists when there is awareness not only the problem, but also of possible solutions. Everyone knows SQL is a problem, but most people think the solution is OO, so the real solution is overlooked.
-- Leandro Guimarães Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Shouldn't a programmer that wanted to be most efficient start with the highest level of abstractions and work his/her way down as needed?
Abstraction brings effeciency (not in raw cycles but in programming time), portability, and other freedoms. In java, the hardware itself is abstracted so that buffer overflows and other security concerns aren't a problem. C programmers blame buffer overflows on poor programming. I have to ask myself who these people are. If they are so good, then they must have some popular program, right? A program like Bind or Sendmail, perhaps? Even the linux kernel has had buffer overflows.
Abstraction doesn't cost much by the way of performance. Java's abstraction is down from it's initial 1600% to 200% the speed of C. During that time C has went from it's 200% of assembler to closer to 100%. It's reasonable to foresee in the future that any abstraction, even those so severe as the JVM, will eventually aproach native speeds through optimized compilers/JIT/AOT.
The best thing about abstractions in the open source world is that a group of people who use these everyday will be the one's developing them. They will address the issues that matter to developers, and they will only need be addressed once. You don't have to reinvent the wheel when you design your car. If someone makes a better wheel, it's very likely that you will benefit from it if you use an abstraction.
Imagine the nightmare of converting to IPv6 if most people didn't use (at least portions of) the BSD TCP/IP stack??? Compare that to the trivial gains you get by not using it! What if all programers decided to use raw sockets to remove the leaky abstraction?
Fix the abstraction, don't abandon it. Have blocking TCP/IP sends (along side the others) for people that need to know if packets made it. You could just add a function to ask the system if the TCP packet had been sent yet (where sent really meant that an ack was recieved). You get to keep the interface. By the time the interface isn't leaky, you will have a large interface that performs all the tasks of the original, just in a more human comprehensible way.
Another advantage to abstractions is that they can be faster in the end. OpenGL is an abstraction. It's so popular that video card designers will work hard to make that abstraction as thin as possible. Now we have cards that support directly some or all of the functions. It's cross-platform as well because of it is an abstraction.
Don't abandon abstractions because they are slow or aren't all encompassing. Fix the leak, don't toss out the toilet.
-- Karma Clown
Non-programmers and leaky abstractions
by
Zaphod-AVA
·
· Score: 2, Insightful
The subject of leaky abstractions applies to novice users as well.
I've felt for a long time that people are taught about computers the wrong way, and this article clarifies why this is true.
People are taught less and less about what the computer actually does, and instead focus on things like the desktop analogy, and task oriented training. The user must then remember all these seemingly strange things computers do that don't follow the abstraction they were taught. This makes them seem difficult and incomprehensable.
The problems created by abstractions intended for users can simply be solved with more complicated software that better models the analogy that the users are taught. Unfortunately, the opposite is probably true for programmers.
-Zaphod
The problem with VB...
by
JaredOfEuropa
·
· Score: 2
As I see it, the problem with VB is not the high abstraction levels per se, but the fact that they are such lousy abstractions. Especially when it comes to controls. For example, look at the standard VB TreeView object. It becomes painfully clear that the VB derivation of this control was written with one narrowly defined purpose in mind, ignoring the fact that people might find many other uses for this control.
When using an OO language like C++, programmers move objects to a higher abstraction level all the time. That is why every good OO course teaches you to make your object's methods orthogonal and complete. In laymen's terms that means that the object operators should be sensible and generic. If you make an object which holds data that you wish to multiply by 5 and then add 2 to it, you don't make an operator that does just that. Instead, you make one operator that multiplies by x, and another that adds y.
Many VB objects ignore this rule, making operators for a specific purpose, that should have been generic instead, and not bothering to implement other possibly useful operators from a lower abstraction level.
When programming VB, I spend about 30% of my time programming around senseless limitations of and omissions from VB objects and controls.
-- If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
Abstraction bad... old way good.
by
LoRider
·
· Score: 2
The age old question (at least in the last few years), is abstraction really beneficial to software development? The answer is.......Yes.
Abstraction is good to an extent. The problems that most people bring up regarding abstraction I find to be more of problem with programmers rather than programming. There are always going to be more shitty programmers than good ones. It's the same in everything in life, there is always fewer good than bad.
Abstraction is awesome, it allows for me to think about things that can make my software better, faster, and rich with features while not having to concern myself with the nitty gritty bits and bytes all the time. I have no problem, actually I enjoy, digging into the code to fix obscure bugs and problems with a particular abstraction.
People always say programmers are lazy, I say it all the time. But the reality is good programmers aren't lazy. Good programmers take pride in their craft and are eager to learn everything about the language or environment they are working in. Good programmers benefit tremendously with good abstraction. Bad programmers use abstraction as a crutch to get things done, then complain when something breaks that they can't figure out.
One last point about abstraction. I have seen people take abstraction too far. There is a point at which abstracting any further handcuffs the programmer and creates too much overhead to make it worth it's while. I prefer to implement an abstraction only when I feel the cost of not implementing the abstraction is higher than implementing the abstraction.
Programming is an art, some are good some are not. Everyone has their own style that they prefer to work with and what works for some, can fails miserably for others. I fortunately work by myself most of the time so I have the freedom to almost always work in my world; so all is good.
Remember just because some people are out their abusing abstraction, as if it ever did anything to them, doesn't mean that abstraction is bad and we should never utilize that technique ever again.
-- LoRider
Programming By Coincidence/Evil Wizards
by
Mannerism
·
· Score: 2
Good article. I'm reminded of a couple of bits of wisdom from The Pragmatic Programmer. The authors describe a condition they call "Programming By Coincidence", which occurs when a programmer somehow comes up with code that works, but doesn't understand the system well enough to be able to explain why it works, and is therefore unable to properly maintain or debug his code. Frequently, I tbink, it is the underlying abstractions that are misunderstood by the programmer.
The authors also refer to "Evil Wizards", which are code generators used by programmers who mindlessly employ the generated code, completely without understanding it. Personally, I consider various frameworks and libraries to be potentially "evil" in this sense as well.
Damn straight! One of the biggest problems with Java is how types are arbitrarily built-in or not. I'm tired of wrapping my ints in Integers just so I can use them in collections and then getting their value when I need to do some math. Mind you, there'd be no problem if operators were overloaded so that I could just use Integer from start to finish. But of course with a built-in String class, there's less pressure for Java to support operator overloading (just as non-native strings encouraged the creation of C++ templates).
The next generation of Java will support generics, and I'd like to see Java do generics justice. Meaning, I'd like to see Java use it's position as a new language with a significant runtime to make C++ generics better. For example, Java generics could store a type-independent representation of code and automatically instantiating templates at runtime, so dynamically loaded modules could make better use of generic functions. I'd like to see them actually make Java do something C++ can't, instead of adding another long list of features C++ gives you permission to do and Java doesn't.
-- A deep unwavering belief is a sure sign you're missing something...
A whine? I took it as a warning: even if you don't use the layers underneath your favorite abstraction, learning how they work is not optional, and people who haven't are not going to be able to solve all their own problems.
As he argues, "paradoxically [...] becoming a proficient programmer is getting harder and harder," though posing as one and being able to get some stuff to vaguely work is getting easier--the frequency at which proficient programmers are needed is going down.
> I can be pretty sure that in my career I will never be required to develop in assembler.
Well if all you do is write Excel macros, that's probably true...
Now take ActiveX . . . please!
by
Latent+Heat
·
· Score: 2
ActiveX is an important abstraction in the Windows world, and boy is it a leaky abstraction if there ever was one.
I desperately want to know how ActiveX works at the low level, but Microsoft either doesn't want to explain it or won't explain it. Yeah, yeah, every Microsoft Press book starts with an explanation of IUnknown -- QueryInterface, AddRef, and Release -- but once they cover the basic mechanics of vtable interfaces the explanation is that some kind of miracle occurs and they you have have ActiveX containers and objects in their full glory, and don't worry your pretty little head worrying about how to roll your own, simply using the wizards (gosh how I hate that term for automatic code generation that obfuscates).
Maybe Microsoft doesn't encourage going inside their abstraction because internally they are so baroque (i.e. ornate to no known functional purpose) or just plain broke. And yes, if you want to do one tiny little thing that is outside the scope (or perhaps the documented scope) of that abstraction, one spends hours banging one's head against the wall with Google and MSDN searches and software code experiments.
Re:"leaky abstractions" my foot.
by
J.+Random+Software
·
· Score: 3, Insightful
The abstraction is a reliable byte stream, which of course isn't really possible due to phenomena that can only be affected by interfaces beneath TCP. A leak that's documented is still a leak.
Small nitpick: East Africa vs West Africa
by
error0x100
·
· Score: 2
The reliability of TCP is why every exciting email from embezzling East Africans arrives in letter-perfect condition
I assume this is a reference to the abundant "Nigeria scams", however, if you look on a map, you will see that Nigeria is actually in so-called "West Africa".
I just thought of another reason why abstractions are important. Take the famous MP3 -> Ogg conversion. Semantically, many aspects of the two types don't match, so the resulting conversion is less than perfect. Well, the Brain -> Machine Code conversion is something more akin to an MP3 -> PDF scenario. And for that reason, programming in a low level language (for scopes big enough to trash the mental cache, so to speak) has a similar conversion problem. Human beings things at an extremely high level, and can manipulate thousand-layer abstractions with ease (think of, for example, the abstraction of a 'cat'). Even something considered high-level in the programming world (objects and generic algorithms) are low-level to human beings. The ideal abstraction, would be, of course, natural language. Human beings think in natural language, and translating thoughts into natural language is trivially easy. For example, I'm writing a linker at the moment. I'd just like to be able to say:
Sort sections by name. Remove all duplicate sections that have the LINK_ONCE parameter set. Merge all other duplicate sections. For each relocation in the relocation table, find the corresponding symbol and update the necessary address. Merge string tables, and update all string references. Write sections to this file.
There we go, six "lines of code." Very close mapping between my thoughts and the code. Assuming the compiler carries out my instructions faithfully, what are the chances of a bug in this code? Nearly zero. It's quite easy to verify that six lines of code is logically correct. Verifying that the thousands of lines of C++ that this translates to is both logically and semantically correct is much, much, harder.
-- A deep unwavering belief is a sure sign you're missing something...
... machine code itself is an abstraction in the first place. This is especially true for modern processors that reorder instructions, execute them in parallel, and in extreme cases convert them into an entirely different instruction set.
Says that all abstraction's are ultimately leaky because you never construct a logical system that is complete and consistent (but get close enough for government work). Program abstractions usually leak long before they reach Goedel's limit.
By comparison, there is another method of transmitting data called IP which is unreliable.
My point, likely poorly made due to crankiness, was that TCP and IP are at different layers of the protocol stack, and TCP in fact assumes IP, and therefore you can't compare them. I do understand the analogy he was trying to make to abstraction layers, I was just trying to say that his language was tweaking the hell out of me.
Everything is an abstraction
by
visionsofmcskill
·
· Score: 2, Funny
Leaky abstractions....?
oh come on.... Nothing is fullproof... every element of computing is an abstraction of something else.... Even assembly code is an abstraction of processor commands...
Not trying to insult the guy, he wrote a great article... but every time you use any command in any language your using an abstraction of many more commands... all we do in coding is build yet more tools not in any way disimilar from the ones were already using... Classes and functions, even straightly scripted programs.
My sense of smell is an abstraction from my nose's ability to respond to chemicals... So when i smell shit i suppose thats a leaky abstraction because i dont want to smell it....
-- --Idiots, Every single one of YOU, A flaming mass of conglomerated morons, hey wait a second, isnt that how RAID works?
Supposed to be informative?
by
naasking
·
· Score: 2
It's a good article for beginner programmers who are still trying to wrap thoir brains around the concepts, but it's hardly a new idea. Unfortunately, the author doesn't name the phenomena for what it is; it's not a problem with abstractions themselves, but is a result of logic and algorithms upon which programming is founded. Abstractions are based on assumptions, and when those assumptions are not true, the abstraction breaks. Take his example of TCP breaking when the network cable is unplugged: since TCP assumes the destination is reachable, it will abviously break when this assumption is violated. His "law of leaky abstractions" is simply the "law of violated assumptions" (see? I can come up with fancy labels too).
This is why learning C with plain old vi and gcc is a good idea: You learn the basics first and then you can advance to an IDE later. Actually there's a saying for this: "You have to crawl before you can walk"
Re:This is true of almost any engineering professi
by
Mac+Degger
·
· Score: 2
Now this is absolutely true...and the cool thing is, the analogy between (mechanical) engineering and software holds true. In a mechanical system, you NEVER, I repeat NEVER get an absolute sdescription; a good engineer knows to what level he can simplify (ie detail the mechanical efficiency, do some thermal calculations, then leave out the air friction because the effect is minimal and can be left out [in this case...in another situation it could well be the ayor factor] without dramatically changing the final strenght/elasticity calculations).
A slightly less proficient engineer will calculate the system, together with the next layer of the system (ie he will calculate the air friction), then discard it [or incorporate since he did the work anyway] because he found out the effect is not significant.
It seems the same thing goes with software development; you abstract to the propper level...but you need to know what level you can safely stop at.
-- --
Waht? Tehr's a preveiw buottn?
Exactly why schools fail to teach computer program
by
xtronics
·
· Score: 2
I am one of the lucky guys who learned programming in machine code first - I actually hand assembled an interrupt driven program that would put the time on a memory mapped display back in 1979.
The schools mistakenly start out with a high level language. They for some reason think no one needs to get good at assembly language -(I would not advise learning Intel assembly first (It is a very poor design for humans to use - one of the reasons it took the intel world 10 years to catch up to the Amiga's OS) but alas Intel is the standard).
I also moved on to high-level programming - and wrote programs in both assembler and C - compiled them and counted machine cycles. Once you see where the compilers fail to produce good code, you can compensate where necessary.
A weird trend has happened where I find it most effective to write in the highest or lowest level and avoid anything in between.
Strangely enough, I was discussing this with a co-worker yesterday. We joked about the silly programmers who return pointers to local variables, or think "char *" is the same as "string".
char *s; strcat(s, "Mary had a little lamb");
The result of the discussion was that everyone needs to understand at least one level down from the level of abstraction at which they will be working. For example, someone using sockets for TCP/IP will probably have a very hard time understanding what is going on unless he knows the general concepts of TCP. A C programmer really needs to understand a little bit about how the C code is going to look and work at the assembly level or she will never understand why the code above crashes
In addition to providing the details needed to make sense of the higher-level abstration layer, a basic understanding of the lower-level abstraction layer allows a good programmer to make educated guesses about potential trouble spots -- performance traps or feature limitations.
That is why I have always advocated an understanding of assembly language, computer architecture, TCP/IP protocols, etc. If you know TCP/IP, you can probably figure out what is wrong with your sockets code much more quickly. If you know how the Ethernet works, you will have much more success trying to wire your apartment. You might not technically need that knowledge, but it helps to be able to guess the answer to "can I just get better shielding to make the ethernet signal go farther?" rather than look it up.
-- Time flies like an arrow. Fruit flies like a banana.
how much you abstract. Abstractions exist everywhere, there's no coming away from that. We can't reinvent the wheel everytime we go out for a ride on our bicycle. However, sometimes abstractions go too far, and make the process slow and unoptimized. That's the difference between an experienced programmer and a rookie, the experienced one knows where to take advantage of an abstraction, and where to go under the hood and get dirty.
Joel, an extreme lamer...
by
Alex+Belits
·
· Score: 2
...every so often discovers some minor details of things that he was supposed to know BEFORE starting any software-related work, and publishes his discoveries on his site with large amount of drivel, usually inspired by his troubled childhood at Microsoft.
-- Contrary to the popular belief, there indeed is no God.
100 KLOC in assembly buys you ... what?
by
duck_prime
·
· Score: 2
Ah, but you are wrong, and I'm speaking as someone who has written over 100,000 lines of assembly code
God, not another guy who's written hello_world.asm.;)
The subjects he lists, history, literature, languages, politics, economics, and arts, science, mathematics, and engineerin why would you disagree that it's possible for someone to obtain competency in these areas?
I believe you misunderstood me.
The point was not to obtain competence in these areas. It was to obtain MASTERY of ALL AREAS of human knowledge SIMULTANEOUSLY.
No longer possible.
(Of course it's even harder when the school systems don't even teach their students to SPELL it correctly. B-) )
-- Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Re:Non-leaky abstractions are a power issue
by
Animats
·
· Score: 2
It's not about correctness. It's about power.
In classical aerospace procurement, the specification rules. It's quite common to order the same part with the same specification from different suppliers. The units supplied must meet specified tests and be functionally interchangeable.
Software companies hate that. There were screams when DoD tried to force that on Ada compiler vendors.
There was a test suite, acceptance testing, and some compilers flunked. DoD contractors couldn't use those compilers.
Software companies aren't used to getting products returned with a big red tag marked REJECTED - Does not conform to specification. And it doesn't end there. There's a follow-up from the quality department questioning their qualifications to continue as a supplier. Then there
's interchange of quality data between DoD buyers that puts the vendor on record as having quality problems. In the aerospace world, there's an organized process for hammering on vendors until their stuff works, or until the vendor gets replaced.
DoD lost that battle in software, though. That approach worked great at keeping aerospace machine shops from getting sloppy. But DoD doesn't buy enough software or computers any more. There was a time when DoD had real influence in computing (the Internet and BSD UNIX were all DoD-funded projects). But that influence declined in the 1980s as the PC sector, where consumers are weak.
And that's what this is all about. Nobody today is in the position to say to Microsoft "Make it work right or we pull the plug and you go out of business". Other vendors have picked up on this, and shipping crap has become a trade custom. Since consumers tend to buy on features rather than quality, we're not seeing pushback on this.
That's why our abstractions leak. Nobody is pushing back hard enough to insist that they don't.
We know how to do it - overdesign inside, heavy testing, clear specifications, a willingness to reject for noncompliance, and the power to make that rejection stick. But nobody today is in a position to do that except for some very specialized procurements.
I looked down in suprise to find a leaky abstraction. Had to change pants.
Because the first step to solving any problem is always to create more problems.
-E
http://almostsmart.com
Great! I'll print off a hardcopy and stick it on my refrigerator! I'm sure my wife will love it!
Although I used to program as a hobby, my eyes bugged out when I saw this article. It's actually quite interesting; I finally realize why the hell people program in lower level languages.
One point that I think could be addressed is backward compatibilty. I really know nothing about this, but don't the versions of the abstractions have to be fairly compatible with each other, especially on a large, distributed system? This extra abstraction of an abstraction has to be orders of magnitude more leaky. The best example I can think of is Windows.
I'm of the idea that the whole premise that high-level tools and high level abstraction coupled with encasulation are the biggest bane of the software industry. We have these high level tools which most programmers really don't understand and are taught that they don't need to understand in order to build these sophisticated products.
Yet, when something goes wrong with the underlying technology they are unable to properly fix their product because all they know is some basic java or VB and they don't understand anything about sockets or big-endian/little endian byte alignment issues. It's no wonder todays software is huge and slow and doesn't work as advertised.
The one shining example of this is FreeBSD, which is based totally on low level C programs and they stress using legacy program methodologies in place of the fancy schmancy new ones which are faulty. The proof is in the pudding, as they say, when you look at the speed and quality if FreeBSD, as opposed to some of the slow ponderous OS's like Windows XP or Mac OSX.
Warmest regards,
--Jack
Wagner LLC Consulting Co. - Getting it right the first time
Well I wouldn't say that it's reliable "because there are timeouts". AAMOF, timeouts just compicate things. So you timeout waiting for packet N, you request a resend of it, and in the interim, guess what, packet N shows up, now you have two N's. Your code is now more complex in having to deal with this situation. Timeouts are just another parameter used adjust the behaviour of the algorithms that control the protocol. Getting deterministic results from an undeterministic foundation involves making observations, accepting some compromises, making some simplifying assumptions, and then writing code that takes all those things into account to come up with something that usually works.
How is this news? All technologies, on some level, are inherently unreliable. Therefore, in order to obtain reliability, it is always by adding some kind of redundancy to an unreliable tool.
I've never seen a technology touted as "reliable" that didn't achieve that reliability without some kind of self-checking or redundancy somewhere. Maybe that's the author's point, but he makes it sound as TCP/IP is unique in this regard.
This is what programming is all about. It seems pretty obvious to me.
And the men who hold high places must be the ones who start
To mold a new reality... closer to the heart
"Reliable" means "always works", it doesn't mean "always obeys the spec". (Unless you use a circular definition)
A timeout is a legal result by the TCP specification, but it's not reliable, because your data didn't make it through.
By the IP specification, your data might not make it either- and that's a legal result because the spec allows it to drop packets for any reason at all. That doesn't mean IP is reliable, just that it obeys its own definition.
Of course, no real protocol can ever meet this restrictive definition of reliable. Some maniac can always cut through your wires or incinerate your CPUs. Calling TCP a "reliable protocol" is just a shorthand for "as much more reliable than the underlying protocols as we could manage"
The timeout you mention does make TCP more reliable than IP, because it alerts you to the data loss, where the application can possibly take steps to retransmit it sometime in the future.
But its not as if TCP could ever achieve the perfect reliablity that the simplest, most abstract description of it would imply. Which is why, as the author says, those who rely on the abstractions can get bitten later.
Is our own bodies.
I'm studying to be a bioinformatics guy with the university of melbourne and have just had the misfortune of looking into the enzymatic reactions that control oxygen based metabolism in the human body.
I tried to do a worst case complexity analysis and gave up about half way through the krebs cycle.
When you think about it, most of basic science, some religeon and all of medicine has been about removing layers of abstraction to try and fix things when they go wrong.
...to start with, or at least be competent with, the basics.
Any good programmer I've ever known started with the lower level stuff and was successful for this reason. Or at least plowed hard into the lower level stuff and learned it well when the time came, but the first scenario is preferable.
Throwing dreamweaver in some HTML kiddie's lap, as much as I love dreamweaver, is not going to get you a reliable Internet DB app.
vk.
The mechanical, electrical, chemical, etc, engineering fields all have various degrees of abstractions via object hiding. It just isn't called "object hiding" because these are in fact real objects and there is no need to call them objects because it is natural to think of them that way. When debugging a design in any of these fields, it is not unusual to have to strip down layers and layers of "abstraction" (ie, pry into physical objects) to get to the bottom of a real tough problem. Those engineers with the broadest skills are usually the best at dealing with such problems. There isn't really anything new in the article.
Great article, but don't throw out the high level tools and go back to coding Assembler.
Brevity is the soul of wit
-- Polonius
Unfortunately, his Slashdotted server is proving that to us right now.
Maybe someone out there prefers to program without any abstraction layers at all, but they inherit so much complexity that it will be impossible for them to deliver a meaningful product in a reasonable time.
As a VB programmer, I've *lived* leaky abstractions. Nowhere has it been more obvious than in the gigantic VB app our team is responsible for maintaining. 262 .frm files, 36 .bas modules, 25 .cls classes, and a handful of .ctl's.
Much of our troubles, though, come from a single abstraction leak: the Sheridan (now called Infragistics) Grid control.
Like most VB controls, the Sheridan Grid is designed to be a drop-in, no-code way to display database information. It's designed to be bound to a data control, which itself is a drop-in no-code connection to a database using ODBC (or whatever the flavor of the month happens to be).
The first leak comes in to play because we don't use the data control. We generate SQL on the fly because we need to do things with our queries that go beyond the capabilities of the control, and we don't save to the database until the client clicks "OK". Right away, we've broken the Sheridan Grid's paradigm, and the abstraction started to leak. So we put in buckets -- bucketfuls of code in obscure control events to buffer up changes to be written when the form closes.
Just when things were running smoothly, Sheridan decided to take that kid with his finger in the dike and send him to an orphanage. They "upgraded" the control. The upgrade was designed to make the control more efficient, of course... but we don't use the data control! It completely broke all our code. Every single grid control in the application -- at least one and usually more in each of 200+ forms -- had to have all-new buckets installed to catch the leaks.
You may be wondering by now why we haven't switched to a better grid control. Sure enough, there are controls out there now that would meet 95% of our needs... but 1) that 5% has high client visibility and 2) the rest of the code works, by golly! No way we're going to rip it out unless we're absolutely forced to.
By the way, our application now compiles to a svelte 16.9 MEG...
Stressed? Me? Of course not. Stress is what a rubber band feels before it breaks, silly.
This is one the best essays on software engineering I've read in a while. As a programmer and CS educator, it's really served to crystallize for me why (a) it seems so much harder for students to learn programming these days, and (b) why I've grown unhappy over the years with the series of new engineering paradigms that are in use. Extremely helpful for putting my own thoughts in order.
The law statement itself, "all non-trivial abstractions, to some degree, are leaky" may possibly get included in my personal "top 10" aphorisms manifesto.
We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
Yes it would be nice to get back to 'first principles' and address machine resources directly, but its impossible to deliver a product to the marketplace in a meaningul timeframe using this method, particularly when Moore's law blurs the gains anyway - crap runs fast enough.
FIne it's relaibale becasue of acks, timeouts, adaptive re-transmit timeouts that take statistical averages of RTT times, exponential back-off and slow start, window acks which keep track of what bytes are received, etc.
So in your case of timing out N, re-tx'ing N, and then getting the repsonse to the first N back after sending the second N, you do two things:
1) Good! You got yr packet!
2) keep track of how many bytes you have received thsu far (TCP is not sending messages, it is sending a stream)
3) when you get the response from your second request, discard it, becuase you already received those bytes from the stream.
4) since you timed out, DON'T use the Round TRip Time for that reponse: slow down your expected RTT time, and THEN start measuring.
And guess what? If I unplug the NIC of the other machine, there is no reliable way of transmitting that data (assuming your destination machine isn't dual homed)- so I keep streaming bytes to a TCP socket and I don't find out my peer is gone for approx. 2 minutes.
WOW. There's nothing reliable about that boundary condition!
my point is TCP is reliable ENOUGH. But I wouldn't equate it with a Maytag warranty. It is not a panacea. Infact, for a closed homogenous network I wouldn't even consider it the best option. But if the boundary conditions fall within the acceptible fudge range (remember Real Time human grade systems are not 100% reliable, only 99.99999% and much of that is achieved through redundancy) your leaks are ok.
In the future, I would want to not be isolated from my friends in the Space Station.
I think that term ooze would suite better in this case. It's possesses a kind of dirtiness to itself and the fealing the word 'ooze' gives me fits good with the matter of described problem. :o)
Back to the article. To be serious, i think that Joel mixed all things as examples of 'Leaky abstraction' to no purpose. Too different situations make concept to fall apart. Here what i mean:
In case of tcp/ip it denotes limits of abstraction. And regardless of programmer background every sane man should now those limits do exist.
In case of page faults it's a matter of competence - there is no abstraction at all. You either do know how your code is compiled and executed or you don't. It's the same when you know what the phrase in a given language do realy mean or you don't. I simplify here.
In the case of C++ strings i saw the only good example. What in my opinion the experience of STL and string class usage tells in this case is: one should understand the underlying mechanics fully before rely on abstraction behaviour.
In programming it is realy simple to tell will the given 'abstraction' present you with an easter egg or not: if you can imagine FSM for the abstraction you will definitely know when to use it.
I'm not a brake. I'm an accelerator. Just a slow one...
> Huh? You can't retransmit cabbages or actors or hard copies of badly researched
> essays . . . but you can retransmit freaking TCP packets!
He does go on to say that "retransmission" in this case means sending an identical twin of the actor. Your criticism really isn't fair in ignoring this. The metaphor of identical twins for copies of information is not perfect, but as a teaching tool for explaining TCP to a non-technical type, I'm not sure I could come up with another metaphor which matches this one for its intuitive value.
Joel should write an article about his leaky hosting company... or maybe his leaky colo-ed box.
Since I can't get to the site and read the article, I'll tell some jokes.
"Leaky Abstractions?! Is this guy talking about Proctology??"
"Leaky Abstractions?! Someone get this guy a plumber!"
"Leaky Abstractions?! I knew we should have used the pill!"
-gerbik
The market rewards abstractions because they help create high level tools that get products on the market faster. Classic case in point is WordPerfect. They couldn't get their early assembler-based product out on a competitive schedule with Word or other C based programs.
I dont think you understand TCP, you dont request a resend. TCP does that for you, if you timeout it means the connection is broken. Feel free to try again later, but trying again means opening a new connection, and no packages from an old connection will confuse the new one.
Isn't "leaky abstraction" a leaky abstraction of the leaky abstractions?
-... ---
For something like IP packets, leaky is acceptable, but for many of those other abstractions, constipated might be a better adjective. Some of the tools and technologies out there (remember 4GL report-writers?) were big clogging masses that just won't pass.
The first thing I do when I start in on a new technology (VBA, CGI, ASP, whatever) is to start digging in the corners and see where the model starts breaking down.
What first turned me on to Perl (I'm trying hard not to flamebait here) was the statement that the easy things should be easy, and the hard things possible.
But even Perl's abstraction of scalars could use a little fiber to move through the system. Turn strict and warnings on, and suddenly your "strings when you need 'em" stop being quite so flexible, and you start worrying about when it's really got something in it or not.
On the HTML coding model breaking down, my current least-fave is checkboxes: if unchecked, they don't return a value to the server in the query, making it hard to determine whether the user is coming at the script the first time and there's no value, or just didn't select a value.
Then there's always "This space intentionally left blank.*" Which I always footnote with "*...or it would have been if not for this notice." Sure sign of needing more regularity in your diet.
Design for Use, not Construction!
Sure, the author points out a few examples of leaky abstractions. But his conclusion seems to be that you always will have to know what is behind the abstraction.
I don't think that's true. It depends on how the abstraction is defined, what it claims to be.
You can use TCP without knowing how the internals work, and assume that all data will be reliably delivered, _unless_ the connection is broken. That is a better abstraction.
And the virtual memory abstraction doesn't say that all memory accesses is guaranteed to take the same amount of time, so I don't consider it to be leaky.
So I don't entirely agree with the author's conclusions.
No calling TCP reliable, means that it provides certain features you can rely on:
1. Data either arrives intact or dont arrive at all.
2. Data always arrives in the order it was sent
3. Data is never duplicated.
You can safely rely on all of these facts. Calling TCP a garantie that you packages arrive is leaky, but TCP doesnt claim that.
So if you dont abstract TCP more than you are meant to, it is a non-leaky abstraction.
Proper abstractions avoid unintended side-effects by presenting a clean view of the intent and function of a given interface, and not just a collection of methods or structures.
When I read what Joel wrote about "leaky abstractions" i saw a peice complaining about "unintended side-effects". I don't think the problem is with abstractions themselves, but rather the implementation.
He lists some examples:
1. TCP - This is a common one. Not only does TCP itself have peculiar behavior in less than ideal conditions, but it is also interfaced with via sockets, which compound the problem with an overly complex API.
If you were to improve on this and present a clean reliable stream transport abstraction is would likely have a simple connection establishment interface and some simple read/write functionality. Errors would be propagated up to a user via exceptions or event handlers. But the point I want to make is that This problem can be solved with a cleaner abstraction.
2. SQL - This example is a straw man. The problem with SQL is not the abstraction it provides, but the complexity of dealing with unknown table sizes when you are trying to write fast generic queries. There is no way to ensure that a query runs fastest on all systems. Every system and environment is going to have different amounts and types of data. The amount of data in a table, the way it is indexed, and the relationship between records is what determines a queries speed. There will always be manual performance tweaking of truly complex SQL simply because every scenario is different and the best solution will vary.
3. C++ string classes. I think this is another straw man. Templates and pointers in C++ are hard. That is all there is too it. Most Visual Basic only coders will not be able to wrap their minds around the logic that is required to write complex c++ template code. No matter how good the abstractions get in C++, you will always have pointers, templates, and complexity. Sorry Joel, your VB coders are going to have to avoid c++ forever. There is simply no way around it. This abstraction was never meant to make things simple enough for Joe Programmer, but rather to provide an extensible, flexible tool for the programmer to use when dealing with string data. Most of the time this is simpler, sometimes it is more complex (try writing your own derived string class - there are a number of required constructors you must implement which are far from obvious) but the end result is that you have a flexible tool, not a leaky abstraction.
There are some other examples, but you see the point. I think Joel has a good idea brewing regarding abstractions, complexity, and managing dependencies and unintended side-effects, but I do not think the problem is anywhere near as clear cut as he presents. As a discipline software engineering has a horrible track record of implementing arcane and overly complex abstractions for network programming (sockets and XTI) generic programming (templates, ref counting, custom allocators) and even operating systems API's (POSIX).
Until we can leave behind all of the cruft and failed experiments of the past, start new with complete and simple abstractions that do not mask behavior, but rather recognize it and provide a mechansim to handle it gracefully, we will run into these problems.
Luckily, such problems are fixable - just write the code. If joel were right and complex abstractions were fundamentally flawed, that would be a dark picture indeed for the future of software engineering (it is only going to grow ever more complex from here kids - make no mistake about it).
I think the key point is that this article is an abstraction (and a leaky one at that) of the truth :)
---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"
The problem that this article points to is a byproduct of large scale software development primarily being an exercise in complexity management. Abstraction is the foremost tool available in order to reduce complexity.
In practice a person can keep track of between 4 and 11 different concepts at a time. The median lands around 5 or 6. If you want to do a self-experiment have someone write down a list of twenty words, then spend 30 seconds looking at them without using memnonic devices such as anagrams to memorize them then put the list away. After thirty more seconds write down as many as you can recall.
This rule applies equally when attempting to manage a piece of software - you can only really keep track of between 4 and 11 "things" at the same time, so the most common practice is to abstract away complexity - you reduce an array of characters terminated by a null characters and a set of functions designed to operate on that array to a String. You went from half a dozen functions, a group of data pieces, and a pointer to a single concept - freeing up slots to pay attention to something else.
The article is completely correct in its thesis that abstractions gloss over details and hide problems - they are designed to. Those details will stop you from being productive because the complexity in the project will rapidly outweigh your ability to pay attention to it.
This range of attention sneaks into quite a few places in software development:
other schemes exist for managing complexity, but abstraction is decided human - you don't open a door, rotate, sit down backwards, rotate again, bend legs, position your feet, extend left arm, grasp door, pull door shut, insert key in iginition, extend right arm above left shoulder, grasp seatbelt, etc... you start the car. Software development is no different.
There exist peopel that can track vast amounts of information in their heads at one time - look at Emacs - iirc RMS famously wrote it as he did because he could keep track of what everythign did, no one else can though. There also exist memnonic devices aside from abstraction for managing complexity - naming conventions, taxonomies, making notes, etc.
-Frums
Joel Spolsky often grates on me (especially when he falls into, "here's how Microsoft solved the problem with near infinite access to manpower, so clearly you should do the same thing."), but this article really rang true. People might also be interested in a similar article published in 1998 on Salon, "The dumbing-down of programming." The author comes from a slightly different point of view, but comes to a similar conclusion: we need to be wary of becoming too detached from the low level details.
Search 2010 Gen Con events
Try again with some constructive criticism, not just criticism.
For most of these problems, there's no easy way to really fix the abstraction. The only solution is for users to be aware of the abstractions they depend on, so that they can troubleshoot the underlying foundations if things break down.
The article is constructive in that it spreads a warning to be cautious about relying on abstractions too much, without understanding how they work.
I dont think you understand TCP, you dont request a resend. TCP does that for you,
You misunderstood my statement. It wasen't made from the standpoint of someone using TCP, it was made from the standpoint of TCP itself. If it's looking at the UDP packets coming in and realizes one is missing, it will request that that missing packet be resent after some timeout period. TCP then also has to be able to handle the situation when two or more of the same packets arrive due to this behaviour.
Error conditions are part of the abstraction TCP provides. The concept of reliability in TCP doesn't mean that it always gets your data across, but that it either does or does not and you can rely on knowing the what happend. This isn't a weakness of TCP, it's a strength.
Unfortunately there are edge conditions in which your data is recieved but you never receive the ACK(nowledge) and therefore assume that the messasge was lost. This isn't very common, but if the cat chews the CAT5 at just the right moment it can happen.
We have examples of massively complex systems that work very reliably day-in and day-out. Jet airplanes, for one; the national communications infrastructure, for another. Airplanes are, on the whole, amazingly reliable. The communications infrastructure, on the other hand, suffers numerous small faults, but they're quickly corrected and we go on. Both have some obvious leaky abstractions.
The argument works out to be pessimism, pure and simple -- and unwarrented pessimism to boot. If it were true that things were all that bad, programmers would all _need_ to understand, in gruesome detail, the microarchitectures they're coding to, how instructions are executed, the full intricacies of the compiler, etc. All of these are leaky abstractions from time to time. They'd also need to understand every line of libc, the entire design of X11 top to bottom, and how their disk device driver works. For almost everyone, this simply isn't true. How many web designers, or even communications applications writers, know -- to the specification level -- how TCP/IP works? How many non-commo programmers?
The point is that sometimes you need to know a _little bit_ about the place where the abstraction can leak. You don't need to know the lower layer exhaustively. A truly competant C programmer may need to know a bit about the architecture of their platform (or not -- it's better to write portable code) but they surely do not need to be a competant assembly programmer. A competant web designer may need to know something about HTML, but not the full intricacies of it. And so forth.
Yes, the abstractions leak. Sometimes you get around this by having one person who knows the lower layer inside and out. Sometimes you delve down into the abstraction yourself. And sometimes, you say that, if the form fails because it needs JavaScript and the user turned off JavaScript, it's the user's fault and mandate JavaScript be turned on -- in fact, a _good_ high-level tool would generate defensive code to put a message on the user's screen telling them that, in the absence of JavaScript, things will fail (i.e. the tool itself can save the programmer from the leaky abstraction).
What Ian Malcolm says, when you boil it all down, is that complex systems simply can't work in a sustained fashion. We have numerous examples which disprove the theory. That doesn't mean that we don't need to worry about failure cases, it means we overengineer and build in failsafes and error-correcting logic and so forth. What Joel Spolsky says is that you can't abstract away complexity because the abstractions leak. Again, there are numerous examples where we've done exactly that, and the abstraction has performed perfectly adequately for the vast majority of users. Someone needs to understand the complex part and maintain the abstraction -- the rest of us can get on with what we're doing, which may be just as complex, one layer up. We can, and do, stand on the shoulders of giants all the time -- we don't need to fully understand the giants to make use of their work.
(I didn't read the whole article, so my analysis may be leaky.)
It is true that every abstraction is but an imperfect representation of the concrete things it was abstracted from. It is true, and worth noting, that the degree to which the abstraction breaks down in certain situations can cause large problems.
I would like to nit-pick the Katzian "It's dragging us down" doomsday prediction at the end. Abstraction itself has lifted us, tremendously. The fact that abstraction is not perfect is a limitation to how much it can lift us, true. But it's like taxes to support military spending or highways--yes, they do "drag down" our paychecks, but they also make what we do to get that paycheck possible.
One other concept that doesn't seem to be considered is the fact that, even with the imperfections, there is a very powerful and important benefit to having done the abstraction in the first place. One people are using the abstraction instead of interfacing with the concrete target directly, then fixing a leak in the abstraction can be done at the abstraction, and everyone, potentially millions or billions, can benefit from it. Yes, there is a cost to rolling that out and ensuring compatibility, etc, but it's an advantage of abstraction that is a powerful tool to deal with inherent "leakiness" of abstracting anything.
Finally, I want to point out one thing that is implied but not stated, and that is how important it is to do your abstraction well. Once you have completed your abstraction, it's likely that thousands or millions will build things on top of it. Code that exists to be used by other code is more important than a one-off script. If you mess it up, you are messing it up for a lot of people. Just something to keep in mind, and something well illustrated by the article.
Liberty uber alles.
agnosticism of implementation is a FEATURE of abstractions...not a bug
It's 10 PM. Do you know if you're un-American?
You have a technical lead to solve sneaky problems that come up, and a bunch of vb/java/.net humps to crank out the business logic. We can't all be Einstein, can we?
love is just extroverted narcissism
Neal Stephenson talks about something similar in In the Beginning was the Command Line. He calls it interface shear; he's specificially referring the the UI as an abstraction (an interesting idea in itself). His take on it was that abstractions are metaphors, and that "interface shear"/"leaky abstractions" occur in regions where the metaphors break down.
Interesting stuff...
So, the metaphor is fundamentally broken, but it is intuitive, which is more important.
-Peter
Neither TCP nor UDP has requested resent packets. If a TCP packet is lost, the sender will not receive an ACK for that packet, and the sender will resend it. This will happen even if the receiver has caught fire. If a UDP packet is lost, it's lost totally.
The leaky abstraction is actually a good thing. We understand that as we layer programming code, inherently, the lower-layers are never going to be perfect. Hence, why we "bubble-up" error-handling.
I don't want abstraction to be universal constant. If problems occur in the abstraction, I want to be able to diagnose that. Just because the abstraction "leaks", that's a good thing. Sometimes a client-server app breaks because the network layer breaks, but we can make modifications, adjustments to fix the problem. If we couldn't, and this "leak" were permanent, we would throw up our hands and quit.
The author somehow is attempting to suggest that this imperfect-abstraction is hurting..something. But where is there perfect abstraction? I have yet to see one: the sky abstracts the universe from my environment, yet I still can get a sunburn.
Even God himself, tried to abstract human beings from evil, and that "leaked" as well.
So we shouldn't define this as "leaky" abstraction, but rather as "flexible" abstraction.
"This isn't a study in computer science, its a study in human behavior"
I think the article is great. And this principle can also be applied to Math. Theorems are much like library function calls. You can use them in your own proofs, without caring about how they are proved, because someone has already taken care of that for you. You prove that the hypothesis are true, and you get a result which is guaranteed to be true.
The problem is that in real Math, you often need a slightly different result, or you cannot prove that the hypothesis are true in your situation. The solution often involves understanding what's "under the hood" in the theorem, so that you can modify the proof a little bit and use it.
Every professional mathematician knows how to prove the theorems that he/she uses. There is no such thing as a "high-level mathematician", that doesn't really know the basics, but only uses sophisticated theorems in top of each other. The same should be true in programming, and this is what the article is about.
The solution? Good education. If anyone wants to be considered a professional programmer, he/she should have a basic understanding of digital electronics, micro-processor design, assembly language (at least one), OS architechture, C, some object oriented language, databases... and should be able to understand the relationship between all those things, because when things go wrong, you may have to go to any of the levels.
It's a lot of things to learn, but there is no other way out. Building software is a difficult task and whoever sells you something else lies.
In physics the abstractions leak. Newton's laws leak like crazy. Einstein's theories leak. Presently there are no fundamental theories in physics which don't leak like crazy when quantum mechanics and gravity interact.
In sports the abstractions leak. That's how we get players like Gretzky and pay a lot of money to watch what they do.
And how about the reason why didn't C++ didn't define a native string type. Because there isn't any way to implement a string class that serves all possible applications. The premise of C++ is not being stuck with someone else's choice on what part of the abstraction should leak. Because C++ doesn't define a native string type, the user is free to replace the default standard string implementation with any other string implementation and have it integrate with the language on an equal footing with the standard string type.
If a language is imposes standard abstractions it only takes one abstraction you can't live with to make that choice of language untenable. Which is how C++ has been so successful despite being the worst of all possible languages (except for all the others).
Interesting that this article hit #1 on the blogdex a few days ago, and has since fallen off the chart.
Interesting links there now are Republican commentary from The Onion and a frightening NY Times article on the "virtual, centralized grand database" the new Homeland Security Bill the House just passed would create.
I agree with Joel, but some people seem to be taking it as a call to stop abstracting. That's silly.
Humans form abstractions. That's what we do. If you abstractions are leaking with detrimental consequences, then it could be because the programming language implementation you're using is deficient, not because you shouldn't be abstracting.
Try a high-performnce Common Lisp compiler some time. Strong dynamic typing and optional static typing, macros, first class functions, generic-function OO, restartable conditions, first class symbols and package systems make abstraction much easier and less prone to arbitrary decisions and problems that are really:
(i) workarounds for methods-in-once-class-rule of "ordinary" single-dispatch OO
(ii) workarounds for the association of what an object is with the name of the object rather than it itself (static typing is really saying "this variable can only hold this type of object", dynamic typing is saying "the object is of this type". Some languages mix these issues up, or fail to recognise the distinction.
(iii) workarounds for the fact that most languages, unlike forth and lisp, are not themselves extensible for new abstractions
(iv) workarounds for the fact that one cannot pass functions as parameters to functions in some languages (doesn't apply to C, thanks to function pointers - here's where the odd fact that low level languages are often easier to form new abstractions in comes in)
(v) workarounds for namespace issues
(vi) workarounds for crappy or nonexistent exception processing
Plus, Common Lisp's incremental compile cycle means faster development, and it's defined behaviours for in place modifications to running programs makes it good for high-availability systems
I thought that TCP used a sliding window to prevent the overhead of having to ack every packet?
I liked the article, but I followed the link to strings are hard and I wasn't so impressed with the guy. To quote"
As every compiler writer knows, lexing and parsing are the slowest part of compiling.
How about optimizing compilers? I find I get performance hits when I turn on optimizations in gcc when I specify -O3 as opposed to -O0. Somehow, I doubt it's a result of lexing or parsing.
It's a gross generalization and I strongly suspect that sort of thing permeates all his writings, given that he spouts off on XML in the aforementioned article as well.
Woz
No my bad, I mis-typed. It should say IP, NOT UDP. The caffiene hasen't kicked in yet.
I think "fundamentally broken" is an overstatement. TCP/IP and the "actor transportation" metaphor have the following in common:
-The low-level transfer is unreliable: things might come thru damaged or not at all, and things might not arrive in the order they were sent.
-A higher level management system deals with transmission failures by requesting retransmissions as needed. It also deals with ordering problems by holding onto things and putting them in the right order.
No metaphor is ever a perfect fit, but this one is a pretty good one, if you ask me. I think that even non-technical types generally understand that you can make arbitrarily many copies of information (they understand Xerox machines, for example), so I don't think this one failing greatly compromises the overall teaching usefulness of the metaphor.
"1) Good! You got yr packet!"
It seems like your keyboard keeps dropping packets. Could we have a repost of this comment?
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
Yup, your right. Should of got to bed before 4 last night and I wouldn't be making stupid mistakes (well, at least not as many).
I submit that your "foo" + "bar" example, or variations on that theme, will continue to separate the wheat from the chaff in IT for the foreseeable, pre-Terminator, ante-dystopian apocalyptic meltdown future.
Resistance is feudal.
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
When told to convert Fortran code over to C (over a million lines) I knew it was going to take me forever. f2c doesn't work in this case since the code is soooo messed up to begin with. So, I found myself doing repetitive conversions over and over again that are specific to the code base.
Solution:
Created a perl script that translates parts of it for me and highlights the rest that has to be hand changed and looked over.
So, to solve one probem I created a slew of more problems with the script freaking out and messing up code.
So far though, it's saved me every bit of that time that I would have spent working on tedious simple stuff. Which in turn allows me to post to Slashdot more!!!!
Reading the article, I get the feeling that Joel has gotten used to more leakiness in his tools (Visual Basic, MFC, C++ AFAIK) than he should have. Let me explain:
Consider two imaginary libraries: Really Nice Library (RNL) and Mostly Crufty Functions (MCF).
Both expose some unavoidable leakiness when a) the abstractions don't fit your problem, or the really *is* at that lower level, or b) you are forced to consider the performance characteristics of the lower level, which the abstractions cannot hide.
In addition to that, MCF adds some more leakiness of its own: buggy implementation, bad assumptions, a badly designed interface, common cases not covered by abstractions etc. etc.
When there's leakiness of the first kind, Joel is right: the programmer should be capable of solving the problem at the lower level (and the tools should make it easy for him to do so).
But when you keep experiencing a lot of leakiness of the second kind, maybe you should go looking for a better set of tools. Life's just too short...
Stupidity is mis-underestimated.
While I usually like Joel's work, I'm pissed about the random jab at C++. For those he didn't read the article, he says something along the lines of
"A lot of the stuff the C++ committe added to the language was to support a string class. Why didn't they just add a built-in string type?"
It's good that a string class wasn't added, because that lead to templates being added! And templates are the greatest thing, ever!
The comment shows a total lack of understanding of post-template, modern C++. People are free not to like C++ (or aspects of it) and to disagree with me about templates, of course, and in that case I'm fine with them taking stabs at it. But I get peeved when people who have just given the language a cursory glance try to fault it. If you haven't used stuff like Loki or Boost, or taken a look at some of the fascinating new design techniques that C++ has enabled, then you're in no place to comment about the language. At least read something like the newer editions of D&E or "The C++ Programming Language" then read "Modern C++" before spouting off.
PS> Of course, I'm not accusing the author of being unknowledgable about C++ or anything of the sort. I'm just saying that this particular comment sounded rather n00b'ish, so to speak.
A deep unwavering belief is a sure sign you're missing something...
I don't think I implied that each packet transmitted required a seperate ACK packet. It definatly doesn't. An ACK means that all data upto that sequence number has been received successfully.
Loved this article. Sent it on to my manager and a co-worker.
.NET effectively. Unless one understands some of the underpinnings of this NEW technology, you actually can't take advantage of it. Throw in the generated code issues and the IDE, an abstraction of an abstraction, really is disadvantageous.
One thing I liked especially is the danger of the Shiny New Thing. It may be neat and cool and save time, but knowing how to use it does not mean that you can do anything else - or function outside of it.
Right now I'm on an ASP.NET project - and some ASP.NET stuff I actually like. But the IDE actually makes it harder to program responsibly, and even utilize
A friend of mine just about strangled some web developers he worked with as they ONLY use tools (and they love all the Shiny New Ones) and barely know what the tools produce. This has led to hideous issues of having to configure servers and designs to work with their products as opposed to them actually knowing how they work. The guy's a saint, I swear.
I think managers and employers need to be aware of how abstract things can get, and realize good programmers can "drill down" from one layer to another to fix things. A Shiny New Thing made with Shiny New Things does NOT mean the people who did it are talented programmers, or that they can haul your butt out of a jam when the Shiny New Thing looses its shine.
"The Sage treasures Unity and measures all things by it" - Lao Tzu
Almost all C++ string classes overload the + operator so you can write s + "bar" to concatenate. But you know what? No matter how hard they try, there is no C++ string class on Earth that will let you type "foo" + "bar", because string literals in C++ are always char*'s, never strings.
I found out that some PHP library functions (empty, isset, etc.) are not really functions, but "language constructs". They ONLY take variables as arguments. You cannot pass function results, constants, and IIRC expressions.
Appearently they married something to C or C++, ruining the "substitutability rule" of programming.
I couldn't believe my eyes. PHP is supposed to be a "scripting language", not a C++ preprocessor. That kind of shit just makes ASP look competative. Please don't completely kill MS because OSS will then feel free to expose and force-feed even more archaic C-family ugliness in its languages.
If it is done for speed purposes, then perhaps have an interpreter switch somewhere that lets one choose between "normal acting" functions and a crippled-but-fast option. (Normal being the default.)
Table-ized A.I.
The problem that Joel talks about is not really a problem with abstraction, it is a problem with teamwork. When I program I simply can not do all of it myself for all but trivial projects. By everything I mean write the compilers, write the OS etc. Instead I must rely on other programmers to write large portions of the code that I run. Whether it is the guy across the hall who wrote the search contact Stored Procedure in SQL or a programmer at microsoft writing a windows Disk IO function, I am relient on their code working as I think it should. This is the problem with teamwork but there is no other solution to programming modern applications. Martin
Maybe I'm an old fashioned has-been but people doign software development should understand the fundamentals of how computers work. That means that they should understand things like memor management, they should understand what a pointer is, they should undertsand about how tight loops versus unrolled loops might affect the performance of the caches on their system. I meet so many "programmers" that have no understanding that there are architectural constraints on what they can and can't do. Software runs on hardware. If you're going to write software and treat the hardware as a black box, you're not going to write it as well, or as efficiently as you could be doing it.
Now I always consider performance when designing/writing code, but programmers are WAY more expensive than hardware, so eeking out performance can often be a wasted effort. Everyone knows that C will smoke Java in most operations, but having its so hard to manage at the enterprise level that you are much better taking the 50%+ performance hit and writing in a "leaky" language.
This is not the greatest sig in the world, this is just a tribute.
Looks like he just discovered and renamed the basic idea that "all models are incomplete". Any scientist could tell you that one! I remember a quote that goes something like this: The greatest scientific accomplishment of the 19th century was the discovery that everything could be described by equations. The greatest scientific accomplishment of the 20th century is that nothing can be described by equations.
That's all an abstraction is: a model. Just like Newtonian physics, supply and demand under perfect competition, and every other hard or soft scientific model. Supply and demand breaks down at the low end (you can't be a market participant if you haven't eaten in a month) and the high end (if you are very wealthy, you can change the very rules of the game). Actually, supply and demand breaks down in many ways, all the time. Physics breaks down at the very large or very small scales. Planetary orbits have wobbles that can only be explained by more complex theories. Etc.
No one should pretend that the models are complete. Or even pretend that complete models are possible. However, the models help you understand. They help you find better solutions (patterns) to problems. They help you discuss and comprehend and write about a problem. They allow you to focus on invariants (and even invariants break down).
All models are imperfect. It's good that computer science folks can understand this, however, I don't think Joel should use a term like "leaky abstraction". Calling it that implies the existence of "unleaky abstraction", which is impossible. These are all just "abstractions" and the leaks are unavoidable.
Example: if I unplug the computer and drop it out of a window, the software will fail. That's a leak, isn't it? Think of how you would address that in your model: maybe another computer watches this one so it can take over if it dies..etc..more complexity, more abstractions, more leaks....
He also points out that, basically, computer science isn't exempt from the complexity, specialization, and growing body of understanding that accompanies every scientific field. Yeah, these days you have to know quite a bit of stuff about every part of a computer system in order to write truly reliable programs and understand what they are doing. And it will only get more complex as time goes on.
But what else can we do, go back to the Apple II? (actually that's not a bad idea. That was the most reliable machine I've ever owned!)
However, the best abstractions are those that plug the leaks or at least keep them to a drip rather than a stream. Automatic garbage collection for plugging memory leaks is a good example. Perhaps this is the main reason why I like Perl and Java. Of course, you still get into trouble, but the programs are much easier to debug without all the code tracking and freeing up storage.
But you know what? No matter how hard they try, there is no C++ string class on Earth that will let you type "foo" + "bar", because string literals in C++ are always char*'s, never strings.
WTF? Has he never heard of temporaries? I don't understand this point at all.
Dahlmann tightly grips the knife, which he may have no idea how to use, and steps out into the plain.
haha! It also seems like the messages from the keyboard are arriving out of order! (read the first 5 words of the original post... you see what I mean)
No, unfortunately, that is simply my brain dropping packets destined for my fingers.
Tis a shame, really.
In the future, I would want to not be isolated from my friends in the Space Station.
You don't, and in fact can't, deal with page faults in your Java program. Nonetheless, your java program will suffer a performance hit when it page faults. Thats a leaky abstraction.
Back to TCP. Earlier for the sake of simplicity I told a little fib, and some of you have steam coming out of your ears by now because this fib is driving you crazy. I said that TCP guarantees that your message will arrive. It doesn't, actually. If your pet snake has chewed through the network cable leading to your computer, and no IP packets can get through, then TCP can't do anything about it and your message doesn't arrive. If you were curt with the system administrators in your company and they punished you by plugging you into an overloaded hub, only some of your IP packets will get through, and TCP will work, but everything will be really slow.
This is what I call a leaky abstraction.
On the surface it looks like an almost reasonable way to describe the situation, but when you look closer, you realize it's mish-mash written to look smarter than it is.
Imagine, addign to teh example above, you were to flip off your computer, or pour a cola directly on the motherboard... at that point ALL programming would cease to function. All computer code exists at a level of abstraction, even when you are programming in machine language you are still abstracted to some degree away from the hardware...
But that is actually the POINT of computers. Abstraction is what gives computers thier strength. It's what allows machines to be programmed to do vastly complex things without requiring a vastly complex piece of code.
All his examples are simply whining that X program can't function when Y event happens. Javascript can't run when JS is turned off in the browser, c++ won't let you add two string literals together, some SQL queries are slower than others...
None of these are inherant faults with abstraction, they are specific instances of poor implementation, instances that can and probably should be fixed. Instead of looking at one flawed analogy and saying that analogies as a argumentative tool are all inherently unusable, you should fix the flaw in that one analogy and use it.
"Your superior intellect is no match for our puny weapons!"
While you are slamming ``liberal arts'' -- a term you seem not to understand -- you highlight the need for it. Liberal arts does not imply a non-scientific, non-technological education. It implies a broad education, including science, mathematics, and engineering along with the ``traditional'' topics of history, literature, languages, politics, economics, and arts. For politics, governance, and management, I want people who are conversant in all of those topics.
Unfortunately, the subjects you list have all grown to the point that no human can obtain even a BASIC understanding of all of them before he's too old to have a useful carreer left.
It was once possible to be a "Rennisance Man" - a master of ALL the sciences and arts reduced to teachability. No more. It's just too bloody large. (I say this as someone who attended a univerdity that claims to try to produce such people - centuries after the last of them is dead. B-) )
Unfortunately, "Liberal Arts" schools have, over much of the last century, been filled with the mathematically and technically illiterate - both because the students without the necessary skills gravitated there, and because the faculties themselves were so disabled, and in turn disparaged the skills they were incompetent to teach.
The engineering/scientific/biologic/technical cirriculum had constant feedback from the real world about what was true and what was false. But the "Arts Schools" taught classes where what was "right" was ONLY a matter of opinion - and grades solely a measure of how well you could regurgitate your Prof's pet bonnet-bees. (This DESPITE the fact that SOME of these theories could be TESTED - if only the academics understood, and/or believed in, things like the scientific method, statistics, and sampling methods.)
Yes the "Social 'Sciences'" are hard. But the bulk of their credentialed practitioners used this as an excuse to drop "science" from their methodologies. (This despite that fact that mathematics departments were generally part of the art, rather than the engineering, side of the school organization.)
I've been out of academia for a while now. I can hope that things have improved, as you seem to claim. But I have not personally seen any sign of such from the outside (other than your claim).
In my school days, too, many students on the Arts side of the wall knew tech, math, and the like. (Students are generally young, and still hunting for their muse.) But they would generally transfer out to some field more conducive to clear thought, drop out to use it in the real world, or (if they stayed in LS&A) suppress it or flunk out.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
No one would cut a lawn with scissors
:)
You'd be surprised what people will cut lawns with. In Brasilia (Capital of Brasil) the standard method of trimming lawns is to use a machete. No, I'm not talking about hacking down waist-high grass, I'm talking about trimming 3-inch high grass down to two inches by hacking repeatedly at it with a machete, trying to swing parallel to the ground as best you can. No, you don't do this yourself, you hire someone to do it. And if you're a salaried groundskeeper, it makes sure that you always have something to do - you woldn't want to be found slacking off during the day. On rare occasions I've seen people using hedge trimmers (aka big scissors) instead. My family was the only one I knew about in our neighborhood that even owned an American-style lawn mower. My parents were too cheap to hire a full-time groundskeeper, and I have lots of brothers and sisters who work for free
Moral of the story; if it works and fits the requirements better, someone will do it.
"Space Exploration is not endless circles in low earth orbit." -Buzz Aldrin
TCP SACK (which suggests a resend of the blocks between the acknowledged ones) is becoming common; Linux 2.1.100 and Win98 both had it.
Very interesting story. It seems anything you could call a tool is an abstraction. The very nature of language is abstraction. Only God can accomplish something without abstractions, although even he might not want to go to all that effort.
Donate background CPU time to fight cancer.
Apparently you never had a liberal arts education. A large part of its function is to teach you to baffle with bullshit and debate based on form and not substance.
That is to say, engineers are not in any way similar to liberal arts majors, as you can't fool mother nature.
This is called the Two Generals Problem (they only win if they both attack, but they can't be sure the last messenger made it past the enemy). Sadly, there's a proof that it can't be solved.
many high level abstractions simply do not exist in assembly language.
Consider the following assembly language code:
Okay, so this is a little snippet of some assembly language I've just recently worked on. Here's the declaration for the input file:
That's it. Is this readable? Is it abstracted at a level high enough? The primary difference between assembly and a HLL is that in assembly one must invent their own logical abstractions for a real world problem, where languages such as C/C++ simply provide them.
You've probably noticed that I'm using a lot of macros. In fact, classes, polymorphism, inheritance, and virtual functions are all easily implemented with macros. I'm using NASM right now (though I'm using my own macro processor), and it works very well. Because I understand both the high-level concepts and low level details, I can code rather high-level abstractions in a relatively low level language such as assembler. I get the best of both worlds: the ease of HLL abstraction with the power of low level coding.
Please tell me what you think of this - I would honestly like to know. For the past few years, I've been working on macro sets and libraries that make coding in assembly seem more like a HLL. I've also set rules for function calls, like a function must preserve all registers, except those which are used to pass parms. With a well developed library of classes and routines, I've found that I can develop applications quickly and painlessly. Because I stick to coding standards, I'm able to reuse quite a bit (> 50%) of my assembly code.
You might be tempted to ask, "Why not just write in a HLL then?" I do. In fact, I prefer to write in C++. But when the need arises, it's nice to be able to apply the same abstractions of a HLL in assembly. It just so happens that the need has arisen - I'm working on a project that will last a few weeks, and my boss doesn't consider it fiscally responsible to buy a $1200 compiler that will be used for such a short time.
Interestingly, the use of assembly has made me a better programmer. Assembly forces one to think about what one is doing before coding the solution, which usually results in better code. Assembly forces me to come up with new abstractions and solutions that fit the problem, rather than fitting the problem into any given HLL's logical paradigm. Once I prove that the abstract algorithm will indeed solve the problem, I'm then free to convert the algorithm into assembly. Notice that this is the opposite of the way most HLL coders go about writing code - they find a way in which to squeeze a real world problem into the paradigm of the language used. Which leaves them at a loss when "leaky abstractions" occur. Assembly has the flexibility to adapt to the solution best suited to a problem, where as HLL's, while very good at solving the particular problem for which they were designed, perform very poorly for solving problems outside of their logical paradigms. While assembly is easily surpassed by C/C++, Java, or VB for many problems, there are simply some problems that cannot be solved without it. But even if one never uses assembly professionally, learning it forces one to learn to develop logical abstractions on their own - which in turn, increases their general problem solving ability, regardless of the language in which they write.
I see the key difference between a good assembly coder and a HLL coder is that an assembly language coder must invent high level abstractions, where as the HLL coder simply learns and uses them. So assembly is a bit more mental work.
The society for a thought-free internet welcomes you.
Spolsky's such a nob. He simply relays the water-downed thoughts of truly original thinkers.
If you really want to learn about "leaky abstractions" and a bunch of other topics, including human cognition, complexity, economics, and engineering design, read Herbert Simon's "The Sciences of the Artificial".
Even if higher-level abstractions were perfectly bug-free and non-leaky, there is another tradeoff that would forever preserve the niche of lower-level development tools. The granularity of the abstraction is an inherent tradeoff not just in machine time/efficiency, but in programmer learning curve as well.
Geeky modern art T-shirts
The software field is notorious for employing people working whose course of study has been analogous to learning Newton but not learning that there are conditions under which that isn't sufficient. I am not in favor of requiring licensing for all software developers, but there are reasons that states do not allow people to, for example, hire themselves out to do bridge design unless they have demonstrated certain minimum training and competences.
why the hell is he talking about TCP being 'reliable' and IP being 'unreliable' when TCP is a transport protocol and IP is a routing protocol? i kept wanting to plug in 'UDP' for every instance of 'IP' when i was reading his article, and couldn't get very far down it because he was driving me up a wall. i'm all for writing up easy to understand explanations of things, but that was a wildly inaccurate way to easily explain it.
< Today, I won't even go near a programming language lower than C and I like Python much better.
So if programmers don't need to know assembler any more what does this tell us? C's abstractions are not significantly leaky. And as compilers get better and better, higher and higher level abstractions will be leak resistant. A great example is OCaml: a very high-level language that still manages to perform [almost] as well as C.
My point: this article is good and all, but it's hardly the final word.
I'm not a professional SQL programmer, though I've dabbled, so I'd really like to know why this is true. Is it because in the first case, the interpreter can compare all three variables at once, instead of in two different steps in the second case?
Could someone please explain?
Jon Acheson
All opinions expressed herein are my own, and not those of my employers, who are appalled.
That Joel! Always coming up with catchy phrases for concepts that his betters have already published papers on. See David Keppel's 1993 paper for a more thorough explanation, complete with lack of baby-talk.
Quit playing Monopoly with Bill. Switch to one of many non-Microsoft products today.
I would prefer them to use C style expressions rather than wierd [] thingies.
And many of the bugs I have had in the last few weeks would just not appear in high level language, e.g. if you branch past an instruction that changes the CPU register sizes.
I'm not a big x86 fan, but don't they usually use ebx these days instead of bx?
It's useful to distinguish between performance-related leaks and correctness leaks. SQL offers an abstraction for which the underlying database layout is irrelevant except for performance issues. The performance issues may be major, but at least you don't have to worry about correctness.
C++ is notorious for this; the language adds abstractions with "gotchas" inside. If you try to get the C++ standards committee to clean things up, you always hear 1) that would break some legacy code somewhere, even if we can't find any examples of such code anywhere in any open source distro or Microsoft distro, or 2) that only bothers people who arent "l33t".
Hardware people used to insist that everything you needed to know to use a part had to be on the datasheet. This is less true today, because hardware designers are so constrained on power, space, heat, and cost all at once.
Shouldn't a programmer that wanted to be most efficient start with the highest level of abstractions and work his/her way down as needed?
Abstraction brings effeciency (not in raw cycles but in programming time), portability, and other freedoms. In java, the hardware itself is abstracted so that buffer overflows and other security concerns aren't a problem. C programmers blame buffer overflows on poor programming. I have to ask myself who these people are. If they are so good, then they must have some popular program, right? A program like Bind or Sendmail, perhaps? Even the linux kernel has had buffer overflows.
Abstraction doesn't cost much by the way of performance. Java's abstraction is down from it's initial 1600% to 200% the speed of C. During that time C has went from it's 200% of assembler to closer to 100%. It's reasonable to foresee in the future that any abstraction, even those so severe as the JVM, will eventually aproach native speeds through optimized compilers/JIT/AOT.
The best thing about abstractions in the open source world is that a group of people who use these everyday will be the one's developing them. They will address the issues that matter to developers, and they will only need be addressed once. You don't have to reinvent the wheel when you design your car. If someone makes a better wheel, it's very likely that you will benefit from it if you use an abstraction.
Imagine the nightmare of converting to IPv6 if most people didn't use (at least portions of) the BSD TCP/IP stack??? Compare that to the trivial gains you get by not using it! What if all programers decided to use raw sockets to remove the leaky abstraction?
Fix the abstraction, don't abandon it. Have blocking TCP/IP sends (along side the others) for people that need to know if packets made it. You could just add a function to ask the system if the TCP packet had been sent yet (where sent really meant that an ack was recieved). You get to keep the interface. By the time the interface isn't leaky, you will have a large interface that performs all the tasks of the original, just in a more human comprehensible way.
Another advantage to abstractions is that they can be faster in the end. OpenGL is an abstraction. It's so popular that video card designers will work hard to make that abstraction as thin as possible. Now we have cards that support directly some or all of the functions. It's cross-platform as well because of it is an abstraction.
Don't abandon abstractions because they are slow or aren't all encompassing. Fix the leak, don't toss out the toilet.
Karma Clown
The subject of leaky abstractions applies to novice users as well.
I've felt for a long time that people are taught about computers the wrong way, and this article clarifies why this is true.
People are taught less and less about what the computer actually does, and instead focus on things like the desktop analogy, and task oriented training. The user must then remember all these seemingly strange things computers do that don't follow the abstraction they were taught. This makes them seem difficult and incomprehensable.
The problems created by abstractions intended for users can simply be solved with more complicated software that better models the analogy that the users are taught. Unfortunately, the opposite is probably true for programmers.
-Zaphod
As I see it, the problem with VB is not the high abstraction levels per se, but the fact that they are such lousy abstractions. Especially when it comes to controls. For example, look at the standard VB TreeView object. It becomes painfully clear that the VB derivation of this control was written with one narrowly defined purpose in mind, ignoring the fact that people might find many other uses for this control.
When using an OO language like C++, programmers move objects to a higher abstraction level all the time. That is why every good OO course teaches you to make your object's methods orthogonal and complete. In laymen's terms that means that the object operators should be sensible and generic. If you make an object which holds data that you wish to multiply by 5 and then add 2 to it, you don't make an operator that does just that. Instead, you make one operator that multiplies by x, and another that adds y.
Many VB objects ignore this rule, making operators for a specific purpose, that should have been generic instead, and not bothering to implement other possibly useful operators from a lower abstraction level.
When programming VB, I spend about 30% of my time programming around senseless limitations of and omissions from VB objects and controls.
If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
The age old question (at least in the last few years), is abstraction really beneficial to software development? The answer is.......Yes.
Abstraction is good to an extent. The problems that most people bring up regarding abstraction I find to be more of problem with programmers rather than programming. There are always going to be more shitty programmers than good ones. It's the same in everything in life, there is always fewer good than bad.
Abstraction is awesome, it allows for me to think about things that can make my software better, faster, and rich with features while not having to concern myself with the nitty gritty bits and bytes all the time. I have no problem, actually I enjoy, digging into the code to fix obscure bugs and problems with a particular abstraction.
People always say programmers are lazy, I say it all the time. But the reality is good programmers aren't lazy. Good programmers take pride in their craft and are eager to learn everything about the language or environment they are working in. Good programmers benefit tremendously with good abstraction. Bad programmers use abstraction as a crutch to get things done, then complain when something breaks that they can't figure out.
One last point about abstraction. I have seen people take abstraction too far. There is a point at which abstracting any further handcuffs the programmer and creates too much overhead to make it worth it's while. I prefer to implement an abstraction only when I feel the cost of not implementing the abstraction is higher than implementing the abstraction.
Programming is an art, some are good some are not. Everyone has their own style that they prefer to work with and what works for some, can fails miserably for others. I fortunately work by myself most of the time so I have the freedom to almost always work in my world; so all is good.
Remember just because some people are out their abusing abstraction, as if it ever did anything to them, doesn't mean that abstraction is bad and we should never utilize that technique ever again.
LoRider
Good article. I'm reminded of a couple of bits of wisdom from The Pragmatic Programmer. The authors describe a condition they call "Programming By Coincidence", which occurs when a programmer somehow comes up with code that works, but doesn't understand the system well enough to be able to explain why it works, and is therefore unable to properly maintain or debug his code. Frequently, I tbink, it is the underlying abstractions that are misunderstood by the programmer.
The authors also refer to "Evil Wizards", which are code generators used by programmers who mindlessly employ the generated code, completely without understanding it. Personally, I consider various frameworks and libraries to be potentially "evil" in this sense as well.
Please donate your spare CPU cycles to help fight cancer and other diseases
Damn straight! One of the biggest problems with Java is how types are arbitrarily built-in or not. I'm tired of wrapping my ints in Integers just so I can use them in collections and then getting their value when I need to do some math. Mind you, there'd be no problem if operators were overloaded so that I could just use Integer from start to finish. But of course with a built-in String class, there's less pressure for Java to support operator overloading (just as non-native strings encouraged the creation of C++ templates).
As he argues, "paradoxically [...] becoming a proficient programmer is getting harder and harder," though posing as one and being able to get some stuff to vaguely work is getting easier--the frequency at which proficient programmers are needed is going down.
> I can be pretty sure that in my career I will never be required to develop in assembler.
Well if all you do is write Excel macros, that's probably true...
ActiveX is an important abstraction in the Windows world, and boy is it a leaky abstraction if there ever was one. I desperately want to know how ActiveX works at the low level, but Microsoft either doesn't want to explain it or won't explain it. Yeah, yeah, every Microsoft Press book starts with an explanation of IUnknown -- QueryInterface, AddRef, and Release -- but once they cover the basic mechanics of vtable interfaces the explanation is that some kind of miracle occurs and they you have have ActiveX containers and objects in their full glory, and don't worry your pretty little head worrying about how to roll your own, simply using the wizards (gosh how I hate that term for automatic code generation that obfuscates). Maybe Microsoft doesn't encourage going inside their abstraction because internally they are so baroque (i.e. ornate to no known functional purpose) or just plain broke. And yes, if you want to do one tiny little thing that is outside the scope (or perhaps the documented scope) of that abstraction, one spends hours banging one's head against the wall with Google and MSDN searches and software code experiments.
The abstraction is a reliable byte stream, which of course isn't really possible due to phenomena that can only be affected by interfaces beneath TCP. A leak that's documented is still a leak.
The reliability of TCP is why every exciting email from embezzling East Africans arrives in letter-perfect condition
I assume this is a reference to the abundant "Nigeria scams", however, if you look on a map, you will see that Nigeria is actually in so-called "West Africa".
I just thought of another reason why abstractions are important. Take the famous MP3 -> Ogg conversion. Semantically, many aspects of the two types don't match, so the resulting conversion is less than perfect. Well, the Brain -> Machine Code conversion is something more akin to an MP3 -> PDF scenario. And for that reason, programming in a low level language (for scopes big enough to trash the mental cache, so to speak) has a similar conversion problem. Human beings things at an extremely high level, and can manipulate thousand-layer abstractions with ease (think of, for example, the abstraction of a 'cat'). Even something considered high-level in the programming world (objects and generic algorithms) are low-level to human beings. The ideal abstraction, would be, of course, natural language. Human beings think in natural language, and translating thoughts into natural language is trivially easy. For example, I'm writing a linker at the moment. I'd just like to be able to say:
Sort sections by name.
Remove all duplicate sections that have the LINK_ONCE parameter set.
Merge all other duplicate sections.
For each relocation in the relocation table, find the corresponding symbol and update the necessary address.
Merge string tables, and update all string references.
Write sections to this file.
There we go, six "lines of code." Very close mapping between my thoughts and the code. Assuming the compiler carries out my instructions faithfully, what are the chances of a bug in this code? Nearly zero. It's quite easy to verify that six lines of code is logically correct. Verifying that the thousands of lines of C++ that this translates to is both logically and semantically correct is much, much, harder.
A deep unwavering belief is a sure sign you're missing something...
... machine code itself is an abstraction in the first place. This is especially true for modern processors that reorder instructions, execute them in parallel, and in extreme cases convert them into an entirely different instruction set.
Says that all abstraction's are ultimately leaky because you never construct a logical system that is complete and consistent (but get close enough for government work). Program abstractions usually leak long before they reach Goedel's limit.
I know all that.
After explaining what TCP was, joel said:
By comparison, there is another method of transmitting data called IP which is unreliable.
My point, likely poorly made due to crankiness, was that TCP and IP are at different layers of the protocol stack, and TCP in fact assumes IP, and therefore you can't compare them. I do understand the analogy he was trying to make to abstraction layers, I was just trying to say that his language was tweaking the hell out of me.
marriage
Leaky abstractions....?
oh come on.... Nothing is fullproof... every element of computing is an abstraction of something else.... Even assembly code is an abstraction of processor commands...
Not trying to insult the guy, he wrote a great article... but every time you use any command in any language your using an abstraction of many more commands... all we do in coding is build yet more tools not in any way disimilar from the ones were already using... Classes and functions, even straightly scripted programs.
My sense of smell is an abstraction from my nose's ability to respond to chemicals... So when i smell shit i suppose thats a leaky abstraction because i dont want to smell it....
--Idiots, Every single one of YOU, A flaming mass of conglomerated morons, hey wait a second, isnt that how RAID works?
It's a good article for beginner programmers who are still trying to wrap thoir brains around the concepts, but it's hardly a new idea. Unfortunately, the author doesn't name the phenomena for what it is; it's not a problem with abstractions themselves, but is a result of logic and algorithms upon which programming is founded. Abstractions are based on assumptions, and when those assumptions are not true, the abstraction breaks. Take his example of TCP breaking when the network cable is unplugged: since TCP assumes the destination is reachable, it will abviously break when this assumption is violated. His "law of leaky abstractions" is simply the "law of violated assumptions" (see? I can come up with fancy labels too).
Higher Logics: where programming meets science.
This is why learning C with plain old vi and gcc is a good idea: You learn the basics first and then you can advance to an IDE later. Actually there's a saying for this: "You have to crawl before you can walk"
Now this is absolutely true...and the cool thing is, the analogy between (mechanical) engineering and software holds true. In a mechanical system, you NEVER, I repeat NEVER get an absolute sdescription; a good engineer knows to what level he can simplify (ie detail the mechanical efficiency, do some thermal calculations, then leave out the air friction because the effect is minimal and can be left out [in this case...in another situation it could well be the ayor factor] without dramatically changing the final strenght/elasticity calculations).
A slightly less proficient engineer will calculate the system, together with the next layer of the system (ie he will calculate the air friction), then discard it [or incorporate since he did the work anyway] because he found out the effect is not significant.
It seems the same thing goes with software development; you abstract to the propper level...but you need to know what level you can safely stop at.
-- Waht? Tehr's a preveiw buottn?
I am one of the lucky guys who learned programming in machine code first - I actually hand assembled an interrupt driven program that would put the time on a memory mapped display back in 1979.
The schools mistakenly start out with a high level language. They for some reason think no one needs to get good at assembly language -(I would not advise learning Intel assembly first (It is a very poor design for humans to use - one of the reasons it took the intel world 10 years to catch up to the Amiga's OS) but alas Intel is the standard).
I also moved on to high-level programming - and wrote programs in both assembler and C - compiled them and counted machine cycles. Once you see where the compilers fail to produce good code, you can compensate where necessary.
A weird trend has happened where I find it most effective to write in the highest or lowest level and avoid anything in between.
Strangely enough, I was discussing this with a co-worker yesterday. We joked about the silly programmers who return pointers to local variables, or think "char *" is the same as "string".
char *s;
strcat(s, "Mary had a little lamb");
The result of the discussion was that everyone needs to understand at least one level down from the level of abstraction at which they will be working. For example, someone using sockets for TCP/IP will probably have a very hard time understanding what is going on unless he knows the general concepts of TCP. A C programmer really needs to understand a little bit about how the C code is going to look and work at the assembly level or she will never understand why the code above crashes
In addition to providing the details needed to make sense of the higher-level abstration layer, a basic understanding of the lower-level abstraction layer allows a good programmer to make educated guesses about potential trouble spots -- performance traps or feature limitations.
That is why I have always advocated an understanding of assembly language, computer architecture, TCP/IP protocols, etc. If you know TCP/IP, you can probably figure out what is wrong with your sockets code much more quickly. If you know how the Ethernet works, you will have much more success trying to wire your apartment. You might not technically need that knowledge, but it helps to be able to guess the answer to "can I just get better shielding to make the ethernet signal go farther?" rather than look it up.
Time flies like an arrow. Fruit flies like a banana.
how much you abstract. Abstractions exist everywhere, there's no coming away from that. We can't reinvent the wheel everytime we go out for a ride on our bicycle. However, sometimes abstractions go too far, and make the process slow and unoptimized. That's the difference between an experienced programmer and a rookie, the experienced one knows where to take advantage of an abstraction, and where to go under the hood and get dirty.
...every so often discovers some minor details of things that he was supposed to know BEFORE starting any software-related work, and publishes his discoveries on his site with large amount of drivel, usually inspired by his troubled childhood at Microsoft.
Contrary to the popular belief, there indeed is no God.
The subjects he lists, history, literature, languages, politics, economics, and arts, science, mathematics, and engineerin why would you disagree that it's possible for someone to obtain competency in these areas?
I believe you misunderstood me.
The point was not to obtain competence in these areas. It was to obtain MASTERY of ALL AREAS of human knowledge SIMULTANEOUSLY.
No longer possible.
(Of course it's even harder when the school systems don't even teach their students to SPELL it correctly. B-) )
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
In classical aerospace procurement, the specification rules. It's quite common to order the same part with the same specification from different suppliers. The units supplied must meet specified tests and be functionally interchangeable.
Software companies hate that. There were screams when DoD tried to force that on Ada compiler vendors. There was a test suite, acceptance testing, and some compilers flunked. DoD contractors couldn't use those compilers.
Software companies aren't used to getting products returned with a big red tag marked REJECTED - Does not conform to specification. And it doesn't end there. There's a follow-up from the quality department questioning their qualifications to continue as a supplier. Then there 's interchange of quality data between DoD buyers that puts the vendor on record as having quality problems. In the aerospace world, there's an organized process for hammering on vendors until their stuff works, or until the vendor gets replaced.
DoD lost that battle in software, though. That approach worked great at keeping aerospace machine shops from getting sloppy. But DoD doesn't buy enough software or computers any more. There was a time when DoD had real influence in computing (the Internet and BSD UNIX were all DoD-funded projects). But that influence declined in the 1980s as the PC sector, where consumers are weak.
And that's what this is all about. Nobody today is in the position to say to Microsoft "Make it work right or we pull the plug and you go out of business". Other vendors have picked up on this, and shipping crap has become a trade custom. Since consumers tend to buy on features rather than quality, we're not seeing pushback on this.
That's why our abstractions leak. Nobody is pushing back hard enough to insist that they don't. We know how to do it - overdesign inside, heavy testing, clear specifications, a willingness to reject for noncompliance, and the power to make that rejection stick. But nobody today is in a position to do that except for some very specialized procurements.